SlideShare uma empresa Scribd logo
1 de 99
Baixar para ler offline
Phylogenetic and Phylogenomic
                    Approaches to the
              Study of Microbial Communities
                                  March 7, 2012
                         IOM Forum on Microbial Threats
                           Social Biology of Microbes

                                Jonathan A. Eisen
                          University of California, Davis


Wednesday, March 7, 12
Acknowledgements
            • $$$
                  •      DOE
                  •      NSF
                  •      GBMF
                  •      Sloan
                  •      DARPA
                  •      DSMZ
                  •      DHS
            • People, places
                  • DOE JGI: Eddy Rubin, Phil Hugenholtz, Nikos Kyrpides
                  • UC Davis: Aaron Darling, Dongying Wu, Holly Bik, Russell Neches,
                    Jenna Morgan-Lang
                  • Other: Jessica Green, Katie Pollard, Martin Wu, Tom Slezak, Jack
                    Gilbert, Steven Kembel, J. Craig Venter, Naomi Ward, Hans-Peter Klenk


Wednesday, March 7, 12
Outline


            • Introduction
            • Phylotyping and phylogenetic ecology
            • Functional prediction
            • Selecting organisms
            • Future needs




Wednesday, March 7, 12
Phylogeny
 • Phylogeny is a description of the
   evolutionary history of
   relationships among organisms (or
   their parts).
 • This is frequently portrayed in a
   diagram called a phylogenetic tree.
 • Phylogenies can be more complex
   than a bifurcating tree (e.g.,
   lateral gene transfer,
   recombination, hybridization)




Wednesday, March 7, 12
Whatever the History: Trying to Incorporate it is Critical




                                 Four Models for Rooting TOL
                         from Lake et al. doi: 10.1098/rstb.2009.0035
Wednesday, March 7, 12
Uses of Phylogeny
                 in Genomics and Metagenomics

                             Example 1:

                     Phylotyping and Phylogenetic
                               Ecology

Wednesday, March 7, 12
rRNA Phylotyping
                                    • Collect DNA from
                                      environment
                                    • PCR amplify rRNA genes
                                      using broad (so-called
                                      universal) primers
                                    • Sequence
                                    • Align to others
                                    • Infer evolutionary tree
                                    • Unknowns “identified” by
                                      placement on tree




Wednesday, March 7, 12
rRNA Phylotyping




Wednesday, March 7, 12
Three Major Issues in Phylotpying
   Beyond Moore’s Law                  Metagenomics




                         Short reads




Wednesday, March 7, 12
rRNA Phylotyping in
      Sargasso Sea
      Metagenomic
    Metagenomic Data




            Venter et al., Science
            304: 66. 2004



Wednesday, March 7, 12
RecA
         Phylotyping in
         Sargasso Data




                 Venter et al., Science
                 304: 66. 2004



Wednesday, March 7, 12
RecA
         Phylotyping in
         Sargasso Data




                 Venter et al., Science
                 304: 66. 2004



Wednesday, March 7, 12
Sargasso Phylotypes
                        0.500

                                                                                 EFG      EFTu        HSP70            RecA        RpoB            rRNA



                        0.375
 Weighted % of Clones




                        0.250




                        0.125




                           0
                                            ia




                                                                ia




                                                                                   ria




                                                                                                s




                                                                                                              i




                                                                                                                              xi




                                                                                                                                              ia




                                                                                                                                                                   a
                                                                                                           ob
                                                                                                te




                                                                                                                                                                ot
                                                                                                                              le
                                          er




                                                             er




                                                                                                                                           er
                                                                                    e




                                                                                              u




                                                                                                         or




                                                                                                                                                              ae
                                                                                                                           of
                                        ct




                                                             ct




                                                                                 ct




                                                                                                                                           ct
                                                                                           ic




                                                                                                        hl




                                                                                                                         or
                                      ba




                                                         ba




                                                                             ba




                                                                                                                                          ba




                                                                                                                                                              ch
                                                                                           rm




                                                                                                       C




                                                                                                                        hl




                                                                                                                                                            ar
                                     eo




                                                        eo




                                                                           eo




                                                                                                                                      so
                                                                                         Fi




                                                                                                                        C




                                                                                                                                                          ry
                                                                                                                                     Fu
                                    t




                                                         t




                                                                            ot
                                 ro




                                                      ro




                                                                                                                                                      Eu
                                                                          pr
                                ap




                                                   ap




                                                                       lta
                            ph




                                                  m




                                                                     De
                                                 am
                           Al




                                             G




                                                                                            Major Phylogenetic Group




                                                                                                                  Venter et al., Science 304: 66-74. 2004
Wednesday, March 7, 12
Solution: More Automation


            • BLAST????
            • Composition/word frequencies
            • Automation of trees




Wednesday, March 7, 12
AutoPhylotyping 1:
                         Each Sequence is an Island




Wednesday, March 7, 12
STAP




                                               Wu et al. 2008 PLoS One
Figure 1. A flow chart of the STAP pipeline.
Wednesday, March 7, 12
STAP
                                               Figure 1. A flow chart of the STAP pipeline.
                                               doi:10.1371/journal.pone.0002566.g001

                                               STAP database, and the query sequence is aligned to them using
                                               the CLUSTALW profile alignment algorithm [40] as described
                                                                                                                            a
                                                                                                                            w
                                               above for domain assignment. By adapting the profile alignment               s
                                                                                                                            a
                                                                                                                            t
                                                                                                                            o
                                                                                                                            G
                                                                                                                            t

                                                                                                                            t
                                                                                                                            s
                                                    Each sequence                                                           T
                                                                                                                            c
                                                                                                                            a
                                                    analyzed separately                                                     q
                                                                                                                            c
                                                                                                                            e
                                                                                                                            b

                                                                                                                            b
                                                                                                                            S
                                                                                                                            p
                                                                                                                            a
                                               Figure 2. Domain assignment. In Step 1, STAP assigns a domain to             t
                                               each query sequence based on its position in a maximum likelihood            d
                                               tree of representative ss-rRNA sequences. Because the tree illustrated       ‘
                                               here is not rooted, domain assignment would not be accurate and              s
                                               reliable (sequence similarity based methods cannot make an accurate
                                                                                                                            s
                                               assignment in this case either). However the figure illustrates an
                                               important role of the tree-based domain assignment step, namely              s
                                               automatic identification of deep-branching environmental ss-rRNAs.           d
                                               doi:10.1371/journal.pone.0002566.g002                                        a


                                                      PLoS ONE | www.plosone.org                                        5




                                               Wu et al. 2008 PLoS One
Figure 1. A flow chart of the STAP pipeline.
Wednesday, March 7, 12
AMPHORA




   Wu and Eisen
   Genome Biology
   2008 9:R151 doi:
   10.1186/
   gb-2008-9-10-r151


Wednesday, March 7, 12
WGT




   Wu and Eisen Genome Biology 2008 9:R151   doi:10.1186/gb-2008-9-10-r151
Wednesday, March 7, 12
AMPHORA




   Wu and Eisen
   Genome Biology
   2008 9:R151 doi:
   10.1186/
   gb-2008-9-10-r151
                         Guide tree
Wednesday, March 7, 12
Wu and Eisen Genome Biology 2008 9:R151   doi:10.1186/gb-2008-9-10-r151

Wednesday, March 7, 12
Comparison of the phylotyping performance by AMPHORA and MEGAN. The sensitivity and specificity of the phylotyping
methods were measured across taxonomic ranks using simulated Sanger shotgun sequences of 31 genes from 100
representative bacterial genomes. The figure shows that AMPHORA significantly outperforms MEGAN in sensitivity without
sacrificing specificity.


Wu and Eisen Genome Biology 2008 9:R151                            doi:10.1186/gb-2008-9-10-r151



Wednesday, March 7, 12
AutoPhylotyping 2:
                         Most in the Family




Wednesday, March 7, 12
Metagenomic Phylogenetic challenge
                              xxxxxxxxxxxxxxxxxxxxxxx

                            xxxxxx             xxxxxxxxxxxxx

                                             xxxxxxxxxxxxxx




                            xxxxxxxxxxxxxx




                         A single tree with everything




Wednesday, March 7, 12
Metagenomic Phylogenetic challenge
                              xxxxxxxxxxxxxxxxxxxxxxx

                            xxxxxx             xxxxxxxxxxxxx

                                             xxxxxxxxxxxxxx




                            xxxxxxxxxxxxxx




                         A single tree with everything




Wednesday, March 7, 12
rRNA Phylotyping in
      Sargasso Sea
      Metagenomic
    Metagenomic Data




            Venter et al., Science
            304: 66. 2004



Wednesday, March 7, 12
Combine all into
                                                              one alignment



               Figure 1. A flow chart of the STAP pipeline.
Wednesday, March 7, 12
Cluster, cluster of more than three identical sequences.
                           APPLIED AND ENVIRONMENTAL MICROBIOLOGY, Feb. 2006, p. 1680–1683                                                                   Vol. 72, No. 2




                                                                                                                                                                              Downloaded from http://aem.asm.org/ on November 15, 2011 by guest on November 15, 2011 by guest
                           0099-2240/06/$08.00ϩ0 doi:10.1128/AEM.72.2.1680–1683.2006
                           Copyright © 2006, American Society for Microbiology. All Rights Reserved.
                         sequences obtained in this and previous (12, 13) studies fall         NAC11-7 (from the same algal bloom study [8]) and an uncul-
                         within the marine roseobacters (see Fig. S1 in the supplemen-         tivated marine bacterium, ZD0207, associated with dimethyl-
                            Characterization of Bacterial Communities Associated with Deep-Sea
                         tal material), a major clade of culturable marine heterotrophs
                         (7), many of which play a role in sulfur cycles (e.g., see refer-
                                                                                               sulfoniopropionate uptake (15). CGOAB33 is most similar to
                                                                                               one (slope strain EI1*) of a group of thiosulfate-oxidizing
                                            Corals on Gulf of Alaska Seamounts†
                         ence 8). One clade of six CGOF sequences is most closely              bacteria from marine sediments and hydrothermal vents (14).
                         related to NAC11-6 from a dimethylsulfoniopropionate-pro-                Members of the family Pseudomonadaceae comprised 23 to
                                               Kevin Penn,1 Dongying closely Jonathan A. Eisen,1,2 and Naomi Ward1,3* in those samples
                         ducing algal bloom (8), while CGOCA38 groups         Wu,1 with        69% of the gammaproteobacterial sequences
                                The Institute for Genomic Research, 9712 Medical Center Drive, Rockville, Maryland 208501; Johns Hopkins University,
                                               Charles and 34th Streets, Baltimore, Maryland 212182; and Center of Marine Biotechnology,
                                                                   701 East Pratt Street, Baltimore, Maryland 202123
                                                                               Received 22 June 2005/Accepted 8 November 2005

                                         Although microbes associated with shallow-water corals have been reported, deepwater coral microbes are
                                      poorly characterized. A cultivation-independent analysis of Alaskan seamount octocoral microflora showed




                                                                                                                                                                                                             Downloaded from http://aem.asm.org/
                                      that Proteobacteria (classes Alphaproteobacteria and Gammaproteobacteria), Firmicutes, Bacteroidetes, and Ac-
                                      idobacteria dominate and vary in abundance. More sampling is needed to understand the basis and significance
                                      of this variation.


                           The most abundant corals on Gulf of Alaska seamounts are                     vitrogen). Amplifications were performed with an initial dena-
                        octocorals (9), which create a habitat structure for mobile                     turation of 2 min at 94°C, followed by 29 cycles of 30 s at 94°C,
                        fauna (4). Concerns about the benthic impacts of commercial                     30 s at 55°C, and 2 min at 72°C, with a final extension of 5 min
                        fishing have renewed interest in habitat-forming deep-sea cor-                   at 72°C. PCR products were cloned using a TOPO TA cloning
                        als (4). Studies of shallow-water scleractinian corals (12) have                kit (Invitrogen), and primers M13F and M13R were used to
                        revealed a diverse microflora and evidence of host-microbe                       sequence positions 9 to 1545 of the 16S rRNA gene.
                        interactions. Although studies of the deep-sea octocoral mi-                       BLASTN (1) was used to compare our query sequences with
                        croflora are under way (10), there have been no published                        reference sequences from the RDP2 (3) database. Represen-
                        reports describing the microbial community composition.                         tative sequences from the BLASTN output were aligned with
                           Three Gulf of Alaska seamounts were visited during re-                       our query sequences, using an RDP2-provided profile align-
                        search cruise AT7-15/16 aboard the R/V Atlantis. The biolog-                    ment. Neighbor-joining trees were created using PHYLIP (6)
                        ical objectives of the cruise included sampling of deep-sea                     and used to assign putative taxonomy down to the family level.
                        octocorals for studies of their dispersal and reproductive strat-               Detailed phylogenetic trees were constructed using the rele-
                        egies, with a particular focus on the abundant bamboo corals                    vant sequences from each clone library, two reference se-
                        (Isididae). We took advantage of available coral specimens to                   quences most closely related to the query sequence, and addi-
                        examine their associated microflora.                                             tional reference sequences. Alignments were generated using
                           Coral, rock, and water column samples (Table 1) were col-                    the RDP2 profile alignment, and bootstrapped neighbor-join-
                        lected from the Warwick, Murray, and Chirikof seamounts                         ing trees were reconstructed using PHYLIP (6).
                        using the deep-submergence vehicle Alvin. Corals and rocks                         The clones sequenced comprised 19 phyla (see Table S1 in
                        were harvested using the submersible’s manipulators and                         the supplemental material), dominated by Proteobacteria
                        stored in a closed box during ascent to minimize physical dis-                  (classes Alphaproteobacteria and Gammaproteobacteria), Firmi-
                        turbance by surface waters. The water adjacent to coral colo-                   cutes, Bacteroidetes, and Acidobacteria (Fig. 1). The relative
                        nies was sampled using a Niskin bottle fired at depth. After                     proportions of these groups varied widely across the five coral
                        submersible recovery, freshly extruded coral exopolysaccharide
                         FIG. 1. Histogram showing percentages of composition (by taxon) for           16S rRNA as did the degree to which a given library was domi-
                                                                                                        samples, gene libraries generated for this study, showing only taxa
                       comprising at least 20% of and rock in at least wereclone library. to
                        and scrapings of coral sequences surfaces one transferred                       nated by a single group (Fig. 1; see Table S1 in the supple-
                        sterile cryovials. Water samples were prefiltered through 20-                    mental material). At the subphylum level, families occurring in
                        ␮m-pore-size Nitex, concentrated using a TFF apparatus (Mil-                    major proportions included Rhizobiaceae, Rhodobacteraceae,
                        lipore), and vacuum filtered (1.0-␮m and 0.2-␮m pore size).                      and Sphingomonadaceae (Alphaproteobacteria); Pseudomona-
Wednesday, March 7, 12 The 0.2-␮m filter retentate was resuspended in sterile saline
RecA
         Phylotyping in
         Sargasso Data




                 Venter et al., Science
                 304: 66. 2004



Wednesday, March 7, 12
RecA
         Phylotyping in
         Sargasso Data




                 Venter et al., Science
                 304: 66. 2004



Wednesday, March 7, 12
Sargasso Phylotypes
                        0.500

                                                                                 EFG      EFTu        HSP70            RecA        RpoB            rRNA



                        0.375
 Weighted % of Clones




                        0.250




                        0.125




                           0
                                            ia




                                                                ia




                                                                                   ria




                                                                                                s




                                                                                                              i




                                                                                                                              xi




                                                                                                                                              ia




                                                                                                                                                                   a
                                                                                                           ob
                                                                                                te




                                                                                                                                                                ot
                                                                                                                              le
                                          er




                                                             er




                                                                                                                                           er
                                                                                    e




                                                                                              u




                                                                                                         or




                                                                                                                                                              ae
                                                                                                                           of
                                        ct




                                                             ct




                                                                                 ct




                                                                                                                                           ct
                                                                                           ic




                                                                                                        hl




                                                                                                                         or
                                      ba




                                                         ba




                                                                             ba




                                                                                                                                          ba




                                                                                                                                                              ch
                                                                                           rm




                                                                                                       C




                                                                                                                        hl




                                                                                                                                                            ar
                                     eo




                                                        eo




                                                                           eo




                                                                                                                                      so
                                                                                         Fi




                                                                                                                        C




                                                                                                                                                          ry
                                                                                                                                     Fu
                                    t




                                                         t




                                                                            ot
                                 ro




                                                      ro




                                                                                                                                                      Eu
                                                                          pr
                                ap




                                                   ap




                                                                       lta
                            ph




                                                  m




                                                                     De
                                                 am
                           Al




                                             G




                                                                                            Major Phylogenetic Group




                                                                                                                  Venter et al., Science 304: 66-74. 2004
Wednesday, March 7, 12
AutoPhylotyping 3:
                          All in the Family




Wednesday, March 7, 12
Metagenomic Phylogenetic challenge
                              xxxxxxxxxxxxxxxxxxxxxxx

                            xxxxxx             xxxxxxxxxxxxx

                                             xxxxxxxxxxxxxx




                            xxxxxxxxxxxxxx




                         A single tree with everything




Wednesday, March 7, 12
Metagenomic Phylogenetic challenge




                         A single tree with everything




Wednesday, March 7, 12
Figure 1. PhylOTU Workflow. Computational processes are represented as squares and databases are represented as cylin
          PhylOTU - Sharpton et al. PLoS Comp. Bio 2011
   workflow of PhylOTU. See Results section for details.
   doi:10.1371/journal.pcbi.1001061.g001
Wednesday, March 7, 12
Wednesday, March 7, 12
AutoPhylotyping 4:
                         All in the Genome




Wednesday, March 7, 12
Challenge
            • Each gene poorly sampled in metagenomes
            • Can we combine all into a single tree?




Wednesday, March 7, 12
AMPHORA ALL




          Kembel et al. The phylogenetic diversity of metagenomes. PLoS
          One 2011
Wednesday, March 7, 12
Wednesday, March 7, 12
the communities combined (18), is a quantitative measure that
                                                        accounts for different levels of divergence between sequences.
                                                        The phylogenetic test (P test), which measures the significance
                                                        of the association between environment and phylogeny (18), is
                                                        typically used as a qualitative measure because duplicate se-
                                                        quences are usually removed from the tree. However, the P
                                                        test may be used in a semiquantitative manner if all clones,
                                                        even those with identical or near-identical sequences, are in-
                                                        cluded in the tree (13).
                                                           Here we describe a quantitative version of UniFrac that we
                                                        call “weighted UniFrac.” We show that weighted UniFrac be-
                                                        haves similarly to the FST test in situations where both are




                                                           FIG. 1. Calculation of the unweighted and the weighted UniFrac
                                                        measures. Squares and circles represent sequences from two different
                                                        environments. (a) In unweighted UniFrac, the distance between the
                                                        circle and square communities is calculated as the fraction of the
                                                        branch length that has descendants from either the square or the circle
                                                        environment (black) but not both (gray). (b) In weighted UniFrac,
                                                        branch lengths are weighted by the relative abundance of sequences in
                                                        the square and circle communities; square sequences are weighted
                                                        twice as much as circle sequences because there are twice as many total
                                                        circle sequences in the data set. The width of branches is proportional
                                                        to the degree to which each branch is weighted in the calculations, and
                                                        gray branches have no weight. Branches 1 and 2 have heavy weights
                                                        since the descendants are biased toward the square and circles, respec-
                                                        tively. Branch 3 contributes no value since it has an equal contribution
                                                        from circle and square sequences after normalization.




         Figure 3. Taxonomic diversity and standardized phylogenetic diversity versus
       depth in environmental samples along an oceanic depth gradient at the HOT ALO
                                             site.

Wednesday, March 7, 12
AutoPhylotyping 5:
                  Novel lineages and decluttering




Wednesday, March 7, 12
RecA Tree of Life
                         Bacteria




                                                                 Archaea
                                                                        Other lineages?




                          Eukaryotes

                              Figure from Barton, Eisen et al.
                              “Evolution”, CSHL Press. 2007.
                           Based on tree from Pace 1997 Science
                                       276:734-740
Wednesday, March 7, 12
Lek Clustering


                                                     0.75
                         0.75


                                       0.33
                   1                                        1
                                0.75          0.75




Wednesday, March 7, 12
Lek Clustering
                                       Cutoff of 0.5


                                                        0.75
                         0.75


                                         0.33
                   1                                           1
                                0.75             0.75




Wednesday, March 7, 12
GOS 1



                         RecA
                                GOS 2



    RecA
                                GOS 3


                                GOS 4




                                GOS 5



Wednesday, March 7, 12
RpoB Too




Wednesday, March 7, 12
Side benefit: binning




Wednesday, March 7, 12
Sulcia makes amino acids




    Baumannia makes vitamins and cofactors




                             Wu et al. 2006 PLoS Biology 4: e188.
Wednesday, March 7, 12
Uses of Phylogeny
                in Genomics and Metagenomics

                               Example 2:

                         Functional Diversity and
                          Functional Predictions


Wednesday, March 7, 12
Predicting Function

            • Key step in genome projects
            • More accurate predictions help guide
              experimental and computational analyses
            • Many diverse approaches
            • All improved both by “phylogenomic” type
              analyses that integrate evolutionary
              reconstructions and understanding of how new
              functions evolve


Wednesday, March 7, 12
PHYLOGENENETIC PREDICTION OF GENE FUNCTION



                                     EXAMPLE A                                METHOD                           EXAMPLE B

                                           2A                         CHOOSE GENE(S) OF INTEREST                        5


                                           3A                                                                       1 3 4
                                                2B                                                              2
                                                                         IDENTIFY HOMOLOGS                             5
                                      1A 2A 1B 3B                                                                    6



                                                                          ALIGN SEQUENCES

                             1A      2A 3A 1B        2B      3B                                      1    2         3       4   5   6



                                                                        CALCULATE GENE TREE


                                                   Duplication?


                            1A       2A 3A 1B       2B      3B                                       1    2         3       4   5   6



                                                                          OVERLAY KNOWN
                                                                        FUNCTIONS ONTO TREE

                                                   Duplication?


                                                    2B      3B                                      1      2        3       4   5   6
                            1A       2A 3A 1B



                                                                        INFER LIKELY FUNCTION
                                                                        OF GENE(S) OF INTEREST
                                                                                                   Ambiguous
                                                   Duplication?



                         Species 1     Species 2          Species 3
                          1A 1B                                                                      1    2         3       4   5   6
                                        2A 2B              3A 3B


                                                                          ACTUAL EVOLUTION
                                                                      (ASSUMED TO BE UNKNOWN)                                           Based on Eisen,
                                                                                                                                        1998 Genome
                                                   Duplication
                                                                                                                                        Res 8: 163-167.
Wednesday, March 7, 12
PHYLOGENENETIC PREDICTION OF GENE FUNCTION



                                     EXAMPLE A                                METHOD                           EXAMPLE B

                                           2A                         CHOOSE GENE(S) OF INTEREST                        5


                                           3A                                                                       1 3 4
                                                2B                                                              2
                                                                         IDENTIFY HOMOLOGS                             5
                                      1A 2A 1B 3B                                                                    6



                                                                          ALIGN SEQUENCES

                             1A      2A 3A 1B        2B      3B                                      1    2         3       4   5   6



                                                                        CALCULATE GENE TREE


                                                   Duplication?


                            1A       2A 3A 1B       2B      3B                                       1    2         3       4   5   6



                                                                          OVERLAY KNOWN
                                                                        FUNCTIONS ONTO TREE

                                                   Duplication?


                                                    2B      3B                                      1      2        3       4   5   6
                            1A       2A 3A 1B



                                                                        INFER LIKELY FUNCTION
                                                                        OF GENE(S) OF INTEREST
                                                                                                   Ambiguous
                                                   Duplication?



                         Species 1     Species 2          Species 3
                          1A 1B                                                                      1    2         3       4   5   6
                                        2A 2B              3A 3B


                                                                          ACTUAL EVOLUTION
                                                                      (ASSUMED TO BE UNKNOWN)                                           Based on Eisen,
                                                                                                                                        1998 Genome
                                                   Duplication
                                                                                                                                        Res 8: 163-167.
Wednesday, March 7, 12
0.01

  Legend:                                                                                                  Halorubrum lacusprofundi
                                      0.32                                                                 Haloquadratum walsbyi
  Dataset genes                                        0.93                                                Halogeometricum borinquense
        MA ammonialyase                                                  0.55                     0.83     Haloferax mediterranei
        MA mutase S subunit                                                                 1              Haloferax mucosum
                                                                                                    0.90   Haloferax volcanii
        MA mutase E subunit
                                                                                                      0.34 Haloferax sulfurifontis
        PHA synthatase                                                                                     Haloferax denitrificans
        cellulase
                                       0.41
                                                                                                           Halalkalicoccus jeotgali
        CRISPRs                                               1                                            Halopiger xanaduensis
                                                                                    0.52                   Natrialba magadii
        CAS
                              0.32                                                                         Haloterrigena turkmenica
                                                                                1                          Halobacterium sp. NRC 1
  Color ranges:
                                     0.08                                                                  Halobacterium salinarum R1
                                                                                                           Natronomonas pharaonis
        New Genomes                           0.23
                                                                                                           Halorhabdus utahensis
                                                     0.79
                                                                  0.52
                                                                                                           Halomicrobium mukohataei
                                                                                     1
                                                                                                           Haloarcula vallismortis
                                                                                           0.71            Haloarcula marismortui
                                                                                           0.24            Haloarcula sinaiiensis
                                                                                                           Haloarcula californiae




Wednesday, March 7, 12
!"#
                                                                 Haloarchaea TBPs
                                        !"E#                                                                                        $%&'?)*%7.1)5+()**%-)+.D
                                       !"HJ                                                                                         $%&'?)*%7.1/2'8/1.#
                                         !"MD                                                                                       $%&'?)*%7.>'&2%-++.#
                                         !"NL                                                                                       $%&'?)*%7.8/&?/*+?'-(+8.#
                                      !"HH                                                                                          $%&'?)*%7.5)-+(*+?+2%-8.#
                                                                                                                                    $%&'A/%5*%(/1.B%&84C+.D
                                                                                                                                    $%&',)'1)(*+2/1.4'*+-A/)-8).#
                                       !"#D      #                                                                                  $%&'4%2()*+/1.86".3:; #.(46=
                                                                                                                                    $%&'4%2()*+/1.8%&+-%*/1.:#.(46=
                                        !"DK!"H#
                                              !"KL                                                                                  $%&'()**+,)-%.(/*01)-+2%.#
                                               !"J#                                                                                 3%(*+%&4%.1%,%5++.#
                                             !"DE                                                                                   $%&'6+,)*.7%-%5/)-8+8.#
                                               !"ED                                                                                 $%&%&0%&+2'22/8.9)'(,%&+.#
                                     !"JH !"L!               #                                                                      $%&'4%2()*+/1.86".3:; #.(46<
                                                                                                                                    $%&'4%2()*+/1.8%&+-%*/1.:#.(46<
                                               !"ML                                                                                 3%(*'-'1'-%8.6@%*%'-+8.#
                                              !"ED                                                                                  $%&'*@%45/8./(%@)-8+8.#
                                               !"EE
                                                                                                                                    $%&'1+2*'4+/1.1/0'@%(%)+.#
                                                      #
                                                                                                                                    $%&'%*2/&%.1%*+81'*(/+.#
                                                          !"DJ                                                                      $%&'%*2/&%.>%&&+81'*(+8.#
                                                          !"#J                                                                      $%&'%*2/&%.8+-%++)-8+8.#
                                                                                                                                    $%&'%*2/&%.2%&+?'*-+%).#
                                                                                                                                    $%&'*/4*/1.&%2/86*'?/-5+.E
                                                                        #                                                           $%&'4%2()*+/1.86".3:; #.(46F
                                                                                                                                    $%&'4%2()*+/1.8%&+-%*/1.:#.(46F#
                                                !"JD                 !"NK                                                           $%&'4%2()*+/1.86".3:; #.(46G
                                                                       !"L#                                                         $%&'4%2()*+/1.8%&+-%*/1.:#.E
                                                                 !"N!                                                               $%&'4%2()*+/1.8%&+-%*/1.:#.H
                                                                    !"JN                                                            $%&'4%2()*+/1.8%&+-%*/1.:#.#
                                                                      !"JH                                                          $%&'4%2()*+/1.86".3:; #.(46I
                                                                                                                                    $%&'4%2()*+/1.8%&+-%*/1.:#.D
                                                                                                                                    $%&'*/4*/1.&%2/86*'?/-5+.D
                                              !"NH!"K#                                                                              $%&',)'1)(*+2/1.4'*+-A/)-8).D
                                                    !"E!                                                                            $%&'*/4*/1.&%2/86*'?/-5+.H
                                                 !"MJ                                                                               $%&'A/%5*%(/1.B%&84C+.#
                                                     !"M!
                                                                                                                                    $%&'?)*%7.1/2'8/1.D
                                                      !"NJ
                                                                                                                                    $%&'?)*%7.1)5+()**%-)+.#
                                                      !"LN                                                                          $%&'?)*%7.>'&2%-++.D
                                        !"NJ          !"K#                                                                          $%&'?)*%7.8/&?/*+?'-(+8.D
                                                                                                                                    $%&'?)*%7.5)-+(*+?+2%-8.D
                                                                 !"J!                                                               $%&'?)*%7.5)-+(*+?+2%-8.E
                                                                            !"MN                                                    $%&'?)*%7.1/2'8/1.H
                                                                                                                                    $%&'?)*%7.1)5+()**%-)+.H
                                                 !"KN
                                                                                    !"JM                                            $%&'?)*%7.1/2'8/1.E
                                                                            !"MD                                                    $%&'?)*%7.1)5+()**%-)+.E
                                                                                   !"NM                                             $%&'?)*%7.>'&2%-++.E
                                                                                      !"MJ                                          $%&'?)*%7.8/&?/*+?'-(+8.H
                                                                 !"EK                                                               $%&'?)*%7.5)-+(*+?+2%-8.H
                                                                                                     !"J!                           $%&'%*2/&%.2%&+?'*-+%).D
                                                                                             #                                      $%&'?)*%7.>'&2%-++.H
                                                                                                              #                     $%&'*/4*/1.&%2/86*'?/-5+.#
                                                                                                                            #       $%&'4%2()*+/1.86".3:; #.(46;
                                                                                                                                    $%&'4%2()*+/1.8%&+-%*/1.:#.(46;#


     Figure 8. Independent expansion of the TATA-binding protein family in two haloarchaeal genera. Phylogeny of TATA-binding protein (TBP) homologs identified by RAST with Bootstrap values
     shown. Colored branches represent duplication events (with the dark blue branch representing four duplications). Ancestral TBP (found in all genomes) is shown on the purple branch. Successive
     duplications are shown in darkening shades of green (Halobacterium) or blue (Haloferax).
                                                                                                                         Lynch et al. in preparation
Wednesday, March 7, 12
Massive Diversity of Proteorhodopsins




                                                   Venter et al., 2004
Wednesday, March 7, 12
Characterizing the niche-space distributions of components                                   Metagenomics DARPA

                                                                               0 .1   0 .2             0 .3           0 .4                  0 .5        0 .6                                                                  0 .2   0 .4   0 .6   0 .8   1 .0



                    Polyne sia Archipe la gos_ G S 0 4 8 a _ C ora l R e e f
                                 India n O ce a n_ G S 1 2 0 _ O pe n O ce a n
                         Polyne sia Archipe la gos_ G S 0 4 9 _ C oa sta l
                         G a la pa gos Isla nds_ G S 0 2 6 _ O pe n O ce a n
                                 India n O ce a n_ G S 1 1 9 _ O pe n O ce a n
                                                                                                                                                                                                                                                                                  G e ne ra l
                                      C a ribbe a n S e a _ G S 0 1 5 _ C oa sta l
                                      C a ribbe a n S e a _ G S 0 1 9 _ C oa sta l
                                 India n O ce a n_ G S 1 1 4 _ O pe n O ce a n                                                                                                                                                                                                      H igh
                  E a ste rn Tropica l Pa cific_ G S 0 2 3 _ O pe n O ce a n                                                                                                                                                                                                        M e dium
                               India n O ce a n_ G S 1 1 0 a _ O pe n O ce a n
                              India n O ce a n_ G S 1 0 8 a _ La goon R e e f                                                                                                                                                                                                       Low
                              C a ribbe a n S e a _ G S 0 1 8 _ O pe n O ce a n                                                                                                                                                                                                     NA
                                 G a la pa gos Isla nds_ G S 0 3 4 _ C oa sta l
                               India n O ce a n_ G S 1 2 2 a _ O pe n O ce a n
                                 India n O ce a n_ G S 1 2 1 _ O pe n O ce a n
                              C a ribbe a n S e a _ G S 0 1 7 _ O pe n O ce a n
                               India n O ce a n_ G S 1 1 2 a _ O pe n O ce a n
                                 India n O ce a n_ G S 1 1 3 _ O pe n O ce a n
                                India n O ce a n_ G S 1 4 8 _ F ringing R e e f
                               C a ribbe a n S e a _ G S 0 1 6 _ C oa sta l S e a
                                 India n O ce a n_ G S 1 2 3 _ O pe n O ce a n
                                         India n O ce a n_ G S 1 4 9 _ H a rbor
                                 G a la pa gos Isla nds_ G S 0 2 7 _ C oa sta l
                  E a ste rn Tropica l Pa cific_ G S 0 2 2 _ O pe n O ce a n                                                                                                                                                                                                      W a te r de pth
      S ites




                              S a rga sso S e a _ G S 0 0 1 c_ O pe n O ce a n
                                 G a la pa gos Isla nds_ G S 0 3 5 _ C oa sta l
                          G a la pa gos Isla nds_ G S 0 3 0 _ W a rm S e e p
                                 G a la pa gos Isla nds_ G S 0 2 9 _ C oa sta l                                                                                                                                                                                                     >4000m
                  G a la pa gos Isla nds_ G S 0 3 1 _ C oa sta l upwe lling
                          India n O ce a n_ G S 1 1 7 a _ C oa sta l sa m ple
                                                                                                                                                                                                                                                                                    2000!4000m
                                 G a la pa gos Isla nds_ G S 0 2 8 _ C oa sta l                                                                                                                                                                                                     900!2000m
                                 G a la pa gos Isla nds_ G S 0 3 6 _ C oa sta l                                                                                                                                                                                                     100!200m
               Polyne sia Archipe la gos_ G S 0 5 1 _ C ora l R e e f Atoll
                    N orth Am e rica n E a st C oa st_ G S 0 1 4 _ C oa sta l                                                                                                                                                                                                       20!100m
                    N orth Am e rica n E a st C oa st_ G S 0 0 6 _ E stua ry                                                                                                                                                                                                        0!20m
                         E a ste rn Tropica l Pa cific_ G S 0 2 1 _ C oa sta l
                    N orth Am e rica n E a st C oa st_ G S 0 0 9 _ C oa sta l
                    N orth Am e rica n E a st C oa st_ G S 0 1 1 _ E stua ry
                    N orth Am e rica n E a st C oa st_ G S 0 0 8 _ C oa sta l
                    N orth Am e rica n E a st C oa st_ G S 0 1 3 _ C oa sta l
                    N orth Am e rica n E a st C oa st_ G S 0 0 4 _ C oa sta l
                    N orth Am e rica n E a st C oa st_ G S 0 0 7 _ C oa sta l
                    N orth Am e rica n E a st C oa st_ G S 0 0 3 _ C oa sta l
                    N orth Am e rica n E a st C oa st_ G S 0 0 2 _ C oa sta l
               N orth Am e rica n E a st C oa st_ G S 0 0 5 _ E m baym e nt




                                                                                             Co                        Co                          Co                       Co                       Co




                                                                                                                                                                                                                                                                   Chlorophyll


                                                                                                                                                                                                                                                                  Water Depth
                                                                                                                                                                                                                                                                       Salinity


                                                                                                                                                                                                                                                                  Temperature
                                                                                                                                                                                                                                                                 Sample Depth


                                                                                                                                                                                                                                                                    Insolation
                                                                                                  mp                         mp                         mp                       mp                       mp
                                                                                                       on                         on                         on                       on                       on
                                                                                                            en                         en                         en                       en                       en
                                                                                                                 t1                         t2                         t3                       t4                       t5




                                                                                                        (a)                                                                                                                             (b)                        (c)




 Figure 3: a) Niche-space distributions for our five components (H T ); b) the site-
                       ˆ ˆ
 similarity matrix (H T H); c) environmental variables for the sites. The matrices are
 aligned so that the same row corresponds to the same site in each matrix. Sites are
 ordered by applying spectral reordering to the similarity matrix (see Materials and
 Methods). Rows are aligned across the three matrices.
Wednesday, March 7, 12
Uses of Phylogeny
                in Genomics and Metagenomics

                             Example 3:

                    Selecting Organisms for Study




Wednesday, March 7, 12
rRNA Tree of Life
                         Bacteria




                                                                 Archaea




                          Eukaryotes

                              Figure from Barton, Eisen et al.
                              “Evolution”, CSHL Press. 2007.
                           Based on tree from Pace 1997 Science
                                       276:734-740
Wednesday, March 7, 12
As of 2002               Proteobacteria
                         TM6
                         OS-K                    • At least 40
                         Acidobacteria
                         Termite Group
                         OP8
                                                   phyla of
                         Nitrospira
                         Bacteroides
                                                   bacteria
                         Chlorobi
                         Fibrobacteres
                         Marine GroupA
                         WS3
                         Gemmimonas
                         Firmicutes
                         Fusobacteria
                         Actinobacteria
                         OP9
                         Cyanobacteria
                         Synergistes
                         Deferribacteres
                         Chrysiogenetes
                         NKB19
                         Verrucomicrobia
                         Chlamydia
                         OP3
                         Planctomycetes
                         Spriochaetes
                         Coprothmermobacter
                         OP10
                         Thermomicrobia
                         Chloroflexi
                         TM7
                         Deinococcus-Thermus
                         Dictyoglomus
                         Aquificae
                         Thermudesulfobacteria
                         Thermotogae
                         OP1                       Based on Hugenholtz,
                         OP11                      2002
Wednesday, March 7, 12
As of 2002              Proteobacteria
                         TM6
                         OS-K
                                                 • At least 40
                         Acidobacteria
                         Termite Group
                         OP8
                                                   phyla of
                         Nitrospira
                         Bacteroides
                                                   bacteria
                         Chlorobi
                         Fibrobacteres
                         Marine GroupA
                                                 • Most genomes
                         WS3
                         Gemmimonas                from three
                         Firmicutes
                         Fusobacteria              phyla
                         Actinobacteria
                         OP9
                         Cyanobacteria
                         Synergistes
                         Deferribacteres
                         Chrysiogenetes
                         NKB19
                         Verrucomicrobia
                         Chlamydia
                         OP3
                         Planctomycetes
                         Spriochaetes
                         Coprothmermobacter
                         OP10
                         Thermomicrobia
                         Chloroflexi
                         TM7
                         Deinococcus-Thermus
                         Dictyoglomus
                         Aquificae
                         Thermudesulfobacteria
                         Thermotogae
                         OP1                       Based on Hugenholtz,
                         OP11                      2002
Wednesday, March 7, 12
As of 2002              Proteobacteria
                         TM6
                         OS-K
                                                 • At least 40
                         Acidobacteria
                         Termite Group
                         OP8
                                                   phyla of
                         Nitrospira
                         Bacteroides
                                                   bacteria
                         Chlorobi
                         Fibrobacteres
                         Marine GroupA
                                                 • Most genomes
                         WS3
                         Gemmimonas                from three
                         Firmicutes
                         Fusobacteria              phyla
                         Actinobacteria
                         OP9
                         Cyanobacteria
                         Synergistes
                                                 • Some studies
                         Deferribacteres
                         Chrysiogenetes            in other phyla
                         NKB19
                         Verrucomicrobia
                         Chlamydia
                         OP3
                         Planctomycetes
                         Spriochaetes
                         Coprothmermobacter
                         OP10
                         Thermomicrobia
                         Chloroflexi
                         TM7
                         Deinococcus-Thermus
                         Dictyoglomus
                         Aquificae
                         Thermudesulfobacteria
                         Thermotogae
                         OP1                       Based on Hugenholtz,
                         OP11                      2002
Wednesday, March 7, 12
As of 2002              Proteobacteria
                         TM6
                         OS-K
                                                 • At least 40
                         Acidobacteria
                         Termite Group
                         OP8
                                                   phyla of
                         Nitrospira
                         Bacteroides
                                                   bacteria
                         Chlorobi
                         Fibrobacteres
                         Marine GroupA
                                                 • Most genomes
                         WS3
                         Gemmimonas                from three
                         Firmicutes
                         Fusobacteria              phyla
                         Actinobacteria
                         OP9
                         Cyanobacteria
                         Synergistes
                                                 • Some other
                         Deferribacteres
                         Chrysiogenetes            phyla are only
                         NKB19
                         Verrucomicrobia
                         Chlamydia
                                                   sparsely
                         OP3
                         Planctomycetes
                         Spriochaetes
                                                   sampled
                         Coprothmermobacter
                         OP10                    • Same trend in
                         Thermomicrobia
                         Chloroflexi
                         TM7
                                                   Eukaryotes
                         Deinococcus-Thermus
                         Dictyoglomus
                         Aquificae
                         Thermudesulfobacteria
                         Thermotogae
                         OP1                       Based on Hugenholtz,
                         OP11                      2002
Wednesday, March 7, 12
As of 2002              Proteobacteria
                         TM6
                         OS-K
                                                 • At least 40
                         Acidobacteria
                         Termite Group
                         OP8
                                                   phyla of
                         Nitrospira
                         Bacteroides
                                                   bacteria
                         Chlorobi
                         Fibrobacteres
                         Marine GroupA
                                                 • Most genomes
                         WS3
                         Gemmimonas                from three
                         Firmicutes
                         Fusobacteria              phyla
                         Actinobacteria
                         OP9
                         Cyanobacteria
                         Synergistes
                                                 • Some other
                         Deferribacteres
                         Chrysiogenetes            phyla are only
                         NKB19
                         Verrucomicrobia
                         Chlamydia
                                                   sparsely
                         OP3
                         Planctomycetes
                         Spriochaetes
                                                   sampled
                         Coprothmermobacter
                         OP10                    • Same trend in
                         Thermomicrobia
                         Chloroflexi
                         TM7
                                                   Viruses
                         Deinococcus-Thermus
                         Dictyoglomus
                         Aquificae
                         Thermudesulfobacteria
                         Thermotogae
                         OP1                       Based on Hugenholtz,
                         OP11                      2002
Wednesday, March 7, 12
Wednesday, March 7, 12
http://www.jgi.doe.gov/programs/GEBA/pilot.html
Wednesday, March 7, 12
GEBA Pilot Project: Components
         • Project overview (Phil Hugenholtz, Nikos Kyrpides, Jonathan Eisen,
           Eddy Rubin, Jim Bristow)
         • Project management (David Bruce, Eileen Dalin, Lynne Goodwin)
         • Culture collection and DNA prep (DSMZ, Hans-Peter Klenk)
         • Sequencing and closure (Eileen Dalin, Susan Lucas, Alla Lapidus, Mat
           Nolan, Alex Copeland, Cliff Han, Feng Chen, Jan-Fang Cheng)
         • Annotation and data release (Nikos Kyrpides, Victor Markowitz, et al)
         • Analysis (Dongying Wu, Kostas Mavrommatis, Martin Wu, Victor
           Kunin, Neil Rawlings, Ian Paulsen, Patrick Chain, Patrik D’Haeseleer,
           Sean Hooper, Iain Anderson, Amrita Pati, Natalia N. Ivanova,
           Athanasios Lykidis, Adam Zemla)
         • Adopt a microbe education project (Cheryl Kerfeld)
         • Outreach (David Gilbert)
         • $$$ (DOE, Eddy Rubin, Jim Bristow)



Wednesday, March 7, 12
GEBA Lesson 1:
                     Phylogeny driven genome selection (and
                    phylogenetics) improves genome annotation
              • Took 56 GEBA genomes and compared results vs. 56
                randomly sampled new genomes
              • Better definition of protein family sequence “patterns”
              • Greatly improves “comparative” and “evolutionary” based
                predictions
              • Conversion of hypothetical into conserved hypotheticals
              • Linking distantly related members of protein families
              • Improved non-homology prediction




Wednesday, March 7, 12
GEBA Lesson 2

                         Phylogeny-driven genome selection
                         helps discover new genetic diversity




Wednesday, March 7, 12
Protein Family Rarefaction
                                   Curves
            • Take data set of multiple complete genomes
            • Identify all protein families using MCL
            • Plot # of genomes vs. # of protein families




Wednesday, March 7, 12
Wu et al. 2009 Nature 462, 1056-1060

Wednesday, March 7, 12
Wu et al. 2009 Nature 462, 1056-1060

Wednesday, March 7, 12
Wu et al. 2009 Nature 462, 1056-1060

Wednesday, March 7, 12
Wu et al. 2009 Nature 462, 1056-1060

Wednesday, March 7, 12
Wu et al. 2009 Nature 462, 1056-1060

Wednesday, March 7, 12
Synapomorphies exist




Wu et al. 2009 Nature 462, 1056-1060

Wednesday, March 7, 12
Families/PD not uniform
               31	





                                       6	

                                             


Wednesday, March 7, 12
GEBA Lesson 3

                         Improves analysis of genome data from
                                 uncultured organisms




Wednesday, March 7, 12
Shotgun Sequencing Allows Use of Other Markers
                                                                                          Sargasso Phylotypes
                       0.500




                       0.375                           GEBA Project
Weighted % of Clones




                       0.250
                                                       improves                                                                                                EFG
                                                                                                                                                               EFTu
                                                                                                                                                               HSP70

                                                       metagenomic analysis                                                                                    RecA
                                                                                                                                                               RpoB
                                                                                                                                                               rRNA

                       0.125




                          0
                                            ia




                                                                ia




                                                                                   ria




                                                                                                s




                                                                                                             i




                                                                                                                              xi




                                                                                                                                            ia




                                                                                                                                                           a
                                                                                                           ob
                                                                                                te




                                                                                                                                                        ot
                                                                                                                            le
                                         er




                                                             er




                                                                                                                                         er
                                                                                    e




                                                                                              u




                                                                                                         or




                                                                                                                                                      ae
                                                                                                                         of
                                       ct




                                                             ct




                                                                                 ct




                                                                                                                                         ct
                                                                                           ic




                                                                                                        hl




                                                                                                                       or
                                     ba




                                                         ba




                                                                             ba




                                                                                                                                     ba




                                                                                                                                                      ch
                                                                                           rm




                                                                                                       C




                                                                                                                       hl




                                                                                                                                                    ar
                                    eo




                                                        eo




                                                                           eo




                                                                                                                                    so
                                                                                         Fi




                                                                                                                       C




                                                                                                                                                  ry
                                                                                                                                   Fu
                                   t




                                                         t




                                                                            ot
                                ro




                                                      ro




                                                                                                                                                 Eu
                                                                          pr
                               ap




                                                   ap




                                                                       lta
                           ph




                                                  m




                                                                     De
                                                 am
                          Al




                                             G




                                                                                            Major Phylogenetic Group



                                                                                                                 Venter et al., Science 304: 66-74. 2004
       Wednesday, March 7, 12
Shotgun Sequencing Allows Use of Other Markers
                                                                                          Sargasso Phylotypes
                       0.500




                       0.375                           But not a lot
Weighted % of Clones




                       0.250                                                                                                                                   EFG
                                                                                                                                                               EFTu
                                                                                                                                                               HSP70
                                                                                                                                                               RecA
                                                                                                                                                               RpoB
                                                                                                                                                               rRNA

                       0.125




                          0
                                            ia




                                                                ia




                                                                                   ria




                                                                                                s




                                                                                                             i




                                                                                                                              xi




                                                                                                                                            ia




                                                                                                                                                           a
                                                                                                           ob
                                                                                                te




                                                                                                                                                        ot
                                                                                                                            le
                                         er




                                                             er




                                                                                                                                         er
                                                                                    e




                                                                                              u




                                                                                                         or




                                                                                                                                                      ae
                                                                                                                         of
                                       ct




                                                             ct




                                                                                 ct




                                                                                                                                         ct
                                                                                           ic




                                                                                                        hl




                                                                                                                       or
                                     ba




                                                         ba




                                                                             ba




                                                                                                                                     ba




                                                                                                                                                      ch
                                                                                           rm




                                                                                                       C




                                                                                                                       hl




                                                                                                                                                    ar
                                    eo




                                                        eo




                                                                           eo




                                                                                                                                    so
                                                                                         Fi




                                                                                                                       C




                                                                                                                                                  ry
                                                                                                                                   Fu
                                   t




                                                         t




                                                                            ot
                                ro




                                                      ro




                                                                                                                                                 Eu
                                                                          pr
                               ap




                                                   ap




                                                                       lta
                           ph




                                                  m




                                                                     De
                                                 am
                          Al




                                             G




                                                                                            Major Phylogenetic Group



                                                                                                                 Venter et al., Science 304: 66-74. 2004
       Wednesday, March 7, 12
Phylogeny and Metagenomics
                              Future 1
                             Need to adapt genomic and
                         metagenomic methods to make better
                                     use of data




Wednesday, March 7, 12
iSEEM Project




Wednesday, March 7, 12
Wednesday, March 7, 12
Jonathan Eisen: Phylogenetic approaches to the analysis of genomes and metagenomes
Jonathan Eisen: Phylogenetic approaches to the analysis of genomes and metagenomes
Jonathan Eisen: Phylogenetic approaches to the analysis of genomes and metagenomes
Jonathan Eisen: Phylogenetic approaches to the analysis of genomes and metagenomes
Jonathan Eisen: Phylogenetic approaches to the analysis of genomes and metagenomes
Jonathan Eisen: Phylogenetic approaches to the analysis of genomes and metagenomes
Jonathan Eisen: Phylogenetic approaches to the analysis of genomes and metagenomes
Jonathan Eisen: Phylogenetic approaches to the analysis of genomes and metagenomes
Jonathan Eisen: Phylogenetic approaches to the analysis of genomes and metagenomes
Jonathan Eisen: Phylogenetic approaches to the analysis of genomes and metagenomes
Jonathan Eisen: Phylogenetic approaches to the analysis of genomes and metagenomes
Jonathan Eisen: Phylogenetic approaches to the analysis of genomes and metagenomes
Jonathan Eisen: Phylogenetic approaches to the analysis of genomes and metagenomes
Jonathan Eisen: Phylogenetic approaches to the analysis of genomes and metagenomes
Jonathan Eisen: Phylogenetic approaches to the analysis of genomes and metagenomes
Jonathan Eisen: Phylogenetic approaches to the analysis of genomes and metagenomes

Mais conteúdo relacionado

Destaque

VIZBI 2014 - Visualizing Genomic Variation
VIZBI 2014 - Visualizing Genomic VariationVIZBI 2014 - Visualizing Genomic Variation
VIZBI 2014 - Visualizing Genomic VariationJan Aerts
 
J Goecks - The Galaxy Visual Analysis Framework
J Goecks - The Galaxy Visual Analysis FrameworkJ Goecks - The Galaxy Visual Analysis Framework
J Goecks - The Galaxy Visual Analysis FrameworkJan Aerts
 
Surfacing the deep data of taxonomy
Surfacing the deep data of taxonomySurfacing the deep data of taxonomy
Surfacing the deep data of taxonomyRoderic Page
 
Tetrahymena genome project 2003 presentation by Jonathan Eisen
Tetrahymena genome project 2003 presentation by Jonathan EisenTetrahymena genome project 2003 presentation by Jonathan Eisen
Tetrahymena genome project 2003 presentation by Jonathan EisenJonathan Eisen
 
OBF Address at BOSC 2012
OBF Address at BOSC 2012OBF Address at BOSC 2012
OBF Address at BOSC 2012Hilmar Lapp
 
The neurobiological nature of free will
The neurobiological nature of free willThe neurobiological nature of free will
The neurobiological nature of free willBjörn Brembs
 
Evolution of the RecA Protein: from Systematics to Structure 1995 talk for CA...
Evolution of the RecA Protein: from Systematics to Structure 1995 talk for CA...Evolution of the RecA Protein: from Systematics to Structure 1995 talk for CA...
Evolution of the RecA Protein: from Systematics to Structure 1995 talk for CA...Jonathan Eisen
 
E Talevich - Biopython project-update
E Talevich - Biopython project-updateE Talevich - Biopython project-update
E Talevich - Biopython project-updateJan Aerts
 
Intel Theater Presentation - SC11
Intel Theater Presentation - SC11Intel Theater Presentation - SC11
Intel Theater Presentation - SC11Deepak Singh
 
Jonathan Eisen @phylogenomics talk for #LAMG12
Jonathan Eisen @phylogenomics talk for #LAMG12Jonathan Eisen @phylogenomics talk for #LAMG12
Jonathan Eisen @phylogenomics talk for #LAMG12Jonathan Eisen
 
Humanizing bioinformatics
Humanizing bioinformaticsHumanizing bioinformatics
Humanizing bioinformaticsJan Aerts
 
A brief description of the Chemical Rediscovery Survey and Open Chemistry in ...
A brief description of the Chemical Rediscovery Survey and Open Chemistry in ...A brief description of the Chemical Rediscovery Survey and Open Chemistry in ...
A brief description of the Chemical Rediscovery Survey and Open Chemistry in ...Jean-Claude Bradley
 
Evolution of gene family size change in fungi
Evolution of gene family size change in fungiEvolution of gene family size change in fungi
Evolution of gene family size change in fungiJason Stajich
 
The Sam Adams talk
The Sam Adams talkThe Sam Adams talk
The Sam Adams talkRoderic Page
 
Fungal ITS meeting presentation
Fungal ITS meeting presentationFungal ITS meeting presentation
Fungal ITS meeting presentationHolly Bik
 
Using Social Media in Research
Using Social Media in ResearchUsing Social Media in Research
Using Social Media in ResearchHolly Bik
 
Perl for Phyloinformatics
Perl for PhyloinformaticsPerl for Phyloinformatics
Perl for PhyloinformaticsRutger Vos
 
yw jakartarb20101031
yw jakartarb20101031yw jakartarb20101031
yw jakartarb20101031Yannick Wurm
 

Destaque (20)

VIZBI 2014 - Visualizing Genomic Variation
VIZBI 2014 - Visualizing Genomic VariationVIZBI 2014 - Visualizing Genomic Variation
VIZBI 2014 - Visualizing Genomic Variation
 
J Goecks - The Galaxy Visual Analysis Framework
J Goecks - The Galaxy Visual Analysis FrameworkJ Goecks - The Galaxy Visual Analysis Framework
J Goecks - The Galaxy Visual Analysis Framework
 
Surfacing the deep data of taxonomy
Surfacing the deep data of taxonomySurfacing the deep data of taxonomy
Surfacing the deep data of taxonomy
 
Tetrahymena genome project 2003 presentation by Jonathan Eisen
Tetrahymena genome project 2003 presentation by Jonathan EisenTetrahymena genome project 2003 presentation by Jonathan Eisen
Tetrahymena genome project 2003 presentation by Jonathan Eisen
 
OBF Address at BOSC 2012
OBF Address at BOSC 2012OBF Address at BOSC 2012
OBF Address at BOSC 2012
 
The neurobiological nature of free will
The neurobiological nature of free willThe neurobiological nature of free will
The neurobiological nature of free will
 
Evolution of the RecA Protein: from Systematics to Structure 1995 talk for CA...
Evolution of the RecA Protein: from Systematics to Structure 1995 talk for CA...Evolution of the RecA Protein: from Systematics to Structure 1995 talk for CA...
Evolution of the RecA Protein: from Systematics to Structure 1995 talk for CA...
 
ORCID Principles
ORCID PrinciplesORCID Principles
ORCID Principles
 
E Talevich - Biopython project-update
E Talevich - Biopython project-updateE Talevich - Biopython project-update
E Talevich - Biopython project-update
 
Intel Theater Presentation - SC11
Intel Theater Presentation - SC11Intel Theater Presentation - SC11
Intel Theater Presentation - SC11
 
Jonathan Eisen @phylogenomics talk for #LAMG12
Jonathan Eisen @phylogenomics talk for #LAMG12Jonathan Eisen @phylogenomics talk for #LAMG12
Jonathan Eisen @phylogenomics talk for #LAMG12
 
Humanizing bioinformatics
Humanizing bioinformaticsHumanizing bioinformatics
Humanizing bioinformatics
 
A brief description of the Chemical Rediscovery Survey and Open Chemistry in ...
A brief description of the Chemical Rediscovery Survey and Open Chemistry in ...A brief description of the Chemical Rediscovery Survey and Open Chemistry in ...
A brief description of the Chemical Rediscovery Survey and Open Chemistry in ...
 
Evolution of gene family size change in fungi
Evolution of gene family size change in fungiEvolution of gene family size change in fungi
Evolution of gene family size change in fungi
 
The Sam Adams talk
The Sam Adams talkThe Sam Adams talk
The Sam Adams talk
 
ESA 2012 talk
ESA 2012 talkESA 2012 talk
ESA 2012 talk
 
Fungal ITS meeting presentation
Fungal ITS meeting presentationFungal ITS meeting presentation
Fungal ITS meeting presentation
 
Using Social Media in Research
Using Social Media in ResearchUsing Social Media in Research
Using Social Media in Research
 
Perl for Phyloinformatics
Perl for PhyloinformaticsPerl for Phyloinformatics
Perl for Phyloinformatics
 
yw jakartarb20101031
yw jakartarb20101031yw jakartarb20101031
yw jakartarb20101031
 

Semelhante a Jonathan Eisen: Phylogenetic approaches to the analysis of genomes and metagenomes

Phylogenetic approaches to metagenomic analysis #KSMicro talk by Jonathan Eisen
Phylogenetic approaches to metagenomic analysis #KSMicro talk by Jonathan EisenPhylogenetic approaches to metagenomic analysis #KSMicro talk by Jonathan Eisen
Phylogenetic approaches to metagenomic analysis #KSMicro talk by Jonathan EisenJonathan Eisen
 
"Phylogeny-driven studies in genomics and metagenomics" talk by Jonathan Eise...
"Phylogeny-driven studies in genomics and metagenomics" talk by Jonathan Eise..."Phylogeny-driven studies in genomics and metagenomics" talk by Jonathan Eise...
"Phylogeny-driven studies in genomics and metagenomics" talk by Jonathan Eise...Jonathan Eisen
 
To Infinity And Beyond March 2011 Ammended For Dissemination
To Infinity And Beyond March 2011 Ammended For DisseminationTo Infinity And Beyond March 2011 Ammended For Dissemination
To Infinity And Beyond March 2011 Ammended For DisseminationNigel Wynne
 
Superiority, Equivalence, and Non-Inferiority Trial Designs
Superiority, Equivalence, and Non-Inferiority Trial DesignsSuperiority, Equivalence, and Non-Inferiority Trial Designs
Superiority, Equivalence, and Non-Inferiority Trial DesignsKevin Clauson
 
Klasa III a
Klasa III aKlasa III a
Klasa III amonzam
 
rpg1-mediated Durable Stem Rust Resistance: Mechanisms of action
rpg1-mediated Durable Stem Rust Resistance: Mechanisms of actionrpg1-mediated Durable Stem Rust Resistance: Mechanisms of action
rpg1-mediated Durable Stem Rust Resistance: Mechanisms of actionBorlaug Global Rust Initiative
 
From Bletchley Park to modern computing: the value of Twitter
From Bletchley Park to modern computing: the value of TwitterFrom Bletchley Park to modern computing: the value of Twitter
From Bletchley Park to modern computing: the value of TwitterSue Black
 

Semelhante a Jonathan Eisen: Phylogenetic approaches to the analysis of genomes and metagenomes (9)

Phylogenetic approaches to metagenomic analysis #KSMicro talk by Jonathan Eisen
Phylogenetic approaches to metagenomic analysis #KSMicro talk by Jonathan EisenPhylogenetic approaches to metagenomic analysis #KSMicro talk by Jonathan Eisen
Phylogenetic approaches to metagenomic analysis #KSMicro talk by Jonathan Eisen
 
"Phylogeny-driven studies in genomics and metagenomics" talk by Jonathan Eise...
"Phylogeny-driven studies in genomics and metagenomics" talk by Jonathan Eise..."Phylogeny-driven studies in genomics and metagenomics" talk by Jonathan Eise...
"Phylogeny-driven studies in genomics and metagenomics" talk by Jonathan Eise...
 
To Infinity And Beyond March 2011 Ammended For Dissemination
To Infinity And Beyond March 2011 Ammended For DisseminationTo Infinity And Beyond March 2011 Ammended For Dissemination
To Infinity And Beyond March 2011 Ammended For Dissemination
 
Superiority, Equivalence, and Non-Inferiority Trial Designs
Superiority, Equivalence, and Non-Inferiority Trial DesignsSuperiority, Equivalence, and Non-Inferiority Trial Designs
Superiority, Equivalence, and Non-Inferiority Trial Designs
 
Klasa III a
Klasa III aKlasa III a
Klasa III a
 
rpg1-mediated Durable Stem Rust Resistance: Mechanisms of action
rpg1-mediated Durable Stem Rust Resistance: Mechanisms of actionrpg1-mediated Durable Stem Rust Resistance: Mechanisms of action
rpg1-mediated Durable Stem Rust Resistance: Mechanisms of action
 
From Bletchley Park to modern computing: the value of Twitter
From Bletchley Park to modern computing: the value of TwitterFrom Bletchley Park to modern computing: the value of Twitter
From Bletchley Park to modern computing: the value of Twitter
 
ACODE48
ACODE48ACODE48
ACODE48
 
Who Wants To Be A Librarian
Who Wants To Be A LibrarianWho Wants To Be A Librarian
Who Wants To Be A Librarian
 

Mais de Jonathan Eisen

Eisen.CentralValley2024.pdf
Eisen.CentralValley2024.pdfEisen.CentralValley2024.pdf
Eisen.CentralValley2024.pdfJonathan Eisen
 
Phylogenomics and the Diversity and Diversification of Microbes
Phylogenomics and the Diversity and Diversification of MicrobesPhylogenomics and the Diversity and Diversification of Microbes
Phylogenomics and the Diversity and Diversification of MicrobesJonathan Eisen
 
Talk by Jonathan Eisen for LAMG2022 meeting
Talk by Jonathan Eisen for LAMG2022 meetingTalk by Jonathan Eisen for LAMG2022 meeting
Talk by Jonathan Eisen for LAMG2022 meetingJonathan Eisen
 
Thoughts on UC Davis' COVID Current Actions
Thoughts on UC Davis' COVID Current ActionsThoughts on UC Davis' COVID Current Actions
Thoughts on UC Davis' COVID Current ActionsJonathan Eisen
 
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...Jonathan Eisen
 
A Field Guide to Sars-CoV-2
A Field Guide to Sars-CoV-2A Field Guide to Sars-CoV-2
A Field Guide to Sars-CoV-2Jonathan Eisen
 
EVE198 Summer Session Class 4
EVE198 Summer Session Class 4EVE198 Summer Session Class 4
EVE198 Summer Session Class 4Jonathan Eisen
 
EVE198 Summer Session 2 Class 1
EVE198 Summer Session 2 Class 1 EVE198 Summer Session 2 Class 1
EVE198 Summer Session 2 Class 1 Jonathan Eisen
 
EVE198 Summer Session 2 Class 2 Vaccines
EVE198 Summer Session 2 Class 2 Vaccines EVE198 Summer Session 2 Class 2 Vaccines
EVE198 Summer Session 2 Class 2 Vaccines Jonathan Eisen
 
EVE198 Spring2021 Class1 Introduction
EVE198 Spring2021 Class1 IntroductionEVE198 Spring2021 Class1 Introduction
EVE198 Spring2021 Class1 IntroductionJonathan Eisen
 
EVE198 Spring2021 Class2
EVE198 Spring2021 Class2EVE198 Spring2021 Class2
EVE198 Spring2021 Class2Jonathan Eisen
 
EVE198 Spring2021 Class5 Vaccines
EVE198 Spring2021 Class5 VaccinesEVE198 Spring2021 Class5 Vaccines
EVE198 Spring2021 Class5 VaccinesJonathan Eisen
 
EVE198 Winter2020 Class 8 - COVID RNA Detection
EVE198 Winter2020 Class 8 - COVID RNA DetectionEVE198 Winter2020 Class 8 - COVID RNA Detection
EVE198 Winter2020 Class 8 - COVID RNA DetectionJonathan Eisen
 
EVE198 Winter2020 Class 1 Introduction
EVE198 Winter2020 Class 1 IntroductionEVE198 Winter2020 Class 1 Introduction
EVE198 Winter2020 Class 1 IntroductionJonathan Eisen
 
EVE198 Winter2020 Class 3 - COVID Testing
EVE198 Winter2020 Class 3 - COVID TestingEVE198 Winter2020 Class 3 - COVID Testing
EVE198 Winter2020 Class 3 - COVID TestingJonathan Eisen
 
EVE198 Winter2020 Class 5 - COVID Vaccines
EVE198 Winter2020 Class 5 - COVID VaccinesEVE198 Winter2020 Class 5 - COVID Vaccines
EVE198 Winter2020 Class 5 - COVID VaccinesJonathan Eisen
 
EVE198 Winter2020 Class 9 - COVID Transmission
EVE198 Winter2020 Class 9 - COVID TransmissionEVE198 Winter2020 Class 9 - COVID Transmission
EVE198 Winter2020 Class 9 - COVID TransmissionJonathan Eisen
 
EVE198 Fall2020 "Covid Mass Testing" Class 8 Vaccines
EVE198 Fall2020 "Covid Mass Testing" Class 8 VaccinesEVE198 Fall2020 "Covid Mass Testing" Class 8 Vaccines
EVE198 Fall2020 "Covid Mass Testing" Class 8 VaccinesJonathan Eisen
 
EVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and Testing
EVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and TestingEVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and Testing
EVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and TestingJonathan Eisen
 
EVE198 Fall2020 "Covid Mass Testing" Class 1 Introduction
EVE198 Fall2020 "Covid Mass Testing" Class 1 IntroductionEVE198 Fall2020 "Covid Mass Testing" Class 1 Introduction
EVE198 Fall2020 "Covid Mass Testing" Class 1 IntroductionJonathan Eisen
 

Mais de Jonathan Eisen (20)

Eisen.CentralValley2024.pdf
Eisen.CentralValley2024.pdfEisen.CentralValley2024.pdf
Eisen.CentralValley2024.pdf
 
Phylogenomics and the Diversity and Diversification of Microbes
Phylogenomics and the Diversity and Diversification of MicrobesPhylogenomics and the Diversity and Diversification of Microbes
Phylogenomics and the Diversity and Diversification of Microbes
 
Talk by Jonathan Eisen for LAMG2022 meeting
Talk by Jonathan Eisen for LAMG2022 meetingTalk by Jonathan Eisen for LAMG2022 meeting
Talk by Jonathan Eisen for LAMG2022 meeting
 
Thoughts on UC Davis' COVID Current Actions
Thoughts on UC Davis' COVID Current ActionsThoughts on UC Davis' COVID Current Actions
Thoughts on UC Davis' COVID Current Actions
 
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
 
A Field Guide to Sars-CoV-2
A Field Guide to Sars-CoV-2A Field Guide to Sars-CoV-2
A Field Guide to Sars-CoV-2
 
EVE198 Summer Session Class 4
EVE198 Summer Session Class 4EVE198 Summer Session Class 4
EVE198 Summer Session Class 4
 
EVE198 Summer Session 2 Class 1
EVE198 Summer Session 2 Class 1 EVE198 Summer Session 2 Class 1
EVE198 Summer Session 2 Class 1
 
EVE198 Summer Session 2 Class 2 Vaccines
EVE198 Summer Session 2 Class 2 Vaccines EVE198 Summer Session 2 Class 2 Vaccines
EVE198 Summer Session 2 Class 2 Vaccines
 
EVE198 Spring2021 Class1 Introduction
EVE198 Spring2021 Class1 IntroductionEVE198 Spring2021 Class1 Introduction
EVE198 Spring2021 Class1 Introduction
 
EVE198 Spring2021 Class2
EVE198 Spring2021 Class2EVE198 Spring2021 Class2
EVE198 Spring2021 Class2
 
EVE198 Spring2021 Class5 Vaccines
EVE198 Spring2021 Class5 VaccinesEVE198 Spring2021 Class5 Vaccines
EVE198 Spring2021 Class5 Vaccines
 
EVE198 Winter2020 Class 8 - COVID RNA Detection
EVE198 Winter2020 Class 8 - COVID RNA DetectionEVE198 Winter2020 Class 8 - COVID RNA Detection
EVE198 Winter2020 Class 8 - COVID RNA Detection
 
EVE198 Winter2020 Class 1 Introduction
EVE198 Winter2020 Class 1 IntroductionEVE198 Winter2020 Class 1 Introduction
EVE198 Winter2020 Class 1 Introduction
 
EVE198 Winter2020 Class 3 - COVID Testing
EVE198 Winter2020 Class 3 - COVID TestingEVE198 Winter2020 Class 3 - COVID Testing
EVE198 Winter2020 Class 3 - COVID Testing
 
EVE198 Winter2020 Class 5 - COVID Vaccines
EVE198 Winter2020 Class 5 - COVID VaccinesEVE198 Winter2020 Class 5 - COVID Vaccines
EVE198 Winter2020 Class 5 - COVID Vaccines
 
EVE198 Winter2020 Class 9 - COVID Transmission
EVE198 Winter2020 Class 9 - COVID TransmissionEVE198 Winter2020 Class 9 - COVID Transmission
EVE198 Winter2020 Class 9 - COVID Transmission
 
EVE198 Fall2020 "Covid Mass Testing" Class 8 Vaccines
EVE198 Fall2020 "Covid Mass Testing" Class 8 VaccinesEVE198 Fall2020 "Covid Mass Testing" Class 8 Vaccines
EVE198 Fall2020 "Covid Mass Testing" Class 8 Vaccines
 
EVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and Testing
EVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and TestingEVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and Testing
EVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and Testing
 
EVE198 Fall2020 "Covid Mass Testing" Class 1 Introduction
EVE198 Fall2020 "Covid Mass Testing" Class 1 IntroductionEVE198 Fall2020 "Covid Mass Testing" Class 1 Introduction
EVE198 Fall2020 "Covid Mass Testing" Class 1 Introduction
 

Último

Generative AI in Health Care a scoping review and a persoanl experience.
Generative AI in Health Care a scoping review and a persoanl experience.Generative AI in Health Care a scoping review and a persoanl experience.
Generative AI in Health Care a scoping review and a persoanl experience.Vaikunthan Rajaratnam
 
SGK RỐI LOẠN KALI MÁU CỰC KỲ QUAN TRỌNG.pdf
SGK RỐI LOẠN KALI MÁU CỰC KỲ QUAN TRỌNG.pdfSGK RỐI LOẠN KALI MÁU CỰC KỲ QUAN TRỌNG.pdf
SGK RỐI LOẠN KALI MÁU CỰC KỲ QUAN TRỌNG.pdfHongBiThi1
 
CPR.nursingoutlook.pdf , Bsc nursing student
CPR.nursingoutlook.pdf , Bsc nursing studentCPR.nursingoutlook.pdf , Bsc nursing student
CPR.nursingoutlook.pdf , Bsc nursing studentsaileshpanda05
 
Breast cancer -ONCO IN MEDICAL AND SURGICAL NURSING.pptx
Breast cancer -ONCO IN MEDICAL AND SURGICAL NURSING.pptxBreast cancer -ONCO IN MEDICAL AND SURGICAL NURSING.pptx
Breast cancer -ONCO IN MEDICAL AND SURGICAL NURSING.pptxNaveenkumar267201
 
Red Blood Cells_anemia & polycythemia.pdf
Red Blood Cells_anemia & polycythemia.pdfRed Blood Cells_anemia & polycythemia.pdf
Red Blood Cells_anemia & polycythemia.pdfMedicoseAcademics
 
"Radical excision of DIE in subferile women with deep infiltrating endometrio...
"Radical excision of DIE in subferile women with deep infiltrating endometrio..."Radical excision of DIE in subferile women with deep infiltrating endometrio...
"Radical excision of DIE in subferile women with deep infiltrating endometrio...Sujoy Dasgupta
 
EXERCISE PERFORMANCE.pptx, Lung function
EXERCISE PERFORMANCE.pptx, Lung functionEXERCISE PERFORMANCE.pptx, Lung function
EXERCISE PERFORMANCE.pptx, Lung functionkrishnareddy157915
 
SGK LEUKEMIA KINH DÒNG BẠCH CÂU HẠT HAY.pdf
SGK LEUKEMIA KINH DÒNG BẠCH CÂU HẠT HAY.pdfSGK LEUKEMIA KINH DÒNG BẠCH CÂU HẠT HAY.pdf
SGK LEUKEMIA KINH DÒNG BẠCH CÂU HẠT HAY.pdfHongBiThi1
 
ORAL HYPOGLYCAEMIC AGENTS - PART 2.pptx
ORAL HYPOGLYCAEMIC AGENTS  - PART 2.pptxORAL HYPOGLYCAEMIC AGENTS  - PART 2.pptx
ORAL HYPOGLYCAEMIC AGENTS - PART 2.pptxNIKITA BHUTE
 
SGK NGẠT NƯỚC ĐHYHN RẤT LÀ HAY NHA .pdf
SGK NGẠT NƯỚC ĐHYHN RẤT LÀ HAY NHA    .pdfSGK NGẠT NƯỚC ĐHYHN RẤT LÀ HAY NHA    .pdf
SGK NGẠT NƯỚC ĐHYHN RẤT LÀ HAY NHA .pdfHongBiThi1
 
Different drug regularity bodies in different countries.
Different drug regularity bodies in different countries.Different drug regularity bodies in different countries.
Different drug regularity bodies in different countries.kishan singh tomar
 
Pharmacokinetic Models by Dr. Ram D. Bawankar.ppt
Pharmacokinetic Models by Dr. Ram D.  Bawankar.pptPharmacokinetic Models by Dr. Ram D.  Bawankar.ppt
Pharmacokinetic Models by Dr. Ram D. Bawankar.pptRamDBawankar1
 
Mental health Team. Dr Senthil Thirusangu
Mental health Team. Dr Senthil ThirusanguMental health Team. Dr Senthil Thirusangu
Mental health Team. Dr Senthil Thirusangu Medical University
 
MedMatch: Your Health, Our Mission. Pitch deck.
MedMatch: Your Health, Our Mission. Pitch deck.MedMatch: Your Health, Our Mission. Pitch deck.
MedMatch: Your Health, Our Mission. Pitch deck.whalesdesign
 
Female Reproductive Physiology Before Pregnancy
Female Reproductive Physiology Before PregnancyFemale Reproductive Physiology Before Pregnancy
Female Reproductive Physiology Before PregnancyMedicoseAcademics
 
historyofpsychiatryinindia. Senthil Thirusangu
historyofpsychiatryinindia. Senthil Thirusanguhistoryofpsychiatryinindia. Senthil Thirusangu
historyofpsychiatryinindia. Senthil Thirusangu Medical University
 
Role of Soap based and synthetic or syndets bar
Role of  Soap based and synthetic or syndets barRole of  Soap based and synthetic or syndets bar
Role of Soap based and synthetic or syndets barmohitRahangdale
 

Último (20)

Generative AI in Health Care a scoping review and a persoanl experience.
Generative AI in Health Care a scoping review and a persoanl experience.Generative AI in Health Care a scoping review and a persoanl experience.
Generative AI in Health Care a scoping review and a persoanl experience.
 
SGK RỐI LOẠN KALI MÁU CỰC KỲ QUAN TRỌNG.pdf
SGK RỐI LOẠN KALI MÁU CỰC KỲ QUAN TRỌNG.pdfSGK RỐI LOẠN KALI MÁU CỰC KỲ QUAN TRỌNG.pdf
SGK RỐI LOẠN KALI MÁU CỰC KỲ QUAN TRỌNG.pdf
 
CPR.nursingoutlook.pdf , Bsc nursing student
CPR.nursingoutlook.pdf , Bsc nursing studentCPR.nursingoutlook.pdf , Bsc nursing student
CPR.nursingoutlook.pdf , Bsc nursing student
 
Breast cancer -ONCO IN MEDICAL AND SURGICAL NURSING.pptx
Breast cancer -ONCO IN MEDICAL AND SURGICAL NURSING.pptxBreast cancer -ONCO IN MEDICAL AND SURGICAL NURSING.pptx
Breast cancer -ONCO IN MEDICAL AND SURGICAL NURSING.pptx
 
Red Blood Cells_anemia & polycythemia.pdf
Red Blood Cells_anemia & polycythemia.pdfRed Blood Cells_anemia & polycythemia.pdf
Red Blood Cells_anemia & polycythemia.pdf
 
American College of physicians ACP high value care recommendations in rheumat...
American College of physicians ACP high value care recommendations in rheumat...American College of physicians ACP high value care recommendations in rheumat...
American College of physicians ACP high value care recommendations in rheumat...
 
"Radical excision of DIE in subferile women with deep infiltrating endometrio...
"Radical excision of DIE in subferile women with deep infiltrating endometrio..."Radical excision of DIE in subferile women with deep infiltrating endometrio...
"Radical excision of DIE in subferile women with deep infiltrating endometrio...
 
EXERCISE PERFORMANCE.pptx, Lung function
EXERCISE PERFORMANCE.pptx, Lung functionEXERCISE PERFORMANCE.pptx, Lung function
EXERCISE PERFORMANCE.pptx, Lung function
 
SGK LEUKEMIA KINH DÒNG BẠCH CÂU HẠT HAY.pdf
SGK LEUKEMIA KINH DÒNG BẠCH CÂU HẠT HAY.pdfSGK LEUKEMIA KINH DÒNG BẠCH CÂU HẠT HAY.pdf
SGK LEUKEMIA KINH DÒNG BẠCH CÂU HẠT HAY.pdf
 
ORAL HYPOGLYCAEMIC AGENTS - PART 2.pptx
ORAL HYPOGLYCAEMIC AGENTS  - PART 2.pptxORAL HYPOGLYCAEMIC AGENTS  - PART 2.pptx
ORAL HYPOGLYCAEMIC AGENTS - PART 2.pptx
 
SGK NGẠT NƯỚC ĐHYHN RẤT LÀ HAY NHA .pdf
SGK NGẠT NƯỚC ĐHYHN RẤT LÀ HAY NHA    .pdfSGK NGẠT NƯỚC ĐHYHN RẤT LÀ HAY NHA    .pdf
SGK NGẠT NƯỚC ĐHYHN RẤT LÀ HAY NHA .pdf
 
Different drug regularity bodies in different countries.
Different drug regularity bodies in different countries.Different drug regularity bodies in different countries.
Different drug regularity bodies in different countries.
 
Pharmacokinetic Models by Dr. Ram D. Bawankar.ppt
Pharmacokinetic Models by Dr. Ram D.  Bawankar.pptPharmacokinetic Models by Dr. Ram D.  Bawankar.ppt
Pharmacokinetic Models by Dr. Ram D. Bawankar.ppt
 
How to master Steroid (glucocorticoids) prescription, different scenarios, ca...
How to master Steroid (glucocorticoids) prescription, different scenarios, ca...How to master Steroid (glucocorticoids) prescription, different scenarios, ca...
How to master Steroid (glucocorticoids) prescription, different scenarios, ca...
 
Mental health Team. Dr Senthil Thirusangu
Mental health Team. Dr Senthil ThirusanguMental health Team. Dr Senthil Thirusangu
Mental health Team. Dr Senthil Thirusangu
 
MedMatch: Your Health, Our Mission. Pitch deck.
MedMatch: Your Health, Our Mission. Pitch deck.MedMatch: Your Health, Our Mission. Pitch deck.
MedMatch: Your Health, Our Mission. Pitch deck.
 
Female Reproductive Physiology Before Pregnancy
Female Reproductive Physiology Before PregnancyFemale Reproductive Physiology Before Pregnancy
Female Reproductive Physiology Before Pregnancy
 
historyofpsychiatryinindia. Senthil Thirusangu
historyofpsychiatryinindia. Senthil Thirusanguhistoryofpsychiatryinindia. Senthil Thirusangu
historyofpsychiatryinindia. Senthil Thirusangu
 
Rheumatoid arthritis Part 1, case based approach with application of the late...
Rheumatoid arthritis Part 1, case based approach with application of the late...Rheumatoid arthritis Part 1, case based approach with application of the late...
Rheumatoid arthritis Part 1, case based approach with application of the late...
 
Role of Soap based and synthetic or syndets bar
Role of  Soap based and synthetic or syndets barRole of  Soap based and synthetic or syndets bar
Role of Soap based and synthetic or syndets bar
 

Jonathan Eisen: Phylogenetic approaches to the analysis of genomes and metagenomes

  • 1. Phylogenetic and Phylogenomic Approaches to the Study of Microbial Communities March 7, 2012 IOM Forum on Microbial Threats Social Biology of Microbes Jonathan A. Eisen University of California, Davis Wednesday, March 7, 12
  • 2. Acknowledgements • $$$ • DOE • NSF • GBMF • Sloan • DARPA • DSMZ • DHS • People, places • DOE JGI: Eddy Rubin, Phil Hugenholtz, Nikos Kyrpides • UC Davis: Aaron Darling, Dongying Wu, Holly Bik, Russell Neches, Jenna Morgan-Lang • Other: Jessica Green, Katie Pollard, Martin Wu, Tom Slezak, Jack Gilbert, Steven Kembel, J. Craig Venter, Naomi Ward, Hans-Peter Klenk Wednesday, March 7, 12
  • 3. Outline • Introduction • Phylotyping and phylogenetic ecology • Functional prediction • Selecting organisms • Future needs Wednesday, March 7, 12
  • 4. Phylogeny • Phylogeny is a description of the evolutionary history of relationships among organisms (or their parts). • This is frequently portrayed in a diagram called a phylogenetic tree. • Phylogenies can be more complex than a bifurcating tree (e.g., lateral gene transfer, recombination, hybridization) Wednesday, March 7, 12
  • 5. Whatever the History: Trying to Incorporate it is Critical Four Models for Rooting TOL from Lake et al. doi: 10.1098/rstb.2009.0035 Wednesday, March 7, 12
  • 6. Uses of Phylogeny in Genomics and Metagenomics Example 1: Phylotyping and Phylogenetic Ecology Wednesday, March 7, 12
  • 7. rRNA Phylotyping • Collect DNA from environment • PCR amplify rRNA genes using broad (so-called universal) primers • Sequence • Align to others • Infer evolutionary tree • Unknowns “identified” by placement on tree Wednesday, March 7, 12
  • 9. Three Major Issues in Phylotpying Beyond Moore’s Law Metagenomics Short reads Wednesday, March 7, 12
  • 10. rRNA Phylotyping in Sargasso Sea Metagenomic Metagenomic Data Venter et al., Science 304: 66. 2004 Wednesday, March 7, 12
  • 11. RecA Phylotyping in Sargasso Data Venter et al., Science 304: 66. 2004 Wednesday, March 7, 12
  • 12. RecA Phylotyping in Sargasso Data Venter et al., Science 304: 66. 2004 Wednesday, March 7, 12
  • 13. Sargasso Phylotypes 0.500 EFG EFTu HSP70 RecA RpoB rRNA 0.375 Weighted % of Clones 0.250 0.125 0 ia ia ria s i xi ia a ob te ot le er er er e u or ae of ct ct ct ct ic hl or ba ba ba ba ch rm C hl ar eo eo eo so Fi C ry Fu t t ot ro ro Eu pr ap ap lta ph m De am Al G Major Phylogenetic Group Venter et al., Science 304: 66-74. 2004 Wednesday, March 7, 12
  • 14. Solution: More Automation • BLAST???? • Composition/word frequencies • Automation of trees Wednesday, March 7, 12
  • 15. AutoPhylotyping 1: Each Sequence is an Island Wednesday, March 7, 12
  • 16. STAP Wu et al. 2008 PLoS One Figure 1. A flow chart of the STAP pipeline. Wednesday, March 7, 12
  • 17. STAP Figure 1. A flow chart of the STAP pipeline. doi:10.1371/journal.pone.0002566.g001 STAP database, and the query sequence is aligned to them using the CLUSTALW profile alignment algorithm [40] as described a w above for domain assignment. By adapting the profile alignment s a t o G t t s Each sequence T c a analyzed separately q c e b b S p a Figure 2. Domain assignment. In Step 1, STAP assigns a domain to t each query sequence based on its position in a maximum likelihood d tree of representative ss-rRNA sequences. Because the tree illustrated ‘ here is not rooted, domain assignment would not be accurate and s reliable (sequence similarity based methods cannot make an accurate s assignment in this case either). However the figure illustrates an important role of the tree-based domain assignment step, namely s automatic identification of deep-branching environmental ss-rRNAs. d doi:10.1371/journal.pone.0002566.g002 a PLoS ONE | www.plosone.org 5 Wu et al. 2008 PLoS One Figure 1. A flow chart of the STAP pipeline. Wednesday, March 7, 12
  • 18. AMPHORA Wu and Eisen Genome Biology 2008 9:R151 doi: 10.1186/ gb-2008-9-10-r151 Wednesday, March 7, 12
  • 19. WGT Wu and Eisen Genome Biology 2008 9:R151 doi:10.1186/gb-2008-9-10-r151 Wednesday, March 7, 12
  • 20. AMPHORA Wu and Eisen Genome Biology 2008 9:R151 doi: 10.1186/ gb-2008-9-10-r151 Guide tree Wednesday, March 7, 12
  • 21. Wu and Eisen Genome Biology 2008 9:R151 doi:10.1186/gb-2008-9-10-r151 Wednesday, March 7, 12
  • 22. Comparison of the phylotyping performance by AMPHORA and MEGAN. The sensitivity and specificity of the phylotyping methods were measured across taxonomic ranks using simulated Sanger shotgun sequences of 31 genes from 100 representative bacterial genomes. The figure shows that AMPHORA significantly outperforms MEGAN in sensitivity without sacrificing specificity. Wu and Eisen Genome Biology 2008 9:R151 doi:10.1186/gb-2008-9-10-r151 Wednesday, March 7, 12
  • 23. AutoPhylotyping 2: Most in the Family Wednesday, March 7, 12
  • 24. Metagenomic Phylogenetic challenge xxxxxxxxxxxxxxxxxxxxxxx xxxxxx xxxxxxxxxxxxx xxxxxxxxxxxxxx xxxxxxxxxxxxxx A single tree with everything Wednesday, March 7, 12
  • 25. Metagenomic Phylogenetic challenge xxxxxxxxxxxxxxxxxxxxxxx xxxxxx xxxxxxxxxxxxx xxxxxxxxxxxxxx xxxxxxxxxxxxxx A single tree with everything Wednesday, March 7, 12
  • 26. rRNA Phylotyping in Sargasso Sea Metagenomic Metagenomic Data Venter et al., Science 304: 66. 2004 Wednesday, March 7, 12
  • 27. Combine all into one alignment Figure 1. A flow chart of the STAP pipeline. Wednesday, March 7, 12
  • 28. Cluster, cluster of more than three identical sequences. APPLIED AND ENVIRONMENTAL MICROBIOLOGY, Feb. 2006, p. 1680–1683 Vol. 72, No. 2 Downloaded from http://aem.asm.org/ on November 15, 2011 by guest on November 15, 2011 by guest 0099-2240/06/$08.00ϩ0 doi:10.1128/AEM.72.2.1680–1683.2006 Copyright © 2006, American Society for Microbiology. All Rights Reserved. sequences obtained in this and previous (12, 13) studies fall NAC11-7 (from the same algal bloom study [8]) and an uncul- within the marine roseobacters (see Fig. S1 in the supplemen- tivated marine bacterium, ZD0207, associated with dimethyl- Characterization of Bacterial Communities Associated with Deep-Sea tal material), a major clade of culturable marine heterotrophs (7), many of which play a role in sulfur cycles (e.g., see refer- sulfoniopropionate uptake (15). CGOAB33 is most similar to one (slope strain EI1*) of a group of thiosulfate-oxidizing Corals on Gulf of Alaska Seamounts† ence 8). One clade of six CGOF sequences is most closely bacteria from marine sediments and hydrothermal vents (14). related to NAC11-6 from a dimethylsulfoniopropionate-pro- Members of the family Pseudomonadaceae comprised 23 to Kevin Penn,1 Dongying closely Jonathan A. Eisen,1,2 and Naomi Ward1,3* in those samples ducing algal bloom (8), while CGOCA38 groups Wu,1 with 69% of the gammaproteobacterial sequences The Institute for Genomic Research, 9712 Medical Center Drive, Rockville, Maryland 208501; Johns Hopkins University, Charles and 34th Streets, Baltimore, Maryland 212182; and Center of Marine Biotechnology, 701 East Pratt Street, Baltimore, Maryland 202123 Received 22 June 2005/Accepted 8 November 2005 Although microbes associated with shallow-water corals have been reported, deepwater coral microbes are poorly characterized. A cultivation-independent analysis of Alaskan seamount octocoral microflora showed Downloaded from http://aem.asm.org/ that Proteobacteria (classes Alphaproteobacteria and Gammaproteobacteria), Firmicutes, Bacteroidetes, and Ac- idobacteria dominate and vary in abundance. More sampling is needed to understand the basis and significance of this variation. The most abundant corals on Gulf of Alaska seamounts are vitrogen). Amplifications were performed with an initial dena- octocorals (9), which create a habitat structure for mobile turation of 2 min at 94°C, followed by 29 cycles of 30 s at 94°C, fauna (4). Concerns about the benthic impacts of commercial 30 s at 55°C, and 2 min at 72°C, with a final extension of 5 min fishing have renewed interest in habitat-forming deep-sea cor- at 72°C. PCR products were cloned using a TOPO TA cloning als (4). Studies of shallow-water scleractinian corals (12) have kit (Invitrogen), and primers M13F and M13R were used to revealed a diverse microflora and evidence of host-microbe sequence positions 9 to 1545 of the 16S rRNA gene. interactions. Although studies of the deep-sea octocoral mi- BLASTN (1) was used to compare our query sequences with croflora are under way (10), there have been no published reference sequences from the RDP2 (3) database. Represen- reports describing the microbial community composition. tative sequences from the BLASTN output were aligned with Three Gulf of Alaska seamounts were visited during re- our query sequences, using an RDP2-provided profile align- search cruise AT7-15/16 aboard the R/V Atlantis. The biolog- ment. Neighbor-joining trees were created using PHYLIP (6) ical objectives of the cruise included sampling of deep-sea and used to assign putative taxonomy down to the family level. octocorals for studies of their dispersal and reproductive strat- Detailed phylogenetic trees were constructed using the rele- egies, with a particular focus on the abundant bamboo corals vant sequences from each clone library, two reference se- (Isididae). We took advantage of available coral specimens to quences most closely related to the query sequence, and addi- examine their associated microflora. tional reference sequences. Alignments were generated using Coral, rock, and water column samples (Table 1) were col- the RDP2 profile alignment, and bootstrapped neighbor-join- lected from the Warwick, Murray, and Chirikof seamounts ing trees were reconstructed using PHYLIP (6). using the deep-submergence vehicle Alvin. Corals and rocks The clones sequenced comprised 19 phyla (see Table S1 in were harvested using the submersible’s manipulators and the supplemental material), dominated by Proteobacteria stored in a closed box during ascent to minimize physical dis- (classes Alphaproteobacteria and Gammaproteobacteria), Firmi- turbance by surface waters. The water adjacent to coral colo- cutes, Bacteroidetes, and Acidobacteria (Fig. 1). The relative nies was sampled using a Niskin bottle fired at depth. After proportions of these groups varied widely across the five coral submersible recovery, freshly extruded coral exopolysaccharide FIG. 1. Histogram showing percentages of composition (by taxon) for 16S rRNA as did the degree to which a given library was domi- samples, gene libraries generated for this study, showing only taxa comprising at least 20% of and rock in at least wereclone library. to and scrapings of coral sequences surfaces one transferred nated by a single group (Fig. 1; see Table S1 in the supple- sterile cryovials. Water samples were prefiltered through 20- mental material). At the subphylum level, families occurring in ␮m-pore-size Nitex, concentrated using a TFF apparatus (Mil- major proportions included Rhizobiaceae, Rhodobacteraceae, lipore), and vacuum filtered (1.0-␮m and 0.2-␮m pore size). and Sphingomonadaceae (Alphaproteobacteria); Pseudomona- Wednesday, March 7, 12 The 0.2-␮m filter retentate was resuspended in sterile saline
  • 29. RecA Phylotyping in Sargasso Data Venter et al., Science 304: 66. 2004 Wednesday, March 7, 12
  • 30. RecA Phylotyping in Sargasso Data Venter et al., Science 304: 66. 2004 Wednesday, March 7, 12
  • 31. Sargasso Phylotypes 0.500 EFG EFTu HSP70 RecA RpoB rRNA 0.375 Weighted % of Clones 0.250 0.125 0 ia ia ria s i xi ia a ob te ot le er er er e u or ae of ct ct ct ct ic hl or ba ba ba ba ch rm C hl ar eo eo eo so Fi C ry Fu t t ot ro ro Eu pr ap ap lta ph m De am Al G Major Phylogenetic Group Venter et al., Science 304: 66-74. 2004 Wednesday, March 7, 12
  • 32. AutoPhylotyping 3: All in the Family Wednesday, March 7, 12
  • 33. Metagenomic Phylogenetic challenge xxxxxxxxxxxxxxxxxxxxxxx xxxxxx xxxxxxxxxxxxx xxxxxxxxxxxxxx xxxxxxxxxxxxxx A single tree with everything Wednesday, March 7, 12
  • 34. Metagenomic Phylogenetic challenge A single tree with everything Wednesday, March 7, 12
  • 35. Figure 1. PhylOTU Workflow. Computational processes are represented as squares and databases are represented as cylin PhylOTU - Sharpton et al. PLoS Comp. Bio 2011 workflow of PhylOTU. See Results section for details. doi:10.1371/journal.pcbi.1001061.g001 Wednesday, March 7, 12
  • 37. AutoPhylotyping 4: All in the Genome Wednesday, March 7, 12
  • 38. Challenge • Each gene poorly sampled in metagenomes • Can we combine all into a single tree? Wednesday, March 7, 12
  • 39. AMPHORA ALL Kembel et al. The phylogenetic diversity of metagenomes. PLoS One 2011 Wednesday, March 7, 12
  • 41. the communities combined (18), is a quantitative measure that accounts for different levels of divergence between sequences. The phylogenetic test (P test), which measures the significance of the association between environment and phylogeny (18), is typically used as a qualitative measure because duplicate se- quences are usually removed from the tree. However, the P test may be used in a semiquantitative manner if all clones, even those with identical or near-identical sequences, are in- cluded in the tree (13). Here we describe a quantitative version of UniFrac that we call “weighted UniFrac.” We show that weighted UniFrac be- haves similarly to the FST test in situations where both are FIG. 1. Calculation of the unweighted and the weighted UniFrac measures. Squares and circles represent sequences from two different environments. (a) In unweighted UniFrac, the distance between the circle and square communities is calculated as the fraction of the branch length that has descendants from either the square or the circle environment (black) but not both (gray). (b) In weighted UniFrac, branch lengths are weighted by the relative abundance of sequences in the square and circle communities; square sequences are weighted twice as much as circle sequences because there are twice as many total circle sequences in the data set. The width of branches is proportional to the degree to which each branch is weighted in the calculations, and gray branches have no weight. Branches 1 and 2 have heavy weights since the descendants are biased toward the square and circles, respec- tively. Branch 3 contributes no value since it has an equal contribution from circle and square sequences after normalization. Figure 3. Taxonomic diversity and standardized phylogenetic diversity versus depth in environmental samples along an oceanic depth gradient at the HOT ALO site. Wednesday, March 7, 12
  • 42. AutoPhylotyping 5: Novel lineages and decluttering Wednesday, March 7, 12
  • 43. RecA Tree of Life Bacteria Archaea Other lineages? Eukaryotes Figure from Barton, Eisen et al. “Evolution”, CSHL Press. 2007. Based on tree from Pace 1997 Science 276:734-740 Wednesday, March 7, 12
  • 44. Lek Clustering 0.75 0.75 0.33 1 1 0.75 0.75 Wednesday, March 7, 12
  • 45. Lek Clustering Cutoff of 0.5 0.75 0.75 0.33 1 1 0.75 0.75 Wednesday, March 7, 12
  • 46. GOS 1 RecA GOS 2 RecA GOS 3 GOS 4 GOS 5 Wednesday, March 7, 12
  • 49. Sulcia makes amino acids Baumannia makes vitamins and cofactors Wu et al. 2006 PLoS Biology 4: e188. Wednesday, March 7, 12
  • 50. Uses of Phylogeny in Genomics and Metagenomics Example 2: Functional Diversity and Functional Predictions Wednesday, March 7, 12
  • 51. Predicting Function • Key step in genome projects • More accurate predictions help guide experimental and computational analyses • Many diverse approaches • All improved both by “phylogenomic” type analyses that integrate evolutionary reconstructions and understanding of how new functions evolve Wednesday, March 7, 12
  • 52. PHYLOGENENETIC PREDICTION OF GENE FUNCTION EXAMPLE A METHOD EXAMPLE B 2A CHOOSE GENE(S) OF INTEREST 5 3A 1 3 4 2B 2 IDENTIFY HOMOLOGS 5 1A 2A 1B 3B 6 ALIGN SEQUENCES 1A 2A 3A 1B 2B 3B 1 2 3 4 5 6 CALCULATE GENE TREE Duplication? 1A 2A 3A 1B 2B 3B 1 2 3 4 5 6 OVERLAY KNOWN FUNCTIONS ONTO TREE Duplication? 2B 3B 1 2 3 4 5 6 1A 2A 3A 1B INFER LIKELY FUNCTION OF GENE(S) OF INTEREST Ambiguous Duplication? Species 1 Species 2 Species 3 1A 1B 1 2 3 4 5 6 2A 2B 3A 3B ACTUAL EVOLUTION (ASSUMED TO BE UNKNOWN) Based on Eisen, 1998 Genome Duplication Res 8: 163-167. Wednesday, March 7, 12
  • 53. PHYLOGENENETIC PREDICTION OF GENE FUNCTION EXAMPLE A METHOD EXAMPLE B 2A CHOOSE GENE(S) OF INTEREST 5 3A 1 3 4 2B 2 IDENTIFY HOMOLOGS 5 1A 2A 1B 3B 6 ALIGN SEQUENCES 1A 2A 3A 1B 2B 3B 1 2 3 4 5 6 CALCULATE GENE TREE Duplication? 1A 2A 3A 1B 2B 3B 1 2 3 4 5 6 OVERLAY KNOWN FUNCTIONS ONTO TREE Duplication? 2B 3B 1 2 3 4 5 6 1A 2A 3A 1B INFER LIKELY FUNCTION OF GENE(S) OF INTEREST Ambiguous Duplication? Species 1 Species 2 Species 3 1A 1B 1 2 3 4 5 6 2A 2B 3A 3B ACTUAL EVOLUTION (ASSUMED TO BE UNKNOWN) Based on Eisen, 1998 Genome Duplication Res 8: 163-167. Wednesday, March 7, 12
  • 54. 0.01 Legend: Halorubrum lacusprofundi 0.32 Haloquadratum walsbyi Dataset genes 0.93 Halogeometricum borinquense MA ammonialyase 0.55 0.83 Haloferax mediterranei MA mutase S subunit 1 Haloferax mucosum 0.90 Haloferax volcanii MA mutase E subunit 0.34 Haloferax sulfurifontis PHA synthatase Haloferax denitrificans cellulase 0.41 Halalkalicoccus jeotgali CRISPRs 1 Halopiger xanaduensis 0.52 Natrialba magadii CAS 0.32 Haloterrigena turkmenica 1 Halobacterium sp. NRC 1 Color ranges: 0.08 Halobacterium salinarum R1 Natronomonas pharaonis New Genomes 0.23 Halorhabdus utahensis 0.79 0.52 Halomicrobium mukohataei 1 Haloarcula vallismortis 0.71 Haloarcula marismortui 0.24 Haloarcula sinaiiensis Haloarcula californiae Wednesday, March 7, 12
  • 55. !"# Haloarchaea TBPs !"E# $%&'?)*%7.1)5+()**%-)+.D !"HJ $%&'?)*%7.1/2'8/1.# !"MD $%&'?)*%7.>'&2%-++.# !"NL $%&'?)*%7.8/&?/*+?'-(+8.# !"HH $%&'?)*%7.5)-+(*+?+2%-8.# $%&'A/%5*%(/1.B%&84C+.D $%&',)'1)(*+2/1.4'*+-A/)-8).# !"#D # $%&'4%2()*+/1.86".3:; #.(46= $%&'4%2()*+/1.8%&+-%*/1.:#.(46= !"DK!"H# !"KL $%&'()**+,)-%.(/*01)-+2%.# !"J# 3%(*+%&4%.1%,%5++.# !"DE $%&'6+,)*.7%-%5/)-8+8.# !"ED $%&%&0%&+2'22/8.9)'(,%&+.# !"JH !"L! # $%&'4%2()*+/1.86".3:; #.(46< $%&'4%2()*+/1.8%&+-%*/1.:#.(46< !"ML 3%(*'-'1'-%8.6@%*%'-+8.# !"ED $%&'*@%45/8./(%@)-8+8.# !"EE $%&'1+2*'4+/1.1/0'@%(%)+.# # $%&'%*2/&%.1%*+81'*(/+.# !"DJ $%&'%*2/&%.>%&&+81'*(+8.# !"#J $%&'%*2/&%.8+-%++)-8+8.# $%&'%*2/&%.2%&+?'*-+%).# $%&'*/4*/1.&%2/86*'?/-5+.E # $%&'4%2()*+/1.86".3:; #.(46F $%&'4%2()*+/1.8%&+-%*/1.:#.(46F# !"JD !"NK $%&'4%2()*+/1.86".3:; #.(46G !"L# $%&'4%2()*+/1.8%&+-%*/1.:#.E !"N! $%&'4%2()*+/1.8%&+-%*/1.:#.H !"JN $%&'4%2()*+/1.8%&+-%*/1.:#.# !"JH $%&'4%2()*+/1.86".3:; #.(46I $%&'4%2()*+/1.8%&+-%*/1.:#.D $%&'*/4*/1.&%2/86*'?/-5+.D !"NH!"K# $%&',)'1)(*+2/1.4'*+-A/)-8).D !"E! $%&'*/4*/1.&%2/86*'?/-5+.H !"MJ $%&'A/%5*%(/1.B%&84C+.# !"M! $%&'?)*%7.1/2'8/1.D !"NJ $%&'?)*%7.1)5+()**%-)+.# !"LN $%&'?)*%7.>'&2%-++.D !"NJ !"K# $%&'?)*%7.8/&?/*+?'-(+8.D $%&'?)*%7.5)-+(*+?+2%-8.D !"J! $%&'?)*%7.5)-+(*+?+2%-8.E !"MN $%&'?)*%7.1/2'8/1.H $%&'?)*%7.1)5+()**%-)+.H !"KN !"JM $%&'?)*%7.1/2'8/1.E !"MD $%&'?)*%7.1)5+()**%-)+.E !"NM $%&'?)*%7.>'&2%-++.E !"MJ $%&'?)*%7.8/&?/*+?'-(+8.H !"EK $%&'?)*%7.5)-+(*+?+2%-8.H !"J! $%&'%*2/&%.2%&+?'*-+%).D # $%&'?)*%7.>'&2%-++.H # $%&'*/4*/1.&%2/86*'?/-5+.# # $%&'4%2()*+/1.86".3:; #.(46; $%&'4%2()*+/1.8%&+-%*/1.:#.(46;# Figure 8. Independent expansion of the TATA-binding protein family in two haloarchaeal genera. Phylogeny of TATA-binding protein (TBP) homologs identified by RAST with Bootstrap values shown. Colored branches represent duplication events (with the dark blue branch representing four duplications). Ancestral TBP (found in all genomes) is shown on the purple branch. Successive duplications are shown in darkening shades of green (Halobacterium) or blue (Haloferax). Lynch et al. in preparation Wednesday, March 7, 12
  • 56. Massive Diversity of Proteorhodopsins Venter et al., 2004 Wednesday, March 7, 12
  • 57. Characterizing the niche-space distributions of components Metagenomics DARPA 0 .1 0 .2 0 .3 0 .4 0 .5 0 .6 0 .2 0 .4 0 .6 0 .8 1 .0 Polyne sia Archipe la gos_ G S 0 4 8 a _ C ora l R e e f India n O ce a n_ G S 1 2 0 _ O pe n O ce a n Polyne sia Archipe la gos_ G S 0 4 9 _ C oa sta l G a la pa gos Isla nds_ G S 0 2 6 _ O pe n O ce a n India n O ce a n_ G S 1 1 9 _ O pe n O ce a n G e ne ra l C a ribbe a n S e a _ G S 0 1 5 _ C oa sta l C a ribbe a n S e a _ G S 0 1 9 _ C oa sta l India n O ce a n_ G S 1 1 4 _ O pe n O ce a n H igh E a ste rn Tropica l Pa cific_ G S 0 2 3 _ O pe n O ce a n M e dium India n O ce a n_ G S 1 1 0 a _ O pe n O ce a n India n O ce a n_ G S 1 0 8 a _ La goon R e e f Low C a ribbe a n S e a _ G S 0 1 8 _ O pe n O ce a n NA G a la pa gos Isla nds_ G S 0 3 4 _ C oa sta l India n O ce a n_ G S 1 2 2 a _ O pe n O ce a n India n O ce a n_ G S 1 2 1 _ O pe n O ce a n C a ribbe a n S e a _ G S 0 1 7 _ O pe n O ce a n India n O ce a n_ G S 1 1 2 a _ O pe n O ce a n India n O ce a n_ G S 1 1 3 _ O pe n O ce a n India n O ce a n_ G S 1 4 8 _ F ringing R e e f C a ribbe a n S e a _ G S 0 1 6 _ C oa sta l S e a India n O ce a n_ G S 1 2 3 _ O pe n O ce a n India n O ce a n_ G S 1 4 9 _ H a rbor G a la pa gos Isla nds_ G S 0 2 7 _ C oa sta l E a ste rn Tropica l Pa cific_ G S 0 2 2 _ O pe n O ce a n W a te r de pth S ites S a rga sso S e a _ G S 0 0 1 c_ O pe n O ce a n G a la pa gos Isla nds_ G S 0 3 5 _ C oa sta l G a la pa gos Isla nds_ G S 0 3 0 _ W a rm S e e p G a la pa gos Isla nds_ G S 0 2 9 _ C oa sta l >4000m G a la pa gos Isla nds_ G S 0 3 1 _ C oa sta l upwe lling India n O ce a n_ G S 1 1 7 a _ C oa sta l sa m ple 2000!4000m G a la pa gos Isla nds_ G S 0 2 8 _ C oa sta l 900!2000m G a la pa gos Isla nds_ G S 0 3 6 _ C oa sta l 100!200m Polyne sia Archipe la gos_ G S 0 5 1 _ C ora l R e e f Atoll N orth Am e rica n E a st C oa st_ G S 0 1 4 _ C oa sta l 20!100m N orth Am e rica n E a st C oa st_ G S 0 0 6 _ E stua ry 0!20m E a ste rn Tropica l Pa cific_ G S 0 2 1 _ C oa sta l N orth Am e rica n E a st C oa st_ G S 0 0 9 _ C oa sta l N orth Am e rica n E a st C oa st_ G S 0 1 1 _ E stua ry N orth Am e rica n E a st C oa st_ G S 0 0 8 _ C oa sta l N orth Am e rica n E a st C oa st_ G S 0 1 3 _ C oa sta l N orth Am e rica n E a st C oa st_ G S 0 0 4 _ C oa sta l N orth Am e rica n E a st C oa st_ G S 0 0 7 _ C oa sta l N orth Am e rica n E a st C oa st_ G S 0 0 3 _ C oa sta l N orth Am e rica n E a st C oa st_ G S 0 0 2 _ C oa sta l N orth Am e rica n E a st C oa st_ G S 0 0 5 _ E m baym e nt Co Co Co Co Co Chlorophyll Water Depth Salinity Temperature Sample Depth Insolation mp mp mp mp mp on on on on on en en en en en t1 t2 t3 t4 t5 (a) (b) (c) Figure 3: a) Niche-space distributions for our five components (H T ); b) the site- ˆ ˆ similarity matrix (H T H); c) environmental variables for the sites. The matrices are aligned so that the same row corresponds to the same site in each matrix. Sites are ordered by applying spectral reordering to the similarity matrix (see Materials and Methods). Rows are aligned across the three matrices. Wednesday, March 7, 12
  • 58. Uses of Phylogeny in Genomics and Metagenomics Example 3: Selecting Organisms for Study Wednesday, March 7, 12
  • 59. rRNA Tree of Life Bacteria Archaea Eukaryotes Figure from Barton, Eisen et al. “Evolution”, CSHL Press. 2007. Based on tree from Pace 1997 Science 276:734-740 Wednesday, March 7, 12
  • 60. As of 2002 Proteobacteria TM6 OS-K • At least 40 Acidobacteria Termite Group OP8 phyla of Nitrospira Bacteroides bacteria Chlorobi Fibrobacteres Marine GroupA WS3 Gemmimonas Firmicutes Fusobacteria Actinobacteria OP9 Cyanobacteria Synergistes Deferribacteres Chrysiogenetes NKB19 Verrucomicrobia Chlamydia OP3 Planctomycetes Spriochaetes Coprothmermobacter OP10 Thermomicrobia Chloroflexi TM7 Deinococcus-Thermus Dictyoglomus Aquificae Thermudesulfobacteria Thermotogae OP1 Based on Hugenholtz, OP11 2002 Wednesday, March 7, 12
  • 61. As of 2002 Proteobacteria TM6 OS-K • At least 40 Acidobacteria Termite Group OP8 phyla of Nitrospira Bacteroides bacteria Chlorobi Fibrobacteres Marine GroupA • Most genomes WS3 Gemmimonas from three Firmicutes Fusobacteria phyla Actinobacteria OP9 Cyanobacteria Synergistes Deferribacteres Chrysiogenetes NKB19 Verrucomicrobia Chlamydia OP3 Planctomycetes Spriochaetes Coprothmermobacter OP10 Thermomicrobia Chloroflexi TM7 Deinococcus-Thermus Dictyoglomus Aquificae Thermudesulfobacteria Thermotogae OP1 Based on Hugenholtz, OP11 2002 Wednesday, March 7, 12
  • 62. As of 2002 Proteobacteria TM6 OS-K • At least 40 Acidobacteria Termite Group OP8 phyla of Nitrospira Bacteroides bacteria Chlorobi Fibrobacteres Marine GroupA • Most genomes WS3 Gemmimonas from three Firmicutes Fusobacteria phyla Actinobacteria OP9 Cyanobacteria Synergistes • Some studies Deferribacteres Chrysiogenetes in other phyla NKB19 Verrucomicrobia Chlamydia OP3 Planctomycetes Spriochaetes Coprothmermobacter OP10 Thermomicrobia Chloroflexi TM7 Deinococcus-Thermus Dictyoglomus Aquificae Thermudesulfobacteria Thermotogae OP1 Based on Hugenholtz, OP11 2002 Wednesday, March 7, 12
  • 63. As of 2002 Proteobacteria TM6 OS-K • At least 40 Acidobacteria Termite Group OP8 phyla of Nitrospira Bacteroides bacteria Chlorobi Fibrobacteres Marine GroupA • Most genomes WS3 Gemmimonas from three Firmicutes Fusobacteria phyla Actinobacteria OP9 Cyanobacteria Synergistes • Some other Deferribacteres Chrysiogenetes phyla are only NKB19 Verrucomicrobia Chlamydia sparsely OP3 Planctomycetes Spriochaetes sampled Coprothmermobacter OP10 • Same trend in Thermomicrobia Chloroflexi TM7 Eukaryotes Deinococcus-Thermus Dictyoglomus Aquificae Thermudesulfobacteria Thermotogae OP1 Based on Hugenholtz, OP11 2002 Wednesday, March 7, 12
  • 64. As of 2002 Proteobacteria TM6 OS-K • At least 40 Acidobacteria Termite Group OP8 phyla of Nitrospira Bacteroides bacteria Chlorobi Fibrobacteres Marine GroupA • Most genomes WS3 Gemmimonas from three Firmicutes Fusobacteria phyla Actinobacteria OP9 Cyanobacteria Synergistes • Some other Deferribacteres Chrysiogenetes phyla are only NKB19 Verrucomicrobia Chlamydia sparsely OP3 Planctomycetes Spriochaetes sampled Coprothmermobacter OP10 • Same trend in Thermomicrobia Chloroflexi TM7 Viruses Deinococcus-Thermus Dictyoglomus Aquificae Thermudesulfobacteria Thermotogae OP1 Based on Hugenholtz, OP11 2002 Wednesday, March 7, 12
  • 67. GEBA Pilot Project: Components • Project overview (Phil Hugenholtz, Nikos Kyrpides, Jonathan Eisen, Eddy Rubin, Jim Bristow) • Project management (David Bruce, Eileen Dalin, Lynne Goodwin) • Culture collection and DNA prep (DSMZ, Hans-Peter Klenk) • Sequencing and closure (Eileen Dalin, Susan Lucas, Alla Lapidus, Mat Nolan, Alex Copeland, Cliff Han, Feng Chen, Jan-Fang Cheng) • Annotation and data release (Nikos Kyrpides, Victor Markowitz, et al) • Analysis (Dongying Wu, Kostas Mavrommatis, Martin Wu, Victor Kunin, Neil Rawlings, Ian Paulsen, Patrick Chain, Patrik D’Haeseleer, Sean Hooper, Iain Anderson, Amrita Pati, Natalia N. Ivanova, Athanasios Lykidis, Adam Zemla) • Adopt a microbe education project (Cheryl Kerfeld) • Outreach (David Gilbert) • $$$ (DOE, Eddy Rubin, Jim Bristow) Wednesday, March 7, 12
  • 68. GEBA Lesson 1: Phylogeny driven genome selection (and phylogenetics) improves genome annotation • Took 56 GEBA genomes and compared results vs. 56 randomly sampled new genomes • Better definition of protein family sequence “patterns” • Greatly improves “comparative” and “evolutionary” based predictions • Conversion of hypothetical into conserved hypotheticals • Linking distantly related members of protein families • Improved non-homology prediction Wednesday, March 7, 12
  • 69. GEBA Lesson 2 Phylogeny-driven genome selection helps discover new genetic diversity Wednesday, March 7, 12
  • 70. Protein Family Rarefaction Curves • Take data set of multiple complete genomes • Identify all protein families using MCL • Plot # of genomes vs. # of protein families Wednesday, March 7, 12
  • 71. Wu et al. 2009 Nature 462, 1056-1060 Wednesday, March 7, 12
  • 72. Wu et al. 2009 Nature 462, 1056-1060 Wednesday, March 7, 12
  • 73. Wu et al. 2009 Nature 462, 1056-1060 Wednesday, March 7, 12
  • 74. Wu et al. 2009 Nature 462, 1056-1060 Wednesday, March 7, 12
  • 75. Wu et al. 2009 Nature 462, 1056-1060 Wednesday, March 7, 12
  • 76. Synapomorphies exist Wu et al. 2009 Nature 462, 1056-1060 Wednesday, March 7, 12
  • 77. Families/PD not uniform 31 6 Wednesday, March 7, 12
  • 78. GEBA Lesson 3 Improves analysis of genome data from uncultured organisms Wednesday, March 7, 12
  • 79. Shotgun Sequencing Allows Use of Other Markers Sargasso Phylotypes 0.500 0.375 GEBA Project Weighted % of Clones 0.250 improves EFG EFTu HSP70 metagenomic analysis RecA RpoB rRNA 0.125 0 ia ia ria s i xi ia a ob te ot le er er er e u or ae of ct ct ct ct ic hl or ba ba ba ba ch rm C hl ar eo eo eo so Fi C ry Fu t t ot ro ro Eu pr ap ap lta ph m De am Al G Major Phylogenetic Group Venter et al., Science 304: 66-74. 2004 Wednesday, March 7, 12
  • 80. Shotgun Sequencing Allows Use of Other Markers Sargasso Phylotypes 0.500 0.375 But not a lot Weighted % of Clones 0.250 EFG EFTu HSP70 RecA RpoB rRNA 0.125 0 ia ia ria s i xi ia a ob te ot le er er er e u or ae of ct ct ct ct ic hl or ba ba ba ba ch rm C hl ar eo eo eo so Fi C ry Fu t t ot ro ro Eu pr ap ap lta ph m De am Al G Major Phylogenetic Group Venter et al., Science 304: 66-74. 2004 Wednesday, March 7, 12
  • 81. Phylogeny and Metagenomics Future 1 Need to adapt genomic and metagenomic methods to make better use of data Wednesday, March 7, 12