SlideShare uma empresa Scribd logo
1 de 102
Baixar para ler offline
Phylogenomic Approaches to the
               Study of Microbial Diversity
                                   September 16, 2012
                           Lake Arrowhead Microbial Genomes
                                       #LAMG12

                                    Jonathan A. Eisen
                              University of California, Davis
                                   @phylogenomics



Sunday, September 16, 12
A Bit of History

        • For the real story about the Lake
          Arrowhead Microbial Genomes meetings
          see http://tinyurl.com/LAMG12
        • But the key to LAMG meetings are ...




Sunday, September 16, 12
Quotes




Sunday, September 16, 12
Quotes

        • Space-time continuum of genes and
          genomes




Sunday, September 16, 12
Quotes

        • Space-time continuum of genes and
          genomes
        • Microbes not only have a lot of sex, they
          have a lot of weird sex




Sunday, September 16, 12
Quotes

        • Space-time continuum of genes and
          genomes
        • Microbes not only have a lot of sex, they
          have a lot of weird sex
        • Gene sequences are the wormhole that
          allows one to tunnel into the past




Sunday, September 16, 12
Quotes

        • Space-time continuum of genes and
          genomes
        • Microbes not only have a lot of sex, they
          have a lot of weird sex
        • Gene sequences are the wormhole that
          allows one to tunnel into the past
        • This is how you do metagenomics on 50
          dollars, and that’s Canadian dollars



Sunday, September 16, 12
Quotes

        • Space-time continuum of genes and
          genomes
        • Microbes not only have a lot of sex, they
          have a lot of weird sex
        • Gene sequences are the wormhole that
          allows one to tunnel into the past
        • This is how you do metagenomics on 50
          dollars, and that’s Canadian dollars
        • The human guts are a real milieu of stuff


Sunday, September 16, 12
Quotes

        • Space-time continuum of genes and
          genomes
        • Microbes not only have a lot of sex, they
          have a lot of weird sex
        • Gene sequences are the wormhole that
          allows one to tunnel into the past
        • This is how you do metagenomics on 50
          dollars, and that’s Canadian dollars
        • The human guts are a real milieu of stuff
        • Antibiotics do not kill things, they corrupt
          them
Sunday, September 16, 12
Quotes

        • There comes a point in life when you have
          to bring chemists into the picture




Sunday, September 16, 12
Quotes

        • There comes a point in life when you have
          to bring chemists into the picture
        • The rectal swabs are here in tan color




Sunday, September 16, 12
Quotes

        • There comes a point in life when you have
          to bring chemists into the picture
        • The rectal swabs are here in tan color
        • If I have time I will tell you about a dream




Sunday, September 16, 12
Quotes

        • There comes a point in life when you have
          to bring chemists into the picture
        • The rectal swabs are here in tan color
        • If I have time I will tell you about a dream
        • Another thing you need to know" pause
          "Actually you don't NEED to know any of
          this




Sunday, September 16, 12
Quotes

        • There comes a point in life when you have
          to bring chemists into the picture
        • The rectal swabs are here in tan color
        • If I have time I will tell you about a dream
        • Another thing you need to know" pause
          "Actually you don't NEED to know any of
          this
        • I have been influenced by Fisher Price
          throughout my life


Sunday, September 16, 12
Quotes

        • There comes a point in life when you have
          to bring chemists into the picture
        • The rectal swabs are here in tan color
        • If I have time I will tell you about a dream
        • Another thing you need to know" pause
          "Actually you don't NEED to know any of
          this
        • I have been influenced by Fisher Price
          throughout my life
        • This is going to be ironic coming from
          someone who studies circumcision
Sunday, September 16, 12
Quotes

        • And we will bring out the unused cheese
          from yesterday




Sunday, September 16, 12
Quotes

        • And we will bring out the unused cheese
          from yesterday
        • A paper came out next year




Sunday, September 16, 12
Quotes

        • And we will bring out the unused cheese
          from yesterday
        • A paper came out next year
        • It takes 1000 nanobiologists to make one
          microbiologist




Sunday, September 16, 12
Quotes

        • And we will bring out the unused cheese
          from yesterday
        • A paper came out next year
        • It takes 1000 nanobiologists to make one
          microbiologist
        • In an engineering sense, the vagina is a
          simple plug flow reactor




Sunday, September 16, 12
Phylogenomic Approaches to
                           Studying Microbial Diversity

                                   Example 1:

                                   Phylotyping
                                      and
                              Phylogenetic Diversity

Sunday, September 16, 12
rRNA Phylotyping
                           DNA
                           extraction                           PCR

                                                            Makes lots of                  Sequence
                              PCR                           copies of the                 rRNA genes
                                                            rRNA genes
                                                             in sample


                                                                                           rRNA1
                                                                                5’...ACACACATAGGTGGAGCTA
                                                                                      GCGATCGATCGA... 3’
                                         Sequence alignment = Data matrix
                                                                                           rRNA2
                                              rRNA1     A   C   A   C   A   C   5’..TACAGTATAGGTGGAGCTAG
                                                                                       CGACGATCGA... 3’
                                              rRNA2     T   A   C   A   G   T
                                                                                           rRNA3
                                              rRNA3     C   A   C   T   G   T   5’...ACGGCAAAATAGGTGGATT
                                              rRNA4     C   A   C   A   G   T         CTAGCGATATAGA... 3’

                                              E. coli   A   G   A   C   A   G               rRNA4
                                                                                5’...ACGGCCCGATAGGTGGATT
                                             Humans     T   A   T   A   G   T         CTAGCGCCATAGA... 3’
                                              Yeast     T   A   C   A   G   T

Sunday, September 16, 12
rRNA Phylotyping




Sunday, September 16, 12
rRNA Phylotyping




                               E. coli           Humans

                                         Yeast




Sunday, September 16, 12
rRNA Phylotyping




                                 E. coli              Humans

                                             Yeast




                                     OTU2   OTU1

                                                    OTU4
                              OTU3



                               E. coli               Humans

                                            Yeast


Sunday, September 16, 12
rRNA Phylotyping
                           B
                      A




  Cluster                  C




Sunday, September 16, 12
rRNA Phylotyping
                           B
                      A




  Cluster                  C




                           B
                     A



 OTUs                      C




Sunday, September 16, 12
rRNA Phylotyping
                                  B
                      A




  Cluster                         C




                                  B
                     A



 OTUs                             C




                           OTU1

                           OTU2

                           OTU3

                           OTU4


Sunday, September 16, 12
rRNA Phylotyping
                                  B
                      A




  Cluster                         C




                                  B
                     A



 OTUs                             C




                                                OTU2   OTU1

                           OTU1                                OTU4
                                         OTU3
                           OTU2

                           OTU3           E. coli              Humans
                           OTU4                        Yeast


Sunday, September 16, 12
rRNA Phylotyping




                               E. coli           Humans

                                         Yeast




Sunday, September 16, 12
rRNA Phylotyping




                                                          Just
                               E. coli           Humans
                                                          Phylogeny
                                         Yeast




Sunday, September 16, 12
rRNA Phylotyping
                                  B
                      A




  Cluster                         C




                                                                          Just
                                  B         E. coli              Humans
                                                                          Phylogeny
                     A

                                                        Yeast
 OTUs                             C




                                                OTU2   OTU1

                           OTU1                                OTU4
                                         OTU3
                           OTU2

                           OTU3           E. coli               Humans
                           OTU4                        Yeast


Sunday, September 16, 12
rRNA Phylotyping
        • OTUs
             • Taxonomic lists
             • Relative abundance of taxa
             • Ecological metrics (alpha and beta diversity)
        • Phylogenetic metrics
             •   Binning
             •   Identification of novel groups
             •   Clades
             •   Rates of change
             •   LGT
             •   Convergence
             •   PD
             •   Phylogenetic ecology (e.g., Unifrac)
Sunday, September 16, 12
What’s New in Phylotyping




Sunday, September 16, 12
What’s New in Phylotyping I

        • More PCR products

        • Deeper sequencing
             • The rare biosphere
             • Relative abundance estimates

        • More samples (with barcoding)
             • Times series
             • Spatially diverse sampling
             • Fine scale sampling

Sunday, September 16, 12
intense research (5–9), as such studies of β-diversity (variation in                                                       mental variation or dispersal limitation
       community composition) yield insights into the maintenance of                                                              vary by spatial scale? Because most bac
                                                                Beta-Diversity
       biodiversity. These studies are still relatively rare for micro-
       organisms, however, and thus our understanding of the mecha-
                                                                                                                                  and hardy, we predicted that dispersa
                                                                                                                                  primarily across continents, resulting
       nisms underlying microbial diversity—most of the tree of life—                                                             microbial “provinces” (15). At the sam
       remains limited.                                                                                                           environmental factors would contrib
           β-Diversity, and therefore distance-decay patterns, could be                                                           decay at all scales, resulting in the steepe
       driven solely by differences in environmental conditions across                                                            scale as reported in plant and animal c
       space, a hypothesis summed up by microbiologists as, “every-
       thing is everywhere—the environmental selects” (10). Under this                                                            Results and Discussion
       model, a distance-decay curve is observed because environmen-                                                              We characterized AOB community co
       tal variables tend to be spatially autocorrelated, and organisms                                                           Sanger sequencing of 16S rRNA gene
       with differing niche preferences are selected from the available                                                           primer sets. Here we focus on the resu
       pool of taxa as the environment changes with distance.                                                                     sequences from the order Nitrosomo
           Dispersal limitation can also give rise to β-diversity, as it per-                                                     primers specific for AOB within the β-
       mits historical contingencies to influence present-day biogeo-                                                              The second primer set (18) generate
       graphic patterns. For example, neutral niche models, in which an
       organism’s marshes 1.sampled marshes sampled for details). for details). its environmental
         Fig. 1. The 13
                         abundance (see Table S1 (see Table S1 Marshes com- com-
                             Fig.     The 13 is not influenced by Marshes
                             pared with one another within regions are circled. (Inset) The arrangement
       preferences, predict apoints within marshes. Six pointsThe arrangement a 100-m relatively
         pared with one another within regions are circled. (Inset) were sampled along On
                             of sampling distance-decay curve (8, 11).
                                                                                                                                  Author contributions: J.B.H.M. and M.C.H.-D. designed
         of sampling points within marshes. Six points births ∼1 kmalongTwo marshescontribute to
                             transect, and a seventh point was sampled
                                                            were sampled away. a 100-m in the
       short time seventh pointstochastic km away.were sampled morethe
                         scales, was sampled(outlined stars) Two marshes in intensively,
                                                                         and deaths
                             Northeast United States                                                                              M.C.H.-D. performed research; J.B.H.M., S.D.A., and M
         transect, and a                                ∼1 a grid pattern.
       a Northeast United Statesdistributionweretaxa (ecological drift). On longer
          heterogeneous (outlined stars) of sampled more intensively,
                             along four 100-m transects in
                                                                                                                                  and M.C.H.-D. wrote the paper.
       time four 100-m transects in a rangegenetic processes allow results taxon Distance-decay curves for the declare no conflict of interest.
         along scales, stochastic pattern.
                             a broader
                                           grid
                                                  of Proteobacteria, but yielded similar
                                                                                                  for Fig. 2. di-                 The authors
                                                                                                                                                     Nitrosomadales communities. The
       versification across the Tables S2 and S3).
                             (Fig. S1 and landscape (evolutionary drift). If dispersal denotes thearticle is alinear regression across all spatial
                                                                                                          dashed, blue line       This   least-squares PNAS Direct Submission.
                                Across all samples, we identified 4,931 quality Nitrosomadales             scales. The solid lines denote separate regressions within each of the three
       isa limiting, then current environmental or (operational taxo- 2.spatial scales: within marshes, regional the Nitrosomadales communities. The acces
            broader range of Proteobacteria, but yielded similar results conditions will
                             sequences, which grouped into 176 OTUs biotic                          Fig.      Distance-decay Freely available marshes within regions circledPNAS open
                                                                                                                                   curves for (across online through the in

       notAcrossand samples, theidentified 4,931 qualitycurve, and thusdashed,Thebluelinelines significantlyregions). The slopes of all withinsolid theof thespatialthis pape
         (Fig. S1
              fully all Tables S2 units)retained a arbitrary 99% Nitrosomadales cutoff. Fig. 1),solidcontinental Dataseparate zero. linear regression across all three
                             nomic and S3). an
                      explain we distance-decay sequence similarity but light andline)denotes(acrossleast-squares The slopeslinesthe each solid in
                             This cutoff
                                              using                                                          blue                   the
                                                                                                       geographicare denote deposition: The sequences red lines
                                                                                                                                          less than regressions of
                                                                                                                                                                       (except
                                                                                                                                                                               reported
                                                         high amount of sequence diversity, scales. are significantly different from the slope of the all scale (blue dashed) line.
       distance will begrouped the chance of including diversity similarity even after marshes, regional (across marshes within regions circled in
                                correlated with community because se-
         sequences, which minimized into 176 OTUs (operational taxo- of spatial scales: within                                    Bank database (accession nos. HQ271472–HQ276885
                             quencing or PCR99% sequence similarity cutoff. appear 1), and continental (across regions). The slopes of all lines (except the solid
                                                   errors. Most (95%) of the sequences              Fig.
       controlling for closelyarbitraryeither(2).the marine Nitrosospira-like clade, blue line) are significantly less than zero. The slopesdistancesolid red lines E-m
         nomic units) using another factors to
                                       related
                                                                                                                                  1
                                                                                                    light somonadales community similarity. Geographic of the con- addressed.
                                                                                                                                    To whom correspondence should be
         Drivers of bacterial β-diversity depend on spatial scale
         This cutoff retained a to be abundant inof sequence diversity,ref. 19) orare significantly different from the slope of the all scale (blue dashed) line.
                             known    high amount estuarine sediments (e.g., but
           For macroorganisms, the relative because                        contribution of environ- largest partial regression coefficient (b = 0.40,
                                                                                                      to  tributed the




                                                                                                                                                                          ECOLOGY
                             marine of including diversity                               (20) (Fig.                               This article contains supporting information online at
         minimized the chance bacterium C-17, classified as Nitrosomonasof se- S2). P < 0.0001), with sediment moisture, nitrate concentration, plant
       mental factors Pairwise community similaritythe sequences appear calcu- cover, salinity, and1073/pnas.1016308108/-/DCSupplemental.
         quencing or PCR or dispersal limitation to β-diversity depends on
                               errors. Most (95%) of between the samples was                                                        air and water temperature contributing to
         Jennifer relatedMartinya,1, Jonathan A. Nitrosospira-likePennc, Steven D. Allisona,d, and M. Claire Horner-Devinedistance con-
           closely B. H. either based the the presence or absence of each OTU using smaller, but significant, partial regression coefficients (b e 0.09–
                               lated to on marine Eisenb, Kevin clade,
                                                                                            somonadales community similarity. Geographic =
                               a rarefied Sørensen’s index (4). Community similarity using this
                                                    sediments (e.g., ref. abundance-based 0.17, the 0.05) (Table 1). Because salt marsh bacteria may be
           known to be abundant in estuarinehighly correlated with the19) or to             tributed P < largest of California, Irvine, CAused a global ocean of
                               incidence index was Biology, and dDepartment of Earth System Science, University ocean currents, we also coefficient (b = 0.40,
                                                                                                                      partial regression 92697; bDepartment




                                                                                                                                                                                    ECOLOGY
         a
           Department of Ecology and classified as Nitrosomonas (20) (Fig. S2).
                                       Evolutionary                                               dispersing through
           marine bacterium Sørensen index (Mantel test: ρvol. 108 P =no. 19 (21). P < 0.0001), with sediment moisture, nitrate(24), to estimate plant
                                C-17, May 10, 2011 | = 0.9239; | 0.0001)
        7850–7854 |Ecology, University of California Davis Genome Center, Davis, CA 95616;circulation model (23), as applied previously concentration,
         Evolution and PNAS |                                                                      c                                                          www.pnas.org/
                                                                                                    Center for Marine Biotechnology and Biomedicine, The Scripps
           Pairwise community similarity between the samples was Jolla, CA 92093; and eSchool and timesandFishery Sciences, University between
                                  A plot of community similarity San Diego, La calcu-
         Institution of Oceanography, University of California atversus geographic distance cover, salinity,of Aquatic and hypothetical microbial cells of Washington,
                                                                                            for relative dispersal air of water temperature contributing to
         Seattle, WA 98195 the presence or samples revealed that the Nitrosomonadales
           lated based on each pairwise set of absence of each OTU using                    smaller, but significant, partial regression coefficientspoints 0.09–
                                                                                                  each sampling location. Dispersal times between sampling (b =
Sunday, September 16, 12       display a significant, negative distance-decay curve (slope = −0.08,   did not explain more variability in bacterial community similarity
Earth Microbiome Project




Sunday, September 16, 12
Microbial Range Maps




Sunday, September 16, 12
Things You Could Do
      • Mississippi River: 2320 miles long




Sunday, September 16, 12
Things You Could Do
      • Mississippi River: 2320 miles long
            • 1 site / mile
            • 3 samples / site
            • 6960 samples
                  • rRNA PCR w/ barcodes
                  • metagenomics w/ barcodes
            • Miseq Run:
                  • 30 million sequence reads
                  • 4310 sequences / sample
            • Hiseq 2000
                  • 6 billion sequence reads
                  • 862,068 sequences / sample

Sunday, September 16, 12
Things You Could Do
      • Mississippi River: 12,249,600 feet long
            • 1 site / 500 feet
            • 3 samples / site
            • 73497 samples
                  • rRNA PCR w/ barcodes
                  • metagenomics w/ barcodes
            • Miseq Run:
                  • 30 million sequence reads
                  • 408 sequences / sample
            • Hiseq 2000
                  • 6 billion sequence reads
                  • 81,635 sequences / sample

Sunday, September 16, 12
What’s New in Phylotyping II

        • Metagenomics avoids biases of rRNA
          PCR




                                           shotgun
                                                sequence




Sunday, September 16, 12
Metagenomic Phylotyping
                                  B
                      A




  Cluster                         C




                                                                       Just
                                  B      E. coli              Humans
                                                                       Phylogeny
                     A

                                                     Yeast
 OTUs                             C




                                             OTU2   OTU1

                           OTU1                             OTU4
                                      OTU3
                           OTU2

                           OTU3        E. coli               Humans
                           OTU4                     Yeast


Sunday, September 16, 12
Phylogenetic Challenge




                                     ??


Sunday, September 16, 12
Phylogenetic Challenge




                                     ??


Sunday, September 16, 12
Phylogenetic Challenge




                               Multiple approaches


Sunday, September 16, 12
Method 1: Each is an island




Sunday, September 16, 12
Method 1: Each is an island




         • Build alignment, models, trees for full length seqs
         • Analyze fragmented reads one at a time


Sunday, September 16, 12
Method 1: Each is an island




         • Build alignment, models, trees for full length seqs
         • Analyze fragmented reads one at a time


Sunday, September 16, 12
Method 1: Each is an island




         • Build alignment, models, trees for full length seqs
         • Analyze fragmented reads one at a time


Sunday, September 16, 12
STAP                                             ss-rRNA Taxonomy Pip
                                                       Figure 1. A flow chart of the STAP pipeline.
                                                       doi:10.1371/journal.pone.0002566.g001

                                                       STAP database, and the query sequence is aligned to them using               a
                                                       the CLUSTALW profile alignment algorithm [40] as described                   w
                                                       above for domain assignment. By adapting the profile alignment               s
                                                                                                                                    a
                                                                                                                                    t
                                                                                                                                    o
                                                                                                                                    G
                                                                                                                                    t

                                                                                                                                    t

                                                            Each sequence
                                                                                                                                    s
                                                                                                                                    T
                                                                                                                                    c

                                                            analyzed separately                                                     a
                                                                                                                                    q
                                                                                                                                    c
                                                                                                                                    e
                                                                                                                                    b

                                                                                                                                    b
                                                                                                                                    S
                                                                                                                                    p
                                                                                                                                    a
                                                       Figure 2. Domain assignment. In Step 1, STAP assigns a domain to             t
                                                       each query sequence based on its position in a maximum likelihood            d
                                                       tree of representative ss-rRNA sequences. Because the tree illustrated       ‘
                                                       here is not rooted, domain assignment would not be accurate and              s
                                                       reliable (sequence similarity based methods cannot make an accurate
                                                                                                                                    s
                                                       assignment in this case either). However the figure illustrates an
                                                       important role of the tree-based domain assignment step, namely              s
                                                       automatic identification of deep-branching environmental ss-rRNAs.           d
                                                       doi:10.1371/journal.pone.0002566.g002                                        a


                                                              PLoS ONE | www.plosone.org                                        5




                                                                 Wu et al. 2008 PLoS One

FigureSeptember 16, 12
Sunday, 1. A flow chart of the STAP pipeline.
AMPHORA




    Wu and Eisen Genome
    Biology 2008 9:R151
    doi:10.1186/
    gb-2008-9-10-r151          Guide tree
Sunday, September 16, 12
Phylotyping w/ Proteins




   Wu and Eisen Genome Biology 2008 9:R151   doi:10.1186/gb-2008-9-10-r151
Sunday, September 16, 12
Method 2: Most in the Family




Sunday, September 16, 12
Phylogenetic Challenge

                                   xxxxxxxxxxxxxxxxxxxxxxx

                                  xxxxxx           xxxxxxxxxxxxx

                                                xxxxxxxxxxxxxx




                                  xxxxxxxxxxxxxx




                                           ??


Sunday, September 16, 12
Method 2: Most in family

                                      xxxxxxxxxxxxxxxxxxxxxxx

                                     xxxxxx           xxxxxxxxxxxxx

                                                  xxxxxxxxxxxxxx




                                     xxxxxxxxxxxxxx




                           One tree for those w/ overlap


Sunday, September 16, 12
rRNA in Sargasso Metagenome




     Venter et al., Science
     304: 66. 2004

Sunday, September 16, 12
RecA Phylotyping in Sargasso Data




     Venter et al., Science
     304: 66. 2004

Sunday, September 16, 12
Weighted % of Clones




                                                                                                         0
                                                                                                             0.125
                                                                                                                             0.250
                                                                                                                                            0.375
                                                                                                                                                            0.500
                                                               Al
                                                                 ph
                                                                        ap
                                                                          ro
                                                                             t  eo
                                                                Be                       ba




Sunday, September 16, 12
                                                                  ta                       ct
                                                                                                 er
                                                                        pr                          ia
                                                                          ot
                                                                                eo
                                                           G




                           304: 66. 2004
                                                            am                     b      ac
                                                                    m                        t  er
                                                                     ap                            ia
                                                                         ro
                                                           Ep               t   eo
                                                                si                       ba
                                                                   lo                      ct




                           Venter et al., Science
                                                                        np                       er
                                                                             ro                     ia
                                                                          eo    t
                                                               De             ba
                                                                 lta             ct
                                                                    pr              er
                                                                       ot              ia
                                                                          eo
                                                                              ba
                                                                     C
                                                                                                                                                    EFG




                                                                                 ct
                                                                       ya           er
                                                                          no           ia
                                                                              ba
                                                                                 ct
                                                                                    er
                                                                          Fi           ia
                                                                             rm
                                                                                ic
                                                                                                                                                    EFTu




                                                                                   ut
                                                                                      es
                                                                     Ac
                                                                        tin
                                                                            ob
                                                                               ac
                                                                                   te
                                                                                      ria
                                                                             C
                                                                               hl
                                                                                                                                                    HSP70




                                                                                  or
                                                                                    ob
                                                                                         i
                                                                                              C




                                Major Phylogenetic Group
                                                                                                  FB
                                                                                                                                                                    Sargasso Phylotypes




                                                                                C
                                                                                                                                                    RecA




                                                                                     hl
                                                                                          or
                                                                                            of
                                                                                                 le
                                                                                                   xi
                                                                             Sp
                                                                                    iro
                                                                                         ch
                                                                                              ae
                                                                                                te
                                                                                                    s
                                                                                                                                                    RpoB




                                                                             Fu
                                                                                    so
                                                                                      ba
                                                           De                              ct
                                                             in                                er
                                                                                                  ia
                                                                    oc
                                                                                                                                                                                          Sargasso Phylotyping




                                                                      oc
                                                                        cu
                                                                                s-
                                                                                                                                                    rRNA




                                                                                     Th
                                                                         Eu      er
                                                                           ry       m
                                                                             ar       u
                                                                               ch s
                                                                                  ae
                                                                        C            ot
                                                                                        a
                                                                          re
                                                                            na
                                                                              rc
                                                                                 ha
                                                                                    eo
                                                                                       ta
STAP, QIIME, Mothur           ss-rRNA Taxonomy Pip




                                                              Combine all into
                                                              one alignment



               Figure 1. A flow chart of the STAP pipeline.
               doi:10.1371/journal.pone.0002566.g001
Sunday, September 16, 12
all of these bioinformatics steps together in one package.                        therefore, to invest a large amount of time and effort to
                                                                                  To this end, we have built an automated, user-friendly,                           get to that list of microbes. But now that current efforts
                                                                                  workflow-based system called WATERS: a Workflow for                               are significantly more advanced and often require com-



                                                                             WATERs
                                                                                  the Alignment, Taxonomy, and Ecology of Ribosomal                                 parison of dozens of factors and variables with datasets of
                                                                                  Sequences (Fig. 1). In addition to being automated and                            thousands of sequences, it is not practically feasible to
                                                                   Page 2 of 14   simple to use, because WATERS is executed in the Kepler                           process these large collections "by hand", and hugely inef-
                                                                                  scientific workflow system (Fig. 2) it also has the advan-                        ficient if instead automated methods can be successfully
                                                                                  tage that it keeps track of the data lineage and provenance                       employed.
                                                                                  of data products [23,24].                                                         Broadening the user base
                                                                                  Automation                                                                        A second motivation and perspective is that by minimiz-
                                                                                  The primary motivation in building WATERS was to                                  ing the technical difficulty of 16 S rDNA analysis through
                                                                                  minimize the technical, bioinformatics challenges that                            the use of WATERS, we aim to make the analysis of these
 ic-                                                                              arise when performing DNA sequence clustering, phylo-                             datasets more widely available and allow individuals with

A).
                           Check                                Build
 sly       Align
                          chimeras
                                            Cluster
                                                                Tree
ers
nly
ed,                    Diversity
                                             Assign            Tree w/
                      statistics &
 ed                     graphs
                                           Taxonomy           Taxonomy
 ng
ge-                   Cytoscape
                                           OTU table            Unifrac
de-                    network                                   files
 he
   a
  nt   Figure 1 Overview of WATERS. Schema of WATERS where white
ise    boxes indicate "behind the scenes" analyses that are performed in WA-
 he    TERS. Quality control files are generated for white boxes, but not oth-
       erwise routinely analyzed. Black arrows indicate that metadata (e.g.,
 on
       sample type) has been overlaid on the data for downstream interpre-
  n-   tation. Colored boxes indicate different types of results files that are
nd     generated for the user for further use and biological interpretation.
       Colors indicate different types of WATERS actors from Fig. 2 which
                                                                                   Figure 2 Screenshot of WATERS in Kepler software. Key features: the library of actors un-collapsed and displayed on the left-hand side, the input
eys    were used: green, Diversity metrics, WriteGraphCoordinates, Diversity       and output paths where the user declares the location of their input files and desired location for the results files. Each green box is an individual Kepler
       graphs; blue, Taxonomy, BuildTree, Rename Trees, Save Trees; Create-        actor that performs a single action on the data stream. The connectors (black arrows) direct and hook up the actors in a defined sequence. Double-
er)                                                                                clicking on any actor or connector allows it to be manipulated and re-arranged.
       Unifrac; yellow, CreateOtuTable, CreateCytoscape, CreateOTUFile;
 16    white, remaining unnamed actors.
 n-
  as
       chimeric sequences generated during PCR identifying
nto
 tly  Hartman et sets 2010. W.A.T.E.R.S.:as opera-
       closely related al of sequences (also known a Workflow for the Alignment,                                                                     Taxonomy, and Ecology
nc-   of Ribosomal units or OTUs), removing redundant
       tional taxonomic
                           Sequences. BMC Bioinformatics 2010, 11:317 doi:
       sequences above a certain percent identity cutoff, assign-
6S    10.1186/1471-2105-11-317 each sequence or
       ing putative taxonomic identifiers to
 As
       representative of a group, inferring a phylogenetic tree of
 n-
       the sequences, and comparing the phylogenetic structure
  Sunday, September 16, 12
One Major Issue with rRNA

        • Copy number varies greatly between taxa
        • Can lead to significant errors in estimates
          of relative abundance from numbers of
          reads




Sunday, September 16, 12
Kembel Correction




                                      Kembel, Wu, Eisen, Green. In press.
                                      PLoS Computational Biology.
                                      Incorporating 16S gene copy number
                                      information improves estimates of
                                      microbial diversity and abundance

Sunday, September 16, 12
Method 3: All in the family




Sunday, September 16, 12
Phylogenetic Challenge




                                     ??


Sunday, September 16, 12
Phylogenetic Challenge




                           A single tree with everything?


Sunday, September 16, 12
rRNA analysis
                                  B
                      A




  Cluster                         C




                                                                       Just
                                  B      E. coli              Humans
                                                                       Phylogeny
                     A

                                                     Yeast
 OTUs                             C




                                             OTU2   OTU1

                           OTU1                             OTU4
                                      OTU3
                           OTU2

                           OTU3        E. coli               Humans
                           OTU4                     Yeast


Sunday, September 16, 12
PhylOTU                                                                 Finding Meta




                    Figure 1. PhylOTU Workflow. Computational processes are represented as squares and databases are represented as cylinders in
                    workflow of PhylOTU. See Results section for details.
 Sharpton TJ,      Riesenfeld SJ, Kembel SW, Ladau J, O'Dwyer JP, Green JL, Eisen JA, Pollard KS. (2011)
                    doi:10.1371/journal.pcbi.1001061.g001
 PhylOTU: A High-Throughput Procedure Quantifies Microbial Community Diversity and Resolves Novel
 Taxa from Metagenomic used toPLoS Comput Biol 7(1): e1001061. doi:10.1371/journal.pcbi.1001061
               alignment Data. build the profile, resulting in a multiple PD versus PID clustering, 2) to explore overlap betw
                         sequence alignment of full-length reference sequences and          clusters and recognized taxonomic designations, and
Sunday, September 16, 12 metagenomic reads. The final step of the alignment process is a    the accuracy of PhylOTU clusters from shotgun re
RecA, RpoB in GOS

                                                                        GOS 1

                                                                        GOS 2




                                                                        GOS 3

                                                                        GOS 4




   Wu D, Wu M, Halpern A, Rusch DB, Yooseph S, et al. (2011) Stalking
   the Fourth Domain in Metagenomic Data: Searching for, Discovering,
                                                                        GOS 5
   and Interpreting Novel, Deep Branches in Marker Gene Phylogenetic
   Trees. PLoS ONE 6(3): e18011. doi:10.1371/journal.pone.0018011


Sunday, September 16, 12
Phylosift/ pplacer




     Aaron Darling, Guillaume Jospin, Holly Bik, Erik Matsen, Eric
     Lowe, and others
Sunday, September 16, 12
Phylosift

        • Probabilistic Phylogenetic Ecology
        • https://github.com/gjospin/PhyloSift
        • http://phylosift.wordpress.com




Sunday, September 16, 12
Method 4: All in the genome




Sunday, September 16, 12
Multiple Genes?




                           A single tree with everything?




Sunday, September 16, 12
Kembel Combiner




     Kembel SW, Eisen JA, Pollard KS, Green JL (2011) The Phylogenetic Diversity of Metagenomes. PLoS
     ONE 6(8): e23214. doi:10.1371/journal.pone.0023214

Sunday, September 16, 12
typically used as a qualitative measure because duplicate s
                                                                                      quences are usually removed from the tree. However, the
                                                                                      test may be used in a semiquantitative manner if all clone




                                Kembel Combiner
                                                                                      even those with identical or near-identical sequences, are i
                                                                                      cluded in the tree (13).
                                                                                         Here we describe a quantitative version of UniFrac that w
                                                                                      call “weighted UniFrac.” We show that weighted UniFrac b
                                                                                      haves similarly to the FST test in situations where both a




                                                                                         FIG. 1. Calculation of the unweighted and the weighted UniFr
                                                                                      measures. Squares and circles represent sequences from two differe
                                                                                      environments. (a) In unweighted UniFrac, the distance between t
                                                                                      circle and square communities is calculated as the fraction of t
                                                                                      branch length that has descendants from either the square or the circ
                                                                                      environment (black) but not both (gray). (b) In weighted UniFra
                                                                                      branch lengths are weighted by the relative abundance of sequences
                                                                                      the square and circle communities; square sequences are weight
                                                                                      twice as much as circle sequences because there are twice as many tot
                                                                                      circle sequences in the data set. The width of branches is proportion
                                                                                      to the degree to which each branch is weighted in the calculations, an
                                                                                      gray branches have no weight. Branches 1 and 2 have heavy weigh
                                                                                      since the descendants are biased toward the square and circles, respe
                                                                                      tively. Branch 3 contributes no value since it has an equal contributio
                                                                                      from circle and square sequences after normalization.




     Kembel SW, Eisen JA, Pollard KS, Green JL (2011) The Phylogenetic Diversity of Metagenomes. PLoS
     ONE 6(8): e23214. doi:10.1371/journal.pone.0023214

Sunday, September 16, 12
Uses of Phylogeny
                    in Genomics and Metagenomics

                                 Example 2:

                           Functional Diversity and
                            Functional Predictions




Sunday, September 16, 12
Phylogenomics
                                                     PHYLOGENENETIC PREDICTION OF GENE FUNCTION



                                       EXAMPLE A                                   METHOD                           EXAMPLE B

                                                2A                         CHOOSE GENE(S) OF INTEREST                        5


                                             3A                                                                          1 3 4
                                                  2B                                                                 2
                                                                              IDENTIFY HOMOLOGS                             5
                                        1A 2A 1B 3B                                                                       6



                                                                               ALIGN SEQUENCES

                               1A      2A    3A 1B        2B      3B                                      1    2         3       4   5   6



                                                                             CALCULATE GENE TREE


                                                        Duplication?


                              1A       2A 3A 1B          2B      3B                                       1    2         3       4   5   6



                                                                               OVERLAY KNOWN
                                                                             FUNCTIONS ONTO TREE

                                                        Duplication?


                                       2A 3A 1B          2B      3B                                      1      2        3       4   5   6
                              1A



                                                                             INFER LIKELY FUNCTION
                                                                             OF GENE(S) OF INTEREST
                                                                                                        Ambiguous
                                                        Duplication?



                           Species 1        Species 2          Species 3

                                                                                                                                             Based on
                            1A 1B            2A 2B              3A 3B                                     1    2         3       4   5   6


                                                                               ACTUAL EVOLUTION
                                                                           (ASSUMED TO BE UNKNOWN)                                           Eisen, 1998
                                                                                                                                             Genome Res 8:
                                                        Duplication                                                                          163-167.

Sunday, September 16, 12
Diversity of Proteorhodopsins




                                                   Venter et al., 2004.
                                                   Science 304: 66.
Sunday, September 16, 12
Improving Functional Predictions

        • Same methods discussed for phylotyping
          improve phylogenomic functional
          prediction for protein families
        • Increase in sequence diversity helps too




Sunday, September 16, 12
Phylosift/ pplacer




     Aaron Darling, Guillaume Jospin, Holly Bik, Erik Matsen, Eric
     Lowe, and others
Sunday, September 16, 12
Carboxydothermus sporulates




                             Wu et al. 2005 PLoS Genetics 1: e65.
Sunday, September 16, 12
Wu et al. 2005 PLoS Genetics 1: e65.
Sunday, September 16, 12
Characterizing the niche-space distributions of components                                                                      NMF in Metagenomes

                                                                                      0 .1   0 .2             0 .3           0 .4                  0 .5        0 .6                                                                  0 .2   0 .4   0 .6   0 .8   1 .0



                           Polyne sia Archipe la gos_ G S 0 4 8 a _ C ora l R e e f
                                        India n O ce a n_ G S 1 2 0 _ O pe n O ce a n
                                Polyne sia Archipe la gos_ G S 0 4 9 _ C oa sta l
                                G a la pa gos Isla nds_ G S 0 2 6 _ O pe n O ce a n
                                        India n O ce a n_ G S 1 1 9 _ O pe n O ce a n
                                                                                                                                                                                                                                                                                         G e ne ra l
                                             C a ribbe a n S e a _ G S 0 1 5 _ C oa sta l
                                             C a ribbe a n S e a _ G S 0 1 9 _ C oa sta l
                                        India n O ce a n_ G S 1 1 4 _ O pe n O ce a n                                                                                                                                                                                                      H igh
                         E a ste rn Tropica l Pa cific_ G S 0 2 3 _ O pe n O ce a n                                                                                                                                                                                                        M e dium
                                      India n O ce a n_ G S 1 1 0 a _ O pe n O ce a n
                                     India n O ce a n_ G S 1 0 8 a _ La goon R e e f                                                                                                                                                                                                       Low
                                     C a ribbe a n S e a _ G S 0 1 8 _ O pe n O ce a n                                                                                                                                                                                                     NA
                                        G a la pa gos Isla nds_ G S 0 3 4 _ C oa sta l
                                      India n O ce a n_ G S 1 2 2 a _ O pe n O ce a n
                                        India n O ce a n_ G S 1 2 1 _ O pe n O ce a n
                                     C a ribbe a n S e a _ G S 0 1 7 _ O pe n O ce a n
                                      India n O ce a n_ G S 1 1 2 a _ O pe n O ce a n
                                        India n O ce a n_ G S 1 1 3 _ O pe n O ce a n
                                       India n O ce a n_ G S 1 4 8 _ F ringing R e e f
                                      C a ribbe a n S e a _ G S 0 1 6 _ C oa sta l S e a
                                        India n O ce a n_ G S 1 2 3 _ O pe n O ce a n
                                                India n O ce a n_ G S 1 4 9 _ H a rbor
                                        G a la pa gos Isla nds_ G S 0 2 7 _ C oa sta l
                         E a ste rn Tropica l Pa cific_ G S 0 2 2 _ O pe n O ce a n                                                                                                                                                                                                      W a te r de pth
             S ites




                                     S a rga sso S e a _ G S 0 0 1 c_ O pe n O ce a n
                                        G a la pa gos Isla nds_ G S 0 3 5 _ C oa sta l
                                 G a la pa gos Isla nds_ G S 0 3 0 _ W a rm S e e p
                                        G a la pa gos Isla nds_ G S 0 2 9 _ C oa sta l                                                                                                                                                                                                     >4000m
                         G a la pa gos Isla nds_ G S 0 3 1 _ C oa sta l upwe lling
                                 India n O ce a n_ G S 1 1 7 a _ C oa sta l sa m ple
                                                                                                                                                                                                                                                                                           2000!4000m
                                        G a la pa gos Isla nds_ G S 0 2 8 _ C oa sta l                                                                                                                                                                                                     900!2000m
                                        G a la pa gos Isla nds_ G S 0 3 6 _ C oa sta l                                                                                                                                                                                                     100!200m
                      Polyne sia Archipe la gos_ G S 0 5 1 _ C ora l R e e f Atoll
                           N orth Am e rica n E a st C oa st_ G S 0 1 4 _ C oa sta l                                                                                                                                                                                                       20!100m
                           N orth Am e rica n E a st C oa st_ G S 0 0 6 _ E stua ry                                                                                                                                                                                                        0!20m
                                E a ste rn Tropica l Pa cific_ G S 0 2 1 _ C oa sta l
                           N orth Am e rica n E a st C oa st_ G S 0 0 9 _ C oa sta l
                           N orth Am e rica n E a st C oa st_ G S 0 1 1 _ E stua ry
                           N orth Am e rica n E a st C oa st_ G S 0 0 8 _ C oa sta l
                           N orth Am e rica n E a st C oa st_ G S 0 1 3 _ C oa sta l
                           N orth Am e rica n E a st C oa st_ G S 0 0 4 _ C oa sta l
                           N orth Am e rica n E a st C oa st_ G S 0 0 7 _ C oa sta l
                           N orth Am e rica n E a st C oa st_ G S 0 0 3 _ C oa sta l
                           N orth Am e rica n E a st C oa st_ G S 0 0 2 _ C oa sta l
                      N orth Am e rica n E a st C oa st_ G S 0 0 5 _ E m baym e nt




                                                                                                    Co                        Co                          Co                       Co                       Co




                                                                                                                                                                                                                                                                          Chlorophyll
                                                                                                                                                                                                                                                                              Salinity


                                                                                                                                                                                                                                                                         Temperature

                                                                                                                                                                                                                                                                         Water Depth
                                                                                                                                                                                                                                                                        Sample Depth


                                                                                                                                                                                                                                                                           Insolation
                                                                                                         mp                         mp                         mp                       mp                       mp
                                                                                                              on                         on                         on                       on                       on
                                                                                                                   en                         en                         en                       en                       en
                                                                                                                        t1                         t2                         t3                       t4                       t5




                                                                                                               (a)                                                                                                                             (b)                        (c)


      Functional biogeography of ocean microbes
    Figure 3: a) Niche-space non-negative matrix
      revealed through distributions for our five components (H T );Weitz,site-
                                                                       w/ b) the Dushoff,
    similarity matrix (HJiang environmental variables for the sites. The matrices Neches,
      factorization ˆ  ˆ T H); c) et al. In press PLoS
                                                                       Langille, are
    aligned so that the same row corresponds to the same site in each matrix. Sites are
      One. Comes out 9/18.                                             Levin, etc
    ordered by applying spectral reordering to the similarity matrix (see Materials and
    Methods). Rows are aligned across the three matrices.
Sunday, September 16, 12
Uses of Phylogeny
                    in Genomics and Metagenomics

                                Example 3:

                       Selecting Organisms for Study




Sunday, September 16, 12
GEBA




                           http://www.jgi.doe.gov/programs/GEBA/pilot.html
Sunday, September 16, 12
GEBA


                                  THAT
                                   IS
                                   SO
                                 LAMG10
                           http://www.jgi.doe.gov/programs/GEBA/pilot.html
Sunday, September 16, 12
How To Keep Up?

        • IMG
        • Genomes Online
        • MicrobeDB
             • http://github.com/mlangill/microbedb/
             • Langille MG, Laird MR, Hsiao WW, Chiu TA, Eisen
               JA, Brinkman FS. MicrobeDB: a locally
               maintainable database of microbial genomic
               sequences. Bioinformatics. 2012 28(14):1947-8.




Sunday, September 16, 12
Improving Phylotyping




Sunday, September 16, 12
More Markers
                              Phylogenetic group      Genome   Gene     Maker
                                                      Number   Number   Candidates
                              Archaea                 62       145415   106
                              Actinobacteria          63       267783   136
                              Alphaproteobacteria     94       347287   121
                              Betaproteobacteria      56       266362   311
                              Gammaproteobacteria     126      483632   118
                              Deltaproteobacteria     25       102115   206
                              Epislonproteobacteria   18       33416    455
                              Bacteriodes             25       71531    286
                              Chlamydae               13       13823    560
                              Chloroflexi             10       33577    323
                              Cyanobacteria           36       124080   590
                              Firmicutes              106      312309   87
                              Spirochaetes            18       38832    176
                              Thermi                  5        14160    974
                              Thermotogae             9        17037    684




Sunday, September 16, 12
Better Reference Tree




    Morgan et al.
    submitted
Sunday, September 16, 12
Improving Functional Predictions




Sunday, September 16, 12
Sifting Families
                                                               Representative
                                                                 Genomes



                                                         B
          A                                                      Extract
                                                                Protein
                                                                                  New
                                                                                Genomes
                                                               Annotation



                                                                                  Extract
                                                                 All v. All
                                                                                 Protein
                                                                  BLAST
                                                                                Annotation



                                                Homology
                                                                                Screen for
                                                  (MCL)  C
                                                Clustering
                                                                                Homologs




                                                       SFams                      HMMs




                                                                 Align &
                                                                  Build
                Sharpton et al. submitted   Figure 1
                                                                 HMMs


Sunday, September 16, 12
Zorro - Automated Masking
                                                                                    9.0

                                                                                    8.0




                                                            Distance to True Tree
                                                                                    7.0

                                                                                    6.0

                                                                                    5.0

                                                                                    4.0
                                                                                                             200
                                                                                    3.0
                                                                                                   no masking




                                                      ce to True Tree
                                                                                    2.0            zorro
                                                                                    1.0            gblocks
                                                                                    0.0
                                                                                          200    400   800   1600   3200
                                                                                                Sequence Length




          Wu M, Chatterji S, Eisen JA (2012) Accounting For Alignment Uncertainty
          in Phylogenomics. PLoS ONE 7(1): e30288. doi:10.1371/journal.pone.
          0030288

Sunday, September 16, 12
Phylogenetic Contrasts




Sunday, September 16, 12
GEBA Lesson



                           We have still only scratched the
                            surface of microbial diversity




Sunday, September 16, 12
PD: All




                               From Wu et al. 2009 Nature 462, 1056-1060
Sunday, September 16, 12
Families/PD not uniform
               31	





                                       6	

                                             


Sunday, September 16, 12
GEBA uncultured
      Number of SAGs from Candidate Phyla




                                                                  406
                                                      1
                                               OD1

                                                     OP1

                                                           OP3

                                                                 SAR
      Site   A: Hydrothermal vent               4      1    -     -
      Site   B: Gold Mine                       6     13    2     -
      Site   C: Tropical gyres (Mesopelagic)    -      -    -     2
      Site   D: Tropical gyres (Photic zone)    1      -    -     -




 Sample collections at 4 additional sites are underway.




                                                                              Phil Hugenholtz




                                                                             97

Sunday, September 16, 12
GEBA Lesson



                           Need Experiments from Across
                                the Tree of Life too




Sunday, September 16, 12
Conclusion




Sunday, September 16, 12
Sunday, September 16, 12
MICROBES




Sunday, September 16, 12
Acknowledgements

             • $$$
                   •   DOE
                   •   NSF
                   •   GBMF
                   •   Sloan
                   •   DARPA
                   •   DSMZ
                   •   DHS
             • People, places
                   • DOE JGI: Eddy Rubin, Phil Hugenholtz, Nikos Kyrpides
                   • UC Davis: Aaron Darling, Dongying Wu, Holly Bik, Russell
                     Neches, Jenna Morgan-Lang
                   • Other: Jessica Green, Katie Pollard, Martin Wu, Tom Slezak,
                     Jack Gilbert, Steven Kembel, J. Craig Venter, Naomi Ward,
                     Hans-Peter Klenk



Sunday, September 16, 12

Mais conteúdo relacionado

Destaque

Jonathan Eisen talk for #SCS2012 at #ISMB "Networks in genomics and bioinfor...
Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinfor...Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinfor...
Jonathan Eisen talk for #SCS2012 at #ISMB "Networks in genomics and bioinfor...Jonathan Eisen
 
VIZBI 2014 - Visualizing Genomic Variation
VIZBI 2014 - Visualizing Genomic VariationVIZBI 2014 - Visualizing Genomic Variation
VIZBI 2014 - Visualizing Genomic VariationJan Aerts
 
The role of cost in yeast gene expression
The role of cost in yeast gene expressionThe role of cost in yeast gene expression
The role of cost in yeast gene expressionMichael Barton
 
Procter Vamsas Bosc2009
Procter Vamsas Bosc2009Procter Vamsas Bosc2009
Procter Vamsas Bosc2009bosc
 
Intro to data visualization
Intro to data visualizationIntro to data visualization
Intro to data visualizationJan Aerts
 
20120622 fridayadelboden
20120622 fridayadelboden20120622 fridayadelboden
20120622 fridayadelbodenYannick Wurm
 
Surfacing the deep data of taxonomy
Surfacing the deep data of taxonomySurfacing the deep data of taxonomy
Surfacing the deep data of taxonomyRoderic Page
 
Tetrahymena genome project 2003 presentation by Jonathan Eisen
Tetrahymena genome project 2003 presentation by Jonathan EisenTetrahymena genome project 2003 presentation by Jonathan Eisen
Tetrahymena genome project 2003 presentation by Jonathan EisenJonathan Eisen
 
Chamberlain PhD Thesis
Chamberlain PhD ThesisChamberlain PhD Thesis
Chamberlain PhD Thesisschamber
 
OBF Address at BOSC 2012
OBF Address at BOSC 2012OBF Address at BOSC 2012
OBF Address at BOSC 2012Hilmar Lapp
 
The neurobiological nature of free will
The neurobiological nature of free willThe neurobiological nature of free will
The neurobiological nature of free willBjörn Brembs
 
Evolution of the RecA Protein: from Systematics to Structure 1995 talk for CA...
Evolution of the RecA Protein: from Systematics to Structure 1995 talk for CA...Evolution of the RecA Protein: from Systematics to Structure 1995 talk for CA...
Evolution of the RecA Protein: from Systematics to Structure 1995 talk for CA...Jonathan Eisen
 
E Talevich - Biopython project-update
E Talevich - Biopython project-updateE Talevich - Biopython project-update
E Talevich - Biopython project-updateJan Aerts
 
Intel Theater Presentation - SC11
Intel Theater Presentation - SC11Intel Theater Presentation - SC11
Intel Theater Presentation - SC11Deepak Singh
 
A brief description of the Chemical Rediscovery Survey and Open Chemistry in ...
A brief description of the Chemical Rediscovery Survey and Open Chemistry in ...A brief description of the Chemical Rediscovery Survey and Open Chemistry in ...
A brief description of the Chemical Rediscovery Survey and Open Chemistry in ...Jean-Claude Bradley
 
Humanizing bioinformatics
Humanizing bioinformaticsHumanizing bioinformatics
Humanizing bioinformaticsJan Aerts
 
Evolution of gene family size change in fungi
Evolution of gene family size change in fungiEvolution of gene family size change in fungi
Evolution of gene family size change in fungiJason Stajich
 
The Sam Adams talk
The Sam Adams talkThe Sam Adams talk
The Sam Adams talkRoderic Page
 

Destaque (20)

Bio4j
Bio4jBio4j
Bio4j
 
Jonathan Eisen talk for #SCS2012 at #ISMB "Networks in genomics and bioinfor...
Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinfor...Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinfor...
Jonathan Eisen talk for #SCS2012 at #ISMB "Networks in genomics and bioinfor...
 
VIZBI 2014 - Visualizing Genomic Variation
VIZBI 2014 - Visualizing Genomic VariationVIZBI 2014 - Visualizing Genomic Variation
VIZBI 2014 - Visualizing Genomic Variation
 
The role of cost in yeast gene expression
The role of cost in yeast gene expressionThe role of cost in yeast gene expression
The role of cost in yeast gene expression
 
Procter Vamsas Bosc2009
Procter Vamsas Bosc2009Procter Vamsas Bosc2009
Procter Vamsas Bosc2009
 
Intro to data visualization
Intro to data visualizationIntro to data visualization
Intro to data visualization
 
20120622 fridayadelboden
20120622 fridayadelboden20120622 fridayadelboden
20120622 fridayadelboden
 
Surfacing the deep data of taxonomy
Surfacing the deep data of taxonomySurfacing the deep data of taxonomy
Surfacing the deep data of taxonomy
 
Tetrahymena genome project 2003 presentation by Jonathan Eisen
Tetrahymena genome project 2003 presentation by Jonathan EisenTetrahymena genome project 2003 presentation by Jonathan Eisen
Tetrahymena genome project 2003 presentation by Jonathan Eisen
 
Chamberlain PhD Thesis
Chamberlain PhD ThesisChamberlain PhD Thesis
Chamberlain PhD Thesis
 
OBF Address at BOSC 2012
OBF Address at BOSC 2012OBF Address at BOSC 2012
OBF Address at BOSC 2012
 
The neurobiological nature of free will
The neurobiological nature of free willThe neurobiological nature of free will
The neurobiological nature of free will
 
Evolution of the RecA Protein: from Systematics to Structure 1995 talk for CA...
Evolution of the RecA Protein: from Systematics to Structure 1995 talk for CA...Evolution of the RecA Protein: from Systematics to Structure 1995 talk for CA...
Evolution of the RecA Protein: from Systematics to Structure 1995 talk for CA...
 
ORCID Principles
ORCID PrinciplesORCID Principles
ORCID Principles
 
E Talevich - Biopython project-update
E Talevich - Biopython project-updateE Talevich - Biopython project-update
E Talevich - Biopython project-update
 
Intel Theater Presentation - SC11
Intel Theater Presentation - SC11Intel Theater Presentation - SC11
Intel Theater Presentation - SC11
 
A brief description of the Chemical Rediscovery Survey and Open Chemistry in ...
A brief description of the Chemical Rediscovery Survey and Open Chemistry in ...A brief description of the Chemical Rediscovery Survey and Open Chemistry in ...
A brief description of the Chemical Rediscovery Survey and Open Chemistry in ...
 
Humanizing bioinformatics
Humanizing bioinformaticsHumanizing bioinformatics
Humanizing bioinformatics
 
Evolution of gene family size change in fungi
Evolution of gene family size change in fungiEvolution of gene family size change in fungi
Evolution of gene family size change in fungi
 
The Sam Adams talk
The Sam Adams talkThe Sam Adams talk
The Sam Adams talk
 

Semelhante a Jonathan Eisen @phylogenomics talk for #LAMG12

DNA and the hidden world of microbes
DNA and the hidden world of microbesDNA and the hidden world of microbes
DNA and the hidden world of microbesJonathan Eisen
 
Talk by @phylogenomics at #LAMG16
Talk by @phylogenomics at #LAMG16Talk by @phylogenomics at #LAMG16
Talk by @phylogenomics at #LAMG16Jonathan Eisen
 
The Era of the Microbiome - Talk by Jonathan Eisen
The Era of the Microbiome - Talk by Jonathan Eisen The Era of the Microbiome - Talk by Jonathan Eisen
The Era of the Microbiome - Talk by Jonathan Eisen Jonathan Eisen
 
"Phylogenomic approaches to microbial diversity" Talk by Jonathan Eisen at #I...
"Phylogenomic approaches to microbial diversity" Talk by Jonathan Eisen at #I..."Phylogenomic approaches to microbial diversity" Talk by Jonathan Eisen at #I...
"Phylogenomic approaches to microbial diversity" Talk by Jonathan Eisen at #I...Jonathan Eisen
 
Chick gene project
Chick gene projectChick gene project
Chick gene projectnxk7001
 
Molecular biology
Molecular biologyMolecular biology
Molecular biologyMadlab
 

Semelhante a Jonathan Eisen @phylogenomics talk for #LAMG12 (9)

DNA and the hidden world of microbes
DNA and the hidden world of microbesDNA and the hidden world of microbes
DNA and the hidden world of microbes
 
Talk by @phylogenomics at #LAMG16
Talk by @phylogenomics at #LAMG16Talk by @phylogenomics at #LAMG16
Talk by @phylogenomics at #LAMG16
 
The Era of the Microbiome - Talk by Jonathan Eisen
The Era of the Microbiome - Talk by Jonathan Eisen The Era of the Microbiome - Talk by Jonathan Eisen
The Era of the Microbiome - Talk by Jonathan Eisen
 
"Phylogenomic approaches to microbial diversity" Talk by Jonathan Eisen at #I...
"Phylogenomic approaches to microbial diversity" Talk by Jonathan Eisen at #I..."Phylogenomic approaches to microbial diversity" Talk by Jonathan Eisen at #I...
"Phylogenomic approaches to microbial diversity" Talk by Jonathan Eisen at #I...
 
Genetics
GeneticsGenetics
Genetics
 
Genetics & orthodontics
Genetics & orthodonticsGenetics & orthodontics
Genetics & orthodontics
 
Blast_anotherFalseHit
Blast_anotherFalseHitBlast_anotherFalseHit
Blast_anotherFalseHit
 
Chick gene project
Chick gene projectChick gene project
Chick gene project
 
Molecular biology
Molecular biologyMolecular biology
Molecular biology
 

Mais de Jonathan Eisen

Eisen.CentralValley2024.pdf
Eisen.CentralValley2024.pdfEisen.CentralValley2024.pdf
Eisen.CentralValley2024.pdfJonathan Eisen
 
Phylogenomics and the Diversity and Diversification of Microbes
Phylogenomics and the Diversity and Diversification of MicrobesPhylogenomics and the Diversity and Diversification of Microbes
Phylogenomics and the Diversity and Diversification of MicrobesJonathan Eisen
 
Talk by Jonathan Eisen for LAMG2022 meeting
Talk by Jonathan Eisen for LAMG2022 meetingTalk by Jonathan Eisen for LAMG2022 meeting
Talk by Jonathan Eisen for LAMG2022 meetingJonathan Eisen
 
Thoughts on UC Davis' COVID Current Actions
Thoughts on UC Davis' COVID Current ActionsThoughts on UC Davis' COVID Current Actions
Thoughts on UC Davis' COVID Current ActionsJonathan Eisen
 
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...Jonathan Eisen
 
A Field Guide to Sars-CoV-2
A Field Guide to Sars-CoV-2A Field Guide to Sars-CoV-2
A Field Guide to Sars-CoV-2Jonathan Eisen
 
EVE198 Summer Session Class 4
EVE198 Summer Session Class 4EVE198 Summer Session Class 4
EVE198 Summer Session Class 4Jonathan Eisen
 
EVE198 Summer Session 2 Class 1
EVE198 Summer Session 2 Class 1 EVE198 Summer Session 2 Class 1
EVE198 Summer Session 2 Class 1 Jonathan Eisen
 
EVE198 Summer Session 2 Class 2 Vaccines
EVE198 Summer Session 2 Class 2 Vaccines EVE198 Summer Session 2 Class 2 Vaccines
EVE198 Summer Session 2 Class 2 Vaccines Jonathan Eisen
 
EVE198 Spring2021 Class1 Introduction
EVE198 Spring2021 Class1 IntroductionEVE198 Spring2021 Class1 Introduction
EVE198 Spring2021 Class1 IntroductionJonathan Eisen
 
EVE198 Spring2021 Class2
EVE198 Spring2021 Class2EVE198 Spring2021 Class2
EVE198 Spring2021 Class2Jonathan Eisen
 
EVE198 Spring2021 Class5 Vaccines
EVE198 Spring2021 Class5 VaccinesEVE198 Spring2021 Class5 Vaccines
EVE198 Spring2021 Class5 VaccinesJonathan Eisen
 
EVE198 Winter2020 Class 8 - COVID RNA Detection
EVE198 Winter2020 Class 8 - COVID RNA DetectionEVE198 Winter2020 Class 8 - COVID RNA Detection
EVE198 Winter2020 Class 8 - COVID RNA DetectionJonathan Eisen
 
EVE198 Winter2020 Class 1 Introduction
EVE198 Winter2020 Class 1 IntroductionEVE198 Winter2020 Class 1 Introduction
EVE198 Winter2020 Class 1 IntroductionJonathan Eisen
 
EVE198 Winter2020 Class 3 - COVID Testing
EVE198 Winter2020 Class 3 - COVID TestingEVE198 Winter2020 Class 3 - COVID Testing
EVE198 Winter2020 Class 3 - COVID TestingJonathan Eisen
 
EVE198 Winter2020 Class 5 - COVID Vaccines
EVE198 Winter2020 Class 5 - COVID VaccinesEVE198 Winter2020 Class 5 - COVID Vaccines
EVE198 Winter2020 Class 5 - COVID VaccinesJonathan Eisen
 
EVE198 Winter2020 Class 9 - COVID Transmission
EVE198 Winter2020 Class 9 - COVID TransmissionEVE198 Winter2020 Class 9 - COVID Transmission
EVE198 Winter2020 Class 9 - COVID TransmissionJonathan Eisen
 
EVE198 Fall2020 "Covid Mass Testing" Class 8 Vaccines
EVE198 Fall2020 "Covid Mass Testing" Class 8 VaccinesEVE198 Fall2020 "Covid Mass Testing" Class 8 Vaccines
EVE198 Fall2020 "Covid Mass Testing" Class 8 VaccinesJonathan Eisen
 
EVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and Testing
EVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and TestingEVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and Testing
EVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and TestingJonathan Eisen
 
EVE198 Fall2020 "Covid Mass Testing" Class 1 Introduction
EVE198 Fall2020 "Covid Mass Testing" Class 1 IntroductionEVE198 Fall2020 "Covid Mass Testing" Class 1 Introduction
EVE198 Fall2020 "Covid Mass Testing" Class 1 IntroductionJonathan Eisen
 

Mais de Jonathan Eisen (20)

Eisen.CentralValley2024.pdf
Eisen.CentralValley2024.pdfEisen.CentralValley2024.pdf
Eisen.CentralValley2024.pdf
 
Phylogenomics and the Diversity and Diversification of Microbes
Phylogenomics and the Diversity and Diversification of MicrobesPhylogenomics and the Diversity and Diversification of Microbes
Phylogenomics and the Diversity and Diversification of Microbes
 
Talk by Jonathan Eisen for LAMG2022 meeting
Talk by Jonathan Eisen for LAMG2022 meetingTalk by Jonathan Eisen for LAMG2022 meeting
Talk by Jonathan Eisen for LAMG2022 meeting
 
Thoughts on UC Davis' COVID Current Actions
Thoughts on UC Davis' COVID Current ActionsThoughts on UC Davis' COVID Current Actions
Thoughts on UC Davis' COVID Current Actions
 
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
 
A Field Guide to Sars-CoV-2
A Field Guide to Sars-CoV-2A Field Guide to Sars-CoV-2
A Field Guide to Sars-CoV-2
 
EVE198 Summer Session Class 4
EVE198 Summer Session Class 4EVE198 Summer Session Class 4
EVE198 Summer Session Class 4
 
EVE198 Summer Session 2 Class 1
EVE198 Summer Session 2 Class 1 EVE198 Summer Session 2 Class 1
EVE198 Summer Session 2 Class 1
 
EVE198 Summer Session 2 Class 2 Vaccines
EVE198 Summer Session 2 Class 2 Vaccines EVE198 Summer Session 2 Class 2 Vaccines
EVE198 Summer Session 2 Class 2 Vaccines
 
EVE198 Spring2021 Class1 Introduction
EVE198 Spring2021 Class1 IntroductionEVE198 Spring2021 Class1 Introduction
EVE198 Spring2021 Class1 Introduction
 
EVE198 Spring2021 Class2
EVE198 Spring2021 Class2EVE198 Spring2021 Class2
EVE198 Spring2021 Class2
 
EVE198 Spring2021 Class5 Vaccines
EVE198 Spring2021 Class5 VaccinesEVE198 Spring2021 Class5 Vaccines
EVE198 Spring2021 Class5 Vaccines
 
EVE198 Winter2020 Class 8 - COVID RNA Detection
EVE198 Winter2020 Class 8 - COVID RNA DetectionEVE198 Winter2020 Class 8 - COVID RNA Detection
EVE198 Winter2020 Class 8 - COVID RNA Detection
 
EVE198 Winter2020 Class 1 Introduction
EVE198 Winter2020 Class 1 IntroductionEVE198 Winter2020 Class 1 Introduction
EVE198 Winter2020 Class 1 Introduction
 
EVE198 Winter2020 Class 3 - COVID Testing
EVE198 Winter2020 Class 3 - COVID TestingEVE198 Winter2020 Class 3 - COVID Testing
EVE198 Winter2020 Class 3 - COVID Testing
 
EVE198 Winter2020 Class 5 - COVID Vaccines
EVE198 Winter2020 Class 5 - COVID VaccinesEVE198 Winter2020 Class 5 - COVID Vaccines
EVE198 Winter2020 Class 5 - COVID Vaccines
 
EVE198 Winter2020 Class 9 - COVID Transmission
EVE198 Winter2020 Class 9 - COVID TransmissionEVE198 Winter2020 Class 9 - COVID Transmission
EVE198 Winter2020 Class 9 - COVID Transmission
 
EVE198 Fall2020 "Covid Mass Testing" Class 8 Vaccines
EVE198 Fall2020 "Covid Mass Testing" Class 8 VaccinesEVE198 Fall2020 "Covid Mass Testing" Class 8 Vaccines
EVE198 Fall2020 "Covid Mass Testing" Class 8 Vaccines
 
EVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and Testing
EVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and TestingEVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and Testing
EVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and Testing
 
EVE198 Fall2020 "Covid Mass Testing" Class 1 Introduction
EVE198 Fall2020 "Covid Mass Testing" Class 1 IntroductionEVE198 Fall2020 "Covid Mass Testing" Class 1 Introduction
EVE198 Fall2020 "Covid Mass Testing" Class 1 Introduction
 

Último

Clinical Pharmacotherapy of Scabies Disease
Clinical Pharmacotherapy of Scabies DiseaseClinical Pharmacotherapy of Scabies Disease
Clinical Pharmacotherapy of Scabies DiseaseSreenivasa Reddy Thalla
 
Study on the Impact of FOCUS-PDCA Management Model on the Disinfection Qualit...
Study on the Impact of FOCUS-PDCA Management Model on the Disinfection Qualit...Study on the Impact of FOCUS-PDCA Management Model on the Disinfection Qualit...
Study on the Impact of FOCUS-PDCA Management Model on the Disinfection Qualit...MehranMouzam
 
ANTI-DIABETICS DRUGS - PTEROCARPUS AND GYMNEMA
ANTI-DIABETICS DRUGS - PTEROCARPUS AND GYMNEMAANTI-DIABETICS DRUGS - PTEROCARPUS AND GYMNEMA
ANTI-DIABETICS DRUGS - PTEROCARPUS AND GYMNEMADivya Kanojiya
 
Glomerular Filtration and determinants of glomerular filtration .pptx
Glomerular Filtration and  determinants of glomerular filtration .pptxGlomerular Filtration and  determinants of glomerular filtration .pptx
Glomerular Filtration and determinants of glomerular filtration .pptxDr.Nusrat Tariq
 
Basic principles involved in the traditional systems of medicine PDF.pdf
Basic principles involved in the traditional systems of medicine PDF.pdfBasic principles involved in the traditional systems of medicine PDF.pdf
Basic principles involved in the traditional systems of medicine PDF.pdfDivya Kanojiya
 
Case Report Peripartum Cardiomyopathy.pptx
Case Report Peripartum Cardiomyopathy.pptxCase Report Peripartum Cardiomyopathy.pptx
Case Report Peripartum Cardiomyopathy.pptxNiranjan Chavan
 
METHODS OF ACQUIRING KNOWLEDGE IN NURSING.pptx by navdeep kaur
METHODS OF ACQUIRING KNOWLEDGE IN NURSING.pptx by navdeep kaurMETHODS OF ACQUIRING KNOWLEDGE IN NURSING.pptx by navdeep kaur
METHODS OF ACQUIRING KNOWLEDGE IN NURSING.pptx by navdeep kaurNavdeep Kaur
 
Presentation on Parasympathetic Nervous System
Presentation on Parasympathetic Nervous SystemPresentation on Parasympathetic Nervous System
Presentation on Parasympathetic Nervous SystemPrerana Jadhav
 
PULMONARY EDEMA AND ITS MANAGEMENT.pdf
PULMONARY EDEMA AND  ITS  MANAGEMENT.pdfPULMONARY EDEMA AND  ITS  MANAGEMENT.pdf
PULMONARY EDEMA AND ITS MANAGEMENT.pdfDolisha Warbi
 
LUNG TUMORS AND ITS CLASSIFICATIONS.pdf
LUNG TUMORS AND ITS  CLASSIFICATIONS.pdfLUNG TUMORS AND ITS  CLASSIFICATIONS.pdf
LUNG TUMORS AND ITS CLASSIFICATIONS.pdfDolisha Warbi
 
SGK HÓA SINH NĂNG LƯỢNG SINH HỌC 2006.pdf
SGK HÓA SINH NĂNG LƯỢNG SINH HỌC 2006.pdfSGK HÓA SINH NĂNG LƯỢNG SINH HỌC 2006.pdf
SGK HÓA SINH NĂNG LƯỢNG SINH HỌC 2006.pdfHongBiThi1
 
CEHPALOSPORINS.pptx By Harshvardhan Dev Bhoomi Uttarakhand University
CEHPALOSPORINS.pptx By Harshvardhan Dev Bhoomi Uttarakhand UniversityCEHPALOSPORINS.pptx By Harshvardhan Dev Bhoomi Uttarakhand University
CEHPALOSPORINS.pptx By Harshvardhan Dev Bhoomi Uttarakhand UniversityHarshChauhan475104
 
Nutrition of OCD for my Nutritional Neuroscience Class
Nutrition of OCD for my Nutritional Neuroscience ClassNutrition of OCD for my Nutritional Neuroscience Class
Nutrition of OCD for my Nutritional Neuroscience Classmanuelazg2001
 
SYNDESMOTIC INJURY- ANATOMICAL REPAIR.pptx
SYNDESMOTIC INJURY- ANATOMICAL REPAIR.pptxSYNDESMOTIC INJURY- ANATOMICAL REPAIR.pptx
SYNDESMOTIC INJURY- ANATOMICAL REPAIR.pptxdrashraf369
 
systemic bacteriology (7)............pptx
systemic bacteriology (7)............pptxsystemic bacteriology (7)............pptx
systemic bacteriology (7)............pptxEyobAlemu11
 
Radiation Dosimetry Parameters and Isodose Curves.pptx
Radiation Dosimetry Parameters and Isodose Curves.pptxRadiation Dosimetry Parameters and Isodose Curves.pptx
Radiation Dosimetry Parameters and Isodose Curves.pptxDr. Dheeraj Kumar
 
Apiculture Chapter 1. Introduction 2.ppt
Apiculture Chapter 1. Introduction 2.pptApiculture Chapter 1. Introduction 2.ppt
Apiculture Chapter 1. Introduction 2.pptkedirjemalharun
 
History and Development of Pharmacovigilence.pdf
History and Development of Pharmacovigilence.pdfHistory and Development of Pharmacovigilence.pdf
History and Development of Pharmacovigilence.pdfSasikiranMarri
 
Informed Consent Empowering Healthcare Decision-Making.pptx
Informed Consent Empowering Healthcare Decision-Making.pptxInformed Consent Empowering Healthcare Decision-Making.pptx
Informed Consent Empowering Healthcare Decision-Making.pptxSasikiranMarri
 

Último (20)

Clinical Pharmacotherapy of Scabies Disease
Clinical Pharmacotherapy of Scabies DiseaseClinical Pharmacotherapy of Scabies Disease
Clinical Pharmacotherapy of Scabies Disease
 
Study on the Impact of FOCUS-PDCA Management Model on the Disinfection Qualit...
Study on the Impact of FOCUS-PDCA Management Model on the Disinfection Qualit...Study on the Impact of FOCUS-PDCA Management Model on the Disinfection Qualit...
Study on the Impact of FOCUS-PDCA Management Model on the Disinfection Qualit...
 
ANTI-DIABETICS DRUGS - PTEROCARPUS AND GYMNEMA
ANTI-DIABETICS DRUGS - PTEROCARPUS AND GYMNEMAANTI-DIABETICS DRUGS - PTEROCARPUS AND GYMNEMA
ANTI-DIABETICS DRUGS - PTEROCARPUS AND GYMNEMA
 
Glomerular Filtration and determinants of glomerular filtration .pptx
Glomerular Filtration and  determinants of glomerular filtration .pptxGlomerular Filtration and  determinants of glomerular filtration .pptx
Glomerular Filtration and determinants of glomerular filtration .pptx
 
Basic principles involved in the traditional systems of medicine PDF.pdf
Basic principles involved in the traditional systems of medicine PDF.pdfBasic principles involved in the traditional systems of medicine PDF.pdf
Basic principles involved in the traditional systems of medicine PDF.pdf
 
Case Report Peripartum Cardiomyopathy.pptx
Case Report Peripartum Cardiomyopathy.pptxCase Report Peripartum Cardiomyopathy.pptx
Case Report Peripartum Cardiomyopathy.pptx
 
METHODS OF ACQUIRING KNOWLEDGE IN NURSING.pptx by navdeep kaur
METHODS OF ACQUIRING KNOWLEDGE IN NURSING.pptx by navdeep kaurMETHODS OF ACQUIRING KNOWLEDGE IN NURSING.pptx by navdeep kaur
METHODS OF ACQUIRING KNOWLEDGE IN NURSING.pptx by navdeep kaur
 
Presentation on Parasympathetic Nervous System
Presentation on Parasympathetic Nervous SystemPresentation on Parasympathetic Nervous System
Presentation on Parasympathetic Nervous System
 
PULMONARY EDEMA AND ITS MANAGEMENT.pdf
PULMONARY EDEMA AND  ITS  MANAGEMENT.pdfPULMONARY EDEMA AND  ITS  MANAGEMENT.pdf
PULMONARY EDEMA AND ITS MANAGEMENT.pdf
 
LUNG TUMORS AND ITS CLASSIFICATIONS.pdf
LUNG TUMORS AND ITS  CLASSIFICATIONS.pdfLUNG TUMORS AND ITS  CLASSIFICATIONS.pdf
LUNG TUMORS AND ITS CLASSIFICATIONS.pdf
 
SGK HÓA SINH NĂNG LƯỢNG SINH HỌC 2006.pdf
SGK HÓA SINH NĂNG LƯỢNG SINH HỌC 2006.pdfSGK HÓA SINH NĂNG LƯỢNG SINH HỌC 2006.pdf
SGK HÓA SINH NĂNG LƯỢNG SINH HỌC 2006.pdf
 
CEHPALOSPORINS.pptx By Harshvardhan Dev Bhoomi Uttarakhand University
CEHPALOSPORINS.pptx By Harshvardhan Dev Bhoomi Uttarakhand UniversityCEHPALOSPORINS.pptx By Harshvardhan Dev Bhoomi Uttarakhand University
CEHPALOSPORINS.pptx By Harshvardhan Dev Bhoomi Uttarakhand University
 
Nutrition of OCD for my Nutritional Neuroscience Class
Nutrition of OCD for my Nutritional Neuroscience ClassNutrition of OCD for my Nutritional Neuroscience Class
Nutrition of OCD for my Nutritional Neuroscience Class
 
Epilepsy
EpilepsyEpilepsy
Epilepsy
 
SYNDESMOTIC INJURY- ANATOMICAL REPAIR.pptx
SYNDESMOTIC INJURY- ANATOMICAL REPAIR.pptxSYNDESMOTIC INJURY- ANATOMICAL REPAIR.pptx
SYNDESMOTIC INJURY- ANATOMICAL REPAIR.pptx
 
systemic bacteriology (7)............pptx
systemic bacteriology (7)............pptxsystemic bacteriology (7)............pptx
systemic bacteriology (7)............pptx
 
Radiation Dosimetry Parameters and Isodose Curves.pptx
Radiation Dosimetry Parameters and Isodose Curves.pptxRadiation Dosimetry Parameters and Isodose Curves.pptx
Radiation Dosimetry Parameters and Isodose Curves.pptx
 
Apiculture Chapter 1. Introduction 2.ppt
Apiculture Chapter 1. Introduction 2.pptApiculture Chapter 1. Introduction 2.ppt
Apiculture Chapter 1. Introduction 2.ppt
 
History and Development of Pharmacovigilence.pdf
History and Development of Pharmacovigilence.pdfHistory and Development of Pharmacovigilence.pdf
History and Development of Pharmacovigilence.pdf
 
Informed Consent Empowering Healthcare Decision-Making.pptx
Informed Consent Empowering Healthcare Decision-Making.pptxInformed Consent Empowering Healthcare Decision-Making.pptx
Informed Consent Empowering Healthcare Decision-Making.pptx
 

Jonathan Eisen @phylogenomics talk for #LAMG12

  • 1. Phylogenomic Approaches to the Study of Microbial Diversity September 16, 2012 Lake Arrowhead Microbial Genomes #LAMG12 Jonathan A. Eisen University of California, Davis @phylogenomics Sunday, September 16, 12
  • 2. A Bit of History • For the real story about the Lake Arrowhead Microbial Genomes meetings see http://tinyurl.com/LAMG12 • But the key to LAMG meetings are ... Sunday, September 16, 12
  • 4. Quotes • Space-time continuum of genes and genomes Sunday, September 16, 12
  • 5. Quotes • Space-time continuum of genes and genomes • Microbes not only have a lot of sex, they have a lot of weird sex Sunday, September 16, 12
  • 6. Quotes • Space-time continuum of genes and genomes • Microbes not only have a lot of sex, they have a lot of weird sex • Gene sequences are the wormhole that allows one to tunnel into the past Sunday, September 16, 12
  • 7. Quotes • Space-time continuum of genes and genomes • Microbes not only have a lot of sex, they have a lot of weird sex • Gene sequences are the wormhole that allows one to tunnel into the past • This is how you do metagenomics on 50 dollars, and that’s Canadian dollars Sunday, September 16, 12
  • 8. Quotes • Space-time continuum of genes and genomes • Microbes not only have a lot of sex, they have a lot of weird sex • Gene sequences are the wormhole that allows one to tunnel into the past • This is how you do metagenomics on 50 dollars, and that’s Canadian dollars • The human guts are a real milieu of stuff Sunday, September 16, 12
  • 9. Quotes • Space-time continuum of genes and genomes • Microbes not only have a lot of sex, they have a lot of weird sex • Gene sequences are the wormhole that allows one to tunnel into the past • This is how you do metagenomics on 50 dollars, and that’s Canadian dollars • The human guts are a real milieu of stuff • Antibiotics do not kill things, they corrupt them Sunday, September 16, 12
  • 10. Quotes • There comes a point in life when you have to bring chemists into the picture Sunday, September 16, 12
  • 11. Quotes • There comes a point in life when you have to bring chemists into the picture • The rectal swabs are here in tan color Sunday, September 16, 12
  • 12. Quotes • There comes a point in life when you have to bring chemists into the picture • The rectal swabs are here in tan color • If I have time I will tell you about a dream Sunday, September 16, 12
  • 13. Quotes • There comes a point in life when you have to bring chemists into the picture • The rectal swabs are here in tan color • If I have time I will tell you about a dream • Another thing you need to know" pause "Actually you don't NEED to know any of this Sunday, September 16, 12
  • 14. Quotes • There comes a point in life when you have to bring chemists into the picture • The rectal swabs are here in tan color • If I have time I will tell you about a dream • Another thing you need to know" pause "Actually you don't NEED to know any of this • I have been influenced by Fisher Price throughout my life Sunday, September 16, 12
  • 15. Quotes • There comes a point in life when you have to bring chemists into the picture • The rectal swabs are here in tan color • If I have time I will tell you about a dream • Another thing you need to know" pause "Actually you don't NEED to know any of this • I have been influenced by Fisher Price throughout my life • This is going to be ironic coming from someone who studies circumcision Sunday, September 16, 12
  • 16. Quotes • And we will bring out the unused cheese from yesterday Sunday, September 16, 12
  • 17. Quotes • And we will bring out the unused cheese from yesterday • A paper came out next year Sunday, September 16, 12
  • 18. Quotes • And we will bring out the unused cheese from yesterday • A paper came out next year • It takes 1000 nanobiologists to make one microbiologist Sunday, September 16, 12
  • 19. Quotes • And we will bring out the unused cheese from yesterday • A paper came out next year • It takes 1000 nanobiologists to make one microbiologist • In an engineering sense, the vagina is a simple plug flow reactor Sunday, September 16, 12
  • 20. Phylogenomic Approaches to Studying Microbial Diversity Example 1: Phylotyping and Phylogenetic Diversity Sunday, September 16, 12
  • 21. rRNA Phylotyping DNA extraction PCR Makes lots of Sequence PCR copies of the rRNA genes rRNA genes in sample rRNA1 5’...ACACACATAGGTGGAGCTA GCGATCGATCGA... 3’ Sequence alignment = Data matrix rRNA2 rRNA1 A C A C A C 5’..TACAGTATAGGTGGAGCTAG CGACGATCGA... 3’ rRNA2 T A C A G T rRNA3 rRNA3 C A C T G T 5’...ACGGCAAAATAGGTGGATT rRNA4 C A C A G T CTAGCGATATAGA... 3’ E. coli A G A C A G rRNA4 5’...ACGGCCCGATAGGTGGATT Humans T A T A G T CTAGCGCCATAGA... 3’ Yeast T A C A G T Sunday, September 16, 12
  • 23. rRNA Phylotyping E. coli Humans Yeast Sunday, September 16, 12
  • 24. rRNA Phylotyping E. coli Humans Yeast OTU2 OTU1 OTU4 OTU3 E. coli Humans Yeast Sunday, September 16, 12
  • 25. rRNA Phylotyping B A Cluster C Sunday, September 16, 12
  • 26. rRNA Phylotyping B A Cluster C B A OTUs C Sunday, September 16, 12
  • 27. rRNA Phylotyping B A Cluster C B A OTUs C OTU1 OTU2 OTU3 OTU4 Sunday, September 16, 12
  • 28. rRNA Phylotyping B A Cluster C B A OTUs C OTU2 OTU1 OTU1 OTU4 OTU3 OTU2 OTU3 E. coli Humans OTU4 Yeast Sunday, September 16, 12
  • 29. rRNA Phylotyping E. coli Humans Yeast Sunday, September 16, 12
  • 30. rRNA Phylotyping Just E. coli Humans Phylogeny Yeast Sunday, September 16, 12
  • 31. rRNA Phylotyping B A Cluster C Just B E. coli Humans Phylogeny A Yeast OTUs C OTU2 OTU1 OTU1 OTU4 OTU3 OTU2 OTU3 E. coli Humans OTU4 Yeast Sunday, September 16, 12
  • 32. rRNA Phylotyping • OTUs • Taxonomic lists • Relative abundance of taxa • Ecological metrics (alpha and beta diversity) • Phylogenetic metrics • Binning • Identification of novel groups • Clades • Rates of change • LGT • Convergence • PD • Phylogenetic ecology (e.g., Unifrac) Sunday, September 16, 12
  • 33. What’s New in Phylotyping Sunday, September 16, 12
  • 34. What’s New in Phylotyping I • More PCR products • Deeper sequencing • The rare biosphere • Relative abundance estimates • More samples (with barcoding) • Times series • Spatially diverse sampling • Fine scale sampling Sunday, September 16, 12
  • 35. intense research (5–9), as such studies of β-diversity (variation in mental variation or dispersal limitation community composition) yield insights into the maintenance of vary by spatial scale? Because most bac Beta-Diversity biodiversity. These studies are still relatively rare for micro- organisms, however, and thus our understanding of the mecha- and hardy, we predicted that dispersa primarily across continents, resulting nisms underlying microbial diversity—most of the tree of life— microbial “provinces” (15). At the sam remains limited. environmental factors would contrib β-Diversity, and therefore distance-decay patterns, could be decay at all scales, resulting in the steepe driven solely by differences in environmental conditions across scale as reported in plant and animal c space, a hypothesis summed up by microbiologists as, “every- thing is everywhere—the environmental selects” (10). Under this Results and Discussion model, a distance-decay curve is observed because environmen- We characterized AOB community co tal variables tend to be spatially autocorrelated, and organisms Sanger sequencing of 16S rRNA gene with differing niche preferences are selected from the available primer sets. Here we focus on the resu pool of taxa as the environment changes with distance. sequences from the order Nitrosomo Dispersal limitation can also give rise to β-diversity, as it per- primers specific for AOB within the β- mits historical contingencies to influence present-day biogeo- The second primer set (18) generate graphic patterns. For example, neutral niche models, in which an organism’s marshes 1.sampled marshes sampled for details). for details). its environmental Fig. 1. The 13 abundance (see Table S1 (see Table S1 Marshes com- com- Fig. The 13 is not influenced by Marshes pared with one another within regions are circled. (Inset) The arrangement preferences, predict apoints within marshes. Six pointsThe arrangement a 100-m relatively pared with one another within regions are circled. (Inset) were sampled along On of sampling distance-decay curve (8, 11). Author contributions: J.B.H.M. and M.C.H.-D. designed of sampling points within marshes. Six points births ∼1 kmalongTwo marshescontribute to transect, and a seventh point was sampled were sampled away. a 100-m in the short time seventh pointstochastic km away.were sampled morethe scales, was sampled(outlined stars) Two marshes in intensively, and deaths Northeast United States M.C.H.-D. performed research; J.B.H.M., S.D.A., and M transect, and a ∼1 a grid pattern. a Northeast United Statesdistributionweretaxa (ecological drift). On longer heterogeneous (outlined stars) of sampled more intensively, along four 100-m transects in and M.C.H.-D. wrote the paper. time four 100-m transects in a rangegenetic processes allow results taxon Distance-decay curves for the declare no conflict of interest. along scales, stochastic pattern. a broader grid of Proteobacteria, but yielded similar for Fig. 2. di- The authors Nitrosomadales communities. The versification across the Tables S2 and S3). (Fig. S1 and landscape (evolutionary drift). If dispersal denotes thearticle is alinear regression across all spatial dashed, blue line This least-squares PNAS Direct Submission. Across all samples, we identified 4,931 quality Nitrosomadales scales. The solid lines denote separate regressions within each of the three isa limiting, then current environmental or (operational taxo- 2.spatial scales: within marshes, regional the Nitrosomadales communities. The acces broader range of Proteobacteria, but yielded similar results conditions will sequences, which grouped into 176 OTUs biotic Fig. Distance-decay Freely available marshes within regions circledPNAS open curves for (across online through the in notAcrossand samples, theidentified 4,931 qualitycurve, and thusdashed,Thebluelinelines significantlyregions). The slopes of all withinsolid theof thespatialthis pape (Fig. S1 fully all Tables S2 units)retained a arbitrary 99% Nitrosomadales cutoff. Fig. 1),solidcontinental Dataseparate zero. linear regression across all three nomic and S3). an explain we distance-decay sequence similarity but light andline)denotes(acrossleast-squares The slopeslinesthe each solid in This cutoff using blue the geographicare denote deposition: The sequences red lines less than regressions of (except reported high amount of sequence diversity, scales. are significantly different from the slope of the all scale (blue dashed) line. distance will begrouped the chance of including diversity similarity even after marshes, regional (across marshes within regions circled in correlated with community because se- sequences, which minimized into 176 OTUs (operational taxo- of spatial scales: within Bank database (accession nos. HQ271472–HQ276885 quencing or PCR99% sequence similarity cutoff. appear 1), and continental (across regions). The slopes of all lines (except the solid errors. Most (95%) of the sequences Fig. controlling for closelyarbitraryeither(2).the marine Nitrosospira-like clade, blue line) are significantly less than zero. The slopesdistancesolid red lines E-m nomic units) using another factors to related 1 light somonadales community similarity. Geographic of the con- addressed. To whom correspondence should be Drivers of bacterial β-diversity depend on spatial scale This cutoff retained a to be abundant inof sequence diversity,ref. 19) orare significantly different from the slope of the all scale (blue dashed) line. known high amount estuarine sediments (e.g., but For macroorganisms, the relative because contribution of environ- largest partial regression coefficient (b = 0.40, to tributed the ECOLOGY marine of including diversity (20) (Fig. This article contains supporting information online at minimized the chance bacterium C-17, classified as Nitrosomonasof se- S2). P < 0.0001), with sediment moisture, nitrate concentration, plant mental factors Pairwise community similaritythe sequences appear calcu- cover, salinity, and1073/pnas.1016308108/-/DCSupplemental. quencing or PCR or dispersal limitation to β-diversity depends on errors. Most (95%) of between the samples was air and water temperature contributing to Jennifer relatedMartinya,1, Jonathan A. Nitrosospira-likePennc, Steven D. Allisona,d, and M. Claire Horner-Devinedistance con- closely B. H. either based the the presence or absence of each OTU using smaller, but significant, partial regression coefficients (b e 0.09– lated to on marine Eisenb, Kevin clade, somonadales community similarity. Geographic = a rarefied Sørensen’s index (4). Community similarity using this sediments (e.g., ref. abundance-based 0.17, the 0.05) (Table 1). Because salt marsh bacteria may be known to be abundant in estuarinehighly correlated with the19) or to tributed P < largest of California, Irvine, CAused a global ocean of incidence index was Biology, and dDepartment of Earth System Science, University ocean currents, we also coefficient (b = 0.40, partial regression 92697; bDepartment ECOLOGY a Department of Ecology and classified as Nitrosomonas (20) (Fig. S2). Evolutionary dispersing through marine bacterium Sørensen index (Mantel test: ρvol. 108 P =no. 19 (21). P < 0.0001), with sediment moisture, nitrate(24), to estimate plant C-17, May 10, 2011 | = 0.9239; | 0.0001) 7850–7854 |Ecology, University of California Davis Genome Center, Davis, CA 95616;circulation model (23), as applied previously concentration, Evolution and PNAS | c www.pnas.org/ Center for Marine Biotechnology and Biomedicine, The Scripps Pairwise community similarity between the samples was Jolla, CA 92093; and eSchool and timesandFishery Sciences, University between A plot of community similarity San Diego, La calcu- Institution of Oceanography, University of California atversus geographic distance cover, salinity,of Aquatic and hypothetical microbial cells of Washington, for relative dispersal air of water temperature contributing to Seattle, WA 98195 the presence or samples revealed that the Nitrosomonadales lated based on each pairwise set of absence of each OTU using smaller, but significant, partial regression coefficientspoints 0.09– each sampling location. Dispersal times between sampling (b = Sunday, September 16, 12 display a significant, negative distance-decay curve (slope = −0.08, did not explain more variability in bacterial community similarity
  • 37. Microbial Range Maps Sunday, September 16, 12
  • 38. Things You Could Do • Mississippi River: 2320 miles long Sunday, September 16, 12
  • 39. Things You Could Do • Mississippi River: 2320 miles long • 1 site / mile • 3 samples / site • 6960 samples • rRNA PCR w/ barcodes • metagenomics w/ barcodes • Miseq Run: • 30 million sequence reads • 4310 sequences / sample • Hiseq 2000 • 6 billion sequence reads • 862,068 sequences / sample Sunday, September 16, 12
  • 40. Things You Could Do • Mississippi River: 12,249,600 feet long • 1 site / 500 feet • 3 samples / site • 73497 samples • rRNA PCR w/ barcodes • metagenomics w/ barcodes • Miseq Run: • 30 million sequence reads • 408 sequences / sample • Hiseq 2000 • 6 billion sequence reads • 81,635 sequences / sample Sunday, September 16, 12
  • 41. What’s New in Phylotyping II • Metagenomics avoids biases of rRNA PCR shotgun sequence Sunday, September 16, 12
  • 42. Metagenomic Phylotyping B A Cluster C Just B E. coli Humans Phylogeny A Yeast OTUs C OTU2 OTU1 OTU1 OTU4 OTU3 OTU2 OTU3 E. coli Humans OTU4 Yeast Sunday, September 16, 12
  • 43. Phylogenetic Challenge ?? Sunday, September 16, 12
  • 44. Phylogenetic Challenge ?? Sunday, September 16, 12
  • 45. Phylogenetic Challenge Multiple approaches Sunday, September 16, 12
  • 46. Method 1: Each is an island Sunday, September 16, 12
  • 47. Method 1: Each is an island • Build alignment, models, trees for full length seqs • Analyze fragmented reads one at a time Sunday, September 16, 12
  • 48. Method 1: Each is an island • Build alignment, models, trees for full length seqs • Analyze fragmented reads one at a time Sunday, September 16, 12
  • 49. Method 1: Each is an island • Build alignment, models, trees for full length seqs • Analyze fragmented reads one at a time Sunday, September 16, 12
  • 50. STAP ss-rRNA Taxonomy Pip Figure 1. A flow chart of the STAP pipeline. doi:10.1371/journal.pone.0002566.g001 STAP database, and the query sequence is aligned to them using a the CLUSTALW profile alignment algorithm [40] as described w above for domain assignment. By adapting the profile alignment s a t o G t t Each sequence s T c analyzed separately a q c e b b S p a Figure 2. Domain assignment. In Step 1, STAP assigns a domain to t each query sequence based on its position in a maximum likelihood d tree of representative ss-rRNA sequences. Because the tree illustrated ‘ here is not rooted, domain assignment would not be accurate and s reliable (sequence similarity based methods cannot make an accurate s assignment in this case either). However the figure illustrates an important role of the tree-based domain assignment step, namely s automatic identification of deep-branching environmental ss-rRNAs. d doi:10.1371/journal.pone.0002566.g002 a PLoS ONE | www.plosone.org 5 Wu et al. 2008 PLoS One FigureSeptember 16, 12 Sunday, 1. A flow chart of the STAP pipeline.
  • 51. AMPHORA Wu and Eisen Genome Biology 2008 9:R151 doi:10.1186/ gb-2008-9-10-r151 Guide tree Sunday, September 16, 12
  • 52. Phylotyping w/ Proteins Wu and Eisen Genome Biology 2008 9:R151 doi:10.1186/gb-2008-9-10-r151 Sunday, September 16, 12
  • 53. Method 2: Most in the Family Sunday, September 16, 12
  • 54. Phylogenetic Challenge xxxxxxxxxxxxxxxxxxxxxxx xxxxxx xxxxxxxxxxxxx xxxxxxxxxxxxxx xxxxxxxxxxxxxx ?? Sunday, September 16, 12
  • 55. Method 2: Most in family xxxxxxxxxxxxxxxxxxxxxxx xxxxxx xxxxxxxxxxxxx xxxxxxxxxxxxxx xxxxxxxxxxxxxx One tree for those w/ overlap Sunday, September 16, 12
  • 56. rRNA in Sargasso Metagenome Venter et al., Science 304: 66. 2004 Sunday, September 16, 12
  • 57. RecA Phylotyping in Sargasso Data Venter et al., Science 304: 66. 2004 Sunday, September 16, 12
  • 58. Weighted % of Clones 0 0.125 0.250 0.375 0.500 Al ph ap ro t eo Be ba Sunday, September 16, 12 ta ct er pr ia ot eo G 304: 66. 2004 am b ac m t er ap ia ro Ep t eo si ba lo ct Venter et al., Science np er ro ia eo t De ba lta ct pr er ot ia eo ba C EFG ct ya er no ia ba ct er Fi ia rm ic EFTu ut es Ac tin ob ac te ria C hl HSP70 or ob i C Major Phylogenetic Group FB Sargasso Phylotypes C RecA hl or of le xi Sp iro ch ae te s RpoB Fu so ba De ct in er ia oc Sargasso Phylotyping oc cu s- rRNA Th Eu er ry m ar u ch s ae C ot a re na rc ha eo ta
  • 59. STAP, QIIME, Mothur ss-rRNA Taxonomy Pip Combine all into one alignment Figure 1. A flow chart of the STAP pipeline. doi:10.1371/journal.pone.0002566.g001 Sunday, September 16, 12
  • 60. all of these bioinformatics steps together in one package. therefore, to invest a large amount of time and effort to To this end, we have built an automated, user-friendly, get to that list of microbes. But now that current efforts workflow-based system called WATERS: a Workflow for are significantly more advanced and often require com- WATERs the Alignment, Taxonomy, and Ecology of Ribosomal parison of dozens of factors and variables with datasets of Sequences (Fig. 1). In addition to being automated and thousands of sequences, it is not practically feasible to Page 2 of 14 simple to use, because WATERS is executed in the Kepler process these large collections "by hand", and hugely inef- scientific workflow system (Fig. 2) it also has the advan- ficient if instead automated methods can be successfully tage that it keeps track of the data lineage and provenance employed. of data products [23,24]. Broadening the user base Automation A second motivation and perspective is that by minimiz- The primary motivation in building WATERS was to ing the technical difficulty of 16 S rDNA analysis through minimize the technical, bioinformatics challenges that the use of WATERS, we aim to make the analysis of these ic- arise when performing DNA sequence clustering, phylo- datasets more widely available and allow individuals with A). Check Build sly Align chimeras Cluster Tree ers nly ed, Diversity Assign Tree w/ statistics & ed graphs Taxonomy Taxonomy ng ge- Cytoscape OTU table Unifrac de- network files he a nt Figure 1 Overview of WATERS. Schema of WATERS where white ise boxes indicate "behind the scenes" analyses that are performed in WA- he TERS. Quality control files are generated for white boxes, but not oth- erwise routinely analyzed. Black arrows indicate that metadata (e.g., on sample type) has been overlaid on the data for downstream interpre- n- tation. Colored boxes indicate different types of results files that are nd generated for the user for further use and biological interpretation. Colors indicate different types of WATERS actors from Fig. 2 which Figure 2 Screenshot of WATERS in Kepler software. Key features: the library of actors un-collapsed and displayed on the left-hand side, the input eys were used: green, Diversity metrics, WriteGraphCoordinates, Diversity and output paths where the user declares the location of their input files and desired location for the results files. Each green box is an individual Kepler graphs; blue, Taxonomy, BuildTree, Rename Trees, Save Trees; Create- actor that performs a single action on the data stream. The connectors (black arrows) direct and hook up the actors in a defined sequence. Double- er) clicking on any actor or connector allows it to be manipulated and re-arranged. Unifrac; yellow, CreateOtuTable, CreateCytoscape, CreateOTUFile; 16 white, remaining unnamed actors. n- as chimeric sequences generated during PCR identifying nto tly Hartman et sets 2010. W.A.T.E.R.S.:as opera- closely related al of sequences (also known a Workflow for the Alignment, Taxonomy, and Ecology nc- of Ribosomal units or OTUs), removing redundant tional taxonomic Sequences. BMC Bioinformatics 2010, 11:317 doi: sequences above a certain percent identity cutoff, assign- 6S 10.1186/1471-2105-11-317 each sequence or ing putative taxonomic identifiers to As representative of a group, inferring a phylogenetic tree of n- the sequences, and comparing the phylogenetic structure Sunday, September 16, 12
  • 61. One Major Issue with rRNA • Copy number varies greatly between taxa • Can lead to significant errors in estimates of relative abundance from numbers of reads Sunday, September 16, 12
  • 62. Kembel Correction Kembel, Wu, Eisen, Green. In press. PLoS Computational Biology. Incorporating 16S gene copy number information improves estimates of microbial diversity and abundance Sunday, September 16, 12
  • 63. Method 3: All in the family Sunday, September 16, 12
  • 64. Phylogenetic Challenge ?? Sunday, September 16, 12
  • 65. Phylogenetic Challenge A single tree with everything? Sunday, September 16, 12
  • 66. rRNA analysis B A Cluster C Just B E. coli Humans Phylogeny A Yeast OTUs C OTU2 OTU1 OTU1 OTU4 OTU3 OTU2 OTU3 E. coli Humans OTU4 Yeast Sunday, September 16, 12
  • 67. PhylOTU Finding Meta Figure 1. PhylOTU Workflow. Computational processes are represented as squares and databases are represented as cylinders in workflow of PhylOTU. See Results section for details. Sharpton TJ, Riesenfeld SJ, Kembel SW, Ladau J, O'Dwyer JP, Green JL, Eisen JA, Pollard KS. (2011) doi:10.1371/journal.pcbi.1001061.g001 PhylOTU: A High-Throughput Procedure Quantifies Microbial Community Diversity and Resolves Novel Taxa from Metagenomic used toPLoS Comput Biol 7(1): e1001061. doi:10.1371/journal.pcbi.1001061 alignment Data. build the profile, resulting in a multiple PD versus PID clustering, 2) to explore overlap betw sequence alignment of full-length reference sequences and clusters and recognized taxonomic designations, and Sunday, September 16, 12 metagenomic reads. The final step of the alignment process is a the accuracy of PhylOTU clusters from shotgun re
  • 68. RecA, RpoB in GOS GOS 1 GOS 2 GOS 3 GOS 4 Wu D, Wu M, Halpern A, Rusch DB, Yooseph S, et al. (2011) Stalking the Fourth Domain in Metagenomic Data: Searching for, Discovering, GOS 5 and Interpreting Novel, Deep Branches in Marker Gene Phylogenetic Trees. PLoS ONE 6(3): e18011. doi:10.1371/journal.pone.0018011 Sunday, September 16, 12
  • 69. Phylosift/ pplacer Aaron Darling, Guillaume Jospin, Holly Bik, Erik Matsen, Eric Lowe, and others Sunday, September 16, 12
  • 70. Phylosift • Probabilistic Phylogenetic Ecology • https://github.com/gjospin/PhyloSift • http://phylosift.wordpress.com Sunday, September 16, 12
  • 71. Method 4: All in the genome Sunday, September 16, 12
  • 72. Multiple Genes? A single tree with everything? Sunday, September 16, 12
  • 73. Kembel Combiner Kembel SW, Eisen JA, Pollard KS, Green JL (2011) The Phylogenetic Diversity of Metagenomes. PLoS ONE 6(8): e23214. doi:10.1371/journal.pone.0023214 Sunday, September 16, 12
  • 74. typically used as a qualitative measure because duplicate s quences are usually removed from the tree. However, the test may be used in a semiquantitative manner if all clone Kembel Combiner even those with identical or near-identical sequences, are i cluded in the tree (13). Here we describe a quantitative version of UniFrac that w call “weighted UniFrac.” We show that weighted UniFrac b haves similarly to the FST test in situations where both a FIG. 1. Calculation of the unweighted and the weighted UniFr measures. Squares and circles represent sequences from two differe environments. (a) In unweighted UniFrac, the distance between t circle and square communities is calculated as the fraction of t branch length that has descendants from either the square or the circ environment (black) but not both (gray). (b) In weighted UniFra branch lengths are weighted by the relative abundance of sequences the square and circle communities; square sequences are weight twice as much as circle sequences because there are twice as many tot circle sequences in the data set. The width of branches is proportion to the degree to which each branch is weighted in the calculations, an gray branches have no weight. Branches 1 and 2 have heavy weigh since the descendants are biased toward the square and circles, respe tively. Branch 3 contributes no value since it has an equal contributio from circle and square sequences after normalization. Kembel SW, Eisen JA, Pollard KS, Green JL (2011) The Phylogenetic Diversity of Metagenomes. PLoS ONE 6(8): e23214. doi:10.1371/journal.pone.0023214 Sunday, September 16, 12
  • 75. Uses of Phylogeny in Genomics and Metagenomics Example 2: Functional Diversity and Functional Predictions Sunday, September 16, 12
  • 76. Phylogenomics PHYLOGENENETIC PREDICTION OF GENE FUNCTION EXAMPLE A METHOD EXAMPLE B 2A CHOOSE GENE(S) OF INTEREST 5 3A 1 3 4 2B 2 IDENTIFY HOMOLOGS 5 1A 2A 1B 3B 6 ALIGN SEQUENCES 1A 2A 3A 1B 2B 3B 1 2 3 4 5 6 CALCULATE GENE TREE Duplication? 1A 2A 3A 1B 2B 3B 1 2 3 4 5 6 OVERLAY KNOWN FUNCTIONS ONTO TREE Duplication? 2A 3A 1B 2B 3B 1 2 3 4 5 6 1A INFER LIKELY FUNCTION OF GENE(S) OF INTEREST Ambiguous Duplication? Species 1 Species 2 Species 3 Based on 1A 1B 2A 2B 3A 3B 1 2 3 4 5 6 ACTUAL EVOLUTION (ASSUMED TO BE UNKNOWN) Eisen, 1998 Genome Res 8: Duplication 163-167. Sunday, September 16, 12
  • 77. Diversity of Proteorhodopsins Venter et al., 2004. Science 304: 66. Sunday, September 16, 12
  • 78. Improving Functional Predictions • Same methods discussed for phylotyping improve phylogenomic functional prediction for protein families • Increase in sequence diversity helps too Sunday, September 16, 12
  • 79. Phylosift/ pplacer Aaron Darling, Guillaume Jospin, Holly Bik, Erik Matsen, Eric Lowe, and others Sunday, September 16, 12
  • 80. Carboxydothermus sporulates Wu et al. 2005 PLoS Genetics 1: e65. Sunday, September 16, 12
  • 81. Wu et al. 2005 PLoS Genetics 1: e65. Sunday, September 16, 12
  • 82. Characterizing the niche-space distributions of components NMF in Metagenomes 0 .1 0 .2 0 .3 0 .4 0 .5 0 .6 0 .2 0 .4 0 .6 0 .8 1 .0 Polyne sia Archipe la gos_ G S 0 4 8 a _ C ora l R e e f India n O ce a n_ G S 1 2 0 _ O pe n O ce a n Polyne sia Archipe la gos_ G S 0 4 9 _ C oa sta l G a la pa gos Isla nds_ G S 0 2 6 _ O pe n O ce a n India n O ce a n_ G S 1 1 9 _ O pe n O ce a n G e ne ra l C a ribbe a n S e a _ G S 0 1 5 _ C oa sta l C a ribbe a n S e a _ G S 0 1 9 _ C oa sta l India n O ce a n_ G S 1 1 4 _ O pe n O ce a n H igh E a ste rn Tropica l Pa cific_ G S 0 2 3 _ O pe n O ce a n M e dium India n O ce a n_ G S 1 1 0 a _ O pe n O ce a n India n O ce a n_ G S 1 0 8 a _ La goon R e e f Low C a ribbe a n S e a _ G S 0 1 8 _ O pe n O ce a n NA G a la pa gos Isla nds_ G S 0 3 4 _ C oa sta l India n O ce a n_ G S 1 2 2 a _ O pe n O ce a n India n O ce a n_ G S 1 2 1 _ O pe n O ce a n C a ribbe a n S e a _ G S 0 1 7 _ O pe n O ce a n India n O ce a n_ G S 1 1 2 a _ O pe n O ce a n India n O ce a n_ G S 1 1 3 _ O pe n O ce a n India n O ce a n_ G S 1 4 8 _ F ringing R e e f C a ribbe a n S e a _ G S 0 1 6 _ C oa sta l S e a India n O ce a n_ G S 1 2 3 _ O pe n O ce a n India n O ce a n_ G S 1 4 9 _ H a rbor G a la pa gos Isla nds_ G S 0 2 7 _ C oa sta l E a ste rn Tropica l Pa cific_ G S 0 2 2 _ O pe n O ce a n W a te r de pth S ites S a rga sso S e a _ G S 0 0 1 c_ O pe n O ce a n G a la pa gos Isla nds_ G S 0 3 5 _ C oa sta l G a la pa gos Isla nds_ G S 0 3 0 _ W a rm S e e p G a la pa gos Isla nds_ G S 0 2 9 _ C oa sta l >4000m G a la pa gos Isla nds_ G S 0 3 1 _ C oa sta l upwe lling India n O ce a n_ G S 1 1 7 a _ C oa sta l sa m ple 2000!4000m G a la pa gos Isla nds_ G S 0 2 8 _ C oa sta l 900!2000m G a la pa gos Isla nds_ G S 0 3 6 _ C oa sta l 100!200m Polyne sia Archipe la gos_ G S 0 5 1 _ C ora l R e e f Atoll N orth Am e rica n E a st C oa st_ G S 0 1 4 _ C oa sta l 20!100m N orth Am e rica n E a st C oa st_ G S 0 0 6 _ E stua ry 0!20m E a ste rn Tropica l Pa cific_ G S 0 2 1 _ C oa sta l N orth Am e rica n E a st C oa st_ G S 0 0 9 _ C oa sta l N orth Am e rica n E a st C oa st_ G S 0 1 1 _ E stua ry N orth Am e rica n E a st C oa st_ G S 0 0 8 _ C oa sta l N orth Am e rica n E a st C oa st_ G S 0 1 3 _ C oa sta l N orth Am e rica n E a st C oa st_ G S 0 0 4 _ C oa sta l N orth Am e rica n E a st C oa st_ G S 0 0 7 _ C oa sta l N orth Am e rica n E a st C oa st_ G S 0 0 3 _ C oa sta l N orth Am e rica n E a st C oa st_ G S 0 0 2 _ C oa sta l N orth Am e rica n E a st C oa st_ G S 0 0 5 _ E m baym e nt Co Co Co Co Co Chlorophyll Salinity Temperature Water Depth Sample Depth Insolation mp mp mp mp mp on on on on on en en en en en t1 t2 t3 t4 t5 (a) (b) (c) Functional biogeography of ocean microbes Figure 3: a) Niche-space non-negative matrix revealed through distributions for our five components (H T );Weitz,site- w/ b) the Dushoff, similarity matrix (HJiang environmental variables for the sites. The matrices Neches, factorization ˆ ˆ T H); c) et al. In press PLoS Langille, are aligned so that the same row corresponds to the same site in each matrix. Sites are One. Comes out 9/18. Levin, etc ordered by applying spectral reordering to the similarity matrix (see Materials and Methods). Rows are aligned across the three matrices. Sunday, September 16, 12
  • 83. Uses of Phylogeny in Genomics and Metagenomics Example 3: Selecting Organisms for Study Sunday, September 16, 12
  • 84. GEBA http://www.jgi.doe.gov/programs/GEBA/pilot.html Sunday, September 16, 12
  • 85. GEBA THAT IS SO LAMG10 http://www.jgi.doe.gov/programs/GEBA/pilot.html Sunday, September 16, 12
  • 86. How To Keep Up? • IMG • Genomes Online • MicrobeDB • http://github.com/mlangill/microbedb/ • Langille MG, Laird MR, Hsiao WW, Chiu TA, Eisen JA, Brinkman FS. MicrobeDB: a locally maintainable database of microbial genomic sequences. Bioinformatics. 2012 28(14):1947-8. Sunday, September 16, 12
  • 88. More Markers Phylogenetic group Genome Gene Maker Number Number Candidates Archaea 62 145415 106 Actinobacteria 63 267783 136 Alphaproteobacteria 94 347287 121 Betaproteobacteria 56 266362 311 Gammaproteobacteria 126 483632 118 Deltaproteobacteria 25 102115 206 Epislonproteobacteria 18 33416 455 Bacteriodes 25 71531 286 Chlamydae 13 13823 560 Chloroflexi 10 33577 323 Cyanobacteria 36 124080 590 Firmicutes 106 312309 87 Spirochaetes 18 38832 176 Thermi 5 14160 974 Thermotogae 9 17037 684 Sunday, September 16, 12
  • 89. Better Reference Tree Morgan et al. submitted Sunday, September 16, 12
  • 91. Sifting Families Representative Genomes B A Extract Protein New Genomes Annotation Extract All v. All Protein BLAST Annotation Homology Screen for (MCL) C Clustering Homologs SFams HMMs Align & Build Sharpton et al. submitted Figure 1 HMMs Sunday, September 16, 12
  • 92. Zorro - Automated Masking 9.0 8.0 Distance to True Tree 7.0 6.0 5.0 4.0 200 3.0 no masking ce to True Tree 2.0 zorro 1.0 gblocks 0.0 200 400 800 1600 3200 Sequence Length Wu M, Chatterji S, Eisen JA (2012) Accounting For Alignment Uncertainty in Phylogenomics. PLoS ONE 7(1): e30288. doi:10.1371/journal.pone. 0030288 Sunday, September 16, 12
  • 94. GEBA Lesson We have still only scratched the surface of microbial diversity Sunday, September 16, 12
  • 95. PD: All From Wu et al. 2009 Nature 462, 1056-1060 Sunday, September 16, 12
  • 96. Families/PD not uniform 31 6 Sunday, September 16, 12
  • 97. GEBA uncultured Number of SAGs from Candidate Phyla 406 1 OD1 OP1 OP3 SAR Site A: Hydrothermal vent 4 1 - - Site B: Gold Mine 6 13 2 - Site C: Tropical gyres (Mesopelagic) - - - 2 Site D: Tropical gyres (Photic zone) 1 - - - Sample collections at 4 additional sites are underway. Phil Hugenholtz 97 Sunday, September 16, 12
  • 98. GEBA Lesson Need Experiments from Across the Tree of Life too Sunday, September 16, 12
  • 102. Acknowledgements • $$$ • DOE • NSF • GBMF • Sloan • DARPA • DSMZ • DHS • People, places • DOE JGI: Eddy Rubin, Phil Hugenholtz, Nikos Kyrpides • UC Davis: Aaron Darling, Dongying Wu, Holly Bik, Russell Neches, Jenna Morgan-Lang • Other: Jessica Green, Katie Pollard, Martin Wu, Tom Slezak, Jack Gilbert, Steven Kembel, J. Craig Venter, Naomi Ward, Hans-Peter Klenk Sunday, September 16, 12