Call Girls Dehradun Just Call 9907093804 Top Class Call Girl Service Available
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
1. A Genomic Encyclopedia of
Bacteria and Archaea
(GEBA)
Jonathan A. Eisen
U. C. Davis and J. G. I.
QuickTime™ and a
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
TIFF (LZW) decompressor
are needed to see this picture.
2. Outline
• Background
– Why history matters
– Gaps in available genomes
• The GEBA pilot project
• Future needs
QuickTime™ and a
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
TIFF (LZW) decompressor
are needed to see this picture.
3. The Tree of Life
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and a
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
TIFF (LZW) decompressor
are needed to see this picture.
4. Famous Arrowhead 2004 Quotes
• Space-time continuum of genes and genomes
• Gene sequences are the wormhole that allows
one to tunnel into the past
• The human mind can conceive of things with
no basis in physical reality
• Thoughts can go faster than the speed of light
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and a
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
TIFF (LZW) decompressor
are needed to see this picture.
5. Famous Arrowhead Quotes 2006
• Publications, student degrees, etc.
• Not trying to say anything bad about
anyone
• The human guts are a real milieu
• Where’s you evening gown?
• You better kiss everybody
• This is how you do metagenomics on 50
dollars, and that’s Canadian dollars
QuickTime™ and a
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
TIFF (LZW) decompressor
are needed to see this picture.
6. QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and a
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
TIFF (LZW) decompressor
are needed to see this picture.
7. QuickTime™ and a
From http://genomesonline.org TIFF (LZW) decompressor
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
are needed to see this picture.
8. Major Microbial Sequencing Efforts
• Coordinated, top-down efforts
– Fungal Genome Initiative (Broad/Whitehead)
– Gordon and Betty Moore Foundation Marine Microbial Genome
Sequencing Project
– Sanger Center Pathogen Sequencing Unit
– NHGRI Human Gut Microbiome Project
– NIH Human Microbiome Program
• White paper or grant systems
– NIAID Microbial Sequencing Centers
– DOE/JGI Community Sequencing Program
– DOE/JGI BER Sequencing Program
– NSF/USDA Microbial Genome Sequencing
• Covers lots of ground and biological diversity
QuickTime™ and a
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
TIFF (LZW) decompressor
are needed to see this picture.
9. The Tree of Life
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and a
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
TIFF (LZW) decompressor
are needed to see this picture.
10. The Tree is not Happy
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and a
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
TIFF (LZW) decompressor
are needed to see this picture.
11. As of 2002 Proteobacteria
TM6
OS-K
• At least 40
Acidobacteria
Termite Group
OP8
phyla of
Nitrospira
Bacteroides bacteria
Chlorobi
Fibrobacteres
Marine GroupA
WS3
Gemmimonas
Firmicutes
Fusobacteria
Actinobacteria
OP9
Cyanobacteria
Synergistes
Deferribacteres
Chrysiogenetes
NKB19
Verrucomicrobia
Chlamydia
OP3
Planctomycetes
Spriochaetes
Coprothmermobacter
OP10
Thermomicrobia
Chloroflexi
TM7
Deinococcus-Thermus
Dictyoglomus
Aquificae
Thermudesulfobacteria
Thermotogae
OP1 Based on QuickTime™ and a
OP11 TIFF (LZW) decompressor
Hugenholtz, 2002
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
are needed to see this picture.
12. As of 2002 Proteobacteria
TM6
OS-K
• At least 40
Acidobacteria
Termite Group
OP8
phyla of
Nitrospira
Bacteroides
bacteria
Chlorobi
Fibrobacteres
Marine GroupA • Genome
WS3
Gemmimonas
Firmicutes
sequences are
Fusobacteria
Actinobacteria
mostly from
OP9
Cyanobacteria three phyla
Synergistes
Deferribacteres
Chrysiogenetes
NKB19
Verrucomicrobia
Chlamydia
OP3
Planctomycetes
Spriochaetes
Coprothmermobacter
OP10
Thermomicrobia
Chloroflexi
TM7
Deinococcus-Thermus
Dictyoglomus
Aquificae
Thermudesulfobacteria
Thermotogae
OP1 Based on QuickTime™ and a
TIFF (LZW) decompressor
Hugenholtz, 2002
QuickTime™ and a
OP11
TIFF (Uncompressed) decompressor
are needed to see this picture.
are needed to see this picture.
13. As of 2002 Proteobacteria
TM6
OS-K
• At least 40
Acidobacteria
Termite Group
OP8
phyla of
Nitrospira
Bacteroides
bacteria
Chlorobi
Fibrobacteres
Marine GroupA • Genome
WS3
Gemmimonas
Firmicutes
sequences are
Fusobacteria
Actinobacteria
mostly from
OP9
Cyanobacteria three phyla
Synergistes
Deferribacteres
Chrysiogenetes • Some other
NKB19
Verrucomicrobia
Chlamydia phyla are
OP3
Planctomycetes
Spriochaetes
only sparsely
Coprothmermobacter
OP10 sampled
Thermomicrobia
Chloroflexi
TM7
Deinococcus-Thermus
Dictyoglomus
Aquificae
Thermudesulfobacteria
Thermotogae
OP1 Based on QuickTime™ and a
TIFF (LZW) decompressor
Hugenholtz, 2002
QuickTime™ and a
OP11
TIFF (Uncompressed) decompressor
are needed to see this picture.
are needed to see this picture.
14. As of 2002 Proteobacteria
TM6
OS-K
• At least 40
Acidobacteria
Termite Group
OP8
phyla of
Nitrospira
Bacteroides
bacteria
Chlorobi
Fibrobacteres
Marine GroupA
• Genome
WS3
Gemmimonas sequences are
Firmicutes
Fusobacteria mostly from
Actinobacteria
OP9
Cyanobacteria
three phyla
Synergistes
Deferribacteres
Chrysiogenetes
• Some other
NKB19
Verrucomicrobia
Chlamydia
phyla are
OP3
Planctomycetes only sparsely
Spriochaetes
Coprothmermobacter
OP10
sampled
Thermomicrobia
Chloroflexi
TM7
• Same trend in
Deinococcus-Thermus
Dictyoglomus
Aquificae
Archaea,
Thermudesulfobacteria
Thermotogae Eukaryotes
OP1 Based on QuickTime™ and a
TIFF (LZW) decompressor
Hugenholtz, 2002
QuickTime™ and a
OP11
TIFF (Uncompressed) decompressor
are needed to see this picture.
are needed to see this picture.
15. Need for Tree Guidance Well Established
• Common approach within some eukaryotic groups
– NHGRI animal projects
– FGI at Whitehead
– Plant sequencing at JGI
• Phylogenetic gaps in bacterial and archaeal projects
commonly lamented in literature
• Many small projects funded to fill in some gaps
– DOE/TIGR Sequencing
– Multiple CSP projects
– Multiple NSF/USDA projects
– Private projects (e.g., Integrated Genomics, Diversa)
– TIGR (Eisen, Ward) Bacterial Tree of Life Project
QuickTime™ and a
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
TIFF (LZW) decompressor
are needed to see this picture.
16. Why Increase Taxonomic Coverage?
• Mechanisms of diversification
• Gene discovery
• Annotation, functional prediction
• Metagenomic analysis
• Species phylogeny and classification
QuickTime™ and a
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
TIFF (LZW) decompressor
are needed to see this picture.
17. Proteobacteria
• Eisen-Ward TM6 • At least 40
OS-K
NSF Tree of Acidobacteria
Termite Group phyla of
OP8
Life Project Nitrospira
Bacteroides
bacteria
Chlorobi
• A genome Fibrobacteres
Marine GroupA
• Genome
WS3
from each of Gemmimonas sequences are
Firmicutes
eight phyla Fusobacteria mostly from
Actinobacteria
OP9
Cyanobacteria
three phyla
Synergistes
Deferribacteres
Chrysiogenetes
• Some other
NKB19
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
Verrucomicrobia
Chlamydia
phyla are only
OP3
Planctomycetes sparsely
Spriochaetes
Coprothmermobacter
OP10
sampled
Thermomicrobia
Based on Chloroflexi
TM7
• Solution I:
Hugenholtz, Deinococcus-Thermus
Dictyoglomus sequence more
2002 Aquificae
Thermudesulfobacteria
Thermotogae phyla
OP1 QuickTime™ and a
QuickTime™ and a
TIFF (LZW) decompressor
OP11
TIFF (Uncompressed) decompressor
are needed to see this picture.
are needed to see this picture.
18. The Tree of Life is Still Angry
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and a
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
TIFF (LZW) decompressor
are needed to see this picture.
19. Within Phyla Diversity Immense
• Each phyla represents billions of years of
evolution
• Some have hundreds of major lineages,
most with no genomes
• Need to sample within phyla too
QuickTime™ and a
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
TIFF (LZW) decompressor
are needed to see this picture.
20. Major Lineages of Actinobacteria
2.5.1 Acidimicrobidae
2.5.1.1 Unclassified
2.5.1.2 "Microthrixineae
2.5.1.3 Acidimicrobineae
2.5.1.4 BD2-10
2.5.1.5 EB1017
2.5.2 Actinobacteridae
2.5.2.1 Unclassified
2.5.2.10 Ellin306/WR160
2.5.2.11 Ellin5012
2.5.2.12 Ellin5034
2.5.2.13 Frankineae
2.5.2.14 Glycomyces
2.5.2.15 Intrasporangiaceae
2.5.2.16 Kineosporiaceae
2.5.2.17 Microbacteriaceae
2.5.2.18 Micrococcaceae
2.5.2.19 Micromonosporaceae
2.5.2.2 Actinomyces
2.5.2.20 Propionibacterineae
2.5.2.21 Pseudonocardiaceae
2.5.2.22 Streptomycineae
2.5.2.23 Streptosporangineae
2.5.2.3 Actinomycineae
2.5.2.4 Actinosynnemataceae
2.5.2.5 Bifidobacteriaceae
2.5.2.6 Brevibacteriaceae
2.5.2.7 Cellulomonadaceae
2.5.2.8 Corynebacterineae
2.5.2.9 Dermabacteraceae
2.5.3 Coriobacteridae
2.5.3.1 Unclassified
2.5.3.2 Atopobiales
2.5.3.3 Coriobacteriales
2.5.3.4 Eggerthellales
2.5.4 OPB41
2.5.5 PK1
2.5.6 Rubrobacteridae
2.5.6.1 Unclassified
2.5.6.2 "Thermoleiphilaceae QuickTime™ and a
TIFF (LZW) decompressor
2.5.6.3 MC47
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
are needed to see this picture.
2.5.6.4 Rubrobacteraceae
22. Proteobacteria
TM6
OS-K
• At least 100 phyla of
Acidobacteria
Termite Group bacteria
OP8
Nitrospira
Bacteroides • Genome sequences are
Chlorobi
Fibrobacteres
Marine GroupA mostly from three phyla
WS3
Gemmimonas
Firmicutes
• Most phyla with cultured
Fusobacteria
Actinobacteria
species are sparsely
OP9
Cyanobacteria
Synergistes
sampled
Deferribacteres
Chrysiogenetes
NKB19
• Lineages with no cultured
Verrucomicrobia
Chlamydia
OP3
taxa even more poorly
Planctomycetes
Spriochaetes sampled
Coprothmermobacter
OP10
Thermomicrobia • Solution - use tree to really
Chloroflexi
TM7
Deinococcus-Thermus
fill gaps
Dictyoglomus
Aquificae
Well sampled phyla
Thermudesulfobacteria
Thermotogae
OP1 QuickTime™ and a
TIFF (LZW) decompressor
OP11
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
are needed to see this picture.
23. QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and a
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
TIFF (LZW) decompressor
are needed to see this picture.
24. GEBA Pilot Project: Components
• Project management (David Bruce, Lynne Goodwin et al)
• Selection of strains (Phil Hugenholtz, Nikos Kyrpides,
Jonathan Eisen)
• Culture collection and DNA prep (DSMZ, Hans-Peter
Klenk)
• Libraries and DNA (Eileen Dalin et al.)
• Sequencing and closure (Susan Lucas, Alla Lapidus et al.)
• Annotation and database needs (Nikos Kyrpides)
• Analysis (Dongying Wu, Martin Wu, Jenna Morgan,
Victor Kunin, Marcel Huntemann, Neil Rawlings, Ian
Paulsen, Gary Xie, Patrick Chain, Patrik D’Haeseleer,
Sean Hooper, Iain Anderson, Mavrommatis Kostas)
• Adopt a microbe education project (Cheryl Kerfeld)
• Outreach (David Gilbert)
QuickTime™ and a
QuickTime™ and a
• $$$ (DOE, Eddy Rubin, Jim Bristow)
TIFF (Uncompressed) decompressor
are needed to see this picture.
TIFF (LZW) decompressor
are needed to see this picture.
25. GEBA Pilot I:
Identifying Lineages without
Genomes
QuickTime™ and a
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
TIFF (LZW) decompressor
are needed to see this picture.
26. QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and a
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
TIFF (LZW) decompressor
are needed to see this picture.
27. QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and a
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
TIFF (LZW) decompressor
are needed to see this picture.
28. QuickTime™ and a
TIFFQuickTime™ and a
(LZW) decompressor
are TIFF (LZW) decompressor
needed to see this picture.
are needed to see this picture.
QuickTime™ and a
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
TIFF (LZW) decompressor
are needed to see this picture.
29. GEBA Pilot II:
Selecting Targets
QuickTime™ and a
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
TIFF (LZW) decompressor
are needed to see this picture.
30. Key Criteria
• Phylogenetic novelty
– Working from top of tree down
– Also selected one phylum to fill in in more detail -
Actinobacteria
• Culturable
– Type strain preferred is all else equal
• DOE mission relevance
• Ready availability to us and community
– Of strain
– Of DNA
QuickTime™ and a
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
TIFF (LZW) decompressor
are needed to see this picture.
31. GEBA Pilot III:
Partnership with DSMZ
QuickTime™ and a
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
TIFF (LZW) decompressor
are needed to see this picture.
32. GEBA Biggest Challenge:
Getting DNA
• Getting quality DNA is biggest bottleneck
• Decided to test as part of the GEBA pilot
the possibility of getting DNA directly from
culture collections
• DSMZ offered to do for free
• ATCC is doing a small number for a fee
• Working with other culture collections
QuickTime™ and a
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
TIFF (LZW) decompressor
are needed to see this picture.
33. QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and a
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
TIFF (LZW) decompressor
are needed to see this picture.
34. Quantification gel of the genomic DNA isolated from Microorganisms
Microorganisms
Conexibacter woesei (DSM 14684T)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Lane 1: c(λ-Marker)= 15 ng Lane 9: DSM 18081, Patulibacter minatonensis
Lane 2: c(λ-Marker)= 30 ng Lane 10: DSM 14684, Conexibacter woesei
Lane 3: c(λ-Marker)= 50 ng Lane 11: DSM 11002, Dethiosulfovibrio peptidovorans
Lane 4: DNA Molecular Weight Marker II (Roche Lane 12: DSM 11551, Halogeometricum borinquense
236250) Lane 13: DNA Molecular Weight Marker II (Roche
Lane 5: DSM 13279, Collinsella stercoris 236250)
Lane 6: DSM 43043, Intrasporangium calvum Lane 14: c(λ-Marker)= 125 ng
Lane 7: DSM 18053, Dyadobacter fermentans Lane 15: c(λ-Marker)= 250 ng
Lane 8: DSM 20476, Slackia heliotrinireducens Lane 16: c(λ-Marker)= 500 ng
Conexibacter woesei (DSM 14684T) was taken from the German Collection of Microorganisms
and Cell Cultures (DSMZ). The genomic DNA was isolated using the Qiagen Genomic 500 DNA
Kit (Qiagen 10262). The genomic DNA was 10-250 kb in size as determined by Pulsed Field Gel
Electrophoresis (PFGE). The bulk of DNA had a size of 50-250 kb (see attached PFGE image).
The DNA concentration is 500 ng/µl as estimated from the gel. Spectrophotometric measurements
QuickTime™ and a
yielded a DNA concentration of 450 µg/ml; 300 µl of genomic DNA are shipped (150 µg).
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
TIFF (LZW) decompressor
are needed to see this picture.
35. GEBA Pilot IV:
Sequencing Progress
QuickTime™ and a
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
TIFF (LZW) decompressor
are needed to see this picture.
36. QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and a
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
TIFF (LZW) decompressor
are needed to see this picture.
37. GEBA Pilot Target List
35
30
25
20
15
# of Genomes
10
5
0
B: Aquificae
B: Chloroflexi Deinococci Firmicutes
B: B:
B: Bacteroidetes B: Fusobacteria B: Spirochaetes Halobacteria Thermococci
A: A: Archaeoglobi Thermoprotei
A: A:
B: AminanaerobiaDeferribacteres
B: B: Deferribacteres B: Planctomycetes A: Methanobacteria
B: Haloanaerobiales Thermovenabulae
B: Thermodesulfobia Methanomicrobia
B: A:
B: Gemmatimonadetes
B: Delta Proteobacteria
B: Epsilon B: Gamma Proteobacteria
Proteobacteria B: Thermodesulfobacteria
B: Actinobacteria (High GC)
Phyla
QuickTime™ and a
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
TIFF (LZW) decompressor
are needed to see this picture.
38. GEBA Pilot Status 5-12-08
35
30
25
Closed
20
Post Draft
Production
15 Library
Awaiting Material
# of Genomes
10
5
0
B: Aquificae B: Firmicutes
B: Chloroflexi
B: Deinococci
B: Bacteroidetes A: HalobacteriaA:
A: Thermococci
B: Fusobacteria B: Spirochaetes A: ArchaeoglobiThermoprotei
B: Aminanaerobia Deferribacteres
B: Deferribacteres
B: B: Planctomycetes
B: Haloanaerobiales A: A: Methanomicrobia
Methanobacteria
B: Thermodesulfobia
B: Thermovenabulae
B: Gemmatimonadetes
B: Delta Proteobacteria
B: Epsilon Proteobacteria B: Thermodesulfobacteria
B: Gamma Proteobacteria
B: Actinobacteria (High GC)
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture. Phyla QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
39. Non Active Projects
16
14
12
10
Abandoned
8 On Hold
# of6Genomes
4
2
0
B: Aquificae
B: ChloroflexiDeinococciFirmicutes
B: B:
B: Bacteroidetes B: Fusobacteria B: Spirochaetes Halobacteria A: Thermoprotei
A: A: Archaeoglobi
A: Thermococci
B: Aminanaerobia Deferribacteres
B: Deferribacteres
B: B: Planctomycetes A: Methanobacteria
B: HaloanaerobialesThermovenabulae
B: ThermodesulfobiaMethanomicrobia
B: A:
B: Gemmatimonadetes
B: Delta Proteobacteria
B: EpsilonB: Gamma Proteobacteria
Proteobacteria B: Thermodesulfobacteria
B: Actinobacteria (High GC)
Phyla
QuickTime™ and a
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
TIFF (LZW) decompressor
are needed to see this picture.
40. GEBA Pilot Data Release
30
25
20
15
# of Genomes
10
5
0
B: Aquificae
B: Chloroflexi Deinococci Firmicutes
B: B:
B: Bacteroidetes B: Fusobacteria B: Spirochaetes A: Halobacteria A: Thermococci
A: Archaeoglobi A: Thermoprotei
B: Aminanaerobia Deferribacteres
B: B: Deferribacteres B: Planctomycetes A: Methanobacteria
B: HaloanaerobialesB: Thermovenabulae
B: Thermodesulfobia A: Methanomicrobia
B: Delta Proteobacteria Gemmatimonadetes
B:
B: Epsilon Proteobacteria B: Thermodesulfobacteria
B: Gamma Proteobacteria
B: Actinobacteria (High GC)
Phyla
QuickTime™ and a
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
TIFF (LZW) decompressor
are needed to see this picture.
41. Progress Report
GEBA Status 5-12-08
Closed
3%
Awaiting Material
26%
Post Draft
51%
Library
9%
Production
11%
QuickTime™ and a
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
TIFF (LZW) decompressor
are needed to see this picture.
42. Progress
QuickTime™ and a
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
TIFF (LZW) decompressor
are needed to see this picture.
43. Data
QuickTime™ and a
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
TIFF (LZW) decompressor
are needed to see this picture.
45. GEBA Pilot V:
Benefit?
QuickTime™ and a
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
TIFF (LZW) decompressor
are needed to see this picture.
46. Why Increase Taxonomic Coverage?
• Mechanisms of diversification
• Gene discovery
• Annotation, functional prediction
• Metagenomic analysis
• Species phylogeny and classification
QuickTime™ and a
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
TIFF (LZW) decompressor
are needed to see this picture.
47. Value of 100 diverse genomes I:
Gene discovery
• Gene families
– Will compare and contrast gene family
diversity in these genomes versus random
samples of previous genomes
– Will assess rate of gene family discovery and
whether / how much it is diminishing
• Specific examples of novelty
– Focusing on DOE mission areas
– Do we find novel forms of hydrogenases,
cellulases, C-fixation, etc
QuickTime™ and a
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
TIFF (LZW) decompressor
are needed to see this picture.
48. Value of 100 diverse genomes II:
Annotation
• Ortholog identification
– Filling in gaps will help identify orthologs between species
– Diverse GC content and amino acid composition should also
improve ortholog identification
• Examination of the rate of hypothetical protein conversion
to “known” proteins
• Non-homology functional prediction should improve
greatly
– Phylogenetic profiling
– Rosetta Stone domain sharing
QuickTime™ and a
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
TIFF (LZW) decompressor
are needed to see this picture.
49. QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and a
TIFF (LZW) decompressor
Based on Wu et al. 2005
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
are needed to see this picture.
50. QuickTime™ and a
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
TIFF (LZW) decompressor
are needed to see this picture.
51. Value of 100 diverse genomes III:
Metagenomics
• More diverse genomes should improve anchoring
and binning of all metagenomic data sets
• Will test by running phylotyping software
comparing to genome data sets with and without
GEBA genomes
– Megan
– AMPHORA
• Should be a good complement to reference
genome sequencing
QuickTime™ and a
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
TIFF (LZW) decompressor
are needed to see this picture.
52. dnaG
0.7
frr
infC
0.6 nusA
pgk
pyrG
0.5 rplA
rplB
rplC
0.4 rplD
rplE
rplF
0.3
rplK
rplL
0.2 rplM
rplN
rplP
0.1 rplS
rplT
rpmA
0 rpoB
rpsB
rpsC
Aquificae Chlorobi rpsE
Chlamydiae Firmicutes
Chloroflexi
Acidobacteria
Bacteroidetes Spirochaetes rpsI
Cyanobacteria Actinobacteria
Planctomycetes rpsJ
Betaproteobacteria
Deltaproteobacteria
Alphaproteobacteria rpsK
Epsilonproteobacteria Unclassified Bacteria
Gammaproteobacteria rpsM
Unclassified Proteobacteria rpsS
smpB
tsf
QuickTime™ and a
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
TIFF (LZW) decompressor
are needed to see this picture.
53. Value of 100 diverse genomes IV:
Mechanisms of Diversification
• Lateral gene transfer
– Lateral gene transfer is fundamentally important in
microbial evolution
– However, when we find “foreign” DNA in genomes we
usually cannot pinpoint the origin of that DNA
– Having more diverse genomes may help better pin
down source groups for each piece of foreign DNA
• Eukaryotic diversification
– Of ~200 eukaryotic specific gene families
– How many now show up in bacteria and archaea
– Any patterns to where there are found?
QuickTime™ and a
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
TIFF (LZW) decompressor
are needed to see this picture.
54. CRISPR - expanding the possible
34 out of 56 genomes contain CRISPR
1-13 arrays (loci) per genome
Halingium ochraceum SMP-2, DSM 14365
807 repeats in total
a single repeat contains 382 repeats
Verminephrobacter eisenieae: 249 repeats
QuickTime™ and a
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
TIFF (LZW) decompressor
are needed to see this picture.
55. Value of 100 diverse genomes V:
Phylogeny
QuickTime™ and a
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
TIFF (LZW) decompressor
are needed to see this picture.
56. 16s Says Hyphomonas is in Rhodobacteriales
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
Badger et al.
2005
QuickTime™ and a
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
TIFF (LZW) decompressor
are needed to see this picture.
57. WGT Says Its Related to Caulobacterales
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
Badger et al.
2005
QuickTime™ and a
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
TIFF (LZW) decompressor
are needed to see this picture.
58. Tree of Life Example II
QuickTime™ and a
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
TIFF (LZW) decompressor
are needed to see this picture.
59. QuickTime™ and a
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
TIFF (LZW) decompressor
are needed to see this picture.
60. GEBA - What’s Next
• Repeat and/or scale up
• Need to determine the value of finished
versus unfinished genomes
• Apply this method to other groups
– Microbial eukaryotes
– Viruses
• Really fill in bacterial and archaeal tree
QuickTime™ and a
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
TIFF (LZW) decompressor
are needed to see this picture.
61. The slopes of the linear regression Lines represent the PD contribution of the genomes
(each window contains 50 genomes)
QuickTime™ and a
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
TIFF (LZW) decompressor
are needed to see this picture.
62. Slope (50 genome windows)
Window position
QuickTime™ and a
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
TIFF (LZW) decompressor
are needed to see this picture.
63. Greengenes ssrRNA
Slope (50 genome windows)
Window position QuickTime™ and a
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
TIFF (LZW) decompressor
are needed to see this picture.
64. GEBA: Long Run
• Need active community input
• Involvement of multiple funding agencies,
labs, genome centers
• Integration/ communication among all large
scale projects
• Follow recommendations of NAS, ASM,
AAM reports
• Adopt a Microbe - Link to educational
initiatives
QuickTime™ and a
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
TIFF (LZW) decompressor
are needed to see this picture.
Notas do Editor
Gets better with more markers - but we do not have lots of sequences for these markers. We can get them from genomes. The more diverse the genomes, thebeter the marker set will be