2. Agenda
• Visualizing omics data
• Re-introduction to 16S analysis
• Hands on 16S analysis in Rstudio
• There is so much to learn. How do I start?
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
4. Who - when, where and why?
Re-introduction to 16S analysis
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
5. Who - when, where and why?
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
6. Who - when, where and why?
Accumulibacter
Competibacter
http://en.wikipedia.org/wiki/File:EBPR_FISH_Floc.jpg
P. Larsen 2012
Bacillus anthracis
http://phil.cdc.gov/phil/details.asp?pid=2226
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
7. Taking advantage of evolution
The affinities of all the beings of the same class have
sometimes been represented by a great tree... The
green and budding twigs may represent existing
species; and those produced during former years
may represent the long succession of extinct species.
C. Darwin, 1872
Nothing in biology makes sense,
except in the light of evolution.
T. Dobzhansky, 1973
http://tolweb.org
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
8. Why do we use the 16S gene?
Ribosomes are universal
rRNA = Structural RNA
http://www.rna.icmb.utexas.edu/SAE/2B/ConsStruc/Diagrams/cons.16.b.Bacteria.pdf
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
9. Why do we use the 16S gene?
8F
8F Universal primer
8F
8F
http://www.rna.icmb.utexas.edu/SAE/2B/ConsStruc/Diagrams/cons.16.b.Bacteria.pdf
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
10. Why do we use the 16S gene?
Ashelford et al. AEM. 2005;71:7724-7736
• Advantages:
• Universal gene (No horizontal gene transfer)
• Conserved regions
• Variable regions
• Great databases and alignments
• Problems:
• Variable copy number
• No universal (unbiased) primers
• (Not directly correlated with activity)
• (Lack of functional information)
http://www.rna.icmb.utexas.edu/SAE/2B/ConsStruc/Diagrams/cons.16.b.Bacteria.pdf
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
12. Typical workflow
Sampling
Extraction
Sample prep
Sequencing
Bioinformatics
• Standardisation, standardization, standardizasion..!
• Use biological replicates and evaluate your variation…!
• Design a good experiment with realistic expectations to
the outcome (Most studies fail here!!!)
AAU activated sludge standard @ midasfieldguide.org
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
13. Typical workflow
Sampling
Extraction
Sample prep
Bioinformatics
Sequencing
Storage
Input (mg)
• Fresh
• 24 h @ 4°C
• 24 h @ 20 °C
4
1 2
9
22
eDNA removal
NH2
+
650 W 10 min
N3
N+
CH3
PMA
AAU activated sludge standard @ midasfieldguide.org
Duration (s)
Bead beating
400
160
80
40
20
4
6
Intensity (ms-1)
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
14. Typical workflow
Mean frequency of
most common residue
in 50 bp window
Sampling
Extraction
Sample prep
Bioinformatics
Sequencing
1.0
0.8
V7
V1
0.6
V2
V3
V1.3
0
V4
V5
V6
V3.4
V4
500
V8
V9
Bp
1000
1500
Ashelford et al. AEM. 2005;71:7724-7736
AAU activated sludge standard @ midasfieldguide.org
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
15. Typical workflow
Sampling
Extraction
Sample prep
Bioinformatics
Sequencing
PCR with modified 16S primers
Illumina adapter
Pad
linker
27F
5’-AATGATACGGCGACCACCGAGATCTACAC GTACGTACG GT AGAGTTTGATCCTGGCTCAG-3’
Illumina adapter
Barcode
Pad
linker
534R
5’-CAAGCAGAAGACGGCATACGAGAT TCCCTTGTCTCC ACGTACGTAC CCG ATTACCGCGGCTGCTGG-3’
PCR Cycle
//
1.
2.
Target region
//
//
3.
AAU activated sludge standard @ midasfieldguide.org
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
18. Typical workflow
Sampling
Extraction
Sample prep
Sequencing
Bioinformatics
How many sequences are needed? It depends on your question!
(although 50.000 raw sequences per sample is usually fine)
AAU raw kit and chemical costs (DKK)
Cost
DNA extraction
105
70a
40
40
Sequencing (min 100k reads / sample)
190b
70c
Total
335
Library preparation
Cost v2
180
a Kits
discounted
50 samples per run
c 150 samples per run (can run up to 300)
b
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
22. Typical workflow
Sampling
Extraction
Sample prep
Bioinformatics
Sequencing
OTU Count
Merge
Cluster
3
11
3
Assign taxonomy (Compare to database)
OTU Count
3
11
3
Removing unique sequences makes the
subsequent steps 10-100x faster and removes
the majority of errors and chimera’s
OTU table
Accumulibacter
Unkown
Competibacter
Dependent on sequencing depth and
sample complexity! Be careful!
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
23. AAU workflow
Sampling
Extraction
Sample prep
Bioinformatics
Sequencing
Find sample ID’s on Google drive
Plain text file
16SAMP-145
16SAMP-146
16SAMP-147
16SAMP-148
16SAMP-149
16SAMP-150
OTU table (+ R version)
16S.V13.workflow.sh
OTU
A B
2 1 Accumulibacter
3 8 Unkown
3 0 Competibacter
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
24. AAU workflow
Sampling
Extraction
Sample prep
Sequencing
Bioinformatics
What 16S.V13.workflow.sh does:
1. Find and unpack your samples
2. Optional subsampling
3. Remove potential phiX contamination (bowtie2)
4. Merge read 1 and read 2 (flash)
5. Remove reads outside length criteria
6. Optional removal of unique reads and subsampling to even depth
7. Format reads for QIIME
8. Cluster reads to OTUs (Uclust, QIIME)
9. Assign taxonomy (RDP classifier, QIIME + database: MiDAS, Greengnes or Silva)
10. Generate OTU table (QIIME)
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
25. Where do I start?
• Get online (twitter, blogs, seqanswer.com)
• Learn basic multivariate statistics
• Learn R (with Rstudio)
• Analyzing Ecological Data (2007) by Zuur,
Ieno & Smith
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY