1. Next-generation sequencing methods such as Roche 454, Illumina GAII, and ABI SOLiD allow for high throughput DNA sequencing through massive parallel sequencing.
2. These methods involve clonal amplification of DNA fragments on solid surfaces or in emulsion PCR followed by sequencing using pyrosequencing, sequencing by synthesis with reversible terminators, or sequencing by ligation approaches.
3. The resulting sequencing data requires high throughput management and analysis pipelines to process the large volumes of sequence data produced.
1. High throughput DNA sequencing
Cosentino Cristian, PhD
Genomics and Bioinformatics unit
Filarete Foundation – Milan
cosentia@gmail.com
2. Summary
1 Classical sequecning method (Sanger)
2 Next-generation sequencing methods
Roche 454
ABi SOLiD
Illummina GAII
3 High throughput data management
4 High throughput sample preparation
6 Next-NGS sequencing
Helicos HeliScope
3. Summary
1 Classical sequecning method (Sanger)
2 Next-generation sequencing methods
Roche 454
ABi SOLiD
Illummina GAII
3 High throughput data management
4 High throughput sample preparation
6 Next-NGS sequencing
Helicos HeliScope
4. Approaching to NGS
2010
2000
1980
1990
1977 Sanger sequencing method by F. Sanger
(PNAS ,1977, 74: 560-564)
1983 PCR by K. Mullis
(Cold Spring Harb Symp Quant Biol. 1986;51 Pt 1:263-73)
1953 Discovery of DNA structure Human Genome Project
(Cold Spring Harb. Symp. Quant. Biol. 1953;18:123-31)
(Nature , 2001, 409: 860–92; Science, 2001, 291: 1304–1351)
1993 Development of pyrosequencing
(Anal. Biochem., 1993, 208: 171-175; Science ,1998, 281: 363-365)
Single molecule emulsion PCR 1998
Founded Solexa 1998
Founded 454 Life Science 2000
454 GS20 sequencer 2005
(First NGS sequencer)
Solexa Genome Analyzer 2006
(First short-read NGS sequencer)
Illumina acquires Solexa 2006
(Illumina enters the NGS business)
ABI SOLiD 2007
(Short-read sequencer based upon ligation)
Roche acquires 454 Life Sciences 2007
(Roche enters the NGS business)
GS FLX sequencer 2008
(NGS with 400-500 bp read lenght)
NGS Human Genome sequencing 2008
(First Human Genome sequencing based upon NGS technology)
Hi-Seq2000 2010
(200Gbp per Flow Cell)
5. Sequencing technologies
DNA sequencing
Classical approach Next-generation sequencing
Individual sequencing reaction Massive parallel sequencing
Clonally amplified DNAs Single molecule DNA
(NGS) (N-NGS)
Helicos
Sanger method Illumina GAII
HeliScope
ABI SOLiD
Roche 454
Output Output
Single sequence ranging from 500 to 1000 bp Gbp of sequences ranging from 25 to 500 bp
High Throughput
6. Sanger
method Sanger method with labeled dNTPs
- The Sanger mehtods is based on the idea that inhibitors can
terminate elongation of DNA at specific points
Roche 454
ABi SOLiD
Illumina GAII
HeliScope
Nanopore
7. Summary
1 Classical sequecning method (Sanger)
2 Next-generation sequencing methods
Roche 454
ABi SOLiD
Illummina GAII
3 High throughput data management
4 High throughput sample preparation
6 Next-NGS sequencing
Helicos HeliScope
8. Next-generation sequencing platforms
Isolation and purification of
target DNA
Sample preparation
Library validation
Amplification
Cluster generation
Emulsion PCR
on solid-phase
Sequencing
Sequencing by synthesis Sequencing by synthesis
with 3’-blocked reversible Pyrosequencing Sequencing by ligation with 3’-unblocked reversible
terminators terminators
Imaging
Four colour imaging Single colour imaging
Data analysis
Illumina GAII Roche 454 ABi SOLiD Helicos HeliScope
15. ABi SOLiD
Sequecning by ligation
Sanger
method Annu. Rev. Genomics Hum. Genet., 2008, 9: 387-402
Roche 454
-
Illumina GAII
HeliScope
Nanopore
5 Universal Prime rounds (n to n-4),
each with 7 probe ligations: 35 bp reads
16. ABi SOLiD
Colour encoding
Sanger
method Annu. Rev. Genomics Hum. Genet., 2008, 9: 387-402
Roche 454
-
Illumina GAII
HeliScope
Nanopore
Base zero
is known
17. ABi SOLiD
Base zero
Sanger
method Annu. Rev. Genomics Hum. Genet., 2008, 9: 387-402
Roche 454
-
Illumina GAII
HeliScope
Nanopore
18. Illumina GAII
Sequecning by synthesis with reversible terminator
Sanger
method
Roche 454
ABi SOLiD
-
HeliScope
Nanopore
19. Illumina GAII
Instrumentation
Introduction
Sample
preparation
Bioanalyzer 2100 Cluster station
Clusters
amplification
Sequencing by
synthesis
Analysis
pipeline
High
throughput
Paired-end module Genome Analyzer IIx Linux server
20. Illumina GAII
GAII applications
Introduction
Sample
preparation
Clusters
amplification Application Source
Sequencing by
synthesis
De novo gDNA sequencing gDNA
Analysis
pipeline
Whole-genome resequencing gDNA
High
throughput
Target resequecning Target enriched DNA sequences
mRNA-seq Total RNA
small RNA-seq Total RNA
CHiP-seq Chip-DNA fragments
Sequencing modes:
• Single-read
• Paired-end
• Multiplexing
21. Illumina GAII
Sequencing workflow
Introduction Sample
preparation and
Sample library validation
preparation
Clusters
amplification Wash cluster
station
Cluster generation
Sequencing by
Cluster station
synthesis Clusters
Analysis
amplification
pipeline
Linearization,
High
Blocking and
throughput
primer
Hybridization
Read 1
SBS sequencing
GAIIx & PE
Prepare read 2
Read 2
Pipeline base call
Analysis
Data analysis
22. Illumina GAII
Library preparation
Introduction
Sample
preparation
Clusters
amplification
Sequencing by
synthesis
Analysis
pipeline
High
throughput
gDNA Fragmented Adaptor- Double strand
DNA ligated DNA denaturation
24. Illumina GAII
Cluster generation
Introduction
Sample
preparation
Clusters Validated library
amplification
Sequencing by
synthesis
Cluster station
Analysis washing
pipeline
Cluster generation
High Load reagents on
throughput Cluster Station
Cluster station
Load DNA on Cluster
Station Weekly
manintenance wash
Amplification on
Cluster Station
Linearization,
Blocking and primer
Hybridization
SBS sequencing onto
GAIIx
25. Illumina GAII
Flow cell
Introduction
Sample
preparation
Clusters
amplification
Sequencing by
synthesis
Analysis
pipeline
High
throughput
26. Illumina GAII
Bridge amplification
Introduction
Sample
preparation
Clusters
amplification
Sequencing by
synthesis
Analysis
pipeline
High
throughput
Hybridize adapter-ligated forward
fragment and extend
Extension is completed
Denature dsDNA and wash original forward template;
reverse template stays covalently attached to the array
27. Illumina GAII
Bridge amplification
Introduction
Sample
preparation
Clusters
amplification
Sequencing by
synthesis
Analysis
pipeline
High
throughput
Bridge amplification of the reverse fragment
Double-strand bridge is formed
Double strand bridge is denatured and reverse as wel as
forward fragments are covalentrly attached to the array
28. Illumina GAII
Bridge amplification
Introduction
Sample
preparation
Clusters
amplification
Sequencing by
synthesis
Analysis
pipeline
High
throughput
Bridge amplification is repeated to enlarge the cluster
Double-strand bridges are denatured
Reverse strands fragments are cleaved and washed away
29. Illumina GAII
Bridge amplification
Introduction
Sample
preparation
Clusters
amplification
Sequencing by
synthesis
Analysis
pipeline
High
throughput
Cluster with forward strands only,
covalently attached to the array
Sequencing primers start the SBS
process
30. Illumina GAII
Sequencing with GAIIx
Introduction
Single read Cluster amplified
Sample
FlowCell
preparation
Clusters Wash GA & PEM
amplification
Sequencing by
synthesis Install prism
Analysis
pipeline Install flow-cell
High
throughput
Apply oil
GAIIx
First-base
incorporation
Adjust focus
Check quality Weekly
metrics manintenance wash
36-100 cycles
Real-time
sequencing run
monitoring
for Read 1/2
Post-run wash
Analysis pipeline
Linux server
31. Illumina GAII
SBS technology
Introduction
Sample
preparation
Clusters
amplification
Sequencing by
synthesis
Analysis
pipeline
High
throughput
32. Illumina GAII
GAIIx optical path
Introduction
Sample
preparation
Clusters
amplification
Sequencing by
synthesis Two colour excitation
Analysis Four colour emission detection
pipeline
High
throughput
33. Illumina GAII
Paired-end sequencing workflow
Introduction
Paired-end
Sample
preparation
Wash GA & PEM
Clusters
amplification
Sequencing by Install prism
synthesis
Analysis Install flow-cell
pipeline
High
throughput Apply oil
First-base
GAIIx incorporation
Adjust focus
PEM
Prepare Read 2
Check quality
metrics
36-100 cycles
Real-time
sequencing run
monitoring
for Read 1/2
Post-run wash
Analysis pipeline
Linux server
35. Illumina GAII
Paired-end technology
Introduction
Paired-end sequencing works into GA and uses chemicals from the PE
Sample module to perform cluster amplification of the reverse strand
preparation
Clusters
amplification
Sequencing by
synthesis
Analysis
pipeline
High
throughput
36. Illumina GAII
Firecrest and CASAVA
Introduction Image files Intensity files Base calls files
Sample
preparation Firecrest Bustard
From image From intensity
Clusters
amplification to intensity to reads
Sequencing by
synthesis
Analysis
pipeline
High
throughput
Gerald
Reads alignment
Assembly Alignment files
CASAVA
Consensus assembly
37. NGS technologies comparison
Sequencing Amplif. Chemistry Read Run Gbp/ DNA $/sequencer
lenght time day required (ref. 2008)
(bp) (d) (μg)
Roche 454 GS FLX emPCR Pyrosequencing 250- 0.35 1.3 3-5 500.000
Titanium 400 *
ABi SOLiD emPCR Sequencing by 25-50 7-14 3.6 0.1-20 595.000
ligation
Illumina GAII Solid-phase Reversible 36-100 4-9 3.9 0.1-1 430.000
terminator
* Average
38. NGS technologies comparison
Sequencing Advantages Disadvantages $/Mbp
(in 2008)*
Roche 454 •Long reads even > 400 bp, •High indel in homopolymer 60
improving de novo sequencing stretches > 6 nucl.
•Rare sustitution errors •High reagent cost
•Longest reads only in single-
read (2x150 bp)
ABi SOLiD •Error correction with the two-base •Long time run 2
encoding system •Needs of cluster station to
perform base calling and up to
1 week to align
•Alignment must be performed
against a reference db
Illumina GAII •Most widely used platform (> 90 •Low multiplexing capability 2
science/nature publication) •Substitution errors
•Sample preparation automatable
•SBS , real-time analysis and base
calling are performed simultaneously
to the run
•Automated cluster generation
*Nat. Biotech., 2008, 26: 1135-1145
39. Summary
1 Classical sequecning method (Sanger)
2 Next-generation sequencing methods
Roche 454
ABi SOLiD
Illummina GAII
3 High throughput data management
4 High throughput sample preparation
6 Next-NGS sequencing
Helicos HeliScope
40. Illumina GAII
High throughput data storage
Introduction Genotyping units
Sample
preparation
Clusters
amplification
0.5 – 14 GB/beadChip
Sequencing by
synthesis
Analysis
pipeline
High
throughput
Data storage
Tape recording unit Data storage
Sequencing unit
for offline backup mangement
200 Tb storage
capacity
1 – 6 Tb/FlowCell
41. Illumina GAII
High throughput data analysis
Introduction
Data storage
Sample
preparation
Clusters
amplification
Sequencing by
synthesis
Analysis
pipeline
High
throughput
Server network
INSERT
Total Gb RAM
Sequencing pipeline Total CPU
Total Tb storage
For Genotypin and
Database server sequencing services
Genotyping applications
42. Summary
1 Classical sequecning method (Sanger)
2 Next-generation sequencing methods
Roche 454
ABi SOLiD
Illummina GAII
3 High throughput data management
4 High throughput sample preparation
6 Next-NGS sequencing
Helicos HeliScope
43. Illumina GAII
High throughput sample preparation
Introduction
Nature Methods, 2010, 7: 111-118
Sample
preparation
Clusters
amplification
Sequencing by
synthesis
Analysis
pipeline
High
throughput
44. Illumina GAII
High throughput sample preparation
Introduction Roche Nimblegen
RainDance Salid-phase capture with custom-
Sample
preparation Microdroplet PCR designed oligonucleotide microarray
Clusters
amplification
Sequencing by
synthesis
Analysis
pipeline
High
throughput
Reported 84% of
capture efficiency
Nature Methods, 2010, 7: 111-118
Reported 65-90% of capture efficiency
45. Illumina GAII
High throughput sample preparation
Introduction Agilent SureSelect
Sample
Solution-phase capture with
preparation streptavidin-coated magnetic beads
Clusters
amplification
Sequencing by
synthesis
Analysis
pipeline
High
throughput
Reported 60-80% of capture efficiency
46. Summary
1 Classical sequecning method (Sanger)
2 Next-generation sequencing methods
Roche 454
ABi SOLiD
Illummina GAII
3 High throughput data management
4 High throughput sample preparation
6 Next-NGS sequencing
Helicos HeliScope
47. Heliscope
Next-NGS: single molecule sequencing
Sanger
method Nature Reviews genetics, 2010, 11: 31-46
Roche 454 • Any gDNA amplification is requiresd, eliminatign the bias from clonally amplified templates
• Low amont of starting gDNA (< 1 μg)
ABi SOLiD
• More effective quantification in mRNA-seq w/o the amplification step
• HeliScope was the first commercialized single molecule sequencer
Illumina GAII
-
Nanopore
Poly(A) adaptor linked
tot he template fragment
One-pass sequencing: poly/T)
adaptors are linked to the solid phase
48. Heliscope
SBS with reversible terminators
Sanger
method Nature Reviews genetics, 2010, 11: 31-46
Roche 454
ABi SOLiD
Illumina GAII
-
One colour – real time detection,
different form Illumina GAII system
Nanopore
49. Nanopore
Challenges of Next-NGS sequencing
Sanger
method
Roche 454
ABi SOLiD
Illumina GAII
HeliScope
-
Oxford Nanopore: strand-sequencing using ionic current blockage
Pacific Biosciences: Real-time DNA sequencing from single polymerase molecules