The document discusses DNA sequencing techniques. It defines DNA sequencing as determining the exact order of nucleotides within a DNA molecule. The first DNA sequences were obtained in the 1970s using 2D chromatography. Sanger and Maxam-Gilbert sequencing were the first generation techniques, with Sanger using DNA polymerase and Maxam-Gilbert using chemical degradation. Next generation sequencing allows millions of reactions in parallel and produces short reads quickly and at low cost without electrophoresis. It utilizes cluster generation and sequencing methods like pyrosequencing, reversible terminators, semiconductor, and ligation. Data analysis involves separating reads, clustering, pairing strands, and aligning to reference genomes.
2. Definition
DNA sequencing is the process of determining the exact sequence of
nucleotides within a DNA molecule.
This means that by sequencing a stretch of DNA, it will be possible to know
the order in which the four nucleotide bases – adenine, guanine, cytosine and
thymine – occur within that nucleic acid molecule
The first DNA sequence were obtained by academic researchers, using
laboratories methods based on 2- dimensional chromatography in the early
1970s.
By the development of dye-based sequencing method with automated
analysis, DNA sequencing has become easier and faster.
3. First generation sequencing
Sanger and Maxam-Gilbert sequencing technologies were classifieds the First
Generation Sequencing Technology who initiated the field of DNA
sequencing with their publication in 1977
– Sequencing by synthesis (Sanger)
– Sequencing by degradation (Maxam-Gilbert)
4. Maxam-gilbert sequencing
First generation of sequencing known as the chemical degradation
method.
performed without DNA cloning.
Relies on the cleaving of nucleotides by chemical.
Chemical treatment generates breaks at a small proportion of one or two
of the four nucleotide bases in each of the four reactions (C, T+C, G, A+G).
This reaction leads to a series of marked fragments.
that can be separated according to their size by electrophoresis
5. Maxam-gilbert sequencing
Chemicals
• Guanine-Dimethyl sulphate followed by piperdine
• G+A- Dimethyl sulphate in formic acid followed by fomic acid
• C+T- hydrazine followed perperdine
• C- hydrazine in 2M Nacl followed by perperdine
it is also considered dangerous
because it uses toxic and radioactive
chemicals.
6. Sanger sequencing
Uses DNA polymerase
All four nucleotides, plus one dideoxynucleotide (ddNTP)
Random termination at specific bases
Separate by gel electrophoresis
Incorporation of di-deoxynucleotides terminates DNA elongation
Individual reactions for each base
10. Next generation sequencing(NGS)
In 2005 the emergence of a new generation of sequencers to break the
limitations of the first generation.
The basic characteristics of second generation
(1) The generation of many millions of short reads in parallel,
(2) The speed up of sequencing the process compared to the
first generation,
(3) The low cost of sequencing and
(4) The sequencing output is directly detected without the
need for electrophoresis.
11. Next generation sequencing(NGS
Next generation sequencing/Massively parallel sequencing It
allows millions of sequencing reactions to be carried out in
parallel
It allows multiplexing for different patient samples
Sequencing and detection takes place simultaneously
Sequencing is of clonally amplified DNA templates which has been
amplified from a single fragment.
Mostly produce short reads- from <400bp
Read numbers vary from ~ 1 million to ~1 billion per run
12. Next generation sequencing(NGS
With massively parallel sequencing new methods for sequencing template
preparation is required
Current NGS platforms utilize clonal amplification on solid supports via
two main methods:
– emulsion PCR (emPCR)
– bridge amplification (DNA cluster generation)
17. Cluster generation
Emulsion PCR
Emulsion PCR is a method of clonal amplification which allows for millions of
unique PCRs to be performed at once through the generation of micro-reactors.
18. Cluster generation
Bridge amplification (Illumina)
• Takes place on the sequencing instrument (flow cell).
The surface of the flow cell is densely coated with primers that are
complementary to the primers attached to the DNA library fragments
19. sequencing
There are four main sequence methods:
Pyrosequencing (454)
Reversible terminator sequencing (Illumina)
Sequencing by ligation (SOLiD)
Semiconductor sequencing (Ion Torrent)
20. Pyrosequencing
Roche/454 sequencing
DNA samples are randomly fragmented and each fragment is attached to a
bead.
Then, each bead is isolated and amplified using PCR
The beads are then transferred to a plate containing many wells called
picotiter plate (PTP)
pyrosequencing technique is applied which consists in activating of a series of
downstream reactions producing light at each incorporation of nucleotide
.By detecting the light emission, the sequence of the DNA fragment is deduced
picotiter plate allows hundreds of thousands of reactions occur in parallel,
reads with lengths of up to 1000 bp and can produce ~1Million reads per run
22. pyrosequencing
Disadvantage
The main errors detected of sequencing are insertions and deletions due to the
presence of homopolymer regions. the identification of the size of homopolymers
should be determined by the intensity of the light emitted by pyrosequencing.
Signals with too high or too low intensity lead to under or overestimation of the
number of nucleotides which causes errors of nucleotides identification
23. Reversible terminator sequencing
(Illumina)
During the first step, the DNA samples are randomly fragmented
adapters are ligated to both ends of each sequence.
these adapters are fixed themselves to the respective complementary
adapters,
the latter are hooked on a slide with many variants of adapters placed on
a solid plate
“PCR bridge amplification
Illumina uses the sequencing by synthesis approach
four modified nucleotides, sequencing primers and DNA polymerases are
added as a mix, and the primers are hybridized to the sequences.
Nucleotides areflourescently labeled
24. Reversible terminator sequencing
(Illumina)
Clusters are excited by laser for emitting a light signal specific to each
nucleotide, which will be detected by a coupled-charge device (CCD)
camera
Computer programs will translate these signals into a nucleotide
sequence
lengths of short reads are about 125 bp.Illumina sequencers is currently
higher than 600 Gpb
Drawbacks
Illumina/Solexa platform is the high requirement for sample loading control
because overloading can result in overlapping clusters and poor sequencing
quality.
26. Semiconductor sequencing (Ion
Torrent)
Detection of hydrogen ions during the polymerization DNA
Sequencing occurs in micro wells with ion sensors
No modified nucleotides
No optics
Fragments attach with beads are placed in micro wells
Lengths of 200 bp, 400 bp and 600 bp with throughput
that can reach 10 Gb for
ion proton sequencer
27. Semiconductor sequencing (Ion
Torrent)
• The major advantages of this sequencing
are focused on read lengths which are
longer to other SGS sequencers and fast
sequencing time between 2 and 8 hours.
• The major disadvantage is the difficulty of
interpreting the homopolymer sequences
28. Sequencing by ligation (SOLiD)
It starts by attaching adapters to the DNA fragments, fixed on beads and
cloned by PCR emulsion.
These beads are then placed on a glass slide
the 8-mer with a fluorescent label at the end are sequentially ligated to
DNA fragments,
the color emitted by the label is recorded
Then, the output format is color space which is the encoded form of the
nucleotide where four fluorescent colors are used to represent 16 possible
combinations of two bases.
The recovered data from the color space can be translated to letters of
DNA bases and the sequence of the DNA fragment can be deduced
29. Sequencing by ligation (SOLiD)
sequencer that produce short reads with length 35 bp and output of 3
Gb/run and continued to improve their sequencing which increased the
length of reads to 75 bp with an output up to 30 Gb/run
The errors of sequencing in this technology is due to noise during the
ligation cycle which causes error identification of bases.
31. Data analysis
Sequences produce from poled library are separated upon the base of
indices induce during library preparation
Reads with Similar stretches are locally clustered
Forward and reverse strands are paired and create contiguous sequences
These sequences are aligned to the reference genome for variant
identification