2. What is a gene & DNA ?
• DNA is the molecule that is the
hereditary material in all living cells.
Genes are made of DNA.
• A gene consists of enough DNA to code
for one protein, and a genome is
simply the sum total of an organism's
DNA.
2
3. What is the Function of Gene ?
• DNA is pivotal to our growth,
reproduction, and health.
• A gene is the basic physical and
functional unit of heredity.
• It is regulate the construction of the
proteins necessary for the cell to
perform all of its functions.
3
4. Why do we want to know the sequence of an entire
genome??
• To know all the genes – then
proteins, then pathways…
• We can understand:
• the biochemistry of the organism.
• genetic diseases.
• Regulation.
4
5. Historyof DNA sequencing
5
5
1953
Discovery of the structure of the
DNA double helix
1972
Development of Recombinant
DNA technology,.
1977
The first complete DNA genome
to be sequenced is that of
Bacteriophage φX174 &
Frederick Sanger publishes
"DNA sequencing with chain-
terminating inhibitors“
1984
Medical Research Council
scientists decipher the
complete DNA sequence
of the Epstein- Barr
virus, 170 kb.
1987
Applied Biosystems
markets first automated
sequencing machine, the
model ABI 370.
1990
The U.S. National Institutes of
Health (NIH) begins large-scale
sequencing trials on M.
capricolum, E. coli
1995
Craig Venter Hamilton Smith
and colleagues publish the 1st
complete genome of bacterium
H. influenzae (whole-genome
shotgun sequencing.)
1996
Pål Nyrén and his student
Mostafa Ronaghi at the Royal
Institute of Technology in
Stockholm publish their method
of Pyrosequencing
1998
Phil Green and Brent Ewing
of the University of
Washington publish"phred”
for sequencer data analysis.
2001
A draft sequence of the
human genome is published.
2004
454 Life Sciences marketsa
parallelized version of
Pyrosequencing.
2006
Era of Next Generation
Sequencing- 454
Sequencing, Illumina etc.
6. Era of sequencing
6
1st generation sequencing:
• Sequence many identical molecules.
• Sequencing in large gels or capillary tubing
limits scale
Sangar Chain
Termination
( 1977 )
Maxam- Gilbert Sequencing
(1977)
7. Era of sequencing
7
2nd generation sequencing:
• Sequence many
identical molecules
• Sequencing in large gels
or capillary tubing
limits scale
Illumina MiSeq
Life Technologies/Applied
Biosystems; SOLID 5500
Roche / 454 Pyro sequencer
QIAGEN Gene Reader
8. Gene Sequencing Techniques
• It is also known as DNA sequencing.
• Gene sequencing may be defind as it is
a process of determining the nucleic
acid sequence-the order of nucleotides
in DNA. It includes any method or
technology that is used to determine
the order of the four bases: adenine,
guanine, cytosine, and thymine.
8
9. Generation of Gene Sequencing
• 1st Generation sequencing:
• Maxam- Gilbert sequencing
• Sanger sequencing.
9
• Next Generation sequencing:
• Sequencing by ligation
• Pyrosequencing
• Single molecular real time sequencing
• Advance Generation sequencing (shotgun):
Whole genome shotgun
Double barrel shotgun
Hierarchical shotgun
10. Maxam-Gilbert
• Walter Gilbert
• Harvard physicist
• Knew James Watson
• Became intrigued with
the biological side
• Became a biophysicist
• Allan Maxam
10
11. 1.0 The Maxam-Gilbert
Technique
• Principle - Chemical
Degradation of Purines and
pyrimidines by
dimethylsulphate and
hydrazine respectively and
then labeled it.
11
1. Aliquot A + dimethyl sulphate, which methylates guanine
residue
2. Aliquot B + formic acid, which modifies adenine and
guanine residues
3. Aliquot C + Hydrazine, which modifies thymine + cytosine
residues
4. Aliquot D + Hydrazine + 5 mol/l NaCl, which makes the
reaction specific for cytosine
13. Advantages/disadvantages
Maxam-Gilbert sequencing
• Requires lots of purified DNA, and many intermediate
purification steps
• Relatively short readings
• Automation not available (sequencers)
• In contrast, the Sanger sequencing methodology
requires little if any DNA purification, no restriction
digests, and no labeling of the DNA sequencing
template
13
14. 2.0 Sanger Method:
• Fred Sanger, 1958
• Was originally a
protein chemist
• Made his first mark
in sequencing
proteins
• Made his second
mark in
sequencing RNA
• 1980 dideoxy
sequencing
14
15. 15
in-vitro DNA synthesis using ‘terminators’, use of dideoxi-
nucleotides that do not permit chain elongation after their integration
Termination of synthesis at specific nucleotides.
Requires a primer, DNA polymerase, a template, a mixture of
nucleotides, and detection system.
Incorporation of di-deoxynucleotides into growing strand terminates
synthesis.
Synthesized strand sizes are determined for each di-
deoxynucleotide by using gel or capillary electrophoresis.
Sanger Method process:
19. • So clearly, sequencing 1500 bases at a
time is not going to work if we ever want
to make real progress.
• So, what do the professionals do?
• Well they use Genome sequencing
strategies…
• We will talk about three ‘classical’
methods:
• Whole-genome shotgun
• Double-barrel shotgun
• Hierarchical shotgun
19
2.0 Advance Generation of sequencing:
21. 2.2 Double-barrel shotgun
• Double-barrel shotgun sequencing is also referred to as
“pairwise‐end sequencing”.
• Same as Whole‐genome shotgun with one difference.
• Sequencing is performed from both ends of DNA inserts as
oppose to just one. Method conceived to reduce “Gaps” and to
reduce assembly error.
DISADVANTAGE:
• More amount of data is generated so, it is difficult to
assemble.
ADVANTAGE:
• Theoretically it is very accurate.
21
23. 3.0 NEXT GENERATION SEQUENCING
• Next-generation sequencing (NGS), also known as high
throughput sequencing, is the catch-all term used to
describe a number of different modern sequencing
technologies including:
• Illumina (Solexa) sequencing
• Roche 454 sequencing
• SOLiD sequencing
• Single Molecule Real Time Sequencing (SMRT):
23
25. NGS WORKFLOW
Sample Extraction , DNA fragmentation and invitro adapter ligation
Clonal Amplification
by emulsion PCR
Sequencing by-
ligation
(SOLiD platform)
Pyrosequencing
(454 sequencing)
Clonal Amplification
by Bridge PCR
Sequencing by
synthesis
(Solexa Technology)
25
26. NGS WORKFLOW
1. Create DNAfragments
2. Add platform-specific adapter sequences to every fragment.
Adapter
ligation
point
Adapter
molecule
• Adapter molecules : Bind library to a flowcell or bead; Add
sequence primer binding sites &Add barcodes formultiplexing.
Adapter
molecule
bound to
DNA
26
28. Cluster Amplification: (Bridge PCR)
• DNA fragments are put with adaptors which is a
library.
• A solid surface is coated with primers
complementary to the two adaptor sequences
• Isothermal amplification, with one end of each
“bridge” attached to the surface
• Clusters of DNA molecules are generated on the
chip. Each cluster is originated from a single DNA
fragment, and is thus a clonal population.
28
29. Cluster Amplification
(Emulsion PCR)
• Fragments with adaptors (the library) are PCR amplified within
a water drop in oil.
• One PCR primer is attached to the surface of a bead.
• DNA molecules are synthesized on the beads in the water
droplet. Each bead bears clonal DNA originated from a single
DNA fragment
• Beads (with attached DNA) are then deposited into the wells of
sequencing chips –
• one well, one bead.
29
31. 3.2 Pyrosequencing:
• It is a unique detection technology based on the
principle of sequencing-by-synthesis.
• it’s provides quantitative real-time data without the
need for gels, probes, or labels.
• It is a non-electrophoretic, bioluminescence method
that measures the release of inorganic pyrophosphate
by proportionally converting it into visible light using a
series of enzymatic reaction.
31
34. 3.1 Sequencing byLigation
Sequencing by Ligation (SBL) uses the enzyme DNA ligase to identify the
nucleotide present at a given position in a DNA sequence. (according to
base pair rule.)
Linker
with
dye
34
35. 3.3 Single Molecule Real Time Sequencing
(SMRT):
• Single Molecule Real Time Sequencing (SMRT) is a new approach to DNA
sequencing Offered by Pacific Biosciences.
• When DNA polymerase incorporates nucleotides into a growing chain, a
volume is created that is large enough to excite and detect a labeled
nucleotide that is being incorporated.
• Per SMRT, different Dye Phospholinked nucleotides, one for each nucleotide
type (A,G,T,C), are used so that the specific nucleotide type being
incorporated by DNA polymerase during the chain extension process can be
identified.
• In order for this to be effectively and accurately achieved, a special designed
excitation detection chamber, called a Zero Mode Wavelength (ZMW), is used.
• Template +Polymerase +Phospholinked labeled dNTPs are deposited in
microwells. ZMWs of a special designed microarray called a “Sequencing
Chip”.
• Real‐time detection occurs in the ZMW, allowing for Real‐Time Sequencing.
35
39. Illumina dye sequencing is a technique used to determine the series
of base pairs in DNA, also known as DNA sequencing. The reversible
terminated chemistry concept was invented by Bruno Canard and
Simon Sarfati at the Pasteur Institute in Paris.
ILLUMINA/SOLEXA SEQUENCING
Run time: 1–10days
Produces: 2–1000 Gb of sequence
Read length: 2 x 50 bp – 2 x 250bp
(paired-end)
Cost: $0.05–$0.40/Mb
Bridge PCR Clustal Amplification
39
40. 40
Applications
DNA sequencing
Gene RegulationAnalysis
Sequencing-based
Transcriptome Analysis
SNPs and SVsdiscovery
Cytogenetic Analysis
ChIP-sequencing
Small RNAdiscovery analysis
41. ROCHE/454 SEQUENCING
41
• Sequence much longer reads by sequencing multiple reads at once by reading
optical signals as bases are added.
• The DNA or RNA is fragmented into shorter reads up to 1kb.
• Uses Emulsion PCR for ClustalAmplification.
• PYROSEQUENCING as sequencing approach.
42. • All of the sequence reads we get from 454 will be
different lengths, because different numbers of
these bases will be added with each cycle.
Application:
Whole genome sequencing
Targeted resequencing
Sequencing-based Transcriptome Analysis
Metagenomics
42
43. LIFE/APG/ABI- SOLiD SEQUENCING
ABSOLIDTM 3 System generates over 20 gigabases &400 M tags per run
Library Preparation
Emulsion PCR/ Bead Enrichment
Bead deposition
Sequencing by Ligation
Chemical crosslinking
to an amino-coated
glass surface 43
45. Application of gene sequencing
• Information obtained using sequencing allows researchers to identify
changes in genes, associations with diseases and phenotypes, and
identify potential drug targets.
• used in evolutionary biology to study how different organisms are
related and how they evolved.
• In Forensics science ex. DNA finger print technique.
• Useful into determine risk of Genetic disorders.
• DNA sequencing may be useful for determining a specific bacteria, to
allow for more precise antibiotics treatments.
• Viral sequencing (gene sequencing) can be used during epidemics to
determine the origins of an outbreak using molecular clock technique.
• Mutation discovery
• Transcriptome Analysis – RNA-Seq
• Sequencing clinical isolates in strain-to-reference mechanisms.
• Discovering non-coding RNAs
• Molecular diagnostics for Oncology & Inherited Disease study.
• Gene Regulation Analysis
• Exploring Chromatin Packaging
45
46. Reference:
Elaine R. Mardis (2008) the impact of next-generation
sequencing technology on genetics. Cell vol.24 No.3,133-
14.
Elaine R. Mardis (2009): Next-Generation Sequencing Methods.
Annu. Rev. Genomics hum genet. 9:387-402
Jorge S Reis-Filho (2010): Next-Generation Sequencing, Breast
Cancer Research 2010, 11(Suppl 3)
Some websites –
https://www.ncbi.nlm.nih.gov/pubmed
https://en.wikipedia.org/wiki/DNA_sequencing
46