SlideShare uma empresa Scribd logo
1 de 44
Phylogenetic Tree Construction
Uddalok
Jana(17mslsbf09)
Its the evolutionary history of a kind
of organism...
the evolution of a genetically related
group of organisms as distinguished
from the development of the
individual organism
the history or course of the
development of something.
• Phylogeny is the inference of evolutionary
relationships
• All forms of life share a common origin.
– here the goal is to deduce the correct trees for all
species of life
– to estimate the time of divergence between
organisms since the time they last shared a
common ancestor
Cladogram vs. Phylogram
Basics of Tree Construct
Comparison of Speciation with the Genetic change
Species tree versus gene tree
• In a species tree an internal node represents a
speciation event
• In a gene tree an internal node represents the
divergence of an ancestral gene into two new
genes with distinct sequences
• Species tree <> Gene tree
– horizontal gene transfer
– gene duplications
Species tree versus gene tree
Gray et al.
Phylogenetic Tree development steps
1. Selection of sequences or any parameter for
analysis
2. Multiple sequence alignment
3. Tree building
4. Tree evaluation
DNA:
– Higher Phylogenetic signal:
• Synonymous vs. non-synonymous substitutions
(detect negative and positive selection)
Protein:
– Phylogenetic signal less predominant than in DNA
– Better to construct a tree for evolutionary distant
species or genes
RNA: rRNA often used for constructing species
trees
Selection of sequences for analysis
Multiple sequence alignment
 This is a critical step in the analysis as in many cases the alignment of
amino acids or nucleotides in a column implies that they share a common
ancestor
 If you misalign a group of sequences you will still be able to produce a
tree. However, it is not likely to be biologically meaningful.
 Crap in is crap out!
 Inspect the alignment to be sure that all sequences are homologous
 Some times with ClustalW distantly related sequences are not well
aligned. Try different gap and extension parameters to improve the
alignment
 Only use these columns of the multiple alignment for which you have data
for all organisms or sequences. Delete the columns for which this is not
the case.
 Delete columns with gaps
Tree building
Character-based
methods
Non-character based
methods
Methods based on an
explicit
model of evolution
Maximum Likelihood
Methods/Bayesian
Phylogeny
Pairwise distance
methods
Methods not based on
an explicit
model of evolution
Maximum Parsimony
Methods
Distance based methods
Distance based methods:
– calculate the distances between molecular sequences using
some distance matrices
– A clustering method (UPGMA, neighbor joining) is used to infer
the tree from the pair wise distance matrix
– treat the sequence from a horizontal(parental) perspective, by
calculating a single distance between entire sequences
Advantage:
• Fast
• Allow using evolutionary models
Disadvantage:
• sequences reduced to one number
Character based methods
Character based methods:
– treat the sequences from a vertical(evolutionary)
perspective.
– they search for each column of the alignment, the
simplest explanation for how the characters
evolved.
– For instance, MP(Maximum Parsimony) involves a
search for a tree with the fewest number of amino
acid (or nucleotide character changes that account
for the observed differences between the protein
(gene) sequences.
Tree evaluation: bootstrapping
• sampling technique for estimating the statistical
error in situations where the underlying sampling
distribution is unknown
• evaluating the reliability of the inferred tree - or
better the reliability of specific branches
How to proceed:
• From the original alignment, columns in the sequence alignment are
chosen at random ‘sampling with replacement’
• a new alignment is constructed with the same size as the original one
• a tree is constructed
This process is repeated 100 of times**
Evaluation
Show bootstrap values on Phylogenetic trees
• majority-rule consensus tree
• map bootstrap values on the original tree
• now while evaluating from bootstrap value we
are going to check a certain tree’s occurrence
number !!! If its 60 out of 100 times its
significant, more than 50 is accountable but
bellow 50 definitely rejected.
Pairwise distance methods
• Distance calculation
• Inferring the tree topology
Pairwise distance methods
Approach:
• align pairs of sequences and count the number of
differences (Hamming distance).
• For an alignment of length N with n sites at which there
are differences: D= (n/N*100).
Problem:
• observed differences <> actual genetic distances between
the sequences.
=> dissimilarity is an underestimation of the true
evolutionary distance, because of the fact that some of
the sequence positions are the result of multiple events
Solution:
• Use an evolutionary model that corrects for multiple
mutations
Distance calculation
Pairwise distance methods
Distance calculation
Pairwise distance methods
Other evolutionary models
Distance calculation
Pairwise distance methods
Distance calculation
Pairwise distance methods
UPGMA Method (Unweighted Pair Group Method with
Arithmetic Mean):
This method is generally attributed to Sokal and Michener
• assumes a molecular clock , i.e. that all sequences
evolve at a similar rate
•distance = twice node height
• forces distances to be ultrametric (for any three
species, the two largest distances are equal)
• produces rooted tree (in this case root is incorrect
but topology is otherwise correct)
Pairwise distance methods
• when two OTUs are grouped, we treat them as a new single OTU
• when OTUs A, B (which have been grouped before) and C are grouped into a
new node ‘u’, then the distance from node ‘u’ to any other node ‘k’ (e.g. grouping
D and E) is simply computed as follows:
Tree inference: UPGMA
Pairwise distance methods
Tree inference: UPGMA
Pairwise distance methods
Advantages:
• Fast
• Allows incorporation of evolutionary models
Disadvantages:
• Assumption of a molecular clock
• Non realistic evolutionary approach as all groups
are equally distanced from the root.
Tree inference: UPGMA
Neighbor Joining
• Very popular method
• Does not make molecular clock assumption :
modified distance matrix constructed to adjust
for differences in evolution rate of each taxon
• Produces un-rooted tree
• Assumes additivity: distance between pairs of
leaves = sum of lengths of edges connecting them
• Like UPGMA, constructs tree by sequentially
joining sub-trees
Pairwise distance methods
• Additive distances can be fitted to an unrooted
tree such that the evolutionary distance
between a pair of OTUs equals the sum of the
lengths of the branches connecting them, rather
than being an average as in the case of cluster
analysis
• Tree construction methods:The neighbour
joining (NJ) method, developed by Saitou and
Nei (1987) offers a heuristic approach to solve
this problem
Tree inference: neighbor joining
Tree inference: neighbor joining
Pairwise distance methods
Pairwise distance methods
Tree inference: neighbor joining
Pairwise distance methods
Tree inference: neighbor joining
Pairwise distance methods
Advantages:
• Fast
• Allows incorporation of evolutionary models
• No assumption of a molecular clock
Disadvantages
• Constructed tree is sometimes only
hypothetically based and no connection with the
original tree
Tree inference: neighbor joining
Maximum parsimony
Principle
• Select that tree that minimizes the total tree length = being the
number of nucleic acid substitutions or amino acid replacements
required to explain a given set of data.
Method
• a particular topology is considered
• for this topology, the ancestral sequences at each branching point
are reconstructed
• the minimum number of events to explain the sequence differences
over the whole tree is computed: the minimum number of
substitutions is computed for each nucleotide (or amino acid) site,
and the numbers for all sites are added.
• another tree topology is chosen
Maximum parsimony
)2(2
)32(
2


 
n
n
N nR
)3(2
)52(
3


 
n
n
N nU
OTU's rooted tree topologies unrooted tree topologies
3 3 1
4 15 3
5 105 15
6 954 105
7 10395 954
8 135135 10395
9 2027025 135135
equation
• Exhaustive search impossible
• Heuristics needed
Maximum parsimony
Maximum Parsimony
Assumptions
• Equal rate of evolution in all branches
Advantages
• sequence information is not reduced to one number (such
as for example in pairwise distance methods)
Disadvantages of maximum parsimony methods
• can be slow for very large datasets
• no correction for multiple mutations, i.e. no substitution
model can be applied
• sensitive to unequal rates of evolution in different lineages
Bayesian
Comparison for the Character based Methods
Parsimony vs. Maximum Likelihood
 There is an efficient algorithm to calculate the parsimony score for a
given topology, therefore parsimony is faster than ML.
 Parsimony is an approximation to ML when mutations are rare
events.
 Weighted parsimony schemes can be used to treat most of the
different evolutionary models used with ML.
 Parsimony throws away information from non-informative sites
that is informative in ML and distance matrix methods.
 Parsimony gives little information about branch lengths.
 Parsimony is inconsistent in certain cases (Felsenstein zone), and
suffers badly from long branch attraction.
Commonly used Phylogeny packages
• 369 phylogeny packages
(http://evolution.gs.washington.edu/phylip/software.html) and 54 free
servers (as of Sep 30, 2011)
– Phylip (general package, protdist, NJ, parsimony, maximum likelihood,
etc)
– PAUP (parsimony)
– PAML (maximum likelihood)
– TreePuzzle (quartet based)
– PhyML (maximum likelihood)
– MyBayes
– MEGA (biologist-centric)
Thank
You...

Mais conteúdo relacionado

Mais procurados

Phylogenetic Tree, types and Applicantion
Phylogenetic Tree, types and Applicantion Phylogenetic Tree, types and Applicantion
Phylogenetic Tree, types and Applicantion Faisal Hussain
 
MULTIPLE SEQUENCE ALIGNMENT
MULTIPLE  SEQUENCE  ALIGNMENTMULTIPLE  SEQUENCE  ALIGNMENT
MULTIPLE SEQUENCE ALIGNMENTMariya Raju
 
Phylogenetic tree and its construction and phylogeny of
Phylogenetic tree and its construction and phylogeny ofPhylogenetic tree and its construction and phylogeny of
Phylogenetic tree and its construction and phylogeny ofbhavnesthakur
 
MEGA (Molecular Evolutionary Genetics Analysis)
MEGA (Molecular Evolutionary Genetics Analysis)MEGA (Molecular Evolutionary Genetics Analysis)
MEGA (Molecular Evolutionary Genetics Analysis)Athar Mutahari
 
Sequence alig Sequence Alignment Pairwise alignment:-
Sequence alig Sequence Alignment Pairwise alignment:-Sequence alig Sequence Alignment Pairwise alignment:-
Sequence alig Sequence Alignment Pairwise alignment:-naveed ul mushtaq
 
Distance based method
Distance based method Distance based method
Distance based method Adhena Lulli
 
Blast and fasta
Blast and fastaBlast and fasta
Blast and fastaALLIENU
 
Secondary Structure Prediction of proteins
Secondary Structure Prediction of proteins Secondary Structure Prediction of proteins
Secondary Structure Prediction of proteins Vijay Hemmadi
 
Introduction to sequence alignment partii
Introduction to sequence alignment partiiIntroduction to sequence alignment partii
Introduction to sequence alignment partiiSumatiHajela
 

Mais procurados (20)

Phylogenetic Tree, types and Applicantion
Phylogenetic Tree, types and Applicantion Phylogenetic Tree, types and Applicantion
Phylogenetic Tree, types and Applicantion
 
MULTIPLE SEQUENCE ALIGNMENT
MULTIPLE  SEQUENCE  ALIGNMENTMULTIPLE  SEQUENCE  ALIGNMENT
MULTIPLE SEQUENCE ALIGNMENT
 
Fasta
FastaFasta
Fasta
 
Phylogenetic tree and its construction and phylogeny of
Phylogenetic tree and its construction and phylogeny ofPhylogenetic tree and its construction and phylogeny of
Phylogenetic tree and its construction and phylogeny of
 
MEGA (Molecular Evolutionary Genetics Analysis)
MEGA (Molecular Evolutionary Genetics Analysis)MEGA (Molecular Evolutionary Genetics Analysis)
MEGA (Molecular Evolutionary Genetics Analysis)
 
Biological database
Biological databaseBiological database
Biological database
 
Sequence alig Sequence Alignment Pairwise alignment:-
Sequence alig Sequence Alignment Pairwise alignment:-Sequence alig Sequence Alignment Pairwise alignment:-
Sequence alig Sequence Alignment Pairwise alignment:-
 
Phylogenetic data analysis
Phylogenetic data analysisPhylogenetic data analysis
Phylogenetic data analysis
 
Prosite
PrositeProsite
Prosite
 
Maximum parsimony
Maximum parsimonyMaximum parsimony
Maximum parsimony
 
Parsimony methods
Parsimony methodsParsimony methods
Parsimony methods
 
Distance based method
Distance based method Distance based method
Distance based method
 
blast bioinformatics
blast bioinformaticsblast bioinformatics
blast bioinformatics
 
BLAST
BLASTBLAST
BLAST
 
Blast and fasta
Blast and fastaBlast and fasta
Blast and fasta
 
Secondary Structure Prediction of proteins
Secondary Structure Prediction of proteins Secondary Structure Prediction of proteins
Secondary Structure Prediction of proteins
 
Swiss prot database
Swiss prot databaseSwiss prot database
Swiss prot database
 
Introduction to sequence alignment partii
Introduction to sequence alignment partiiIntroduction to sequence alignment partii
Introduction to sequence alignment partii
 
UPGMA
UPGMAUPGMA
UPGMA
 
European molecular biology laboratory (EMBL)
European molecular biology laboratory (EMBL)European molecular biology laboratory (EMBL)
European molecular biology laboratory (EMBL)
 

Semelhante a Phylogenetic tree construction

BTC 506 Phylogenetic Analysis.pptx
BTC 506 Phylogenetic Analysis.pptxBTC 506 Phylogenetic Analysis.pptx
BTC 506 Phylogenetic Analysis.pptxChijiokeNsofor
 
Phylogenetic Tree evolution
Phylogenetic Tree evolutionPhylogenetic Tree evolution
Phylogenetic Tree evolutionMd Omama Jawaid
 
Phylogenetic analysis in nutshell
Phylogenetic analysis in nutshellPhylogenetic analysis in nutshell
Phylogenetic analysis in nutshellAvinash Kumar
 
Softwares For Phylogentic Analysis
Softwares For Phylogentic AnalysisSoftwares For Phylogentic Analysis
Softwares For Phylogentic AnalysisPrasanthperceptron
 
Bls 303 l1.phylogenetics
Bls 303 l1.phylogeneticsBls 303 l1.phylogenetics
Bls 303 l1.phylogeneticsBruno Mmassy
 
PHYLOGENETIC ANALYSIS_CSS2.pptx
PHYLOGENETIC ANALYSIS_CSS2.pptxPHYLOGENETIC ANALYSIS_CSS2.pptx
PHYLOGENETIC ANALYSIS_CSS2.pptxSilpa87
 
AI 바이오 (4일차).pdf
AI 바이오 (4일차).pdfAI 바이오 (4일차).pdf
AI 바이오 (4일차).pdfH K Yoon
 
Bioinformatics presentation shabir .pptx
Bioinformatics presentation shabir .pptxBioinformatics presentation shabir .pptx
Bioinformatics presentation shabir .pptxshabirhassan4585
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis緯鈞 沈
 
A distance-based method for phylogenetic tree reconstruction using algebraic ...
A distance-based method for phylogenetic tree reconstruction using algebraic ...A distance-based method for phylogenetic tree reconstruction using algebraic ...
A distance-based method for phylogenetic tree reconstruction using algebraic ...Emily Castner
 
phylogenetictreeanditsconstructionandphylogenyof-191208102256.pdf
phylogenetictreeanditsconstructionandphylogenyof-191208102256.pdfphylogenetictreeanditsconstructionandphylogenyof-191208102256.pdf
phylogenetictreeanditsconstructionandphylogenyof-191208102256.pdfalizain9604
 
Mean shift and Hierarchical clustering
Mean shift and Hierarchical clustering Mean shift and Hierarchical clustering
Mean shift and Hierarchical clustering Yan Xu
 
Perl for Phyloinformatics
Perl for PhyloinformaticsPerl for Phyloinformatics
Perl for PhyloinformaticsRutger Vos
 

Semelhante a Phylogenetic tree construction (20)

BTC 506 Phylogenetic Analysis.pptx
BTC 506 Phylogenetic Analysis.pptxBTC 506 Phylogenetic Analysis.pptx
BTC 506 Phylogenetic Analysis.pptx
 
Phylogenetic Tree evolution
Phylogenetic Tree evolutionPhylogenetic Tree evolution
Phylogenetic Tree evolution
 
Tree building
Tree buildingTree building
Tree building
 
Phylogenetic analysis in nutshell
Phylogenetic analysis in nutshellPhylogenetic analysis in nutshell
Phylogenetic analysis in nutshell
 
Upgma
UpgmaUpgma
Upgma
 
Softwares For Phylogentic Analysis
Softwares For Phylogentic AnalysisSoftwares For Phylogentic Analysis
Softwares For Phylogentic Analysis
 
Bls 303 l1.phylogenetics
Bls 303 l1.phylogeneticsBls 303 l1.phylogenetics
Bls 303 l1.phylogenetics
 
6238578.ppt
6238578.ppt6238578.ppt
6238578.ppt
 
PHYLOGENETIC ANALYSIS_CSS2.pptx
PHYLOGENETIC ANALYSIS_CSS2.pptxPHYLOGENETIC ANALYSIS_CSS2.pptx
PHYLOGENETIC ANALYSIS_CSS2.pptx
 
AI 바이오 (4일차).pdf
AI 바이오 (4일차).pdfAI 바이오 (4일차).pdf
AI 바이오 (4일차).pdf
 
Phylogenetics1
Phylogenetics1Phylogenetics1
Phylogenetics1
 
Bioinformatics presentation shabir .pptx
Bioinformatics presentation shabir .pptxBioinformatics presentation shabir .pptx
Bioinformatics presentation shabir .pptx
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
Phylogenetic studies
Phylogenetic studiesPhylogenetic studies
Phylogenetic studies
 
BioINfo.pptx
BioINfo.pptxBioINfo.pptx
BioINfo.pptx
 
A distance-based method for phylogenetic tree reconstruction using algebraic ...
A distance-based method for phylogenetic tree reconstruction using algebraic ...A distance-based method for phylogenetic tree reconstruction using algebraic ...
A distance-based method for phylogenetic tree reconstruction using algebraic ...
 
phylogenetictreeanditsconstructionandphylogenyof-191208102256.pdf
phylogenetictreeanditsconstructionandphylogenyof-191208102256.pdfphylogenetictreeanditsconstructionandphylogenyof-191208102256.pdf
phylogenetictreeanditsconstructionandphylogenyof-191208102256.pdf
 
Mean shift and Hierarchical clustering
Mean shift and Hierarchical clustering Mean shift and Hierarchical clustering
Mean shift and Hierarchical clustering
 
Perl for Phyloinformatics
Perl for PhyloinformaticsPerl for Phyloinformatics
Perl for Phyloinformatics
 
phy prAC.pptx
phy prAC.pptxphy prAC.pptx
phy prAC.pptx
 

Último

Replisome-Cohesin Interfacing A Molecular Perspective.pdf
Replisome-Cohesin Interfacing A Molecular Perspective.pdfReplisome-Cohesin Interfacing A Molecular Perspective.pdf
Replisome-Cohesin Interfacing A Molecular Perspective.pdfAtiaGohar1
 
CHROMATOGRAPHY PALLAVI RAWAT.pptx
CHROMATOGRAPHY  PALLAVI RAWAT.pptxCHROMATOGRAPHY  PALLAVI RAWAT.pptx
CHROMATOGRAPHY PALLAVI RAWAT.pptxpallavirawat456
 
Immunoblott technique for protein detection.ppt
Immunoblott technique for protein detection.pptImmunoblott technique for protein detection.ppt
Immunoblott technique for protein detection.pptAmirRaziq1
 
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...Christina Parmionova
 
Forensic limnology of diatoms by Sanjai.pptx
Forensic limnology of diatoms by Sanjai.pptxForensic limnology of diatoms by Sanjai.pptx
Forensic limnology of diatoms by Sanjai.pptxkumarsanjai28051
 
whole genome sequencing new and its types including shortgun and clone by clone
whole genome sequencing new  and its types including shortgun and clone by clonewhole genome sequencing new  and its types including shortgun and clone by clone
whole genome sequencing new and its types including shortgun and clone by clonechaudhary charan shingh university
 
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...Sérgio Sacani
 
complex analysis best book for solving questions.pdf
complex analysis best book for solving questions.pdfcomplex analysis best book for solving questions.pdf
complex analysis best book for solving questions.pdfSubhamKumar3239
 
GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests Glycosides
GLYCOSIDES Classification Of GLYCOSIDES  Chemical Tests GlycosidesGLYCOSIDES Classification Of GLYCOSIDES  Chemical Tests Glycosides
GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests GlycosidesNandakishor Bhaurao Deshmukh
 
DOG BITE management in pediatrics # for Pediatric pgs# topic presentation # f...
DOG BITE management in pediatrics # for Pediatric pgs# topic presentation # f...DOG BITE management in pediatrics # for Pediatric pgs# topic presentation # f...
DOG BITE management in pediatrics # for Pediatric pgs# topic presentation # f...HafsaHussainp
 
GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024Jene van der Heide
 
WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11
WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11
WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11GelineAvendao
 
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptxGENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptxRitchAndruAgustin
 
FBI Profiling - Forensic Psychology.pptx
FBI Profiling - Forensic Psychology.pptxFBI Profiling - Forensic Psychology.pptx
FBI Profiling - Forensic Psychology.pptxPayal Shrivastava
 
How we decide powerpoint presentation.pptx
How we decide powerpoint presentation.pptxHow we decide powerpoint presentation.pptx
How we decide powerpoint presentation.pptxJosielynTars
 
Environmental Acoustics- Speech interference level, acoustics calibrator.pptx
Environmental Acoustics- Speech interference level, acoustics calibrator.pptxEnvironmental Acoustics- Speech interference level, acoustics calibrator.pptx
Environmental Acoustics- Speech interference level, acoustics calibrator.pptxpriyankatabhane
 
Observational constraints on mergers creating magnetism in massive stars
Observational constraints on mergers creating magnetism in massive starsObservational constraints on mergers creating magnetism in massive stars
Observational constraints on mergers creating magnetism in massive starsSérgio Sacani
 

Último (20)

Replisome-Cohesin Interfacing A Molecular Perspective.pdf
Replisome-Cohesin Interfacing A Molecular Perspective.pdfReplisome-Cohesin Interfacing A Molecular Perspective.pdf
Replisome-Cohesin Interfacing A Molecular Perspective.pdf
 
CHROMATOGRAPHY PALLAVI RAWAT.pptx
CHROMATOGRAPHY  PALLAVI RAWAT.pptxCHROMATOGRAPHY  PALLAVI RAWAT.pptx
CHROMATOGRAPHY PALLAVI RAWAT.pptx
 
Let’s Say Someone Did Drop the Bomb. Then What?
Let’s Say Someone Did Drop the Bomb. Then What?Let’s Say Someone Did Drop the Bomb. Then What?
Let’s Say Someone Did Drop the Bomb. Then What?
 
Immunoblott technique for protein detection.ppt
Immunoblott technique for protein detection.pptImmunoblott technique for protein detection.ppt
Immunoblott technique for protein detection.ppt
 
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...
 
Forensic limnology of diatoms by Sanjai.pptx
Forensic limnology of diatoms by Sanjai.pptxForensic limnology of diatoms by Sanjai.pptx
Forensic limnology of diatoms by Sanjai.pptx
 
whole genome sequencing new and its types including shortgun and clone by clone
whole genome sequencing new  and its types including shortgun and clone by clonewhole genome sequencing new  and its types including shortgun and clone by clone
whole genome sequencing new and its types including shortgun and clone by clone
 
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
 
complex analysis best book for solving questions.pdf
complex analysis best book for solving questions.pdfcomplex analysis best book for solving questions.pdf
complex analysis best book for solving questions.pdf
 
GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests Glycosides
GLYCOSIDES Classification Of GLYCOSIDES  Chemical Tests GlycosidesGLYCOSIDES Classification Of GLYCOSIDES  Chemical Tests Glycosides
GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests Glycosides
 
Interferons.pptx.
Interferons.pptx.Interferons.pptx.
Interferons.pptx.
 
DOG BITE management in pediatrics # for Pediatric pgs# topic presentation # f...
DOG BITE management in pediatrics # for Pediatric pgs# topic presentation # f...DOG BITE management in pediatrics # for Pediatric pgs# topic presentation # f...
DOG BITE management in pediatrics # for Pediatric pgs# topic presentation # f...
 
GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024
 
WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11
WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11
WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11
 
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptxGENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
 
FBI Profiling - Forensic Psychology.pptx
FBI Profiling - Forensic Psychology.pptxFBI Profiling - Forensic Psychology.pptx
FBI Profiling - Forensic Psychology.pptx
 
PLASMODIUM. PPTX
PLASMODIUM. PPTXPLASMODIUM. PPTX
PLASMODIUM. PPTX
 
How we decide powerpoint presentation.pptx
How we decide powerpoint presentation.pptxHow we decide powerpoint presentation.pptx
How we decide powerpoint presentation.pptx
 
Environmental Acoustics- Speech interference level, acoustics calibrator.pptx
Environmental Acoustics- Speech interference level, acoustics calibrator.pptxEnvironmental Acoustics- Speech interference level, acoustics calibrator.pptx
Environmental Acoustics- Speech interference level, acoustics calibrator.pptx
 
Observational constraints on mergers creating magnetism in massive stars
Observational constraints on mergers creating magnetism in massive starsObservational constraints on mergers creating magnetism in massive stars
Observational constraints on mergers creating magnetism in massive stars
 

Phylogenetic tree construction

  • 2. Its the evolutionary history of a kind of organism... the evolution of a genetically related group of organisms as distinguished from the development of the individual organism the history or course of the development of something.
  • 3. • Phylogeny is the inference of evolutionary relationships • All forms of life share a common origin. – here the goal is to deduce the correct trees for all species of life – to estimate the time of divergence between organisms since the time they last shared a common ancestor
  • 5. Basics of Tree Construct
  • 6. Comparison of Speciation with the Genetic change Species tree versus gene tree • In a species tree an internal node represents a speciation event • In a gene tree an internal node represents the divergence of an ancestral gene into two new genes with distinct sequences • Species tree <> Gene tree – horizontal gene transfer – gene duplications
  • 7. Species tree versus gene tree Gray et al.
  • 8. Phylogenetic Tree development steps 1. Selection of sequences or any parameter for analysis 2. Multiple sequence alignment 3. Tree building 4. Tree evaluation
  • 9. DNA: – Higher Phylogenetic signal: • Synonymous vs. non-synonymous substitutions (detect negative and positive selection) Protein: – Phylogenetic signal less predominant than in DNA – Better to construct a tree for evolutionary distant species or genes RNA: rRNA often used for constructing species trees Selection of sequences for analysis
  • 10. Multiple sequence alignment  This is a critical step in the analysis as in many cases the alignment of amino acids or nucleotides in a column implies that they share a common ancestor  If you misalign a group of sequences you will still be able to produce a tree. However, it is not likely to be biologically meaningful.  Crap in is crap out!  Inspect the alignment to be sure that all sequences are homologous  Some times with ClustalW distantly related sequences are not well aligned. Try different gap and extension parameters to improve the alignment  Only use these columns of the multiple alignment for which you have data for all organisms or sequences. Delete the columns for which this is not the case.  Delete columns with gaps
  • 11. Tree building Character-based methods Non-character based methods Methods based on an explicit model of evolution Maximum Likelihood Methods/Bayesian Phylogeny Pairwise distance methods Methods not based on an explicit model of evolution Maximum Parsimony Methods
  • 12. Distance based methods Distance based methods: – calculate the distances between molecular sequences using some distance matrices – A clustering method (UPGMA, neighbor joining) is used to infer the tree from the pair wise distance matrix – treat the sequence from a horizontal(parental) perspective, by calculating a single distance between entire sequences Advantage: • Fast • Allow using evolutionary models Disadvantage: • sequences reduced to one number
  • 13. Character based methods Character based methods: – treat the sequences from a vertical(evolutionary) perspective. – they search for each column of the alignment, the simplest explanation for how the characters evolved. – For instance, MP(Maximum Parsimony) involves a search for a tree with the fewest number of amino acid (or nucleotide character changes that account for the observed differences between the protein (gene) sequences.
  • 14. Tree evaluation: bootstrapping • sampling technique for estimating the statistical error in situations where the underlying sampling distribution is unknown • evaluating the reliability of the inferred tree - or better the reliability of specific branches How to proceed: • From the original alignment, columns in the sequence alignment are chosen at random ‘sampling with replacement’ • a new alignment is constructed with the same size as the original one • a tree is constructed This process is repeated 100 of times**
  • 15. Evaluation Show bootstrap values on Phylogenetic trees • majority-rule consensus tree • map bootstrap values on the original tree • now while evaluating from bootstrap value we are going to check a certain tree’s occurrence number !!! If its 60 out of 100 times its significant, more than 50 is accountable but bellow 50 definitely rejected.
  • 16.
  • 17. Pairwise distance methods • Distance calculation • Inferring the tree topology
  • 18. Pairwise distance methods Approach: • align pairs of sequences and count the number of differences (Hamming distance). • For an alignment of length N with n sites at which there are differences: D= (n/N*100). Problem: • observed differences <> actual genetic distances between the sequences. => dissimilarity is an underestimation of the true evolutionary distance, because of the fact that some of the sequence positions are the result of multiple events Solution: • Use an evolutionary model that corrects for multiple mutations Distance calculation
  • 20. Pairwise distance methods Other evolutionary models Distance calculation
  • 22. Pairwise distance methods UPGMA Method (Unweighted Pair Group Method with Arithmetic Mean): This method is generally attributed to Sokal and Michener • assumes a molecular clock , i.e. that all sequences evolve at a similar rate •distance = twice node height • forces distances to be ultrametric (for any three species, the two largest distances are equal) • produces rooted tree (in this case root is incorrect but topology is otherwise correct)
  • 23. Pairwise distance methods • when two OTUs are grouped, we treat them as a new single OTU • when OTUs A, B (which have been grouped before) and C are grouped into a new node ‘u’, then the distance from node ‘u’ to any other node ‘k’ (e.g. grouping D and E) is simply computed as follows: Tree inference: UPGMA
  • 24.
  • 25.
  • 26. Pairwise distance methods Tree inference: UPGMA
  • 27. Pairwise distance methods Advantages: • Fast • Allows incorporation of evolutionary models Disadvantages: • Assumption of a molecular clock • Non realistic evolutionary approach as all groups are equally distanced from the root. Tree inference: UPGMA
  • 28. Neighbor Joining • Very popular method • Does not make molecular clock assumption : modified distance matrix constructed to adjust for differences in evolution rate of each taxon • Produces un-rooted tree • Assumes additivity: distance between pairs of leaves = sum of lengths of edges connecting them • Like UPGMA, constructs tree by sequentially joining sub-trees
  • 29. Pairwise distance methods • Additive distances can be fitted to an unrooted tree such that the evolutionary distance between a pair of OTUs equals the sum of the lengths of the branches connecting them, rather than being an average as in the case of cluster analysis • Tree construction methods:The neighbour joining (NJ) method, developed by Saitou and Nei (1987) offers a heuristic approach to solve this problem Tree inference: neighbor joining
  • 30. Tree inference: neighbor joining Pairwise distance methods
  • 31. Pairwise distance methods Tree inference: neighbor joining
  • 32. Pairwise distance methods Tree inference: neighbor joining
  • 33. Pairwise distance methods Advantages: • Fast • Allows incorporation of evolutionary models • No assumption of a molecular clock Disadvantages • Constructed tree is sometimes only hypothetically based and no connection with the original tree Tree inference: neighbor joining
  • 34. Maximum parsimony Principle • Select that tree that minimizes the total tree length = being the number of nucleic acid substitutions or amino acid replacements required to explain a given set of data. Method • a particular topology is considered • for this topology, the ancestral sequences at each branching point are reconstructed • the minimum number of events to explain the sequence differences over the whole tree is computed: the minimum number of substitutions is computed for each nucleotide (or amino acid) site, and the numbers for all sites are added. • another tree topology is chosen
  • 35. Maximum parsimony )2(2 )32( 2     n n N nR )3(2 )52( 3     n n N nU OTU's rooted tree topologies unrooted tree topologies 3 3 1 4 15 3 5 105 15 6 954 105 7 10395 954 8 135135 10395 9 2027025 135135 equation • Exhaustive search impossible • Heuristics needed
  • 37. Maximum Parsimony Assumptions • Equal rate of evolution in all branches Advantages • sequence information is not reduced to one number (such as for example in pairwise distance methods) Disadvantages of maximum parsimony methods • can be slow for very large datasets • no correction for multiple mutations, i.e. no substitution model can be applied • sensitive to unequal rates of evolution in different lineages
  • 38.
  • 40.
  • 41. Comparison for the Character based Methods Parsimony vs. Maximum Likelihood  There is an efficient algorithm to calculate the parsimony score for a given topology, therefore parsimony is faster than ML.  Parsimony is an approximation to ML when mutations are rare events.  Weighted parsimony schemes can be used to treat most of the different evolutionary models used with ML.  Parsimony throws away information from non-informative sites that is informative in ML and distance matrix methods.  Parsimony gives little information about branch lengths.  Parsimony is inconsistent in certain cases (Felsenstein zone), and suffers badly from long branch attraction.
  • 42.
  • 43. Commonly used Phylogeny packages • 369 phylogeny packages (http://evolution.gs.washington.edu/phylip/software.html) and 54 free servers (as of Sep 30, 2011) – Phylip (general package, protdist, NJ, parsimony, maximum likelihood, etc) – PAUP (parsimony) – PAML (maximum likelihood) – TreePuzzle (quartet based) – PhyML (maximum likelihood) – MyBayes – MEGA (biologist-centric)

Notas do Editor

  1. ** now while evaluating from bootstrap value we are going to check a certain tree’s occurrence number !!! If its 60 out of 100 times its significant, more than 50 is accountable but bellow 50 definitely rejected.
  2. OTU=operational taxonomic unit