SlideShare uma empresa Scribd logo
1 de 14
 Scoring system is a set of values for qualifying the set of
one residue being substituted by another in an alignment.
 It is also known as substitution matrix.
 Scoring matrix of nucleotide is relatively simple.
 A positive value or a high score is given for a match &
negative value or a low score is given for a mismatch.
 Scoring matrices for amino acids are more complicated
because scoring has to reflect the physicochemical
properties of amino acid residues.
Transition --- substitutions in which a purine (A/G) is replaced by
another purine (A/G) or a pyrimidine (C/T) is replaced by
another pyrimidine (C/T).
Tansversions ---
(A/G)  (C/T)
1000G
0100C
0010T
0001A
GCTA
Identity matrix
1-5-5-
1
G
-51-1-
5
C
-5-11-
5
T
-1-5-51A
GCTA
Transition-Transversion matrix
 Match score: +1
 Mismatch score: +0
 Gap penalty: –1
 ACGTCTGATACGCCGTATAGTCTATCT
||||| ||| || ||||||||
----CTGATTCGC---ATCGTCTATCT
 Matches: 18 × (+1)
 Mismatches: 2 × 0 Score = +11
 Gaps: 7 × (– 1)
PAM - point accepted mutation based on
global alignment [evolutionary model]
BLOSUM - Block substitutions based on
local alignments [similarity among
conserved sequences]
 First given by Dayhoff who compiled alignment of 71
groups of very closely related protein sequences.
 PAM- Point Accepted Mutation.
 PAM matrix were derived based on evolutionary
divergence between sequences of protein structure.
 Construction of PAM1 matrix involves alignment of full
length sequence & subsequent construction of
phylogenic trees using parsimony principle.
 Ancestral sequence information is used to count the number
of substitution along each branch of tree.
 Positive scores in the matrix denotes substitutions occurring
more frequently than expected among evolutionary
conserved replacements.
 Negative score corresponds to substution which occurs less
frequently.
 A PAM is defined as 1% amino acid change or one mutation
per 100 residues.
 The increasing PAM numbers correlate with increasing PAM
units & thus evolutionary distances of protein sequences.
 Constructed based on the phylogenetic
relationships prior to scoring mutations;
 Difficulty of determining ancestral
relationships among sequences;
 Based on a small set of closely related
proteins;
 It is a series of block amino acid substitution matrix.
 Derived on the basis of direct observation for every
possible amino acid substitution in multiple sequence
alignment.
 Sequence pattern is also called as block.
 Ungapped alignments are less than 60 amino acid in
length.
 BLOSUM matrix are actual % values of sequence
selected for construction of matrix.
 BLOSUM 62 indicates that sequence selected for
constructing the matrix is an average share of 62%.
 BLOSUM share for a particular residue pair is derived
from the log ratio of observed residue substitution
versus the expected probability of particular residue.
 Lower the number of BLOSUM more divergent species
are present.
C S T P A G
C 9
S -1 4
T -1 1 5
P -3 -1 -1 7
A 0 1 0 -1 4
G -3 0 -2 -2 0 6
 BLOSUM62 was
measured on pairs
of sequences with
an average of 62 %
identical amino
acids.
Log-odds = log ( )chance to see the pair in homologous proteins
chance to see the pair in unrelated proteins by chance
 PAM
› Based on mutational
model of evolution
(Markov process)
› PAM1 is based on
sequences of 85%
similarity
› Designed to track the
evolutionary origins
 BLOSUM
› Based on the multiple
alignment of blocks
› Good to be used to
compare distant
sequences
› Designed to find
proteins’ conserved
domains
 ESSENTIAL BIOINFORMATICS by Xiong
 NCBI Handbook
 www.google.com
Scoring matrices

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

blast bioinformatics
blast bioinformaticsblast bioinformatics
blast bioinformatics
 
Fasta
FastaFasta
Fasta
 
Cath
CathCath
Cath
 
European molecular biology laboratory (EMBL)
European molecular biology laboratory (EMBL)European molecular biology laboratory (EMBL)
European molecular biology laboratory (EMBL)
 
Structural databases
Structural databases Structural databases
Structural databases
 
Proteins databases
Proteins databasesProteins databases
Proteins databases
 
Gen bank databases
Gen bank databasesGen bank databases
Gen bank databases
 
Secondary protein structure prediction
Secondary protein structure predictionSecondary protein structure prediction
Secondary protein structure prediction
 
NCBI National Center for Biotechnology Information
NCBI National Center for Biotechnology InformationNCBI National Center for Biotechnology Information
NCBI National Center for Biotechnology Information
 
Protein data bank
Protein data bankProtein data bank
Protein data bank
 
PIR- Protein Information Resource
PIR- Protein Information ResourcePIR- Protein Information Resource
PIR- Protein Information Resource
 
Primary and secondary database
Primary and secondary databasePrimary and secondary database
Primary and secondary database
 
Phylogenetic analysis
Phylogenetic analysis Phylogenetic analysis
Phylogenetic analysis
 
sequence alignment
sequence alignmentsequence alignment
sequence alignment
 
Composite and Specialized databases
Composite and Specialized databasesComposite and Specialized databases
Composite and Specialized databases
 
Secondary Structure Prediction of proteins
Secondary Structure Prediction of proteins Secondary Structure Prediction of proteins
Secondary Structure Prediction of proteins
 
Nucleic Acid Sequence databases
Nucleic Acid Sequence databasesNucleic Acid Sequence databases
Nucleic Acid Sequence databases
 
Tools and database of NCBI
Tools and database of NCBITools and database of NCBI
Tools and database of NCBI
 
Swiss PROT
Swiss PROT Swiss PROT
Swiss PROT
 
methods for protein structure prediction
methods for protein structure predictionmethods for protein structure prediction
methods for protein structure prediction
 

Semelhante a Scoring matrices

20100515 bioinformatics kapushesky_lecture07
20100515 bioinformatics kapushesky_lecture0720100515 bioinformatics kapushesky_lecture07
20100515 bioinformatics kapushesky_lecture07
Computer Science Club
 
Basics of bioinformatics
Basics of bioinformaticsBasics of bioinformatics
Basics of bioinformatics
Abhishek Vatsa
 

Semelhante a Scoring matrices (20)

Scoring schemes in bioinformatics
Scoring schemes in bioinformaticsScoring schemes in bioinformatics
Scoring schemes in bioinformatics
 
Scoring schemes in bioinformatics (blosum)
Scoring schemes in bioinformatics (blosum)Scoring schemes in bioinformatics (blosum)
Scoring schemes in bioinformatics (blosum)
 
Arms 2
Arms 2Arms 2
Arms 2
 
PAM matrices evolution
PAM matrices evolutionPAM matrices evolution
PAM matrices evolution
 
20100515 bioinformatics kapushesky_lecture07
20100515 bioinformatics kapushesky_lecture0720100515 bioinformatics kapushesky_lecture07
20100515 bioinformatics kapushesky_lecture07
 
Bioinformatica 20-10-2011-t3-scoring matrices
Bioinformatica 20-10-2011-t3-scoring matricesBioinformatica 20-10-2011-t3-scoring matrices
Bioinformatica 20-10-2011-t3-scoring matrices
 
Basics of bioinformatics
Basics of bioinformaticsBasics of bioinformatics
Basics of bioinformatics
 
Medicilon KRAS-targeted Drugs R&D Service.pdf
Medicilon KRAS-targeted Drugs R&D Service.pdfMedicilon KRAS-targeted Drugs R&D Service.pdf
Medicilon KRAS-targeted Drugs R&D Service.pdf
 
Sequence Alignment,Blast, Fasta, MSA
Sequence Alignment,Blast, Fasta, MSASequence Alignment,Blast, Fasta, MSA
Sequence Alignment,Blast, Fasta, MSA
 
Research Project
Research ProjectResearch Project
Research Project
 
Computation and System Biology Assignment Help
Computation and System Biology Assignment HelpComputation and System Biology Assignment Help
Computation and System Biology Assignment Help
 
Seq alignment
Seq alignment Seq alignment
Seq alignment
 
Aacr poster2007
Aacr poster2007Aacr poster2007
Aacr poster2007
 
MUTATION OF DNA IN AN ORGANISM DELETION INSERTION
MUTATION OF DNA IN AN ORGANISM DELETION INSERTIONMUTATION OF DNA IN AN ORGANISM DELETION INSERTION
MUTATION OF DNA IN AN ORGANISM DELETION INSERTION
 
10 mutation
10 mutation10 mutation
10 mutation
 
4. sequence alignment.pptx
4. sequence alignment.pptx4. sequence alignment.pptx
4. sequence alignment.pptx
 
How the blast work
How the blast workHow the blast work
How the blast work
 
BIOS 5260 Term Paper
BIOS 5260 Term PaperBIOS 5260 Term Paper
BIOS 5260 Term Paper
 
SNP genotyping using Affymetrix' Axiom Genotyping Solution
SNP genotyping using Affymetrix' Axiom Genotyping SolutionSNP genotyping using Affymetrix' Axiom Genotyping Solution
SNP genotyping using Affymetrix' Axiom Genotyping Solution
 
Wang labsummer2010
Wang labsummer2010Wang labsummer2010
Wang labsummer2010
 

Último

Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Sérgio Sacani
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
seri bangash
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
Areesha Ahmad
 

Último (20)

Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
 
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts ServiceJustdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
chemical bonding Essentials of Physical Chemistry2.pdf
chemical bonding Essentials of Physical Chemistry2.pdfchemical bonding Essentials of Physical Chemistry2.pdf
chemical bonding Essentials of Physical Chemistry2.pdf
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 

Scoring matrices

  • 1.
  • 2.  Scoring system is a set of values for qualifying the set of one residue being substituted by another in an alignment.  It is also known as substitution matrix.  Scoring matrix of nucleotide is relatively simple.  A positive value or a high score is given for a match & negative value or a low score is given for a mismatch.  Scoring matrices for amino acids are more complicated because scoring has to reflect the physicochemical properties of amino acid residues.
  • 3. Transition --- substitutions in which a purine (A/G) is replaced by another purine (A/G) or a pyrimidine (C/T) is replaced by another pyrimidine (C/T). Tansversions --- (A/G)  (C/T) 1000G 0100C 0010T 0001A GCTA Identity matrix 1-5-5- 1 G -51-1- 5 C -5-11- 5 T -1-5-51A GCTA Transition-Transversion matrix
  • 4.  Match score: +1  Mismatch score: +0  Gap penalty: –1  ACGTCTGATACGCCGTATAGTCTATCT ||||| ||| || |||||||| ----CTGATTCGC---ATCGTCTATCT  Matches: 18 × (+1)  Mismatches: 2 × 0 Score = +11  Gaps: 7 × (– 1)
  • 5. PAM - point accepted mutation based on global alignment [evolutionary model] BLOSUM - Block substitutions based on local alignments [similarity among conserved sequences]
  • 6.  First given by Dayhoff who compiled alignment of 71 groups of very closely related protein sequences.  PAM- Point Accepted Mutation.  PAM matrix were derived based on evolutionary divergence between sequences of protein structure.  Construction of PAM1 matrix involves alignment of full length sequence & subsequent construction of phylogenic trees using parsimony principle.
  • 7.  Ancestral sequence information is used to count the number of substitution along each branch of tree.  Positive scores in the matrix denotes substitutions occurring more frequently than expected among evolutionary conserved replacements.  Negative score corresponds to substution which occurs less frequently.  A PAM is defined as 1% amino acid change or one mutation per 100 residues.  The increasing PAM numbers correlate with increasing PAM units & thus evolutionary distances of protein sequences.
  • 8.  Constructed based on the phylogenetic relationships prior to scoring mutations;  Difficulty of determining ancestral relationships among sequences;  Based on a small set of closely related proteins;
  • 9.  It is a series of block amino acid substitution matrix.  Derived on the basis of direct observation for every possible amino acid substitution in multiple sequence alignment.  Sequence pattern is also called as block.  Ungapped alignments are less than 60 amino acid in length.  BLOSUM matrix are actual % values of sequence selected for construction of matrix.
  • 10.  BLOSUM 62 indicates that sequence selected for constructing the matrix is an average share of 62%.  BLOSUM share for a particular residue pair is derived from the log ratio of observed residue substitution versus the expected probability of particular residue.  Lower the number of BLOSUM more divergent species are present.
  • 11. C S T P A G C 9 S -1 4 T -1 1 5 P -3 -1 -1 7 A 0 1 0 -1 4 G -3 0 -2 -2 0 6  BLOSUM62 was measured on pairs of sequences with an average of 62 % identical amino acids. Log-odds = log ( )chance to see the pair in homologous proteins chance to see the pair in unrelated proteins by chance
  • 12.  PAM › Based on mutational model of evolution (Markov process) › PAM1 is based on sequences of 85% similarity › Designed to track the evolutionary origins  BLOSUM › Based on the multiple alignment of blocks › Good to be used to compare distant sequences › Designed to find proteins’ conserved domains
  • 13.  ESSENTIAL BIOINFORMATICS by Xiong  NCBI Handbook  www.google.com