SlideShare uma empresa Scribd logo
1 de 8
In silico analysis of accurate
proteomics, complemented by
selective isolation of peptides.

Yasset Perez-Riverol
yperez@ebi.ac.uk
yasset.perez@biocomp.cigb.edu.cu
Aniel Sanchez Puentes
aniel.sanchez@cigb.edu.cu

EBI is an Outstation of the European Molecular Biology Laboratory.
ABSTRACT
Protein identification by mass spectrometry is mainly based on MS/MS spectra and the accuracy
of molecular mass determination. However, the high complexity and dynamic ranges for any
species of proteomic samples, surpasses the separation capacity and detection power of the
most advanced multidimensional liquid chromatographs and mass spectrometers. Only a tiny
portion of signals is selected for MS/MS experiments and a still considerable number of them do
not provide reliable peptide identification.

The approach is based on mass accuracy, isoelectric point (pI), retention time (tR) and N-terminal
amino acid determination as protein identification criteria regardless of high quality MS/MS
spectra. When the methodology was combined with the selective isolation methods, the number
of unique peptides and identified proteins increases. Finally, to demonstrate the feasibility of the
methodology, an OFFGEL-LC-MS/MS experiment was also implemented. Our results show that
using the information provided by these features and selective isolation methods we could found
the 93% of the high confidence protein identified by MS/MS with false-positive rate lower than
5%.
Drosophila cell

RH0

~~~~ K

RH1

~~~~R

(0)
Abs at 215 nm

~~H~~K
RH2
~~H~~R

(1+)
(2+, 3+)

20

10

30

40

764.974

100

b

%

1

292.163

RPEGENASYHLAYDKDR389

373

764.320
719.299

0
827.868

100
b
%

1

251.126

0
b
100 110.025
1
292.138
%
0

y
n

828.373

DSSIVTHDNDIFR233

221

828.889
702.367

1

RPEGENASYHLAYDK389

373

1021.131 1238.504 1450.501 1637.761
1140.3881333.778 1577.739 1762.484

MS/MS spectra were interpreted by the X! Tandem
software using the Flybase sequence database. The
database search results were validated using
PeptideProphet.
This work analyzed only the four isoelectric
focusing fractions with the lowest pI having the best
agreement between the theoretical and experimental
values, according to previous. In addition, these
fractions cover 50% of the identified peptides. Also,
we used only highly reliable peptide identifications,
filtering out those with a PeptideProphet probability
lower than 0.97 (FDR = 0.01) or with
posttranslational modifications. For experimental t R
analysis the acceptance error was set at 748.42 s,
and mass tolerance was set at 10 ppm.
Proteomic Research: N-Term Identification

Anal Chem. 2010 Oct 15;82(20):8492-501.
Create a Insilico tryptic
peptide database.

Annotate peptides with
theoretical rt, pI, mass,
MW, N-Term

Annotate experimental identified
sequences from PeptideProphet
output with probability
more than 0.97.
(rt, pI, N-term, MW, sequence)

Search precursor masses of
Experimental sequences on
Insilco Database.
[Match only one sequence]

[Not match any sequence]

Peptides out of
the ppm range

[Match with more than one sequence]
Search in the input theoretical
List of sequence the peptide
By current property
(pI, rt, MW, N-term)
[Not match any sequence]
[Match with more than one sequence]

Peptides out of
the error range for
property

[Match only one sequence]
Compare with MS/MS
sequence result

[Not Match Insilco sequence with MS/MS sequence]

Annotate as a False
Positive Identification.

[Match Insilco sequence with MS/MS sequence]

Annotate as Peptide
Identification

A tree-based algorithm to identify
unique
peptides
in
the
experimental set was constructed
in a similar fashion to the one
designed for theoretical analysis.
The final list of unique peptides
was validated by using the
sequence
predicted
from
PeptideProphet. In cases where
the PeptideProphet sequences
and the sequences identified by
our approach did not match, the
identifications achieved by our
algorithm were considered as false
positive identification.
CONCLUSION
The use of the information provided by some analytical tools could help to offset the
information contained in the sequence of peptides, but it is more efficient when a
prokaryote proteome is analyzed.
Some drawbacks associated to precision (accuracy) that can be predicted are that
the variables used may hinder the accurate mass proteomics analysis with the
identification of false positive hints. The inclusion of some types of peptides and the
reduction of complexity allows increasing the percent of unique peptides compared to
normal analysis. The combination of several selective methods (RH0, RH1, and
RH2) in the same sample could increase the percent of proteins with unique
peptides. The theoretical analysis described in this paper does not exclude the
possibility of combining it with the MS/MS information obtained in any proteomic
experiment.

Mais conteúdo relacionado

Mais procurados

OpenMS: Quantitative proteomics at large scale
OpenMS: Quantitative proteomics at large scaleOpenMS: Quantitative proteomics at large scale
OpenMS: Quantitative proteomics at large scaleYasset Perez-Riverol
 
Pili Lab_Amgen Project AACR 2015 poster
Pili Lab_Amgen Project AACR 2015 posterPili Lab_Amgen Project AACR 2015 poster
Pili Lab_Amgen Project AACR 2015 posterAshley Orillion
 
AsedaSciences SLAS2017 poster presentation
AsedaSciences SLAS2017 poster presentationAsedaSciences SLAS2017 poster presentation
AsedaSciences SLAS2017 poster presentationAndrew Bieberich
 
Advanced techniques in protein estimation
Advanced techniques in protein estimationAdvanced techniques in protein estimation
Advanced techniques in protein estimationvinayak gogawale
 
Molecular dynamics synchronised Manipulator system to repair Biomolecules
Molecular dynamics synchronised Manipulator system to repair BiomoleculesMolecular dynamics synchronised Manipulator system to repair Biomolecules
Molecular dynamics synchronised Manipulator system to repair Biomoleculesjayakarj
 
Integrating Pathway Information with Gene Expression Data to Identify Novel ...
 Integrating Pathway Information with Gene Expression Data to Identify Novel ... Integrating Pathway Information with Gene Expression Data to Identify Novel ...
Integrating Pathway Information with Gene Expression Data to Identify Novel ...Charlie Pei
 
Population Pharmacokinetic Modelling of an investigational prodrug. Crunenber...
Population Pharmacokinetic Modelling of an investigational prodrug. Crunenber...Population Pharmacokinetic Modelling of an investigational prodrug. Crunenber...
Population Pharmacokinetic Modelling of an investigational prodrug. Crunenber...robirish51
 
QSAR Studies of the Inhibitory Activity of a Series of Substituted Indole and...
QSAR Studies of the Inhibitory Activity of a Series of Substituted Indole and...QSAR Studies of the Inhibitory Activity of a Series of Substituted Indole and...
QSAR Studies of the Inhibitory Activity of a Series of Substituted Indole and...inventionjournals
 

Mais procurados (10)

OpenMS: Quantitative proteomics at large scale
OpenMS: Quantitative proteomics at large scaleOpenMS: Quantitative proteomics at large scale
OpenMS: Quantitative proteomics at large scale
 
Pili Lab_Amgen Project AACR 2015 poster
Pili Lab_Amgen Project AACR 2015 posterPili Lab_Amgen Project AACR 2015 poster
Pili Lab_Amgen Project AACR 2015 poster
 
Yasset iso point-cigb-2012
Yasset iso point-cigb-2012Yasset iso point-cigb-2012
Yasset iso point-cigb-2012
 
AsedaSciences SLAS2017 poster presentation
AsedaSciences SLAS2017 poster presentationAsedaSciences SLAS2017 poster presentation
AsedaSciences SLAS2017 poster presentation
 
Advanced techniques in protein estimation
Advanced techniques in protein estimationAdvanced techniques in protein estimation
Advanced techniques in protein estimation
 
Molecular dynamics synchronised Manipulator system to repair Biomolecules
Molecular dynamics synchronised Manipulator system to repair BiomoleculesMolecular dynamics synchronised Manipulator system to repair Biomolecules
Molecular dynamics synchronised Manipulator system to repair Biomolecules
 
Integrating Pathway Information with Gene Expression Data to Identify Novel ...
 Integrating Pathway Information with Gene Expression Data to Identify Novel ... Integrating Pathway Information with Gene Expression Data to Identify Novel ...
Integrating Pathway Information with Gene Expression Data to Identify Novel ...
 
Population Pharmacokinetic Modelling of an investigational prodrug. Crunenber...
Population Pharmacokinetic Modelling of an investigational prodrug. Crunenber...Population Pharmacokinetic Modelling of an investigational prodrug. Crunenber...
Population Pharmacokinetic Modelling of an investigational prodrug. Crunenber...
 
Cancer Proteomics
Cancer ProteomicsCancer Proteomics
Cancer Proteomics
 
QSAR Studies of the Inhibitory Activity of a Series of Substituted Indole and...
QSAR Studies of the Inhibitory Activity of a Series of Substituted Indole and...QSAR Studies of the Inhibitory Activity of a Series of Substituted Indole and...
QSAR Studies of the Inhibitory Activity of a Series of Substituted Indole and...
 

Semelhante a Yasset perezriverol csi2011

Theoretical evaluation of shotgun proteomic analysis strategies; Peptide obse...
Theoretical evaluation of shotgun proteomic analysis strategies; Peptide obse...Theoretical evaluation of shotgun proteomic analysis strategies; Peptide obse...
Theoretical evaluation of shotgun proteomic analysis strategies; Peptide obse...Keiji Takamoto
 
Protein Qualitative Analysis Services
Protein Qualitative Analysis ServicesProtein Qualitative Analysis Services
Protein Qualitative Analysis ServicesCreative Proteomics
 
“Proteomics” to study genes and genomes
“Proteomics” to study genes and genomes“Proteomics” to study genes and genomes
“Proteomics” to study genes and genomesNazish_Nehal
 
Methods for Protein Sequencing.pdf
Methods for Protein Sequencing.pdfMethods for Protein Sequencing.pdf
Methods for Protein Sequencing.pdfCreative Proteomics
 
Three Methods for Protein Sequencing
Three Methods for Protein SequencingThree Methods for Protein Sequencing
Three Methods for Protein SequencingCreative Proteomics
 
1.proteomics coursework-3 dec2012-aky
1.proteomics coursework-3 dec2012-aky1.proteomics coursework-3 dec2012-aky
1.proteomics coursework-3 dec2012-akyAmit Yadav
 
Cncp 2010
Cncp 2010Cncp 2010
Cncp 2010ygc
 
Analytical method development and validation
Analytical method development and validationAnalytical method development and validation
Analytical method development and validationCreative Peptides
 
Peptide Mass Fingerprinting
Peptide Mass FingerprintingPeptide Mass Fingerprinting
Peptide Mass FingerprintingRida Khalid
 
proteomics, mass spectrometry, science, bioinformatics, electrophoresis, liqu...
proteomics, mass spectrometry, science, bioinformatics, electrophoresis, liqu...proteomics, mass spectrometry, science, bioinformatics, electrophoresis, liqu...
proteomics, mass spectrometry, science, bioinformatics, electrophoresis, liqu...Amit Yadav
 
Proteomics 2009 V9p1683
Proteomics 2009 V9p1683Proteomics 2009 V9p1683
Proteomics 2009 V9p1683jcruzsilva
 
Publication - Alternative Surfactants for Improved Efficiency of In Situ Tryp...
Publication - Alternative Surfactants for Improved Efficiency of In Situ Tryp...Publication - Alternative Surfactants for Improved Efficiency of In Situ Tryp...
Publication - Alternative Surfactants for Improved Efficiency of In Situ Tryp...Nathan Marshall
 
A miniaturized sandwich immunoassay platform
A miniaturized sandwich immunoassay platformA miniaturized sandwich immunoassay platform
A miniaturized sandwich immunoassay platformQing Chen
 
The Application and Methods for Peptidomics
The Application and Methods for PeptidomicsThe Application and Methods for Peptidomics
The Application and Methods for PeptidomicsCreative Proteomics
 
Integrative analysis of medical imaging and omics
Integrative analysis of medical imaging and omicsIntegrative analysis of medical imaging and omics
Integrative analysis of medical imaging and omicsHongyoon Choi
 

Semelhante a Yasset perezriverol csi2011 (20)

Theoretical evaluation of shotgun proteomic analysis strategies; Peptide obse...
Theoretical evaluation of shotgun proteomic analysis strategies; Peptide obse...Theoretical evaluation of shotgun proteomic analysis strategies; Peptide obse...
Theoretical evaluation of shotgun proteomic analysis strategies; Peptide obse...
 
Protein Qualitative Analysis Services
Protein Qualitative Analysis ServicesProtein Qualitative Analysis Services
Protein Qualitative Analysis Services
 
“Proteomics” to study genes and genomes
“Proteomics” to study genes and genomes“Proteomics” to study genes and genomes
“Proteomics” to study genes and genomes
 
Methods for Protein Sequencing.pdf
Methods for Protein Sequencing.pdfMethods for Protein Sequencing.pdf
Methods for Protein Sequencing.pdf
 
Three Methods for Protein Sequencing
Three Methods for Protein SequencingThree Methods for Protein Sequencing
Three Methods for Protein Sequencing
 
Proteomics
ProteomicsProteomics
Proteomics
 
MpM
MpMMpM
MpM
 
1.proteomics coursework-3 dec2012-aky
1.proteomics coursework-3 dec2012-aky1.proteomics coursework-3 dec2012-aky
1.proteomics coursework-3 dec2012-aky
 
Cncp 2010
Cncp 2010Cncp 2010
Cncp 2010
 
proteomics
 proteomics proteomics
proteomics
 
Analytical method development and validation
Analytical method development and validationAnalytical method development and validation
Analytical method development and validation
 
Peptide Mass Fingerprinting
Peptide Mass FingerprintingPeptide Mass Fingerprinting
Peptide Mass Fingerprinting
 
Explaining Peptide Prophet
Explaining Peptide ProphetExplaining Peptide Prophet
Explaining Peptide Prophet
 
proteomics, mass spectrometry, science, bioinformatics, electrophoresis, liqu...
proteomics, mass spectrometry, science, bioinformatics, electrophoresis, liqu...proteomics, mass spectrometry, science, bioinformatics, electrophoresis, liqu...
proteomics, mass spectrometry, science, bioinformatics, electrophoresis, liqu...
 
Proteomics 2009 V9p1683
Proteomics 2009 V9p1683Proteomics 2009 V9p1683
Proteomics 2009 V9p1683
 
Publication - Alternative Surfactants for Improved Efficiency of In Situ Tryp...
Publication - Alternative Surfactants for Improved Efficiency of In Situ Tryp...Publication - Alternative Surfactants for Improved Efficiency of In Situ Tryp...
Publication - Alternative Surfactants for Improved Efficiency of In Situ Tryp...
 
A miniaturized sandwich immunoassay platform
A miniaturized sandwich immunoassay platformA miniaturized sandwich immunoassay platform
A miniaturized sandwich immunoassay platform
 
Proteomics
ProteomicsProteomics
Proteomics
 
The Application and Methods for Peptidomics
The Application and Methods for PeptidomicsThe Application and Methods for Peptidomics
The Application and Methods for Peptidomics
 
Integrative analysis of medical imaging and omics
Integrative analysis of medical imaging and omicsIntegrative analysis of medical imaging and omics
Integrative analysis of medical imaging and omics
 

Mais de Yasset Perez-Riverol

Biocontainers 2019: Presentation for the ELIXIR All Hands
Biocontainers 2019: Presentation for the ELIXIR All HandsBiocontainers 2019: Presentation for the ELIXIR All Hands
Biocontainers 2019: Presentation for the ELIXIR All HandsYasset Perez-Riverol
 
Mapping millions of peptidoforms to Genome Coordinates
Mapping millions of peptidoforms to Genome CoordinatesMapping millions of peptidoforms to Genome Coordinates
Mapping millions of peptidoforms to Genome CoordinatesYasset Perez-Riverol
 
Systematic integration of millions of peptidoform evidences into Ensembl and ...
Systematic integration of millions of peptidoform evidences into Ensembl and ...Systematic integration of millions of peptidoform evidences into Ensembl and ...
Systematic integration of millions of peptidoform evidences into Ensembl and ...Yasset Perez-Riverol
 
Biocontainers Hackathon Introduction
Biocontainers Hackathon IntroductionBiocontainers Hackathon Introduction
Biocontainers Hackathon IntroductionYasset Perez-Riverol
 
BioContainers on ELIXIR All Hands 2017
BioContainers on ELIXIR All Hands 2017BioContainers on ELIXIR All Hands 2017
BioContainers on ELIXIR All Hands 2017Yasset Perez-Riverol
 
Do we need to make public our proteomics data?
Do we need to make public our proteomics data?Do we need to make public our proteomics data?
Do we need to make public our proteomics data?Yasset Perez-Riverol
 
Design of an hexapeptide database for proteomics studies
Design of an hexapeptide database for proteomics studiesDesign of an hexapeptide database for proteomics studies
Design of an hexapeptide database for proteomics studiesYasset Perez-Riverol
 
Parallel conformational search of small molecules
Parallel conformational search of small moleculesParallel conformational search of small molecules
Parallel conformational search of small moleculesYasset Perez-Riverol
 
Standarization in Proteomics: From raw data to metadata files
Standarization in Proteomics: From raw data to metadata filesStandarization in Proteomics: From raw data to metadata files
Standarization in Proteomics: From raw data to metadata filesYasset Perez-Riverol
 
PRIDE and ProteomeXchange – Making proteomics data accessible and reusable
PRIDE and ProteomeXchange – Making proteomics data accessible and reusablePRIDE and ProteomeXchange – Making proteomics data accessible and reusable
PRIDE and ProteomeXchange – Making proteomics data accessible and reusable Yasset Perez-Riverol
 
SintCompound: A Small Compound Database for Virtual Screening
SintCompound: A Small Compound Database for Virtual ScreeningSintCompound: A Small Compound Database for Virtual Screening
SintCompound: A Small Compound Database for Virtual ScreeningYasset Perez-Riverol
 

Mais de Yasset Perez-Riverol (13)

Introduction to Proteogenomics
Introduction to Proteogenomics Introduction to Proteogenomics
Introduction to Proteogenomics
 
Biocontainers 2019: Presentation for the ELIXIR All Hands
Biocontainers 2019: Presentation for the ELIXIR All HandsBiocontainers 2019: Presentation for the ELIXIR All Hands
Biocontainers 2019: Presentation for the ELIXIR All Hands
 
Mapping millions of peptidoforms to Genome Coordinates
Mapping millions of peptidoforms to Genome CoordinatesMapping millions of peptidoforms to Genome Coordinates
Mapping millions of peptidoforms to Genome Coordinates
 
Systematic integration of millions of peptidoform evidences into Ensembl and ...
Systematic integration of millions of peptidoform evidences into Ensembl and ...Systematic integration of millions of peptidoform evidences into Ensembl and ...
Systematic integration of millions of peptidoform evidences into Ensembl and ...
 
Biocontainers Hackathon Introduction
Biocontainers Hackathon IntroductionBiocontainers Hackathon Introduction
Biocontainers Hackathon Introduction
 
BioContainers on ELIXIR All Hands 2017
BioContainers on ELIXIR All Hands 2017BioContainers on ELIXIR All Hands 2017
BioContainers on ELIXIR All Hands 2017
 
Do we need to make public our proteomics data?
Do we need to make public our proteomics data?Do we need to make public our proteomics data?
Do we need to make public our proteomics data?
 
Design of an hexapeptide database for proteomics studies
Design of an hexapeptide database for proteomics studiesDesign of an hexapeptide database for proteomics studies
Design of an hexapeptide database for proteomics studies
 
Parallel conformational search of small molecules
Parallel conformational search of small moleculesParallel conformational search of small molecules
Parallel conformational search of small molecules
 
PBS Web (Spanish)
PBS Web (Spanish)PBS Web (Spanish)
PBS Web (Spanish)
 
Standarization in Proteomics: From raw data to metadata files
Standarization in Proteomics: From raw data to metadata filesStandarization in Proteomics: From raw data to metadata files
Standarization in Proteomics: From raw data to metadata files
 
PRIDE and ProteomeXchange – Making proteomics data accessible and reusable
PRIDE and ProteomeXchange – Making proteomics data accessible and reusablePRIDE and ProteomeXchange – Making proteomics data accessible and reusable
PRIDE and ProteomeXchange – Making proteomics data accessible and reusable
 
SintCompound: A Small Compound Database for Virtual Screening
SintCompound: A Small Compound Database for Virtual ScreeningSintCompound: A Small Compound Database for Virtual Screening
SintCompound: A Small Compound Database for Virtual Screening
 

Yasset perezriverol csi2011

  • 1. In silico analysis of accurate proteomics, complemented by selective isolation of peptides. Yasset Perez-Riverol yperez@ebi.ac.uk yasset.perez@biocomp.cigb.edu.cu Aniel Sanchez Puentes aniel.sanchez@cigb.edu.cu EBI is an Outstation of the European Molecular Biology Laboratory.
  • 2. ABSTRACT Protein identification by mass spectrometry is mainly based on MS/MS spectra and the accuracy of molecular mass determination. However, the high complexity and dynamic ranges for any species of proteomic samples, surpasses the separation capacity and detection power of the most advanced multidimensional liquid chromatographs and mass spectrometers. Only a tiny portion of signals is selected for MS/MS experiments and a still considerable number of them do not provide reliable peptide identification. The approach is based on mass accuracy, isoelectric point (pI), retention time (tR) and N-terminal amino acid determination as protein identification criteria regardless of high quality MS/MS spectra. When the methodology was combined with the selective isolation methods, the number of unique peptides and identified proteins increases. Finally, to demonstrate the feasibility of the methodology, an OFFGEL-LC-MS/MS experiment was also implemented. Our results show that using the information provided by these features and selective isolation methods we could found the 93% of the high confidence protein identified by MS/MS with false-positive rate lower than 5%.
  • 3. Drosophila cell RH0 ~~~~ K RH1 ~~~~R (0) Abs at 215 nm ~~H~~K RH2 ~~H~~R (1+) (2+, 3+) 20 10 30 40 764.974 100 b % 1 292.163 RPEGENASYHLAYDKDR389 373 764.320 719.299 0 827.868 100 b % 1 251.126 0 b 100 110.025 1 292.138 % 0 y n 828.373 DSSIVTHDNDIFR233 221 828.889 702.367 1 RPEGENASYHLAYDK389 373 1021.131 1238.504 1450.501 1637.761 1140.3881333.778 1577.739 1762.484 MS/MS spectra were interpreted by the X! Tandem software using the Flybase sequence database. The database search results were validated using PeptideProphet. This work analyzed only the four isoelectric focusing fractions with the lowest pI having the best agreement between the theoretical and experimental values, according to previous. In addition, these fractions cover 50% of the identified peptides. Also, we used only highly reliable peptide identifications, filtering out those with a PeptideProphet probability lower than 0.97 (FDR = 0.01) or with posttranslational modifications. For experimental t R analysis the acceptance error was set at 748.42 s, and mass tolerance was set at 10 ppm.
  • 4.
  • 5. Proteomic Research: N-Term Identification Anal Chem. 2010 Oct 15;82(20):8492-501.
  • 6. Create a Insilico tryptic peptide database. Annotate peptides with theoretical rt, pI, mass, MW, N-Term Annotate experimental identified sequences from PeptideProphet output with probability more than 0.97. (rt, pI, N-term, MW, sequence) Search precursor masses of Experimental sequences on Insilco Database. [Match only one sequence] [Not match any sequence] Peptides out of the ppm range [Match with more than one sequence] Search in the input theoretical List of sequence the peptide By current property (pI, rt, MW, N-term) [Not match any sequence] [Match with more than one sequence] Peptides out of the error range for property [Match only one sequence] Compare with MS/MS sequence result [Not Match Insilco sequence with MS/MS sequence] Annotate as a False Positive Identification. [Match Insilco sequence with MS/MS sequence] Annotate as Peptide Identification A tree-based algorithm to identify unique peptides in the experimental set was constructed in a similar fashion to the one designed for theoretical analysis. The final list of unique peptides was validated by using the sequence predicted from PeptideProphet. In cases where the PeptideProphet sequences and the sequences identified by our approach did not match, the identifications achieved by our algorithm were considered as false positive identification.
  • 7.
  • 8. CONCLUSION The use of the information provided by some analytical tools could help to offset the information contained in the sequence of peptides, but it is more efficient when a prokaryote proteome is analyzed. Some drawbacks associated to precision (accuracy) that can be predicted are that the variables used may hinder the accurate mass proteomics analysis with the identification of false positive hints. The inclusion of some types of peptides and the reduction of complexity allows increasing the percent of unique peptides compared to normal analysis. The combination of several selective methods (RH0, RH1, and RH2) in the same sample could increase the percent of proteins with unique peptides. The theoretical analysis described in this paper does not exclude the possibility of combining it with the MS/MS information obtained in any proteomic experiment.