SlideShare a Scribd company logo
1 of 23
Important Protein Databases
&
Proteomics Softwares
MOHD KYUM
Punjab Agricultural University
• Protein databases have become a crucial part of modern biology.
• Searching databases is often the first step in the study of a new protein.
• Huge amounts of data for protein structures, functions, and particularly sequences are being
generated which cannot be handled without using computer databases.
• Without the prior knowledge obtained from such searches, known information about the
protein could be missed, or an experiment could be repeated unnecessarily.
• Comparison between proteins and protein classification provide information about the
relationship between proteins within a genome or across different species, and hence offer
much more information than can be obtained by studying only an isolated protein.
Introduction to Protein Databases
Protein Databases
• The databases can be classified in following categories:
 Sequence Databases
 2D Gel Databases
 3D Structure Databases
 Polymorphism and Mutation Database
 Chemistry Databases
 Enzyme and Pathway Databases
 Ontologies, Specialized Protein Databases
 Family and Domain Databases,
 Gene Expression Databases
 Genome Annotation Databases
 Organism Specific Databases
 Phylogenomic Databases
 Protein-Protein Interaction Databases,
 Proteomic Databases,
 PTM Databases
 Other Miscellaneous Databases.
Protein Sequence Databases
https://www.ncbi.nlm.nih.gov/refseq/
https://www.uniprot.org/
UniProt
• UniProt provides more annotations than any other sequence database with a minimal level
of redundancy. It has following three components:
1. Protein knowledgebase- including Swiss-Prot (manually annotated and reviewed) and
TrEMBL (automatically annotated).
2. UniRef- sequence clusters for fast sequence similarity searches.
3. UniParc- sequence archive for keeping track of sequences and their identifiers.
 UniProt, as a curated protein sequence database, offers a portal to a wide range of
annotations, covering areas such as function, family, domain parsing, post-translational
modifications, and variants.
RefSec-NCBI
• The National Center for Biotechnology Information Reference Sequence (NCBI RefSeq) database provides
curated non-redundant sequences of genomic regions, transcripts and proteins for taxonomically diverse
organisms including Archaea, Bacteria, Eukaryotes, and Viruses.
• RefSeq database is derived from the sequence data available in the redundant archival database GenBank.
• RefSeq sequences include coding regions, conserved domains, variations etc. and enhanced annotations such
as publications, names, symbols, aliases, Gene IDs, and database cross-references.
• The sequences and annotations are generated using a combined approach of collaboration, automated
prediction, and manual curation.
• The RefSeq records can be directly accessed from NCBI web sites bysearch of the Nucleotide or Protein
databases, BLAST searches against selected databases and FTP downloads
Protein Structure Databases
http://www.wwpdb.org/ https://scop.mrc-lmb.cam.ac.uk/
WWPDB
• The World Wide Protein Data Bank (WWPDB) was established in 2003 as an
international collaboration to maintain a single and publicly available PDB Archive of
protein structural data.
• The “PDB Archive” is a collection of flat files in three different formats:
(A) Legacy PDB format (B) PDBx/mmCIF format (C) Protein Data Bank Markup Language
(PDBML) format.
• Each member site serves as a deposition, data processing and distribution site for the PDB
Archive, and each provides its own view of the primary data and a variety of tools and
resources.
SCOP
• SCOP (Structural Classification of Proteins) contains information about the classification
of protein structures along with their sequences information.
• It classified works under sub-categories with their features:
1. Class - Global characteristics 2. Fold - Similar “topology”
3. Superfamily - Clear structural homology 4. Family - Clear sequence homology
5. Protein - Functionally identical 6. Species - Unique sequences
It aims to provide an accurate, detailed, and comprehensive description of the structural and
evolutionary relationships amongst all proteins of known structure.
Protein Family Databases
http://pfam.xfam.org/ http://pantherdb.org/ https://prosite.expasy.org/
Pfam
• Pfam is a database of protein families represented as multiple sequence alignments and Hidden
Markov Models (HMMs).
• Pfam entries can be classified as Family (related protein regions), Domain (protein structural unit),
Repeat (multiple short protein structural units), Motifs (short protein structural unit outside global
domains).
• Related Pfam entries are grouped into clans based on sequence, structure or profile HMM
similarity.
• The Pfam database web site provides search interface for querying by sequence, keyword, domain
architecture, taxonomy, and browse interfaces for analyzing protein sequences for Pfam matches
and viewing Pfam annotations in domain architectures, sequence alignments, interactions, species
and protein structures in PDB.
PANTHER
• PANTHER is a database of gene families, including a phylogenetic tree for each family in which nodes of the
tree are annotated with gene attributes
• The main goals of PANTHER is the accurate inference (and practical application) of gene and protein function
over large sequence databases, using phylogenetic trees to extrapolate from the relatively sparse experimental
information from a few model organisms.
• The three types of gene attribute currently annotated in PANTHER are:
(A) Subfamily membership (B) Protein class and (C) Gene function
• The PANTHER website provides tools for functional analysis of lists of genes or proteins.
• PANTHER now includes stable database identifiers for inferred ancestral genes, which are used to associate
inferred gene attributes with particular genes in the common ancestral genomes of extant species.
PROSITE
• PROSITE is a database of documentation entries describing protein domains, families and
functional sites as well as associated patterns and profiles to identify them.
• The entries are derived from multiple alignments of homologous sequences and have the
advantage of identifying distant relationships between sequences.
• PROSITE includes a collection of ProRules based on profiles and patterns of functionally
and/or structurally critical amino acids that can be used to increase PROSITE’s
discriminatory power.
• The PROSITE web site provides keyword-based search and allows browsing by
documentation entry, ProRule description, taxonomic scope and number of positive hits.
Proteomics – An Introduction
• Proteomics is the recent branch of molecular biology concerned with the study of
proteome.
• The term proteomics was introduced in 1994.
• It has many roles in molecular biology field such as: study of structure and
function of proteins, 3D structure of proteins and, qualitative and quantitative
analysis of proteins.
• It has many applications including Clinical research, Drug discovery, Biomarkers,
Neurology, etc.
Proteomics Softwares
http://www.funrich.org/ http://prohitsms.com/Prohits_download/list.php http://proteowizard.sourceforge.net/
FunRich
• FunRich software, is an open-access software that facilitates the analysis of
proteomics data, providing tools for functional enrichment and interaction
network analysis of genes and proteins.
• FunRich is a reinterpretation of proteomic software, a standalone tool
combining ease of use with customizable databases, free access, and
graphical representations.
ProHits
• ProHits is a complete open source software solution for MS (Mass Spectrometric) based
interaction proteomics that manages the entire pipeline from raw MS data files to fully
annotated protein-protein interaction data sets.
• It was designed to provide an intuitive user interface from the biologist’s perspective and
can accommodate multiple instruments within a facility, multiple user groups, multiple
laboratory locations and any number of parallel projects.
• ProHits can manage all project scales and supports common experimental pipelines,
including those using gel-based separation, gel-free analysis and multidimensional protein
or peptide separation.
ProteoWizard
• ProteoWizard provides a modular and extensible set of open-source, cross-platform tools
and libraries.
• The tools perform proteomics data analyses; the libraries enable rapid tool creation by
providing a robust, pluggable development framework that simplifies and unifies data file
access, and performs standard chemistry and LCMS dataset computations.
• The primary goal of ProteoWizard is to eliminate the existing barriers to proteomic
software development so that researchers can focus on the development of new analytic
approaches, rather than having to dedicate significant resources to mundane (if important)
tasks, like reading data files.
Proteomics Databases
https://www.proteomicsdb.org/
http:// ppdb.tc.cornell.edu https://www.ebi.ac.uk/pride/
PPDB
• PPDB is a Plant Proteome DataBase for Arabidopsis thaliana and maize (Zea
mays).
• Initially PPDB was dedicated to plant plastids, but has now expanded to the
whole plant proteome – hence it was renamed from Plastid PDB to Plant PDB
in November 2007.
• The PPDB stores experimental data from in-house proteome and mass
spectrometry analysis, curated information about protein function, protein
properties and subcellular localization.
PRIDE
• The PRoteomics IDEntifications database (PRIDE) is a repository for massspectrometry
based proteomics data including identifications of proteins, peptides and post-translational
modifications that have been described in the scientificliterature, together with supporting
mass spectra and related technical and biological metadata.
• PRIDE supports tandem MS (MS/MS) and Peptide Fingerprinting datasets with
search/analysis workflows originally analyzed by the submitters.
• PIRDE provides several services such as the Protein Identifier Cross-Reference (PICR),
the Ontology Lookup Service (OLS) and Database on Demand.
ProteomicsDB
• ProteomicsDB (Data base) is an effort of the Technische Universität
München (TUM).
• It is dedicated to expedite the identification of the human proteome
and its use across the scientific community.
Important protein databases and proteomics softwares

More Related Content

What's hot

Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
hemantbreeder
 
The European Nucleotide Archive
The European Nucleotide ArchiveThe European Nucleotide Archive
The European Nucleotide Archive
EBI
 

What's hot (20)

BLAST
BLASTBLAST
BLAST
 
Threading modeling methods
Threading modeling methodsThreading modeling methods
Threading modeling methods
 
Protein Databases
Protein DatabasesProtein Databases
Protein Databases
 
Bioinformatics t6-phylogenetics v2014
Bioinformatics t6-phylogenetics v2014Bioinformatics t6-phylogenetics v2014
Bioinformatics t6-phylogenetics v2014
 
Composite and Specialized databases
Composite and Specialized databasesComposite and Specialized databases
Composite and Specialized databases
 
Structural database and their classification by abdul qahar
Structural database and their classification by abdul qaharStructural database and their classification by abdul qahar
Structural database and their classification by abdul qahar
 
Gen bank (genetic sequence databank)
Gen bank (genetic sequence databank)Gen bank (genetic sequence databank)
Gen bank (genetic sequence databank)
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
 
Protein fold recognition and ab_initio modeling
Protein fold recognition and ab_initio modelingProtein fold recognition and ab_initio modeling
Protein fold recognition and ab_initio modeling
 
Biological databases
Biological databasesBiological databases
Biological databases
 
The European Nucleotide Archive
The European Nucleotide ArchiveThe European Nucleotide Archive
The European Nucleotide Archive
 
Secondary Structure Prediction of proteins
Secondary Structure Prediction of proteins Secondary Structure Prediction of proteins
Secondary Structure Prediction of proteins
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
 
Protein Structure Prediction
Protein Structure PredictionProtein Structure Prediction
Protein Structure Prediction
 
Phylogenetic analysis
Phylogenetic analysisPhylogenetic analysis
Phylogenetic analysis
 
Secondary protein structure prediction
Secondary protein structure predictionSecondary protein structure prediction
Secondary protein structure prediction
 
Protein database
Protein  databaseProtein  database
Protein database
 
Cath
CathCath
Cath
 
Protein database
Protein databaseProtein database
Protein database
 
Sequence analysis - Bioinformatics
Sequence analysis - BioinformaticsSequence analysis - Bioinformatics
Sequence analysis - Bioinformatics
 

Similar to Important protein databases and proteomics softwares

Biological Database (1)pptxpdfpdfpdf.pdf
Biological Database (1)pptxpdfpdfpdf.pdfBiological Database (1)pptxpdfpdfpdf.pdf
Biological Database (1)pptxpdfpdfpdf.pdf
BioinformaticsCentre
 
Informal presentation on bioinformatics
Informal presentation on bioinformaticsInformal presentation on bioinformatics
Informal presentation on bioinformatics
Atai Rabby
 

Similar to Important protein databases and proteomics softwares (20)

Proteins databases
Proteins databasesProteins databases
Proteins databases
 
PIR- Protein Information Resource
PIR- Protein Information ResourcePIR- Protein Information Resource
PIR- Protein Information Resource
 
biological databases.pptx
biological databases.pptxbiological databases.pptx
biological databases.pptx
 
Introduction OF BIOLOGICAL DATABASE
Introduction OF BIOLOGICAL DATABASEIntroduction OF BIOLOGICAL DATABASE
Introduction OF BIOLOGICAL DATABASE
 
Protein databases
Protein databasesProtein databases
Protein databases
 
Proteomics resources at the EBI & ExPASy
Proteomics resources at the EBI & ExPASyProteomics resources at the EBI & ExPASy
Proteomics resources at the EBI & ExPASy
 
Protein Databases
Protein DatabasesProtein Databases
Protein Databases
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Biological databases
Biological databases Biological databases
Biological databases
 
Data retreival system
Data retreival systemData retreival system
Data retreival system
 
Protein Database
Protein DatabaseProtein Database
Protein Database
 
Biological Database (1)pptxpdfpdfpdf.pdf
Biological Database (1)pptxpdfpdfpdf.pdfBiological Database (1)pptxpdfpdfpdf.pdf
Biological Database (1)pptxpdfpdfpdf.pdf
 
protein databases.ppt
protein databases.pptprotein databases.ppt
protein databases.ppt
 
Introduction to Biological database ppt(1).pptx
Introduction to Biological database ppt(1).pptxIntroduction to Biological database ppt(1).pptx
Introduction to Biological database ppt(1).pptx
 
Primary Bioinformatics Database.pptx
Primary Bioinformatics Database.pptxPrimary Bioinformatics Database.pptx
Primary Bioinformatics Database.pptx
 
Introduction to bioinformatics
Introduction to bioinformaticsIntroduction to bioinformatics
Introduction to bioinformatics
 
Introduction to databases.pptx
Introduction to databases.pptxIntroduction to databases.pptx
Introduction to databases.pptx
 
Bioinformatics مي.pdf
Bioinformatics  مي.pdfBioinformatics  مي.pdf
Bioinformatics مي.pdf
 
Data Retrieval Systems
Data Retrieval SystemsData Retrieval Systems
Data Retrieval Systems
 
Informal presentation on bioinformatics
Informal presentation on bioinformaticsInformal presentation on bioinformatics
Informal presentation on bioinformatics
 

Recently uploaded

Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
MohamedFarag457087
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptx
Silpa
 
CYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptxCYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptx
Silpa
 
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
Scintica Instrumentation
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
seri bangash
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Sérgio Sacani
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
1301aanya
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
NazaninKarimi6
 

Recently uploaded (20)

Chemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdfChemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdf
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
 
Genetics and epigenetics of ADHD and comorbid conditions
Genetics and epigenetics of ADHD and comorbid conditionsGenetics and epigenetics of ADHD and comorbid conditions
Genetics and epigenetics of ADHD and comorbid conditions
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptx
 
GBSN - Microbiology (Unit 3)Defense Mechanism of the body
GBSN - Microbiology (Unit 3)Defense Mechanism of the body GBSN - Microbiology (Unit 3)Defense Mechanism of the body
GBSN - Microbiology (Unit 3)Defense Mechanism of the body
 
Role of AI in seed science Predictive modelling and Beyond.pptx
Role of AI in seed science  Predictive modelling and  Beyond.pptxRole of AI in seed science  Predictive modelling and  Beyond.pptx
Role of AI in seed science Predictive modelling and Beyond.pptx
 
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRingsTransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
CYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptxCYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptx
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
Grade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsGrade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its Functions
 
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
 
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
 
Atp synthase , Atp synthase complex 1 to 4.
Atp synthase , Atp synthase complex 1 to 4.Atp synthase , Atp synthase complex 1 to 4.
Atp synthase , Atp synthase complex 1 to 4.
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
 

Important protein databases and proteomics softwares

  • 1. Important Protein Databases & Proteomics Softwares MOHD KYUM Punjab Agricultural University
  • 2. • Protein databases have become a crucial part of modern biology. • Searching databases is often the first step in the study of a new protein. • Huge amounts of data for protein structures, functions, and particularly sequences are being generated which cannot be handled without using computer databases. • Without the prior knowledge obtained from such searches, known information about the protein could be missed, or an experiment could be repeated unnecessarily. • Comparison between proteins and protein classification provide information about the relationship between proteins within a genome or across different species, and hence offer much more information than can be obtained by studying only an isolated protein. Introduction to Protein Databases
  • 3. Protein Databases • The databases can be classified in following categories:  Sequence Databases  2D Gel Databases  3D Structure Databases  Polymorphism and Mutation Database  Chemistry Databases  Enzyme and Pathway Databases  Ontologies, Specialized Protein Databases  Family and Domain Databases,  Gene Expression Databases  Genome Annotation Databases  Organism Specific Databases  Phylogenomic Databases  Protein-Protein Interaction Databases,  Proteomic Databases,  PTM Databases  Other Miscellaneous Databases.
  • 5. UniProt • UniProt provides more annotations than any other sequence database with a minimal level of redundancy. It has following three components: 1. Protein knowledgebase- including Swiss-Prot (manually annotated and reviewed) and TrEMBL (automatically annotated). 2. UniRef- sequence clusters for fast sequence similarity searches. 3. UniParc- sequence archive for keeping track of sequences and their identifiers.  UniProt, as a curated protein sequence database, offers a portal to a wide range of annotations, covering areas such as function, family, domain parsing, post-translational modifications, and variants.
  • 6. RefSec-NCBI • The National Center for Biotechnology Information Reference Sequence (NCBI RefSeq) database provides curated non-redundant sequences of genomic regions, transcripts and proteins for taxonomically diverse organisms including Archaea, Bacteria, Eukaryotes, and Viruses. • RefSeq database is derived from the sequence data available in the redundant archival database GenBank. • RefSeq sequences include coding regions, conserved domains, variations etc. and enhanced annotations such as publications, names, symbols, aliases, Gene IDs, and database cross-references. • The sequences and annotations are generated using a combined approach of collaboration, automated prediction, and manual curation. • The RefSeq records can be directly accessed from NCBI web sites bysearch of the Nucleotide or Protein databases, BLAST searches against selected databases and FTP downloads
  • 7. Protein Structure Databases http://www.wwpdb.org/ https://scop.mrc-lmb.cam.ac.uk/
  • 8. WWPDB • The World Wide Protein Data Bank (WWPDB) was established in 2003 as an international collaboration to maintain a single and publicly available PDB Archive of protein structural data. • The “PDB Archive” is a collection of flat files in three different formats: (A) Legacy PDB format (B) PDBx/mmCIF format (C) Protein Data Bank Markup Language (PDBML) format. • Each member site serves as a deposition, data processing and distribution site for the PDB Archive, and each provides its own view of the primary data and a variety of tools and resources.
  • 9. SCOP • SCOP (Structural Classification of Proteins) contains information about the classification of protein structures along with their sequences information. • It classified works under sub-categories with their features: 1. Class - Global characteristics 2. Fold - Similar “topology” 3. Superfamily - Clear structural homology 4. Family - Clear sequence homology 5. Protein - Functionally identical 6. Species - Unique sequences It aims to provide an accurate, detailed, and comprehensive description of the structural and evolutionary relationships amongst all proteins of known structure.
  • 10. Protein Family Databases http://pfam.xfam.org/ http://pantherdb.org/ https://prosite.expasy.org/
  • 11. Pfam • Pfam is a database of protein families represented as multiple sequence alignments and Hidden Markov Models (HMMs). • Pfam entries can be classified as Family (related protein regions), Domain (protein structural unit), Repeat (multiple short protein structural units), Motifs (short protein structural unit outside global domains). • Related Pfam entries are grouped into clans based on sequence, structure or profile HMM similarity. • The Pfam database web site provides search interface for querying by sequence, keyword, domain architecture, taxonomy, and browse interfaces for analyzing protein sequences for Pfam matches and viewing Pfam annotations in domain architectures, sequence alignments, interactions, species and protein structures in PDB.
  • 12. PANTHER • PANTHER is a database of gene families, including a phylogenetic tree for each family in which nodes of the tree are annotated with gene attributes • The main goals of PANTHER is the accurate inference (and practical application) of gene and protein function over large sequence databases, using phylogenetic trees to extrapolate from the relatively sparse experimental information from a few model organisms. • The three types of gene attribute currently annotated in PANTHER are: (A) Subfamily membership (B) Protein class and (C) Gene function • The PANTHER website provides tools for functional analysis of lists of genes or proteins. • PANTHER now includes stable database identifiers for inferred ancestral genes, which are used to associate inferred gene attributes with particular genes in the common ancestral genomes of extant species.
  • 13. PROSITE • PROSITE is a database of documentation entries describing protein domains, families and functional sites as well as associated patterns and profiles to identify them. • The entries are derived from multiple alignments of homologous sequences and have the advantage of identifying distant relationships between sequences. • PROSITE includes a collection of ProRules based on profiles and patterns of functionally and/or structurally critical amino acids that can be used to increase PROSITE’s discriminatory power. • The PROSITE web site provides keyword-based search and allows browsing by documentation entry, ProRule description, taxonomic scope and number of positive hits.
  • 14. Proteomics – An Introduction • Proteomics is the recent branch of molecular biology concerned with the study of proteome. • The term proteomics was introduced in 1994. • It has many roles in molecular biology field such as: study of structure and function of proteins, 3D structure of proteins and, qualitative and quantitative analysis of proteins. • It has many applications including Clinical research, Drug discovery, Biomarkers, Neurology, etc.
  • 16. FunRich • FunRich software, is an open-access software that facilitates the analysis of proteomics data, providing tools for functional enrichment and interaction network analysis of genes and proteins. • FunRich is a reinterpretation of proteomic software, a standalone tool combining ease of use with customizable databases, free access, and graphical representations.
  • 17. ProHits • ProHits is a complete open source software solution for MS (Mass Spectrometric) based interaction proteomics that manages the entire pipeline from raw MS data files to fully annotated protein-protein interaction data sets. • It was designed to provide an intuitive user interface from the biologist’s perspective and can accommodate multiple instruments within a facility, multiple user groups, multiple laboratory locations and any number of parallel projects. • ProHits can manage all project scales and supports common experimental pipelines, including those using gel-based separation, gel-free analysis and multidimensional protein or peptide separation.
  • 18. ProteoWizard • ProteoWizard provides a modular and extensible set of open-source, cross-platform tools and libraries. • The tools perform proteomics data analyses; the libraries enable rapid tool creation by providing a robust, pluggable development framework that simplifies and unifies data file access, and performs standard chemistry and LCMS dataset computations. • The primary goal of ProteoWizard is to eliminate the existing barriers to proteomic software development so that researchers can focus on the development of new analytic approaches, rather than having to dedicate significant resources to mundane (if important) tasks, like reading data files.
  • 20. PPDB • PPDB is a Plant Proteome DataBase for Arabidopsis thaliana and maize (Zea mays). • Initially PPDB was dedicated to plant plastids, but has now expanded to the whole plant proteome – hence it was renamed from Plastid PDB to Plant PDB in November 2007. • The PPDB stores experimental data from in-house proteome and mass spectrometry analysis, curated information about protein function, protein properties and subcellular localization.
  • 21. PRIDE • The PRoteomics IDEntifications database (PRIDE) is a repository for massspectrometry based proteomics data including identifications of proteins, peptides and post-translational modifications that have been described in the scientificliterature, together with supporting mass spectra and related technical and biological metadata. • PRIDE supports tandem MS (MS/MS) and Peptide Fingerprinting datasets with search/analysis workflows originally analyzed by the submitters. • PIRDE provides several services such as the Protein Identifier Cross-Reference (PICR), the Ontology Lookup Service (OLS) and Database on Demand.
  • 22. ProteomicsDB • ProteomicsDB (Data base) is an effort of the Technische Universität München (TUM). • It is dedicated to expedite the identification of the human proteome and its use across the scientific community.