SlideShare uma empresa Scribd logo
1 de 21
Archives and
Information Retrieval
Reading:
Introduction to Bioinformatics.
Arthur M. Lesk. Fourth Edition
Chapter 4
Introduction
• Learning objectives:
• What is the general arrangement of biological data in the
public databases?
• To know the information retrieval skills that will allow you to
make effective use of the databases.
• To become familiar with basic operations.
• How does one retrieve information on a particular subject in
the literature?
Primary public domain bioinformatics
servers
Public Domain
Bioinformatics
Facilities
European Bioinformatics
Institute (EBI)
United Kingdom
National Center
For Biotechnology
Information (NCBI)
United States
Genome
Net
(KEGG & DDBJ)
Japan
Databases
Analysis
Tools
Databases
Analysis
Tools
Databases
Analysis
Tools
The Archives
• Massive biological experimental data
• These biological information databases can be
classified into two types
• The first level databases
• Come from the raw data which were obtained via the
experiments. “simple”
• The second level databases
• Further reorganized based on.. in order to achieve some
specific goals
The Archives
• Some examples:
• The first level databases
• Nucleic acid sequence databases: GenBank, EMBL Data Library,
DNA Database of Japan (DDBJ)
• Protein sequence database: SWISS-PROT, PIR
• Protein structure database: PDB
• The second level databases
• GDB
• TRANSFAC
• SCOP
Nucleic acid sequence databases
• International DNA Sequence Database
Collaboration
• NCBI (GenBank) – USA (1982)
• EMBL (Data Library)– Europe (1982)
• DDBJ (DNA Data Bank)– Japan (1988)
NCBI
• Established in USA in 1988 as a national resource
for molecular biology information
• creates public databases
• conducts research in computational biology
• develops software tools for analyzing genome data
• disseminates biomedical information
Nucleic acid sequence databases
• GenBank
• nucleic acid sequence and the protein sequence
• literature work
• biological annotation
• A new release is made every two months
• GenBank information retrieval system
NCBI ENTREZ
• A platform that provides access to and links to
databases with biological information
ENTREZ
PubMed
GenBank
Protein
databases
Genomes PopSet Taxonomy OMIM
MedLine
NCBI ENTREZ
GenBank
Protein
databases
Genomes
PopSet
Taxonomy
OMIM
MedLine Literature Database
Database of DNA sequences that have been collected to
analyze the evolutionary relatedness of a population.
Database of human genes and genetic disorders
Database of all publicly available DNA sequences
Database of amino acid sequences from SwissProt, PIR, PRF,
PDB, and translations from annotated coding regions in
GenBank and RefSeq.
Database of genomes from organisms and viruses
Database of names of organisms with sequences in GenBank or Prot
PubMed Center
• the U.S. National Library of Medicine's digital
archive of life sciences journal literature
• Access to the full text of articles in PMC is free,
except where a journal requires a subscription for
access to recent articles
OMIM-Online Mendelian
Inheritance in Man
• A catalog of human genes linked to diseases
• Began by Victor A. McKusick at Johns Hopkins University
• A good place to start when you want to research a certain
disease or biological molecule
• This database is cross-referenced to PubMed and other NCBI-
based databases
Complete ENTREZ database divisions
How to submit sequence data to
GenBank
• Bankit based web interface
• http://www.ncbi.nlm.nih.gov/BankIt
• Sequin program
• http://www.ncbi.nlm.nih.gov/Sequin
Protein databases
• The Protein Information Resource (PIR) was established in
1984 by the National Biomedical Research Foundation
(NBRF).
• The PIR Protein Sequence Database evolved from the
original NBRF Protein Sequence Database, developed over
20 years
• PIR-International is a collaboration between NBRF, the
Munich Information Center for Protein Sequences (MIPS),
and the Japan International Protein Information Database
(JIPID)
• collect and publish what is now the oldest and largest
database of biomolecular sequence, source, literature, and
feature information.
PIR
• PIR-International Protein Sequence Database: an annotated, non-
redundant and cross-referenced database of protein sequences.
• PIR Alignment Database, PIR-ALN: contains sequence alignments of
superfamilies, families and homology domains produced from
information in the Protein Sequence Database.
• FAMBASE Family Database: a searchable database containing a single
representative sequence from each protein family.
• RESID Database of Amino Acid Modifications: based on feature
information in the Protein Sequence Database.
PIR
• http://www-nbrf.georgetown.edu/pir/
SWISS-PROT
• http://www.ebi.ac.uk/swissprot/
• an well-annotated protein sequence database established in 1986.
• It is maintained collaboratively by the Swiss Institute for Bioinformatics
(SIB) and the European Bioinformatics Institute (EBI).
• a curated protein sequence database that provides a high level of
annotation, a minimal level of redundancy and a high level of
integration with other databases.
Note: UniProtKB/TrEMBL and UniProtKB/Swiss-Prot have been
incorporated into the UniProt (Universal Protein Resource). a one-stop
shop allowing easy access to all publicly available information about
protein sequences.
PROSITE
• http://ca.expasy.org/prosite/
• a method of determining what is the function of
uncharacterized proteins translated from genomic
or cDNA sequences.
• a database of biologically significant sites
• patterns formulated in such a way that with appropriate
computational tools it can rapidly and reliably identify to
which known family of protein (if any) the new sequence
belongs.
PDB
• http://www.rcsb.org/pdb/
• The single international repository for public data on the 3-
dimensional structures of biological macromolecules
• Is established by the Brookhaven National Lab of United
States
• The contents are primarily experimental data derived from
X-ray crystallography and NMR experiments
• Rasmol may demonstrate 3D structure of the biological
macromolecule according to the PDB document
protein databases.ppt

Mais conteúdo relacionado

Semelhante a protein databases.ppt

Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...
SBituila
 
Biological databases.pptx
Biological databases.pptxBiological databases.pptx
Biological databases.pptx
PagudalaSangeetha
 
Biological Database (1)pptxpdfpdfpdf.pdf
Biological Database (1)pptxpdfpdfpdf.pdfBiological Database (1)pptxpdfpdfpdf.pdf
Biological Database (1)pptxpdfpdfpdf.pdf
BioinformaticsCentre
 

Semelhante a protein databases.ppt (20)

Proteins databases
Proteins databasesProteins databases
Proteins databases
 
Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...
 
Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...
 
Bioinformatics biological databases
Bioinformatics biological databasesBioinformatics biological databases
Bioinformatics biological databases
 
Protein Databases
Protein DatabasesProtein Databases
Protein Databases
 
Biological database
Biological databaseBiological database
Biological database
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Presentation on Biological database By Elufer Akram @ University Of Science ...
Presentation on Biological database  By Elufer Akram @ University Of Science ...Presentation on Biological database  By Elufer Akram @ University Of Science ...
Presentation on Biological database By Elufer Akram @ University Of Science ...
 
Biological databases.pptx
Biological databases.pptxBiological databases.pptx
Biological databases.pptx
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Protein database
Protein  databaseProtein  database
Protein database
 
Biological Database (1)pptxpdfpdfpdf.pdf
Biological Database (1)pptxpdfpdfpdf.pdfBiological Database (1)pptxpdfpdfpdf.pdf
Biological Database (1)pptxpdfpdfpdf.pdf
 
Biological data bioinformatics
Biological data bioinformatics Biological data bioinformatics
Biological data bioinformatics
 
Data base in detail
Data base in detailData base in detail
Data base in detail
 
Bioinformatics introduction
Bioinformatics introductionBioinformatics introduction
Bioinformatics introduction
 
Important protein databases and proteomics softwares
Important protein databases and proteomics softwaresImportant protein databases and proteomics softwares
Important protein databases and proteomics softwares
 
BIOINFORMATICS AND DATABASES IN BIOINFORMATICS.pdf
BIOINFORMATICS  AND  DATABASES IN BIOINFORMATICS.pdfBIOINFORMATICS  AND  DATABASES IN BIOINFORMATICS.pdf
BIOINFORMATICS AND DATABASES IN BIOINFORMATICS.pdf
 
COMPUNATIONAL BIOLOGY AND DATABASES IN BIOINFORMATICS.pptx
COMPUNATIONAL BIOLOGY AND DATABASES IN BIOINFORMATICS.pptxCOMPUNATIONAL BIOLOGY AND DATABASES IN BIOINFORMATICS.pptx
COMPUNATIONAL BIOLOGY AND DATABASES IN BIOINFORMATICS.pptx
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Nucleic acid and protein databanks
Nucleic acid and protein databanksNucleic acid and protein databanks
Nucleic acid and protein databanks
 

Último

Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Lokesh Kothari
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
Areesha Ahmad
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Sérgio Sacani
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
RizalinePalanog2
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Sérgio Sacani
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
PirithiRaju
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
ssuser79fe74
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
Sérgio Sacani
 

Último (20)

High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
American Type Culture Collection (ATCC).pptx
American Type Culture Collection (ATCC).pptxAmerican Type Culture Collection (ATCC).pptx
American Type Culture Collection (ATCC).pptx
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 

protein databases.ppt

  • 1. Archives and Information Retrieval Reading: Introduction to Bioinformatics. Arthur M. Lesk. Fourth Edition Chapter 4
  • 2. Introduction • Learning objectives: • What is the general arrangement of biological data in the public databases? • To know the information retrieval skills that will allow you to make effective use of the databases. • To become familiar with basic operations. • How does one retrieve information on a particular subject in the literature?
  • 3. Primary public domain bioinformatics servers Public Domain Bioinformatics Facilities European Bioinformatics Institute (EBI) United Kingdom National Center For Biotechnology Information (NCBI) United States Genome Net (KEGG & DDBJ) Japan Databases Analysis Tools Databases Analysis Tools Databases Analysis Tools
  • 4. The Archives • Massive biological experimental data • These biological information databases can be classified into two types • The first level databases • Come from the raw data which were obtained via the experiments. “simple” • The second level databases • Further reorganized based on.. in order to achieve some specific goals
  • 5. The Archives • Some examples: • The first level databases • Nucleic acid sequence databases: GenBank, EMBL Data Library, DNA Database of Japan (DDBJ) • Protein sequence database: SWISS-PROT, PIR • Protein structure database: PDB • The second level databases • GDB • TRANSFAC • SCOP
  • 6. Nucleic acid sequence databases • International DNA Sequence Database Collaboration • NCBI (GenBank) – USA (1982) • EMBL (Data Library)– Europe (1982) • DDBJ (DNA Data Bank)– Japan (1988)
  • 7. NCBI • Established in USA in 1988 as a national resource for molecular biology information • creates public databases • conducts research in computational biology • develops software tools for analyzing genome data • disseminates biomedical information
  • 8. Nucleic acid sequence databases • GenBank • nucleic acid sequence and the protein sequence • literature work • biological annotation • A new release is made every two months • GenBank information retrieval system
  • 9. NCBI ENTREZ • A platform that provides access to and links to databases with biological information ENTREZ PubMed GenBank Protein databases Genomes PopSet Taxonomy OMIM MedLine
  • 10. NCBI ENTREZ GenBank Protein databases Genomes PopSet Taxonomy OMIM MedLine Literature Database Database of DNA sequences that have been collected to analyze the evolutionary relatedness of a population. Database of human genes and genetic disorders Database of all publicly available DNA sequences Database of amino acid sequences from SwissProt, PIR, PRF, PDB, and translations from annotated coding regions in GenBank and RefSeq. Database of genomes from organisms and viruses Database of names of organisms with sequences in GenBank or Prot
  • 11. PubMed Center • the U.S. National Library of Medicine's digital archive of life sciences journal literature • Access to the full text of articles in PMC is free, except where a journal requires a subscription for access to recent articles
  • 12. OMIM-Online Mendelian Inheritance in Man • A catalog of human genes linked to diseases • Began by Victor A. McKusick at Johns Hopkins University • A good place to start when you want to research a certain disease or biological molecule • This database is cross-referenced to PubMed and other NCBI- based databases
  • 14. How to submit sequence data to GenBank • Bankit based web interface • http://www.ncbi.nlm.nih.gov/BankIt • Sequin program • http://www.ncbi.nlm.nih.gov/Sequin
  • 15. Protein databases • The Protein Information Resource (PIR) was established in 1984 by the National Biomedical Research Foundation (NBRF). • The PIR Protein Sequence Database evolved from the original NBRF Protein Sequence Database, developed over 20 years • PIR-International is a collaboration between NBRF, the Munich Information Center for Protein Sequences (MIPS), and the Japan International Protein Information Database (JIPID) • collect and publish what is now the oldest and largest database of biomolecular sequence, source, literature, and feature information.
  • 16. PIR • PIR-International Protein Sequence Database: an annotated, non- redundant and cross-referenced database of protein sequences. • PIR Alignment Database, PIR-ALN: contains sequence alignments of superfamilies, families and homology domains produced from information in the Protein Sequence Database. • FAMBASE Family Database: a searchable database containing a single representative sequence from each protein family. • RESID Database of Amino Acid Modifications: based on feature information in the Protein Sequence Database.
  • 18. SWISS-PROT • http://www.ebi.ac.uk/swissprot/ • an well-annotated protein sequence database established in 1986. • It is maintained collaboratively by the Swiss Institute for Bioinformatics (SIB) and the European Bioinformatics Institute (EBI). • a curated protein sequence database that provides a high level of annotation, a minimal level of redundancy and a high level of integration with other databases. Note: UniProtKB/TrEMBL and UniProtKB/Swiss-Prot have been incorporated into the UniProt (Universal Protein Resource). a one-stop shop allowing easy access to all publicly available information about protein sequences.
  • 19. PROSITE • http://ca.expasy.org/prosite/ • a method of determining what is the function of uncharacterized proteins translated from genomic or cDNA sequences. • a database of biologically significant sites • patterns formulated in such a way that with appropriate computational tools it can rapidly and reliably identify to which known family of protein (if any) the new sequence belongs.
  • 20. PDB • http://www.rcsb.org/pdb/ • The single international repository for public data on the 3- dimensional structures of biological macromolecules • Is established by the Brookhaven National Lab of United States • The contents are primarily experimental data derived from X-ray crystallography and NMR experiments • Rasmol may demonstrate 3D structure of the biological macromolecule according to the PDB document