SlideShare uma empresa Scribd logo
1 de 19
Baixar para ler offline
BIOINFORMATICS
Name: Anuja Vilas Konde
Msc II
CONTENT
What is bioinformatics ?
What is data and information?
Biological databases
Types of biological databases
Retrieval of databases
Advantages of biological databases
Bioinformatics
Definition
Marriage between computer science and Molecular Biology.
Techniques of computer science problems of molecular biology.
Information technology applied to analyse biological data.
Helps to gain understanding of biological data.
Plays important role in molecular medicine, evolutionary studies, drug
development and in biotechnology.
Analysis of gene and protein expression, comparison of genomic data , storing of
biological information.
Data and Information
Raw data
Processed and
Analysed
Information
BIOLOGICAL DATABASES
Biological Living.
Databases Collection of data in organized manner (i.e information )
Libraries of life sciences information, collected from scientific experiments which is
stored using computational analysis.
Information Accessed, Managed and updated.
Features:
Data heterogeneity
High volume data
Data curation
Types of biological databases
Biological databases
Primary Secondary
Basis on source of data: Basis of databases stored:
Biological databases
Sequence
Nucleic acid Protein
Structure
PDB
SCOP
CATH
On The basis of Data sources
Primary databases :
Contains experimentally derived data i.e raw data
Examples: Nucleotide sequence, Protein or Macromolecular sequence
Experimental results submitted into databases .
Swiss Prot, PIR , Gen bank, DDBJ.
Secondary databases :
Data derived from analysing primary data i.e information
Examples: Conserved regions, Signature sequence etc
Submitted data Analysed and stored
SCOP, CATH
On basis of data stored
Structural databases :
Includes structures of experimentally derived proteins and domains
Main aim is to organize protein structure providing biological community to access the
information
A. PDB ( Protein Data Bank):
Databases of experimentally determined 3D structure of protein
Currently stores 80,000 protein structure
Obtained from NMR spectroscopy and X ray crystallography
Easily accessible, can be downloaded and utilised
B. SCOP (Structural classification of proteins):
Contains information about classification and structures of proteins
Also describes evolutionary relation between proteins.
Currently contains 38,000 protein structures
Freely accessible to the internet
C. CATH ( Class architecture topology homology):
Contains information about classification and structures of proteins.
also gives information of bonding of proteins and evolutionary relationships of proteins.
currently contains 8,078 proteins domains information
Sequence databases :
Composed of large collections of nucleic acid sequence, protein sequence stored in computer.
Mainly of two types
Nucleic acid sequence:
Contains collections of sequences of genome, gene and transcript sequence.
Three chief databases store and make available raw nucleic acid data to public
Gene bank, EMBL, and DDBJ
referred to as primary sequence databases.
Genebank:
Located in USA.
Accessible through NCBI portal
Contains annotated collections of nucleotide sequence and their protein translations.
Receives 100,000 distinct organism sequences from all over world.
EMBL( European molecular biology laboratory):
Maintained by EBI (European bioinformatics Institute)
Comprises of primary nucleotide sequence
Data receives from genome sequencing centers.
DDBJ (DNA data bank of Japan) :
1. Located at the National Institute of Genetics (NIG).
2. Only nucleotide sequence data bank in Asia.
3. Exchange data with Gen Bank and EMBL.
4. Mainly receives data from Japanese researchers.
Protein sequence :
1. Database which include a protein’s amino acid sequence, conformation, structure, and features
such as active sites.
2. Compiled by the translation of DNA sequences from different gene databases.
3. Important resource because proteins mediate most biological functions.
4. Includes PIR, Swiss PROT, PDB.
1.PIR (Protein Information Resource):
2. Established in 1984 by National Biomedical Research Foundation
3. Provides a high level of annotation.
4. Contains sequence of amino acid and information about protein function prediction
5. Also contains sequences of domains.
Swiss PROT:
1. Swiss institute of bioinformatics in collaborations with EMBL data provides a databank
2. very high quality and consistent annotations
3.
It incorporates:
Functions of proteins
A. Post-translational modification such as phosphorylation, acetylation
B. domains and sites
C. Secondary structural feature and quaternary structure of the protein.
PDB (Protein data bank) :
1. Includes sequences of proteins.
2. Helps to predict 3D structure of proteins.
3. Database holds data derived from mainly two sources: Structure determined by X-ray
crystallography, NMR experiments
Retrieval of biological databases
Accessing the stored data of an organism or a particular gene from the databases.
When obtaining a new DNA sequence, one needs to know whether it has already been deposited in
the databanks.
Requirement for retrieval:
name of organism
name of gene
Data retrieval system :
Entrez
SRS
BLAST
Entrez :
Molecular biology databases and retrieval system
Developed by NCBI
Nucleotide and protein sequence data, 3D structure data
Easy to access but limited information to search
SRS ( Sequence retrieval system)
Home to over 80,000 biological databases
Developed by European Bioinformatics Institute (EBI)
Includes sequence of metabolic pathways, transcription factors, and conserved regions.
Provides the description of gene, date on which it is uploaded and updated.
BLAST (Basic Local Alignment Search Tool) :
Developed by NCBI
Blast programs were designed for fast database searching.
Helps to retrieve the data
Also helps for comparing primary biological sequence information
Raw data obtained
from experiment
Submit that data to
databases
Accession number
Entry accession
number in blast
Search
Find relationship
among them
Variants of BLAST
BLASTN - Compares a DNA query to DNA databases
BLASTP - Compares a protein query to a protein database.
BLASTX - Compares a DNA query to a protein database , by translating the query in the 6 possible frames .
TBLASTN -Compares a protein query to a DNA database, in the 6 possible frames of the database.
Advantages
Databases act as a store house of information.
Used to store and organize data in such a way that information can be retrieved
easily via a variety of search criteria.
It allows knowledge discovery, which refers to the identification of connections
between pieces of information .
Databases are important tools in assisting scientists to analyze and explain a
host of biological phenomena from the structure of biomolecules and their
interaction, to the whole metabolism of organisms and to understanding the
evolution of species.
THANK YOU

Mais conteúdo relacionado

Mais procurados (20)

Introduction to NCBI
Introduction to NCBIIntroduction to NCBI
Introduction to NCBI
 
Databases in Bioinformatics
Databases in BioinformaticsDatabases in Bioinformatics
Databases in Bioinformatics
 
Primary, secondary, tertiary biological database
Primary, secondary, tertiary biological databasePrimary, secondary, tertiary biological database
Primary, secondary, tertiary biological database
 
Introduction to Biological databases
Introduction to Biological databasesIntroduction to Biological databases
Introduction to Biological databases
 
BITS: Basics of sequence databases
BITS: Basics of sequence databasesBITS: Basics of sequence databases
BITS: Basics of sequence databases
 
Entrez databases
Entrez databasesEntrez databases
Entrez databases
 
Nucleic Acid Sequence Databases
Nucleic Acid Sequence DatabasesNucleic Acid Sequence Databases
Nucleic Acid Sequence Databases
 
Data Retrieval Systems
Data Retrieval SystemsData Retrieval Systems
Data Retrieval Systems
 
Protein Database
Protein DatabaseProtein Database
Protein Database
 
Introduction to ncbi, embl, ddbj
Introduction to ncbi, embl, ddbjIntroduction to ncbi, embl, ddbj
Introduction to ncbi, embl, ddbj
 
Data base in detail
Data base in detailData base in detail
Data base in detail
 
Kegg
KeggKegg
Kegg
 
Biological data bioinformatics
Biological data bioinformatics Biological data bioinformatics
Biological data bioinformatics
 
Nucleic acid database
Nucleic acid databaseNucleic acid database
Nucleic acid database
 
Major databases in bioinformatics
Major databases in bioinformaticsMajor databases in bioinformatics
Major databases in bioinformatics
 
Features of biological databases
Features of biological databasesFeatures of biological databases
Features of biological databases
 
Biological data base
Biological data baseBiological data base
Biological data base
 
OMIM Database
OMIM DatabaseOMIM Database
OMIM Database
 
Data retreival system
Data retreival systemData retreival system
Data retreival system
 
Biological Database
Biological DatabaseBiological Database
Biological Database
 

Semelhante a Bioinformatics introduction

Presentation on Biological database By Elufer Akram @ University Of Science ...
Presentation on Biological database  By Elufer Akram @ University Of Science ...Presentation on Biological database  By Elufer Akram @ University Of Science ...
Presentation on Biological database By Elufer Akram @ University Of Science ...Elufer Akram
 
Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...SBituila
 
Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...BibiQuinah
 
Database in bioinformatics
Database in bioinformaticsDatabase in bioinformatics
Database in bioinformaticsVinaKhan1
 
Bioinformatics in biotechnology by kk sahu
Bioinformatics in biotechnology by kk sahu Bioinformatics in biotechnology by kk sahu
Bioinformatics in biotechnology by kk sahu KAUSHAL SAHU
 
BioInformatics Tools -Genomics , Proteomics and metablomics
BioInformatics Tools -Genomics , Proteomics and metablomicsBioInformatics Tools -Genomics , Proteomics and metablomics
BioInformatics Tools -Genomics , Proteomics and metablomicsAyeshaYousaf20
 
Presentation.pptx
Presentation.pptxPresentation.pptx
Presentation.pptxAshuAsh15
 
BIOINFO unit 1.pptx
BIOINFO unit 1.pptxBIOINFO unit 1.pptx
BIOINFO unit 1.pptxrnath286
 
Protein databases
Protein databasesProtein databases
Protein databasessarumalay
 
protein databases
 protein databases protein databases
protein databaseswasisyed
 
Types of biological databases-protein database
Types of biological databases-protein databaseTypes of biological databases-protein database
Types of biological databases-protein databasechinmayeec
 
Protein Sequence Databases
Protein Sequence Databases Protein Sequence Databases
Protein Sequence Databases Hemant Bothe
 

Semelhante a Bioinformatics introduction (20)

Biological database
Biological databaseBiological database
Biological database
 
Presentation on Biological database By Elufer Akram @ University Of Science ...
Presentation on Biological database  By Elufer Akram @ University Of Science ...Presentation on Biological database  By Elufer Akram @ University Of Science ...
Presentation on Biological database By Elufer Akram @ University Of Science ...
 
Introduction to databases.pptx
Introduction to databases.pptxIntroduction to databases.pptx
Introduction to databases.pptx
 
Biological databases.pptx
Biological databases.pptxBiological databases.pptx
Biological databases.pptx
 
Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...
 
Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...
 
Database in bioinformatics
Database in bioinformaticsDatabase in bioinformatics
Database in bioinformatics
 
Bioinformatics in biotechnology by kk sahu
Bioinformatics in biotechnology by kk sahu Bioinformatics in biotechnology by kk sahu
Bioinformatics in biotechnology by kk sahu
 
Protein Databases
Protein DatabasesProtein Databases
Protein Databases
 
BioInformatics Tools -Genomics , Proteomics and metablomics
BioInformatics Tools -Genomics , Proteomics and metablomicsBioInformatics Tools -Genomics , Proteomics and metablomics
BioInformatics Tools -Genomics , Proteomics and metablomics
 
Presentation.pptx
Presentation.pptxPresentation.pptx
Presentation.pptx
 
BIOINFO unit 1.pptx
BIOINFO unit 1.pptxBIOINFO unit 1.pptx
BIOINFO unit 1.pptx
 
Protein databases
Protein databasesProtein databases
Protein databases
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Biological databases
Biological databasesBiological databases
Biological databases
 
protein databases
 protein databases protein databases
protein databases
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Types of biological databases-protein database
Types of biological databases-protein databaseTypes of biological databases-protein database
Types of biological databases-protein database
 
Protein Sequence Databases
Protein Sequence Databases Protein Sequence Databases
Protein Sequence Databases
 
PROTEIN DATABASE
PROTEIN DATABASEPROTEIN DATABASE
PROTEIN DATABASE
 

Último

Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...
Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...
Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...Mohammad Khajehpour
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Silpa
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and ClassificationsAreesha Ahmad
 
Introduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptxIntroduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptxBhagirath Gogikar
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Monika Rani
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryAlex Henderson
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY1301aanya
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPirithiRaju
 
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts ServiceJustdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Servicemonikaservice1
 
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)AkefAfaneh2
 
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATIONSTS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATIONrouseeyyy
 
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verifiedConnaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verifiedDelhi Call girls
 
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑Damini Dixit
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPirithiRaju
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .Poonam Aher Patil
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...ssuser79fe74
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)Areesha Ahmad
 

Último (20)

Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...
Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...
Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
Introduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptxIntroduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptx
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts ServiceJustdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
 
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
 
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATIONSTS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
 
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verifiedConnaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
 
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 

Bioinformatics introduction

  • 2. CONTENT What is bioinformatics ? What is data and information? Biological databases Types of biological databases Retrieval of databases Advantages of biological databases
  • 3. Bioinformatics Definition Marriage between computer science and Molecular Biology. Techniques of computer science problems of molecular biology. Information technology applied to analyse biological data. Helps to gain understanding of biological data. Plays important role in molecular medicine, evolutionary studies, drug development and in biotechnology. Analysis of gene and protein expression, comparison of genomic data , storing of biological information.
  • 4. Data and Information Raw data Processed and Analysed Information
  • 5. BIOLOGICAL DATABASES Biological Living. Databases Collection of data in organized manner (i.e information ) Libraries of life sciences information, collected from scientific experiments which is stored using computational analysis. Information Accessed, Managed and updated. Features: Data heterogeneity High volume data Data curation
  • 6. Types of biological databases Biological databases Primary Secondary Basis on source of data: Basis of databases stored: Biological databases Sequence Nucleic acid Protein Structure PDB SCOP CATH
  • 7. On The basis of Data sources Primary databases : Contains experimentally derived data i.e raw data Examples: Nucleotide sequence, Protein or Macromolecular sequence Experimental results submitted into databases . Swiss Prot, PIR , Gen bank, DDBJ. Secondary databases : Data derived from analysing primary data i.e information Examples: Conserved regions, Signature sequence etc Submitted data Analysed and stored SCOP, CATH
  • 8. On basis of data stored Structural databases : Includes structures of experimentally derived proteins and domains Main aim is to organize protein structure providing biological community to access the information A. PDB ( Protein Data Bank): Databases of experimentally determined 3D structure of protein Currently stores 80,000 protein structure Obtained from NMR spectroscopy and X ray crystallography Easily accessible, can be downloaded and utilised
  • 9. B. SCOP (Structural classification of proteins): Contains information about classification and structures of proteins Also describes evolutionary relation between proteins. Currently contains 38,000 protein structures Freely accessible to the internet C. CATH ( Class architecture topology homology): Contains information about classification and structures of proteins. also gives information of bonding of proteins and evolutionary relationships of proteins. currently contains 8,078 proteins domains information
  • 10. Sequence databases : Composed of large collections of nucleic acid sequence, protein sequence stored in computer. Mainly of two types Nucleic acid sequence: Contains collections of sequences of genome, gene and transcript sequence. Three chief databases store and make available raw nucleic acid data to public Gene bank, EMBL, and DDBJ referred to as primary sequence databases. Genebank: Located in USA. Accessible through NCBI portal Contains annotated collections of nucleotide sequence and their protein translations. Receives 100,000 distinct organism sequences from all over world.
  • 11. EMBL( European molecular biology laboratory): Maintained by EBI (European bioinformatics Institute) Comprises of primary nucleotide sequence Data receives from genome sequencing centers. DDBJ (DNA data bank of Japan) : 1. Located at the National Institute of Genetics (NIG). 2. Only nucleotide sequence data bank in Asia. 3. Exchange data with Gen Bank and EMBL. 4. Mainly receives data from Japanese researchers.
  • 12. Protein sequence : 1. Database which include a protein’s amino acid sequence, conformation, structure, and features such as active sites. 2. Compiled by the translation of DNA sequences from different gene databases. 3. Important resource because proteins mediate most biological functions. 4. Includes PIR, Swiss PROT, PDB. 1.PIR (Protein Information Resource): 2. Established in 1984 by National Biomedical Research Foundation 3. Provides a high level of annotation. 4. Contains sequence of amino acid and information about protein function prediction 5. Also contains sequences of domains.
  • 13. Swiss PROT: 1. Swiss institute of bioinformatics in collaborations with EMBL data provides a databank 2. very high quality and consistent annotations 3. It incorporates: Functions of proteins A. Post-translational modification such as phosphorylation, acetylation B. domains and sites C. Secondary structural feature and quaternary structure of the protein. PDB (Protein data bank) : 1. Includes sequences of proteins. 2. Helps to predict 3D structure of proteins. 3. Database holds data derived from mainly two sources: Structure determined by X-ray crystallography, NMR experiments
  • 14. Retrieval of biological databases Accessing the stored data of an organism or a particular gene from the databases. When obtaining a new DNA sequence, one needs to know whether it has already been deposited in the databanks. Requirement for retrieval: name of organism name of gene Data retrieval system : Entrez SRS BLAST
  • 15. Entrez : Molecular biology databases and retrieval system Developed by NCBI Nucleotide and protein sequence data, 3D structure data Easy to access but limited information to search SRS ( Sequence retrieval system) Home to over 80,000 biological databases Developed by European Bioinformatics Institute (EBI) Includes sequence of metabolic pathways, transcription factors, and conserved regions. Provides the description of gene, date on which it is uploaded and updated.
  • 16. BLAST (Basic Local Alignment Search Tool) : Developed by NCBI Blast programs were designed for fast database searching. Helps to retrieve the data Also helps for comparing primary biological sequence information Raw data obtained from experiment Submit that data to databases Accession number Entry accession number in blast Search Find relationship among them
  • 17. Variants of BLAST BLASTN - Compares a DNA query to DNA databases BLASTP - Compares a protein query to a protein database. BLASTX - Compares a DNA query to a protein database , by translating the query in the 6 possible frames . TBLASTN -Compares a protein query to a DNA database, in the 6 possible frames of the database.
  • 18. Advantages Databases act as a store house of information. Used to store and organize data in such a way that information can be retrieved easily via a variety of search criteria. It allows knowledge discovery, which refers to the identification of connections between pieces of information . Databases are important tools in assisting scientists to analyze and explain a host of biological phenomena from the structure of biomolecules and their interaction, to the whole metabolism of organisms and to understanding the evolution of species.