In the era of computers life sciences databases are still understated. Here is my presentation on biological databases. Complete classification of different databases.
For more presentations and work come and visit
https://www.linkedin.com/in/shradheya-r-r-gupta-54492984/
1. Biological Databases
1
Submitted
For the M.Sc. (BIOTECHNOLOGY) III-SEMESTER EXAM.DEC.2018 (REGULAR)
Subject
SEMINAR, SCIENTIFIC WRITING AND PRESENTATION
Submitted by
Shradheya R.R. Gupta
M.Sc. Biotechnology
Roll. Number
1832128
2. 1. Content
Biological Databases
Types
A. On the basis of type:-
1. Sequence Databases
2. Structure Databases
3. Functional Databases
Conclusion
B. On the basis of order:-
1. Primary Databases
2. Secondary Databases
3. Composite Databases
2
3. 1. Biological Databases
Biological databases are store house of life science information.
Information is collected from scientific experiments, published literature, high-
throughput experiment technology, and computational analysis.
3
4. A. On the basis of type
1. Sequence database:-
Composed of a large collection of
nucleic acid and protein sequences.
BLAST program is the most common
searching tool for sequence
similarity.
Many annotations of the sequences
are based on the results of sequence
similarity searches of previously-
annotated sequences. 4
5. 2. Structure database:-
Main aim is to organize and annotate the protein structures.
Example:-
1. PDB
2. Databases of Macromolecules Movements
3. Functional database:-
Physiological role of gene products - enzyme activities, mutant phenotypes,
biological pathways etc.
Examples:-
1. KEGG PATHWAY Database
2. BRENDA
3. Reactome
4. HMDB
5
6. B. On the basis of order
1. Primary database:-
A primary database contains information obtained experimentally.
Experimental results are submitted directly into the database by researchers,
and the data are essentially archival in nature.
6
7. A. Nucleotide Primary database:-
Three chief databases that store and
make available raw nucleic acid
sequences.
1. GenBank:-
Located in the U.S.A.
2. DDBJ:-
Located in Japan
3. EMBL:-
Located in U.K.
They have uniform data formats (but
not identical) and exchange data on
daily basis. 7
8. B. Protein Primary database:-
PIR-PSD is a comprehensive, non- redundant and annotated data.
Classification of protein sequences based on the super family concept.
SWISS -PROT it provides a high level of annotation.
Both PIR-PSD and SWISS-PROT have software that enables the user to easily
search through the database to obtain only the required information.
TrEMB it contains the translation of all coding sequences present in the EMBL
nucleotide database.
8
9. 2. Secondary database:-
Comprises data derived from the results of primary data.
Secondary databases have become the molecular biologist’s reference library
over the past decade.
9
10. A. Nucleotide Secondary database:-
UniGene automatically partitioning GenBank sequences into a non-redundant
set of gene-oriented clusters.
Ensembl provide a centralized resource for geneticists, molecular biologists
and other researchers studying the genomes.
Microbial Resource contains all the focus on one organism.
ACeDB originally developed for the C. Elegans ( a nematode worm) genome
project. It is a repository of sequence, genetic map and phenotypic information
about the C. Elegans.
FlyBase genome of the fruit fly D. Melanogaster to a high degree of
completeness and quality.
10
11. B. Protein Secondary database:-
InterPro is a database of protein families, domains and functional sites in
which identifiable features found in known proteins can be applied to new
protein.
UniProt database of protein sequence and functional information.
GPCRGB database is focused on a single family protein, GPCRGB. These are
transmembrane protein used by cells to communicate with the outside world.
CluSTr (Cluster of SWISS-PROT and TrEMBL) database offers an automatic
classification of the entries in the SWISS-PROT and TrEMBL databases into
groups of related proteins.
COGS or Cluster of Orthologous Groups of protein database. 11
12. 3. Composite database:-
It is an amalgamation of different primary database sources,
which omits the need to search multiple resources.
NCBI hosts these features to various persons involved in
research.
Examples:-
1. OMIM
Catalog of human genes, genetic disorders and related literature.
2. GENE
Molecular data and literature related to genes with extensive links to
other databases. 12
13. 13
Conclusion
The present challenge is to:-
1. Handle huge volume of data.
2. To improve database design.
3. Develop software for database access and manipulation.
There is no doubt of involvement of bioinformatics in biological
sciences and betterment of human lives.