INTRODUCTION
WHAT IS DATA AND DATABASE?
WHAT IS BIOLOGICAL DATABASE?
TYPES OF BIOLOGICAL DATABASE
PRIMARY DATABASE
Nucleic acid sequence database
Protein sequence database
SECONDARY DATABASE
COMPOSITE DATABASE
TERTIARY DATABASE
WHY NEED?
CONCLUSION
REFRENCES
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Primary and secondary database
1. PRIMARY AND SECONDARY BIOLOGICAL
DATABASE
By
KAUSHAL KUMAR SAHU
Assistant Professor (Ad Hoc)
Department of Biotechnology
Govt. Digvijay Autonomous P. G. College
Raj-Nandgaon ( C. G. )
2. CONTANTS
• INTRODUCTION
• WHAT IS DATA AND DATABASE?
• WHAT IS BIOLOGICAL DATABASE?
• TYPES OF BIOLOGICAL DATABASE
– PRIMARY DATABASE
• Nucleic acid sequence database
• Protein sequence database
– SECONDARY DATABASE
– COMPOSITE DATABASE
– TERTIARY DATABASE
• WHY NEED?
• CONCLUSION
• REFRENCES
5/11/2020
2
3. INTRODUCTION
Application of
computational
techniques
Management
and Analysis
of biological
Data.
Bioinformatic
History:
•The first English use of the word "data" is from the 1640s.
• Using the word "data" to mean "transmittable and
storable computer information" was first done in 1946.
•The first database was created in 1956 .
•Insulin protein is the first protein to be sequenced.
5/11/2020 3
4. DATA
• A series of
observations,
measurements ,
or facts;
information
and also
called: information
computing.
DATABASE
• A large
systematized collecti
on of data that can
be
expanded,updated,
and retrieved rapidly
for specific purpose.
5/11/2020 4
5. BIOLOGICAL DATABASE
• Storage of biological information(Nucleic
acid sequence, Protein sequence and
structure).
5/11/2020 5
6. DEFINATION
Biological database are computer sites
that organise, store and disseminate files that
contain information consisting of literature
references, nucleic acid sequences and Protein
sequences and structure.
5/11/2020 6
9. Primary Database
Stores biomolecular sequences (Protein or Nucleic acid )
and associated annotation information (Organism,
species, mutation linked to particular diseases,
bibliographic etc. )
Primary sources are original materials on which research
is based.
Neither interpreted nor condensed nor evaluated by
other writers.
5/11/2020 9
11. NCBI
• Located in Bethesda, Maryland and was founded in 1988
through legislation sponsored by Senator Claude Pepper.
• Was directed by David Lipman, one of the original authors of
the BLAST.
• The NCBI houses a series of databases.
EX. : GenBank - DNA sequences.
PubMed (a bibliographic database ) - the biomedical
literature.
Other databases - Epigenomics database.
5/11/2020 11
12. GenBank
• A part of International nucleiotide sequence database
collaboration which comprised of EMBL, DDBJ GenBank
at NCBI.
• The database started in 1982 by Walter Goad and Los
Alamos National Laboratory.
• In 15 August 2017, GenBank release 221.0 has
203,180,606 loci, 240,343,378,258 bases, from
203,180,606 reported sequences.
https://www.revolvy.com/main/index.php?s=GenBank
5/11/2020 12
13. EMBL-EBI
• Established in 1980 at the EMBL laboratories in
Heidelberg, Germany.
• An international, innovative and interdisciplinary
research organisation funded by 23 member states and
two associate member states.
• Location- Hinxton, Cambridge, UK.
5/11/2020 13
14. DDBJ
• 1987 DDBJ release 1 was provided.
• Situated in Mishima, Japan.
5/11/2020 14
17. SECONDARY DATABASE
• Derived from the analysis of primary data.
• Present in the form of regular expressions(patterns),
fringerprints, blocks.
Secondary
databse
PROSITE
PRINTS
5/11/2020 17
18. PROSITE
• It is consists of entries describing the protein families,
domains and functional sites as wel as aminocid patterns
and profiles in them.
• Complemented by collection of rules based profiles and
pattern i.e. ProRule.
5/11/2020 18
19. PRINTS
• Collection of protein motif fringerprints.
• the motifs do not overlap, but are separated along a
sequence, though they may be contiguous in 3D-space.
• Fingerprints can encode protein folds and functionalities
more flexibly and powerfully than can single motifs, full
diagnostic potency deriving from the mutual context
provided by motif neighbours.
5/11/2020 19
20. COMPOSITE DATABASE
• Represent an amalgamation of several primary database
sources and are easy to use.
• Access all the relevant information from a single source
rather than connect to multiple resources.
Ex. NCBI, UniProt etc.
5/11/2020 20
21. CONCLUSION
• Bioinformatics is the application of information
technology to store, organize To make biological data
available in computer-readable form.
• We can easily analyze the vast amount of biological
data which is available in the form of sequences and
structures of proteins(the building block of organisms)
and nucleic acid (the information carrior).
• Need for storing and communicating large datasets has
grown .
• Make biological data available to scientists.
5/11/2020 21