SlideShare a Scribd company logo
1 of 44
S.BITUILA
II MSC.
Sequence and structural databases of Dna
and protein , and its significances in scientific
researches.
DNA Databases:
 Sequence Databases
 Structural Databases
DNA Sequence Databases:
 NCBI
 EMBL
 DDBJ
 Ensembl
 GenBank
 EBI
 UniGene
NCBI (National Centre for
Biotechnological Information)
 Established in the year 1988
 It aims to create public databases , develop
software tools for sequence analysis and
disseminate biomedical information, mainly to
aid the research in computational biology.
 Roles:
-Maintains several biological databases
eg.GenBank,the nucleic acid sequence
database.
-provides data retrieval system (eg.Entrez)
-provides computational resources for the
analysis of GenBank data and a variety of other
biological databases.
Tools available in NCBI:
 BLAST,Entrez,standard
BLAST,megaBLAST, mega BLAST,PSI-
BLAST,RPS-BLAST
 Types of Databases :
-Nucleotide database
-Literature database
-protein database
-Gene expression
-Structural database
-Chemical and others.
EMBL(European Molecular Biology
Laboratory)
 Established in the year 1974 by Leo Sjilard ,
James Watson and John Kendrew.
 Roles:
-Incorporates , Organizes and Distributes
nucleotide sequences from the public sources.
-Performs basic researches in molecular
biology and medicine as well as trains Scientists,
students and visitors.
 Tools:
-Ppsearch,GeneQuiz,FASTA,DALI,BLAST-
2,Radar,Dali-Lite etc.
DDBJ(DNA Databank of Japan)
 Established in the year 1986
 Roles:
-Collects nucleotide sequence data and
provides freely available nucleotide
sequence data.
-Provides supercomputer system to
support research activities in Life Sciences.
Tools:
-Getentry,SRS,TXSearch,LIBRA,GIB.
Ensemble:
 Launched in the year 1999 in response to the imminent
completion of the Human Genome Project.
 Joint Project between the European Bioinformatics
Institute and the welcome Trust Sanger Institute.
 It aims to provide a centralized resource for geneticists,
molecular biologists and other researchers studying the
genomes of our own species and other vertebrates and
model organisms.
 Genome databases for vertebrates and other eukaryotic
species .
 It is one of the well known genome browsers for the
retrieval of genomic information.
 Plays a major role in ENCODE (Encyclopaedia of DNA
Elements Consortium) Project.
 Tools: BLAST ,Data Slicer, Variant Effect Predictor,
Assembly converter etc.
GenBank:
 Started in the year 1982 by Walter Goad and Los
Alamos National Laboratory.
 Produced and maintained by the National Centre for
Biotechnology Information (NCBI) as a part of the
International Nucleotide Sequence Database
Collaboration(INSDC)
 Roles:
-open access ,annotated collection of all publicly
available nucleotide sequences and their protein
translations.
-Provide and encourage access within the scientific
community.
 Tools: Bar S Tool, Sequin, BLAST,
EBI(European Bioinformatics Institute):
 1980
 EMBL-EBI is a centre for research and services in
bioinformatics ,and is a part of European Molecular Biology
Laboratory(EMBL)
 It hosts a number of publicly open ,free to use life sciences
resources ,including biomedical databases, analysis tools
and bio-ontologies which includes-;
- ArrayExpress -archive of gene expression experiments.
- BioModels - a database of computational models relevant
to the life sciences.
- BioStudies -a database that serves as a generic data
archive at EMBL-EBI for biomolecular datasets.
-European Nucleotide Archive (ENA) – resource of
Nucleotide sequencing information.
UniGene:
 It is an NCBI database of the
transcriptome and thus ,despite the name
not primarily a database for genes.
 It provides informations on protein
similarities, gene expression , cDNA
clones and genomic location .
DNA Structural Databases:
RNase P Database:
 Compilation of RNase P sequences,
sequence alignments , secondary
structures, three dimensional models
and accessory information.
 Also contains secondary structures of
bacterial and archaeal RNAs including
specially annotated ‘reference’
secondary structures of E.Coli and
Bacillus subtilis RNase P RNAs,a
minimum phylogenetic consensus
structure,and coordinates for models
of three-dimensional structure.
Protein Databases:
 Protein Sequence Databases
 Protein Structural Databases
Protein Sequence Databases:
 PIR
 SWISS-PROT
 Trembl
 iProclass
 Pfam
PIR(Protein Information Resource):
 1984 by the National Biomedical Research
Foundation(NBRF)
 Roles: -Source of annotated proteins
database and analysis tools for the
researchers.
 Provides an introduction to a range of
biological database.
 Highlights the distinction between different
data types and indicates where the most
important resources are maintained.
-It also supports genomic and
proteomic research and scientific discovery.
PIR is split into four
sections:
 PIR1: contains fully classified and annotated entries.
 PIR2: includes preliminary entries ,which have not
been thoroughly reviewed and may contain
redundancy .
 PIR3 contains unverified entries ,which have not been
reviewedPIR4 entries fall into one of the four
categories:
-conceptual translations of artefactual
sequences
-conceptual translations of sequences that are
not transcribed or translated
-protein sequences or conceptual translations
that are extensively genetically engineered
-Sequences that are not genetically encoded and
not produced on ribosomes.
SWISS-Prot:
 Founded in the year 1986 by Amos
Bairoch and developed by Swiss
Institute of Bioinformatics and
subsequently developed by Rolf
Apwelier at EBI.
 Provides high level annotations,
including descriptions of the function of
the protein, structure of its domains, its
post translational modifications variants
etc.
 Minimal redundancy and integration with
other databases .
TrEMBL(Translated EMBL)
 Founded in the year 1996 as a
computer annotated supplement
to Swiss-Prot.
 Contains translation of all coding
sequences present in EMBL,
GenBank, DDBJ Nucleotide
Sequence Databases and also
protein extracted from the
literature or submitted to Swiss-
Prot.
iPro-class (Integrated Protein
Knowledge bases)
-First released in 2000
- Provides comprehensive description of a protein
family ,function and structure for Uniprot protein
sequence.
 It contains Value added descriptions of proteins
including family relationship at global and local
levels.
 Serves as a framework for data integration in
distributed networking environment.
 It can also be used to support protein sequence
annotation and genomic/proteomic research to
obtain comprehensive up-to-date information on
proteins.
Uses:
 iPro-class provides two types of protein
sequence reports. In one type it covers
information on genetic gene family structure
function, taxonomy and literature with cross
reference to molecular database .The second
type present PIR super family membership
information with length ,taxonomy and
keyword statistics.
 It also provides links to various molecular
biology databases.
Pfam
 1995 by Erik Sonhammer , Sean Eddy and
Richard Durbin as a collection of commonly
occurring protein domains that could be used to
annotate the protein coding genes of
multicellular animals.
 It is a database of protein families.
 Includes annotations and multiple sequence
alignment of protein families generated using
hidden Markov models.
 The general purpose of Pfam database is to
provide a complete and accurate classification of
protein families.
 This method has been widely adopted by
biologists because of its wide coverage of
proteins and sensible naming conventions.
Uses :
 It is used by experimental biologists
researching specific proteins ,by structural
biologists to identify new targets for
structure determination, by computational
biologists to organize sequences and by
evolutionary biologists for tracing the
origins of proteins.
 It also allows users to submit protein or
DNA sequences to search for matches to
families in the database.
Structural Databases of protein
;
 PDB
 CATH
 SCOP
 Gene 3D
 D Bali
 E-MSD
PDB(Protein DataBank);
 1971, by Brookhaven National Laboratory ,New
York.
 It is a database for the three –dimensional structural
data of large biological molecules, and nucleic acids.
 Roles:
-It is a key resource in areas of structural
biology ,such as structural genomics .
-Provides protein structures to many other
databases eg SCOP and CATH.
 Tools:
-ADIT(auto Deep Input Tool), pdb-Extract,
OOSTAR, Open Ras Mol, CIF Tr, MAXIT, Biopython,
mmLIB,XML2PDB,
CATH( Class, Architecture, Topology
and Homology)
 Mid 1990s by Professor Christine Orengo and colleagues including
Janet Thornton and David Jones at the University College London.
-It is a protein Structure Classification Database. and shares many
broad features with the SCOP resource.
-It provides information on the Evolutionary relationships of protein
domains .
 Roles:
-Class; at this level the domains are assigned according to their
secondary structure content .
-Architecture , at this level , information on the secondary structure
arrangement in three dimensional space is used for assignment. It
describes the gross secondary structure content and packing.
-Topology encompasses both overall shape and connectivity of
secondary structure
-Homology groups domains that share more than 35% sequence
identity and thought to share a common ancestor.
The four levels of CATH hierarchy:
# Level Description
1. Class: The overall secondary structure
content of the domain .
2. Architectur
e:
High structural similarity but no
evidence of homology .
3. Topology: A large-Scale grouping of
topologies which share
particular structural features
4. Homolog-
ous
superfam-
ily
Indicative of a demonstrable
evolutionary relationship
SCOP( Structural Classification of
Protein)
 1994
 Centre for Engineering and the Laboratory of Molecular
Biology.
 Roles:
-Describes Structural and Evolutionary relationship
between proteins of known structure.
-Provides broad survey of all known proteins folds ,
detailed information about the close relatives of protein
and a protein and a framework for future research and
classification.
E-MSD
 1996
 Provides clean Macromolecular Structure
Data
 Accept and process depositions to the PDB.
 Transform the PDB flat –file archive to a
relational database system.
 Management and distribution of data on
molecular structures in close collaboration
with PDB.
 Tools- Autodep and Emdep
Gene 3D:
 Provides structural annotation for proteins
in the CATH sequence database.
 It uses the information in CATH to predict
the locations of structural domains on
millions of protein sequences available in
public databases.
 Provides comprehensive structural and
fuctional annotation of most available
protein sequence including the Uniprot,
Refseq and Integr 8 resources.
References:
-Bioinformatics by Sabu M Thampi
-Bioinformatics by Dardel
-Bioinformatics for Biologists by Dr. Murtada
Alshareifi
-https://bioinf.comav.upv.es
Thank you

More Related Content

What's hot

Introduction OF BIOLOGICAL DATABASE
Introduction OF BIOLOGICAL DATABASEIntroduction OF BIOLOGICAL DATABASE
Introduction OF BIOLOGICAL DATABASEPrashantSharma807
 
Nucleic Acid Sequence databases
Nucleic Acid Sequence databasesNucleic Acid Sequence databases
Nucleic Acid Sequence databasesPranavathiyani G
 
Sequence Submission Tools
Sequence Submission ToolsSequence Submission Tools
Sequence Submission ToolsRishikaMaji
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomicsAthira RG
 
MULTIPLE SEQUENCE ALIGNMENT
MULTIPLE  SEQUENCE  ALIGNMENTMULTIPLE  SEQUENCE  ALIGNMENT
MULTIPLE SEQUENCE ALIGNMENTMariya Raju
 
Protein information resource (PIR)
Protein information resource (PIR)Protein information resource (PIR)
Protein information resource (PIR)ShivaniShewale2
 
Biological databases
Biological databasesBiological databases
Biological databasesAfra Fathima
 
UniProt
UniProtUniProt
UniProtAmnaA7
 

What's hot (20)

Introduction OF BIOLOGICAL DATABASE
Introduction OF BIOLOGICAL DATABASEIntroduction OF BIOLOGICAL DATABASE
Introduction OF BIOLOGICAL DATABASE
 
Nucleic Acid Sequence databases
Nucleic Acid Sequence databasesNucleic Acid Sequence databases
Nucleic Acid Sequence databases
 
European molecular biology laboratory (EMBL)
European molecular biology laboratory (EMBL)European molecular biology laboratory (EMBL)
European molecular biology laboratory (EMBL)
 
NCBI National Center for Biotechnology Information
NCBI National Center for Biotechnology InformationNCBI National Center for Biotechnology Information
NCBI National Center for Biotechnology Information
 
Cath
CathCath
Cath
 
Sequence Submission Tools
Sequence Submission ToolsSequence Submission Tools
Sequence Submission Tools
 
Clustal W - Multiple Sequence alignment
Clustal W - Multiple Sequence alignment   Clustal W - Multiple Sequence alignment
Clustal W - Multiple Sequence alignment
 
PIR- Protein Information Resource
PIR- Protein Information ResourcePIR- Protein Information Resource
PIR- Protein Information Resource
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
 
blast bioinformatics
blast bioinformaticsblast bioinformatics
blast bioinformatics
 
MULTIPLE SEQUENCE ALIGNMENT
MULTIPLE  SEQUENCE  ALIGNMENTMULTIPLE  SEQUENCE  ALIGNMENT
MULTIPLE SEQUENCE ALIGNMENT
 
Entrez databases
Entrez databasesEntrez databases
Entrez databases
 
Ncbi
NcbiNcbi
Ncbi
 
Gen bank databases
Gen bank databasesGen bank databases
Gen bank databases
 
Genomics
GenomicsGenomics
Genomics
 
Protein information resource (PIR)
Protein information resource (PIR)Protein information resource (PIR)
Protein information resource (PIR)
 
Biological databases
Biological databasesBiological databases
Biological databases
 
UniProt
UniProtUniProt
UniProt
 
EMBL- European Molecular Biology Laboratory
EMBL- European Molecular Biology LaboratoryEMBL- European Molecular Biology Laboratory
EMBL- European Molecular Biology Laboratory
 
Molecular modeling database
Molecular modeling database Molecular modeling database
Molecular modeling database
 

Similar to Sequence and Structural Databases of DNA and Protein, and its significance in Scientific Researches

Presentation on Biological database By Elufer Akram @ University Of Science ...
Presentation on Biological database  By Elufer Akram @ University Of Science ...Presentation on Biological database  By Elufer Akram @ University Of Science ...
Presentation on Biological database By Elufer Akram @ University Of Science ...Elufer Akram
 
Nucleic acid and protein databanks
Nucleic acid and protein databanksNucleic acid and protein databanks
Nucleic acid and protein databanksNithyaNandapal
 
Bioinformatics introduction
Bioinformatics introductionBioinformatics introduction
Bioinformatics introductionDrGopaSarma
 
Bioinformatics biological databases
Bioinformatics biological databasesBioinformatics biological databases
Bioinformatics biological databasesSangeeta Das
 
Protein Sequence Databases
Protein Sequence Databases Protein Sequence Databases
Protein Sequence Databases Hemant Bothe
 
Bioinformatics in biotechnology by kk sahu
Bioinformatics in biotechnology by kk sahu Bioinformatics in biotechnology by kk sahu
Bioinformatics in biotechnology by kk sahu KAUSHAL SAHU
 
BIOINFO unit 1.pptx
BIOINFO unit 1.pptxBIOINFO unit 1.pptx
BIOINFO unit 1.pptxrnath286
 
Protein databases
Protein databasesProtein databases
Protein databasessarumalay
 
BioInformatics Tools -Genomics , Proteomics and metablomics
BioInformatics Tools -Genomics , Proteomics and metablomicsBioInformatics Tools -Genomics , Proteomics and metablomics
BioInformatics Tools -Genomics , Proteomics and metablomicsAyeshaYousaf20
 
Database in bioinformatics
Database in bioinformaticsDatabase in bioinformatics
Database in bioinformaticsVinaKhan1
 

Similar to Sequence and Structural Databases of DNA and Protein, and its significance in Scientific Researches (20)

Biological database
Biological databaseBiological database
Biological database
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Presentation on Biological database By Elufer Akram @ University Of Science ...
Presentation on Biological database  By Elufer Akram @ University Of Science ...Presentation on Biological database  By Elufer Akram @ University Of Science ...
Presentation on Biological database By Elufer Akram @ University Of Science ...
 
Biological databases.pptx
Biological databases.pptxBiological databases.pptx
Biological databases.pptx
 
Protein Databases
Protein DatabasesProtein Databases
Protein Databases
 
Nucleic acid and protein databanks
Nucleic acid and protein databanksNucleic acid and protein databanks
Nucleic acid and protein databanks
 
Bioinformatics introduction
Bioinformatics introductionBioinformatics introduction
Bioinformatics introduction
 
Bioinformatics biological databases
Bioinformatics biological databasesBioinformatics biological databases
Bioinformatics biological databases
 
Introduction to databases.pptx
Introduction to databases.pptxIntroduction to databases.pptx
Introduction to databases.pptx
 
Protein Sequence Databases
Protein Sequence Databases Protein Sequence Databases
Protein Sequence Databases
 
Bioinformatics in biotechnology by kk sahu
Bioinformatics in biotechnology by kk sahu Bioinformatics in biotechnology by kk sahu
Bioinformatics in biotechnology by kk sahu
 
BIOINFO unit 1.pptx
BIOINFO unit 1.pptxBIOINFO unit 1.pptx
BIOINFO unit 1.pptx
 
Protein databases
Protein databasesProtein databases
Protein databases
 
PROTEIN DATABASE
PROTEIN DATABASEPROTEIN DATABASE
PROTEIN DATABASE
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Protein database
Protein  databaseProtein  database
Protein database
 
BioInformatics Tools -Genomics , Proteomics and metablomics
BioInformatics Tools -Genomics , Proteomics and metablomicsBioInformatics Tools -Genomics , Proteomics and metablomics
BioInformatics Tools -Genomics , Proteomics and metablomics
 
Protein Database
Protein DatabaseProtein Database
Protein Database
 
Database in bioinformatics
Database in bioinformaticsDatabase in bioinformatics
Database in bioinformatics
 

Recently uploaded

All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PPRINCE C P
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)Areesha Ahmad
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...ssifa0344
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsSérgio Sacani
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptxAlMamun560346
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxgindu3009
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencySheetal Arora
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfSumit Kumar yadav
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000Sapana Sha
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxRizalinePalanog2
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and ClassificationsAreesha Ahmad
 

Recently uploaded (20)

All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C P
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 

Sequence and Structural Databases of DNA and Protein, and its significance in Scientific Researches

  • 1. S.BITUILA II MSC. Sequence and structural databases of Dna and protein , and its significances in scientific researches.
  • 2. DNA Databases:  Sequence Databases  Structural Databases
  • 3. DNA Sequence Databases:  NCBI  EMBL  DDBJ  Ensembl  GenBank  EBI  UniGene
  • 4. NCBI (National Centre for Biotechnological Information)  Established in the year 1988  It aims to create public databases , develop software tools for sequence analysis and disseminate biomedical information, mainly to aid the research in computational biology.  Roles: -Maintains several biological databases eg.GenBank,the nucleic acid sequence database. -provides data retrieval system (eg.Entrez) -provides computational resources for the analysis of GenBank data and a variety of other biological databases.
  • 5. Tools available in NCBI:  BLAST,Entrez,standard BLAST,megaBLAST, mega BLAST,PSI- BLAST,RPS-BLAST  Types of Databases : -Nucleotide database -Literature database -protein database -Gene expression -Structural database -Chemical and others.
  • 6.
  • 7. EMBL(European Molecular Biology Laboratory)  Established in the year 1974 by Leo Sjilard , James Watson and John Kendrew.  Roles: -Incorporates , Organizes and Distributes nucleotide sequences from the public sources. -Performs basic researches in molecular biology and medicine as well as trains Scientists, students and visitors.  Tools: -Ppsearch,GeneQuiz,FASTA,DALI,BLAST- 2,Radar,Dali-Lite etc.
  • 8.
  • 9. DDBJ(DNA Databank of Japan)  Established in the year 1986  Roles: -Collects nucleotide sequence data and provides freely available nucleotide sequence data. -Provides supercomputer system to support research activities in Life Sciences. Tools: -Getentry,SRS,TXSearch,LIBRA,GIB.
  • 10.
  • 11. Ensemble:  Launched in the year 1999 in response to the imminent completion of the Human Genome Project.  Joint Project between the European Bioinformatics Institute and the welcome Trust Sanger Institute.  It aims to provide a centralized resource for geneticists, molecular biologists and other researchers studying the genomes of our own species and other vertebrates and model organisms.  Genome databases for vertebrates and other eukaryotic species .  It is one of the well known genome browsers for the retrieval of genomic information.  Plays a major role in ENCODE (Encyclopaedia of DNA Elements Consortium) Project.  Tools: BLAST ,Data Slicer, Variant Effect Predictor, Assembly converter etc.
  • 12. GenBank:  Started in the year 1982 by Walter Goad and Los Alamos National Laboratory.  Produced and maintained by the National Centre for Biotechnology Information (NCBI) as a part of the International Nucleotide Sequence Database Collaboration(INSDC)  Roles: -open access ,annotated collection of all publicly available nucleotide sequences and their protein translations. -Provide and encourage access within the scientific community.  Tools: Bar S Tool, Sequin, BLAST,
  • 13. EBI(European Bioinformatics Institute):  1980  EMBL-EBI is a centre for research and services in bioinformatics ,and is a part of European Molecular Biology Laboratory(EMBL)  It hosts a number of publicly open ,free to use life sciences resources ,including biomedical databases, analysis tools and bio-ontologies which includes-; - ArrayExpress -archive of gene expression experiments. - BioModels - a database of computational models relevant to the life sciences. - BioStudies -a database that serves as a generic data archive at EMBL-EBI for biomolecular datasets. -European Nucleotide Archive (ENA) – resource of Nucleotide sequencing information.
  • 14.
  • 15. UniGene:  It is an NCBI database of the transcriptome and thus ,despite the name not primarily a database for genes.  It provides informations on protein similarities, gene expression , cDNA clones and genomic location .
  • 17. RNase P Database:  Compilation of RNase P sequences, sequence alignments , secondary structures, three dimensional models and accessory information.  Also contains secondary structures of bacterial and archaeal RNAs including specially annotated ‘reference’ secondary structures of E.Coli and Bacillus subtilis RNase P RNAs,a minimum phylogenetic consensus structure,and coordinates for models of three-dimensional structure.
  • 18.
  • 19. Protein Databases:  Protein Sequence Databases  Protein Structural Databases
  • 20. Protein Sequence Databases:  PIR  SWISS-PROT  Trembl  iProclass  Pfam
  • 21. PIR(Protein Information Resource):  1984 by the National Biomedical Research Foundation(NBRF)  Roles: -Source of annotated proteins database and analysis tools for the researchers.  Provides an introduction to a range of biological database.  Highlights the distinction between different data types and indicates where the most important resources are maintained. -It also supports genomic and proteomic research and scientific discovery.
  • 22. PIR is split into four sections:  PIR1: contains fully classified and annotated entries.  PIR2: includes preliminary entries ,which have not been thoroughly reviewed and may contain redundancy .  PIR3 contains unverified entries ,which have not been reviewedPIR4 entries fall into one of the four categories: -conceptual translations of artefactual sequences -conceptual translations of sequences that are not transcribed or translated -protein sequences or conceptual translations that are extensively genetically engineered -Sequences that are not genetically encoded and not produced on ribosomes.
  • 23.
  • 24. SWISS-Prot:  Founded in the year 1986 by Amos Bairoch and developed by Swiss Institute of Bioinformatics and subsequently developed by Rolf Apwelier at EBI.  Provides high level annotations, including descriptions of the function of the protein, structure of its domains, its post translational modifications variants etc.  Minimal redundancy and integration with other databases .
  • 25. TrEMBL(Translated EMBL)  Founded in the year 1996 as a computer annotated supplement to Swiss-Prot.  Contains translation of all coding sequences present in EMBL, GenBank, DDBJ Nucleotide Sequence Databases and also protein extracted from the literature or submitted to Swiss- Prot.
  • 26.
  • 27. iPro-class (Integrated Protein Knowledge bases) -First released in 2000 - Provides comprehensive description of a protein family ,function and structure for Uniprot protein sequence.  It contains Value added descriptions of proteins including family relationship at global and local levels.  Serves as a framework for data integration in distributed networking environment.  It can also be used to support protein sequence annotation and genomic/proteomic research to obtain comprehensive up-to-date information on proteins.
  • 28. Uses:  iPro-class provides two types of protein sequence reports. In one type it covers information on genetic gene family structure function, taxonomy and literature with cross reference to molecular database .The second type present PIR super family membership information with length ,taxonomy and keyword statistics.  It also provides links to various molecular biology databases.
  • 29.
  • 30. Pfam  1995 by Erik Sonhammer , Sean Eddy and Richard Durbin as a collection of commonly occurring protein domains that could be used to annotate the protein coding genes of multicellular animals.  It is a database of protein families.  Includes annotations and multiple sequence alignment of protein families generated using hidden Markov models.  The general purpose of Pfam database is to provide a complete and accurate classification of protein families.  This method has been widely adopted by biologists because of its wide coverage of proteins and sensible naming conventions.
  • 31.
  • 32. Uses :  It is used by experimental biologists researching specific proteins ,by structural biologists to identify new targets for structure determination, by computational biologists to organize sequences and by evolutionary biologists for tracing the origins of proteins.  It also allows users to submit protein or DNA sequences to search for matches to families in the database.
  • 33. Structural Databases of protein ;  PDB  CATH  SCOP  Gene 3D  D Bali  E-MSD
  • 34. PDB(Protein DataBank);  1971, by Brookhaven National Laboratory ,New York.  It is a database for the three –dimensional structural data of large biological molecules, and nucleic acids.  Roles: -It is a key resource in areas of structural biology ,such as structural genomics . -Provides protein structures to many other databases eg SCOP and CATH.  Tools: -ADIT(auto Deep Input Tool), pdb-Extract, OOSTAR, Open Ras Mol, CIF Tr, MAXIT, Biopython, mmLIB,XML2PDB,
  • 35.
  • 36. CATH( Class, Architecture, Topology and Homology)  Mid 1990s by Professor Christine Orengo and colleagues including Janet Thornton and David Jones at the University College London. -It is a protein Structure Classification Database. and shares many broad features with the SCOP resource. -It provides information on the Evolutionary relationships of protein domains .  Roles: -Class; at this level the domains are assigned according to their secondary structure content . -Architecture , at this level , information on the secondary structure arrangement in three dimensional space is used for assignment. It describes the gross secondary structure content and packing. -Topology encompasses both overall shape and connectivity of secondary structure -Homology groups domains that share more than 35% sequence identity and thought to share a common ancestor.
  • 37. The four levels of CATH hierarchy: # Level Description 1. Class: The overall secondary structure content of the domain . 2. Architectur e: High structural similarity but no evidence of homology . 3. Topology: A large-Scale grouping of topologies which share particular structural features 4. Homolog- ous superfam- ily Indicative of a demonstrable evolutionary relationship
  • 38.
  • 39. SCOP( Structural Classification of Protein)  1994  Centre for Engineering and the Laboratory of Molecular Biology.  Roles: -Describes Structural and Evolutionary relationship between proteins of known structure. -Provides broad survey of all known proteins folds , detailed information about the close relatives of protein and a protein and a framework for future research and classification.
  • 40.
  • 41. E-MSD  1996  Provides clean Macromolecular Structure Data  Accept and process depositions to the PDB.  Transform the PDB flat –file archive to a relational database system.  Management and distribution of data on molecular structures in close collaboration with PDB.  Tools- Autodep and Emdep
  • 42. Gene 3D:  Provides structural annotation for proteins in the CATH sequence database.  It uses the information in CATH to predict the locations of structural domains on millions of protein sequences available in public databases.  Provides comprehensive structural and fuctional annotation of most available protein sequence including the Uniprot, Refseq and Integr 8 resources.
  • 43. References: -Bioinformatics by Sabu M Thampi -Bioinformatics by Dardel -Bioinformatics for Biologists by Dr. Murtada Alshareifi -https://bioinf.comav.upv.es