SlideShare uma empresa Scribd logo
1 de 19
Mr.Yogesh Joshi
WCBT
Structure database: PDB
Protein Data Bank
 http://www.rcsb.org/pdb/home/home.do
 A repository for 3-D biological
macromolecular structure.
 It includes proteins, nucleic acids.
 Obtained by X-Ray crystallography (80%) or
NMR spectroscopy (16%).
 Transferred to the Research Collaboratory for
Structural Bioinformatics (RCSB) in 1998.
 Currently it holds 141616 released structures.
 freely accessible on the Internet via the websites of its member
organisations (PDBe,PDBj,and RCSB).
 The PDB is overseen by an organization called the Worldwide
Protein Data Bank, wwPDB.
 The PDB is a key resource in areas of structural biology, such
as structural genomics.
 Most major scientific journals, and some funding agencies,
now require scientists to submit their structure data to the PDB.
 Many other databases use protein structures deposited
in the PDB. For example, SCOP and CATH classify
protein structures, while PDBsum provides a graphic
overview of PDB entries using information from other
sources, such as Gene ontology
History
 Founded in 1971 by Brookhaven National
Laboratory, New York.
 In October 1998,the PDB was transferred to the
Research Collaboratory for Structural
Bioinformatics (RCSB);
 In 2003, with the formation of the wwPDB, the
PDB became an international organization.
 The founding members are PDBe (Europe), RCSB
(USA), and PDBj (Japan).
Content
 The PDB database is updated weekly.
 As of 12 May 2017, the breakdown of current
holdings is as follows:
 In the past, the number of structures in the PDB has
grown at an approximately exponential rate passing
the 100,000 structures milestone in 2014.
PDB data formats/File Formats
 The file format initially used by the PDB was called the PDB
file format.
 PDB file format was used to contain the coordinates and related
information.
 In the late 1990’s, macromolecular Crystallographic Information file
(mmCIF) evolved.
 mmCIF and PDBML
 Push in to make structure files completely self-contained descriptions of
the experiment and details of the structure determination.
 PDB file format unstructured and obsolete
PDB File Format
 Text file – you can edit with a text editor e.g. WordPad
 Atomic co-ordinates
 Rich annotation
 Citation
 Experimental Method
 Biological source e.
 Etc.
Viewing the data
 The structure files may be viewed using one of open
software programme, including JMOL, PYMOL, and
RASMOL.
 Some other free, but not open source programs include
ICM-Browser,VMD, MDL Chime, UCSF Chimera, Swiss-
PDB Viewer, StarBiochem(a Java-based interactive
molecular viewer with integrated search of protein
databank), Sirius, and VisProt3DS(a tool for Protein
Visualization in 3D stereoscopic view and other modes).
 The RCSB PDB website contains an extensive list of both
free and commercial molecule visualization programs and
web browser plugins.
 Advanced search
 New features
 File format
PDB File Format
 A deposited set of protein coordinates becomes
an entry in PDB.
 A deposited set of protein coordinates becomes
an entry in PDB.
 One can search a structure in PDB using the
four-letter code or keywords related to its
annotation.
 The identified structure can be viewed directly
online or downloaded to a local computer for
analysis.
 The PDB website provides options for retrieval,
analysis, and direct viewing of macromolecular
 It also provides links to protein structural
classification results
available in databases such as SCOP and CATH.
• The data format in PDB was created in FORTAN
compatible format.
• Header:-
• The header section provides an overview of the
protein and the quality of the structure.
 It contains information about the name of the
molecule, source organism, bibliographic reference,
methods of structure determination, resolution,
crystallographic parameters, protein sequence,
cofactors, and description of structure types and
locations and sometimes secondary structure
information.
 Structure coordinates:-there are a specified number of
columns with predetermined contents.
 The ATOM part refers to protein atom information
whereas the HETATM(for heteroatom group) part refers
to atoms of cofactor or substrate molecules.
 Approximately ten columns of text and numbers are
designated
 They include information for the atom number, atom
name, residue name, polypeptide chain identifier,
residue number, x, y, and z Cartesian coordinates,
temperature factor, and occupancy factor.
 The last two parameters, occupancy and temperature
factors, relate to disorders of atomic positions in crystals.
 END
 Restriction:-
 The field width for polypeptide chains is only one
character in width, meaning that no more than 26 chains
can be used in a multisubunit protein model
mmCIF and MMDB Formats
 The most popular new formats include the
macromolecular crystallographic information file
(mmCIF) and the molecular modeling database
(MMDB) file.
 Both formats are highly parsable by computer
software, meaning that information in each field of
a record can be retrieved separately.
 These new formats facilitate the retrieval and
organization of information from database
structures.
 The mmCIF format is similar to the format for a
relational database in which a set of tables are
used to organize database records.
 Each table or field of information is explicitly
assigned by a tag and linked to other fields
through a special syntax.
 a single line of description in the header section
of PDB is divided into many lines or fields with
each field having explicit assignment of item
names and item values.
 Each field starts with an underscore character
followed by category name and keyword
description separated by a period.
 Using multiple fields with tags for the same
information has the advantage of providing one-
to-one relationship between item names and item
values.
MMDB
 Another new format is the MMDB format
developed by the NCBI to parse and sort pieces
of information in PDB.
 The objective is to allow the information to be
more easily integrated with GenBank and Medline
through Entrez.
 An MMDB file is written in the ASN.1 format which
has information in a record structured as a nested
hierarchy.
 This allows faster retrieval than mmCIF and PDB.
 Furthermore, the MMDB format includes bond
connectivity information for each molecule, called
a “chemical graph,” which is recorded in the
ASN.1 file.

Mais conteúdo relacionado

Mais procurados

Phylogenetic analysis
Phylogenetic analysis Phylogenetic analysis
Phylogenetic analysis Nitin Naik
 
Primary and secondary database
Primary and secondary databasePrimary and secondary database
Primary and secondary databaseKAUSHAL SAHU
 
Secondary Structure Prediction of proteins
Secondary Structure Prediction of proteins Secondary Structure Prediction of proteins
Secondary Structure Prediction of proteins Vijay Hemmadi
 
sequence of file formats in bioinformatics
sequence of file formats in bioinformaticssequence of file formats in bioinformatics
sequence of file formats in bioinformaticsnadeem akhter
 
Introduction to ncbi, embl, ddbj
Introduction to ncbi, embl, ddbjIntroduction to ncbi, embl, ddbj
Introduction to ncbi, embl, ddbjKAUSHAL SAHU
 
Blast and fasta
Blast and fastaBlast and fasta
Blast and fastaALLIENU
 
DNA data bank of japan (DDBJ)
DNA data bank of japan (DDBJ)DNA data bank of japan (DDBJ)
DNA data bank of japan (DDBJ)ZoufishanY
 
Nucleic Acid Sequence databases
Nucleic Acid Sequence databasesNucleic Acid Sequence databases
Nucleic Acid Sequence databasesPranavathiyani G
 

Mais procurados (20)

Phylogenetic analysis
Phylogenetic analysis Phylogenetic analysis
Phylogenetic analysis
 
Gen bank databases
Gen bank databasesGen bank databases
Gen bank databases
 
EMBL
EMBLEMBL
EMBL
 
Primary and secondary database
Primary and secondary databasePrimary and secondary database
Primary and secondary database
 
protein data bank
protein data bankprotein data bank
protein data bank
 
European molecular biology laboratory (EMBL)
European molecular biology laboratory (EMBL)European molecular biology laboratory (EMBL)
European molecular biology laboratory (EMBL)
 
Entrez databases
Entrez databasesEntrez databases
Entrez databases
 
Biological database
Biological databaseBiological database
Biological database
 
Swiss PROT
Swiss PROT Swiss PROT
Swiss PROT
 
Secondary Structure Prediction of proteins
Secondary Structure Prediction of proteins Secondary Structure Prediction of proteins
Secondary Structure Prediction of proteins
 
sequence of file formats in bioinformatics
sequence of file formats in bioinformaticssequence of file formats in bioinformatics
sequence of file formats in bioinformatics
 
Introduction to ncbi, embl, ddbj
Introduction to ncbi, embl, ddbjIntroduction to ncbi, embl, ddbj
Introduction to ncbi, embl, ddbj
 
PIR- Protein Information Resource
PIR- Protein Information ResourcePIR- Protein Information Resource
PIR- Protein Information Resource
 
NCBI National Center for Biotechnology Information
NCBI National Center for Biotechnology InformationNCBI National Center for Biotechnology Information
NCBI National Center for Biotechnology Information
 
Tools and database of NCBI
Tools and database of NCBITools and database of NCBI
Tools and database of NCBI
 
Protein Databases
Protein DatabasesProtein Databases
Protein Databases
 
Blast and fasta
Blast and fastaBlast and fasta
Blast and fasta
 
Structural databases
Structural databases Structural databases
Structural databases
 
DNA data bank of japan (DDBJ)
DNA data bank of japan (DDBJ)DNA data bank of japan (DDBJ)
DNA data bank of japan (DDBJ)
 
Nucleic Acid Sequence databases
Nucleic Acid Sequence databasesNucleic Acid Sequence databases
Nucleic Acid Sequence databases
 

Semelhante a Protein data bank

Pharmacoinformatics Database basics(sree)
Pharmacoinformatics Database basics(sree)Pharmacoinformatics Database basics(sree)
Pharmacoinformatics Database basics(sree)Sreekanth Gali
 
Bioinformatics lecture xxiii
Bioinformatics lecture xxiiiBioinformatics lecture xxiii
Bioinformatics lecture xxiiiMuhammad Younis
 
BITS: Overview of important biological databases beyond sequences
BITS: Overview of important biological databases beyond sequencesBITS: Overview of important biological databases beyond sequences
BITS: Overview of important biological databases beyond sequencesBITS
 
Types of biological databases-protein database
Types of biological databases-protein databaseTypes of biological databases-protein database
Types of biological databases-protein databasechinmayeec
 
Protein structure
Protein structureProtein structure
Protein structurePooja Pawar
 
R.P Maurya ppt on C C D C & DSSP(Bioinformatics)
R.P Maurya ppt  on C C D C & DSSP(Bioinformatics)R.P Maurya ppt  on C C D C & DSSP(Bioinformatics)
R.P Maurya ppt on C C D C & DSSP(Bioinformatics)R.P MAURYA
 
Data retreival system
Data retreival systemData retreival system
Data retreival systemShikha Thakur
 
Biological Database (1)pptxpdfpdfpdf.pdf
Biological Database (1)pptxpdfpdfpdf.pdfBiological Database (1)pptxpdfpdfpdf.pdf
Biological Database (1)pptxpdfpdfpdf.pdfBioinformaticsCentre
 
Protein Data Bank ( PDB ) - Bioinformatics
Protein Data Bank ( PDB ) - BioinformaticsProtein Data Bank ( PDB ) - Bioinformatics
Protein Data Bank ( PDB ) - Bioinformaticskarmandeepkaur7
 
Protein database ..... of NCBI
Protein database ..... of NCBI Protein database ..... of NCBI
Protein database ..... of NCBI Alagppa University
 
biological databases.pptx
biological databases.pptxbiological databases.pptx
biological databases.pptxscience lover
 
Bioinformatic databases 2
Bioinformatic databases 2Bioinformatic databases 2
Bioinformatic databases 2Razzaqe
 
Bioinformatic_Databases_2.ppt
Bioinformatic_Databases_2.pptBioinformatic_Databases_2.ppt
Bioinformatic_Databases_2.pptNaglaaFathy42
 

Semelhante a Protein data bank (20)

Protein Data Bank
Protein Data BankProtein Data Bank
Protein Data Bank
 
Introduction to pdb
Introduction to pdbIntroduction to pdb
Introduction to pdb
 
Pharmacoinformatics Database basics(sree)
Pharmacoinformatics Database basics(sree)Pharmacoinformatics Database basics(sree)
Pharmacoinformatics Database basics(sree)
 
Bioinformatics lecture xxiii
Bioinformatics lecture xxiiiBioinformatics lecture xxiii
Bioinformatics lecture xxiii
 
BITS: Overview of important biological databases beyond sequences
BITS: Overview of important biological databases beyond sequencesBITS: Overview of important biological databases beyond sequences
BITS: Overview of important biological databases beyond sequences
 
Molecular modeling database
Molecular modeling database Molecular modeling database
Molecular modeling database
 
Types of biological databases-protein database
Types of biological databases-protein databaseTypes of biological databases-protein database
Types of biological databases-protein database
 
Protein sequence databases
Protein sequence databasesProtein sequence databases
Protein sequence databases
 
Protein structure
Protein structureProtein structure
Protein structure
 
R.P Maurya ppt on C C D C & DSSP(Bioinformatics)
R.P Maurya ppt  on C C D C & DSSP(Bioinformatics)R.P Maurya ppt  on C C D C & DSSP(Bioinformatics)
R.P Maurya ppt on C C D C & DSSP(Bioinformatics)
 
Data Retrieval Systems
Data Retrieval SystemsData Retrieval Systems
Data Retrieval Systems
 
Data retreival system
Data retreival systemData retreival system
Data retreival system
 
Proteins databases
Proteins databasesProteins databases
Proteins databases
 
Biological Database (1)pptxpdfpdfpdf.pdf
Biological Database (1)pptxpdfpdfpdf.pdfBiological Database (1)pptxpdfpdfpdf.pdf
Biological Database (1)pptxpdfpdfpdf.pdf
 
Protein Data Bank ( PDB ) - Bioinformatics
Protein Data Bank ( PDB ) - BioinformaticsProtein Data Bank ( PDB ) - Bioinformatics
Protein Data Bank ( PDB ) - Bioinformatics
 
Protein database ..... of NCBI
Protein database ..... of NCBI Protein database ..... of NCBI
Protein database ..... of NCBI
 
Databases
DatabasesDatabases
Databases
 
biological databases.pptx
biological databases.pptxbiological databases.pptx
biological databases.pptx
 
Bioinformatic databases 2
Bioinformatic databases 2Bioinformatic databases 2
Bioinformatic databases 2
 
Bioinformatic_Databases_2.ppt
Bioinformatic_Databases_2.pptBioinformatic_Databases_2.ppt
Bioinformatic_Databases_2.ppt
 

Último

An introduction on sequence tagged site mapping
An introduction on sequence tagged site mappingAn introduction on sequence tagged site mapping
An introduction on sequence tagged site mappingadibshanto115
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learninglevieagacer
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxseri bangash
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticssakshisoni2385
 
Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Silpa
 
Velocity and Acceleration PowerPoint.ppt
Velocity and Acceleration PowerPoint.pptVelocity and Acceleration PowerPoint.ppt
Velocity and Acceleration PowerPoint.pptRakeshMohan42
 
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxClimate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxDiariAli
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryAlex Henderson
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Serviceshivanisharma5244
 
Exploring Criminology and Criminal Behaviour.pdf
Exploring Criminology and Criminal Behaviour.pdfExploring Criminology and Criminal Behaviour.pdf
Exploring Criminology and Criminal Behaviour.pdfrohankumarsinghrore1
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)Areesha Ahmad
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and ClassificationsAreesha Ahmad
 
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)AkefAfaneh2
 
300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptxryanrooker
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceAlex Henderson
 
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...Monika Rani
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptxSilpa
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .Poonam Aher Patil
 

Último (20)

An introduction on sequence tagged site mapping
An introduction on sequence tagged site mappingAn introduction on sequence tagged site mapping
An introduction on sequence tagged site mapping
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.
 
Velocity and Acceleration PowerPoint.ppt
Velocity and Acceleration PowerPoint.pptVelocity and Acceleration PowerPoint.ppt
Velocity and Acceleration PowerPoint.ppt
 
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxClimate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
 
Exploring Criminology and Criminal Behaviour.pdf
Exploring Criminology and Criminal Behaviour.pdfExploring Criminology and Criminal Behaviour.pdf
Exploring Criminology and Criminal Behaviour.pdf
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
 
300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptx
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 

Protein data bank

  • 2. Protein Data Bank  http://www.rcsb.org/pdb/home/home.do  A repository for 3-D biological macromolecular structure.  It includes proteins, nucleic acids.  Obtained by X-Ray crystallography (80%) or NMR spectroscopy (16%).  Transferred to the Research Collaboratory for Structural Bioinformatics (RCSB) in 1998.  Currently it holds 141616 released structures.
  • 3.  freely accessible on the Internet via the websites of its member organisations (PDBe,PDBj,and RCSB).  The PDB is overseen by an organization called the Worldwide Protein Data Bank, wwPDB.  The PDB is a key resource in areas of structural biology, such as structural genomics.  Most major scientific journals, and some funding agencies, now require scientists to submit their structure data to the PDB.
  • 4.  Many other databases use protein structures deposited in the PDB. For example, SCOP and CATH classify protein structures, while PDBsum provides a graphic overview of PDB entries using information from other sources, such as Gene ontology
  • 5. History  Founded in 1971 by Brookhaven National Laboratory, New York.  In October 1998,the PDB was transferred to the Research Collaboratory for Structural Bioinformatics (RCSB);  In 2003, with the formation of the wwPDB, the PDB became an international organization.  The founding members are PDBe (Europe), RCSB (USA), and PDBj (Japan).
  • 6. Content  The PDB database is updated weekly.  As of 12 May 2017, the breakdown of current holdings is as follows:
  • 7.  In the past, the number of structures in the PDB has grown at an approximately exponential rate passing the 100,000 structures milestone in 2014.
  • 8. PDB data formats/File Formats  The file format initially used by the PDB was called the PDB file format.  PDB file format was used to contain the coordinates and related information.  In the late 1990’s, macromolecular Crystallographic Information file (mmCIF) evolved.  mmCIF and PDBML  Push in to make structure files completely self-contained descriptions of the experiment and details of the structure determination.  PDB file format unstructured and obsolete
  • 9. PDB File Format  Text file – you can edit with a text editor e.g. WordPad  Atomic co-ordinates  Rich annotation  Citation  Experimental Method  Biological source e.  Etc.
  • 10. Viewing the data  The structure files may be viewed using one of open software programme, including JMOL, PYMOL, and RASMOL.  Some other free, but not open source programs include ICM-Browser,VMD, MDL Chime, UCSF Chimera, Swiss- PDB Viewer, StarBiochem(a Java-based interactive molecular viewer with integrated search of protein databank), Sirius, and VisProt3DS(a tool for Protein Visualization in 3D stereoscopic view and other modes).  The RCSB PDB website contains an extensive list of both free and commercial molecule visualization programs and web browser plugins.
  • 11.  Advanced search  New features  File format
  • 12. PDB File Format  A deposited set of protein coordinates becomes an entry in PDB.  A deposited set of protein coordinates becomes an entry in PDB.  One can search a structure in PDB using the four-letter code or keywords related to its annotation.  The identified structure can be viewed directly online or downloaded to a local computer for analysis.  The PDB website provides options for retrieval, analysis, and direct viewing of macromolecular
  • 13.  It also provides links to protein structural classification results available in databases such as SCOP and CATH. • The data format in PDB was created in FORTAN compatible format. • Header:- • The header section provides an overview of the protein and the quality of the structure.  It contains information about the name of the molecule, source organism, bibliographic reference, methods of structure determination, resolution, crystallographic parameters, protein sequence, cofactors, and description of structure types and locations and sometimes secondary structure information.
  • 14.  Structure coordinates:-there are a specified number of columns with predetermined contents.  The ATOM part refers to protein atom information whereas the HETATM(for heteroatom group) part refers to atoms of cofactor or substrate molecules.  Approximately ten columns of text and numbers are designated  They include information for the atom number, atom name, residue name, polypeptide chain identifier, residue number, x, y, and z Cartesian coordinates, temperature factor, and occupancy factor.  The last two parameters, occupancy and temperature factors, relate to disorders of atomic positions in crystals.  END  Restriction:-  The field width for polypeptide chains is only one character in width, meaning that no more than 26 chains can be used in a multisubunit protein model
  • 15.
  • 16. mmCIF and MMDB Formats  The most popular new formats include the macromolecular crystallographic information file (mmCIF) and the molecular modeling database (MMDB) file.  Both formats are highly parsable by computer software, meaning that information in each field of a record can be retrieved separately.  These new formats facilitate the retrieval and organization of information from database structures.  The mmCIF format is similar to the format for a relational database in which a set of tables are used to organize database records.
  • 17.  Each table or field of information is explicitly assigned by a tag and linked to other fields through a special syntax.  a single line of description in the header section of PDB is divided into many lines or fields with each field having explicit assignment of item names and item values.  Each field starts with an underscore character followed by category name and keyword description separated by a period.  Using multiple fields with tags for the same information has the advantage of providing one- to-one relationship between item names and item values.
  • 18.
  • 19. MMDB  Another new format is the MMDB format developed by the NCBI to parse and sort pieces of information in PDB.  The objective is to allow the information to be more easily integrated with GenBank and Medline through Entrez.  An MMDB file is written in the ASN.1 format which has information in a record structured as a nested hierarchy.  This allows faster retrieval than mmCIF and PDB.  Furthermore, the MMDB format includes bond connectivity information for each molecule, called a “chemical graph,” which is recorded in the ASN.1 file.