Brief introduction to bioinformatics, applications of bioinformatics, protein modeling, types of protein modeling, applications of protein modeling , homology modeling(how-to), steps in performing homology modeling in Swiss model.
2. CONTENT
• DEFINITION OF BIOINFORMATICS
• IN SILICO?
• APPLICATIONS OF BIOINFORMATICS
• DEFINITION OF PROTEIN MODELING
• TYPES OF PROTEIN MODELING
• APPLICATIONS OF PROTEIN MODELING
• WHAT IS HOMOLOGY MODELING?
• STEPS IN HOMOLOGY MODELING
• PERFORMING MODELING WITH SWISS MODEL
3. DEFINITION OF BIOINFORMATICS
• This is an integrative field that provides methodologies and software tools for
analyzing biological data, particularly huge and complicated data
sets(Bilotta et al, 2018).
• Bioinformatics combines
biology, chemistry, physics,
computer science, information
engineering, mathematics and
statistics to analyze and interpret
the biological data(NCBI, 2001).
4. IN SILICO?
• In silico basically means performed using a computer or via computer
simulations.
• It is a latin word that depicts analysis done through the aid of the silicon chip
in the computer system.
• In silico studies have a range of applications in the biomedical world. A few
are: homology modeling, drug discovery(molecular docking), lead
optimization, de novo drug design, pharmacophore-based drug design,
quantitative structure- Activity Relationship(QSAR), Molecular dynamic(MD),
Virtual screening, In silico pharmacokinetic(ADMET), toxicological and drug
safety screening.
• See next slide.
7. APPLICATIONS OF BIOINFORMATICS
• DRUG DISCOVERY AND DESIGN: Translational bioinformatics is the field
of bioinformatics that deals with drug discovery and
development(AMIA, 2006).CADD(Computer Aided Drug Design) role
in pharmaceutics.
8. OTHER AREAS OF APPLICATION OF
BIOINFORMATICS
•Waste Management(Microbial Genome program
Scientists)(Arora et al, 2010).
•Personalized Medicine and gene therapy (Hong et al, 2012).
•Climatic Change Studies (Sinha, 2015).
9. •N.B: Bioinformatics has made scientific
researches and projects cost effective.
Likewise, it has shortened the time length of
scientific researches as we can now achieve
more in a shorter period of time.
10. PROTEIN MODELING
• The molecular functionality of
any living organism depends on
the proteins(their structure,
location etc.)(Breda et al, 2007).
• Proteins need modeling so as to
create medical opportunity for
disease treatments.
• Lack of availability of 3D
structures of certain
proteins(membrane proteins) for
invitro and in silico analysis.
3D STRUCTURE OF A
PROTEIN
11. TYPES OF PROTEIN MODELING
• There are two major ways to model proteins. They are:
•Experimentally derived methods e.g. X-ray crystallography,
Nuclear Magnetic Resonance(NMR) spectroscopy, cryo-electron
microscopy(cryo-EM) etc.
•Knowledge based methods e.g. Homology modeling,
threading/fold recognition and ab initio modeling.
12. HOMOLOGY MODELING
• This involves the determination of the 3D structure of a protein based
on the structure of experimentally derived proteins with closely related
sequence.
• it refers to constructing an atomic-resolution model of the "target"
protein from its amino acid sequence and an experimental
three-dimensional structure of a related homologous protein.
• Its also called comparative modeling of protein.
13. HOMOLOGY MODELING
•It rests on the principle of sequence homology.
•If two proteins have high sequence similarity, then they can
similar 3D structure.
14.
15. STEPS IN HOMOLOGY MODELING
• TEMPLATE SELECTION( USING BLAST: BASIC LOCAL ALIGNMENT SEARCH TOOL)
• SEQUENCE ALIGNMENT
• BACKBONE MODEL BUILDING
• LOOP MODELING AND SIDE CHAIN REFINEMENT
• MODEL REFINEMENT USING ENERGY FUNCTION
18. TERMS IN HOMOLOGY MODELING
• GMQE score: Global Model Quality Estimation is a quality estimate which combines
properties from the target-template alignment and the template structure. It
ranges from 0 to 1, reflecting the expected accuracy of a model built with that
alignment and template, normalized by the coverage of the target sequence.
Higher numbers indicate higher reliability.
• QUERY COVERAGE: the % of the contig length that aligns with the NCBI hit. A small
query coverage % means only a tiny portion of the contig is aligning. If there is an
alignment with 100% identity and a 5% query coverage, the sequence is probably
not that taxon.
• QSQE score: The Quaternary Structure Quality Estimate score is a number between
0 and 1, reflecting the expected accuracy of the interchain contacts for a model
built based on a given alignment and template. In general a higher QSQE is
"better", while a value above 0.7 can be considered reliable to follow the
predicted quaternary structure in the modelling process.
19. REFERENCES
• Bilotta, Mariaconcetta & Tradigo, Giuseppe & Veltri, Pierangelo. (2019). Bioinformatics Data
Models, Representation and Storage. 10.1016/B978-0-12-809633-8.20410-X.
• National Centre of Biotechnology Information, definition of bioinformatics, 2001.
• Arora, Pankaj & Shi, Wenxin. (2010). Tools of Bioinformatics in Biodegradation. Reviews in
Environmental Science and Bio/Technology. 10.1007/s11157-010-9211-x.
• Subrata Sinha (2015)Role of bioinformatics in climatic change studies.
• Breda A, Valadares NF, Norberto de Souza O, et al. Protein Structure, Modelling and
Applications. 2006 May 1 [Updated 2007 Sep 14]. In: Gruber A, Durham AM, Huynh C, et al.,
editors. Bioinformatics in Tropical Disease Research: A Practical and Case-Study Approach
[Internet]. Bethesda (MD): National Center for Biotechnology Information (US); 2008. Chapter
A06. Available from: https://www.ncbi.nlm.nih.gov/books/NBK6824