SlideShare a Scribd company logo
1 of 23
Download to read offline
GCB 2012

Computation and visualization of protein topology
graphs including ligands

Tim Schäfer, Patrick May, Ina Koch
Goethe-University Frankfurt am Main
Institute for Computer Science
Department for Molecular Bioinformatics
Contents
●

Introduction
●

●

Protein structure, ligands and graphs

Methods
●

●

Types of protein ligand graphs

●

●

Computing protein ligand graphs from PDB and DSSP data
Visualization of protein ligand graphs

Results
●

The VPLG software

●

Case studies

Computation and visualization of protein topology graphs including ligands

2
Protein structure
●

Primary structure
●

●

●

String of 20 AAs
3D: peptide bonds, steric constraints, Ramachandran

Secondary structure
●

●

super-secondary structure

●

●

local, H-bonds, SSE types
protein domains

Tertiary structure
●

●

atom coordinates

Quaternary structure
●

multi-chain proteins

Computation and visualization of protein topology graphs including ligands

3
Modeling protein structure as a graph
●

●

●

Graph G = (V, E)
Pervasive data structure in
computer science and
bioinformatics
Usage of graps to model
proteins
●

CATH [Orengo et al.]

●

SCOP [Murzin et al.]

●

TOPS+ [Gilbert, Veeramalai]

●

PTGL [May, Koch]

Computation and visualization of protein topology graphs including ligands

4
Protein ligand interactions
●

●

●

Proteins interact with their environment to fulfill their biological
function: other proteins, ligands, ...
Knowledge on PLI is important in many applications, e.g. drug
design, molecular medicine
> 10,000 different ligands occur in PDB files

Computation and visualization of protein topology graphs including ligands

[PDB Ligand Expo, 1AVD]

5
A graph model of proteins on the supersecondary structure level
●

Secondary Structure Elements (SSEs)
●

●

●

Few: 40,000 atoms => 400 residues => 40 SSEs
Automated assignment from 3D data possible (e.g., DSSP)

Protein model
●

●

undirected, labeled graph for each chain of a protein

Protein graph
●

vertices: SSEs or ligands

●

edges: contacts and spatial relations between them

●

graph types: alpha-beta-graph, alpha-graph or beta-graph

Computation and visualization of protein topology graphs including ligands

[Koch 1997, May et al. 2004]

6
Protein ligand graphs
●

Extension of protein graph model

●

Protein ligand graph G = (V, E)
●

Vertex: SSE || Ligand

●

Edge: Spatial contact (parallel, antiparallel, mixed, ligand, backbone)

Computation and visualization of protein topology graphs including ligands

7
Computing protein ligand graphs
1. Obtain sequence from PDB file

SKVVVPAQGKKITLQNGKLNVPENPIIPYIEGDGIGVDVTPAM

Computation and visualization of protein topology graphs including ligands

8
Computing protein ligand graphs
1. Obtain sequence from PDB file
2. Obtain secondary structure assignments from DSSP file

SKVVVPAQGKKITLQNGKLNVPENPIIPYIEGDGIGVDVTPAM
HHHHHHHHHH

EEEEEEE

HHHHHH EEEEEEE

Computation and visualization of protein topology graphs including ligands

9
Computing protein ligand graphs
1. Obtain sequence from PDB file
2. Obtain secondary structure assignments from DSSP file
3. Add ligand information from PDB file

SKVVVPAQGKKITLQNGKLNVPENPIIPYIEGDGIGVDVTPAMJ
HHHHHHHHHH

EEEEEEE

HHHHHH EEEEEEE

Computation and visualization of protein topology graphs including ligands

L
10
Computing protein ligand graphs
1. Obtain sequence from PDB file
2. Obtain secondary structure assignments from DSSP file
3. Add ligand information from PDB file
4. Build graph: add a vertex for each SSE

H

E

H

Computation and visualization of protein topology graphs including ligands

E

L
11
Computing protein ligand graphs
1. Obtain sequence from PDB file
2. Obtain secondary structure assignments from DSSP file
3. Add ligand information from PDB file
4. Build graph: add a vertex for each SSE
5. Compute spatial contacts and add edges between SSEs

H

E

H

Computation and visualization of protein topology graphs including ligands

E

L
12
VPLG – an open source implementation in Java

Computation and visualization of protein topology graphs including ligands

13
Overview:
Computing protein ligand graphs

Computation and visualization of protein topology graphs including ligands

14
Contact computation
●

Parser for PDB and DSSP files

●

Atom level contacts
●

●

●

vdW radius overlap, radius 2 A
complexity for n atoms is Ο (n*n)

Residue level contacts
●

●

Protein-ligand contacts
●

●

collision spheres, CA is center

collision spheres, min max center

SSE level contacts
●

rules depending on SSE type, difficult

Computation and visualization of protein topology graphs including ligands

15
Types of protein graphs
●

Alpha

●

Beta

●

Alpha-beta

●

Ligands

●

Not ness. connected
●

Connected components

●

Backbone option

Computation and visualization of protein topology graphs including ligands

16
The VPLG software
●

Command line tool

●

GUI in Java/Swing

●

Screenshots

●

Graph file formats

●

Database support

Computation and visualization of protein topology graphs including ligands

17
Protein graphs for the whole PDB –
Some statistics
●

62,364 proteins consisting of 151,025 chains

●

1.4 million helices, 1.2 million beta strands, 300k ligands

●

Contacts
●

●

Parallel: 450.000

●

Antiparallel: 870.000

●

Total non-ligand: 2.000.000

●

●

Mixed: 710.000

Ligand: 760.000 => 2.50 contacts per ligand

=> 0.76 contacts per SSE

High number of isolated ligands (~30 %) at first run
●

Atom radius too small?

●

Contact definition?

●

Small ligands (few atoms?)

Computation and visualization of protein topology graphs including ligands

18
Outlook and possible improvements
●

VPLG software
●

●

New spatial relation 'close to' for ligands

●

●

Post-processing of DSSP output (short SSEs)
RNA / DNA as ligands

Backend
●

Get Web server online

●

Protein structure comparison (not necessarily graph-based)

●

Use VPLG to get ligands into the PTGL

Computation and visualization of protein topology graphs including ligands

19
Summary

●

Ligands are part of protein graph

●

Visualization based on primary structure ordering of SSEs

●

VPLG software is open source implementation
●

●

●

Visualization of protein graphs from PDB and DSSP files
http://www.bioinformatik.uni-frankfurt.de
https://sourceforge.net/projects/vplg/

Evaluation
●

Statistics

●

Hemoglobin as an example

Computation and visualization of protein topology graphs including ligands

20
Acknowledgments
●

●

Supervisor Prof. Ina Koch
Molecular Bininformatics group at Goethe University Frankfurt
am Main (MolBI)

Computation and visualization of protein topology graphs including ligands

21
Thanks for your attention!

Questions?

Computation and visualization of protein topology graphs including ligands

22
(Appendix slides follow)

Computation and visualization of protein topology graphs including ligands

23

More Related Content

Viewers also liked

Pmi pmp-resume template-11
Pmi pmp-resume template-11Pmi pmp-resume template-11
Pmi pmp-resume template-11mission_vishvas
 
Интернет для специальности психолога - важнейшие информационные сайты
Интернет для специальности психолога - важнейшие информационные сайтыИнтернет для специальности психолога - важнейшие информационные сайты
Интернет для специальности психолога - важнейшие информационные сайтыFyisher
 
よね村の再開村についてのご提案
よね村の再開村についてのご提案よね村の再開村についてのご提案
よね村の再開村についてのご提案doku_sho
 
Почему ИграютВсе
Почему ИграютВсеПочему ИграютВсе
Почему ИграютВсеIgrayutVse
 
GelecekHane Halil Aksu - Açık İnovasyon Buluşması - Akıl Gıdıklayıcı İnovasyo...
GelecekHane Halil Aksu - Açık İnovasyon Buluşması - Akıl Gıdıklayıcı İnovasyo...GelecekHane Halil Aksu - Açık İnovasyon Buluşması - Akıl Gıdıklayıcı İnovasyo...
GelecekHane Halil Aksu - Açık İnovasyon Buluşması - Akıl Gıdıklayıcı İnovasyo...Gelecek Hane
 
Роман Кокин «Организация тестирования в больших командах»
Роман Кокин «Организация тестирования в больших командах»Роман Кокин «Организация тестирования в больших командах»
Роман Кокин «Организация тестирования в больших командах»DataArt
 
Trinh dien ho so bai day
Trinh dien ho so bai dayTrinh dien ho so bai day
Trinh dien ho so bai dayheocon020192
 

Viewers also liked (10)

0 rchel installation_os
0 rchel installation_os0 rchel installation_os
0 rchel installation_os
 
Pmi pmp-resume template-11
Pmi pmp-resume template-11Pmi pmp-resume template-11
Pmi pmp-resume template-11
 
Интернет для специальности психолога - важнейшие информационные сайты
Интернет для специальности психолога - важнейшие информационные сайтыИнтернет для специальности психолога - важнейшие информационные сайты
Интернет для специальности психолога - важнейшие информационные сайты
 
よね村の再開村についてのご提案
よね村の再開村についてのご提案よね村の再開村についてのご提案
よね村の再開村についてのご提案
 
Почему ИграютВсе
Почему ИграютВсеПочему ИграютВсе
Почему ИграютВсе
 
GelecekHane Halil Aksu - Açık İnovasyon Buluşması - Akıl Gıdıklayıcı İnovasyo...
GelecekHane Halil Aksu - Açık İnovasyon Buluşması - Akıl Gıdıklayıcı İnovasyo...GelecekHane Halil Aksu - Açık İnovasyon Buluşması - Akıl Gıdıklayıcı İnovasyo...
GelecekHane Halil Aksu - Açık İnovasyon Buluşması - Akıl Gıdıklayıcı İnovasyo...
 
Роман Кокин «Организация тестирования в больших командах»
Роман Кокин «Организация тестирования в больших командах»Роман Кокин «Организация тестирования в больших командах»
Роман Кокин «Организация тестирования в больших командах»
 
Joomla! na MS Windows
Joomla! na MS WindowsJoomla! na MS Windows
Joomla! na MS Windows
 
Trinh dien ho so bai day
Trinh dien ho so bai dayTrinh dien ho so bai day
Trinh dien ho so bai day
 
Bluetooth speaker
Bluetooth speakerBluetooth speaker
Bluetooth speaker
 

Similar to Computation and visualization of protein topology graphs including ligands

upload.pdf
upload.pdfupload.pdf
upload.pdfzohra72
 
MULISA : A New Strategy for Discovery of Protein Functional Motifs and Residues
MULISA : A New Strategy for Discovery of Protein Functional Motifs and ResiduesMULISA : A New Strategy for Discovery of Protein Functional Motifs and Residues
MULISA : A New Strategy for Discovery of Protein Functional Motifs and Residuescsandit
 
modelling assignment
modelling assignmentmodelling assignment
modelling assignmentShwetA Kumari
 
Zarlish attique 187104 project assignment modeller
Zarlish attique 187104 project assignment modellerZarlish attique 187104 project assignment modeller
Zarlish attique 187104 project assignment modellerZarlishAttique1
 
Session ii g2 overview metabolic network modeling mcc
Session ii g2 overview metabolic network modeling mccSession ii g2 overview metabolic network modeling mcc
Session ii g2 overview metabolic network modeling mccUSD Bioinformatics
 
Recent trends in bioinformatics
Recent trends in bioinformaticsRecent trends in bioinformatics
Recent trends in bioinformaticsZeeshan Hanjra
 
Exploring Large Chemical Data Sets
Exploring Large Chemical Data SetsExploring Large Chemical Data Sets
Exploring Large Chemical Data Setskylelutz
 
PaulZRXie_Nucl. Acids Res.-2013
PaulZRXie_Nucl. Acids Res.-2013PaulZRXie_Nucl. Acids Res.-2013
PaulZRXie_Nucl. Acids Res.-2013Zhongru (Paul) Xie
 
PCA-CompChem_seminar
PCA-CompChem_seminarPCA-CompChem_seminar
PCA-CompChem_seminarAnne D'cruz
 
EnrichNet: Graph-based statistic and web-application for gene/protein set enr...
EnrichNet: Graph-based statistic and web-application for gene/protein set enr...EnrichNet: Graph-based statistic and web-application for gene/protein set enr...
EnrichNet: Graph-based statistic and web-application for gene/protein set enr...Enrico Glaab
 
Neo4j_Cypher.pdf
Neo4j_Cypher.pdfNeo4j_Cypher.pdf
Neo4j_Cypher.pdfJaberRad1
 
Doctoral Thesis Dissertation 2014-03-20 @PoliMi
Doctoral Thesis Dissertation 2014-03-20 @PoliMiDoctoral Thesis Dissertation 2014-03-20 @PoliMi
Doctoral Thesis Dissertation 2014-03-20 @PoliMiDavide Chicco
 
Lecture 9 molecular descriptors
Lecture 9  molecular descriptorsLecture 9  molecular descriptors
Lecture 9 molecular descriptorsRAJAN ROLTA
 
[DigiHealth 22] Budget friendly sample sizes for genomics research - Ognjen M...
[DigiHealth 22] Budget friendly sample sizes for genomics research - Ognjen M...[DigiHealth 22] Budget friendly sample sizes for genomics research - Ognjen M...
[DigiHealth 22] Budget friendly sample sizes for genomics research - Ognjen M...DataScienceConferenc1
 

Similar to Computation and visualization of protein topology graphs including ligands (20)

upload.pdf
upload.pdfupload.pdf
upload.pdf
 
MULISA : A New Strategy for Discovery of Protein Functional Motifs and Residues
MULISA : A New Strategy for Discovery of Protein Functional Motifs and ResiduesMULISA : A New Strategy for Discovery of Protein Functional Motifs and Residues
MULISA : A New Strategy for Discovery of Protein Functional Motifs and Residues
 
modelling assignment
modelling assignmentmodelling assignment
modelling assignment
 
Zarlish attique 187104 project assignment modeller
Zarlish attique 187104 project assignment modellerZarlish attique 187104 project assignment modeller
Zarlish attique 187104 project assignment modeller
 
Session ii g2 overview metabolic network modeling mcc
Session ii g2 overview metabolic network modeling mccSession ii g2 overview metabolic network modeling mcc
Session ii g2 overview metabolic network modeling mcc
 
Recent trends in bioinformatics
Recent trends in bioinformaticsRecent trends in bioinformatics
Recent trends in bioinformatics
 
Exploring Large Chemical Data Sets
Exploring Large Chemical Data SetsExploring Large Chemical Data Sets
Exploring Large Chemical Data Sets
 
May workshop
May workshopMay workshop
May workshop
 
PaulZRXie_Nucl. Acids Res.-2013
PaulZRXie_Nucl. Acids Res.-2013PaulZRXie_Nucl. Acids Res.-2013
PaulZRXie_Nucl. Acids Res.-2013
 
May 15 workshop
May 15  workshopMay 15  workshop
May 15 workshop
 
CADD meeting 08-30-2016
CADD meeting 08-30-2016CADD meeting 08-30-2016
CADD meeting 08-30-2016
 
PCA-CompChem_seminar
PCA-CompChem_seminarPCA-CompChem_seminar
PCA-CompChem_seminar
 
EnrichNet: Graph-based statistic and web-application for gene/protein set enr...
EnrichNet: Graph-based statistic and web-application for gene/protein set enr...EnrichNet: Graph-based statistic and web-application for gene/protein set enr...
EnrichNet: Graph-based statistic and web-application for gene/protein set enr...
 
ProCheck
ProCheckProCheck
ProCheck
 
Swaati modeling
Swaati modeling Swaati modeling
Swaati modeling
 
Neo4j_Cypher.pdf
Neo4j_Cypher.pdfNeo4j_Cypher.pdf
Neo4j_Cypher.pdf
 
Doctoral Thesis Dissertation 2014-03-20 @PoliMi
Doctoral Thesis Dissertation 2014-03-20 @PoliMiDoctoral Thesis Dissertation 2014-03-20 @PoliMi
Doctoral Thesis Dissertation 2014-03-20 @PoliMi
 
Lecture 9 molecular descriptors
Lecture 9  molecular descriptorsLecture 9  molecular descriptors
Lecture 9 molecular descriptors
 
[DigiHealth 22] Budget friendly sample sizes for genomics research - Ognjen M...
[DigiHealth 22] Budget friendly sample sizes for genomics research - Ognjen M...[DigiHealth 22] Budget friendly sample sizes for genomics research - Ognjen M...
[DigiHealth 22] Budget friendly sample sizes for genomics research - Ognjen M...
 
VLife SCOPE for Lead Optimization
VLife SCOPE for Lead OptimizationVLife SCOPE for Lead Optimization
VLife SCOPE for Lead Optimization
 

Recently uploaded

How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 

Recently uploaded (20)

How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 

Computation and visualization of protein topology graphs including ligands

  • 1. GCB 2012 Computation and visualization of protein topology graphs including ligands Tim Schäfer, Patrick May, Ina Koch Goethe-University Frankfurt am Main Institute for Computer Science Department for Molecular Bioinformatics
  • 2. Contents ● Introduction ● ● Protein structure, ligands and graphs Methods ● ● Types of protein ligand graphs ● ● Computing protein ligand graphs from PDB and DSSP data Visualization of protein ligand graphs Results ● The VPLG software ● Case studies Computation and visualization of protein topology graphs including ligands 2
  • 3. Protein structure ● Primary structure ● ● ● String of 20 AAs 3D: peptide bonds, steric constraints, Ramachandran Secondary structure ● ● super-secondary structure ● ● local, H-bonds, SSE types protein domains Tertiary structure ● ● atom coordinates Quaternary structure ● multi-chain proteins Computation and visualization of protein topology graphs including ligands 3
  • 4. Modeling protein structure as a graph ● ● ● Graph G = (V, E) Pervasive data structure in computer science and bioinformatics Usage of graps to model proteins ● CATH [Orengo et al.] ● SCOP [Murzin et al.] ● TOPS+ [Gilbert, Veeramalai] ● PTGL [May, Koch] Computation and visualization of protein topology graphs including ligands 4
  • 5. Protein ligand interactions ● ● ● Proteins interact with their environment to fulfill their biological function: other proteins, ligands, ... Knowledge on PLI is important in many applications, e.g. drug design, molecular medicine > 10,000 different ligands occur in PDB files Computation and visualization of protein topology graphs including ligands [PDB Ligand Expo, 1AVD] 5
  • 6. A graph model of proteins on the supersecondary structure level ● Secondary Structure Elements (SSEs) ● ● ● Few: 40,000 atoms => 400 residues => 40 SSEs Automated assignment from 3D data possible (e.g., DSSP) Protein model ● ● undirected, labeled graph for each chain of a protein Protein graph ● vertices: SSEs or ligands ● edges: contacts and spatial relations between them ● graph types: alpha-beta-graph, alpha-graph or beta-graph Computation and visualization of protein topology graphs including ligands [Koch 1997, May et al. 2004] 6
  • 7. Protein ligand graphs ● Extension of protein graph model ● Protein ligand graph G = (V, E) ● Vertex: SSE || Ligand ● Edge: Spatial contact (parallel, antiparallel, mixed, ligand, backbone) Computation and visualization of protein topology graphs including ligands 7
  • 8. Computing protein ligand graphs 1. Obtain sequence from PDB file SKVVVPAQGKKITLQNGKLNVPENPIIPYIEGDGIGVDVTPAM Computation and visualization of protein topology graphs including ligands 8
  • 9. Computing protein ligand graphs 1. Obtain sequence from PDB file 2. Obtain secondary structure assignments from DSSP file SKVVVPAQGKKITLQNGKLNVPENPIIPYIEGDGIGVDVTPAM HHHHHHHHHH EEEEEEE HHHHHH EEEEEEE Computation and visualization of protein topology graphs including ligands 9
  • 10. Computing protein ligand graphs 1. Obtain sequence from PDB file 2. Obtain secondary structure assignments from DSSP file 3. Add ligand information from PDB file SKVVVPAQGKKITLQNGKLNVPENPIIPYIEGDGIGVDVTPAMJ HHHHHHHHHH EEEEEEE HHHHHH EEEEEEE Computation and visualization of protein topology graphs including ligands L 10
  • 11. Computing protein ligand graphs 1. Obtain sequence from PDB file 2. Obtain secondary structure assignments from DSSP file 3. Add ligand information from PDB file 4. Build graph: add a vertex for each SSE H E H Computation and visualization of protein topology graphs including ligands E L 11
  • 12. Computing protein ligand graphs 1. Obtain sequence from PDB file 2. Obtain secondary structure assignments from DSSP file 3. Add ligand information from PDB file 4. Build graph: add a vertex for each SSE 5. Compute spatial contacts and add edges between SSEs H E H Computation and visualization of protein topology graphs including ligands E L 12
  • 13. VPLG – an open source implementation in Java Computation and visualization of protein topology graphs including ligands 13
  • 14. Overview: Computing protein ligand graphs Computation and visualization of protein topology graphs including ligands 14
  • 15. Contact computation ● Parser for PDB and DSSP files ● Atom level contacts ● ● ● vdW radius overlap, radius 2 A complexity for n atoms is Ο (n*n) Residue level contacts ● ● Protein-ligand contacts ● ● collision spheres, CA is center collision spheres, min max center SSE level contacts ● rules depending on SSE type, difficult Computation and visualization of protein topology graphs including ligands 15
  • 16. Types of protein graphs ● Alpha ● Beta ● Alpha-beta ● Ligands ● Not ness. connected ● Connected components ● Backbone option Computation and visualization of protein topology graphs including ligands 16
  • 17. The VPLG software ● Command line tool ● GUI in Java/Swing ● Screenshots ● Graph file formats ● Database support Computation and visualization of protein topology graphs including ligands 17
  • 18. Protein graphs for the whole PDB – Some statistics ● 62,364 proteins consisting of 151,025 chains ● 1.4 million helices, 1.2 million beta strands, 300k ligands ● Contacts ● ● Parallel: 450.000 ● Antiparallel: 870.000 ● Total non-ligand: 2.000.000 ● ● Mixed: 710.000 Ligand: 760.000 => 2.50 contacts per ligand => 0.76 contacts per SSE High number of isolated ligands (~30 %) at first run ● Atom radius too small? ● Contact definition? ● Small ligands (few atoms?) Computation and visualization of protein topology graphs including ligands 18
  • 19. Outlook and possible improvements ● VPLG software ● ● New spatial relation 'close to' for ligands ● ● Post-processing of DSSP output (short SSEs) RNA / DNA as ligands Backend ● Get Web server online ● Protein structure comparison (not necessarily graph-based) ● Use VPLG to get ligands into the PTGL Computation and visualization of protein topology graphs including ligands 19
  • 20. Summary ● Ligands are part of protein graph ● Visualization based on primary structure ordering of SSEs ● VPLG software is open source implementation ● ● ● Visualization of protein graphs from PDB and DSSP files http://www.bioinformatik.uni-frankfurt.de https://sourceforge.net/projects/vplg/ Evaluation ● Statistics ● Hemoglobin as an example Computation and visualization of protein topology graphs including ligands 20
  • 21. Acknowledgments ● ● Supervisor Prof. Ina Koch Molecular Bininformatics group at Goethe University Frankfurt am Main (MolBI) Computation and visualization of protein topology graphs including ligands 21
  • 22. Thanks for your attention! Questions? Computation and visualization of protein topology graphs including ligands 22
  • 23. (Appendix slides follow) Computation and visualization of protein topology graphs including ligands 23