SlideShare uma empresa Scribd logo
1 de 30
Bringing cheminformatics toolkits into tune Noel M. O’Boyle OpenBabel May 2011 Molecular Informatics Open Source Software EMBL-EBI, Cambridge, UK
Toolkits, toolkits and more toolkits Commercial cheminformatics toolkits:
Toolkits, toolkits and more toolkits Open Source cheminformatics toolkits: CDK OpenBabel OASA PerlMol
The importance of being interoperable Good for users Can take advantage of complementary features CDK:Gasteigerπ charges, maximal common substructure, shape similarity with ultrafast shape descriptors, mass-spectrometry analysis RDKit:RECAP fragmentation, calculation of R/S, atom pair fingerprints, shape similarity with volume overlap OpenBabel:several forcefields, crystallography, large number of file formats, conformer searching, InChIKey
The importance of being interoperable Good for users Can take advantage of complementary features Can choose between different implementations Faster SMARTS searching, better 2D depiction, more accurate 3D structure generation Avoid vendor lock-in Good for developers Less reinvention of wheel, more time to spend on development of complementary features Avoid balkanisation of field Bigger pool of users
J. Chem. Inf. Model., 2006, 46, 991 http://www.blueobelisk.org
J. Chem. Inf. Model., 2006, 46, 991 http://www.blueobelisk.org
Bringing it all together with Cinfony Different languages Java (CDK, OPSIN), C++ (Open Babel, RDKit, Indigo) Use Python, a higher-level language that can bridge to both Different APIs Each toolkit uses different commands to carry out the same tasks Implement a common API Different chemical models Different internal representation of a molecule Use existing method for storage and transfer of chemical information: chemical file formats MDL mol file for 2D and 3D, SMILES for 0D
Cinfony API
One API to rule them all Example - create a Molecule from a SMILES string: mol = Chem.MolFromSmiles(SMILESstring) mol = Indigo.loadMolecule(SMILESstring) mol = openbabel.OBMol() obconversion = openbabel.OBConversion() obconversion.SetInFormat("smi") obconversion.ReadString(mol, SMILESstring) OpenBabel CDK builder = cdk.DefaultChemObjectBuilder.getInstance() sp = cdk.smiles.SmilesParser(builder) mol = sp.parseSmiles(SMILESstring) RDKit Indigo mol = toolkit.readstring("smi", SMILESstring) wheretoolkit is either obabel,cdk, indyorrdk
Design of Cinfony API API is small (“fits your brain”) Covers core functionality of toolkits Corollary: need to access underlying toolkit for additional functionality Makes it easy to carry out common tasks API is stable Make it easy to find relevant methods Example: add hydrogens to a molecule atommanip = cdk.tools.manipulator.AtomContainerManipulatoratommanip.convertImplicitToExplicitHydrogens(molecule) CDK molecule.addh()
cinfony.toolkit
cinfony.toolkit.Molecule
Examples of use Chemistry Toolkit Rosetta http://ctr.wikia.com Andrew Dalke
Combining toolkits >>> from cinfony import rdk, cdk, obabel>>> obabelmol = obabel.readstring("smi", "CCC")>>> rdkmol = rdk.Molecule(obabelmol)>>> rdkmol.draw(show=False, filename="propane.png")>>> print cdk.Molecule(rdkmol).calcdesc(){'chi0C': 2.7071067811865475, 'BCUT.4': 4.4795252101839402, 'rotatableBondsCount': 2, 'mde.9': 0.0, 'mde.8': 0.0, ... } Import Cinfony Read in a molecule from a SMILES string with Open Babel Convert it to an RDKit Molecule Create a 2D depiction of the molecule with RDKit Convert it to a CDK Molecule and calculate descriptor values
Comparing toolkits >>> fromcinfonyimportrdk, cdk, obabel, indy, webel>>> for toolkit in [rdk, cdk, obabel, indy, webel]: ...     mol = toolkit.readstring("smi", "CCC") ...     printmol.molwt ...     mol.draw(filename="%s.png" % toolkit.__name__) Import Cinfony For each toolkit... ... Read in a molecule from a SMILES string ... Print its molecular weight ... Create a 2D depiction Useful for sanity checks, identifying limitations, bugs Calculating the molecular weight (http://tinyurl.com/chemacs3) implicit hydrogen, isotopes Comparison of descriptor values (http://tinyurl.com/chemacs2) Should be highly correlated Comparison of depictions (http://tinyurl.com/chemacs1)
Cinfony and the Web
Webel - Chemistry for Web 2.0 Webel is a Cinfony module that runs entirely using web services CDK webservices by RajarshiGuha, hosted by Ola Spjuth at Uppsala University NCI/CADD Chemical Identifier Resolver by Markus Sitzmann (uses Cactvs for much of backend) Easy to install – no dependencies Can be used in environments where installing a cheminformatics toolkit is not possible Web services may provide additional services not available elsewhere Example: how similar is aspirin to Dr. Scholl’s Wart Remover Kit? >>> from cinfony import webel >>> aspirin = webel.readstring("name", "aspirin")>>> wartremover = webel.readstring("name", ...                     "Dr. Scholl’s Wart Remover Kit")>>> print aspirin.calcfp() | wartremover.calcfp() 0.59375
Webel - Chemistry for Web 2.0 Webel is a Cinfony module that runs entirely using web services CDK webservices by RajarshiGuha, hosted by Ola Spjuth at Uppsala University NCI/CADD Chemical Identifier Resolver by Markus Sitzmann (uses Cactvs for much of backend) Easy to install – no dependencies Can be used in environments where installing a cheminformatics toolkit is not possible Web services may provide additional services not available elsewhere Example: how similar is aspirin to Dr. Scholl’s Wart Remover Kit? >>> from cinfony import webel >>> aspirin = webel.readstring("name", "aspirin")>>> wartremover = webel.readstring("name", ...                     "Dr. Scholl’s Wart Remover Kit")>>> print aspirin.calcfp() | wartremover.calcfp() 0.59375
Cheminformatics in the browser See http://tinyurl.com/cm7005-b or just Google “webelsilverlight”
makes it easy to... Start using a new toolkit Carry out common tasks Combine functionality from different toolkits Compare results from different toolkits Do cheminformatics through the web, and on the web
Food for thought Inclusion of cheminformatics toolkits in Linux distributions “apt-get install cinfony” DebiChem can help Binary versions for Linux API stability – and associated version numbering Needed to handle dependencies “Sorry - This version of Cinfony will work only with the 1.2.x series of Toolkit Y” What other toolkits or functionality should Cinfony support? Would be nice if various toolkits promoted Cinfony Even nicer if they ran the test suite and fixed problems, and added in new features (new fps, etc.)! Using Cinfony, it’s easy for toolkits to test against other toolkits Quality Control RDKit - Java bindings on Windows Licensing of Cinfony’s components Related point: Science is BSD Let’s support Python 3 already
Bringing cheminformatics toolkits into tune Chem. Cent. J., 2008, 2, 24. http://cinfony.googlecode.com http://baoilleach.blogspot.com Acknowledgements CDK:EgonWillighagen, RajarshiGuha Open Babel: Chris Morley, Tim Vandermeersch RDKit: Greg Landrum Indigo: Dmitry Pavlov OASA: Beda Kosata OPSIN: Daniel Lowe JPype: Steve Ménard Chemical Identifier Resolver: Markus Sitzmann Interactive Tutorial: Michael Foord Image: Tintin44 (Flickr)
Cheminformatics in the browser As Webel is pure Python, it can run places where traditional cheminformatics software cannot... ...such as in a web browser Microsoft have developed a browser plugin called Silverlight for developing applications for the web It includes a Python interpreter (IronPython) So you can use Webel in Silverlight applications Michael Foord has developed an interactive Python tutorial using Silverlight See http://ironpython.net/tutorial/ I have combined this with Webel to develop an interactive Cheminformatics tutorial
Performance
import this VirtualBox, Double click on MIOSS Applications/Accessories/Terminal user:~$ cd apps/cinfony user:~/apps/cinfony$ ./myjython.sh Jython 2.5.2 (Release_2_5_2:7206, Mar 2 2011) >>> from cinfony import cdk, indy, opsin, webel >>> See API and “How to Use” at http://cinfony.googlecode.com

Mais conteúdo relacionado

Destaque

Presentación sobre .com
Presentación sobre .comPresentación sobre .com
Presentación sobre .com
mekhael
 
Agenda calendario 2013
Agenda calendario 2013Agenda calendario 2013
Agenda calendario 2013
Poder Judicial
 
User experience calendar
User experience calendar User experience calendar
User experience calendar
Rinco Cheng
 
Google Analytics, Seo, Social Media
Google Analytics, Seo, Social MediaGoogle Analytics, Seo, Social Media
Google Analytics, Seo, Social Media
Johaeli92
 
Tips on how to get followers on pinterest
Tips on how to get followers on pinterestTips on how to get followers on pinterest
Tips on how to get followers on pinterest
micheal235
 
Understanding the Bible Intro - Session 1
Understanding the Bible Intro - Session 1Understanding the Bible Intro - Session 1
Understanding the Bible Intro - Session 1
techhelper
 

Destaque (19)

Spreadsheets
SpreadsheetsSpreadsheets
Spreadsheets
 
Presentación sobre .com
Presentación sobre .comPresentación sobre .com
Presentación sobre .com
 
A simulated annealing algorithm for solving the school bus
A  simulated annealing algorithm for solving the school busA  simulated annealing algorithm for solving the school bus
A simulated annealing algorithm for solving the school bus
 
Agenda calendario 2013
Agenda calendario 2013Agenda calendario 2013
Agenda calendario 2013
 
How evo used data-driven personalization to acquire more traffic and retain m...
How evo used data-driven personalization to acquire more traffic and retain m...How evo used data-driven personalization to acquire more traffic and retain m...
How evo used data-driven personalization to acquire more traffic and retain m...
 
User experience calendar
User experience calendar User experience calendar
User experience calendar
 
RESUME2 PATRICIA HOWE DIRENZO
RESUME2 PATRICIA HOWE DIRENZORESUME2 PATRICIA HOWE DIRENZO
RESUME2 PATRICIA HOWE DIRENZO
 
The Inner Game of Success - Positivity Place second presentation)
The Inner Game of Success - Positivity Place second presentation)The Inner Game of Success - Positivity Place second presentation)
The Inner Game of Success - Positivity Place second presentation)
 
Dica como investir num negócio lucrativo
Dica   como investir num negócio lucrativo Dica   como investir num negócio lucrativo
Dica como investir num negócio lucrativo
 
デジタルサイネージ×緊急地震速報の可能性
デジタルサイネージ×緊急地震速報の可能性デジタルサイネージ×緊急地震速報の可能性
デジタルサイネージ×緊急地震速報の可能性
 
Ready for the digital workplace?
Ready for the digital workplace?Ready for the digital workplace?
Ready for the digital workplace?
 
Google Analytics, Seo, Social Media
Google Analytics, Seo, Social MediaGoogle Analytics, Seo, Social Media
Google Analytics, Seo, Social Media
 
Tips on how to get followers on pinterest
Tips on how to get followers on pinterestTips on how to get followers on pinterest
Tips on how to get followers on pinterest
 
Schedule
ScheduleSchedule
Schedule
 
Zaragoza turismo-51
Zaragoza turismo-51Zaragoza turismo-51
Zaragoza turismo-51
 
URL to IRL: Networking
URL to IRL: NetworkingURL to IRL: Networking
URL to IRL: Networking
 
Understanding the Bible Intro - Session 1
Understanding the Bible Intro - Session 1Understanding the Bible Intro - Session 1
Understanding the Bible Intro - Session 1
 
Fatla Bloque Cierre
Fatla Bloque CierreFatla Bloque Cierre
Fatla Bloque Cierre
 
Anexo 1.-propuesta-técnica-y-económica
Anexo 1.-propuesta-técnica-y-económicaAnexo 1.-propuesta-técnica-y-económica
Anexo 1.-propuesta-técnica-y-económica
 

Semelhante a Cinfony - Bring cheminformatics toolkits into tune

Building Server Applications Using ObjectiveC And GNUstep
Building Server Applications Using ObjectiveC And GNUstepBuilding Server Applications Using ObjectiveC And GNUstep
Building Server Applications Using ObjectiveC And GNUstep
guest9efd1a1
 

Semelhante a Cinfony - Bring cheminformatics toolkits into tune (20)

Cinfony - Combining disparate cheminformatics resources into a single toolkit
Cinfony - Combining disparate cheminformatics resources into a single toolkitCinfony - Combining disparate cheminformatics resources into a single toolkit
Cinfony - Combining disparate cheminformatics resources into a single toolkit
 
A DevOps guide to Kubernetes
A DevOps guide to KubernetesA DevOps guide to Kubernetes
A DevOps guide to Kubernetes
 
Common primitives in Docker environments
Common primitives in Docker environmentsCommon primitives in Docker environments
Common primitives in Docker environments
 
The DevOps paradigm - the evolution of IT professionals and opensource toolkit
The DevOps paradigm - the evolution of IT professionals and opensource toolkitThe DevOps paradigm - the evolution of IT professionals and opensource toolkit
The DevOps paradigm - the evolution of IT professionals and opensource toolkit
 
The DevOps Paradigm
The DevOps ParadigmThe DevOps Paradigm
The DevOps Paradigm
 
Docker Enables DevOps - Boston
Docker Enables DevOps - BostonDocker Enables DevOps - Boston
Docker Enables DevOps - Boston
 
Openstack Summit Container Day Keynote
Openstack Summit Container Day KeynoteOpenstack Summit Container Day Keynote
Openstack Summit Container Day Keynote
 
Cloud Native Dünyada CI/CD
Cloud Native Dünyada CI/CDCloud Native Dünyada CI/CD
Cloud Native Dünyada CI/CD
 
Rajiv Profile
Rajiv ProfileRajiv Profile
Rajiv Profile
 
Provisioning Toolchain Introduction for Velocity Online Conference (March 2010)
Provisioning Toolchain Introduction for Velocity Online Conference (March 2010)Provisioning Toolchain Introduction for Velocity Online Conference (March 2010)
Provisioning Toolchain Introduction for Velocity Online Conference (March 2010)
 
The Nuxeo Way: leveraging open source to build a world-class ECM platform
The Nuxeo Way: leveraging open source to build a world-class ECM platformThe Nuxeo Way: leveraging open source to build a world-class ECM platform
The Nuxeo Way: leveraging open source to build a world-class ECM platform
 
UniK - a unikernel compiler and runtime
UniK - a unikernel compiler and runtimeUniK - a unikernel compiler and runtime
UniK - a unikernel compiler and runtime
 
Accelerate Your Automation Testing Effort using TestProject & Docker | Docker...
Accelerate Your Automation Testing Effort using TestProject & Docker | Docker...Accelerate Your Automation Testing Effort using TestProject & Docker | Docker...
Accelerate Your Automation Testing Effort using TestProject & Docker | Docker...
 
DockerCon SF 2015: Keynote Day 1
DockerCon SF 2015: Keynote Day 1DockerCon SF 2015: Keynote Day 1
DockerCon SF 2015: Keynote Day 1
 
Continuous Delivery With Containers
Continuous Delivery With ContainersContinuous Delivery With Containers
Continuous Delivery With Containers
 
The Ring programming language version 1.7 book - Part 75 of 196
The Ring programming language version 1.7 book - Part 75 of 196The Ring programming language version 1.7 book - Part 75 of 196
The Ring programming language version 1.7 book - Part 75 of 196
 
Extending DevOps to Big Data Applications with Kubernetes
Extending DevOps to Big Data Applications with KubernetesExtending DevOps to Big Data Applications with Kubernetes
Extending DevOps to Big Data Applications with Kubernetes
 
The Ring programming language version 1.8 book - Part 77 of 202
The Ring programming language version 1.8 book - Part 77 of 202The Ring programming language version 1.8 book - Part 77 of 202
The Ring programming language version 1.8 book - Part 77 of 202
 
Elyra - a set of AI-centric extensions to JupyterLab Notebooks.
Elyra - a set of AI-centric extensions to JupyterLab Notebooks.Elyra - a set of AI-centric extensions to JupyterLab Notebooks.
Elyra - a set of AI-centric extensions to JupyterLab Notebooks.
 
Building Server Applications Using ObjectiveC And GNUstep
Building Server Applications Using ObjectiveC And GNUstepBuilding Server Applications Using ObjectiveC And GNUstep
Building Server Applications Using ObjectiveC And GNUstep
 

Mais de baoilleach

Universal Smiles: Finally a canonical SMILES string
Universal Smiles: Finally a canonical SMILES stringUniversal Smiles: Finally a canonical SMILES string
Universal Smiles: Finally a canonical SMILES string
baoilleach
 
What's New and Cooking in Open Babel 2.3.2
What's New and Cooking in Open Babel 2.3.2What's New and Cooking in Open Babel 2.3.2
What's New and Cooking in Open Babel 2.3.2
baoilleach
 
Large-scale computational design and selection of polymers for solar cells
Large-scale computational design and selection of polymers for solar cellsLarge-scale computational design and selection of polymers for solar cells
Large-scale computational design and selection of polymers for solar cells
baoilleach
 
Improving the quality of chemical databases with community-developed tools (a...
Improving the quality of chemical databases with community-developed tools (a...Improving the quality of chemical databases with community-developed tools (a...
Improving the quality of chemical databases with community-developed tools (a...
baoilleach
 
Why multiple scoring functions can improve docking performance - Testing hypo...
Why multiple scoring functions can improve docking performance - Testing hypo...Why multiple scoring functions can improve docking performance - Testing hypo...
Why multiple scoring functions can improve docking performance - Testing hypo...
baoilleach
 
Improving enrichment rates
Improving enrichment ratesImproving enrichment rates
Improving enrichment rates
baoilleach
 
The Blue Obelisk community
The Blue Obelisk communityThe Blue Obelisk community
The Blue Obelisk community
baoilleach
 

Mais de baoilleach (20)

We need to talk about Kekulization, Aromaticity and SMILES
We need to talk about Kekulization, Aromaticity and SMILESWe need to talk about Kekulization, Aromaticity and SMILES
We need to talk about Kekulization, Aromaticity and SMILES
 
So I have an SD File... What do I do next?
So I have an SD File... What do I do next?So I have an SD File... What do I do next?
So I have an SD File... What do I do next?
 
Universal Smiles: Finally a canonical SMILES string
Universal Smiles: Finally a canonical SMILES stringUniversal Smiles: Finally a canonical SMILES string
Universal Smiles: Finally a canonical SMILES string
 
What's New and Cooking in Open Babel 2.3.2
What's New and Cooking in Open Babel 2.3.2What's New and Cooking in Open Babel 2.3.2
What's New and Cooking in Open Babel 2.3.2
 
Intro to Open Babel
Intro to Open BabelIntro to Open Babel
Intro to Open Babel
 
Protein-ligand docking
Protein-ligand dockingProtein-ligand docking
Protein-ligand docking
 
Cheminformatics
CheminformaticsCheminformatics
Cheminformatics
 
Making the most of a QM calculation
Making the most of a QM calculationMaking the most of a QM calculation
Making the most of a QM calculation
 
Data Analysis in QSAR
Data Analysis in QSARData Analysis in QSAR
Data Analysis in QSAR
 
Large-scale computational design and selection of polymers for solar cells
Large-scale computational design and selection of polymers for solar cellsLarge-scale computational design and selection of polymers for solar cells
Large-scale computational design and selection of polymers for solar cells
 
My Open Access papers
My Open Access papersMy Open Access papers
My Open Access papers
 
Improving the quality of chemical databases with community-developed tools (a...
Improving the quality of chemical databases with community-developed tools (a...Improving the quality of chemical databases with community-developed tools (a...
Improving the quality of chemical databases with community-developed tools (a...
 
De novo design of molecular wires with optimal properties for solar energy co...
De novo design of molecular wires with optimal properties for solar energy co...De novo design of molecular wires with optimal properties for solar energy co...
De novo design of molecular wires with optimal properties for solar energy co...
 
Density functional theory calculations on Ruthenium polypyridyl complexes inc...
Density functional theory calculations on Ruthenium polypyridyl complexes inc...Density functional theory calculations on Ruthenium polypyridyl complexes inc...
Density functional theory calculations on Ruthenium polypyridyl complexes inc...
 
Application of Density Functional Theory to Scanning Tunneling Microscopy
Application of Density Functional Theory to Scanning Tunneling MicroscopyApplication of Density Functional Theory to Scanning Tunneling Microscopy
Application of Density Functional Theory to Scanning Tunneling Microscopy
 
Towards Practical Molecular Devices
Towards Practical Molecular DevicesTowards Practical Molecular Devices
Towards Practical Molecular Devices
 
Why multiple scoring functions can improve docking performance - Testing hypo...
Why multiple scoring functions can improve docking performance - Testing hypo...Why multiple scoring functions can improve docking performance - Testing hypo...
Why multiple scoring functions can improve docking performance - Testing hypo...
 
Why multiple scoring functions can improve docking performance - Testing hypo...
Why multiple scoring functions can improve docking performance - Testing hypo...Why multiple scoring functions can improve docking performance - Testing hypo...
Why multiple scoring functions can improve docking performance - Testing hypo...
 
Improving enrichment rates
Improving enrichment ratesImproving enrichment rates
Improving enrichment rates
 
The Blue Obelisk community
The Blue Obelisk communityThe Blue Obelisk community
The Blue Obelisk community
 

Último

Último (20)

How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 

Cinfony - Bring cheminformatics toolkits into tune

  • 1. Bringing cheminformatics toolkits into tune Noel M. O’Boyle OpenBabel May 2011 Molecular Informatics Open Source Software EMBL-EBI, Cambridge, UK
  • 2. Toolkits, toolkits and more toolkits Commercial cheminformatics toolkits:
  • 3. Toolkits, toolkits and more toolkits Open Source cheminformatics toolkits: CDK OpenBabel OASA PerlMol
  • 4. The importance of being interoperable Good for users Can take advantage of complementary features CDK:Gasteigerπ charges, maximal common substructure, shape similarity with ultrafast shape descriptors, mass-spectrometry analysis RDKit:RECAP fragmentation, calculation of R/S, atom pair fingerprints, shape similarity with volume overlap OpenBabel:several forcefields, crystallography, large number of file formats, conformer searching, InChIKey
  • 5. The importance of being interoperable Good for users Can take advantage of complementary features Can choose between different implementations Faster SMARTS searching, better 2D depiction, more accurate 3D structure generation Avoid vendor lock-in Good for developers Less reinvention of wheel, more time to spend on development of complementary features Avoid balkanisation of field Bigger pool of users
  • 6. J. Chem. Inf. Model., 2006, 46, 991 http://www.blueobelisk.org
  • 7. J. Chem. Inf. Model., 2006, 46, 991 http://www.blueobelisk.org
  • 8. Bringing it all together with Cinfony Different languages Java (CDK, OPSIN), C++ (Open Babel, RDKit, Indigo) Use Python, a higher-level language that can bridge to both Different APIs Each toolkit uses different commands to carry out the same tasks Implement a common API Different chemical models Different internal representation of a molecule Use existing method for storage and transfer of chemical information: chemical file formats MDL mol file for 2D and 3D, SMILES for 0D
  • 9.
  • 10.
  • 11.
  • 13. One API to rule them all Example - create a Molecule from a SMILES string: mol = Chem.MolFromSmiles(SMILESstring) mol = Indigo.loadMolecule(SMILESstring) mol = openbabel.OBMol() obconversion = openbabel.OBConversion() obconversion.SetInFormat("smi") obconversion.ReadString(mol, SMILESstring) OpenBabel CDK builder = cdk.DefaultChemObjectBuilder.getInstance() sp = cdk.smiles.SmilesParser(builder) mol = sp.parseSmiles(SMILESstring) RDKit Indigo mol = toolkit.readstring("smi", SMILESstring) wheretoolkit is either obabel,cdk, indyorrdk
  • 14. Design of Cinfony API API is small (“fits your brain”) Covers core functionality of toolkits Corollary: need to access underlying toolkit for additional functionality Makes it easy to carry out common tasks API is stable Make it easy to find relevant methods Example: add hydrogens to a molecule atommanip = cdk.tools.manipulator.AtomContainerManipulatoratommanip.convertImplicitToExplicitHydrogens(molecule) CDK molecule.addh()
  • 17. Examples of use Chemistry Toolkit Rosetta http://ctr.wikia.com Andrew Dalke
  • 18. Combining toolkits >>> from cinfony import rdk, cdk, obabel>>> obabelmol = obabel.readstring("smi", "CCC")>>> rdkmol = rdk.Molecule(obabelmol)>>> rdkmol.draw(show=False, filename="propane.png")>>> print cdk.Molecule(rdkmol).calcdesc(){'chi0C': 2.7071067811865475, 'BCUT.4': 4.4795252101839402, 'rotatableBondsCount': 2, 'mde.9': 0.0, 'mde.8': 0.0, ... } Import Cinfony Read in a molecule from a SMILES string with Open Babel Convert it to an RDKit Molecule Create a 2D depiction of the molecule with RDKit Convert it to a CDK Molecule and calculate descriptor values
  • 19. Comparing toolkits >>> fromcinfonyimportrdk, cdk, obabel, indy, webel>>> for toolkit in [rdk, cdk, obabel, indy, webel]: ... mol = toolkit.readstring("smi", "CCC") ... printmol.molwt ... mol.draw(filename="%s.png" % toolkit.__name__) Import Cinfony For each toolkit... ... Read in a molecule from a SMILES string ... Print its molecular weight ... Create a 2D depiction Useful for sanity checks, identifying limitations, bugs Calculating the molecular weight (http://tinyurl.com/chemacs3) implicit hydrogen, isotopes Comparison of descriptor values (http://tinyurl.com/chemacs2) Should be highly correlated Comparison of depictions (http://tinyurl.com/chemacs1)
  • 21. Webel - Chemistry for Web 2.0 Webel is a Cinfony module that runs entirely using web services CDK webservices by RajarshiGuha, hosted by Ola Spjuth at Uppsala University NCI/CADD Chemical Identifier Resolver by Markus Sitzmann (uses Cactvs for much of backend) Easy to install – no dependencies Can be used in environments where installing a cheminformatics toolkit is not possible Web services may provide additional services not available elsewhere Example: how similar is aspirin to Dr. Scholl’s Wart Remover Kit? >>> from cinfony import webel >>> aspirin = webel.readstring("name", "aspirin")>>> wartremover = webel.readstring("name", ... "Dr. Scholl’s Wart Remover Kit")>>> print aspirin.calcfp() | wartremover.calcfp() 0.59375
  • 22. Webel - Chemistry for Web 2.0 Webel is a Cinfony module that runs entirely using web services CDK webservices by RajarshiGuha, hosted by Ola Spjuth at Uppsala University NCI/CADD Chemical Identifier Resolver by Markus Sitzmann (uses Cactvs for much of backend) Easy to install – no dependencies Can be used in environments where installing a cheminformatics toolkit is not possible Web services may provide additional services not available elsewhere Example: how similar is aspirin to Dr. Scholl’s Wart Remover Kit? >>> from cinfony import webel >>> aspirin = webel.readstring("name", "aspirin")>>> wartremover = webel.readstring("name", ... "Dr. Scholl’s Wart Remover Kit")>>> print aspirin.calcfp() | wartremover.calcfp() 0.59375
  • 23. Cheminformatics in the browser See http://tinyurl.com/cm7005-b or just Google “webelsilverlight”
  • 24. makes it easy to... Start using a new toolkit Carry out common tasks Combine functionality from different toolkits Compare results from different toolkits Do cheminformatics through the web, and on the web
  • 25. Food for thought Inclusion of cheminformatics toolkits in Linux distributions “apt-get install cinfony” DebiChem can help Binary versions for Linux API stability – and associated version numbering Needed to handle dependencies “Sorry - This version of Cinfony will work only with the 1.2.x series of Toolkit Y” What other toolkits or functionality should Cinfony support? Would be nice if various toolkits promoted Cinfony Even nicer if they ran the test suite and fixed problems, and added in new features (new fps, etc.)! Using Cinfony, it’s easy for toolkits to test against other toolkits Quality Control RDKit - Java bindings on Windows Licensing of Cinfony’s components Related point: Science is BSD Let’s support Python 3 already
  • 26. Bringing cheminformatics toolkits into tune Chem. Cent. J., 2008, 2, 24. http://cinfony.googlecode.com http://baoilleach.blogspot.com Acknowledgements CDK:EgonWillighagen, RajarshiGuha Open Babel: Chris Morley, Tim Vandermeersch RDKit: Greg Landrum Indigo: Dmitry Pavlov OASA: Beda Kosata OPSIN: Daniel Lowe JPype: Steve Ménard Chemical Identifier Resolver: Markus Sitzmann Interactive Tutorial: Michael Foord Image: Tintin44 (Flickr)
  • 27.
  • 28. Cheminformatics in the browser As Webel is pure Python, it can run places where traditional cheminformatics software cannot... ...such as in a web browser Microsoft have developed a browser plugin called Silverlight for developing applications for the web It includes a Python interpreter (IronPython) So you can use Webel in Silverlight applications Michael Foord has developed an interactive Python tutorial using Silverlight See http://ironpython.net/tutorial/ I have combined this with Webel to develop an interactive Cheminformatics tutorial
  • 30. import this VirtualBox, Double click on MIOSS Applications/Accessories/Terminal user:~$ cd apps/cinfony user:~/apps/cinfony$ ./myjython.sh Jython 2.5.2 (Release_2_5_2:7206, Mar 2 2011) >>> from cinfony import cdk, indy, opsin, webel >>> See API and “How to Use” at http://cinfony.googlecode.com

Notas do Editor

  1. OB has around 112 classes, 227 global functions. CDK has 882 classes.
  2. Note to self: Add canonical SMILES support for RDKit