SlideShare uma empresa Scribd logo
1 de 31
Open PHACTS - Chemistry
Platform Update and learnings
Antony Williams and Valery Tkachenko
ORCID ID:0000-0002-2668-4821
@gray_alasdair Big Data Integration 2
OpenPHACTS and CRS Diagram
The Chemical Registration Service
Chemistry processing
•Validation
•Standardization
•Properties generation
•Properties retrieval
Export
•RDF
•SDF
API
•Domain-specific searches
•Chemical visualization
•Properties
•Conversions
Subsystems
• “CVSP” (frontend, backend, database)
• Compounds (frontend, database)
• OpenPHACTS API (frontend, database)
• Datasources registry (frontend, database)
• Processing farm (optional)
Structure-Based Database linking
• Open PHACTS, and many other projects
requiring the linking of structure databases,
depend on mappings
• Different databases use different processes
for standardization prior at deposition
• Examples: PubChem, EBI databases,
ChemSpider, etc.
DrugBank
• ~60 records can’t be dearomatized unambiguously
• ~40 records where InChIs did not match structure
• 2 records where SMILES, InChI and name did not
match the structure
• 7 records with 2 stereo bonds at chiral atoms
DB04283 DB04462
Standardizers
• EBI Standardizer:
https://wwwdev.ebi.ac.uk/chembl/extra/francis/sta
/
• PubChem Standardizer: https://
pubchem.ncbi.nlm.nih.gov/standardize/standardi
• NCGC Standardizer: https://tripod.nih.gov/?
p=61
• The CVSP Standardizer work in Open
PHACTS http://cvsp.chemspider.com/
Standardization Rules
• Available from: http://tinyurl.com/hwapem3
• Use the SRS as guidance for standardization
• Adjust as necessary to our needs
Nitro groups
Salt and Ionic Bonds
The CVSP System
http://cvsp.chemspider.com
Supports various file formats
Comptox Chemistry Dashboard
Prior to deposition check a deposition…
>3450 compounds in one SDF
98 Errors, 1571 Warnings
Review Errors
Validation Rule Set
Various Rules Sets Available
CVSP – My own custom rules
ChEMBL Validation Review
(of 1.3 million records)
• 11,020 records with 4 bonds and zero charge, e.g.
CHEMBL501101 or CHEMBL501973
• 271 records with hypervalent oxygen (e.g. ,
CHEMBL2219679), carbon (e.g. 1005895), boron,
chlorine, iodine or phosphine
• 6,177 records where direction of bond makes no
sense, e.g. CHEMBL12760 and CHEMBL34704
Chemical Validation first…
Standardization Second
• Chemical Validation detects errors –
Standardization FIXES them according to rules
• SMIRKS transformations are based on both
InChI Normalization and FDA SRS rules
Standardization SMIRKS
Examples of InChI normalization
[*;H+:1]>>[*;H:1]
[O,S,Se,Te:1]=[O+,S+,Se+,Te+:2][C-;v3:3]>>[O,S,Se,Te:1]=[O,S,Se,Te:2]=[C:3]
[N-,P-,As-,Sb-:1]=[C+;v3:2]>>[N,P,As,Sb:1]#[C:2]
Examples of FDA SRS rules
[n:1]=[O:2]>>[n+:1][O-:2]
[*:1]=[N:2]#[N:3]>>[*:1]=[N+:2]=[N-:3]
[N+0;H3:1].[C:3](=[O:4])[O:5][H:6]>>[N+1;H4:1].[C:3](=[O:4])[O-:5]
Thiopurine [H:1][S:2][c:3]1[n:8][c:7]([H,*:13])[n:6][c:5]2[c:4]1[n:11][c:10]
([H,*:12])[n:9]2>>[H:1][N:8]1[C:7]([H,*:13])=[N:6][C:5]2=[C:4]([N:11]=[C:10]
([H,*:12])[N:9]2)[C:3]1=[S:2]
Examples of Standardization
Double bond with adjacent wiggly single bond
Collapser hydrogen atoms with no stereo bonds
Examples of Standardization
Remove symmetric stereocenters
Turn off chiral flag if no up or down bonds
Defining a Community Rule Set
• There are multiple standardizers, each with
their own rules set
• Can we decide on a default community rules
set, like Standard InChI, that could be used
by ALL Standardizers?
• A joint meeting between the Research Data
Alliance (RDA), IUPAC and ACS Division of
Chemical Information discussed the value
and possibilities of this approach (July 2016)
EPA is investigating CVSP
• EPA is investigating CVSP as a validation
and standardization platform
• Considering the API aspects of CVSP to
integrate to our registration system
• CVSP is a reference implementation and
“starting point” for a community rules set
CVSP code is now Open Source
• Open Source CVSP code now released
• Code is hosted on Open PHACTS Github
https://github.com/openphacts/ops-crs
• Valery Tkachenko will offer future support
• Hoping for additional community engagement
and support
• Some details of availability….
Virtual Machines
• OPS_FRONT (all websites and API)
• OPS_BACK (all heavy-lifting)
• OPS_DB (databases)
• VMs are VMware images
• Can be converted to other hypervisors
Thank you
Emails: tony27587@gmail.com and tkachenko.valery@gmail.com
SLIDES: www.slideshare.net/AntonyWilliams

Mais conteúdo relacionado

Destaque

WG-2016 HolidayCatalogWEB
WG-2016 HolidayCatalogWEBWG-2016 HolidayCatalogWEB
WG-2016 HolidayCatalogWEB
Jessica Barto
 
Polynomial long division
Polynomial long divisionPolynomial long division
Polynomial long division
Deepak Kumar
 
Unit 1 Static Electricity
Unit 1 Static ElectricityUnit 1 Static Electricity
Unit 1 Static Electricity
Bruce Coulter
 

Destaque (12)

Conservation and management of large carnivores in France: A beneficial colla...
Conservation and management of large carnivores in France: A beneficial colla...Conservation and management of large carnivores in France: A beneficial colla...
Conservation and management of large carnivores in France: A beneficial colla...
 
Trabajo de matemáticas 1
Trabajo de matemáticas 1Trabajo de matemáticas 1
Trabajo de matemáticas 1
 
WG-2016 HolidayCatalogWEB
WG-2016 HolidayCatalogWEBWG-2016 HolidayCatalogWEB
WG-2016 HolidayCatalogWEB
 
Polynomial long division
Polynomial long divisionPolynomial long division
Polynomial long division
 
Envoyer des images depuis l'extérieur : les nouveaux moyens de contribution IP
Envoyer des images depuis l'extérieur :les nouveaux moyens de contribution IPEnvoyer des images depuis l'extérieur :les nouveaux moyens de contribution IP
Envoyer des images depuis l'extérieur : les nouveaux moyens de contribution IP
 
Actividad 8 taller práctico 10 claves para la implementación de tendencias y ...
Actividad 8 taller práctico 10 claves para la implementación de tendencias y ...Actividad 8 taller práctico 10 claves para la implementación de tendencias y ...
Actividad 8 taller práctico 10 claves para la implementación de tendencias y ...
 
The Karnaugh Map
The Karnaugh MapThe Karnaugh Map
The Karnaugh Map
 
Unit 1 Static Electricity
Unit 1 Static ElectricityUnit 1 Static Electricity
Unit 1 Static Electricity
 
Dividing polynomials
Dividing polynomialsDividing polynomials
Dividing polynomials
 
Peer evaluation (plants)
Peer evaluation (plants)Peer evaluation (plants)
Peer evaluation (plants)
 
Rúbrica de evaluación plants
Rúbrica de evaluación   plantsRúbrica de evaluación   plants
Rúbrica de evaluación plants
 
4 bit Binary counter
4 bit Binary counter4 bit Binary counter
4 bit Binary counter
 

Semelhante a Open PHACTS Webinar Series - Chemistry Platform

How the InChI identifier is used to underpin our online chemistry databases a...
How the InChI identifier is used to underpin our online chemistry databases a...How the InChI identifier is used to underpin our online chemistry databases a...
How the InChI identifier is used to underpin our online chemistry databases a...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...
ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...
ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...
Dr. Haxel Consult
 
Structure Identification Using High Resolution Mass Spectrometry Data and the...
Structure Identification Using High Resolution Mass Spectrometry Data and the...Structure Identification Using High Resolution Mass Spectrometry Data and the...
Structure Identification Using High Resolution Mass Spectrometry Data and the...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Dealing with the complex challenge of managing diverse chemistry data online
Dealing with the complex challenge of managing diverse chemistry data onlineDealing with the complex challenge of managing diverse chemistry data online
Dealing with the complex challenge of managing diverse chemistry data online
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Dealing with the complex challenge of managing diverse chemistry data online
Dealing with the complex challenge of managing diverse chemistry data onlineDealing with the complex challenge of managing diverse chemistry data online
Dealing with the complex challenge of managing diverse chemistry data online
Ken Karapetyan
 
Acs 2013 indianapolis_cvsp
Acs 2013 indianapolis_cvspAcs 2013 indianapolis_cvsp
Acs 2013 indianapolis_cvsp
Ken Karapetyan
 
Experiences in Hosting Big Chemistry Data Collections for the Community
Experiences in Hosting Big Chemistry Data Collections for the CommunityExperiences in Hosting Big Chemistry Data Collections for the Community
Experiences in Hosting Big Chemistry Data Collections for the Community
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Chemistry data: Distortion and dissemination in the Internet Era
Chemistry data: Distortion and dissemination in the Internet EraChemistry data: Distortion and dissemination in the Internet Era
Chemistry data: Distortion and dissemination in the Internet Era
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 

Semelhante a Open PHACTS Webinar Series - Chemistry Platform (20)

ChemValidator – an online service for validating and standardizing chemical s...
ChemValidator – an online service for validating and standardizing chemical s...ChemValidator – an online service for validating and standardizing chemical s...
ChemValidator – an online service for validating and standardizing chemical s...
 
How the InChI identifier is used to underpin our online chemistry databases a...
How the InChI identifier is used to underpin our online chemistry databases a...How the InChI identifier is used to underpin our online chemistry databases a...
How the InChI identifier is used to underpin our online chemistry databases a...
 
How the InChI identifier is used to underpin our online chemistry databases a...
How the InChI identifier is used to underpin our online chemistry databases a...How the InChI identifier is used to underpin our online chemistry databases a...
How the InChI identifier is used to underpin our online chemistry databases a...
 
The RSC chemical validation and standardization platform, a potential path to...
The RSC chemical validation and standardization platform, a potential path to...The RSC chemical validation and standardization platform, a potential path to...
The RSC chemical validation and standardization platform, a potential path to...
 
How to place your research questions or results into the context of the "Lega...
How to place your research questions or results into the context of the "Lega...How to place your research questions or results into the context of the "Lega...
How to place your research questions or results into the context of the "Lega...
 
Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...
Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...
Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...
 
ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...
ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...
ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...
 
Structure Identification Using High Resolution Mass Spectrometry Data and the...
Structure Identification Using High Resolution Mass Spectrometry Data and the...Structure Identification Using High Resolution Mass Spectrometry Data and the...
Structure Identification Using High Resolution Mass Spectrometry Data and the...
 
US-EPA Cheminformatics Support for Delivering Data Related to Chemicals of E...
US-EPA Cheminformatics Support for Delivering Data Related to Chemicals of E...US-EPA Cheminformatics Support for Delivering Data Related to Chemicals of E...
US-EPA Cheminformatics Support for Delivering Data Related to Chemicals of E...
 
Hosting Public Domain Chemicals Data Online for the Community – the Challenge...
Hosting Public Domain Chemicals Data Online for the Community – the Challenge...Hosting Public Domain Chemicals Data Online for the Community – the Challenge...
Hosting Public Domain Chemicals Data Online for the Community – the Challenge...
 
Open innovation contributions from RSC resulting from the Open Phacts project
Open innovation contributions from RSC resulting from the Open Phacts projectOpen innovation contributions from RSC resulting from the Open Phacts project
Open innovation contributions from RSC resulting from the Open Phacts project
 
Open innovation contributions from RSC resulting from the Open Phacts project
Open innovation contributions from RSC resulting from the Open Phacts projectOpen innovation contributions from RSC resulting from the Open Phacts project
Open innovation contributions from RSC resulting from the Open Phacts project
 
Dealing with the complex challenge of managing diverse chemistry data online
Dealing with the complex challenge of managing diverse chemistry data onlineDealing with the complex challenge of managing diverse chemistry data online
Dealing with the complex challenge of managing diverse chemistry data online
 
Dealing with the complex challenge of managing diverse chemistry data online
Dealing with the complex challenge of managing diverse chemistry data onlineDealing with the complex challenge of managing diverse chemistry data online
Dealing with the complex challenge of managing diverse chemistry data online
 
Acs 2013 indianapolis_cvsp
Acs 2013 indianapolis_cvspAcs 2013 indianapolis_cvsp
Acs 2013 indianapolis_cvsp
 
Automated workflows for data curation and standardization of chemical structu...
Automated workflows for data curation and standardization of chemical structu...Automated workflows for data curation and standardization of chemical structu...
Automated workflows for data curation and standardization of chemical structu...
 
Experiences in Hosting Big Chemistry Data Collections for the Community
Experiences in Hosting Big Chemistry Data Collections for the CommunityExperiences in Hosting Big Chemistry Data Collections for the Community
Experiences in Hosting Big Chemistry Data Collections for the Community
 
Cheminformatics tools and chemistry data underpinning mass spectrometry analy...
Cheminformatics tools and chemistry data underpinning mass spectrometry analy...Cheminformatics tools and chemistry data underpinning mass spectrometry analy...
Cheminformatics tools and chemistry data underpinning mass spectrometry analy...
 
The RSC chemical validation and standardization platform, a potential path to...
The RSC chemical validation and standardization platform, a potential path to...The RSC chemical validation and standardization platform, a potential path to...
The RSC chemical validation and standardization platform, a potential path to...
 
Chemistry data: Distortion and dissemination in the Internet Era
Chemistry data: Distortion and dissemination in the Internet EraChemistry data: Distortion and dissemination in the Internet Era
Chemistry data: Distortion and dissemination in the Internet Era
 

Mais de open_phacts

Mais de open_phacts (19)

Open PHACTS April 2017 Science webinar Workflow tools
Open PHACTS April 2017 Science webinar Workflow toolsOpen PHACTS April 2017 Science webinar Workflow tools
Open PHACTS April 2017 Science webinar Workflow tools
 
Open PHACTS webinar June 2016 - Data2Discovery
Open PHACTS webinar June 2016 - Data2DiscoveryOpen PHACTS webinar June 2016 - Data2Discovery
Open PHACTS webinar June 2016 - Data2Discovery
 
Open PHACTS MIOSS may 2016
Open PHACTS MIOSS may 2016Open PHACTS MIOSS may 2016
Open PHACTS MIOSS may 2016
 
Open PHACTS Webinar: Computational Protocols for In Silico Target Validation
Open PHACTS Webinar: Computational Protocols for In Silico Target ValidationOpen PHACTS Webinar: Computational Protocols for In Silico Target Validation
Open PHACTS Webinar: Computational Protocols for In Silico Target Validation
 
Patent annotations: From SureChEMBL to Open PHACTS
Patent annotations: From SureChEMBL to Open PHACTSPatent annotations: From SureChEMBL to Open PHACTS
Patent annotations: From SureChEMBL to Open PHACTS
 
2013-12-04 Experimental data guided docking allows to elucidate the molecular...
2013-12-04 Experimental data guided docking allows to elucidate the molecular...2013-12-04 Experimental data guided docking allows to elucidate the molecular...
2013-12-04 Experimental data guided docking allows to elucidate the molecular...
 
2015-05-19 Open PHACTS Drug Discovery Workflow Workshop - KNIME
2015-05-19 Open PHACTS Drug Discovery Workflow Workshop - KNIME2015-05-19 Open PHACTS Drug Discovery Workflow Workshop - KNIME
2015-05-19 Open PHACTS Drug Discovery Workflow Workshop - KNIME
 
2015-05-19 Open PHACTS Drug Discovery Workflow Workshop - The API
2015-05-19 Open PHACTS Drug Discovery Workflow Workshop - The API2015-05-19 Open PHACTS Drug Discovery Workflow Workshop - The API
2015-05-19 Open PHACTS Drug Discovery Workflow Workshop - The API
 
2015-04-28 Open PHACTS at Swedish Linked Data Network Meet-up
2015-04-28 Open PHACTS at Swedish Linked Data Network Meet-up2015-04-28 Open PHACTS at Swedish Linked Data Network Meet-up
2015-04-28 Open PHACTS at Swedish Linked Data Network Meet-up
 
2015-02-10 The Open PHACTS Discovery Platform: Semantic Data Integration for ...
2015-02-10 The Open PHACTS Discovery Platform: Semantic Data Integration for ...2015-02-10 The Open PHACTS Discovery Platform: Semantic Data Integration for ...
2015-02-10 The Open PHACTS Discovery Platform: Semantic Data Integration for ...
 
2014-03-20 Open PHACTS - A Data Platform for Drug Discovery
2014-03-20 Open PHACTS - A Data Platform for Drug Discovery2014-03-20 Open PHACTS - A Data Platform for Drug Discovery
2014-03-20 Open PHACTS - A Data Platform for Drug Discovery
 
2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHAC...
2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHAC...2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHAC...
2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHAC...
 
2013 Open PHACTS Architecture Poster
2013 Open PHACTS Architecture Poster2013 Open PHACTS Architecture Poster
2013 Open PHACTS Architecture Poster
 
2013 Open PHACTS Scientific Questions Poster
2013 Open PHACTS Scientific Questions Poster2013 Open PHACTS Scientific Questions Poster
2013 Open PHACTS Scientific Questions Poster
 
2013 Open PHACTS Exemplars Poster
2013 Open PHACTS Exemplars Poster2013 Open PHACTS Exemplars Poster
2013 Open PHACTS Exemplars Poster
 
2011-11-28 Open PHACTS at RSC CICAG
2011-11-28 Open PHACTS at RSC CICAG2011-11-28 Open PHACTS at RSC CICAG
2011-11-28 Open PHACTS at RSC CICAG
 
2011-12-02 Open PHACTS at STM Innovation
2011-12-02 Open PHACTS at STM Innovation2011-12-02 Open PHACTS at STM Innovation
2011-12-02 Open PHACTS at STM Innovation
 
2011-10-11 Open PHACTS at BioIT World Europe
2011-10-11 Open PHACTS at BioIT World Europe2011-10-11 Open PHACTS at BioIT World Europe
2011-10-11 Open PHACTS at BioIT World Europe
 
2011-11-07 Open PHACTS Poster
2011-11-07 Open PHACTS Poster2011-11-07 Open PHACTS Poster
2011-11-07 Open PHACTS Poster
 

Último

The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
seri bangash
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
1301aanya
 
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptxTHE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
ANSARKHAN96
 
CYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptxCYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptx
Silpa
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
levieagacer
 
Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.
Silpa
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
MohamedFarag457087
 

Último (20)

The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
 
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRingsTransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICEPATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptxTHE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
 
CYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptxCYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptx
 
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRLGwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
 
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
 
Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.
 
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxClimate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
 
Grade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsGrade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its Functions
 
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIACURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
 
Genetics and epigenetics of ADHD and comorbid conditions
Genetics and epigenetics of ADHD and comorbid conditionsGenetics and epigenetics of ADHD and comorbid conditions
Genetics and epigenetics of ADHD and comorbid conditions
 
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
 
Role of AI in seed science Predictive modelling and Beyond.pptx
Role of AI in seed science  Predictive modelling and  Beyond.pptxRole of AI in seed science  Predictive modelling and  Beyond.pptx
Role of AI in seed science Predictive modelling and Beyond.pptx
 

Open PHACTS Webinar Series - Chemistry Platform

  • 1. Open PHACTS - Chemistry Platform Update and learnings Antony Williams and Valery Tkachenko ORCID ID:0000-0002-2668-4821
  • 2. @gray_alasdair Big Data Integration 2 OpenPHACTS and CRS Diagram
  • 3. The Chemical Registration Service Chemistry processing •Validation •Standardization •Properties generation •Properties retrieval Export •RDF •SDF API •Domain-specific searches •Chemical visualization •Properties •Conversions
  • 4.
  • 5. Subsystems • “CVSP” (frontend, backend, database) • Compounds (frontend, database) • OpenPHACTS API (frontend, database) • Datasources registry (frontend, database) • Processing farm (optional)
  • 6. Structure-Based Database linking • Open PHACTS, and many other projects requiring the linking of structure databases, depend on mappings • Different databases use different processes for standardization prior at deposition • Examples: PubChem, EBI databases, ChemSpider, etc.
  • 7. DrugBank • ~60 records can’t be dearomatized unambiguously • ~40 records where InChIs did not match structure • 2 records where SMILES, InChI and name did not match the structure • 7 records with 2 stereo bonds at chiral atoms DB04283 DB04462
  • 8. Standardizers • EBI Standardizer: https://wwwdev.ebi.ac.uk/chembl/extra/francis/sta / • PubChem Standardizer: https:// pubchem.ncbi.nlm.nih.gov/standardize/standardi • NCGC Standardizer: https://tripod.nih.gov/? p=61 • The CVSP Standardizer work in Open PHACTS http://cvsp.chemspider.com/
  • 9.
  • 10. Standardization Rules • Available from: http://tinyurl.com/hwapem3 • Use the SRS as guidance for standardization • Adjust as necessary to our needs
  • 12. Salt and Ionic Bonds
  • 15. Comptox Chemistry Dashboard Prior to deposition check a deposition…
  • 17. 98 Errors, 1571 Warnings
  • 20. Various Rules Sets Available
  • 21. CVSP – My own custom rules
  • 22. ChEMBL Validation Review (of 1.3 million records) • 11,020 records with 4 bonds and zero charge, e.g. CHEMBL501101 or CHEMBL501973 • 271 records with hypervalent oxygen (e.g. , CHEMBL2219679), carbon (e.g. 1005895), boron, chlorine, iodine or phosphine • 6,177 records where direction of bond makes no sense, e.g. CHEMBL12760 and CHEMBL34704
  • 23. Chemical Validation first… Standardization Second • Chemical Validation detects errors – Standardization FIXES them according to rules • SMIRKS transformations are based on both InChI Normalization and FDA SRS rules
  • 24. Standardization SMIRKS Examples of InChI normalization [*;H+:1]>>[*;H:1] [O,S,Se,Te:1]=[O+,S+,Se+,Te+:2][C-;v3:3]>>[O,S,Se,Te:1]=[O,S,Se,Te:2]=[C:3] [N-,P-,As-,Sb-:1]=[C+;v3:2]>>[N,P,As,Sb:1]#[C:2] Examples of FDA SRS rules [n:1]=[O:2]>>[n+:1][O-:2] [*:1]=[N:2]#[N:3]>>[*:1]=[N+:2]=[N-:3] [N+0;H3:1].[C:3](=[O:4])[O:5][H:6]>>[N+1;H4:1].[C:3](=[O:4])[O-:5] Thiopurine [H:1][S:2][c:3]1[n:8][c:7]([H,*:13])[n:6][c:5]2[c:4]1[n:11][c:10] ([H,*:12])[n:9]2>>[H:1][N:8]1[C:7]([H,*:13])=[N:6][C:5]2=[C:4]([N:11]=[C:10] ([H,*:12])[N:9]2)[C:3]1=[S:2]
  • 25. Examples of Standardization Double bond with adjacent wiggly single bond Collapser hydrogen atoms with no stereo bonds
  • 26. Examples of Standardization Remove symmetric stereocenters Turn off chiral flag if no up or down bonds
  • 27. Defining a Community Rule Set • There are multiple standardizers, each with their own rules set • Can we decide on a default community rules set, like Standard InChI, that could be used by ALL Standardizers? • A joint meeting between the Research Data Alliance (RDA), IUPAC and ACS Division of Chemical Information discussed the value and possibilities of this approach (July 2016)
  • 28. EPA is investigating CVSP • EPA is investigating CVSP as a validation and standardization platform • Considering the API aspects of CVSP to integrate to our registration system • CVSP is a reference implementation and “starting point” for a community rules set
  • 29. CVSP code is now Open Source • Open Source CVSP code now released • Code is hosted on Open PHACTS Github https://github.com/openphacts/ops-crs • Valery Tkachenko will offer future support • Hoping for additional community engagement and support • Some details of availability….
  • 30. Virtual Machines • OPS_FRONT (all websites and API) • OPS_BACK (all heavy-lifting) • OPS_DB (databases) • VMs are VMware images • Can be converted to other hypervisors
  • 31. Thank you Emails: tony27587@gmail.com and tkachenko.valery@gmail.com SLIDES: www.slideshare.net/AntonyWilliams

Notas do Editor

  1. Open PHACTS was developed to support the key questions of drug discovery Business questions have been at the heart of Open PHACTS and have driven the development of the platform Mx/psa, how calculated who did it? Mash up. With your data too, - top layer join together but need them all commercial Data provided by many publishers Originally in many formats: relational, SD files and RDF Worked closely with publishers Data licensing was a major issue Over 5 billion triples – 14 datasets & growing Hosted on beefy hardware; data in memory (aim) Extensive memcaching Pose complex queries to extract data