SlideShare uma empresa Scribd logo
1 de 22
ChemSpider Reactions:
   Delivering a free community
resource of chemical syntheses

Valery Tkachenko, Colin Batchelor, Daniel Lowe, Ken
     Karapetyan, David Sharpe and Antony Williams
                   ACS New Orleans April 2013
Overview
•   Motivation
•   The RSC and chemical reaction data
•   New sources of chemical reaction data
•   ChemSpider Reactions: bringing it all together
•   Experiments with reaction classification
•   The National Chemical Database Service
Who needs another reaction
           database?
• Those who cannot afford to license access…
• Those who would like to access data that is
  not abstracted
• Those who might like to contribute data to a
  database
• Anybody wanting to integrate their systems in
  and to pull data out.
RSC and chemical reaction data 1



Graphical abstracting journals:
Methods in Organic Synthesis (monthly, 1990 to present)
Catalysts and Catalysed Reactions (monthly, 2005 to
present)

These constitute a backfile of over 50000 novel reactions
RSC and chemical reaction data 2
RSC and chemical reaction data 3
New sources of reaction data



Daniel Lowe’s PhD thesis (Cantab, 2012) was on
extracting reactions from US patent data.
We can apply this technology to the RSC Journal
archive.
ChemSpider Reactions
  bringing it all together
http://csr.dev.rsc-us.org/


WORK IN PROGRESS
Reaction classification              1
Project Prospect has text-mined RSC journal
articles for named reactions and molecular
processes, annotated according to Creative
Commons-licensed ontologies:

See http://rxno.googlecode.com/
Reaction classification   2




Classification of Daniel’s US
Patent data
Reaction InChI
To do for reactions what InChI has done for
structures
•Think online searching
•Deduplication and linking

http://www-rinchi.ch.cam.ac.uk/help.html
Reaction InChI
Early work – RInChIs layered on to a few
hundred thousand reactions
•Not generated for a few 10s of thousands of
reactions
•Reaction deduplication results differ based on
algorithm – GGA software versus RInChI
•Under investigation
Other sources
ChemSpider SyntheticPages

•Electronic Lab Notebooks
•University repositories

Please send theses
What will ChemSpider Reactions serve?
• Chemical Database Service
• Linking back to original
  publications/supplementary data
• Underpinning other tools e.g. retrosynthetic
  analysis (depends on data quality and
  mapping)
Chemical Database
Service
National Chemical Database Service
for UK academics
Integrates commercial databases and
services
Chemicals, analytical data, prediction
algorithms
Development of data repository
ARChem from SimBioSys                 1
Synthesis planning tool which performs rule-
and precedent-based retrosynthetic analysis
back to commercially available starting
materials.
ARChem from SimBiosys   2
ARChem from SimBioSys   3
But what about data quality?
• Data validation and curation
  required
• Encouraging participation with
  Rewards and RECOGNITION
Manual curation
• Integrated commenting, curating and validation
  platform across ALL eScience and Publishing
  platforms
• All integrated to a central RSC profile and
  feeding the alt-metrics tools
The other kind of RDF
               (made-up example)
Chemical reactions are unusually well-suited to representation. (Donald
Davidson’s event semantics)

_:r1 a obo:RXNO_0000004 ; # Diels–Alder
 obo:has_participant_ceasing_to_exist _:m1 ;
# a diene
 obo:has_participant_ceasing_to_exist _:m2 ;
# an olefin
 obo:has_participant_starting_to_exist _:m3 .
# a substituted cyclohexene
_:m1 a <http://rdf.chemspider.com/233000> .
_:m2 a <http://rdf.chemspider.com/233001> .
_:m3 a <http://rdf.chemspider.com/233002> .
Questions?

E-mail: tkachenkov@rsc.org, batchelorc@rsc.org

Mais conteúdo relacionado

Mais procurados

Delivering The Benefits of Chemical-Biological Integration in Computational T...
Delivering The Benefits of Chemical-Biological Integration in Computational T...Delivering The Benefits of Chemical-Biological Integration in Computational T...
Delivering The Benefits of Chemical-Biological Integration in Computational T...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Structure Identification Using High Resolution Mass Spectrometry Data and the...
Structure Identification Using High Resolution Mass Spectrometry Data and the...Structure Identification Using High Resolution Mass Spectrometry Data and the...
Structure Identification Using High Resolution Mass Spectrometry Data and the...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Non-targeted analysis supported by data and cheminformatics delivered via the...
Non-targeted analysis supported by data and cheminformatics delivered via the...Non-targeted analysis supported by data and cheminformatics delivered via the...
Non-targeted analysis supported by data and cheminformatics delivered via the...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
The EPA Comptox Chemistry Dashboard: A Web-Based Data Integration Hub for Env...
The EPA Comptox Chemistry Dashboard: A Web-Based Data Integration Hub for Env...The EPA Comptox Chemistry Dashboard: A Web-Based Data Integration Hub for Env...
The EPA Comptox Chemistry Dashboard: A Web-Based Data Integration Hub for Env...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
The EPA CompTox Chemistry Dashboard and Underpinning Software Architecture
The EPA CompTox Chemistry Dashboard and Underpinning Software Architecture The EPA CompTox Chemistry Dashboard and Underpinning Software Architecture
The EPA CompTox Chemistry Dashboard and Underpinning Software Architecture
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Accessing information for chemicals in hydraulic fracturing fluids using the ...
Accessing information for chemicals in hydraulic fracturing fluids using the ...Accessing information for chemicals in hydraulic fracturing fluids using the ...
Accessing information for chemicals in hydraulic fracturing fluids using the ...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Using the US EPA’s CompTox Chemistry Dashboard for structure identification a...
Using the US EPA’s CompTox Chemistry Dashboard for structure identification a...Using the US EPA’s CompTox Chemistry Dashboard for structure identification a...
Using the US EPA’s CompTox Chemistry Dashboard for structure identification a...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Applications of the US EPA’s CompTox chemicals dashboard to support structure...
Applications of the US EPA’s CompTox chemicals dashboard to support structure...Applications of the US EPA’s CompTox chemicals dashboard to support structure...
Applications of the US EPA’s CompTox chemicals dashboard to support structure...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Chemical identification of unknowns in high resolution mass spectrometry usin...
Chemical identification of unknowns in high resolution mass spectrometry usin...Chemical identification of unknowns in high resolution mass spectrometry usin...
Chemical identification of unknowns in high resolution mass spectrometry usin...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Incorporating new technologies and High Throughput Screening in the design an...
Incorporating new technologies and High Throughput Screening in the design an...Incorporating new technologies and High Throughput Screening in the design an...
Incorporating new technologies and High Throughput Screening in the design an...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Chemistry Validation and Standardization Platform v2.0
Chemistry Validation and Standardization Platform v2.0Chemistry Validation and Standardization Platform v2.0
Chemistry Validation and Standardization Platform v2.0
Valery Tkachenko
 
Tools and approaches for data deposition into nanomaterial databases
Tools and approaches for data deposition into nanomaterial databasesTools and approaches for data deposition into nanomaterial databases
Tools and approaches for data deposition into nanomaterial databases
Valery Tkachenko
 
The EPA CompTox Dashboard as a Data Integration Hub for Environmental Chemist...
The EPA CompTox Dashboard as a Data Integration Hub for Environmental Chemist...The EPA CompTox Dashboard as a Data Integration Hub for Environmental Chemist...
The EPA CompTox Dashboard as a Data Integration Hub for Environmental Chemist...
Andrew McEachran
 

Mais procurados (20)

Delivering The Benefits of Chemical-Biological Integration in Computational T...
Delivering The Benefits of Chemical-Biological Integration in Computational T...Delivering The Benefits of Chemical-Biological Integration in Computational T...
Delivering The Benefits of Chemical-Biological Integration in Computational T...
 
The needs for chemistry standards, database tools and data curation at the ch...
The needs for chemistry standards, database tools and data curation at the ch...The needs for chemistry standards, database tools and data curation at the ch...
The needs for chemistry standards, database tools and data curation at the ch...
 
Structure Identification Using High Resolution Mass Spectrometry Data and the...
Structure Identification Using High Resolution Mass Spectrometry Data and the...Structure Identification Using High Resolution Mass Spectrometry Data and the...
Structure Identification Using High Resolution Mass Spectrometry Data and the...
 
Structure Identification Using High Resolution Mass Spectrometry Data and the...
Structure Identification Using High Resolution Mass Spectrometry Data and the...Structure Identification Using High Resolution Mass Spectrometry Data and the...
Structure Identification Using High Resolution Mass Spectrometry Data and the...
 
The application of cloud computing to royal society of chemistry data platforms
The application of cloud computing to royal society of chemistry data platformsThe application of cloud computing to royal society of chemistry data platforms
The application of cloud computing to royal society of chemistry data platforms
 
An examination of data quality on QSAR Modeling in regards to the environment...
An examination of data quality on QSAR Modeling in regards to the environment...An examination of data quality on QSAR Modeling in regards to the environment...
An examination of data quality on QSAR Modeling in regards to the environment...
 
Non-targeted analysis supported by data and cheminformatics delivered via the...
Non-targeted analysis supported by data and cheminformatics delivered via the...Non-targeted analysis supported by data and cheminformatics delivered via the...
Non-targeted analysis supported by data and cheminformatics delivered via the...
 
The EPA Comptox Chemistry Dashboard: A Web-Based Data Integration Hub for Env...
The EPA Comptox Chemistry Dashboard: A Web-Based Data Integration Hub for Env...The EPA Comptox Chemistry Dashboard: A Web-Based Data Integration Hub for Env...
The EPA Comptox Chemistry Dashboard: A Web-Based Data Integration Hub for Env...
 
The EPA CompTox Chemistry Dashboard and Underpinning Software Architecture
The EPA CompTox Chemistry Dashboard and Underpinning Software Architecture The EPA CompTox Chemistry Dashboard and Underpinning Software Architecture
The EPA CompTox Chemistry Dashboard and Underpinning Software Architecture
 
Accessing information for chemicals in hydraulic fracturing fluids using the ...
Accessing information for chemicals in hydraulic fracturing fluids using the ...Accessing information for chemicals in hydraulic fracturing fluids using the ...
Accessing information for chemicals in hydraulic fracturing fluids using the ...
 
Using the US EPA’s CompTox Chemistry Dashboard for structure identification a...
Using the US EPA’s CompTox Chemistry Dashboard for structure identification a...Using the US EPA’s CompTox Chemistry Dashboard for structure identification a...
Using the US EPA’s CompTox Chemistry Dashboard for structure identification a...
 
Applications of the US EPA’s CompTox chemicals dashboard to support structure...
Applications of the US EPA’s CompTox chemicals dashboard to support structure...Applications of the US EPA’s CompTox chemicals dashboard to support structure...
Applications of the US EPA’s CompTox chemicals dashboard to support structure...
 
Chemical identification of unknowns in high resolution mass spectrometry usin...
Chemical identification of unknowns in high resolution mass spectrometry usin...Chemical identification of unknowns in high resolution mass spectrometry usin...
Chemical identification of unknowns in high resolution mass spectrometry usin...
 
Incorporating new technologies and High Throughput Screening in the design an...
Incorporating new technologies and High Throughput Screening in the design an...Incorporating new technologies and High Throughput Screening in the design an...
Incorporating new technologies and High Throughput Screening in the design an...
 
Chemistry Validation and Standardization Platform v2.0
Chemistry Validation and Standardization Platform v2.0Chemistry Validation and Standardization Platform v2.0
Chemistry Validation and Standardization Platform v2.0
 
ACRL Trust in Science Talk
ACRL Trust in Science TalkACRL Trust in Science Talk
ACRL Trust in Science Talk
 
Tools and approaches for data deposition into nanomaterial databases
Tools and approaches for data deposition into nanomaterial databasesTools and approaches for data deposition into nanomaterial databases
Tools and approaches for data deposition into nanomaterial databases
 
The importance of data curation on QSAR Modeling: PHYSPROP open data as a cas...
The importance of data curation on QSAR Modeling: PHYSPROP open data as a cas...The importance of data curation on QSAR Modeling: PHYSPROP open data as a cas...
The importance of data curation on QSAR Modeling: PHYSPROP open data as a cas...
 
The EPA CompTox Dashboard as a Data Integration Hub for Environmental Chemist...
The EPA CompTox Dashboard as a Data Integration Hub for Environmental Chemist...The EPA CompTox Dashboard as a Data Integration Hub for Environmental Chemist...
The EPA CompTox Dashboard as a Data Integration Hub for Environmental Chemist...
 
Opportunities in chemical structure standardization
Opportunities in chemical structure standardizationOpportunities in chemical structure standardization
Opportunities in chemical structure standardization
 

Destaque

Lançando versões em um clique - deploy contínuo
Lançando versões em um clique - deploy contínuoLançando versões em um clique - deploy contínuo
Lançando versões em um clique - deploy contínuo
Hélio Medeiros
 
2012 ACS Skolnik Symposium - ChemSpotlight
2012 ACS Skolnik Symposium - ChemSpotlight2012 ACS Skolnik Symposium - ChemSpotlight
2012 ACS Skolnik Symposium - ChemSpotlight
Geoffrey Hutchison
 
Rapid Product Design in the Wild - Agile Iceland
Rapid Product Design in the Wild - Agile IcelandRapid Product Design in the Wild - Agile Iceland
Rapid Product Design in the Wild - Agile Iceland
Michele Ide-Smith
 

Destaque (20)

SXSW 2013: How Twitter is Changing How We Watch TV
SXSW 2013: How Twitter is Changing How We Watch TVSXSW 2013: How Twitter is Changing How We Watch TV
SXSW 2013: How Twitter is Changing How We Watch TV
 
Lynne Cazaly - Insights & Connections
Lynne Cazaly - Insights & ConnectionsLynne Cazaly - Insights & Connections
Lynne Cazaly - Insights & Connections
 
Morgan e xt_062811
Morgan e xt_062811Morgan e xt_062811
Morgan e xt_062811
 
Coordinating Senior Care with Intuit's Weave
Coordinating Senior Care with Intuit's Weave Coordinating Senior Care with Intuit's Weave
Coordinating Senior Care with Intuit's Weave
 
Scrummaster Needed Desperately at 2016 Scrum Australia
Scrummaster Needed Desperately at 2016 Scrum AustraliaScrummaster Needed Desperately at 2016 Scrum Australia
Scrummaster Needed Desperately at 2016 Scrum Australia
 
Engaging students in publishing on the internet early in their careers
Engaging students in publishing on the internet early in their careersEngaging students in publishing on the internet early in their careers
Engaging students in publishing on the internet early in their careers
 
Vetting Plugins : WordCamp Columbus 2015
Vetting Plugins : WordCamp Columbus 2015Vetting Plugins : WordCamp Columbus 2015
Vetting Plugins : WordCamp Columbus 2015
 
Lançando versões em um clique - deploy contínuo
Lançando versões em um clique - deploy contínuoLançando versões em um clique - deploy contínuo
Lançando versões em um clique - deploy contínuo
 
2012 ACS Skolnik Symposium - ChemSpotlight
2012 ACS Skolnik Symposium - ChemSpotlight2012 ACS Skolnik Symposium - ChemSpotlight
2012 ACS Skolnik Symposium - ChemSpotlight
 
The State of PHPUnit
The State of PHPUnitThe State of PHPUnit
The State of PHPUnit
 
Rapid Product Design in the Wild - Agile Iceland
Rapid Product Design in the Wild - Agile IcelandRapid Product Design in the Wild - Agile Iceland
Rapid Product Design in the Wild - Agile Iceland
 
Web Frontend development: tools and good practices to (re)organize the chaos
Web Frontend development: tools and good practices to (re)organize the chaosWeb Frontend development: tools and good practices to (re)organize the chaos
Web Frontend development: tools and good practices to (re)organize the chaos
 
Introduction to Perl Best Practices
Introduction to Perl Best PracticesIntroduction to Perl Best Practices
Introduction to Perl Best Practices
 
Martijn van Exel - Collaborate to compete: Regain your Competitive Edge with osm
Martijn van Exel - Collaborate to compete: Regain your Competitive Edge with osmMartijn van Exel - Collaborate to compete: Regain your Competitive Edge with osm
Martijn van Exel - Collaborate to compete: Regain your Competitive Edge with osm
 
Mobile Content Prototyping - Jump Start Your Mobile Project
Mobile Content Prototyping - Jump Start Your Mobile ProjectMobile Content Prototyping - Jump Start Your Mobile Project
Mobile Content Prototyping - Jump Start Your Mobile Project
 
Cowboy development with Django
Cowboy development with DjangoCowboy development with Django
Cowboy development with Django
 
What may I do with your data? What do I have to do with your data? Policie...
What may I do with your data? What do I have to do with your data? Policie...What may I do with your data? What do I have to do with your data? Policie...
What may I do with your data? What do I have to do with your data? Policie...
 
Make your web apps "Go, Go" like Power Rangers
Make your web apps "Go, Go" like Power RangersMake your web apps "Go, Go" like Power Rangers
Make your web apps "Go, Go" like Power Rangers
 
Beyond PHP - It's not (just) about the code
Beyond PHP - It's not (just) about the codeBeyond PHP - It's not (just) about the code
Beyond PHP - It's not (just) about the code
 
7 reasons to start using Docker
7 reasons to start using Docker7 reasons to start using Docker
7 reasons to start using Docker
 

Semelhante a ChemSpider reactions – delivering a free community resource of chemical syntheses

Structure identification approaches using the EPA CompTox Chemicals Dashboard...
Structure identification approaches using the EPA CompTox Chemicals Dashboard...Structure identification approaches using the EPA CompTox Chemicals Dashboard...
Structure identification approaches using the EPA CompTox Chemicals Dashboard...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Serving the medicinal chemistry community with Royal Society of Chemistry che...
Serving the medicinal chemistry community with Royal Society of Chemistry che...Serving the medicinal chemistry community with Royal Society of Chemistry che...
Serving the medicinal chemistry community with Royal Society of Chemistry che...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Accessing Environmental Chemistry Data via Data Dashboards and Applications t...
Accessing Environmental Chemistry Data via Data Dashboards and Applications t...Accessing Environmental Chemistry Data via Data Dashboards and Applications t...
Accessing Environmental Chemistry Data via Data Dashboards and Applications t...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
ICIC 2013 Conference Proceedings Antony Williams Royal Society of Chemistry
ICIC 2013 Conference Proceedings Antony Williams Royal Society of ChemistryICIC 2013 Conference Proceedings Antony Williams Royal Society of Chemistry
ICIC 2013 Conference Proceedings Antony Williams Royal Society of Chemistry
Dr. Haxel Consult
 
Big data challenges associated with building a national data repository for c...
Big data challenges associated with building a national data repository for c...Big data challenges associated with building a national data repository for c...
Big data challenges associated with building a national data repository for c...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
US-EPA Chemicals Dashboard – an integrated data hub for environmental science
US-EPA Chemicals Dashboard – an integrated data hub for environmental scienceUS-EPA Chemicals Dashboard – an integrated data hub for environmental science
US-EPA Chemicals Dashboard – an integrated data hub for environmental science
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Structure identification by Mass Spectrometry Non-Targeted Analysis using the...
Structure identification by Mass Spectrometry Non-Targeted Analysis using the...Structure identification by Mass Spectrometry Non-Targeted Analysis using the...
Structure identification by Mass Spectrometry Non-Targeted Analysis using the...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...
ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...
ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...
Dr. Haxel Consult
 
CompTox Chemicals Dashboard: Data and tools to support chemical and environme...
CompTox Chemicals Dashboard: Data and tools to support chemical and environme...CompTox Chemicals Dashboard: Data and tools to support chemical and environme...
CompTox Chemicals Dashboard: Data and tools to support chemical and environme...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
eScience at the Royal Society of Chemistry and our current initiatives
eScience at the Royal Society of Chemistry and our current initiativeseScience at the Royal Society of Chemistry and our current initiatives
eScience at the Royal Society of Chemistry and our current initiatives
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Activities at the Royal Society of Chemistry to gather, extract and analyze b...
Activities at the Royal Society of Chemistry to gather, extract and analyze b...Activities at the Royal Society of Chemistry to gather, extract and analyze b...
Activities at the Royal Society of Chemistry to gather, extract and analyze b...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 

Semelhante a ChemSpider reactions – delivering a free community resource of chemical syntheses (20)

ChemSpider reactions – delivering a free community resource of chemical synth...
ChemSpider reactions – delivering a free community resource of chemical synth...ChemSpider reactions – delivering a free community resource of chemical synth...
ChemSpider reactions – delivering a free community resource of chemical synth...
 
Structure identification approaches using the EPA CompTox Chemicals Dashboard...
Structure identification approaches using the EPA CompTox Chemicals Dashboard...Structure identification approaches using the EPA CompTox Chemicals Dashboard...
Structure identification approaches using the EPA CompTox Chemicals Dashboard...
 
Serving the medicinal chemistry community with Royal Society of Chemistry che...
Serving the medicinal chemistry community with Royal Society of Chemistry che...Serving the medicinal chemistry community with Royal Society of Chemistry che...
Serving the medicinal chemistry community with Royal Society of Chemistry che...
 
Accessing Environmental Chemistry Data via Data Dashboards and Applications t...
Accessing Environmental Chemistry Data via Data Dashboards and Applications t...Accessing Environmental Chemistry Data via Data Dashboards and Applications t...
Accessing Environmental Chemistry Data via Data Dashboards and Applications t...
 
ICIC 2013 Conference Proceedings Antony Williams Royal Society of Chemistry
ICIC 2013 Conference Proceedings Antony Williams Royal Society of ChemistryICIC 2013 Conference Proceedings Antony Williams Royal Society of Chemistry
ICIC 2013 Conference Proceedings Antony Williams Royal Society of Chemistry
 
Big data challenges associated with building a national data repository for c...
Big data challenges associated with building a national data repository for c...Big data challenges associated with building a national data repository for c...
Big data challenges associated with building a national data repository for c...
 
A chemistry data repository to serve them all
A chemistry data repository to serve them allA chemistry data repository to serve them all
A chemistry data repository to serve them all
 
US-EPA Chemicals Dashboard – an integrated data hub for environmental science
US-EPA Chemicals Dashboard – an integrated data hub for environmental scienceUS-EPA Chemicals Dashboard – an integrated data hub for environmental science
US-EPA Chemicals Dashboard – an integrated data hub for environmental science
 
Structure identification by Mass Spectrometry Non-Targeted Analysis using the...
Structure identification by Mass Spectrometry Non-Targeted Analysis using the...Structure identification by Mass Spectrometry Non-Targeted Analysis using the...
Structure identification by Mass Spectrometry Non-Targeted Analysis using the...
 
ChemSpider – disseminating data and enabling an abundance of chemistry platforms
ChemSpider – disseminating data and enabling an abundance of chemistry platformsChemSpider – disseminating data and enabling an abundance of chemistry platforms
ChemSpider – disseminating data and enabling an abundance of chemistry platforms
 
Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...
Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...
Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...
 
Reaxys rmc unified platform_ webinar_
Reaxys rmc unified platform_ webinar_Reaxys rmc unified platform_ webinar_
Reaxys rmc unified platform_ webinar_
 
Cheminformatics tools and chemistry data underpinning mass spectrometry analy...
Cheminformatics tools and chemistry data underpinning mass spectrometry analy...Cheminformatics tools and chemistry data underpinning mass spectrometry analy...
Cheminformatics tools and chemistry data underpinning mass spectrometry analy...
 
UDM (Unified Data Model) - Enabling Exchange of Comprehensive Reaction Inform...
UDM (Unified Data Model) - Enabling Exchange of Comprehensive Reaction Inform...UDM (Unified Data Model) - Enabling Exchange of Comprehensive Reaction Inform...
UDM (Unified Data Model) - Enabling Exchange of Comprehensive Reaction Inform...
 
ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...
ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...
ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...
 
CompTox Chemicals Dashboard: Data and tools to support chemical and environme...
CompTox Chemicals Dashboard: Data and tools to support chemical and environme...CompTox Chemicals Dashboard: Data and tools to support chemical and environme...
CompTox Chemicals Dashboard: Data and tools to support chemical and environme...
 
Data drivenapproach to medicinalchemistry
Data drivenapproach to medicinalchemistryData drivenapproach to medicinalchemistry
Data drivenapproach to medicinalchemistry
 
eScience at the Royal Society of Chemistry and our current initiatives
eScience at the Royal Society of Chemistry and our current initiativeseScience at the Royal Society of Chemistry and our current initiatives
eScience at the Royal Society of Chemistry and our current initiatives
 
The US-EPA CompTox Chemicals Dashboard to support Non-Targeted Analysis
The US-EPA CompTox Chemicals Dashboard to support Non-Targeted AnalysisThe US-EPA CompTox Chemicals Dashboard to support Non-Targeted Analysis
The US-EPA CompTox Chemicals Dashboard to support Non-Targeted Analysis
 
Activities at the Royal Society of Chemistry to gather, extract and analyze b...
Activities at the Royal Society of Chemistry to gather, extract and analyze b...Activities at the Royal Society of Chemistry to gather, extract and analyze b...
Activities at the Royal Society of Chemistry to gather, extract and analyze b...
 

Último

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Último (20)

GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 

ChemSpider reactions – delivering a free community resource of chemical syntheses

  • 1. ChemSpider Reactions: Delivering a free community resource of chemical syntheses Valery Tkachenko, Colin Batchelor, Daniel Lowe, Ken Karapetyan, David Sharpe and Antony Williams ACS New Orleans April 2013
  • 2. Overview • Motivation • The RSC and chemical reaction data • New sources of chemical reaction data • ChemSpider Reactions: bringing it all together • Experiments with reaction classification • The National Chemical Database Service
  • 3. Who needs another reaction database? • Those who cannot afford to license access… • Those who would like to access data that is not abstracted • Those who might like to contribute data to a database • Anybody wanting to integrate their systems in and to pull data out.
  • 4. RSC and chemical reaction data 1 Graphical abstracting journals: Methods in Organic Synthesis (monthly, 1990 to present) Catalysts and Catalysed Reactions (monthly, 2005 to present) These constitute a backfile of over 50000 novel reactions
  • 5. RSC and chemical reaction data 2
  • 6. RSC and chemical reaction data 3
  • 7. New sources of reaction data Daniel Lowe’s PhD thesis (Cantab, 2012) was on extracting reactions from US patent data. We can apply this technology to the RSC Journal archive.
  • 8. ChemSpider Reactions bringing it all together http://csr.dev.rsc-us.org/ WORK IN PROGRESS
  • 9. Reaction classification 1 Project Prospect has text-mined RSC journal articles for named reactions and molecular processes, annotated according to Creative Commons-licensed ontologies: See http://rxno.googlecode.com/
  • 10. Reaction classification 2 Classification of Daniel’s US Patent data
  • 11. Reaction InChI To do for reactions what InChI has done for structures •Think online searching •Deduplication and linking http://www-rinchi.ch.cam.ac.uk/help.html
  • 12. Reaction InChI Early work – RInChIs layered on to a few hundred thousand reactions •Not generated for a few 10s of thousands of reactions •Reaction deduplication results differ based on algorithm – GGA software versus RInChI •Under investigation
  • 13. Other sources ChemSpider SyntheticPages •Electronic Lab Notebooks •University repositories Please send theses
  • 14. What will ChemSpider Reactions serve? • Chemical Database Service • Linking back to original publications/supplementary data • Underpinning other tools e.g. retrosynthetic analysis (depends on data quality and mapping)
  • 15. Chemical Database Service National Chemical Database Service for UK academics Integrates commercial databases and services Chemicals, analytical data, prediction algorithms Development of data repository
  • 16. ARChem from SimBioSys 1 Synthesis planning tool which performs rule- and precedent-based retrosynthetic analysis back to commercially available starting materials.
  • 19. But what about data quality? • Data validation and curation required • Encouraging participation with Rewards and RECOGNITION
  • 20. Manual curation • Integrated commenting, curating and validation platform across ALL eScience and Publishing platforms • All integrated to a central RSC profile and feeding the alt-metrics tools
  • 21. The other kind of RDF (made-up example) Chemical reactions are unusually well-suited to representation. (Donald Davidson’s event semantics) _:r1 a obo:RXNO_0000004 ; # Diels–Alder obo:has_participant_ceasing_to_exist _:m1 ; # a diene obo:has_participant_ceasing_to_exist _:m2 ; # an olefin obo:has_participant_starting_to_exist _:m3 . # a substituted cyclohexene _:m1 a <http://rdf.chemspider.com/233000> . _:m2 a <http://rdf.chemspider.com/233001> . _:m3 a <http://rdf.chemspider.com/233002> .