SlideShare a Scribd company logo
1 of 30
Download to read offline
CINF 13, ACS Fall 2017, Washington, D.C.
pistachio
Search and Faceting of Large Reaction Databases
John	Mayfield,	Daniel	Lowe,	Roger	Sayle
What do Synthetic Chemists Want from Their
Reaction Systems?
CINF 13, ACS Fall 2017, Washington, D.C.
Data ClassificationDiagrams Search
What do Synthetic Chemists Want from Their
Reaction Systems?
CINF 13, ACS Fall 2017, Washington, D.C.
Data ClassificationDiagrams Search
HazELNut Filbert NameRXN Cobnut
Accelrys
Pipeline Pilot
(AstraZeneca, AbbVie
& Hoffmann-La Roche)
ChemAxon
JChem Cartridge
(GlaxoSmithKline
& Novartis)
Elsevier Reaxys
(Hoffmann-La Roche,
AstraZeneca, Merck)
Perkin Elmer Informatics
(formerly CambridgeSoft)
eNotebook v9, v11 or v13
or Symyx ELN v5.x or v6.x
Oracle Server
version 10, 11 or
Microsoft Windows, Linux or Mac OS
Infrastructure for liberating and processing
reactions from Electronic Lab Notebooks (ELNs)
CINF 13, ACS Fall 2017, Washington, D.C.
To 7-chloro-4-oxo-4,5-dihydrofuro[2,3-d]pyridazine-2-carboxylic acid (Peakdale) (220 mg, 1.025 mmol) and (3,4-
dimethoxyphenyl)boronic acid (187 mg, 1.025 mmol) in 1,4-dioxane (3 mL) and water (1.5 mL) was
added sodium carbonate(435 mg, 4.10 mmol) and tetrakis(triphenylphosphine)palladium(0) (110 mg, 0.095
mmol). The reaction was heated in the microwave at 80° C. for 2 hours and at 100° C. for a further 2 hours.
The solvent was removed and the residue was suspended in DMSO, filtered and purified by MDAP. Appropriate
fractions were combined and the solvent removed to give 7-(3,4-dimethoxyphenyl)-4-oxo-4,5-dihydrofuro[2,3-
d]pyridazine-2-carboxylic acid (25 mg, 7%) as a yellow solid.
[0517]
US 2016/16966 A1
Daniel M. Lowe. Extraction of chemical structures and reactions from the literature. Ph.D. Thesis,
University of Cambridge, 2012
Daniel M. Lowe. Extraction of chemical structures and reactions from the literature. Ph.D. Thesis,
University of Cambridge, 2012
To 7-chloro-4-oxo-4,5-dihydrofuro[2,3-d]pyridazine-2-carboxylic acid (Peakdale) (220 mg, 1.025 mmol) and (3,4-
dimethoxyphenyl)boronic acid (187 mg, 1.025 mmol) in 1,4-dioxane (3 mL) and water (1.5 mL) was
added sodium carbonate(435 mg, 4.10 mmol) and tetrakis(triphenylphosphine)palladium(0) (110 mg, 0.095
mmol). The reaction was heated in the microwave at 80° C. for 2 hours and at 100° C. for a further 2 hours.
The solvent was removed and the residue was suspended in DMSO, filtered and purified by MDAP. Appropriate
fractions were combined and the solvent removed to give 7-(3,4-dimethoxyphenyl)-4-oxo-4,5-dihydrofuro[2,3-
d]pyridazine-2-carboxylic acid (25 mg, 7%) as a yellow solid.
[0517]
Product Properties
7-(3,4-dimethoxyphenyl)-4-oxo-4,5-dihydrofuro[2,3-d]pyridazine-2-carboxylic acid 25 mg, 7% yield, Yellow Solid
Reactant Properties
7-chloro-4-oxo-4,5-dihydrofuro[2,3-d]pyridazine-2-carboxylic acid 220 mg, 1.025 mmol
(3,4-dimethoxyphenyl)boronic acid 187 mg, 1.025 mmol
Agent Properties
1,4-dioxane 3mL
water 1.5mL
sodium carbonate 435 mg, 4.10 mol
tetrakis(triphenylphosphine)palladium(0) 110 mg, 0.095 mmol
DMSO
Unstructured	text	to	a	structured	reaction	table
US 2016/16966 A1
LeadMine	+	Chemical	Tagger
Christos Nicolaou et al. The Proximal Lilly Collection: Mapping, Exploring and Exploiting
Feasible Chemical Space J. Chem. Inf. Model., 2016, 56 (7), pp 1253–1266
Nadine Schneider et al. Big Data from Pharmaceutical Patents: A Computational Analysis of
Medicinal Chemists’ Bread and Butter. J. Med. Chem., 2016, 59 (9), pp 4385–4402
Nadine Schneider et al. Development of a Novel Fingerprint for Chemical Reactions and Its
Application to Large-Scale Reaction Classification and Similarity J. Chem. Inf.
Model., 2015, 55 (1), pp 39–53
Nadine Schneider et al. What’s What: The (Nearly) Definitive Guide to Reaction Role
Assignment. J. Chem. Inf. Model., 2016, 56 (12), pp 2336–2346
Connor Coley et al. Prediction of Organic Reaction Outcomes Using Machine Learning. ACS
Cent. Sci., 2017, 3 (5), pp 434–443
Data impact
CINF 13, ACS Fall 2017, Washington, D.C.
Public subset released in 2014 as CC-Zero
Pistachio expands the scope of the data and uses Atom-
Atom Maps from NameRxn
Example	26.	Epizyme	Inc.	1-phenoxy-3-(alkylamino)-propan-2-olderivatives	as	CARM1	inhibitors	and	uses	thereof	(US	09718816	
B2)	Aug.	1,	2017
Example 26, US 09718816 B2
John	May,	et	al.	Sketchy	Sketches:	Hiding	Chemistry	in	Plain	Sight.	Seventh	Joint	Sheffield	Conference	on	
Cheminformatics.	2016
	Step	1
	Step	4
	Step	3
	Step	2
	etc..
sketch extraction
NextMove’s	Praline
total reactions over time
CINF 13, ACS Fall 2017, Washington, D.C.
0
0.5M
1.0M
1.5M
2.0M
2.5M
3.0M
3.5M
1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018
ReactionDetails(cumulative)
EPO Applications
EPO Grants
USPTO Applications
USPTO Grants
What do Synthetic Chemists Want from Their
Reaction Systems?
CINF 13, ACS Fall 2017, Washington, D.C.
Data ClassificationDiagrams Search
reaction DIAGRAMS
Good reaction diagrams are essential in
communicating synthetic chemistry
Layout can be stored or generated
• When extracting from text, layout must be generated
• Generated diagrams can be unsatisfactory for display
CINF 13, ACS Fall 2017, Washington, D.C.
O
OB
OH
HO
OH
O
O
Cl
N
HN
C
O
PPd
P
P
P
O
O
Na+
Na+
-O O-
O
H2O
O
O
N
HN
C
O
O OH
O
+
ChemDrawOEChem
Generated from SMILES for US 2016/16966 A1 [0517]
ChemAxonBIOVIA
Generated from SMILES for US 2016/16966 A1 [0517]
diagram improvements
Typical work arounds:
• Separately render molecules
• Hide agents and list separately
What do humans do:
• Wrap products below
• Abbreviate functional groups and agents
• Orientate reactants to products and visa versa
• Hide agents and list as text
CINF 13, ACS Fall 2017, Washington, D.C.
Pistachio+CDK
(Abbreviated+Aligned)
Pistachio+CDK
(Abbreviated)
Generated from SMILES for US 2016/16966 A1 [0517]
reaction detail view
What do Synthetic Chemists Want from Their
Reaction Systems?
CINF 13, ACS Fall 2017, Washington, D.C.
Data ClassificationDiagrams Search
4.1.6	Cyclic	Beckmann	rearrangement
Assigns names to 900+ reactions using transformations
Can guarantee perfect Atom-Atom Mapping
• Atom-Atom Mapping is an output not an input
• MCS mappers struggle with rearrangements:
namerxn
concepts and rxno
CINF 13, ACS Fall 2017, Washington, D.C.
1 Heteroatom alkylation and arylation
.7 O-substitution
.1 Chan-Lam ether coupling
.2 Diazomethane esterification
.3 Ethyl esterification
.4 Hydroxy to methoxy
.5 Hydroxy to triflyloxy
.6 Methyl esterification
.n
2 Acylation and related processes
.6 O-acylation to ester
.1 Ester Schotten-Baumann
.2 Esterification (generic)
.3 Fischer-Speier esterification
.4 Baeyer-Villiger oxidation
.5 Yamaguchi esterification
.6 Hydroxy to imidazolecarbonyloxy
.7 Imidazolecarbonyl to ester
.8 Hydroxy to acetoxy
.9 Steglich esterification
.n
concepts and rxno
CINF 13, ACS Fall 2017, Washington, D.C.
1 Heteroatom alkylation and arylation
.7 O-substitution
.1 Chan-Lam ether coupling
.2 Diazomethane esterification
.3 Ethyl esterification
.4 Hydroxy to methoxy
.5 Hydroxy to triflyloxy
.6 Methyl esterification
.n
2 Acylation and related processes
.6 O-acylation to ester
.1 Ester Schotten-Baumann
.2 Esterification (generic)
.3 Fischer-Speier esterification
.4 Baeyer-Villiger oxidation
.5 Yamaguchi esterification
.6 Hydroxy to imidazolecarbonyloxy
.7 Imidazolecarbonyl to ester
.8 Hydroxy to acetoxy
.9 Steglich esterification
.n
Esterification	(7)
Chan-Lam	coupling	(3)
Schotten-Baumann	
Reaction	(9)
RXNO: http://github.com/rsc-ontologies/rxno
result FACETS
Provides summary over the key concepts of results
Cut through information deluge and refine search
CINF 13, ACS Fall 2017, Washington, D.C.
• Reaction Types (NextMove ontology tree)
• Drug Targets (ChEMBL ontology tree)
• Disease Targets (MESH ontology tree)
• Yields
• Affiliation (NextMove ontology tree)
• Publication Date, Documents, Authors
CINF 13, ACS Fall 2017, Washington, D.C.
Intel(R) Core(TM) i7-6900K CPU @
3.20GHz
2.9 seconds to summarise
all 6.6 million rows
Resource expensive – O(n) size of
result set
• Client, server, or database?
• Overhead copying and transferring data that is
not needed
• Calculate when requested or up-front?
facet calculation
Custom cartridge:
What do Synthetic Chemists Want from Their
Reaction Systems?
CINF 13, ACS Fall 2017, Washington, D.C.
Data ClassificationDiagrams Search
one entry point
CINF 13, ACS Fall 2017, Washington, D.C.
Systematic	Name Date	Range Trivial	Name
Yield	Range Affiliation Reaction	SMARTS
Disease	Target DocumentLine	Formula
SMILES InChIAuthor Protein	Target Collection
Reaction	Type	(NameRxn)SMARTSSource
…and	logical	combinations	thereof
suggestions
Based on global frequency
CINF 13, ACS Fall 2017, Washington, D.C.
Based on context frequency
structure search technology
NextMove’s Arthor Technology
Up to 100x faster then state-of-the-
art
Combination of SMARTS
compilation and efficient storage
Preliminary PostgreSQL integration
36s Arthor
56m BIOVIA Direct (Oracle)
1h Bingo (NoSQL)
1h54m Bingo (PostgreSQL)
2h6m Bingo (Oracle)
2h41m JChem (Oracle)
5h9m RDCart (PostgreSQL)
13h54m pgchem (PostgreSQL)
1d1h52m mychem (MySQL)
3d1h13m orchem (Oracle)
Benchmark: ~3.5K queries against ~7M structures (eMolecules 2014) all on the same
hardware.
John May and Roger Sayle, Substructure Search Face-off, May 2015
Intention can be refined by qualifiers
Role
{structure} product
Substructure
{structure} substructure
{structure} substructure product
Make/Break
Synthesis of {structure}
Combined with other terms
{structure} substructure product and yield of 80%
refining structure search
CINF 13, ACS Fall 2017, Washington, D.C.
Find:	7H-purine	substructure	product
Find:	Synthesis	of	7H-purine
make/break example
CINF 13, ACS Fall 2017, Washington, D.C.
Find:	7H-purine-8-one	substructure	chlorination
Find:	[*:1][CH2:2]Cl>>[*:1][CH2:2]F
Namerxn example
CINF 13, ACS Fall 2017, Washington, D.C.
Acknowledgements
Noel O’Boyle (NextMove Software), Egon Willighagen (CDK)
James Davison, Matt Swain (Vernalis)
What do Synthetic Chemists Want from Their
Reaction Systems?
Data ClassificationDiagrams Search
pistachio
http://www.nextmovesoftware.com/pistachio.html
Come find me around ACS for a demo!
See also: CINF 90

More Related Content

What's hot

Dimensionality reduction with t-SNE(Rtsne) and UMAP(uwot) using R packages.
Dimensionality reduction with t-SNE(Rtsne) and UMAP(uwot) using R packages. Dimensionality reduction with t-SNE(Rtsne) and UMAP(uwot) using R packages.
Dimensionality reduction with t-SNE(Rtsne) and UMAP(uwot) using R packages. Satoshi Kato
 
ランダムフォレスト
ランダムフォレストランダムフォレスト
ランダムフォレストKinki University
 
Cluster Analysis Introduction
Cluster Analysis IntroductionCluster Analysis Introduction
Cluster Analysis IntroductionPrasiddhaSarma
 
Causal discovery and prediction mechanisms
Causal discovery and prediction mechanismsCausal discovery and prediction mechanisms
Causal discovery and prediction mechanismsShiga University, RIKEN
 
Deep residual learning for image recognition
Deep residual learning for image recognitionDeep residual learning for image recognition
Deep residual learning for image recognitionYoonho Shin
 
Densely Connected Convolutional Networks
Densely Connected Convolutional NetworksDensely Connected Convolutional Networks
Densely Connected Convolutional Networksharmonylab
 
[第2回3D勉強会 研究紹介] Neural 3D Mesh Renderer (CVPR 2018)
[第2回3D勉強会 研究紹介] Neural 3D Mesh Renderer (CVPR 2018)[第2回3D勉強会 研究紹介] Neural 3D Mesh Renderer (CVPR 2018)
[第2回3D勉強会 研究紹介] Neural 3D Mesh Renderer (CVPR 2018)Hiroharu Kato
 
【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...
【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...
【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...Deep Learning JP
 
[DL輪読会]ICLR2020の分布外検知速報
[DL輪読会]ICLR2020の分布外検知速報[DL輪読会]ICLR2020の分布外検知速報
[DL輪読会]ICLR2020の分布外検知速報Deep Learning JP
 
[DL輪読会]GENESIS: Generative Scene Inference and Sampling with Object-Centric L...
[DL輪読会]GENESIS: Generative Scene Inference and Sampling with Object-Centric L...[DL輪読会]GENESIS: Generative Scene Inference and Sampling with Object-Centric L...
[DL輪読会]GENESIS: Generative Scene Inference and Sampling with Object-Centric L...Deep Learning JP
 
[DL輪読会]Adversarial Learning for Zero-shot Domain Adaptation
[DL輪読会]Adversarial Learning for Zero-shot Domain Adaptation[DL輪読会]Adversarial Learning for Zero-shot Domain Adaptation
[DL輪読会]Adversarial Learning for Zero-shot Domain AdaptationDeep Learning JP
 
[DL Hacks 実装]MIDINET: A Convolutional Generative Adversarial Network For Symb...
[DL Hacks 実装]MIDINET: A Convolutional Generative Adversarial Network For Symb...[DL Hacks 実装]MIDINET: A Convolutional Generative Adversarial Network For Symb...
[DL Hacks 実装]MIDINET: A Convolutional Generative Adversarial Network For Symb...Deep Learning JP
 
Cs231n 2017 lecture13 Generative Model
Cs231n 2017 lecture13 Generative ModelCs231n 2017 lecture13 Generative Model
Cs231n 2017 lecture13 Generative ModelYanbin Kong
 
確率モデルを使ったグラフクラスタリング
確率モデルを使ったグラフクラスタリング確率モデルを使ったグラフクラスタリング
確率モデルを使ったグラフクラスタリング正志 坪坂
 
[DL輪読会]AdaShare: Learning What To Share For Efficient Deep Multi-Task Learning
[DL輪読会]AdaShare: Learning What To Share For Efficient Deep Multi-Task Learning[DL輪読会]AdaShare: Learning What To Share For Efficient Deep Multi-Task Learning
[DL輪読会]AdaShare: Learning What To Share For Efficient Deep Multi-Task LearningDeep Learning JP
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to CassandraGokhan Atil
 
非制約最小二乗密度比推定法 uLSIF を用いた外れ値検出
非制約最小二乗密度比推定法 uLSIF を用いた外れ値検出非制約最小二乗密度比推定法 uLSIF を用いた外れ値検出
非制約最小二乗密度比推定法 uLSIF を用いた外れ値検出hoxo_m
 
AI勉強会用スライド
AI勉強会用スライドAI勉強会用スライド
AI勉強会用スライドharmonylab
 
夏のトップカンファレンス論文読み会 / Realtime Multi-Person 2D Pose Estimation using Part Affin...
夏のトップカンファレンス論文読み会 / Realtime Multi-Person 2D Pose Estimation using Part Affin...夏のトップカンファレンス論文読み会 / Realtime Multi-Person 2D Pose Estimation using Part Affin...
夏のトップカンファレンス論文読み会 / Realtime Multi-Person 2D Pose Estimation using Part Affin...Shunsuke Ono
 
Deeplearning bank marketing dataset
Deeplearning bank marketing datasetDeeplearning bank marketing dataset
Deeplearning bank marketing datasetTellSun
 

What's hot (20)

Dimensionality reduction with t-SNE(Rtsne) and UMAP(uwot) using R packages.
Dimensionality reduction with t-SNE(Rtsne) and UMAP(uwot) using R packages. Dimensionality reduction with t-SNE(Rtsne) and UMAP(uwot) using R packages.
Dimensionality reduction with t-SNE(Rtsne) and UMAP(uwot) using R packages.
 
ランダムフォレスト
ランダムフォレストランダムフォレスト
ランダムフォレスト
 
Cluster Analysis Introduction
Cluster Analysis IntroductionCluster Analysis Introduction
Cluster Analysis Introduction
 
Causal discovery and prediction mechanisms
Causal discovery and prediction mechanismsCausal discovery and prediction mechanisms
Causal discovery and prediction mechanisms
 
Deep residual learning for image recognition
Deep residual learning for image recognitionDeep residual learning for image recognition
Deep residual learning for image recognition
 
Densely Connected Convolutional Networks
Densely Connected Convolutional NetworksDensely Connected Convolutional Networks
Densely Connected Convolutional Networks
 
[第2回3D勉強会 研究紹介] Neural 3D Mesh Renderer (CVPR 2018)
[第2回3D勉強会 研究紹介] Neural 3D Mesh Renderer (CVPR 2018)[第2回3D勉強会 研究紹介] Neural 3D Mesh Renderer (CVPR 2018)
[第2回3D勉強会 研究紹介] Neural 3D Mesh Renderer (CVPR 2018)
 
【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...
【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...
【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...
 
[DL輪読会]ICLR2020の分布外検知速報
[DL輪読会]ICLR2020の分布外検知速報[DL輪読会]ICLR2020の分布外検知速報
[DL輪読会]ICLR2020の分布外検知速報
 
[DL輪読会]GENESIS: Generative Scene Inference and Sampling with Object-Centric L...
[DL輪読会]GENESIS: Generative Scene Inference and Sampling with Object-Centric L...[DL輪読会]GENESIS: Generative Scene Inference and Sampling with Object-Centric L...
[DL輪読会]GENESIS: Generative Scene Inference and Sampling with Object-Centric L...
 
[DL輪読会]Adversarial Learning for Zero-shot Domain Adaptation
[DL輪読会]Adversarial Learning for Zero-shot Domain Adaptation[DL輪読会]Adversarial Learning for Zero-shot Domain Adaptation
[DL輪読会]Adversarial Learning for Zero-shot Domain Adaptation
 
[DL Hacks 実装]MIDINET: A Convolutional Generative Adversarial Network For Symb...
[DL Hacks 実装]MIDINET: A Convolutional Generative Adversarial Network For Symb...[DL Hacks 実装]MIDINET: A Convolutional Generative Adversarial Network For Symb...
[DL Hacks 実装]MIDINET: A Convolutional Generative Adversarial Network For Symb...
 
Cs231n 2017 lecture13 Generative Model
Cs231n 2017 lecture13 Generative ModelCs231n 2017 lecture13 Generative Model
Cs231n 2017 lecture13 Generative Model
 
確率モデルを使ったグラフクラスタリング
確率モデルを使ったグラフクラスタリング確率モデルを使ったグラフクラスタリング
確率モデルを使ったグラフクラスタリング
 
[DL輪読会]AdaShare: Learning What To Share For Efficient Deep Multi-Task Learning
[DL輪読会]AdaShare: Learning What To Share For Efficient Deep Multi-Task Learning[DL輪読会]AdaShare: Learning What To Share For Efficient Deep Multi-Task Learning
[DL輪読会]AdaShare: Learning What To Share For Efficient Deep Multi-Task Learning
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to Cassandra
 
非制約最小二乗密度比推定法 uLSIF を用いた外れ値検出
非制約最小二乗密度比推定法 uLSIF を用いた外れ値検出非制約最小二乗密度比推定法 uLSIF を用いた外れ値検出
非制約最小二乗密度比推定法 uLSIF を用いた外れ値検出
 
AI勉強会用スライド
AI勉強会用スライドAI勉強会用スライド
AI勉強会用スライド
 
夏のトップカンファレンス論文読み会 / Realtime Multi-Person 2D Pose Estimation using Part Affin...
夏のトップカンファレンス論文読み会 / Realtime Multi-Person 2D Pose Estimation using Part Affin...夏のトップカンファレンス論文読み会 / Realtime Multi-Person 2D Pose Estimation using Part Affin...
夏のトップカンファレンス論文読み会 / Realtime Multi-Person 2D Pose Estimation using Part Affin...
 
Deeplearning bank marketing dataset
Deeplearning bank marketing datasetDeeplearning bank marketing dataset
Deeplearning bank marketing dataset
 

Similar to CINF 13: Pistachio - Search and Faceting of Large Reaction Databases

Free online access to experimental and predicted chemical properties through ...
Free online access to experimental and predicted chemical properties through ...Free online access to experimental and predicted chemical properties through ...
Free online access to experimental and predicted chemical properties through ...Kamel Mansouri
 
ICIC 2013 Conference Proceedings Sebastian Radestock
ICIC 2013 Conference Proceedings Sebastian RadestockICIC 2013 Conference Proceedings Sebastian Radestock
ICIC 2013 Conference Proceedings Sebastian RadestockDr. Haxel Consult
 
Open-source tools for querying and organizing large reaction databases
Open-source tools for querying and organizing large reaction databasesOpen-source tools for querying and organizing large reaction databases
Open-source tools for querying and organizing large reaction databasesGreg Landrum
 
MMDS 2014: Myria (and Scalable Graph Clustering with RelaxMap)
MMDS 2014: Myria (and Scalable Graph Clustering with RelaxMap)MMDS 2014: Myria (and Scalable Graph Clustering with RelaxMap)
MMDS 2014: Myria (and Scalable Graph Clustering with RelaxMap)University of Washington
 
Cheminformatics II
Cheminformatics IICheminformatics II
Cheminformatics IIbaoilleach
 
Review of some successes
Review of some successesReview of some successes
Review of some successesAndrea Zaliani
 
Getting the Big Picture by Joining up the SAR dots
Getting the Big Picture by Joining up the SAR dotsGetting the Big Picture by Joining up the SAR dots
Getting the Big Picture by Joining up the SAR dotsSorel Muresan
 
Practical 9 protein structure and function (3)
Practical 9 protein structure and function  (3)Practical 9 protein structure and function  (3)
Practical 9 protein structure and function (3)Osama Barayan
 
The influence of data curation on QSAR Modeling – Presented at American Chemi...
The influence of data curation on QSAR Modeling – Presented at American Chemi...The influence of data curation on QSAR Modeling – Presented at American Chemi...
The influence of data curation on QSAR Modeling – Presented at American Chemi...Kamel Mansouri
 
R Analytics in the Cloud
R Analytics in the CloudR Analytics in the Cloud
R Analytics in the CloudDataMine Lab
 
Automatic extraction of bioactivity data from patents
Automatic extraction of bioactivity data from patentsAutomatic extraction of bioactivity data from patents
Automatic extraction of bioactivity data from patentsNextMove Software
 
Using open bioactivity data for developing machine-learning prediction models...
Using open bioactivity data for developing machine-learning prediction models...Using open bioactivity data for developing machine-learning prediction models...
Using open bioactivity data for developing machine-learning prediction models...Sunghwan Kim
 
2015-02-10 The Open PHACTS Discovery Platform: Semantic Data Integration for ...
2015-02-10 The Open PHACTS Discovery Platform: Semantic Data Integration for ...2015-02-10 The Open PHACTS Discovery Platform: Semantic Data Integration for ...
2015-02-10 The Open PHACTS Discovery Platform: Semantic Data Integration for ...open_phacts
 
CINF 2012 talk Recrystallization App
CINF 2012 talk Recrystallization AppCINF 2012 talk Recrystallization App
CINF 2012 talk Recrystallization AppJean-Claude Bradley
 

Similar to CINF 13: Pistachio - Search and Faceting of Large Reaction Databases (20)

The EPA Online Prediction Physicochemical Prediction Platform to Support Envi...
The EPA Online Prediction Physicochemical Prediction Platform to Support Envi...The EPA Online Prediction Physicochemical Prediction Platform to Support Envi...
The EPA Online Prediction Physicochemical Prediction Platform to Support Envi...
 
Free online access to experimental and predicted chemical properties through ...
Free online access to experimental and predicted chemical properties through ...Free online access to experimental and predicted chemical properties through ...
Free online access to experimental and predicted chemical properties through ...
 
ICIC 2013 Conference Proceedings Sebastian Radestock
ICIC 2013 Conference Proceedings Sebastian RadestockICIC 2013 Conference Proceedings Sebastian Radestock
ICIC 2013 Conference Proceedings Sebastian Radestock
 
Open-source tools for querying and organizing large reaction databases
Open-source tools for querying and organizing large reaction databasesOpen-source tools for querying and organizing large reaction databases
Open-source tools for querying and organizing large reaction databases
 
MMDS 2014: Myria (and Scalable Graph Clustering with RelaxMap)
MMDS 2014: Myria (and Scalable Graph Clustering with RelaxMap)MMDS 2014: Myria (and Scalable Graph Clustering with RelaxMap)
MMDS 2014: Myria (and Scalable Graph Clustering with RelaxMap)
 
Cheminformatics II
Cheminformatics IICheminformatics II
Cheminformatics II
 
Review of some successes
Review of some successesReview of some successes
Review of some successes
 
Getting the Big Picture by Joining up the SAR dots
Getting the Big Picture by Joining up the SAR dotsGetting the Big Picture by Joining up the SAR dots
Getting the Big Picture by Joining up the SAR dots
 
Websci17 final
Websci17 finalWebsci17 final
Websci17 final
 
Practical 9 protein structure and function (3)
Practical 9 protein structure and function  (3)Practical 9 protein structure and function  (3)
Practical 9 protein structure and function (3)
 
The influence of data curation on QSAR Modeling – Presented at American Chemi...
The influence of data curation on QSAR Modeling – Presented at American Chemi...The influence of data curation on QSAR Modeling – Presented at American Chemi...
The influence of data curation on QSAR Modeling – Presented at American Chemi...
 
Can a Free Access Structure-Centric Community for Chemists Benefit Drug Disco...
Can a Free Access Structure-Centric Community for Chemists Benefit Drug Disco...Can a Free Access Structure-Centric Community for Chemists Benefit Drug Disco...
Can a Free Access Structure-Centric Community for Chemists Benefit Drug Disco...
 
The influence of data curation on QSAR Modeling – examining issues of qualit...
 The influence of data curation on QSAR Modeling – examining issues of qualit... The influence of data curation on QSAR Modeling – examining issues of qualit...
The influence of data curation on QSAR Modeling – examining issues of qualit...
 
R Analytics in the Cloud
R Analytics in the CloudR Analytics in the Cloud
R Analytics in the Cloud
 
Automatic extraction of bioactivity data from patents
Automatic extraction of bioactivity data from patentsAutomatic extraction of bioactivity data from patents
Automatic extraction of bioactivity data from patents
 
Using open bioactivity data for developing machine-learning prediction models...
Using open bioactivity data for developing machine-learning prediction models...Using open bioactivity data for developing machine-learning prediction models...
Using open bioactivity data for developing machine-learning prediction models...
 
2015-02-10 The Open PHACTS Discovery Platform: Semantic Data Integration for ...
2015-02-10 The Open PHACTS Discovery Platform: Semantic Data Integration for ...2015-02-10 The Open PHACTS Discovery Platform: Semantic Data Integration for ...
2015-02-10 The Open PHACTS Discovery Platform: Semantic Data Integration for ...
 
CINF 2012 talk Recrystallization App
CINF 2012 talk Recrystallization AppCINF 2012 talk Recrystallization App
CINF 2012 talk Recrystallization App
 
GiTools
GiToolsGiTools
GiTools
 
Delivering The Benefits of Chemical-Biological Integration in Computational T...
Delivering The Benefits of Chemical-Biological Integration in Computational T...Delivering The Benefits of Chemical-Biological Integration in Computational T...
Delivering The Benefits of Chemical-Biological Integration in Computational T...
 

More from NextMove Software

CINF 170: Regioselectivity: An application of expert systems and ontologies t...
CINF 170: Regioselectivity: An application of expert systems and ontologies t...CINF 170: Regioselectivity: An application of expert systems and ontologies t...
CINF 170: Regioselectivity: An application of expert systems and ontologies t...NextMove Software
 
Building a bridge between human-readable and machine-readable representations...
Building a bridge between human-readable and machine-readable representations...Building a bridge between human-readable and machine-readable representations...
Building a bridge between human-readable and machine-readable representations...NextMove Software
 
CINF 35: Structure searching for patent information: The need for speed
CINF 35: Structure searching for patent information: The need for speedCINF 35: Structure searching for patent information: The need for speed
CINF 35: Structure searching for patent information: The need for speedNextMove Software
 
A de facto standard or a free-for-all? A benchmark for reading SMILES
A de facto standard or a free-for-all? A benchmark for reading SMILESA de facto standard or a free-for-all? A benchmark for reading SMILES
A de facto standard or a free-for-all? A benchmark for reading SMILESNextMove Software
 
Recent Advances in Chemical & Biological Search Systems: Evolution vs Revolution
Recent Advances in Chemical & Biological Search Systems: Evolution vs RevolutionRecent Advances in Chemical & Biological Search Systems: Evolution vs Revolution
Recent Advances in Chemical & Biological Search Systems: Evolution vs RevolutionNextMove Software
 
Can we agree on the structure represented by a SMILES string? A benchmark dat...
Can we agree on the structure represented by a SMILES string? A benchmark dat...Can we agree on the structure represented by a SMILES string? A benchmark dat...
Can we agree on the structure represented by a SMILES string? A benchmark dat...NextMove Software
 
Comparing Cahn-Ingold-Prelog Rule Implementations
Comparing Cahn-Ingold-Prelog Rule ImplementationsComparing Cahn-Ingold-Prelog Rule Implementations
Comparing Cahn-Ingold-Prelog Rule ImplementationsNextMove Software
 
Eugene Garfield: the father of chemical text mining and artificial intelligen...
Eugene Garfield: the father of chemical text mining and artificial intelligen...Eugene Garfield: the father of chemical text mining and artificial intelligen...
Eugene Garfield: the father of chemical text mining and artificial intelligen...NextMove Software
 
Chemical similarity using multi-terabyte graph databases: 68 billion nodes an...
Chemical similarity using multi-terabyte graph databases: 68 billion nodes an...Chemical similarity using multi-terabyte graph databases: 68 billion nodes an...
Chemical similarity using multi-terabyte graph databases: 68 billion nodes an...NextMove Software
 
Recent improvements to the RDKit
Recent improvements to the RDKitRecent improvements to the RDKit
Recent improvements to the RDKitNextMove Software
 
Pharmaceutical industry best practices in lessons learned: ELN implementation...
Pharmaceutical industry best practices in lessons learned: ELN implementation...Pharmaceutical industry best practices in lessons learned: ELN implementation...
Pharmaceutical industry best practices in lessons learned: ELN implementation...NextMove Software
 
Digital Chemical Representations
Digital Chemical RepresentationsDigital Chemical Representations
Digital Chemical RepresentationsNextMove Software
 
Challenges and successes in machine interpretation of Markush descriptions
Challenges and successes in machine interpretation of Markush descriptionsChallenges and successes in machine interpretation of Markush descriptions
Challenges and successes in machine interpretation of Markush descriptionsNextMove Software
 
PubChem as a Biologics Database
PubChem as a Biologics DatabasePubChem as a Biologics Database
PubChem as a Biologics DatabaseNextMove Software
 
CINF 17: Comparing Cahn-Ingold-Prelog Rule Implementations: The need for an o...
CINF 17: Comparing Cahn-Ingold-Prelog Rule Implementations: The need for an o...CINF 17: Comparing Cahn-Ingold-Prelog Rule Implementations: The need for an o...
CINF 17: Comparing Cahn-Ingold-Prelog Rule Implementations: The need for an o...NextMove Software
 
Building on Sand: Standard InChIs on non-standard molfiles
Building on Sand: Standard InChIs on non-standard molfilesBuilding on Sand: Standard InChIs on non-standard molfiles
Building on Sand: Standard InChIs on non-standard molfilesNextMove Software
 
Chemical Structure Representation of Inorganic Salts and Mixtures of Gases: A...
Chemical Structure Representation of Inorganic Salts and Mixtures of Gases: A...Chemical Structure Representation of Inorganic Salts and Mixtures of Gases: A...
Chemical Structure Representation of Inorganic Salts and Mixtures of Gases: A...NextMove Software
 
Advanced grammars for state-of-the-art named entity recognition (NER)
Advanced grammars for state-of-the-art named entity recognition (NER)Advanced grammars for state-of-the-art named entity recognition (NER)
Advanced grammars for state-of-the-art named entity recognition (NER)NextMove Software
 
Challenges in Chemical Information Exchange
Challenges in Chemical Information ExchangeChallenges in Chemical Information Exchange
Challenges in Chemical Information ExchangeNextMove Software
 

More from NextMove Software (20)

DeepSMILES
DeepSMILESDeepSMILES
DeepSMILES
 
CINF 170: Regioselectivity: An application of expert systems and ontologies t...
CINF 170: Regioselectivity: An application of expert systems and ontologies t...CINF 170: Regioselectivity: An application of expert systems and ontologies t...
CINF 170: Regioselectivity: An application of expert systems and ontologies t...
 
Building a bridge between human-readable and machine-readable representations...
Building a bridge between human-readable and machine-readable representations...Building a bridge between human-readable and machine-readable representations...
Building a bridge between human-readable and machine-readable representations...
 
CINF 35: Structure searching for patent information: The need for speed
CINF 35: Structure searching for patent information: The need for speedCINF 35: Structure searching for patent information: The need for speed
CINF 35: Structure searching for patent information: The need for speed
 
A de facto standard or a free-for-all? A benchmark for reading SMILES
A de facto standard or a free-for-all? A benchmark for reading SMILESA de facto standard or a free-for-all? A benchmark for reading SMILES
A de facto standard or a free-for-all? A benchmark for reading SMILES
 
Recent Advances in Chemical & Biological Search Systems: Evolution vs Revolution
Recent Advances in Chemical & Biological Search Systems: Evolution vs RevolutionRecent Advances in Chemical & Biological Search Systems: Evolution vs Revolution
Recent Advances in Chemical & Biological Search Systems: Evolution vs Revolution
 
Can we agree on the structure represented by a SMILES string? A benchmark dat...
Can we agree on the structure represented by a SMILES string? A benchmark dat...Can we agree on the structure represented by a SMILES string? A benchmark dat...
Can we agree on the structure represented by a SMILES string? A benchmark dat...
 
Comparing Cahn-Ingold-Prelog Rule Implementations
Comparing Cahn-Ingold-Prelog Rule ImplementationsComparing Cahn-Ingold-Prelog Rule Implementations
Comparing Cahn-Ingold-Prelog Rule Implementations
 
Eugene Garfield: the father of chemical text mining and artificial intelligen...
Eugene Garfield: the father of chemical text mining and artificial intelligen...Eugene Garfield: the father of chemical text mining and artificial intelligen...
Eugene Garfield: the father of chemical text mining and artificial intelligen...
 
Chemical similarity using multi-terabyte graph databases: 68 billion nodes an...
Chemical similarity using multi-terabyte graph databases: 68 billion nodes an...Chemical similarity using multi-terabyte graph databases: 68 billion nodes an...
Chemical similarity using multi-terabyte graph databases: 68 billion nodes an...
 
Recent improvements to the RDKit
Recent improvements to the RDKitRecent improvements to the RDKit
Recent improvements to the RDKit
 
Pharmaceutical industry best practices in lessons learned: ELN implementation...
Pharmaceutical industry best practices in lessons learned: ELN implementation...Pharmaceutical industry best practices in lessons learned: ELN implementation...
Pharmaceutical industry best practices in lessons learned: ELN implementation...
 
Digital Chemical Representations
Digital Chemical RepresentationsDigital Chemical Representations
Digital Chemical Representations
 
Challenges and successes in machine interpretation of Markush descriptions
Challenges and successes in machine interpretation of Markush descriptionsChallenges and successes in machine interpretation of Markush descriptions
Challenges and successes in machine interpretation of Markush descriptions
 
PubChem as a Biologics Database
PubChem as a Biologics DatabasePubChem as a Biologics Database
PubChem as a Biologics Database
 
CINF 17: Comparing Cahn-Ingold-Prelog Rule Implementations: The need for an o...
CINF 17: Comparing Cahn-Ingold-Prelog Rule Implementations: The need for an o...CINF 17: Comparing Cahn-Ingold-Prelog Rule Implementations: The need for an o...
CINF 17: Comparing Cahn-Ingold-Prelog Rule Implementations: The need for an o...
 
Building on Sand: Standard InChIs on non-standard molfiles
Building on Sand: Standard InChIs on non-standard molfilesBuilding on Sand: Standard InChIs on non-standard molfiles
Building on Sand: Standard InChIs on non-standard molfiles
 
Chemical Structure Representation of Inorganic Salts and Mixtures of Gases: A...
Chemical Structure Representation of Inorganic Salts and Mixtures of Gases: A...Chemical Structure Representation of Inorganic Salts and Mixtures of Gases: A...
Chemical Structure Representation of Inorganic Salts and Mixtures of Gases: A...
 
Advanced grammars for state-of-the-art named entity recognition (NER)
Advanced grammars for state-of-the-art named entity recognition (NER)Advanced grammars for state-of-the-art named entity recognition (NER)
Advanced grammars for state-of-the-art named entity recognition (NER)
 
Challenges in Chemical Information Exchange
Challenges in Chemical Information ExchangeChallenges in Chemical Information Exchange
Challenges in Chemical Information Exchange
 

Recently uploaded

Use of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptxUse of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptxRenuJangid3
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .Poonam Aher Patil
 
CYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptxCYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptxCherry
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxMohamedFarag457087
 
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRLGwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRLkantirani197
 
Genetics and epigenetics of ADHD and comorbid conditions
Genetics and epigenetics of ADHD and comorbid conditionsGenetics and epigenetics of ADHD and comorbid conditions
Genetics and epigenetics of ADHD and comorbid conditionsbassianu17
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.Cherry
 
Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Cherry
 
Concept of gene and Complementation test.pdf
Concept of gene and Complementation test.pdfConcept of gene and Complementation test.pdf
Concept of gene and Complementation test.pdfCherry
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceAlex Henderson
 
Genome sequencing,shotgun sequencing.pptx
Genome sequencing,shotgun sequencing.pptxGenome sequencing,shotgun sequencing.pptx
Genome sequencing,shotgun sequencing.pptxCherry
 
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxClimate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxDiariAli
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bSérgio Sacani
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxseri bangash
 
Genome organization in virus,bacteria and eukaryotes.pptx
Genome organization in virus,bacteria and eukaryotes.pptxGenome organization in virus,bacteria and eukaryotes.pptx
Genome organization in virus,bacteria and eukaryotes.pptxCherry
 
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.Cherry
 
PODOCARPUS...........................pptx
PODOCARPUS...........................pptxPODOCARPUS...........................pptx
PODOCARPUS...........................pptxCherry
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learninglevieagacer
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Serviceshivanisharma5244
 

Recently uploaded (20)

Use of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptxUse of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptx
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
CYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptxCYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptx
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
 
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRLGwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
 
Genetics and epigenetics of ADHD and comorbid conditions
Genetics and epigenetics of ADHD and comorbid conditionsGenetics and epigenetics of ADHD and comorbid conditions
Genetics and epigenetics of ADHD and comorbid conditions
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.
 
Early Development of Mammals (Mouse and Human).pdf
Early Development of Mammals (Mouse and Human).pdfEarly Development of Mammals (Mouse and Human).pdf
Early Development of Mammals (Mouse and Human).pdf
 
Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.
 
Concept of gene and Complementation test.pdf
Concept of gene and Complementation test.pdfConcept of gene and Complementation test.pdf
Concept of gene and Complementation test.pdf
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
Genome sequencing,shotgun sequencing.pptx
Genome sequencing,shotgun sequencing.pptxGenome sequencing,shotgun sequencing.pptx
Genome sequencing,shotgun sequencing.pptx
 
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxClimate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
 
Genome organization in virus,bacteria and eukaryotes.pptx
Genome organization in virus,bacteria and eukaryotes.pptxGenome organization in virus,bacteria and eukaryotes.pptx
Genome organization in virus,bacteria and eukaryotes.pptx
 
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
 
PODOCARPUS...........................pptx
PODOCARPUS...........................pptxPODOCARPUS...........................pptx
PODOCARPUS...........................pptx
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
 

CINF 13: Pistachio - Search and Faceting of Large Reaction Databases

  • 1. CINF 13, ACS Fall 2017, Washington, D.C. pistachio Search and Faceting of Large Reaction Databases John Mayfield, Daniel Lowe, Roger Sayle
  • 2. What do Synthetic Chemists Want from Their Reaction Systems? CINF 13, ACS Fall 2017, Washington, D.C. Data ClassificationDiagrams Search
  • 3. What do Synthetic Chemists Want from Their Reaction Systems? CINF 13, ACS Fall 2017, Washington, D.C. Data ClassificationDiagrams Search
  • 4. HazELNut Filbert NameRXN Cobnut Accelrys Pipeline Pilot (AstraZeneca, AbbVie & Hoffmann-La Roche) ChemAxon JChem Cartridge (GlaxoSmithKline & Novartis) Elsevier Reaxys (Hoffmann-La Roche, AstraZeneca, Merck) Perkin Elmer Informatics (formerly CambridgeSoft) eNotebook v9, v11 or v13 or Symyx ELN v5.x or v6.x Oracle Server version 10, 11 or Microsoft Windows, Linux or Mac OS Infrastructure for liberating and processing reactions from Electronic Lab Notebooks (ELNs) CINF 13, ACS Fall 2017, Washington, D.C.
  • 5. To 7-chloro-4-oxo-4,5-dihydrofuro[2,3-d]pyridazine-2-carboxylic acid (Peakdale) (220 mg, 1.025 mmol) and (3,4- dimethoxyphenyl)boronic acid (187 mg, 1.025 mmol) in 1,4-dioxane (3 mL) and water (1.5 mL) was added sodium carbonate(435 mg, 4.10 mmol) and tetrakis(triphenylphosphine)palladium(0) (110 mg, 0.095 mmol). The reaction was heated in the microwave at 80° C. for 2 hours and at 100° C. for a further 2 hours. The solvent was removed and the residue was suspended in DMSO, filtered and purified by MDAP. Appropriate fractions were combined and the solvent removed to give 7-(3,4-dimethoxyphenyl)-4-oxo-4,5-dihydrofuro[2,3- d]pyridazine-2-carboxylic acid (25 mg, 7%) as a yellow solid. [0517] US 2016/16966 A1 Daniel M. Lowe. Extraction of chemical structures and reactions from the literature. Ph.D. Thesis, University of Cambridge, 2012
  • 6. Daniel M. Lowe. Extraction of chemical structures and reactions from the literature. Ph.D. Thesis, University of Cambridge, 2012 To 7-chloro-4-oxo-4,5-dihydrofuro[2,3-d]pyridazine-2-carboxylic acid (Peakdale) (220 mg, 1.025 mmol) and (3,4- dimethoxyphenyl)boronic acid (187 mg, 1.025 mmol) in 1,4-dioxane (3 mL) and water (1.5 mL) was added sodium carbonate(435 mg, 4.10 mmol) and tetrakis(triphenylphosphine)palladium(0) (110 mg, 0.095 mmol). The reaction was heated in the microwave at 80° C. for 2 hours and at 100° C. for a further 2 hours. The solvent was removed and the residue was suspended in DMSO, filtered and purified by MDAP. Appropriate fractions were combined and the solvent removed to give 7-(3,4-dimethoxyphenyl)-4-oxo-4,5-dihydrofuro[2,3- d]pyridazine-2-carboxylic acid (25 mg, 7%) as a yellow solid. [0517] Product Properties 7-(3,4-dimethoxyphenyl)-4-oxo-4,5-dihydrofuro[2,3-d]pyridazine-2-carboxylic acid 25 mg, 7% yield, Yellow Solid Reactant Properties 7-chloro-4-oxo-4,5-dihydrofuro[2,3-d]pyridazine-2-carboxylic acid 220 mg, 1.025 mmol (3,4-dimethoxyphenyl)boronic acid 187 mg, 1.025 mmol Agent Properties 1,4-dioxane 3mL water 1.5mL sodium carbonate 435 mg, 4.10 mol tetrakis(triphenylphosphine)palladium(0) 110 mg, 0.095 mmol DMSO Unstructured text to a structured reaction table US 2016/16966 A1 LeadMine + Chemical Tagger
  • 7. Christos Nicolaou et al. The Proximal Lilly Collection: Mapping, Exploring and Exploiting Feasible Chemical Space J. Chem. Inf. Model., 2016, 56 (7), pp 1253–1266 Nadine Schneider et al. Big Data from Pharmaceutical Patents: A Computational Analysis of Medicinal Chemists’ Bread and Butter. J. Med. Chem., 2016, 59 (9), pp 4385–4402 Nadine Schneider et al. Development of a Novel Fingerprint for Chemical Reactions and Its Application to Large-Scale Reaction Classification and Similarity J. Chem. Inf. Model., 2015, 55 (1), pp 39–53 Nadine Schneider et al. What’s What: The (Nearly) Definitive Guide to Reaction Role Assignment. J. Chem. Inf. Model., 2016, 56 (12), pp 2336–2346 Connor Coley et al. Prediction of Organic Reaction Outcomes Using Machine Learning. ACS Cent. Sci., 2017, 3 (5), pp 434–443 Data impact CINF 13, ACS Fall 2017, Washington, D.C. Public subset released in 2014 as CC-Zero Pistachio expands the scope of the data and uses Atom- Atom Maps from NameRxn
  • 8. Example 26. Epizyme Inc. 1-phenoxy-3-(alkylamino)-propan-2-olderivatives as CARM1 inhibitors and uses thereof (US 09718816 B2) Aug. 1, 2017 Example 26, US 09718816 B2 John May, et al. Sketchy Sketches: Hiding Chemistry in Plain Sight. Seventh Joint Sheffield Conference on Cheminformatics. 2016 Step 1 Step 4 Step 3 Step 2 etc.. sketch extraction NextMove’s Praline
  • 9. total reactions over time CINF 13, ACS Fall 2017, Washington, D.C. 0 0.5M 1.0M 1.5M 2.0M 2.5M 3.0M 3.5M 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 ReactionDetails(cumulative) EPO Applications EPO Grants USPTO Applications USPTO Grants
  • 10. What do Synthetic Chemists Want from Their Reaction Systems? CINF 13, ACS Fall 2017, Washington, D.C. Data ClassificationDiagrams Search
  • 11. reaction DIAGRAMS Good reaction diagrams are essential in communicating synthetic chemistry Layout can be stored or generated • When extracting from text, layout must be generated • Generated diagrams can be unsatisfactory for display CINF 13, ACS Fall 2017, Washington, D.C.
  • 13. ChemAxonBIOVIA Generated from SMILES for US 2016/16966 A1 [0517]
  • 14. diagram improvements Typical work arounds: • Separately render molecules • Hide agents and list separately What do humans do: • Wrap products below • Abbreviate functional groups and agents • Orientate reactants to products and visa versa • Hide agents and list as text CINF 13, ACS Fall 2017, Washington, D.C.
  • 17. What do Synthetic Chemists Want from Their Reaction Systems? CINF 13, ACS Fall 2017, Washington, D.C. Data ClassificationDiagrams Search
  • 18. 4.1.6 Cyclic Beckmann rearrangement Assigns names to 900+ reactions using transformations Can guarantee perfect Atom-Atom Mapping • Atom-Atom Mapping is an output not an input • MCS mappers struggle with rearrangements: namerxn
  • 19. concepts and rxno CINF 13, ACS Fall 2017, Washington, D.C. 1 Heteroatom alkylation and arylation .7 O-substitution .1 Chan-Lam ether coupling .2 Diazomethane esterification .3 Ethyl esterification .4 Hydroxy to methoxy .5 Hydroxy to triflyloxy .6 Methyl esterification .n 2 Acylation and related processes .6 O-acylation to ester .1 Ester Schotten-Baumann .2 Esterification (generic) .3 Fischer-Speier esterification .4 Baeyer-Villiger oxidation .5 Yamaguchi esterification .6 Hydroxy to imidazolecarbonyloxy .7 Imidazolecarbonyl to ester .8 Hydroxy to acetoxy .9 Steglich esterification .n
  • 20. concepts and rxno CINF 13, ACS Fall 2017, Washington, D.C. 1 Heteroatom alkylation and arylation .7 O-substitution .1 Chan-Lam ether coupling .2 Diazomethane esterification .3 Ethyl esterification .4 Hydroxy to methoxy .5 Hydroxy to triflyloxy .6 Methyl esterification .n 2 Acylation and related processes .6 O-acylation to ester .1 Ester Schotten-Baumann .2 Esterification (generic) .3 Fischer-Speier esterification .4 Baeyer-Villiger oxidation .5 Yamaguchi esterification .6 Hydroxy to imidazolecarbonyloxy .7 Imidazolecarbonyl to ester .8 Hydroxy to acetoxy .9 Steglich esterification .n Esterification (7) Chan-Lam coupling (3) Schotten-Baumann Reaction (9) RXNO: http://github.com/rsc-ontologies/rxno
  • 21. result FACETS Provides summary over the key concepts of results Cut through information deluge and refine search CINF 13, ACS Fall 2017, Washington, D.C. • Reaction Types (NextMove ontology tree) • Drug Targets (ChEMBL ontology tree) • Disease Targets (MESH ontology tree) • Yields • Affiliation (NextMove ontology tree) • Publication Date, Documents, Authors
  • 22. CINF 13, ACS Fall 2017, Washington, D.C. Intel(R) Core(TM) i7-6900K CPU @ 3.20GHz 2.9 seconds to summarise all 6.6 million rows Resource expensive – O(n) size of result set • Client, server, or database? • Overhead copying and transferring data that is not needed • Calculate when requested or up-front? facet calculation Custom cartridge:
  • 23. What do Synthetic Chemists Want from Their Reaction Systems? CINF 13, ACS Fall 2017, Washington, D.C. Data ClassificationDiagrams Search
  • 24. one entry point CINF 13, ACS Fall 2017, Washington, D.C. Systematic Name Date Range Trivial Name Yield Range Affiliation Reaction SMARTS Disease Target DocumentLine Formula SMILES InChIAuthor Protein Target Collection Reaction Type (NameRxn)SMARTSSource …and logical combinations thereof
  • 25. suggestions Based on global frequency CINF 13, ACS Fall 2017, Washington, D.C. Based on context frequency
  • 26. structure search technology NextMove’s Arthor Technology Up to 100x faster then state-of-the- art Combination of SMARTS compilation and efficient storage Preliminary PostgreSQL integration 36s Arthor 56m BIOVIA Direct (Oracle) 1h Bingo (NoSQL) 1h54m Bingo (PostgreSQL) 2h6m Bingo (Oracle) 2h41m JChem (Oracle) 5h9m RDCart (PostgreSQL) 13h54m pgchem (PostgreSQL) 1d1h52m mychem (MySQL) 3d1h13m orchem (Oracle) Benchmark: ~3.5K queries against ~7M structures (eMolecules 2014) all on the same hardware. John May and Roger Sayle, Substructure Search Face-off, May 2015
  • 27. Intention can be refined by qualifiers Role {structure} product Substructure {structure} substructure {structure} substructure product Make/Break Synthesis of {structure} Combined with other terms {structure} substructure product and yield of 80% refining structure search CINF 13, ACS Fall 2017, Washington, D.C.
  • 30. Acknowledgements Noel O’Boyle (NextMove Software), Egon Willighagen (CDK) James Davison, Matt Swain (Vernalis) What do Synthetic Chemists Want from Their Reaction Systems? Data ClassificationDiagrams Search pistachio http://www.nextmovesoftware.com/pistachio.html Come find me around ACS for a demo! See also: CINF 90