SlideShare uma empresa Scribd logo
1 de 28
Baixar para ler offline
Getting the Big Picture by Joining
up the SAR dots
Large-scale integration of structure
and bioactivity data

The 9th Annual Pharmaceutical IT Congress 2011

Sorel Muresan
AstraZeneca R&D Mölndal
DECS Computational Sciences
WO patents with the classification code C07D




 Query performed using the European Patent Office search interface

                                                         DECS | CompSci
Driver – explosion in SAR data

       • Chemical information landscape changing fast

       • Databases, journal articles, patents, internal docs


                      2006                                        2008




                                                                     DECS | CompSci
Southan, C.; Varkonyi, P.; Muresan, S., J. Cheminfo. 2009, 1:10
The Challenge – Information deluge




• Volume

• Complexity

• Unstructured content




                                     DECS | CompSci
Since 2006 >1M chemistry publications per year




  Number of articles (diamonds) and patents (open boxes) abstracted
  annually by Chemical Abstracts
  Bachrach J.Cheminformatics 2009 1:2

                                                         DECS | CompSci
Number of structures per year from J Med Chem




  W. Patrick Walters; Jeremy Green; Jonathan R. Weiss; Mark A. Murcko;
  J. Med. Chem. Article ASAP
  DOI: 10.1021/jm200504p Copyright © 2011 American Chemical Society

                                                             DECS | CompSci
SAR key entities and relationships




     Unstructured Data                                            Structured Entries in
      from Documents                                              Relational Databases
                                    Expert Extraction
                                           or
                                      Text Mining

                                                                            DECS | CompSci
Southan, C.; Boppana, K.; Jagarlapudi, S.; Muresan, S .J. Cheminfo. 2011, 3:14
Manually extracted SAR data (commercial)

• GOSTAR (GVKBIO Online Structure Activity Relationship Database) is a
  comprehensive database that captures explicit relationships between the three
  entities of publications, compounds and sequences.

• It includes 2.6 million compounds linked to 3,500 sequences with 12.5M SAR
  points extracted from 43,000 patents and 67,000 articles from 125 journals




                                                                  DECS | CompSci
SAR data (public)

• PubChem
  • the NCBI public informatics backbone for the NIH Molecular Libraries
    Initiative focused on small molecules as systems biology probes and
    potential therapeutic agents. The statistics are 30.5 million
    compounds with 85.6 million links. Of the compounds, 1654K have
    been tested in 504K assays.


• ChEMBL
  • includes drugs, small molecules from the medicinal chemistry or
    biochemical literature and their targets. It contains 1,060,258 distinct
    compounds extracted by expert manual curation from 42,516
    publications with 5,479,146 activities, including SAR and ADMET
    values. This data is mapped to 8,603 targets.


                                                              DECS | CompSci
Extracting chemical entities from text


Collaboration with IBM Research Almaden to apply
 text analytics technology to analyze intellectual
 property and scientific literature

 - 10 million full text patents
 - 11 million structures

 - 12% out of 46M parent structures in Chemistry Connect




                                               DECS | CompSci
Chemical Named Entity Recognition (NER)

                    7-CHLORO-1,3-DIHYDRO-1-METHYL-5-
                    PHENYL-2H-1,4-BENZODIAZEPIN-2-ONE


                                        Name-to-Structure
                                        software



                     CN1c2ccc(cc2C(=NCC1=O)c3ccccc3)Cl




                                              DECS | CompSci
Extracting chemical entities from text


The biggest cause of missing compounds when extracting
 chemical entities from text is the presence of typographical
 errors: human errors, OCR failures, hyphenation and
 multiple line issues, etc.

• Automated spelling correction with CaffeineFix from
 NextMove Software

   • CaffeineFix significantly improves extraction rates (22%
     increase from D=0 to D=1)

   • name2structure software are complementary (40% of the
     structures come from single n2s contributions)


                                                       DECS | CompSci
Structure standardisation

                           “The big merge” requires:
                • A common set of chemistry and biology rules
                applied carefully & consistently across databases




                                                                          DECS | CompSci
Muresan, S.; Sitzmann, M.; Southan, C., Biocomputing and Drug Discovery, 2011
Chemistry Connect




                    DECS | CompSci
Technical Overview - ETL

   Data Sources   Extraction           Transformation      Loading




    Text Files      Python               Structure
                    Scripts            Normalization
                  (chemistry)          Property calc

                                                         Oracle PL/SQL
   Oracle DB                                              (ext tables)

                            Pipeline Pilot
                         (biological results)
      Web
     Service

                                                        DECS | CompSci
Technical Overview - Application


                                              HTML



                                               Java

    Oracle 11g       WebLogic Server
     Direct 7    REST (and SOAP) services
                                               .Net


                                            PipelinePilot
                                               Knime
                                                Excel




                                              DECS | CompSci
Source content in Chemistry Connect
  Source                Structures        % unique       Cpd/Str       Syn/Str
  ChemSpider                18922316           50         1.07            1.8
  Reaxys                    15535377           59         1.12            2.0
  IBM patents               11038533           51         1.00            n/a
  PubChemBE                  4675643           n/a        1.03            n/a
  ACD                        4452644           73         1.01            1.3
  eMolecules                 4213813           19         1.01            n/a
  TRPharma                   3268613           n/a        1.03            n/a
  GOSTAR                     3128567           27         1.00            3.3
  ChEMBL                      940905           n/a        1.05             1.6
  TRIntegrity                 307685           27         1.00             1.3
  AZReagents                   78265           3.4        1.73             3.4
  TRPartnering                 17901           10         1.00            1.0
  ChEBI                        13191           n/a        1.31             5.2
  HMDB                          7789           53         1.00            13.4
  DrugBank                      6359           n/a        1.04            5.0
  TTD                           2663           4.9        1.27             n/a
  Bioprint                      2481           n/a        1.00             n/a


                                                                   DECS | CompSci
Muresan, S. et al, Drug Discovery Today 2011, in print
Finding a common language
                                                                                Acetaminophen
                   [3H]Acetaminophen                  882-720-13                Acetaminophen (4-hydroxyacetanilide)
                   10066-90-7                         882-720-16                Acetaminophen glucuronide(55%)
                                                                                acetaminophen sulfate
                   103-90-2                           882-720-20                Acetaminophen sulfate(30%)
                   1047-607-00                        A F ANACIN                acetaminophen sulphate
                                                                                Acetaminophen Uniserts
                   1169-894-12                        A PER                     acetaminophene
                                                      A.F. ANACIN               Acetamol
                   16110-10-4                                                   ACETANILIDE, 4'-HYDROXY-
                                                      AAP                       Acetavance
                   222 AF                             aa-sulfate                Acetofen
                   222-AF                             AA-sulphate
                                                                                ACETOMINOPHEN
                                                                                Actamin
                   3-(glutathion-S-yl)acetaminophen   Abenol                    Actamin Extra
                                                                                Actamin Super
                   37519-14-5                         Abensanil                 Actifed Plus
                   3-hydroxyacetaminophen             ABROL                     Actimol
                                                                                Actimol Chewable Tablets
                   4-(Acetylamino)phenol              ABROLET                   Actimol Children's Suspension
                   4-13-00-01091                      AC112578                  Actimol Infants' Suspension
                                                                                Actimol Junior Strength Caplets
                   4-ACETAMIDOPHENOL                  AC112579                  Actron
                                                      Acamol                    Afebrin
                   4-Acetaminophenol                                            Afebryl
                                                      Accu-Tap                  Aferadol
                   4-ACETYLAMINOPHENOL                Acenol                    AG10223
                   4'-Hydroxyacetanilide              Acenol (pharmaceutical)
                                                                                AG12029
                                                                                AG124687
                   4-HYDROXYACETANILIDE               Acephen                   AG12800
                                                                                AG12948
                   4-HYDROXYANILID KYSELINY OCTOVE    Acertol                   Amadil
                   4-hydroxyphenolacetamide           Aceta                     Aminofen
                   644/4046                           Aceta Elixir              Aminofen Max
                                                                                Anacin
                   644/7502                           Aceta Tablets             Anacin-3
                   64889-81-2                         Acetaco                   Anacin-3 Extra Strength
                                                      Acetagesic                Anadin dla dzieci
                   659/9501                                                     Anaflon
                                                      Acetalgin
                   77097-85-9
Acetaminophen:
                                                                                Analter
                                                      ACETAMIDE, N-(4-          Anapap
                   840-416-00                         HYDROXYPHENYL)-           Andox

>1000 synonyms..   872-667-00
                   878-022-04
                                                      ACETAMIDE, N-(P-
                                                      HYDROXYPHENYL)-
                                                                                Anelix
                                                                                Anexsia
                                                                                Anexsia 10/660
                   878-022-09                         Acetamidophenol           Anexsia 5/325
                   878-022-14                         Acetaminofen              Anexsia 7.5/325
                                                      Acetaminophen             Anexsia 7.5/650
                   878-022-19                                                   Anhiba
                   882-720-04                         Acetaminophen (4-         Anoquan
                                                      hydroxyacetanilide)       Anti-Algos
                   882-720-07                                                   Antidol
                                                      Acetaminophen
                   882-720-10                         glucuronide(55%)
                                                                                Apacet
                                                                                 DECS | CompSci
                                                                                Apacet Capsules
                                                      acetaminophen sulfate
Word of the Day : Crowdsourcing




                                  DECS | CompSci
Exact match source comparisons




    sources that include   predominantly patent-
    known drugs            derived compounds


                                         DECS | CompSci
Chemistry Connect - Synonyms Searches




                                   DECS | CompSci
Chemistry Connect - Structure Searches




                                     DECS | CompSci
Chemistry Connect - Patent Searches




                                      DECS | CompSci
Chemistry Connect - Test & Result Searches




                                     DECS | CompSci
Different Questions, Common Language

            Question                                            Concepts
• What compounds have been described in             Target            Pathway
  document D?
                                                   Institute         People        Disease
• What compounds bind target X with an affinity
                                                  Compound                              Bioprocess
  greater than A?                                                     Target

                                                               MoA               Pathway       Disease
• What targets does compound C bind with an
  affinity greater than A?
                                                  Compound           Test         Target
• What compounds have AZ patented on target X?

• What is the structure for this development       Disease       Study           Drug          MoA
  compound?
                                                                Species
• How can I quickly get the SAR data from this    Compound                      BMO (AE)
  patent?                                                            Study



                                                  BMO (AE)           Compound



                                                                                    DECS | CompSci
Take-home messages

• Chemistry Connect is enabling AZ to intensify its exploitation of
  synergies between internal and external SAR estate and to shorten
  the time between hypothesis generation during DMTA cycles


• Our Chemical Dictionary of 120 million chemical terms has become a
  crucial cross-mapping resource between chemistry and the scientific
  literature


• We cannot wave a magic wand over data qality, provenance issues,
  drug name space, and the inherent challenges of chemistry
  representation but Chemistry Connect gives us a unique overview and
  amelioration options for each source


                                                         DECS | CompSci
A Democracy of Ideas (Acknowledgements)


• Plamen Petrov           • Niklas Blomberg
• Chris Southan           • Kay Brickmann
• Paul Xie                • Ola Engkvist
• Peter Varkonyi          • Yidong Yang
• Thierry Kogej           • Hongming Chen
• Christian Tyrchan       • and many others…
• Magnus Kjellberg
• Håkan Nilsson
• Mats Ericsson
• Jonas Ekengren
• Marcus Gelderman
• Ithipol Suriyawongkul




                                               DECS | CompSci
Thank you!




             DECS | CompSci

Mais conteúdo relacionado

Mais procurados

Fiber Laser Enables Marking of Advanced Plastics-ILS Feature Article JanFeb2016
Fiber Laser Enables Marking of Advanced Plastics-ILS Feature Article JanFeb2016Fiber Laser Enables Marking of Advanced Plastics-ILS Feature Article JanFeb2016
Fiber Laser Enables Marking of Advanced Plastics-ILS Feature Article JanFeb2016Scott Sabreen
 
Deep Purple: Discolouration in CBD products
Deep Purple: Discolouration in CBD productsDeep Purple: Discolouration in CBD products
Deep Purple: Discolouration in CBD productsMarkus Roggen
 
Aspects of pharmaceutical molecular design (Belgrade version)
Aspects of pharmaceutical molecular design (Belgrade version)Aspects of pharmaceutical molecular design (Belgrade version)
Aspects of pharmaceutical molecular design (Belgrade version)Peter Kenny
 
lectures_chicago2009_lecture10
lectures_chicago2009_lecture10lectures_chicago2009_lecture10
lectures_chicago2009_lecture10Ewald Terpetschnig
 
Dictionary of Natural products(DNP)
Dictionary of Natural products(DNP)Dictionary of Natural products(DNP)
Dictionary of Natural products(DNP)Iqra Yasmeen
 
Growth and Structural studies of Zn doped L-Threonine single crystal
Growth and Structural studies of Zn doped L-Threonine single crystalGrowth and Structural studies of Zn doped L-Threonine single crystal
Growth and Structural studies of Zn doped L-Threonine single crystaltheijes
 
Triton-Plus-La-Jolla-Nd-AN30281
Triton-Plus-La-Jolla-Nd-AN30281Triton-Plus-La-Jolla-Nd-AN30281
Triton-Plus-La-Jolla-Nd-AN30281Anne Trinquier
 
characteristics exploration of n ii cuzn nano-composite coated permanent magnets
characteristics exploration of n ii cuzn nano-composite coated permanent magnetscharacteristics exploration of n ii cuzn nano-composite coated permanent magnets
characteristics exploration of n ii cuzn nano-composite coated permanent magnetsIJEAB
 
IB Chemistry on ICT, 3D software, Avogadro, Jmol, Swiss PDB, Pymol for Intern...
IB Chemistry on ICT, 3D software, Avogadro, Jmol, Swiss PDB, Pymol for Intern...IB Chemistry on ICT, 3D software, Avogadro, Jmol, Swiss PDB, Pymol for Intern...
IB Chemistry on ICT, 3D software, Avogadro, Jmol, Swiss PDB, Pymol for Intern...Lawrence kok
 
Growth, Structure and Physical Properties of Tetraaqua Bismaleatocobalt (II) ...
Growth, Structure and Physical Properties of Tetraaqua Bismaleatocobalt (II) ...Growth, Structure and Physical Properties of Tetraaqua Bismaleatocobalt (II) ...
Growth, Structure and Physical Properties of Tetraaqua Bismaleatocobalt (II) ...IOSR Journals
 
IB Chemistry on ICT, 3D software, Avogadro, Jmol, Swiss PDB, Pymol for Intern...
IB Chemistry on ICT, 3D software, Avogadro, Jmol, Swiss PDB, Pymol for Intern...IB Chemistry on ICT, 3D software, Avogadro, Jmol, Swiss PDB, Pymol for Intern...
IB Chemistry on ICT, 3D software, Avogadro, Jmol, Swiss PDB, Pymol for Intern...Lawrence kok
 
Growth and Characterization of ZTS Single Crystal and its Analysis of Optical...
Growth and Characterization of ZTS Single Crystal and its Analysis of Optical...Growth and Characterization of ZTS Single Crystal and its Analysis of Optical...
Growth and Characterization of ZTS Single Crystal and its Analysis of Optical...IRJET Journal
 

Mais procurados (15)

Fiber Laser Enables Marking of Advanced Plastics-ILS Feature Article JanFeb2016
Fiber Laser Enables Marking of Advanced Plastics-ILS Feature Article JanFeb2016Fiber Laser Enables Marking of Advanced Plastics-ILS Feature Article JanFeb2016
Fiber Laser Enables Marking of Advanced Plastics-ILS Feature Article JanFeb2016
 
30120140502001
3012014050200130120140502001
30120140502001
 
Deep Purple: Discolouration in CBD products
Deep Purple: Discolouration in CBD productsDeep Purple: Discolouration in CBD products
Deep Purple: Discolouration in CBD products
 
1-s2.0-S0022286014012551-main
1-s2.0-S0022286014012551-main1-s2.0-S0022286014012551-main
1-s2.0-S0022286014012551-main
 
Aspects of pharmaceutical molecular design (Belgrade version)
Aspects of pharmaceutical molecular design (Belgrade version)Aspects of pharmaceutical molecular design (Belgrade version)
Aspects of pharmaceutical molecular design (Belgrade version)
 
lectures_chicago2009_lecture10
lectures_chicago2009_lecture10lectures_chicago2009_lecture10
lectures_chicago2009_lecture10
 
Dictionary of Natural products(DNP)
Dictionary of Natural products(DNP)Dictionary of Natural products(DNP)
Dictionary of Natural products(DNP)
 
Growth and Structural studies of Zn doped L-Threonine single crystal
Growth and Structural studies of Zn doped L-Threonine single crystalGrowth and Structural studies of Zn doped L-Threonine single crystal
Growth and Structural studies of Zn doped L-Threonine single crystal
 
30120140502001
3012014050200130120140502001
30120140502001
 
Triton-Plus-La-Jolla-Nd-AN30281
Triton-Plus-La-Jolla-Nd-AN30281Triton-Plus-La-Jolla-Nd-AN30281
Triton-Plus-La-Jolla-Nd-AN30281
 
characteristics exploration of n ii cuzn nano-composite coated permanent magnets
characteristics exploration of n ii cuzn nano-composite coated permanent magnetscharacteristics exploration of n ii cuzn nano-composite coated permanent magnets
characteristics exploration of n ii cuzn nano-composite coated permanent magnets
 
IB Chemistry on ICT, 3D software, Avogadro, Jmol, Swiss PDB, Pymol for Intern...
IB Chemistry on ICT, 3D software, Avogadro, Jmol, Swiss PDB, Pymol for Intern...IB Chemistry on ICT, 3D software, Avogadro, Jmol, Swiss PDB, Pymol for Intern...
IB Chemistry on ICT, 3D software, Avogadro, Jmol, Swiss PDB, Pymol for Intern...
 
Growth, Structure and Physical Properties of Tetraaqua Bismaleatocobalt (II) ...
Growth, Structure and Physical Properties of Tetraaqua Bismaleatocobalt (II) ...Growth, Structure and Physical Properties of Tetraaqua Bismaleatocobalt (II) ...
Growth, Structure and Physical Properties of Tetraaqua Bismaleatocobalt (II) ...
 
IB Chemistry on ICT, 3D software, Avogadro, Jmol, Swiss PDB, Pymol for Intern...
IB Chemistry on ICT, 3D software, Avogadro, Jmol, Swiss PDB, Pymol for Intern...IB Chemistry on ICT, 3D software, Avogadro, Jmol, Swiss PDB, Pymol for Intern...
IB Chemistry on ICT, 3D software, Avogadro, Jmol, Swiss PDB, Pymol for Intern...
 
Growth and Characterization of ZTS Single Crystal and its Analysis of Optical...
Growth and Characterization of ZTS Single Crystal and its Analysis of Optical...Growth and Characterization of ZTS Single Crystal and its Analysis of Optical...
Growth and Characterization of ZTS Single Crystal and its Analysis of Optical...
 

Semelhante a Getting the Big Picture by Joining up the SAR dots

The influence of data curation on QSAR Modeling – Presented at American Chemi...
The influence of data curation on QSAR Modeling – Presented at American Chemi...The influence of data curation on QSAR Modeling – Presented at American Chemi...
The influence of data curation on QSAR Modeling – Presented at American Chemi...Kamel Mansouri
 
The importance of data curation on QSAR Modeling: PHYSPROP open data as a cas...
The importance of data curation on QSAR Modeling: PHYSPROP open data as a cas...The importance of data curation on QSAR Modeling: PHYSPROP open data as a cas...
The importance of data curation on QSAR Modeling: PHYSPROP open data as a cas...Kamel Mansouri
 
PNNL April 2011 ogce
PNNL April 2011 ogcePNNL April 2011 ogce
PNNL April 2011 ogcemarpierc
 
Data-intensive profile for the VAMDC
Data-intensive profile for the VAMDCData-intensive profile for the VAMDC
Data-intensive profile for the VAMDCAstroAtom
 
Using open bioactivity data for developing machine-learning prediction models...
Using open bioactivity data for developing machine-learning prediction models...Using open bioactivity data for developing machine-learning prediction models...
Using open bioactivity data for developing machine-learning prediction models...Sunghwan Kim
 
SOT short course on computational toxicology
SOT short course on computational toxicology SOT short course on computational toxicology
SOT short course on computational toxicology Sean Ekins
 
Chemoinformatic File Format.pptx
Chemoinformatic File Format.pptxChemoinformatic File Format.pptx
Chemoinformatic File Format.pptxwadhava gurumeet
 
Project Focused Activity And Knowledge Tracker A Unified Data Analysis Collab...
Project Focused Activity And Knowledge Tracker A Unified Data Analysis Collab...Project Focused Activity And Knowledge Tracker A Unified Data Analysis Collab...
Project Focused Activity And Knowledge Tracker A Unified Data Analysis Collab...brosiusad
 
2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHAC...
2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHAC...2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHAC...
2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHAC...open_phacts
 
Tim Malthus_Towards standards for the exchange of field spectral datasets
Tim Malthus_Towards standards for the exchange of field spectral datasetsTim Malthus_Towards standards for the exchange of field spectral datasets
Tim Malthus_Towards standards for the exchange of field spectral datasetsTERN Australia
 
Reaxys rmc unified platform_ webinar_
Reaxys rmc unified platform_ webinar_Reaxys rmc unified platform_ webinar_
Reaxys rmc unified platform_ webinar_Ann-Marie Roche
 
Rescuing Data from Decaying and Moribund Clinical Information Systems
Rescuing Data from Decaying and Moribund Clinical Information SystemsRescuing Data from Decaying and Moribund Clinical Information Systems
Rescuing Data from Decaying and Moribund Clinical Information SystemsHealth Informatics New Zealand
 
Tracking Trends in Korean Information Science Research, 2000-2011
Tracking Trends in Korean Information Science Research, 2000-2011Tracking Trends in Korean Information Science Research, 2000-2011
Tracking Trends in Korean Information Science Research, 2000-2011SoYoung YU
 
Stephen Friend CRUK-MD Anderson Cancer Workshop 2012-02-28
Stephen Friend CRUK-MD Anderson Cancer Workshop 2012-02-28Stephen Friend CRUK-MD Anderson Cancer Workshop 2012-02-28
Stephen Friend CRUK-MD Anderson Cancer Workshop 2012-02-28Sage Base
 
Cadd and molecular modeling for M.Pharm
Cadd and molecular modeling for M.PharmCadd and molecular modeling for M.Pharm
Cadd and molecular modeling for M.PharmShikha Popali
 

Semelhante a Getting the Big Picture by Joining up the SAR dots (20)

The influence of data curation on QSAR Modeling – Presented at American Chemi...
The influence of data curation on QSAR Modeling – Presented at American Chemi...The influence of data curation on QSAR Modeling – Presented at American Chemi...
The influence of data curation on QSAR Modeling – Presented at American Chemi...
 
Dr. Ying Xiao: Radiation Therapy Oncology Group Bioinformatics
Dr. Ying Xiao: Radiation Therapy Oncology Group BioinformaticsDr. Ying Xiao: Radiation Therapy Oncology Group Bioinformatics
Dr. Ying Xiao: Radiation Therapy Oncology Group Bioinformatics
 
The importance of data curation on QSAR Modeling: PHYSPROP open data as a cas...
The importance of data curation on QSAR Modeling: PHYSPROP open data as a cas...The importance of data curation on QSAR Modeling: PHYSPROP open data as a cas...
The importance of data curation on QSAR Modeling: PHYSPROP open data as a cas...
 
PNNL April 2011 ogce
PNNL April 2011 ogcePNNL April 2011 ogce
PNNL April 2011 ogce
 
Data-intensive profile for the VAMDC
Data-intensive profile for the VAMDCData-intensive profile for the VAMDC
Data-intensive profile for the VAMDC
 
Using open bioactivity data for developing machine-learning prediction models...
Using open bioactivity data for developing machine-learning prediction models...Using open bioactivity data for developing machine-learning prediction models...
Using open bioactivity data for developing machine-learning prediction models...
 
SOT short course on computational toxicology
SOT short course on computational toxicology SOT short course on computational toxicology
SOT short course on computational toxicology
 
NMR Prediction Accuracy Validation
NMR Prediction Accuracy ValidationNMR Prediction Accuracy Validation
NMR Prediction Accuracy Validation
 
Chemoinformatic File Format.pptx
Chemoinformatic File Format.pptxChemoinformatic File Format.pptx
Chemoinformatic File Format.pptx
 
Project Focused Activity And Knowledge Tracker A Unified Data Analysis Collab...
Project Focused Activity And Knowledge Tracker A Unified Data Analysis Collab...Project Focused Activity And Knowledge Tracker A Unified Data Analysis Collab...
Project Focused Activity And Knowledge Tracker A Unified Data Analysis Collab...
 
Identification of “Known Unknowns” Utilizing Accurate Mass Data and ChemSpider
Identification of “Known Unknowns” Utilizing Accurate Mass Data and ChemSpiderIdentification of “Known Unknowns” Utilizing Accurate Mass Data and ChemSpider
Identification of “Known Unknowns” Utilizing Accurate Mass Data and ChemSpider
 
2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHAC...
2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHAC...2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHAC...
2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHAC...
 
Delivering The Benefits of Chemical-Biological Integration in Computational T...
Delivering The Benefits of Chemical-Biological Integration in Computational T...Delivering The Benefits of Chemical-Biological Integration in Computational T...
Delivering The Benefits of Chemical-Biological Integration in Computational T...
 
Tim Malthus_Towards standards for the exchange of field spectral datasets
Tim Malthus_Towards standards for the exchange of field spectral datasetsTim Malthus_Towards standards for the exchange of field spectral datasets
Tim Malthus_Towards standards for the exchange of field spectral datasets
 
Reaxys rmc unified platform_ webinar_
Reaxys rmc unified platform_ webinar_Reaxys rmc unified platform_ webinar_
Reaxys rmc unified platform_ webinar_
 
Rescuing Data from Decaying and Moribund Clinical Information Systems
Rescuing Data from Decaying and Moribund Clinical Information SystemsRescuing Data from Decaying and Moribund Clinical Information Systems
Rescuing Data from Decaying and Moribund Clinical Information Systems
 
Tracking Trends in Korean Information Science Research, 2000-2011
Tracking Trends in Korean Information Science Research, 2000-2011Tracking Trends in Korean Information Science Research, 2000-2011
Tracking Trends in Korean Information Science Research, 2000-2011
 
ADMET.pptx
ADMET.pptxADMET.pptx
ADMET.pptx
 
Stephen Friend CRUK-MD Anderson Cancer Workshop 2012-02-28
Stephen Friend CRUK-MD Anderson Cancer Workshop 2012-02-28Stephen Friend CRUK-MD Anderson Cancer Workshop 2012-02-28
Stephen Friend CRUK-MD Anderson Cancer Workshop 2012-02-28
 
Cadd and molecular modeling for M.Pharm
Cadd and molecular modeling for M.PharmCadd and molecular modeling for M.Pharm
Cadd and molecular modeling for M.Pharm
 

Mais de Sorel Muresan

Alcoguard® H5941 – The sustainable bio-polymer
Alcoguard® H5941 – The sustainable bio-polymerAlcoguard® H5941 – The sustainable bio-polymer
Alcoguard® H5941 – The sustainable bio-polymerSorel Muresan
 
Multifunctional Surfactants - Essential ingredients for efficient cleaning pr...
Multifunctional Surfactants - Essential ingredients for efficient cleaning pr...Multifunctional Surfactants - Essential ingredients for efficient cleaning pr...
Multifunctional Surfactants - Essential ingredients for efficient cleaning pr...Sorel Muresan
 
AkzoNobel NRE HH Conference Poland 2016-04-26 - FINAL
AkzoNobel NRE HH Conference Poland 2016-04-26 - FINALAkzoNobel NRE HH Conference Poland 2016-04-26 - FINAL
AkzoNobel NRE HH Conference Poland 2016-04-26 - FINALSorel Muresan
 
Narrow Range Ethoxylates - Highly targeted performance for more effective cle...
Narrow Range Ethoxylates - Highly targeted performance for more effective cle...Narrow Range Ethoxylates - Highly targeted performance for more effective cle...
Narrow Range Ethoxylates - Highly targeted performance for more effective cle...Sorel Muresan
 
Berol® SurfBoost AD15 - a sustainable solution for cleaning
Berol® SurfBoost AD15 - a sustainable solution for cleaningBerol® SurfBoost AD15 - a sustainable solution for cleaning
Berol® SurfBoost AD15 - a sustainable solution for cleaningSorel Muresan
 
Multifunctional hydrotropes
Multifunctional hydrotropesMultifunctional hydrotropes
Multifunctional hydrotropesSorel Muresan
 
Thickening with cationic surfactants
Thickening with cationic surfactantsThickening with cationic surfactants
Thickening with cationic surfactants Sorel Muresan
 
Challenges with CLP and solutions offered by AkzoNobel
Challenges with CLP and solutions offered by AkzoNobelChallenges with CLP and solutions offered by AkzoNobel
Challenges with CLP and solutions offered by AkzoNobelSorel Muresan
 
Berol® ENV226 Plus - a formulator’s delight
Berol® ENV226 Plus - a formulator’s delightBerol® ENV226 Plus - a formulator’s delight
Berol® ENV226 Plus - a formulator’s delightSorel Muresan
 

Mais de Sorel Muresan (10)

Alcoguard® H5941 – The sustainable bio-polymer
Alcoguard® H5941 – The sustainable bio-polymerAlcoguard® H5941 – The sustainable bio-polymer
Alcoguard® H5941 – The sustainable bio-polymer
 
Multifunctional Surfactants - Essential ingredients for efficient cleaning pr...
Multifunctional Surfactants - Essential ingredients for efficient cleaning pr...Multifunctional Surfactants - Essential ingredients for efficient cleaning pr...
Multifunctional Surfactants - Essential ingredients for efficient cleaning pr...
 
AkzoNobel NRE HH Conference Poland 2016-04-26 - FINAL
AkzoNobel NRE HH Conference Poland 2016-04-26 - FINALAkzoNobel NRE HH Conference Poland 2016-04-26 - FINAL
AkzoNobel NRE HH Conference Poland 2016-04-26 - FINAL
 
Narrow Range Ethoxylates - Highly targeted performance for more effective cle...
Narrow Range Ethoxylates - Highly targeted performance for more effective cle...Narrow Range Ethoxylates - Highly targeted performance for more effective cle...
Narrow Range Ethoxylates - Highly targeted performance for more effective cle...
 
Berol® SurfBoost AD15 - a sustainable solution for cleaning
Berol® SurfBoost AD15 - a sustainable solution for cleaningBerol® SurfBoost AD15 - a sustainable solution for cleaning
Berol® SurfBoost AD15 - a sustainable solution for cleaning
 
Multifunctional hydrotropes
Multifunctional hydrotropesMultifunctional hydrotropes
Multifunctional hydrotropes
 
Thickening with cationic surfactants
Thickening with cationic surfactantsThickening with cationic surfactants
Thickening with cationic surfactants
 
Challenges with CLP and solutions offered by AkzoNobel
Challenges with CLP and solutions offered by AkzoNobelChallenges with CLP and solutions offered by AkzoNobel
Challenges with CLP and solutions offered by AkzoNobel
 
Berol® ENV226 Plus - a formulator’s delight
Berol® ENV226 Plus - a formulator’s delightBerol® ENV226 Plus - a formulator’s delight
Berol® ENV226 Plus - a formulator’s delight
 
Chemistry Connect
Chemistry ConnectChemistry Connect
Chemistry Connect
 

Último

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 

Último (20)

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 

Getting the Big Picture by Joining up the SAR dots

  • 1. Getting the Big Picture by Joining up the SAR dots Large-scale integration of structure and bioactivity data The 9th Annual Pharmaceutical IT Congress 2011 Sorel Muresan AstraZeneca R&D Mölndal DECS Computational Sciences
  • 2. WO patents with the classification code C07D Query performed using the European Patent Office search interface DECS | CompSci
  • 3. Driver – explosion in SAR data • Chemical information landscape changing fast • Databases, journal articles, patents, internal docs 2006 2008 DECS | CompSci Southan, C.; Varkonyi, P.; Muresan, S., J. Cheminfo. 2009, 1:10
  • 4. The Challenge – Information deluge • Volume • Complexity • Unstructured content DECS | CompSci
  • 5. Since 2006 >1M chemistry publications per year Number of articles (diamonds) and patents (open boxes) abstracted annually by Chemical Abstracts Bachrach J.Cheminformatics 2009 1:2 DECS | CompSci
  • 6. Number of structures per year from J Med Chem W. Patrick Walters; Jeremy Green; Jonathan R. Weiss; Mark A. Murcko; J. Med. Chem. Article ASAP DOI: 10.1021/jm200504p Copyright © 2011 American Chemical Society DECS | CompSci
  • 7. SAR key entities and relationships Unstructured Data Structured Entries in from Documents Relational Databases Expert Extraction or Text Mining DECS | CompSci Southan, C.; Boppana, K.; Jagarlapudi, S.; Muresan, S .J. Cheminfo. 2011, 3:14
  • 8. Manually extracted SAR data (commercial) • GOSTAR (GVKBIO Online Structure Activity Relationship Database) is a comprehensive database that captures explicit relationships between the three entities of publications, compounds and sequences. • It includes 2.6 million compounds linked to 3,500 sequences with 12.5M SAR points extracted from 43,000 patents and 67,000 articles from 125 journals DECS | CompSci
  • 9. SAR data (public) • PubChem • the NCBI public informatics backbone for the NIH Molecular Libraries Initiative focused on small molecules as systems biology probes and potential therapeutic agents. The statistics are 30.5 million compounds with 85.6 million links. Of the compounds, 1654K have been tested in 504K assays. • ChEMBL • includes drugs, small molecules from the medicinal chemistry or biochemical literature and their targets. It contains 1,060,258 distinct compounds extracted by expert manual curation from 42,516 publications with 5,479,146 activities, including SAR and ADMET values. This data is mapped to 8,603 targets. DECS | CompSci
  • 10. Extracting chemical entities from text Collaboration with IBM Research Almaden to apply text analytics technology to analyze intellectual property and scientific literature - 10 million full text patents - 11 million structures - 12% out of 46M parent structures in Chemistry Connect DECS | CompSci
  • 11. Chemical Named Entity Recognition (NER) 7-CHLORO-1,3-DIHYDRO-1-METHYL-5- PHENYL-2H-1,4-BENZODIAZEPIN-2-ONE Name-to-Structure software CN1c2ccc(cc2C(=NCC1=O)c3ccccc3)Cl DECS | CompSci
  • 12. Extracting chemical entities from text The biggest cause of missing compounds when extracting chemical entities from text is the presence of typographical errors: human errors, OCR failures, hyphenation and multiple line issues, etc. • Automated spelling correction with CaffeineFix from NextMove Software • CaffeineFix significantly improves extraction rates (22% increase from D=0 to D=1) • name2structure software are complementary (40% of the structures come from single n2s contributions) DECS | CompSci
  • 13. Structure standardisation “The big merge” requires: • A common set of chemistry and biology rules applied carefully & consistently across databases DECS | CompSci Muresan, S.; Sitzmann, M.; Southan, C., Biocomputing and Drug Discovery, 2011
  • 14. Chemistry Connect DECS | CompSci
  • 15. Technical Overview - ETL Data Sources Extraction Transformation Loading Text Files Python Structure Scripts Normalization (chemistry) Property calc Oracle PL/SQL Oracle DB (ext tables) Pipeline Pilot (biological results) Web Service DECS | CompSci
  • 16. Technical Overview - Application HTML Java Oracle 11g WebLogic Server Direct 7 REST (and SOAP) services .Net PipelinePilot Knime Excel DECS | CompSci
  • 17. Source content in Chemistry Connect Source Structures % unique Cpd/Str Syn/Str ChemSpider 18922316 50 1.07 1.8 Reaxys 15535377 59 1.12 2.0 IBM patents 11038533 51 1.00 n/a PubChemBE 4675643 n/a 1.03 n/a ACD 4452644 73 1.01 1.3 eMolecules 4213813 19 1.01 n/a TRPharma 3268613 n/a 1.03 n/a GOSTAR 3128567 27 1.00 3.3 ChEMBL 940905 n/a 1.05 1.6 TRIntegrity 307685 27 1.00 1.3 AZReagents 78265 3.4 1.73 3.4 TRPartnering 17901 10 1.00 1.0 ChEBI 13191 n/a 1.31 5.2 HMDB 7789 53 1.00 13.4 DrugBank 6359 n/a 1.04 5.0 TTD 2663 4.9 1.27 n/a Bioprint 2481 n/a 1.00 n/a DECS | CompSci Muresan, S. et al, Drug Discovery Today 2011, in print
  • 18. Finding a common language Acetaminophen [3H]Acetaminophen 882-720-13 Acetaminophen (4-hydroxyacetanilide) 10066-90-7 882-720-16 Acetaminophen glucuronide(55%) acetaminophen sulfate 103-90-2 882-720-20 Acetaminophen sulfate(30%) 1047-607-00 A F ANACIN acetaminophen sulphate Acetaminophen Uniserts 1169-894-12 A PER acetaminophene A.F. ANACIN Acetamol 16110-10-4 ACETANILIDE, 4'-HYDROXY- AAP Acetavance 222 AF aa-sulfate Acetofen 222-AF AA-sulphate ACETOMINOPHEN Actamin 3-(glutathion-S-yl)acetaminophen Abenol Actamin Extra Actamin Super 37519-14-5 Abensanil Actifed Plus 3-hydroxyacetaminophen ABROL Actimol Actimol Chewable Tablets 4-(Acetylamino)phenol ABROLET Actimol Children's Suspension 4-13-00-01091 AC112578 Actimol Infants' Suspension Actimol Junior Strength Caplets 4-ACETAMIDOPHENOL AC112579 Actron Acamol Afebrin 4-Acetaminophenol Afebryl Accu-Tap Aferadol 4-ACETYLAMINOPHENOL Acenol AG10223 4'-Hydroxyacetanilide Acenol (pharmaceutical) AG12029 AG124687 4-HYDROXYACETANILIDE Acephen AG12800 AG12948 4-HYDROXYANILID KYSELINY OCTOVE Acertol Amadil 4-hydroxyphenolacetamide Aceta Aminofen 644/4046 Aceta Elixir Aminofen Max Anacin 644/7502 Aceta Tablets Anacin-3 64889-81-2 Acetaco Anacin-3 Extra Strength Acetagesic Anadin dla dzieci 659/9501 Anaflon Acetalgin 77097-85-9 Acetaminophen: Analter ACETAMIDE, N-(4- Anapap 840-416-00 HYDROXYPHENYL)- Andox >1000 synonyms.. 872-667-00 878-022-04 ACETAMIDE, N-(P- HYDROXYPHENYL)- Anelix Anexsia Anexsia 10/660 878-022-09 Acetamidophenol Anexsia 5/325 878-022-14 Acetaminofen Anexsia 7.5/325 Acetaminophen Anexsia 7.5/650 878-022-19 Anhiba 882-720-04 Acetaminophen (4- Anoquan hydroxyacetanilide) Anti-Algos 882-720-07 Antidol Acetaminophen 882-720-10 glucuronide(55%) Apacet DECS | CompSci Apacet Capsules acetaminophen sulfate
  • 19. Word of the Day : Crowdsourcing DECS | CompSci
  • 20. Exact match source comparisons sources that include predominantly patent- known drugs derived compounds DECS | CompSci
  • 21. Chemistry Connect - Synonyms Searches DECS | CompSci
  • 22. Chemistry Connect - Structure Searches DECS | CompSci
  • 23. Chemistry Connect - Patent Searches DECS | CompSci
  • 24. Chemistry Connect - Test & Result Searches DECS | CompSci
  • 25. Different Questions, Common Language Question Concepts • What compounds have been described in Target Pathway document D? Institute People Disease • What compounds bind target X with an affinity Compound Bioprocess greater than A? Target MoA Pathway Disease • What targets does compound C bind with an affinity greater than A? Compound Test Target • What compounds have AZ patented on target X? • What is the structure for this development Disease Study Drug MoA compound? Species • How can I quickly get the SAR data from this Compound BMO (AE) patent? Study BMO (AE) Compound DECS | CompSci
  • 26. Take-home messages • Chemistry Connect is enabling AZ to intensify its exploitation of synergies between internal and external SAR estate and to shorten the time between hypothesis generation during DMTA cycles • Our Chemical Dictionary of 120 million chemical terms has become a crucial cross-mapping resource between chemistry and the scientific literature • We cannot wave a magic wand over data qality, provenance issues, drug name space, and the inherent challenges of chemistry representation but Chemistry Connect gives us a unique overview and amelioration options for each source DECS | CompSci
  • 27. A Democracy of Ideas (Acknowledgements) • Plamen Petrov • Niklas Blomberg • Chris Southan • Kay Brickmann • Paul Xie • Ola Engkvist • Peter Varkonyi • Yidong Yang • Thierry Kogej • Hongming Chen • Christian Tyrchan • and many others… • Magnus Kjellberg • Håkan Nilsson • Mats Ericsson • Jonas Ekengren • Marcus Gelderman • Ithipol Suriyawongkul DECS | CompSci
  • 28. Thank you! DECS | CompSci