SlideShare uma empresa Scribd logo
1 de 37
Baixar para ler offline
Developing chemoinformatics models: One can’t
embrace unembraceable




Igor V. Tetko
Helmholtz Zentrum München - German Research Center for Environmental Health (GmbH)
Institute of Bioinformatics & Systems Biology


Kyiv, 10 August 2009, Summer School
Representation of Molecules
Can be defined with calculated
   properties (logP, quantum-
   chemical parameters, etc.)             HO               "12.3%
                                                           $     '
                                                           $ 4.6 '
Can be defined with a set of                               $ M '
   structural descriptors                                  $
                                                            13.2'
                                                                 '
                                                           $
   (topological 2D, 3D, etc.).                     N
                                                           $10.1'
                                                           #     &

Goal is to correlate descriptors
  with some properties.                                !
                                     HO

One of these sets of descriptors                           "13.7%
                                                           $     '
  could be used for determine an                           $ 4.8 '
  applicability domain of a model.                         $ M '
                                               N           $     '
                                                           $15.8'
                                                           $12.0'
                                                           #     &



                                                       !
Distance to model:
"One can not embrace the unembraceable.”

Possible: 1060 - 10100 molecules theoretically exist
( > 1080 atoms in the Universe)

Achievable: 1020 - 1024 can be synthesized now
by companies (weight of the Moon is ca 1023 kg)

Available: 2*107 molecules are on the market

Measured: 102 - 104 molecules with data
                                                             Kozma Prutkov
Problem: To predict these properties of just molecules
on the market we must extrapolate data from one to
1,000 - 100,000 molecules!                 OH      O




                                                   O     N




 There is a need for methods
 which can estimate
 the accuracy of predictions!
Models can fail due to chemical diversity
   of training & test sets (i.e. outside of applicability
   domain)


          Training set data used
          to develop a model

Our model given
the training set




                                           New data to be estimated



                                   Correct model
Declining R&D productivity in the
pharmaceutical industry




                                    Approved medicine




  http://www.frost.com/prod/servlet/market-insight-top.pag?docid=128394740
Benchmarking data: logP prediction

Public dataset:                                   Industrial datasets:
N=266 molecules1                                  N=95809 (Pfizer Inc.)2
 • N=233 Star set (supported with                 N=882 (Nycomed GmbH)3
   experimental values from CLOGP
   v5.0 program)
                                                  2logP
                                                      and logD (pH 7.4)
 • N=43 Non-Star set (no
   experimental logP values in                    measurements
   CLOGP v5.0)                                    3logP   measurements only


1Provided by A. Avdeef, “Absorption
and drug development. Solubility,
permeability and charge state”, ed.
Hoboken, NJ: Wiley–Interscience,
2003.




                            Mannhold, R. et al, J. Pharm. Sci., 2009, 98(3), 861-893.
Performance of algorithms for the public dataset
                                                         Star set (N = 223)                  Non-Star set (N = 43)
                               Method                            % within error range                 % within error range
                                                 RMSE    rank    <0.5     0.5-1    >1   RMSE rank <0.5         0.5-1    >1
                               AB/LogP            0.41     I     84       12      4     1.00      I     42      23     35
                               S+logP             0.45     I     76       22      3     0.87      I     40      35     26
                               ACD/logP           0.50     I     75       17      7     1.00      I     44      33     23
                               Consensus log P    0.50     I     74       18      8     0.80      I     47      28     26
                               CLOGP              0.52    II     74       20      6     0.91      I     47      28     26
                               VLOGP OPS          0.52    II     64       21      7     1.07      I     33      28     26
AAM = average logP used        ALOGPS             0.53    II     71       23      6     0.82      I     42      30     28
as predicted value for all     MiLogP             0.57    II     69       22      9     0.86      I     49      30     21
                               XLOGP              0.62    II     60       30      10    0.89      I     47      23     30
molecules R2=0                 KowWIN             0.64    II     68       21      11    1.05      I     40      30     30
                               CSlogP             0.65    II     66       22      12    0.93      I     58      19     23
Bootstrap test:                ALOGP (Dragon)     0.69    II     60       25      16    0.92      I     28      40     33
• rank I - similar to “best    MolLogP            0.69    II     61       25      14    0.93      I     40      35     26
                               ALOGP98            0.70    II     61       26      13    1.00      I     30      37     33
model”                         OsirisP            0.71    II     59       26      16    0.94      I     42      26     33
• rank II -- better than AAM   VLOGP              0.72    II     65       22      14    1.13      I     40      28     33
• rank III - similar to AAM    TLOGP              0.74    II     67       16      13    1.12      I     30      37     30
                               ABSOLV             0.75    II     53       30      17    1.02      I     49      28     23
                               QikProp            0.77    II     53       30      17    1.24     II     40      26     35
                               QuantlogP          0.80    II     47       30      22    1.17     II     35      26     40
                               SLIPPER-2002       0.80    II     62       22      15    1.16     II     35      23     42
                               COSMOFrag          0.84    II     48       26      19    1.23     II     26      40     33
1Provided  by A. Avdeef,       XLOGP2             0.87    II     57       22      20    1.16     II     35      23     42
Absorption and drug            QLOGP              0.96    II     48       26      25    1.42     II     21      26     53
                               VEGA               1.04    II     47       27      26    1.24     II     28      30     42
development. Solubility,
                               CLIP               1.05    II     41       25      30    1.54     III    33      9      49
permeability and charge        LSER               1.07    II     44       26      30    1.26     II     35      16     49
state, ed. Hoboken, NJ:        MLOGP (Sim+)       1.26    II     38       30      33    1.56     III    26      28     47
Wiley–Interscience,            NC+NHET            1.35    III    29       26      45    1.71     III    19      16     65
                               SPARC              1.36    III    45       22      32    1.70     III    28      21     49
2003.                          MLOGP(Dragon)      1.52    III    39       26      35    2.45     III    23      30     47
                               LSER UFZ           1.60    III    36       23      41    2.79     III    19      12     67
                               AAM                1.62    III    22       24      53    2.10     III    19      28     53
                               VLOGP-NOPS         1.76    III    1         1      7     1.39     III    7       0       7
                               HINT               1.80    III    34       22      44    2.72     III    30      5      65
                               GBLOGP             1.98    III    32       26      42    1.75     III    19      16     65
Performance of algorithms for in-house datasets
 Benchmarking of logP                                                                 Pfizer set (N = 95 809)                      Nycomed set (N = 882)

 methods for in-house                                   Method
                                                                       RMSE     Failed1   rank   % in error range
                                                                                                 <0.5   0.5-
                                                                                                         1
                                                                                                               >1
                                                                                                                     RMSE,
                                                                                                                    zwitterions
                                                                                                                    excluded2
                                                                                                                                  RMSE   rank   % in error range
                                                                                                                                                <0.5   0.5-
                                                                                                                                                        1
                                                                                                                                                              >1


 data of Pfizer & Nycomed                        Consensus log P       0.95                I     48 29 24             0.94        0.58    I     61 32         7
                                                 ALOGPS                1.02                I     41 30 29             1.01        0.68    I     51 34 15
 •   Large number of methods could not           S+logP                1.02                I     44 29 27             1.00        0.69    I     58 27 15
     perform better than AAM model               NC+NHET               1.04                II    38 30 32             1.04        0.88   III    42 32 26
                                                 MLOGP(S+)             1.05                II    40 29 31             1.05        1.17   III    32 26 41
 •   Best results are calculated using
                                                 XLOGP3                1.07                II    43 28 29             1.06        0.65    I     55 34 12
     Consensus logP model
                                                 MiLogP                1.10      27        II    41 28 30             1.09        0.67    I     60 26 14
 • log P = 1.46 + 0.11 (NC - NHET)               AB/LogP               1.12      24        II    39 29 33             1.11        0.88   III    45 28 27
   N=95 809, RMSE=1.04, R2=0.2                   ALOGP                 1.12                II    39 29 32             1.12        0.72    II    52 33 15
                                                 ALOGP98               1.12                II    40 28 32             1.10        0.73    II    52 31 17
                                                 OsirisP               1.13       6        II    39 28 33             1.12        0.85    II    43 33 24
                                                 AAM                   1.16                III   33 29 38             1.16        0.94   III    42 31 27

Different MlogP implementations                  CLOGP                 1.23                III   37 28 35             1.21        1.01   III    46 28 22
                                                 ACD/logP              1.28                III   35 27 38             1.28        0.87   III    46 34 21
demonstrate very different                       CSlogP                1.29      20        III   37 27 36             1.28        1.06   III    38 29 33
performances for both sets                       COSMOFrag             1.30 10883          III   32 27 40             1.30        1.06   III    29 31 40
                                                 QikProp               1.32     103        III   31 26 43             1.32        1.17   III    27 24 49
                                                 KowWIN                1.32      16        III   33 26 41             1.31        1.20   III    29 27 44
                                                 QLogP                 1.33      24        III   34 27 39             1.32        0.80    II    50 33 17
                                                 XLOGP2                1.80                III   15 17 68             1.80        0.94   III    39 31 29
     N.B! Do we really compare                   MLOGP(Dragon) 2.03                        III   34 24 42             2.03        0.90   III    45 30 25
     methods or their implementations?         1Nr of molecu les with ca lculations failures due to errors or crash of programs. All methods predicted all
                                               molecules for the Nycomed dataset. 2RMSE calculated after excluding of 769 zwitterionic compounds from the
                                               Pfizer dataset. 3Most molecules failed by COSMOFrag are zwitterions.



                                  Mannhold, R. et al, J. Pharm. Sci., 2009, 98(3), 861-893.
Optimization of a drug: we need to find a road in
multidimensional space
…and to arrive to a good result as
soon as possible!
We need to optimize our road: we can’t see mountains!
Examples of distances to models




Jaworska et al,
ATLA, 33, 445-
459, 2005.        Euclidian      Mahalanobis

Glende et al,
Mutation Res.,
2001.




                   City-block   Probability-density
Distances to a model in descriptor space




             Jaworska et al, ATLA, 33, 445-459, 2005.
Nearest neighbors in the input space

     x2
      4


      2


      0                   x


      -2


      -4

           -3   -2   -1   0   1   2    3
                                      x1
Nearest neighbors and activity
                         1         y
                        0.8


                        0.6


                        0.4


                        0.2


     y=exp(-(x1+x2)2)    0
                              -3       -2   -1   0   1   2   3
                                                             x    The nearest
     x=x1+x2 ! x2
                4
                                                                  neighbors in
                                                                  descriptor space
                                                                  are not the
                         2
                                                                  neighbors in
                         0                       x                property!

                         -2


                         -4

                              -3       -2   -1   0   1   2   3
                                                             x1
Ensemble methods




Hansen, L.K.; Salamon, P. IEEE Trans. Pattern. Anal. Mach. Learn., 1990, 12, 993.
Tetko, I. V.; Luik, A. I.; Poda, G. I. J. Med. Chem., 1993, 36, 811.
Tetko, I.V.; Livingstone, D. J.; Luik, A. I. Neural Network Studies. 1. Comparison of
Overfitting and Overtraining. J. Chem. Inf. Comput. Sci. 1995, 35(5), 826.
Who of them would be elected in 1904?




2000



        G.W. Bush        A. Gore
        republican       democrat
Elections results in 1904




 1904


         T. Roosevelt       A. Parker
         republican         democrat
President elections in the USA by states




The distribution of votes in states creates a typical fingerprint,
     which is invisible when we use just average values

          Democrats of 2000 == Republicans of 1904
Ensemble Based Distance
            HO
                                             " net 1 %
                               "12.3%
                                             $       '
                               $     '
                                             $ net 2 '
                               $ 4.6 '
                                             $ M '
logP=3.11                      $ M '
                                             $       '    Morphinan-3-ol, 17-methyl-
                                             $net 63'
                               $     '
                       N
                               $13.2'
                                             $net 64'
                                             #       &
                               $
                               #10.1&'
       HO
                                             " net 1 %
                               "13.7%
                                             $       '
                               $     '
                                             $ net 2 '
logP=3.48                      $ 4.8 '   !                       Levallorphan
                           !                 $ M '
                               $ M '
                                             $       '
                                             $net 63'
                   N           $     '
                               $15.8'        $       '
                               $12.0'        #net 64&
                               #     &


                                         !
                           !

                                                         CORREL -- correlation
                                                         between model vectors
                                                         for any two molecules
Nearest neighbors and activity
          A                                                        B
                1        y                                                 1        y

              0.8                                                      0.8


              0.6                                                      0.6


              0.4                                                      0.4


              0.2                                                      0.2


                0                                                          0
                    -3       -2   -1       0         1    2    3               -3       -2       -1       0    1   2   3
                                                              x                                                        x

x=x1+x2                                C       x2
                                                4


                                                2                                                             CORREL measure
                                                0                      x
                                                                                                              correctly detect the
                                                                                                              nearest neighbors!
                                               -2


                                               -4

                                                    -3   -2   -1       0            1        2        3
                                                                                                  x1
Local models: Instant learning of logP for Pt(II) molecules


                                     H2 N        Cl
                                            Pt
                      Cl             H2N         Cl
                               Cl                                N        Cl
 H2N         Cl                                                      Pt
                                                                 N         Cl
        Pt
 H 2N        Cl
                                      Cl
                           N
                                Pt               H2N        Cl
                           N          Cl         H2N
                                                       Pt
                                                            Cl       H2N         Cl
                                                                            Pt
                                                                      H2N        Cl
                  O
    N        O                      H3N                Cl
        Pt
    N        O                               Pt
                  O                 H3N                Cl




            Prediction of new classes of compounds can be extremely difficult as
        exemplified by an absence of correlations between predicted and experimental
                      values using developed by us ALOGPS program.


                                                                 Tetko et al, J. Inorg. Biochem, 2008, 102, 1424-37.
Local models: Instant learning by knowledge transfer




        The use of LIBRARY mode (local correction of the global model)
                dramatically (5 times!) decreased logP errors,




                        Tetko et al, J. Inorg. Biochem, 2008, 102, 1424-37.
Performance of algorithms for in-house datasets
                                       Pfizer set (N = 95 809)                      Nycomed set (N = 882)
                                                  % in error range    RMSE,                      % in error range
                        RMSE     Failed1   rank                                    RMSE   rank
                                                  <0.5   0.5-   >1   zwitterions                 <0.5   0.5-   >1
         Method                                           1          excluded2                           1

  Consensus log P       0.95                I     48 29 24             0.94        0.58    I     61 32         7
  ALOGPS                1.02                I     41 30 29             1.01        0.68    I     51 34 15
  S+logP                1.02                I     44 29 27             1.00        0.69    I     58 27 15
  NC+NHET               1.04                II    38 30 32             1.04        0.88   III    42 32 26
  MLOGP(S+)             1.05                II    40 29 31             1.05        1.17   III    32 26 41
  XLOGP3                1.07                II    43 28 29             1.06        0.65    I     55 34 12
  MiLogP                1.10      27        II    41 28 30             1.09        0.67    I     60 26 14
  AB/LogP               1.12      24        II    39 29 33             1.11        0.88   III    45 28 27
  ALOGP                 1.12                II    39 29 32             1.12        0.72    II    52 33 15
  ALOGP98               1.12                II    40 28 32             1.10        0.73    II    52 31 17
  OsirisP               1.13       6        II    39 28 33             1.12        0.85    II    43 33 24
  AAM                   1.16                III   33 29 38             1.16        0.94   III    42 31 27
  CLOGP                 1.23                III   37 28 35             1.21        1.01   III    46 28 22
  ACD/logP              1.28                III   35 27 38             1.28        0.87   III    46 34 21
  CSlogP                1.29      20        III   37 27 36             1.28        1.06   III    38 29 33
  COSMOFrag             1.30 10883          III   32 27 40             1.30        1.06   III    29 31 40
  QikProp               1.32     103        III   31 26 43             1.32        1.17   III    27 24 49
  KowWIN                1.32      16        III   33 26 41             1.31        1.20   III    29 27 44
  QLogP                 1.33      24        III   34 27 39             1.32        0.80    II    50 33 17
  XLOGP2                1.80                III   15 17 68             1.80        0.94   III    39 31 29
  MLOGP(Dragon) 2.03                        III   34 24 42             2.03        0.90   III    45 30 25

1Nr of molecu les with ca lculations failures due to errors or crash of programs. All methods predicted all
molecules for the Nycomed dataset. 2RMSE calculated after excluding of 769 zwitterionic compounds from the
Pfizer dataset. 3Most molecules failed by COSMOFrag are zwitterions.



      Mannhold, R. et al, J. Pharm. Sci., 2009, 98(3), 861-893.
Local models: Instant learning of in-house data
(Pfizer Inc.), N=95809

    ALOGPS Blind prediction                            ALOGPS LIBRARY




        RMSE=1.02                                     RMSE=0.59

              in less than 30 minutes of calculations on a notebook!

                      Tetko, Poda, Ostermann, Mannhold, QSAR Comb. Sci, 2009, in press.
REACH and QSAR (Quantitative Structure
        Activity Relationship) models

> 140,000 chemicals to be registered … is a lot!
It is expensive to measure all of them ($200,000 per compound), a lot of
    animal testing
QSAR models can be used to prioritize compounds
    •    Compound is predicted to be toxic
          • Biological testing will be done to prove/
            disprove the models
    •    Compound is predicted to be not toxic
          • tests can be avoided, saving money, animals
          • but ... only if we are confident in the predictions
    •    FP7 project CADASTER http://www.cadaster.eu has
         goals to develop a strategy for use of in silico methods
         in REACH
Estimation toxicity of T. pyriformis

Initial Dataset1,2
   n=983 molecules
       n=644 training set
       n=339 test set 1


Test set 2:
  n=110 molecules1,2


     The overall goal is to predict and to assess the reliability of predictions
     toxicity against T. pyriformis for chemicals directly from their structure.

           1Zhu   et al, J. Chem. Inf. Comput. Sci, 2008, 48(4), 766-784.
           2Schultz   et al, QSAR Comb Sci, 2007, 26(2), 238-254.
Overview of analyzed distances to models (DMs)


EUCLID                    k
                                                                   TANIMOTO
                         "d       j                                                                         "x    a,i   x b,i
               EUm=      j=1
                                      k is number of nearest          Tanimoto(a,b) =
                              k                                                         "x   a,i   x a,i   +" x
                                                                                                              b,i       x b,i # " x a,i x b,i
                                      neighbors, m index of
          EUCLID = EU m                                            xa,i and xb,i are fragment counts
                                      model
              !                                                !
LEVERAGE                                                           PLSEU (DModX)
!
            LEVERAGE=xT(XTX)-1x                                    Error in approximation (restoration) of the
                                                                   vector of input variables from the latent
                                                                   variables and PLS weights.

                        1              2
                                                                   CORREL
STD            STD =
                       N "1
                            #( yi " y)
                                                                   CORREL(a) =maxj CORREL(a,j)=R2(Yacalc,Yjcalc)
         yi is value calculated with model i and y is average Ya=(y ,…,y ) is vector of predictions of molecule i
     !                                                             1    N
value

                                             !
Mixture of Gaussian
Distributions (MGD)



Idea is to find a MGD,
which maximize
likelihood (probability)

! N(0,!2(ei))

 of the observed
distribution of errors
MGDs for the simulated datasets


A) No significant
   MGD was found
B) A MGD composed
   of 3 Gaussian
   distributions was
   found
Distance does not work: Neural Network model
Distance works: Neural Network model
Estimations based on training set errors




                Tetko et al, J Chem Inf Model, 2008, 48(9):1733-46.
Estimations of errors using MGD and 5-fold
cross-validation




                Tetko et al, J Chem Inf Model, 2008, 48(9):1733-46.
Prediction accuracy for training and two external sets




Estimated experimental
accuracy is about
SE = 0.38

HPV (High Production
Volume): 3182
EINECS (REACH): 48774




Challenge to predict toxicity organized in collaboration with European Neural Network Society
                                   at http://www.cadaster.eu

                               Tetko et al, J Chem Inf Model, 2008, 48(9):1733-46.
Challenges and solutions
                       new measurement
                              N



                                  O      O




       reliable predictions


                                                              O   N        O




                                                                           O        OH
                                                      N       O


                                                                  new series to predict




            Our methodology allows confident navigation in a defined molecular space.

!   It can be used to develop targeted (local) models covering specific series.

!   It can be used to reliably estimate which compounds can/can’t be reliably predicted.

!   It can be used to guide experimental design and to minimize costs of new measurements.
Acknowledgements


     My group                                                            Collaborators:

Mr I. Sushko                                                    Dr. C. Höfer (DMPKore)
Mr S. Novotarskyi                                               Dr. G. Poda (Pfizer)
Mr A.K. Pandey                                                  Dr. C. Ostermann (Nycomed)
Mr R. Körner                                                    Prof. R. Mannhold (Düsseldorf
Mr S. Brandmaier                                                University)
Dr M. Rupp                                                      Prof. A. Tropsha (NC, USA)
                                                                Prof. T. Oprea (New Mexico, USA)
                                                                + many other colleagues &
Visiting Scientists
                                                                       co-authors
Dr. V. Kovalishyn
Dr. V. Prokopenko
Prof. J. Emmersen                    Funding
                                               FP7 MC ITN ECO
               GO-Bio BMBF http://qspr.eu
                                               FP7 CADASTER http://www.cadaster.ue
            Germany-Ukraine grant UKR 08/006
                                               FP6 INTAS VCCLAB http://www.vcclab.org
                    DFG TE 380/1-1

Mais conteúdo relacionado

Destaque

Faith And Economics
Faith And EconomicsFaith And Economics
Faith And EconomicsRebecca G
 
Azure en entornos empresariales
Azure en entornos empresarialesAzure en entornos empresariales
Azure en entornos empresarialesAvanet
 
SharePoint Worst Practices - SPSRIC
SharePoint Worst Practices - SPSRICSharePoint Worst Practices - SPSRIC
SharePoint Worst Practices - SPSRICDan Usher
 
Future of ecommerce
Future of ecommerceFuture of ecommerce
Future of ecommerceSasmita Pati
 
你所不知道的健康檢查
你所不知道的健康檢查你所不知道的健康檢查
你所不知道的健康檢查honan4108
 
Redacción de textos Nicolas Arturo Vargas
Redacción de textos Nicolas Arturo VargasRedacción de textos Nicolas Arturo Vargas
Redacción de textos Nicolas Arturo Vargasnicolas1629
 
HXRefactored - Doesn't Your Mom Deserve Better
HXRefactored - Doesn't Your Mom Deserve BetterHXRefactored - Doesn't Your Mom Deserve Better
HXRefactored - Doesn't Your Mom Deserve BetterSanjay Khurana
 
collaborative inquiry
collaborative inquiry collaborative inquiry
collaborative inquiry yliulaoshi
 
2015 Ultimate Hiring Toolbox For Small & Medium Businesses
2015 Ultimate Hiring Toolbox For Small & Medium Businesses2015 Ultimate Hiring Toolbox For Small & Medium Businesses
2015 Ultimate Hiring Toolbox For Small & Medium BusinessesSage HR
 
Jel 2012 open education sophie2ze
Jel 2012 open education sophie2zeJel 2012 open education sophie2ze
Jel 2012 open education sophie2zeSophie TOUZÉ
 
9 2 business environment and business ideas
9 2 business environment and business ideas9 2 business environment and business ideas
9 2 business environment and business ideasbananaapple2
 
Cannes Lions 2016 Unwrapped
Cannes Lions 2016 UnwrappedCannes Lions 2016 Unwrapped
Cannes Lions 2016 UnwrappedCaratAUNZ
 

Destaque (17)

Faith And Economics
Faith And EconomicsFaith And Economics
Faith And Economics
 
Reto clínico joven con dolor abdominal intratable
Reto clínico joven con dolor abdominal intratableReto clínico joven con dolor abdominal intratable
Reto clínico joven con dolor abdominal intratable
 
Azure en entornos empresariales
Azure en entornos empresarialesAzure en entornos empresariales
Azure en entornos empresariales
 
SharePoint Worst Practices - SPSRIC
SharePoint Worst Practices - SPSRICSharePoint Worst Practices - SPSRIC
SharePoint Worst Practices - SPSRIC
 
ICCS9 2011 Talk
ICCS9 2011 TalkICCS9 2011 Talk
ICCS9 2011 Talk
 
Future of ecommerce
Future of ecommerceFuture of ecommerce
Future of ecommerce
 
你所不知道的健康檢查
你所不知道的健康檢查你所不知道的健康檢查
你所不知道的健康檢查
 
Redacción de textos Nicolas Arturo Vargas
Redacción de textos Nicolas Arturo VargasRedacción de textos Nicolas Arturo Vargas
Redacción de textos Nicolas Arturo Vargas
 
HXRefactored - Doesn't Your Mom Deserve Better
HXRefactored - Doesn't Your Mom Deserve BetterHXRefactored - Doesn't Your Mom Deserve Better
HXRefactored - Doesn't Your Mom Deserve Better
 
98 2016 da 0 a tre anni
98   2016   da 0 a tre anni98   2016   da 0 a tre anni
98 2016 da 0 a tre anni
 
E book La Crisis Silenciosa (1ª Parte)
E book La Crisis Silenciosa (1ª Parte)E book La Crisis Silenciosa (1ª Parte)
E book La Crisis Silenciosa (1ª Parte)
 
collaborative inquiry
collaborative inquiry collaborative inquiry
collaborative inquiry
 
2015 Ultimate Hiring Toolbox For Small & Medium Businesses
2015 Ultimate Hiring Toolbox For Small & Medium Businesses2015 Ultimate Hiring Toolbox For Small & Medium Businesses
2015 Ultimate Hiring Toolbox For Small & Medium Businesses
 
Dukane ipad products 2013
Dukane ipad products 2013Dukane ipad products 2013
Dukane ipad products 2013
 
Jel 2012 open education sophie2ze
Jel 2012 open education sophie2zeJel 2012 open education sophie2ze
Jel 2012 open education sophie2ze
 
9 2 business environment and business ideas
9 2 business environment and business ideas9 2 business environment and business ideas
9 2 business environment and business ideas
 
Cannes Lions 2016 Unwrapped
Cannes Lions 2016 UnwrappedCannes Lions 2016 Unwrapped
Cannes Lions 2016 Unwrapped
 

Semelhante a Developing Chemoinformatics Models

Cheminformatics toolkits: a personal perspective
Cheminformatics toolkits: a personal perspectiveCheminformatics toolkits: a personal perspective
Cheminformatics toolkits: a personal perspectiveNextMove Software
 
Chemoinformatics in Action
Chemoinformatics in ActionChemoinformatics in Action
Chemoinformatics in ActionSSA KPI
 
Sparse feature analysis for detection of clustered microcalcifications in mam...
Sparse feature analysis for detection of clustered microcalcifications in mam...Sparse feature analysis for detection of clustered microcalcifications in mam...
Sparse feature analysis for detection of clustered microcalcifications in mam...Wesley De Neve
 
Paper and pencil_cosmological_calculator
Paper and pencil_cosmological_calculatorPaper and pencil_cosmological_calculator
Paper and pencil_cosmological_calculatorSérgio Sacani
 
Open Science Data Repository - Dataledger
Open Science Data Repository - DataledgerOpen Science Data Repository - Dataledger
Open Science Data Repository - DataledgerAlexandru Korotcov
 
Molecular design: How to and how not to?
Molecular design:  How to and how not to?Molecular design:  How to and how not to?
Molecular design: How to and how not to?Peter Kenny
 
Cocomo ii estimation
Cocomo ii estimationCocomo ii estimation
Cocomo ii estimationjujin1810
 
Benchmark Calculations of Atomic Data for Modelling Applications
 Benchmark Calculations of Atomic Data for Modelling Applications Benchmark Calculations of Atomic Data for Modelling Applications
Benchmark Calculations of Atomic Data for Modelling ApplicationsAstroAtom
 
Jag Trasgo Helsinki091002
Jag Trasgo Helsinki091002Jag Trasgo Helsinki091002
Jag Trasgo Helsinki091002Miguel Morales
 
Faster, More Effective Flowgraph-based Malware Classification
Faster, More Effective Flowgraph-based Malware ClassificationFaster, More Effective Flowgraph-based Malware Classification
Faster, More Effective Flowgraph-based Malware ClassificationSilvio Cesare
 
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...IJERD Editor
 
EPA Summer 2013_Portable Pharmacokinetic Parameter Prediction Tool
EPA Summer 2013_Portable Pharmacokinetic Parameter Prediction ToolEPA Summer 2013_Portable Pharmacokinetic Parameter Prediction Tool
EPA Summer 2013_Portable Pharmacokinetic Parameter Prediction ToolEmerald Feng
 
Flexscore: Ensemble-based evaluation for protein Structure models
Flexscore: Ensemble-based evaluation for protein Structure modelsFlexscore: Ensemble-based evaluation for protein Structure models
Flexscore: Ensemble-based evaluation for protein Structure modelsPurdue University
 
Mining Version Histories to Guide Software Changes
Mining Version Histories to Guide Software ChangesMining Version Histories to Guide Software Changes
Mining Version Histories to Guide Software ChangesThomas Zimmermann
 
Concept of Limit of Detection (LOD)
Concept of Limit of Detection (LOD)Concept of Limit of Detection (LOD)
Concept of Limit of Detection (LOD)GH Yeoh
 
An Efficient Reactive Model for Resource Discovery in DHT-Based Peer-to-Peer ...
An Efficient Reactive Model for Resource Discovery in DHT-Based Peer-to-Peer ...An Efficient Reactive Model for Resource Discovery in DHT-Based Peer-to-Peer ...
An Efficient Reactive Model for Resource Discovery in DHT-Based Peer-to-Peer ...James Salter
 
Integrating R with the CDK: Enhanced Chemical Data Mining
Integrating R with the CDK: Enhanced Chemical Data MiningIntegrating R with the CDK: Enhanced Chemical Data Mining
Integrating R with the CDK: Enhanced Chemical Data MiningRajarshi Guha
 

Semelhante a Developing Chemoinformatics Models (20)

Cheminformatics toolkits: a personal perspective
Cheminformatics toolkits: a personal perspectiveCheminformatics toolkits: a personal perspective
Cheminformatics toolkits: a personal perspective
 
Wcre12c.ppt
Wcre12c.pptWcre12c.ppt
Wcre12c.ppt
 
Chemoinformatics in Action
Chemoinformatics in ActionChemoinformatics in Action
Chemoinformatics in Action
 
Sparse feature analysis for detection of clustered microcalcifications in mam...
Sparse feature analysis for detection of clustered microcalcifications in mam...Sparse feature analysis for detection of clustered microcalcifications in mam...
Sparse feature analysis for detection of clustered microcalcifications in mam...
 
Paper and pencil_cosmological_calculator
Paper and pencil_cosmological_calculatorPaper and pencil_cosmological_calculator
Paper and pencil_cosmological_calculator
 
Keynote HotSWUp 2012
Keynote HotSWUp 2012Keynote HotSWUp 2012
Keynote HotSWUp 2012
 
Open Science Data Repository - Dataledger
Open Science Data Repository - DataledgerOpen Science Data Repository - Dataledger
Open Science Data Repository - Dataledger
 
Molecular design: How to and how not to?
Molecular design:  How to and how not to?Molecular design:  How to and how not to?
Molecular design: How to and how not to?
 
Cocomo ii estimation
Cocomo ii estimationCocomo ii estimation
Cocomo ii estimation
 
9th ICCS Noordwijkerhout
9th ICCS Noordwijkerhout9th ICCS Noordwijkerhout
9th ICCS Noordwijkerhout
 
Benchmark Calculations of Atomic Data for Modelling Applications
 Benchmark Calculations of Atomic Data for Modelling Applications Benchmark Calculations of Atomic Data for Modelling Applications
Benchmark Calculations of Atomic Data for Modelling Applications
 
Jag Trasgo Helsinki091002
Jag Trasgo Helsinki091002Jag Trasgo Helsinki091002
Jag Trasgo Helsinki091002
 
Faster, More Effective Flowgraph-based Malware Classification
Faster, More Effective Flowgraph-based Malware ClassificationFaster, More Effective Flowgraph-based Malware Classification
Faster, More Effective Flowgraph-based Malware Classification
 
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
 
EPA Summer 2013_Portable Pharmacokinetic Parameter Prediction Tool
EPA Summer 2013_Portable Pharmacokinetic Parameter Prediction ToolEPA Summer 2013_Portable Pharmacokinetic Parameter Prediction Tool
EPA Summer 2013_Portable Pharmacokinetic Parameter Prediction Tool
 
Flexscore: Ensemble-based evaluation for protein Structure models
Flexscore: Ensemble-based evaluation for protein Structure modelsFlexscore: Ensemble-based evaluation for protein Structure models
Flexscore: Ensemble-based evaluation for protein Structure models
 
Mining Version Histories to Guide Software Changes
Mining Version Histories to Guide Software ChangesMining Version Histories to Guide Software Changes
Mining Version Histories to Guide Software Changes
 
Concept of Limit of Detection (LOD)
Concept of Limit of Detection (LOD)Concept of Limit of Detection (LOD)
Concept of Limit of Detection (LOD)
 
An Efficient Reactive Model for Resource Discovery in DHT-Based Peer-to-Peer ...
An Efficient Reactive Model for Resource Discovery in DHT-Based Peer-to-Peer ...An Efficient Reactive Model for Resource Discovery in DHT-Based Peer-to-Peer ...
An Efficient Reactive Model for Resource Discovery in DHT-Based Peer-to-Peer ...
 
Integrating R with the CDK: Enhanced Chemical Data Mining
Integrating R with the CDK: Enhanced Chemical Data MiningIntegrating R with the CDK: Enhanced Chemical Data Mining
Integrating R with the CDK: Enhanced Chemical Data Mining
 

Mais de SSA KPI

Germany presentation
Germany presentationGermany presentation
Germany presentationSSA KPI
 
Grand challenges in energy
Grand challenges in energyGrand challenges in energy
Grand challenges in energySSA KPI
 
Engineering role in sustainability
Engineering role in sustainabilityEngineering role in sustainability
Engineering role in sustainabilitySSA KPI
 
Consensus and interaction on a long term strategy for sustainable development
Consensus and interaction on a long term strategy for sustainable developmentConsensus and interaction on a long term strategy for sustainable development
Consensus and interaction on a long term strategy for sustainable developmentSSA KPI
 
Competences in sustainability in engineering education
Competences in sustainability in engineering educationCompetences in sustainability in engineering education
Competences in sustainability in engineering educationSSA KPI
 
Introducatio SD for enginers
Introducatio SD for enginersIntroducatio SD for enginers
Introducatio SD for enginersSSA KPI
 
DAAD-10.11.2011
DAAD-10.11.2011DAAD-10.11.2011
DAAD-10.11.2011SSA KPI
 
Talking with money
Talking with moneyTalking with money
Talking with moneySSA KPI
 
'Green' startup investment
'Green' startup investment'Green' startup investment
'Green' startup investmentSSA KPI
 
From Huygens odd sympathy to the energy Huygens' extraction from the sea waves
From Huygens odd sympathy to the energy Huygens' extraction from the sea wavesFrom Huygens odd sympathy to the energy Huygens' extraction from the sea waves
From Huygens odd sympathy to the energy Huygens' extraction from the sea wavesSSA KPI
 
Dynamics of dice games
Dynamics of dice gamesDynamics of dice games
Dynamics of dice gamesSSA KPI
 
Energy Security Costs
Energy Security CostsEnergy Security Costs
Energy Security CostsSSA KPI
 
Naturally Occurring Radioactivity (NOR) in natural and anthropic environments
Naturally Occurring Radioactivity (NOR) in natural and anthropic environmentsNaturally Occurring Radioactivity (NOR) in natural and anthropic environments
Naturally Occurring Radioactivity (NOR) in natural and anthropic environmentsSSA KPI
 
Advanced energy technology for sustainable development. Part 5
Advanced energy technology for sustainable development. Part 5Advanced energy technology for sustainable development. Part 5
Advanced energy technology for sustainable development. Part 5SSA KPI
 
Advanced energy technology for sustainable development. Part 4
Advanced energy technology for sustainable development. Part 4Advanced energy technology for sustainable development. Part 4
Advanced energy technology for sustainable development. Part 4SSA KPI
 
Advanced energy technology for sustainable development. Part 3
Advanced energy technology for sustainable development. Part 3Advanced energy technology for sustainable development. Part 3
Advanced energy technology for sustainable development. Part 3SSA KPI
 
Advanced energy technology for sustainable development. Part 2
Advanced energy technology for sustainable development. Part 2Advanced energy technology for sustainable development. Part 2
Advanced energy technology for sustainable development. Part 2SSA KPI
 
Advanced energy technology for sustainable development. Part 1
Advanced energy technology for sustainable development. Part 1Advanced energy technology for sustainable development. Part 1
Advanced energy technology for sustainable development. Part 1SSA KPI
 
Fluorescent proteins in current biology
Fluorescent proteins in current biologyFluorescent proteins in current biology
Fluorescent proteins in current biologySSA KPI
 
Neurotransmitter systems of the brain and their functions
Neurotransmitter systems of the brain and their functionsNeurotransmitter systems of the brain and their functions
Neurotransmitter systems of the brain and their functionsSSA KPI
 

Mais de SSA KPI (20)

Germany presentation
Germany presentationGermany presentation
Germany presentation
 
Grand challenges in energy
Grand challenges in energyGrand challenges in energy
Grand challenges in energy
 
Engineering role in sustainability
Engineering role in sustainabilityEngineering role in sustainability
Engineering role in sustainability
 
Consensus and interaction on a long term strategy for sustainable development
Consensus and interaction on a long term strategy for sustainable developmentConsensus and interaction on a long term strategy for sustainable development
Consensus and interaction on a long term strategy for sustainable development
 
Competences in sustainability in engineering education
Competences in sustainability in engineering educationCompetences in sustainability in engineering education
Competences in sustainability in engineering education
 
Introducatio SD for enginers
Introducatio SD for enginersIntroducatio SD for enginers
Introducatio SD for enginers
 
DAAD-10.11.2011
DAAD-10.11.2011DAAD-10.11.2011
DAAD-10.11.2011
 
Talking with money
Talking with moneyTalking with money
Talking with money
 
'Green' startup investment
'Green' startup investment'Green' startup investment
'Green' startup investment
 
From Huygens odd sympathy to the energy Huygens' extraction from the sea waves
From Huygens odd sympathy to the energy Huygens' extraction from the sea wavesFrom Huygens odd sympathy to the energy Huygens' extraction from the sea waves
From Huygens odd sympathy to the energy Huygens' extraction from the sea waves
 
Dynamics of dice games
Dynamics of dice gamesDynamics of dice games
Dynamics of dice games
 
Energy Security Costs
Energy Security CostsEnergy Security Costs
Energy Security Costs
 
Naturally Occurring Radioactivity (NOR) in natural and anthropic environments
Naturally Occurring Radioactivity (NOR) in natural and anthropic environmentsNaturally Occurring Radioactivity (NOR) in natural and anthropic environments
Naturally Occurring Radioactivity (NOR) in natural and anthropic environments
 
Advanced energy technology for sustainable development. Part 5
Advanced energy technology for sustainable development. Part 5Advanced energy technology for sustainable development. Part 5
Advanced energy technology for sustainable development. Part 5
 
Advanced energy technology for sustainable development. Part 4
Advanced energy technology for sustainable development. Part 4Advanced energy technology for sustainable development. Part 4
Advanced energy technology for sustainable development. Part 4
 
Advanced energy technology for sustainable development. Part 3
Advanced energy technology for sustainable development. Part 3Advanced energy technology for sustainable development. Part 3
Advanced energy technology for sustainable development. Part 3
 
Advanced energy technology for sustainable development. Part 2
Advanced energy technology for sustainable development. Part 2Advanced energy technology for sustainable development. Part 2
Advanced energy technology for sustainable development. Part 2
 
Advanced energy technology for sustainable development. Part 1
Advanced energy technology for sustainable development. Part 1Advanced energy technology for sustainable development. Part 1
Advanced energy technology for sustainable development. Part 1
 
Fluorescent proteins in current biology
Fluorescent proteins in current biologyFluorescent proteins in current biology
Fluorescent proteins in current biology
 
Neurotransmitter systems of the brain and their functions
Neurotransmitter systems of the brain and their functionsNeurotransmitter systems of the brain and their functions
Neurotransmitter systems of the brain and their functions
 

Último

ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfSpandanaRallapalli
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...Postal Advocate Inc.
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxAshokKarra1
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSJoshuaGantuangco2
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomnelietumpap1
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 

Último (20)

ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdf
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptx
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choom
 
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 

Developing Chemoinformatics Models

  • 1. Developing chemoinformatics models: One can’t embrace unembraceable Igor V. Tetko Helmholtz Zentrum München - German Research Center for Environmental Health (GmbH) Institute of Bioinformatics & Systems Biology Kyiv, 10 August 2009, Summer School
  • 2. Representation of Molecules Can be defined with calculated properties (logP, quantum- chemical parameters, etc.) HO "12.3% $ ' $ 4.6 ' Can be defined with a set of $ M ' structural descriptors $ 13.2' ' $ (topological 2D, 3D, etc.). N $10.1' # & Goal is to correlate descriptors with some properties. ! HO One of these sets of descriptors "13.7% $ ' could be used for determine an $ 4.8 ' applicability domain of a model. $ M ' N $ ' $15.8' $12.0' # & ! Distance to model:
  • 3. "One can not embrace the unembraceable.” Possible: 1060 - 10100 molecules theoretically exist ( > 1080 atoms in the Universe) Achievable: 1020 - 1024 can be synthesized now by companies (weight of the Moon is ca 1023 kg) Available: 2*107 molecules are on the market Measured: 102 - 104 molecules with data Kozma Prutkov Problem: To predict these properties of just molecules on the market we must extrapolate data from one to 1,000 - 100,000 molecules! OH O O N There is a need for methods which can estimate the accuracy of predictions!
  • 4. Models can fail due to chemical diversity of training & test sets (i.e. outside of applicability domain) Training set data used to develop a model Our model given the training set New data to be estimated Correct model
  • 5. Declining R&D productivity in the pharmaceutical industry Approved medicine http://www.frost.com/prod/servlet/market-insight-top.pag?docid=128394740
  • 6. Benchmarking data: logP prediction Public dataset: Industrial datasets: N=266 molecules1 N=95809 (Pfizer Inc.)2 • N=233 Star set (supported with N=882 (Nycomed GmbH)3 experimental values from CLOGP v5.0 program) 2logP and logD (pH 7.4) • N=43 Non-Star set (no experimental logP values in measurements CLOGP v5.0) 3logP measurements only 1Provided by A. Avdeef, “Absorption and drug development. Solubility, permeability and charge state”, ed. Hoboken, NJ: Wiley–Interscience, 2003. Mannhold, R. et al, J. Pharm. Sci., 2009, 98(3), 861-893.
  • 7. Performance of algorithms for the public dataset Star set (N = 223) Non-Star set (N = 43) Method % within error range % within error range RMSE rank <0.5 0.5-1 >1 RMSE rank <0.5 0.5-1 >1 AB/LogP 0.41 I 84 12 4 1.00 I 42 23 35 S+logP 0.45 I 76 22 3 0.87 I 40 35 26 ACD/logP 0.50 I 75 17 7 1.00 I 44 33 23 Consensus log P 0.50 I 74 18 8 0.80 I 47 28 26 CLOGP 0.52 II 74 20 6 0.91 I 47 28 26 VLOGP OPS 0.52 II 64 21 7 1.07 I 33 28 26 AAM = average logP used ALOGPS 0.53 II 71 23 6 0.82 I 42 30 28 as predicted value for all MiLogP 0.57 II 69 22 9 0.86 I 49 30 21 XLOGP 0.62 II 60 30 10 0.89 I 47 23 30 molecules R2=0 KowWIN 0.64 II 68 21 11 1.05 I 40 30 30 CSlogP 0.65 II 66 22 12 0.93 I 58 19 23 Bootstrap test: ALOGP (Dragon) 0.69 II 60 25 16 0.92 I 28 40 33 • rank I - similar to “best MolLogP 0.69 II 61 25 14 0.93 I 40 35 26 ALOGP98 0.70 II 61 26 13 1.00 I 30 37 33 model” OsirisP 0.71 II 59 26 16 0.94 I 42 26 33 • rank II -- better than AAM VLOGP 0.72 II 65 22 14 1.13 I 40 28 33 • rank III - similar to AAM TLOGP 0.74 II 67 16 13 1.12 I 30 37 30 ABSOLV 0.75 II 53 30 17 1.02 I 49 28 23 QikProp 0.77 II 53 30 17 1.24 II 40 26 35 QuantlogP 0.80 II 47 30 22 1.17 II 35 26 40 SLIPPER-2002 0.80 II 62 22 15 1.16 II 35 23 42 COSMOFrag 0.84 II 48 26 19 1.23 II 26 40 33 1Provided by A. Avdeef, XLOGP2 0.87 II 57 22 20 1.16 II 35 23 42 Absorption and drug QLOGP 0.96 II 48 26 25 1.42 II 21 26 53 VEGA 1.04 II 47 27 26 1.24 II 28 30 42 development. Solubility, CLIP 1.05 II 41 25 30 1.54 III 33 9 49 permeability and charge LSER 1.07 II 44 26 30 1.26 II 35 16 49 state, ed. Hoboken, NJ: MLOGP (Sim+) 1.26 II 38 30 33 1.56 III 26 28 47 Wiley–Interscience, NC+NHET 1.35 III 29 26 45 1.71 III 19 16 65 SPARC 1.36 III 45 22 32 1.70 III 28 21 49 2003. MLOGP(Dragon) 1.52 III 39 26 35 2.45 III 23 30 47 LSER UFZ 1.60 III 36 23 41 2.79 III 19 12 67 AAM 1.62 III 22 24 53 2.10 III 19 28 53 VLOGP-NOPS 1.76 III 1 1 7 1.39 III 7 0 7 HINT 1.80 III 34 22 44 2.72 III 30 5 65 GBLOGP 1.98 III 32 26 42 1.75 III 19 16 65
  • 8. Performance of algorithms for in-house datasets Benchmarking of logP Pfizer set (N = 95 809) Nycomed set (N = 882) methods for in-house Method RMSE Failed1 rank % in error range <0.5 0.5- 1 >1 RMSE, zwitterions excluded2 RMSE rank % in error range <0.5 0.5- 1 >1 data of Pfizer & Nycomed Consensus log P 0.95 I 48 29 24 0.94 0.58 I 61 32 7 ALOGPS 1.02 I 41 30 29 1.01 0.68 I 51 34 15 • Large number of methods could not S+logP 1.02 I 44 29 27 1.00 0.69 I 58 27 15 perform better than AAM model NC+NHET 1.04 II 38 30 32 1.04 0.88 III 42 32 26 MLOGP(S+) 1.05 II 40 29 31 1.05 1.17 III 32 26 41 • Best results are calculated using XLOGP3 1.07 II 43 28 29 1.06 0.65 I 55 34 12 Consensus logP model MiLogP 1.10 27 II 41 28 30 1.09 0.67 I 60 26 14 • log P = 1.46 + 0.11 (NC - NHET) AB/LogP 1.12 24 II 39 29 33 1.11 0.88 III 45 28 27 N=95 809, RMSE=1.04, R2=0.2 ALOGP 1.12 II 39 29 32 1.12 0.72 II 52 33 15 ALOGP98 1.12 II 40 28 32 1.10 0.73 II 52 31 17 OsirisP 1.13 6 II 39 28 33 1.12 0.85 II 43 33 24 AAM 1.16 III 33 29 38 1.16 0.94 III 42 31 27 Different MlogP implementations CLOGP 1.23 III 37 28 35 1.21 1.01 III 46 28 22 ACD/logP 1.28 III 35 27 38 1.28 0.87 III 46 34 21 demonstrate very different CSlogP 1.29 20 III 37 27 36 1.28 1.06 III 38 29 33 performances for both sets COSMOFrag 1.30 10883 III 32 27 40 1.30 1.06 III 29 31 40 QikProp 1.32 103 III 31 26 43 1.32 1.17 III 27 24 49 KowWIN 1.32 16 III 33 26 41 1.31 1.20 III 29 27 44 QLogP 1.33 24 III 34 27 39 1.32 0.80 II 50 33 17 XLOGP2 1.80 III 15 17 68 1.80 0.94 III 39 31 29 N.B! Do we really compare MLOGP(Dragon) 2.03 III 34 24 42 2.03 0.90 III 45 30 25 methods or their implementations? 1Nr of molecu les with ca lculations failures due to errors or crash of programs. All methods predicted all molecules for the Nycomed dataset. 2RMSE calculated after excluding of 769 zwitterionic compounds from the Pfizer dataset. 3Most molecules failed by COSMOFrag are zwitterions. Mannhold, R. et al, J. Pharm. Sci., 2009, 98(3), 861-893.
  • 9. Optimization of a drug: we need to find a road in multidimensional space
  • 10. …and to arrive to a good result as soon as possible!
  • 11. We need to optimize our road: we can’t see mountains!
  • 12. Examples of distances to models Jaworska et al, ATLA, 33, 445- 459, 2005. Euclidian Mahalanobis Glende et al, Mutation Res., 2001. City-block Probability-density
  • 13. Distances to a model in descriptor space Jaworska et al, ATLA, 33, 445-459, 2005.
  • 14. Nearest neighbors in the input space x2 4 2 0 x -2 -4 -3 -2 -1 0 1 2 3 x1
  • 15. Nearest neighbors and activity 1 y 0.8 0.6 0.4 0.2 y=exp(-(x1+x2)2) 0 -3 -2 -1 0 1 2 3 x The nearest x=x1+x2 ! x2 4 neighbors in descriptor space are not the 2 neighbors in 0 x property! -2 -4 -3 -2 -1 0 1 2 3 x1
  • 16. Ensemble methods Hansen, L.K.; Salamon, P. IEEE Trans. Pattern. Anal. Mach. Learn., 1990, 12, 993. Tetko, I. V.; Luik, A. I.; Poda, G. I. J. Med. Chem., 1993, 36, 811. Tetko, I.V.; Livingstone, D. J.; Luik, A. I. Neural Network Studies. 1. Comparison of Overfitting and Overtraining. J. Chem. Inf. Comput. Sci. 1995, 35(5), 826.
  • 17. Who of them would be elected in 1904? 2000 G.W. Bush A. Gore republican democrat
  • 18. Elections results in 1904 1904 T. Roosevelt A. Parker republican democrat
  • 19. President elections in the USA by states The distribution of votes in states creates a typical fingerprint, which is invisible when we use just average values Democrats of 2000 == Republicans of 1904
  • 20. Ensemble Based Distance HO " net 1 % "12.3% $ ' $ ' $ net 2 ' $ 4.6 ' $ M ' logP=3.11 $ M ' $ ' Morphinan-3-ol, 17-methyl- $net 63' $ ' N $13.2' $net 64' # & $ #10.1&' HO " net 1 % "13.7% $ ' $ ' $ net 2 ' logP=3.48 $ 4.8 ' ! Levallorphan ! $ M ' $ M ' $ ' $net 63' N $ ' $15.8' $ ' $12.0' #net 64& # & ! ! CORREL -- correlation between model vectors for any two molecules
  • 21. Nearest neighbors and activity A B 1 y 1 y 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 -3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3 x x x=x1+x2 C x2 4 2 CORREL measure 0 x correctly detect the nearest neighbors! -2 -4 -3 -2 -1 0 1 2 3 x1
  • 22. Local models: Instant learning of logP for Pt(II) molecules H2 N Cl Pt Cl H2N Cl Cl N Cl H2N Cl Pt N Cl Pt H 2N Cl Cl N Pt H2N Cl N Cl H2N Pt Cl H2N Cl Pt H2N Cl O N O H3N Cl Pt N O Pt O H3N Cl Prediction of new classes of compounds can be extremely difficult as exemplified by an absence of correlations between predicted and experimental values using developed by us ALOGPS program. Tetko et al, J. Inorg. Biochem, 2008, 102, 1424-37.
  • 23. Local models: Instant learning by knowledge transfer The use of LIBRARY mode (local correction of the global model) dramatically (5 times!) decreased logP errors, Tetko et al, J. Inorg. Biochem, 2008, 102, 1424-37.
  • 24. Performance of algorithms for in-house datasets Pfizer set (N = 95 809) Nycomed set (N = 882) % in error range RMSE, % in error range RMSE Failed1 rank RMSE rank <0.5 0.5- >1 zwitterions <0.5 0.5- >1 Method 1 excluded2 1 Consensus log P 0.95 I 48 29 24 0.94 0.58 I 61 32 7 ALOGPS 1.02 I 41 30 29 1.01 0.68 I 51 34 15 S+logP 1.02 I 44 29 27 1.00 0.69 I 58 27 15 NC+NHET 1.04 II 38 30 32 1.04 0.88 III 42 32 26 MLOGP(S+) 1.05 II 40 29 31 1.05 1.17 III 32 26 41 XLOGP3 1.07 II 43 28 29 1.06 0.65 I 55 34 12 MiLogP 1.10 27 II 41 28 30 1.09 0.67 I 60 26 14 AB/LogP 1.12 24 II 39 29 33 1.11 0.88 III 45 28 27 ALOGP 1.12 II 39 29 32 1.12 0.72 II 52 33 15 ALOGP98 1.12 II 40 28 32 1.10 0.73 II 52 31 17 OsirisP 1.13 6 II 39 28 33 1.12 0.85 II 43 33 24 AAM 1.16 III 33 29 38 1.16 0.94 III 42 31 27 CLOGP 1.23 III 37 28 35 1.21 1.01 III 46 28 22 ACD/logP 1.28 III 35 27 38 1.28 0.87 III 46 34 21 CSlogP 1.29 20 III 37 27 36 1.28 1.06 III 38 29 33 COSMOFrag 1.30 10883 III 32 27 40 1.30 1.06 III 29 31 40 QikProp 1.32 103 III 31 26 43 1.32 1.17 III 27 24 49 KowWIN 1.32 16 III 33 26 41 1.31 1.20 III 29 27 44 QLogP 1.33 24 III 34 27 39 1.32 0.80 II 50 33 17 XLOGP2 1.80 III 15 17 68 1.80 0.94 III 39 31 29 MLOGP(Dragon) 2.03 III 34 24 42 2.03 0.90 III 45 30 25 1Nr of molecu les with ca lculations failures due to errors or crash of programs. All methods predicted all molecules for the Nycomed dataset. 2RMSE calculated after excluding of 769 zwitterionic compounds from the Pfizer dataset. 3Most molecules failed by COSMOFrag are zwitterions. Mannhold, R. et al, J. Pharm. Sci., 2009, 98(3), 861-893.
  • 25. Local models: Instant learning of in-house data (Pfizer Inc.), N=95809 ALOGPS Blind prediction ALOGPS LIBRARY RMSE=1.02 RMSE=0.59 in less than 30 minutes of calculations on a notebook! Tetko, Poda, Ostermann, Mannhold, QSAR Comb. Sci, 2009, in press.
  • 26. REACH and QSAR (Quantitative Structure Activity Relationship) models > 140,000 chemicals to be registered … is a lot! It is expensive to measure all of them ($200,000 per compound), a lot of animal testing QSAR models can be used to prioritize compounds • Compound is predicted to be toxic • Biological testing will be done to prove/ disprove the models • Compound is predicted to be not toxic • tests can be avoided, saving money, animals • but ... only if we are confident in the predictions • FP7 project CADASTER http://www.cadaster.eu has goals to develop a strategy for use of in silico methods in REACH
  • 27. Estimation toxicity of T. pyriformis Initial Dataset1,2 n=983 molecules n=644 training set n=339 test set 1 Test set 2: n=110 molecules1,2 The overall goal is to predict and to assess the reliability of predictions toxicity against T. pyriformis for chemicals directly from their structure. 1Zhu et al, J. Chem. Inf. Comput. Sci, 2008, 48(4), 766-784. 2Schultz et al, QSAR Comb Sci, 2007, 26(2), 238-254.
  • 28. Overview of analyzed distances to models (DMs) EUCLID k TANIMOTO "d j "x a,i x b,i EUm= j=1 k is number of nearest Tanimoto(a,b) = k "x a,i x a,i +" x b,i x b,i # " x a,i x b,i neighbors, m index of EUCLID = EU m xa,i and xb,i are fragment counts model ! ! LEVERAGE PLSEU (DModX) ! LEVERAGE=xT(XTX)-1x Error in approximation (restoration) of the vector of input variables from the latent variables and PLS weights. 1 2 CORREL STD STD = N "1 #( yi " y) CORREL(a) =maxj CORREL(a,j)=R2(Yacalc,Yjcalc) yi is value calculated with model i and y is average Ya=(y ,…,y ) is vector of predictions of molecule i ! 1 N value !
  • 29. Mixture of Gaussian Distributions (MGD) Idea is to find a MGD, which maximize likelihood (probability) ! N(0,!2(ei)) of the observed distribution of errors
  • 30. MGDs for the simulated datasets A) No significant MGD was found B) A MGD composed of 3 Gaussian distributions was found
  • 31. Distance does not work: Neural Network model
  • 32. Distance works: Neural Network model
  • 33. Estimations based on training set errors Tetko et al, J Chem Inf Model, 2008, 48(9):1733-46.
  • 34. Estimations of errors using MGD and 5-fold cross-validation Tetko et al, J Chem Inf Model, 2008, 48(9):1733-46.
  • 35. Prediction accuracy for training and two external sets Estimated experimental accuracy is about SE = 0.38 HPV (High Production Volume): 3182 EINECS (REACH): 48774 Challenge to predict toxicity organized in collaboration with European Neural Network Society at http://www.cadaster.eu Tetko et al, J Chem Inf Model, 2008, 48(9):1733-46.
  • 36. Challenges and solutions new measurement N O O reliable predictions O N O O OH N O new series to predict Our methodology allows confident navigation in a defined molecular space. ! It can be used to develop targeted (local) models covering specific series. ! It can be used to reliably estimate which compounds can/can’t be reliably predicted. ! It can be used to guide experimental design and to minimize costs of new measurements.
  • 37. Acknowledgements My group Collaborators: Mr I. Sushko Dr. C. Höfer (DMPKore) Mr S. Novotarskyi Dr. G. Poda (Pfizer) Mr A.K. Pandey Dr. C. Ostermann (Nycomed) Mr R. Körner Prof. R. Mannhold (Düsseldorf Mr S. Brandmaier University) Dr M. Rupp Prof. A. Tropsha (NC, USA) Prof. T. Oprea (New Mexico, USA) + many other colleagues & Visiting Scientists co-authors Dr. V. Kovalishyn Dr. V. Prokopenko Prof. J. Emmersen Funding FP7 MC ITN ECO GO-Bio BMBF http://qspr.eu FP7 CADASTER http://www.cadaster.ue Germany-Ukraine grant UKR 08/006 FP6 INTAS VCCLAB http://www.vcclab.org DFG TE 380/1-1