SlideShare uma empresa Scribd logo
1 de 11
Baixar para ler offline
Data analysis in QSAR

        Noel O’Boyle

  Dave Palmer, John Mitchell
Quantitative Structure-Activity
             Relationship (QSAR)
Also QSPR (Property)



                       Perceive
                                  Physical
          Structure    Predict
                                  property
                       Propose
Quantitative Structure-Activity
              Relationship (QSAR)
Also QSPR (Property)



                             Perceive
                                        Physical
          Structure          Predict
                                        property
                             Propose

    Descriptors

    Molecular weight
    No.of H-bonding groups
    Surface area

    MOE: 180 descriptors
Quantitative Structure-Activity
              Relationship (QSAR)
                                           Need to use an estimate of the error
 Optimise model on training set (2/3)      of prediction (CV or bootstrap, but not
     Test model on test set (1/3)          resubstitution)

                  or

Build model on training set (2/3 of 2/3)
   Optimise on test set (1/3 of 2/3)       Using the error of prediction
  Test model on validation set (1/3)
Quantitative Structure-Activity
                   Relationship (QSAR)
                                                         Need to use an estimate of the error
  Optimise model on training set (2/3)                   of prediction (CV or bootstrap, but not
      Test model on test set (1/3)                       resubstitution)

                         or

Build model on training set (2/3 of 2/3)
   Optimise on test set (1/3 of 2/3)                     Using the error of prediction
  Test model on validation set (1/3)


Testing must be carried out on a hold-out set from the same distribution
as the the set of molecules used for optimisation
=> don't optimise model on diverse set
Assessing Model Fit by Cross-Validation, JCICS, 2003, 43, 579.
Beware of q2, J.Mol.Graph.Model., 2002, 20, 269.
Feature selection




                  Number of descriptors →

Features = descriptors                2n different subsets
Parameter optimisation

– Models have parameters which should be tuned
   • Support vector machines
        – gamma, cost, epsilon
QSAR: Feature selection and parameter
             optimisation
• Aim: To find a robust method of simultaneously
  optimising the parameters and performing feature
  selection
• Prediction of solubility for drug-like molecules
   – David Palmer
   – Support vector machines
       • gamma, cost, epsilon
   – 127 descriptors
• Large search space
   – stochastic algorithm required
Ant Colony Optimisation (ACO)
•   Insipired by behaviour of ants foraging for food
•   Ants lay down pheromones, which influence the path taken by other
    ants. Meanwhile pheromones are evaporating.
•   Ants’ trails converge to shortest path between nest and food




•   Ant Colony Optimisation (ACO) – Marco Dorigo, PhD Thesis, 1992.
•   ACO for feature selection – Shen et al, JCIM, 2005, 45, 1024.
•   We have extended it to perform simultaneous parameter optimisation
Ant Colony Optimisation (ACO)

• population of ants (typically 50 to 100)
    – each ant represents a model – i.e. a subset of descriptors and
      values for the parameters

    – each ant has a fitness score, e.g. 10-fold cross validation rmse

• it is more likely that an ant will choose a particular
  descriptor/parameter value in the next iteration if
    – many ants have chosen it in this iteration (local search), or

    – many ants have chosen it in their best models to date (global
      search)

• for descriptors/parameter values that are not chosen in the
  current or best models, the probability that they will be chosen
  decreases (evaporation)
Does it work?

Mais conteúdo relacionado

Destaque

Making the most of a QM calculation
Making the most of a QM calculationMaking the most of a QM calculation
Making the most of a QM calculationbaoilleach
 
Hyperchem Ma, badbarcode en_1109_nocomment-final
Hyperchem Ma, badbarcode en_1109_nocomment-finalHyperchem Ma, badbarcode en_1109_nocomment-final
Hyperchem Ma, badbarcode en_1109_nocomment-finalPacSecJP
 
Quantum pharmacology. Basics
Quantum pharmacology. BasicsQuantum pharmacology. Basics
Quantum pharmacology. BasicsMobiliuz
 
Focus on Reading Structure Activity
Focus on Reading Structure Activity Focus on Reading Structure Activity
Focus on Reading Structure Activity Jesse Bluma
 
Computer Aided Drug Design QSAR Related Methods
Computer Aided Drug Design QSAR Related MethodsComputer Aided Drug Design QSAR Related Methods
Computer Aided Drug Design QSAR Related MethodsJahan B Ghasemi
 
2 d qsar model of dihydrofolate reductase (dhfr) inhibitors with activity in ...
2 d qsar model of dihydrofolate reductase (dhfr) inhibitors with activity in ...2 d qsar model of dihydrofolate reductase (dhfr) inhibitors with activity in ...
2 d qsar model of dihydrofolate reductase (dhfr) inhibitors with activity in ...Prasanthperceptron
 
Discovery Bus: UK QSAR meeting at GSK
Discovery Bus: UK QSAR meeting at GSKDiscovery Bus: UK QSAR meeting at GSK
Discovery Bus: UK QSAR meeting at GSKDavid Leahy
 
Automating Drug Design Nov 13th 2009 97
Automating Drug Design Nov 13th 2009 97Automating Drug Design Nov 13th 2009 97
Automating Drug Design Nov 13th 2009 97David Leahy
 
Qsar introduction beginners
Qsar introduction beginnersQsar introduction beginners
Qsar introduction beginnersAntonio Cassano
 
Introduction to OECD QSAR Toolbox
Introduction to OECD QSAR ToolboxIntroduction to OECD QSAR Toolbox
Introduction to OECD QSAR Toolboxguestcfca1eb1
 
QSAR Study on Antitubercular Drug Derivatives
QSAR Study on Antitubercular Drug DerivativesQSAR Study on Antitubercular Drug Derivatives
QSAR Study on Antitubercular Drug DerivativesLydia Yeshitla
 
Qsar studies of saponine analogues for anticancer activity by sagar alone
Qsar studies of saponine analogues for anticancer activity by sagar aloneQsar studies of saponine analogues for anticancer activity by sagar alone
Qsar studies of saponine analogues for anticancer activity by sagar alonesagar alone
 

Destaque (17)

Computational Chemistry Robots
Computational Chemistry RobotsComputational Chemistry Robots
Computational Chemistry Robots
 
Making the most of a QM calculation
Making the most of a QM calculationMaking the most of a QM calculation
Making the most of a QM calculation
 
Hyperchem Ma, badbarcode en_1109_nocomment-final
Hyperchem Ma, badbarcode en_1109_nocomment-finalHyperchem Ma, badbarcode en_1109_nocomment-final
Hyperchem Ma, badbarcode en_1109_nocomment-final
 
Quantum pharmacology. Basics
Quantum pharmacology. BasicsQuantum pharmacology. Basics
Quantum pharmacology. Basics
 
JAI
JAIJAI
JAI
 
Focus on Reading Structure Activity
Focus on Reading Structure Activity Focus on Reading Structure Activity
Focus on Reading Structure Activity
 
Computer Aided Drug Design QSAR Related Methods
Computer Aided Drug Design QSAR Related MethodsComputer Aided Drug Design QSAR Related Methods
Computer Aided Drug Design QSAR Related Methods
 
2 d qsar model of dihydrofolate reductase (dhfr) inhibitors with activity in ...
2 d qsar model of dihydrofolate reductase (dhfr) inhibitors with activity in ...2 d qsar model of dihydrofolate reductase (dhfr) inhibitors with activity in ...
2 d qsar model of dihydrofolate reductase (dhfr) inhibitors with activity in ...
 
Electronegativity
ElectronegativityElectronegativity
Electronegativity
 
Discovery Bus: UK QSAR meeting at GSK
Discovery Bus: UK QSAR meeting at GSKDiscovery Bus: UK QSAR meeting at GSK
Discovery Bus: UK QSAR meeting at GSK
 
Molecular design of life
Molecular design of lifeMolecular design of life
Molecular design of life
 
Automating Drug Design Nov 13th 2009 97
Automating Drug Design Nov 13th 2009 97Automating Drug Design Nov 13th 2009 97
Automating Drug Design Nov 13th 2009 97
 
Molecular design
Molecular designMolecular design
Molecular design
 
Qsar introduction beginners
Qsar introduction beginnersQsar introduction beginners
Qsar introduction beginners
 
Introduction to OECD QSAR Toolbox
Introduction to OECD QSAR ToolboxIntroduction to OECD QSAR Toolbox
Introduction to OECD QSAR Toolbox
 
QSAR Study on Antitubercular Drug Derivatives
QSAR Study on Antitubercular Drug DerivativesQSAR Study on Antitubercular Drug Derivatives
QSAR Study on Antitubercular Drug Derivatives
 
Qsar studies of saponine analogues for anticancer activity by sagar alone
Qsar studies of saponine analogues for anticancer activity by sagar aloneQsar studies of saponine analogues for anticancer activity by sagar alone
Qsar studies of saponine analogues for anticancer activity by sagar alone
 

Semelhante a Data Analysis in QSAR

IGARSS2011-I-Ling.ppt
IGARSS2011-I-Ling.pptIGARSS2011-I-Ling.ppt
IGARSS2011-I-Ling.pptgrssieee
 
BIM_2010_20_Bioinformatics_Project
BIM_2010_20_Bioinformatics_ProjectBIM_2010_20_Bioinformatics_Project
BIM_2010_20_Bioinformatics_ProjectSagar Nikam
 
SVM-PSO based Feature Selection for Improving Medical Diagnosis Reliability u...
SVM-PSO based Feature Selection for Improving Medical Diagnosis Reliability u...SVM-PSO based Feature Selection for Improving Medical Diagnosis Reliability u...
SVM-PSO based Feature Selection for Improving Medical Diagnosis Reliability u...cscpconf
 
Automated parameter optimization should be included in future 
defect predict...
Automated parameter optimization should be included in future 
defect predict...Automated parameter optimization should be included in future 
defect predict...
Automated parameter optimization should be included in future 
defect predict...Chakkrit (Kla) Tantithamthavorn
 
A comparison of particle swarm optimization and the genetic algorithm by Rani...
A comparison of particle swarm optimization and the genetic algorithm by Rani...A comparison of particle swarm optimization and the genetic algorithm by Rani...
A comparison of particle swarm optimization and the genetic algorithm by Rani...Pim Piepers
 
A comparison of three chromatographic retention time prediction models
A comparison of three chromatographic retention time prediction modelsA comparison of three chromatographic retention time prediction models
A comparison of three chromatographic retention time prediction modelsAndrew McEachran
 
Data Mining Protein Structures' Topological Properties to Enhance Contact Ma...
Data Mining Protein Structures' Topological Properties  to Enhance Contact Ma...Data Mining Protein Structures' Topological Properties  to Enhance Contact Ma...
Data Mining Protein Structures' Topological Properties to Enhance Contact Ma...jaumebp
 
A parsimonious SVM model selection criterion for classification of real-world ...
A parsimonious SVM model selection criterion for classification of real-world ...A parsimonious SVM model selection criterion for classification of real-world ...
A parsimonious SVM model selection criterion for classification of real-world ...o_almasi
 
An Automatic Clustering Technique for Optimal Clusters
An Automatic Clustering Technique for Optimal ClustersAn Automatic Clustering Technique for Optimal Clusters
An Automatic Clustering Technique for Optimal ClustersIJCSEA Journal
 
AI 바이오 (4일차).pdf
AI 바이오 (4일차).pdfAI 바이오 (4일차).pdf
AI 바이오 (4일차).pdfH K Yoon
 
structure analysis.ppt
structure analysis.pptstructure analysis.ppt
structure analysis.pptRaviJangid77
 
structure analysis.ppt
structure analysis.pptstructure analysis.ppt
structure analysis.pptMd Polash
 
structure analysis.ppt
structure analysis.pptstructure analysis.ppt
structure analysis.pptAditiDubey89
 
Improved Particle Swarm Optimization
Improved Particle Swarm OptimizationImproved Particle Swarm Optimization
Improved Particle Swarm Optimizationvane sanchez
 
an improver particle optmizacion plan de negocios
an improver particle optmizacion plan de negociosan improver particle optmizacion plan de negocios
an improver particle optmizacion plan de negociosCarlos Iza
 
Small Molecules and siRNA: Methods to Explore Bioactivity Data
Small Molecules and siRNA: Methods to Explore Bioactivity DataSmall Molecules and siRNA: Methods to Explore Bioactivity Data
Small Molecules and siRNA: Methods to Explore Bioactivity DataRajarshi Guha
 

Semelhante a Data Analysis in QSAR (20)

ADMET.pptx
ADMET.pptxADMET.pptx
ADMET.pptx
 
IGARSS2011-I-Ling.ppt
IGARSS2011-I-Ling.pptIGARSS2011-I-Ling.ppt
IGARSS2011-I-Ling.ppt
 
BIM_2010_20_Bioinformatics_Project
BIM_2010_20_Bioinformatics_ProjectBIM_2010_20_Bioinformatics_Project
BIM_2010_20_Bioinformatics_Project
 
SVM-PSO based Feature Selection for Improving Medical Diagnosis Reliability u...
SVM-PSO based Feature Selection for Improving Medical Diagnosis Reliability u...SVM-PSO based Feature Selection for Improving Medical Diagnosis Reliability u...
SVM-PSO based Feature Selection for Improving Medical Diagnosis Reliability u...
 
Automated parameter optimization should be included in future 
defect predict...
Automated parameter optimization should be included in future 
defect predict...Automated parameter optimization should be included in future 
defect predict...
Automated parameter optimization should be included in future 
defect predict...
 
A comparison of particle swarm optimization and the genetic algorithm by Rani...
A comparison of particle swarm optimization and the genetic algorithm by Rani...A comparison of particle swarm optimization and the genetic algorithm by Rani...
A comparison of particle swarm optimization and the genetic algorithm by Rani...
 
A comparison of three chromatographic retention time prediction models
A comparison of three chromatographic retention time prediction modelsA comparison of three chromatographic retention time prediction models
A comparison of three chromatographic retention time prediction models
 
Data Mining Protein Structures' Topological Properties to Enhance Contact Ma...
Data Mining Protein Structures' Topological Properties  to Enhance Contact Ma...Data Mining Protein Structures' Topological Properties  to Enhance Contact Ma...
Data Mining Protein Structures' Topological Properties to Enhance Contact Ma...
 
T180203125133
T180203125133T180203125133
T180203125133
 
A parsimonious SVM model selection criterion for classification of real-world ...
A parsimonious SVM model selection criterion for classification of real-world ...A parsimonious SVM model selection criterion for classification of real-world ...
A parsimonious SVM model selection criterion for classification of real-world ...
 
An Automatic Clustering Technique for Optimal Clusters
An Automatic Clustering Technique for Optimal ClustersAn Automatic Clustering Technique for Optimal Clusters
An Automatic Clustering Technique for Optimal Clusters
 
AI 바이오 (4일차).pdf
AI 바이오 (4일차).pdfAI 바이오 (4일차).pdf
AI 바이오 (4일차).pdf
 
structure analysis.ppt
structure analysis.pptstructure analysis.ppt
structure analysis.ppt
 
structure analysis.ppt
structure analysis.pptstructure analysis.ppt
structure analysis.ppt
 
structure analysis.ppt
structure analysis.pptstructure analysis.ppt
structure analysis.ppt
 
structure analysis.ppt
structure analysis.pptstructure analysis.ppt
structure analysis.ppt
 
Improved Particle Swarm Optimization
Improved Particle Swarm OptimizationImproved Particle Swarm Optimization
Improved Particle Swarm Optimization
 
an improver particle optmizacion plan de negocios
an improver particle optmizacion plan de negociosan improver particle optmizacion plan de negocios
an improver particle optmizacion plan de negocios
 
Small Molecules and siRNA: Methods to Explore Bioactivity Data
Small Molecules and siRNA: Methods to Explore Bioactivity DataSmall Molecules and siRNA: Methods to Explore Bioactivity Data
Small Molecules and siRNA: Methods to Explore Bioactivity Data
 
Cohen
CohenCohen
Cohen
 

Mais de baoilleach

We need to talk about Kekulization, Aromaticity and SMILES
We need to talk about Kekulization, Aromaticity and SMILESWe need to talk about Kekulization, Aromaticity and SMILES
We need to talk about Kekulization, Aromaticity and SMILESbaoilleach
 
Open Babel project overview
Open Babel project overviewOpen Babel project overview
Open Babel project overviewbaoilleach
 
So I have an SD File... What do I do next?
So I have an SD File... What do I do next?So I have an SD File... What do I do next?
So I have an SD File... What do I do next?baoilleach
 
Chemistrify the Web
Chemistrify the WebChemistrify the Web
Chemistrify the Webbaoilleach
 
Universal Smiles: Finally a canonical SMILES string
Universal Smiles: Finally a canonical SMILES stringUniversal Smiles: Finally a canonical SMILES string
Universal Smiles: Finally a canonical SMILES stringbaoilleach
 
What's New and Cooking in Open Babel 2.3.2
What's New and Cooking in Open Babel 2.3.2What's New and Cooking in Open Babel 2.3.2
What's New and Cooking in Open Babel 2.3.2baoilleach
 
Intro to Open Babel
Intro to Open BabelIntro to Open Babel
Intro to Open Babelbaoilleach
 
Protein-ligand docking
Protein-ligand dockingProtein-ligand docking
Protein-ligand dockingbaoilleach
 
Cheminformatics
CheminformaticsCheminformatics
Cheminformaticsbaoilleach
 
Large-scale computational design and selection of polymers for solar cells
Large-scale computational design and selection of polymers for solar cellsLarge-scale computational design and selection of polymers for solar cells
Large-scale computational design and selection of polymers for solar cellsbaoilleach
 
My Open Access papers
My Open Access papersMy Open Access papers
My Open Access papersbaoilleach
 
Improving the quality of chemical databases with community-developed tools (a...
Improving the quality of chemical databases with community-developed tools (a...Improving the quality of chemical databases with community-developed tools (a...
Improving the quality of chemical databases with community-developed tools (a...baoilleach
 
De novo design of molecular wires with optimal properties for solar energy co...
De novo design of molecular wires with optimal properties for solar energy co...De novo design of molecular wires with optimal properties for solar energy co...
De novo design of molecular wires with optimal properties for solar energy co...baoilleach
 
Cinfony - Bring cheminformatics toolkits into tune
Cinfony - Bring cheminformatics toolkits into tuneCinfony - Bring cheminformatics toolkits into tune
Cinfony - Bring cheminformatics toolkits into tunebaoilleach
 
Density functional theory calculations on Ruthenium polypyridyl complexes inc...
Density functional theory calculations on Ruthenium polypyridyl complexes inc...Density functional theory calculations on Ruthenium polypyridyl complexes inc...
Density functional theory calculations on Ruthenium polypyridyl complexes inc...baoilleach
 
Application of Density Functional Theory to Scanning Tunneling Microscopy
Application of Density Functional Theory to Scanning Tunneling MicroscopyApplication of Density Functional Theory to Scanning Tunneling Microscopy
Application of Density Functional Theory to Scanning Tunneling Microscopybaoilleach
 
Towards Practical Molecular Devices
Towards Practical Molecular DevicesTowards Practical Molecular Devices
Towards Practical Molecular Devicesbaoilleach
 
Why multiple scoring functions can improve docking performance - Testing hypo...
Why multiple scoring functions can improve docking performance - Testing hypo...Why multiple scoring functions can improve docking performance - Testing hypo...
Why multiple scoring functions can improve docking performance - Testing hypo...baoilleach
 
Why multiple scoring functions can improve docking performance - Testing hypo...
Why multiple scoring functions can improve docking performance - Testing hypo...Why multiple scoring functions can improve docking performance - Testing hypo...
Why multiple scoring functions can improve docking performance - Testing hypo...baoilleach
 
Improving enrichment rates
Improving enrichment ratesImproving enrichment rates
Improving enrichment ratesbaoilleach
 

Mais de baoilleach (20)

We need to talk about Kekulization, Aromaticity and SMILES
We need to talk about Kekulization, Aromaticity and SMILESWe need to talk about Kekulization, Aromaticity and SMILES
We need to talk about Kekulization, Aromaticity and SMILES
 
Open Babel project overview
Open Babel project overviewOpen Babel project overview
Open Babel project overview
 
So I have an SD File... What do I do next?
So I have an SD File... What do I do next?So I have an SD File... What do I do next?
So I have an SD File... What do I do next?
 
Chemistrify the Web
Chemistrify the WebChemistrify the Web
Chemistrify the Web
 
Universal Smiles: Finally a canonical SMILES string
Universal Smiles: Finally a canonical SMILES stringUniversal Smiles: Finally a canonical SMILES string
Universal Smiles: Finally a canonical SMILES string
 
What's New and Cooking in Open Babel 2.3.2
What's New and Cooking in Open Babel 2.3.2What's New and Cooking in Open Babel 2.3.2
What's New and Cooking in Open Babel 2.3.2
 
Intro to Open Babel
Intro to Open BabelIntro to Open Babel
Intro to Open Babel
 
Protein-ligand docking
Protein-ligand dockingProtein-ligand docking
Protein-ligand docking
 
Cheminformatics
CheminformaticsCheminformatics
Cheminformatics
 
Large-scale computational design and selection of polymers for solar cells
Large-scale computational design and selection of polymers for solar cellsLarge-scale computational design and selection of polymers for solar cells
Large-scale computational design and selection of polymers for solar cells
 
My Open Access papers
My Open Access papersMy Open Access papers
My Open Access papers
 
Improving the quality of chemical databases with community-developed tools (a...
Improving the quality of chemical databases with community-developed tools (a...Improving the quality of chemical databases with community-developed tools (a...
Improving the quality of chemical databases with community-developed tools (a...
 
De novo design of molecular wires with optimal properties for solar energy co...
De novo design of molecular wires with optimal properties for solar energy co...De novo design of molecular wires with optimal properties for solar energy co...
De novo design of molecular wires with optimal properties for solar energy co...
 
Cinfony - Bring cheminformatics toolkits into tune
Cinfony - Bring cheminformatics toolkits into tuneCinfony - Bring cheminformatics toolkits into tune
Cinfony - Bring cheminformatics toolkits into tune
 
Density functional theory calculations on Ruthenium polypyridyl complexes inc...
Density functional theory calculations on Ruthenium polypyridyl complexes inc...Density functional theory calculations on Ruthenium polypyridyl complexes inc...
Density functional theory calculations on Ruthenium polypyridyl complexes inc...
 
Application of Density Functional Theory to Scanning Tunneling Microscopy
Application of Density Functional Theory to Scanning Tunneling MicroscopyApplication of Density Functional Theory to Scanning Tunneling Microscopy
Application of Density Functional Theory to Scanning Tunneling Microscopy
 
Towards Practical Molecular Devices
Towards Practical Molecular DevicesTowards Practical Molecular Devices
Towards Practical Molecular Devices
 
Why multiple scoring functions can improve docking performance - Testing hypo...
Why multiple scoring functions can improve docking performance - Testing hypo...Why multiple scoring functions can improve docking performance - Testing hypo...
Why multiple scoring functions can improve docking performance - Testing hypo...
 
Why multiple scoring functions can improve docking performance - Testing hypo...
Why multiple scoring functions can improve docking performance - Testing hypo...Why multiple scoring functions can improve docking performance - Testing hypo...
Why multiple scoring functions can improve docking performance - Testing hypo...
 
Improving enrichment rates
Improving enrichment ratesImproving enrichment rates
Improving enrichment rates
 

Último

Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...JojoEDelaCruz
 
Activity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationActivity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationRosabel UA
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptxMusic 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptxleah joy valeriano
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSJoshuaGantuangco2
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptxmary850239
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management SystemChristalin Nelson
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfErwinPantujan2
 
Active Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfActive Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfPatidar M
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYKayeClaireEstoconing
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 

Último (20)

Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...
 
Activity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationActivity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translation
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptxMusic 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
 
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptxFINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management System
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
 
Active Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfActive Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdf
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 

Data Analysis in QSAR

  • 1. Data analysis in QSAR Noel O’Boyle Dave Palmer, John Mitchell
  • 2. Quantitative Structure-Activity Relationship (QSAR) Also QSPR (Property) Perceive Physical Structure Predict property Propose
  • 3. Quantitative Structure-Activity Relationship (QSAR) Also QSPR (Property) Perceive Physical Structure Predict property Propose Descriptors Molecular weight No.of H-bonding groups Surface area MOE: 180 descriptors
  • 4. Quantitative Structure-Activity Relationship (QSAR) Need to use an estimate of the error Optimise model on training set (2/3) of prediction (CV or bootstrap, but not Test model on test set (1/3) resubstitution) or Build model on training set (2/3 of 2/3) Optimise on test set (1/3 of 2/3) Using the error of prediction Test model on validation set (1/3)
  • 5. Quantitative Structure-Activity Relationship (QSAR) Need to use an estimate of the error Optimise model on training set (2/3) of prediction (CV or bootstrap, but not Test model on test set (1/3) resubstitution) or Build model on training set (2/3 of 2/3) Optimise on test set (1/3 of 2/3) Using the error of prediction Test model on validation set (1/3) Testing must be carried out on a hold-out set from the same distribution as the the set of molecules used for optimisation => don't optimise model on diverse set Assessing Model Fit by Cross-Validation, JCICS, 2003, 43, 579. Beware of q2, J.Mol.Graph.Model., 2002, 20, 269.
  • 6. Feature selection Number of descriptors → Features = descriptors 2n different subsets
  • 7. Parameter optimisation – Models have parameters which should be tuned • Support vector machines – gamma, cost, epsilon
  • 8. QSAR: Feature selection and parameter optimisation • Aim: To find a robust method of simultaneously optimising the parameters and performing feature selection • Prediction of solubility for drug-like molecules – David Palmer – Support vector machines • gamma, cost, epsilon – 127 descriptors • Large search space – stochastic algorithm required
  • 9. Ant Colony Optimisation (ACO) • Insipired by behaviour of ants foraging for food • Ants lay down pheromones, which influence the path taken by other ants. Meanwhile pheromones are evaporating. • Ants’ trails converge to shortest path between nest and food • Ant Colony Optimisation (ACO) – Marco Dorigo, PhD Thesis, 1992. • ACO for feature selection – Shen et al, JCIM, 2005, 45, 1024. • We have extended it to perform simultaneous parameter optimisation
  • 10. Ant Colony Optimisation (ACO) • population of ants (typically 50 to 100) – each ant represents a model – i.e. a subset of descriptors and values for the parameters – each ant has a fitness score, e.g. 10-fold cross validation rmse • it is more likely that an ant will choose a particular descriptor/parameter value in the next iteration if – many ants have chosen it in this iteration (local search), or – many ants have chosen it in their best models to date (global search) • for descriptors/parameter values that are not chosen in the current or best models, the probability that they will be chosen decreases (evaporation)