Seu SlideShare está sendo baixado. ×

Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Carregando em…3
×

1 de 28 Anúncio

# biosortia2prop.pptx

biosortia2prop algorithm

biosortia2prop algorithm

Anúncio
Anúncio

Anúncio

### biosortia2prop.pptx

1. 1. MS2PROP (biosortia2prop) QED properties and lipinski https://github.com/patrickchirdon/biosortia Key finding: ms2prop had an r2 of .73 on the independent test set across all the QED properties, we have an r2 of 88% (we beat envedabio)
2. 2. Additions to biosortia2prop in progress:
3. 3. Open source Calculator would Bring people to the Biosortia web site And would require people To cite you if they used it Could also build additional Chembl models on Request. Next goal-- finish the Calculator, screen your Compounds using the PASS program and find Targets!
4. 4. Methods  Methods   Lasso regression-- least absolute shrinkage and selection operation regression is a regularized version of linear regression. It adds a regularization term to the cost function using the l1 norm of the weight vector. An important characteristic of lasso regression is that it tends to completely eliminate the weights of the least important features (set them to 0). Kasso regression automatically performs feature selection and outputs a sparse model.   elastic- elastic net is a middle ground between ridge regression and lasso regression. the regularization term is a simple mix of both ridge and lasso's regularization terms, and you can control the mix ratio r. when r=0, elastic net is equivalent o ridge regression, and when r=1, it is equivalent to lasso regression. ridge regression is a good default to use, but if you suspect that only a few features are actually useful, you should prefer lasso or elastic since they tend to reduce the useless features' weights down to zero. In general elastic is preferred over lasso since lasso may behave erratically when the number of features is greater than the number of training instances or when several features are strongly correlated.   How good is a regression?   statisticians have come up with a tool that’s easy to understand. It is called r^2. Typically, R square is looked at as a percentage value, and it can range from 0% to 100%. The higher it is,the greater the explanatory power of the regression model (the lower the weight of unexplained squares, the better the model).   https://medium.com/wwblog/evaluating-regression-models-using-rmse-and-r%C2%B2-42f77400efee
5. 5. Scikitlearn models of QED test data  for solubility lasso lars had r^2 of 1, lasso r^2 of 1, and elastic r^2 of 1 on test data  bioavailability-- lasso lars had a r^2 of 1 on test data, lasso r^2 of 1, and elastic r^2 of 1.  for solubility lasso lars had r^2 of 1, lasso r^2 of 1, and elastic r^2 of 1 on test data.  for fraction of sp3 hybridized carbons (a measure of selectivity of binding)-- lasso lars had a r^2 of 1 on training data (first model for lasso lars to work), lasso r^2 of 1, and elastic r^2 of 1.  mglur5 (autism target), hsp90a (neuroinflammatory target, charcot marie tooth disease), calpain 1 (covid 19 target), aphid mortality model. r^2 of 1 with elastic, lasso, and lasso_lars models
6. 6. Lipinski druglikeness  sensitivity 93.7 %, specificity 60%, accuracy 87% when doing the lipinski classification.  QED is lipinski plus aromatic ring count and med chem rules (SMARTS)
7. 7. Test data excluded from Training data We trained a keras neural net on the GNPS natural products mass spec database
8. 8. Test data excluded from Training data
9. 9. for CNS compounds, moderately polar (PSA<79 Å2) and relatively lipophilic (log P from +0.4 to +6.0) molecules have a high probability to access the CNS.
10. 10. FDA Test Set (independent dataset) The Egan rule considers good bioavailability for compounds with 0 ≥ tPSA ≤ 132 Å2 and -1≥ logP ≤ 6 [15] for GI adsorption, PSA lower than 142 Å2 and log P between −2.3 and +6.8.
11. 11. Pesticides Egan’s rule holds for pesticides. EPA environmental toxicity calculator https://www.epa.gov/chemical-research/toxicity-estimation-software-tool- test
12. 12. Egan’s rule can differentiate CNS compounds from non CNS compounds for CNS compounds, moderately polar (PSA<79 Å2) and relatively lipophilic (log P from +0.4 to +6.0) molecules have a high probability to access the CNS.
13. 13. Molecular Descriptors for Chemoinformatics  Todeschini and Consonni
14. 14. Properties of Pesticides: https://www.uky.edu/ Ag/Entomology/PSE P/6environment.html
15. 15. When trying to find a target all you need is a pharmacophore. Since we know the structure for 140 we can find the pharmacophores. https://academic.oup.com/nar/article/45/W1/W356/379121 3 https://hmdb.ca/metabolites?utf8=%E2%9C%93&quantified=1&blood=1&urine=1&saliva=1&csf=1&feces=1&sweat=1&breast_milk=1&bi le=1&amniotic_fluid=1&other_fluids=1&microbial=1&filter=true 140 structures from link above Malaria predictor http://chembl.blogspot.com/2020/ 05/malaria-inhibitor-prediction- platform.html Antiviral predictor http://crdd.osdd.net/servers/avcpred/
16. 16. Molecular Descriptors for Chemoinformatics Gustafson, DI (1989). Groundwater Ubiquity Score: A Simple Method for assessing pesticide leachability. EnvironToxicol Chem, 339-357. Papa, E, Castiglioni, S, Gramatica, P, Nikolayenko, V, Kayumov, O and Calamari, D. (2004) Screening the leaching tendency of pesticides applied in the Amu Darya Basin (Uzbekistan) Water Res, 38, 3485-3491. Laskonoski, DA, Goring, CAI, McCall, PJ and Swann, RL (1982). Terrestrial Environment, in Environmental Analysis for Chemicals (ed. RA Conway), Van Norstrand Reinhold Company, New York, pp 198-240. Gramatica, P and DiGuardo, A (2002). Screening of Pesticides for environmental partitioning tendency. Chemosphere, 47, 947-956. Papa, E, Castiglioni, S, Gramatica, P, Nikolayenko, V. Kayumov, o and calamari, D (2004). Screening the leaching tendency of pesticides applied in the Amu Darya Basin (Uzbekistan). Water Res, 38, 3485-3494. Wingnet, P, Cramer, CJ and Truhlar, DJ (2000). Prediction of soil sorption coefficients using universal solvation model. Environ. Sci technol, 34, 4733-4740. Andrews, PR, Craik, DJ and Martin JL (1984) Functional Group Contributions to drug receptor interactions. J Med. Chem, 27, 1648- 1657. Muegge, O (2002). Pharmacophore features of potential drugs Chem. Eur. J, 8 1977-1981. Muegge, I (2003). Selection Criteria for druglike compounds. Med. Res. Rev, 23, 302-321. Muegge, I, Heald, Sl and Brittelli. (2001). Simple Selection Criteria for druglike chemical matter. J. Med Chem. 44. 1841-1846.
17. 17.  Compounds of interest---  Ulvan -- https://www.jpost.com/health-and-wellness/could-seaweed-save-humanity-from-covid-19-687775 https://onlinelibrary.wiley.com/doi/epdf/10.1002/adma.202206367 Ulvan is a plastic similar to PEI. It's useful for covid 19 and for agriculture viruses. https://www.newscientist.com/article/2341170-battery-made-using-seaweed-still-works-after-charging-1000- times/?utm_medium=social&utm_campaign=echobox&utm_source=Facebook#Echobox=1665398953 https://www.sciencedirect.com/science/article/pii/S2211926418308373  Alginate nanoparticles coated with antibodies for tumors-- partner with freenome? https://www.science.org/doi/10.1126/science.abq6990?utm_campaign=SciMag&utm_medium=ownedSocial&utm_source=LinkedIn&cookieSet=1 https://www.freenome.com/ https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6432598/  Gratzel cells https://www.researchgate.net/publication/256549423_Brown_seaweed_pigment_as_a_dye_source_for_photoelectrochemical_solar_cells  omega 3  sunscreen
18. 18. anti-inflammatories  https://www.newscientist.com/article/2355262-vagus-nerve-receptors-may-be-key-to-controlling-inflammation/  https://uantwerpen.vib.be/group/VincentTimmerman  We made a hsp90a model with an r^2 of 1.  Heat shock protein dis-regulation is a key part of neuroinflammation in charcot marie tooth disease  We also have enough data to build a c-myc inhibitor-- cancer target and anti-aging target (see old mouse become young mouse)  https://molecular-cancer.biomedcentral.com/articles/10.1186/s12943-020-01291-6   https://rupress.org/jcb/article/220/8/e202103090/212429/The-long-journey-to-bring-a-Myc-inhibitor-to-the   https://time.com/6246864/reverse-aging-scientists-discover-milestone/   https://www.nationalgeographic.com/magazine/science/article/zombie-cells-could-hold-the-secret-to-alzheimers-cure
19. 19.  potential partnerships--  https://www.calicolabs.com/ (anti-aging)  https://insilico.com/ (phase 0 clinical trial in less than a month)  https://www.envedabio.com/ (our algorithm beats envedabio’s ms2prop algorithm)  https://www.harringtondiscovery.org/ (local, Cleveland University Hospitals, rare and orphan disease drug discovery)  https://www.energy.gov/osdbu/small-business-toolbox  https://www.trialspark.com/ (CRO for matching labs to clinical trials)  Tillerman lab in Belgium? Charcot Marie Tooth Disease lab https://oig.hhs.gov/oei/reports/oei-09-00-00380.pdf
20. 20. RESOURCES https://www.rosettacommons.org/ https://openmolecules.org/datawarrior/ https://pubchem.ncbi.nlm.nih.gov//edit3/index.html Databases-- GNPS library-- https://ccms-ucsd.github.io/GNPSDocumentation/gnpslibraries/ https://ec.europa.eu/food/plant/pesticides/eu-pesticides-database/start/screen/active-substances https://cbirt.net/meta-ai-releases-esm-metagenomic-atlas-a-repository-of-over-600-million-predicted-protein- structures/ https://comptox.epa.gov/genra/ https://www.metaboanalyst.ca/MetaboAnalyst/ModuleView.xhtml https://ipb-halle.github.io/MetFrag/projects/metfragweb/ https://hmdb.ca/spectra/ms_ms/search https://pubchem.ncbi.nlm.nih.gov/ https://www.ebi.ac.uk/chembl/ https://ochem.eu/home/show.do https://alphafold.ebi.ac.uk/ https://www.rcsb.org/ https://cfmid.wishartlab.com/predict https://massbank.eu/MassBank/Search http://www.swissadme.ch