Using an online database of chemical compounds for the purpose of structure identification

•Transferir como PPT, PDF•

0 gostou•2,866 visualizações

US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

Online databases can be used for the purposes of structure identification. The Royal Society of Chemistry provides access to an online database containing tens of millions of compounds and this has been shown to be a very effective platform for the development of tools for structure identification. Since in many cases an unknown to an investigator is known in the chemical literature or reference database, these “known unknowns” are commonly available now on aggregated internet resources. The identification of these types of compounds in commercial, environmental, forensic, and natural product samples can be identified by searching against these large aggregated databases querying by either elemental composition or monoisotopic mass. Searching by elemental composition is the preferred approach as it is often difficult to determine a unique elemental composition for compounds with molecular weights greater than 600 Da. In these cases, searching by the monoisotopic mass is advantageous. In either case, the search results can be refined by appropriate filtering to identify the compounds. We will report on integrated filtering and search approaches on our aggregated compound database for the purpose of structure identification and review our progress in using the platform for natural product dereplication purposes.

Ciências

Using an online database of
chemical compounds for the
purpose of structure
identification
Antony Williams, Valery Tkachenko
and Alexey Pshenichnov
ACS San Francisco
August 2014

Free and Easy
• Everything I will show in terms of ChemSpider
is available for free online today
• To make it easy to “take notes” these slides
are already available at:
www.slideshare.net/AntonyWilliams/

Mass Spectrometry for
Structure ID
• Many applications of mass spectrometry are the
identification of “knowns”
• Known structures, previously characterized,
previously identified and, increasingly, online
• Dereplication, identification of “other
manufacturers” materials, metabolites, lipids
analysis – can be supported by existing
databases
• What large database could serve mass spec. ?

• ~32 million chemicals and growing
• Data sourced from >500 different sources
• Crowd sourced curation and annotation
• Ongoing deposition of data from our
journals and our collaborators
• Structure centric hub for web-searching
• …and a really big dictionary!!!

For Mass Spectrometrists
• Valuable searches for Mass Spec would be:
• Search the database by mass or formula for
structure identification
• Search subsets of data – e.g. “metabolism”,
pesticides etc
• Link structure-based data across the internet
• Provide “programming interfaces” to integrate
• Does ChemSpider provide value to Mass
Spectrometrists?

Data Source Selection
• >32 million chemicals include
• Vendor collections
• Government databases
• Individual/Lab data
• Publication data
• All segregated allowing for data source
selection

Mass Spec Analysis
Jim Little, Eastman Chemical

Improved Searches
Substructure Search with Mass Filter
352.239 +/- 0.0018

Identification of “Known
Unknowns”
• “Known Unknowns” can be identified by
searching in ChemSpider
• Searching of “segregated” datasets can be
performed
• Datasets can be expanded for specific
projects – for example, natural products ID…

What about ID’ing
“Unknowns”?
• Bring together various spectroscopic
techniques for structure elucidation –
primarily NMR and Mass Spectrometry
• Work to identify substructural fragments
• Use Computer-Assisted Structure
Elucidation

• Index literature related to marine natural
products: 26K articles and growing
• Structure searchable database
• Data includes taxonomy, location and literature
• “Spectral features” generated algorithmically
• Utilize the spectral features for dereplication
• Initially NMR and MS

Web Services Open Up
Collaboration
• Agilent, Bruker, Waters and Thermo all using
or investigating our web-based services for
compound lookup
• Many academic sites integrating directly –
metabonomics, name lookup, mass-based
searching

Results of the ChemSpider Search
in the MarkerLynx Worksheet

Future Developments
• Enhanced support for Multiple Substructures
• Mass to formula conversion
• Expand data sources with MS focus

Acknowledgments
• RSC Cheminformatics Team
• James Little, Eastman Chemical Company
• Depositors of data – there are many!

Thank you
Email: williamsa@rsc.org
ORCID: 0000-0002-2668-4821
Twitter: @ChemConnector
Personal Blog: www.chemconnector.com
SLIDES: www.slideshare.net/AntonyWilliams

Mais conteúdo relacionado

Último

User Guide: Pulsar™ Weather Station (Columbia Weather Systems)Columbia Weather Systems

Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPirithiRaju

Microphone- characteristics,carbon microphone, dynamic microphone.pptxpriyankatabhane

Microteaching on terms used in filtration .Pharmaceutical EngineeringPrajakta Shinde

ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptxmaryFF1

Volatile Oils Pharmacognosy And Phytochemistry -INandakishor Bhaurao Deshmukh

Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝soniya singh

Four Spheres of the Earth Presentation.pptJoemSTuliba

Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...D. B. S. College Kanpur

User Guide: Capricorn FLX™ Weather StationColumbia Weather Systems

Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPirithiRaju

Pests of safflower_Binomics_Identification_Dr.UPR.pdfPirithiRaju

Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPirithiRaju

Radiation physics in Dental Radiology...navyadasi1992

The dark energy paradox leads to a new structure of spacetime.pptxEran Akiva Sinbar

THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxNandakishor Bhaurao Deshmukh

User Guide: Orion™ Weather Station (Columbia Weather Systems)Columbia Weather Systems

Bioteknologi kelas 10 kumer smapsa .pptx023NiWayanAnggiSriWa

OECD bibliometric indicators: Selected highlights, April 2024innovationoecd

Harmful and Useful Microorganisms Presentationtahreemzahra82

Destaque

Product Design Trends in 2024 | Teenage EngineeringsPixeldarts

How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow

AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork

Skeleton Culture CodeSkeleton Technologies

PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley

Content Methodology: A Best Practices Report (Webinar)contently

How to Prepare For a Successful Job Search for 2024Albert Qian

Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)

Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal

5 Public speaking tips from TED - Visualized summarySpeakerHub

ChatGPT and the Future of Work - Clark Boyd Clark Boyd

Getting into the tech field. what next Tessa Mero

Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray

How to have difficult conversations Rajiv Jayarajah, MAppComm, ACC

Introduction to Data ScienceChristy Abraham Joy

Time Management & Productivity - Best PracticesVit Horky

The six step guide to practical project managementMindGenius

Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36

Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools

12 Ways to Increase Your Influence at WorkGetSmarter

Destaque (20)

Product Design Trends in 2024 | Teenage Engineerings

How Race, Age and Gender Shape Attitudes Towards Mental Health

AI Trends in Creative Operations 2024 by Artwork Flow.pdf

Skeleton Culture Code

PEPSICO Presentation to CAGNY Conference Feb 2024

Content Methodology: A Best Practices Report (Webinar)

How to Prepare For a Successful Job Search for 2024

Social Media Marketing Trends 2024 // The Global Indie Insights

Trends In Paid Search: Navigating The Digital Landscape In 2024

5 Public speaking tips from TED - Visualized summary

ChatGPT and the Future of Work - Clark Boyd

Getting into the tech field. what next

Google's Just Not That Into You: Understanding Core Updates & Search Intent

How to have difficult conversations

Introduction to Data Science

Time Management & Productivity - Best Practices

The six step guide to practical project management

Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...

Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...

12 Ways to Increase Your Influence at Work

Using an online database of chemical compounds for the purpose of structure identification

1. Using an online database of chemical compounds for the purpose of structure identification Antony Williams, Valery Tkachenko and Alexey Pshenichnov ACS San Francisco August 2014

2. Free and Easy • Everything I will show in terms of ChemSpider is available for free online today • To make it easy to “take notes” these slides are already available at: www.slideshare.net/AntonyWilliams/

3. Mass Spectrometry for Structure ID • Many applications of mass spectrometry are the identification of “knowns” • Known structures, previously characterized, previously identified and, increasingly, online • Dereplication, identification of “other manufacturers” materials, metabolites, lipids analysis – can be supported by existing databases • What large database could serve mass spec. ?

4. • ~32 million chemicals and growing • Data sourced from >500 different sources • Crowd sourced curation and annotation • Ongoing deposition of data from our journals and our collaborators • Structure centric hub for web-searching • …and a really big dictionary!!!

5. ChemSpider

6. What will ChemSpider give us?

7. What will ChemSpider give us?

8. What will ChemSpider give us?

9. What will ChemSpider give us?

10. Spectra: e.g. Cholesterol

11. Spectra

12. For Mass Spectrometrists • Valuable searches for Mass Spec would be: • Search the database by mass or formula for structure identification • Search subsets of data – e.g. “metabolism”, pesticides etc • Link structure-based data across the internet • Provide “programming interfaces” to integrate • Does ChemSpider provide value to Mass Spectrometrists?

13. Pre-calculated data

14. Data Source Selection • >32 million chemicals include • Vendor collections • Government databases • Individual/Lab data • Publication data • All segregated allowing for data source selection

15. Data Source Selection - Type

16. Data Source Selection - Individual

17. Mass Spec Analysis Jim Little, Eastman Chemical

18. ChemSpider Interface

19. 1287 Hits Ranked by Defect

20. 1287 Hits Ranked by # of References

21. Top Ranked Hit

22. Tinuvin 328

23. What can I find on ChemSpider?

24. What can I find?

25. What can I find?

26. Source and Purchase…

27. What can I find on ChemSpider?

28. External Calculation Engines

29. What can I find on ChemSpider?

30. and in the RSC Databases..

31. Linked to the Publisher

32. What can I find?

33. And out to Google Patents

34. And What About the Entire Web?

35. The InChI Identifier

36. InChIStrings Hash to InChIKeys

37. Searching Internet by Structure

38. Extended Study Sorting by references

39. Position sorted by references

40. Position 1 only

41. Searching by Monoisotopic Mass

42. Improved Searches Substructure Search with Mass Filter 352.239 +/- 0.0018

43. Identification of “Known Unknowns” • “Known Unknowns” can be identified by searching in ChemSpider • Searching of “segregated” datasets can be performed • Datasets can be expanded for specific projects – for example, natural products ID…

44. We Are Doomed I Tell You!!!

45.

46. http://www.pharma-sea.eu/

47. The PharmaSea Website

48. What about ID’ing “Unknowns”? • Bring together various spectroscopic techniques for structure elucidation – primarily NMR and Mass Spectrometry • Work to identify substructural fragments • Use Computer-Assisted Structure Elucidation

49. • Index literature related to marine natural products: 26K articles and growing • Structure searchable database • Data includes taxonomy, location and literature • “Spectral features” generated algorithmically • Utilize the spectral features for dereplication • Initially NMR and MS

50.

51. Web Services

52. Web Services Open Up Collaboration • Agilent, Bruker, Waters and Thermo all using or investigating our web-based services for compound lookup • Many academic sites integrating directly – metabonomics, name lookup, mass-based searching

53. Results of the ChemSpider Search in the MarkerLynx Worksheet

54. Hit Details in ChemSpider

55. Future Developments • Enhanced support for Multiple Substructures • Mass to formula conversion • Expand data sources with MS focus

56. Acknowledgments • RSC Cheminformatics Team • James Little, Eastman Chemical Company • Depositors of data – there are many!

57. Thank you Email: williamsa@rsc.org ORCID: 0000-0002-2668-4821 Twitter: @ChemConnector Personal Blog: www.chemconnector.com SLIDES: www.slideshare.net/AntonyWilliams

Notas do Editor

MarinLit is ‘article-centric’ and not compound centric. Compounds are only indexed when they are newly discovered, revised, or new to marine. All compound records link to the paper they were first mentioned. They are not linked to subsequent articles that describe them.

Using an online database of chemical compounds for the purpose of structure identification

Recomendados

Recomendados

Mais conteúdo relacionado

Último

Último (20)

Destaque

Destaque (20)

Using an online database of chemical compounds for the purpose of structure identification

Notas do Editor