Within the last decade there has been a rapid growth in the adoption of Mass Spectrometry (MS) as a routine and facile technique not just by a group of expert level mass spectrometrists, but by a much more diverse group of non-MS related disciplines. This shift continues to be fueled by a number of factors, which can be broadly segregated into, instrumental technologies, the derived high value of the technique, the cost per sample, the derived information content, ease of use and software.
Advances in sensitivity, ruggedness, reliability, ease of integration with High Performance Liquid Chromatography (HPLC), Gas Chromatography (GC) and other separation techniques and the general ease of operation of MS instrumentation can all be considered as enabling. Ultimately, the strongest driver for the wide adoption of MS has been driven by the clear value that the technique brings to so many different businesses in terms of both sample throughput and information content per sample. This expansion in the ability to create data both in terms of volume and in data density per dataset can be correlated directly with a backlog in the ability to extract, process, store and report, and thereby create the resulting high information and knowledge content which is sought. Data that are generated by the instruments in their various guises are simply binary bits and bytes and information has to be extracted via a process of conversion of data to information and knowledge. Software therefore becomes an integral, critical and enabling part of the cycle of information creation in support of compound development and chemical analysis.
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Applications of Computer Software for the Interpretation and Management of Mass Spectrometry Data in Pharmaceutical Science
1. Applications of Computer Software for the Interpretation
and Management of Mass Spectrometry Data in
Pharmaceutical Science
Mark Bayliss and Antony Williams, Advanced Chemistry Development,
90 Adelaide Street West, Suite 702, Toronto, ON, M5H 3V9, Canada
2. Abstract
Within the last decade there has been a rapid growth in the adoption of
Mass Spectrometry (MS) as a routine and facile technique not just by a group of
expert level mass spectrometrists, but by a much more diverse group of non-MS
related disciplines. This shift continues to be fueled by a number of factors,
which can be broadly segregated into, instrumental technologies, the derived
high value of the technique, the cost per sample, the derived information
content, ease of use and software.
Advances in sensitivity, ruggedness, reliability, ease of integration with
High Performance Liquid Chromatography (HPLC), Gas Chromatography (GC)
and other separation techniques and the general ease of operation of MS
instrumentation can all be considered as enabling. Ultimately, the strongest
driver for the wide adoption of MS has been driven by the clear value that the
technique brings to so many different businesses in terms of both sample
throughput and information content per sample. This expansion in the ability to
create data both in terms of volume and in data density per dataset can be
correlated directly with a backlog in the ability to extract, process, store and
report, and thereby create the resulting high information and knowledge content
which is sought. Data that are generated by the instruments in their various
guises are simply binary bits and bytes and information has to be extracted via a
process of conversion of data to information and knowledge. Software therefore
3. becomes an integral, critical and enabling part of the cycle of information
creation in support of compound development and chemical analysis.
Additional business drivers include the need to reduce development
timelines, a greater understanding of the chemical significance of a particular
development compound and return on investment. All these factors result in a
tremendous business effort focused around streamlined approaches that provide
scientists, managers, and executives the capability to readily obtain, or even
request, the necessary information.
Due to the heterogeneous instrumentation environment and resulting
distribution of data formats, it is challenging to bring together a single universally
applied interface for the data. Data in this sense refers to the different
spectroscopies and other analytical techniques that are commonly used in
support of chemical analysis. The ability to read in raw vendor formats and allow
integrated data-handling has been severely lacking. Efforts have been made to
define common exchange data formats such as JCAMP and NetCDF and current
efforts using XML which are being driven by the ASTM E13 committee. Third
party vendors [1] have also assumed the task of becoming the neutral party to
unify data handling and management. Such third party offerings have become a
crucial component in the effort to build a single corporate spectroscopic
database supporting all instrumentation, not limited to MS but inclusive of NMR,
IR, UV-Vis, Raman and HPLC as shown in Figure 1.
4. In this chapter we intend to present, review and discuss some of the non-
instrument related software systems that exist for qualitative data extraction and
structural elucidation. During this discussion we will examine the representation
of molecular structures associated with analytical data and the support systems
that are able to store, retrieve and report this information. It is not our intention to
review the archival systems that exist for the long term storage of the physical
datafiles and other associated electronic records. As part of this review we will
include a survey of the creation of commercial and laboratory specific reference
databases and associated searching algorithms. We will also discuss recent
efforts to introduce advanced processing and analysis algorithms to the hands of
the masses, specifically as an aid to data extraction and structure elucidation.
Broadly speaking, we can separate the points of discussion into tools for data
extraction, elucidation, storage, retrieval, reporting and information distribution.
Nine strategies consistently appear in MS-based methods for accelerated
development and have been discussed in detail by Lee [2]. The strategies are
standard methods, template structure identification, databases, screening,
integration, miniaturization, parallel processing, visualization and automation.
These strategies serve to define the attributes of the analytical methods being
applied. High-throughput sample-generating technologies such as biomolecular
screening and combinatorial chemistry can create many thousands of samples,
each requiring the application of one or more forms of analytical chemistry.
Nowadays, the ability to devise, construct, and refine sample-analysis methods,
either chromatographic or spectroscopic, has become as equally important as
5. the hardware itself. Today, the need to integrate appropriate method
development strategies with MS processing capabilities is a critical factor in the
modern industrial laboratory.
In chemical and pharmaceutical companies around the world, the
necessity to acquire and analyze analytical data for the abundance of samples is
a critical business requirement. As a result the availability of open-access
laboratories containing highly roboticized instrumentation such as OpenLynx
from Waters Corporation – formerly Micromass Ltd [3], the 1100 Series High
Throughput LC/MS System from Agilent [4] and others are now commonplace.
The careers of professional spectrometrists are now largely focused on the
implementation of optimal techniques to support the users of these laboratories
rather than the standard sample analysis of yesteryear. Decreasing costs and
reduced footprints for the instrumentation, as well as more intuitive software
interfaces for non-specialists and globalization of software platforms such as
Waters Micromass OpenLynx Global Server™[5], allows the use of
spectroscopic and chromatographic techniques in an open-access laboratory
environment across organizations. Commonly, these laboratories are also likely
to provide NMR, MS, IR, UV-Vis and chromatographic instrumentation. As a
result of these laboratories both standard and hyphenated MS-based techniques
have entered the hands of the masses. It is clear that distinct differences still
exist between the applications of mass spectrometry made available to non-
specialists and those performed by the specialist.
6. In general, non-specialists are adopting MS instrumentation that
predominantly generates molecular ion only MS with little or no fragmentation.
This is clearly revealed during visits to any of the number of laboratories that
now offer Open Access technologies that enable a chemist with no prior MS
knowledge or experience to submit samples for analysis in a totally automated
manner. In a small number of cases, this has been extended to the inclusion of
MS/MS fragmentation though this appears not to be the norm at this time.
Another example of this appears in applications that deal with combinatorial
plate analysis, for example, the data generated includes a full high performance
liquid chromatography-MS (LC-MS) run. The ionizing technique is “soft” and
produces for each well in a plate both the parent ion and one or more
chromatographic traces [Total Ion Current (TIC), Extracted Ion Current (XIC),
Diode Array, Chemiluminesence Nitrogen Detector (CLND), Evaporative Light
Scattering Detection (ELSD) and others] to aid in the assay of materials in the
sample. Meanwhile, the traditional spectrometrist is generally more focused on
non-routine analyses which require greater levels of custom method
development, structural elucidation, and studies requiring the usage of accurate
mass LC/MS and LC/MS/MS.
Whether the application provides data for synthetic chemists or expert
spectrometrists, computer software is an essential factor in a successful
analysis. Whether it is the application of advanced chemometric algorithms for
noise-reduction, the association of structural fragments with mass spectral
features, or the management and databasing of the derived information,
7. computer software applications additional to those required for operation of the
instrument are a necessary and integral part of the analytical information
repertoire that exists for scientists industry wide.
Extraction of data
Prior to any structural elucidation, the need for data extraction is of
paramount importance. The simplest form of extraction may merely be a case of
selecting a peak of interest in the LC/MS or GC/MS TIC and obtaining a
spectrum for that peak. The inclusion of background subtraction further improves
the spectral quality with the removal of solvent and contributions from any
background ions – thus making the identification of the molecular weight or
spectrally related ions clearer. The automation of background subtraction and
generation of a “cleaned” spectrum is very much the mainstay of all data
processing systems that exist in the marketplace. Of course this method
precludes that the elution times of the peaks are either known or that the peaks
in the TIC are clearly visible. In the case of Open Access or combinatorial
studies, it can be common practice to use the additional analog detectors, UV,
ELSD or CLND detectors to define the retention time of the eluting peaks which
can then be used to obtain a combined and background subtracted MS
spectrum. This technique certainly adds value when the analysis is not sample
limited, and a strong peak exists in the analog detector(s) which can be used to
direct the extraction of the MS spectrum. Variability in detection between one or
8. more detectors, as in a lack of chromophore for example does lead to a lack of
detector response. The use of more than one analog detector does help to
minimize this impact.
In many cases, where the focus of the MS is in the extraction of low intensity
components such as impurity analysis and metabolite determinations, the
presence of the chromatographic peak of interest may be obscured by the
presence of high background levels resulting from solvents, buffers and other
none sample related background contamination ions. In addition, in these cases
the concentration of the unknown peak(s) of interest in the sample may be so
low that there may be no response on the UV or other analog detector that can
define the position of the chromatographic peak. It is often the case that the
intensity of the contamination ions or those from the solvents and buffers far
exceeds those arising from the sample related ions and thus extraction by
retention time alone becomes less appealing. In the case of natural product
analysis and metabolism studies, the chromatographic peaks of interest may be
present with a multitude of other peaks that are related to the sample matrix and
thus unwanted. This of course further increases the complexity of the extraction
process. Differentiating sample related peaks from those resulting from the
matrix often requires extensive knowledge of both the samples of interest and
the matrix and thus these tasks are often performed by highly trained mass
spectrometrists with a detailed understanding of the sample and its chemistry. Of
course, if, for a particular sample, a significant knowledge base already exists, it
is possible to use this knowledge as a template for data extraction. This is done
9. by searching for masses within the dataset that differ by some delta mass (∆M)
from the compound of interest, such as a parent drug compound or synthesis
material. For example, it would be possible to extract mass chromatograms for
the mono, di, tri… hydroxylated forms of a starting drug structure by extracting
mass chromatograms for (Parent Mass + n[+16]) and then identifying the
presence of chromatographic peaks within these extracted mass
chromatograms. This method represents the route of choice for many of the
software packages that exist for metabolite data extraction by many of the
vendors and offers significant value in being able to extract only sample related
events that exist within the datasets of interest.
As an aid to data extraction, a number of chemometric algorithms have
been developed over the years to assist in the extraction of sample related
spectra and remove the interference of the background and matrix-based
effects. These algorithms by their very nature do not use any knowledge base
for the extraction process and can be beneficial in cases where there has been
significant rearrangement in the integrity of the structure relative to the original
parent structure. Examples of such algorithms include Biller-Biemann [6] and
more recently CODA (COmponent Detection Algorithm) reported by Windig [7],
both of which have been integrated into various MS processing software
platforms over the years. Both of these algorithms are effective in removing the
noise resulting from chemical background and electronic noise that exists within
the data. This can be seen in Figure 2 where the trace at the top represents the
original TIC and the trace at the bottom represents the TIC following the
10. application of CODA. The output from the CODA approach can also be
visualized in the form of individual mass chromatograms. As a generic technique
CODA is most appropriate for the extraction of all peaks contained within the
sample data file, for example an impurity analysis. In other cases, it maybe
desirable to extract only the unique chromatographic peaks present in two or
more data sets. Windig et. al. [8] also reports on the application of CODA to two
or more datasets and the subsequent comparison of the output to determine only
those components that are unique referred to as COMPARELCMS. Figure 3
represents such a comparison using the COMPARELCMS process, where the
top trace represents a metabolized trace and the bottom one a control against
which the metabolized sample is compared. As is clearly visible, the top trace
contains a number of peaks that are unique and thus can be investigated further
as potential metabolite or impurity candidates. As in the case of the visualization
of the CODA output, COMPARELCMS can also be visualized as individually
selectable mass chromatograms. Once extracted, the difference in mass from
the starting parent compound can then be rationalized to either a simple
modification of the original structure, or some other more complex structural
rearrangement.
The isolation of the MS chromatographic peak and its associated mass
can be used in a number of ways, simply as an indicator of molecular weight, as
a means of calculating the empirical formula or as a driver used in the
generation of tandem MS/MS or MS(n) data either in an instrument driven MS to
MS/MS or MS(n) switching protocol or via an MS1 targeted method.
11. Structural Elucidation Using MS Data
The elucidation of chemical structure(s) covers an extremely wide arena
of processes. At its simplest level this may be the calculation of empirical
formula using high mass accuracy, of an isotopically pure spectral peak. Whilst
calculation of empirical formula does not preclude the use of high resolution MS,
it remains a critical requirement in the determination of spectral peak purity. The
necessity for high mass accuracy and high mass resolution may not be apparent
at first glance. High mass accuracy is the ability to determine the value of the
ionized mass to a significant number of decimal places as discussed below. High
mass resolution is the ability of an MS instrument to separate two or more
masses that have the same nominal value. It is also important to note that a high
mass accuracy instrument is unable to separate isomeric forms of the same
compound as the mass of each component is exactly the same.
A spectrally pure peak is an absolute requirement to ensure the correct
calculation of the center of gravity for the mass spectral peak under investigation
and that it is not biased by the presence of some spectral peak with similar
nominal mass. Such determinations of empirical formula thus require the
calculation of molecular weight to at least 3 decimal places or better such that
the number of permutations of carbon, hydrogen, nitrogen, oxygen and so on
can be minimized, Figure 4. The usage of accurate mass determinations need
not be confined to just MS1 or molecular ion peaks. Rather it has a much wider
12. applicability when used in conjunction with tandem MS spectral peaks [9]. This
has been found to assist greatly in the determination of structural fragments and
is being widely applied in the study of metabolites, degradants, natural products
and impurity elucidations.
In the example of the fragmentation of the tri-ethyl pirimiphos, it is
determined that two potential fragment routes give rise to the nominal mass
m/z 152 Figure 5. In the first suggested fragmentation route, cleavage occurs at
the oxygen in position number 11 attached to the phosphorus-sulfur moiety. The
charge is retained on this portion of the molecule to result in a fragment ion with
calculated accurate mass of m/z 152.006 Da, Figure 6a and resulting in a delta
mass of 80 mDa from the experimentally recorded mass of m/z 152.086. When
this is contrasted with the other fragmentation possibility, Figure 6b, a mass
delta of 3 mDa is observed between the calculated fragment mass and the
experimentally determined mass. In the example presented above, it is possible
by adjusting the mass accuracy of the fragmentation assignment process to
match that of the instrumentation being used to acquire the MS data thereby
reduces the number of false positive fragment possibilities that have to be
reviewed.
In the pharmaceutical industry, much of the MS-based elucidation
strategy is based on the premise that much of the parent drug structure will be
retained in the metabolites, impurities, or degradants [10]. The resulting
fragment ions associated with unique substructures of the parent compound are
thus also retained. Thus, the unique fragment ions contained in either full scan
13. or product ion mass spectra of the parent compound serve as the template for
identification. The template structure identification strategy has been recently
illustrated for the profiling of paclitaxel degradants [11].
MS vendors are astute at providing tools for data extraction, quantitation
and compound suggestions, but these often do not include proposed chemical
structures or fragments. The conversion of spectrum to structure in a de-novo
sense, for example natural products, where no prior sample information exists,
remains an extremely difficult process when MS is used in isolation. In the
majority of cases the conversion of a spectrum to a structure even with all the
advances that have been made in the technology, still requires some starting
information about the sample that has to be used in conjunction with the mass
spectral information. Confirmation of structure by the verification of key mass
spectral ions present in the spectrum forms an extremely powerful technique for
structural analysis around a scaffold of prior information of the sample. The
complement of MS, NMR, other spectroscopies and anecdotal information has
been proven to be necessary for de-novo structural elucidation[12,13]. In these
cases MS provides accurate mass information and thus empirical formulae for
the complete structure and key fragments which can be used during the
elucidation process. Neutral loss analysis of the tandem MS and other
fragmentation techniques provides indications for the presence of structural
fragment information for example hydroxylation and phosphate moieties.
Additionally, isotopic information especially in the cases of structures which are
chlorinated, brominated, those containing sulfur and some transitional metal
14. cations are highly characteristic and are thus diagnostic. The incorporation of
NMR data [1H NMR, 13C NMR, 2D NMR data and other relevant techniques]
allows complete atom-to-atom connectivity maps and thus a route to complete
structural identification. These structural elucidations are still typically
undertaken by expert level spectrometrists throughout the industry, however,
such expert software systems as ACD/Structure Elucidator from Advanced
Chemistry Development Inc., are now serving to dramatically reduce the time
and complexity of this process.
Where a significant body of knowledge exists for the structure being
elucidated, for example in impurity analysis and metabolism studies, the
difference in mass between the starting compound and the unknown significantly
reduces the number of possibilities that have to evaluated. In most cases
significant structural information is retained in the spectral information of the
unknown and thus techniques such as spectral correlation, discussed later, offer
advantages. In those cases where significant rearrangement or oxidative
cleavage may have occurred, the remaining part of the structure may be
significantly different from the parent drug. In these situations the fragment ions
are often significantly different from those of the parent compound and thus
spectral correlation approaches may not be as useful in the determination of
structural changes. In practice these types of structural analysis challenges
require evaluation by a spectrometrist and potentially other scientists with a
detailed understanding of the chemistries and possible enzymatic pathways that
are involved.
15. The method by which a spectrum is obtained can have a significant effect
on the way in which the structure can be elucidated. High energy ionization
techniques such as EI typically result in spectra containing extensive
fragmentation usually with little or no remaining molecular ion spectral
information. Fortunately, standardized instrumental ionization acquisition
conditions ensure that spectra are usually reproducible from instrument to
instrument. These standardized methods of acquisition thus ensure that spectra
can be easily stored in a spectral library and distributed to all groups who
require search access. Spectral databases are discussed later in this chapter.
Low energy ionization techniques such as electrospray and atmospheric
pressure chemical ionization on the other hand typically generate protonated or
deprotonated molecular ions with little or no fragmentation. Fragmentation can
be induced in a number of ways including source induced fragmentation,
fragmentation in a gas filled collision cell or via resonant fragmentation in ion
traps. These low energy spectra, unlike EI spectra, are not acquired under fixed
fragmentation conditions and as such the spectra can be very different. These
differences are further exacerbated when instrument-to-instrument, vendor-to-
vendor and MS instrument types are included in the variation matrix [14].
Whether the spectrum has been obtained as a MS1 full scan experiment or via a
tandem MS/MS acquisition, structural assignment of the spectrum can still be
possible. In the case of the assignment of a full scan MS1 trace such as EI
GC/MS spectra it is important to note that the assignment of the spectrum will be
dependent upon the isotope that is selected for the fragment assignment
16. procedure. This is clearly identified in the fragmentation of Temazepam,
Figure 7, in which the 37Cl contributes a significant amount to the ion intensity of
35
the fragment ions. Note that the spectrum in this case is assigned using the Cl
isotope. It is usual however in the case of the majority of structural elucidations
to isolate an individual isotope using the first stage mass filtering capabilities of
the MS instrumentation before collisionally induced dissociation (CID) in a
collision cell or ion trap. In this way the tandem MS spectrum is isotopically pure
and thus the fragments in the spectrum can result from the assignment of the
selected isotope. The use of high resolution, at the stage of isolation of the MS1
mass of interest, can provide an additional level of confidence ensuring that the
tandem MS spectrum is isotopically pure. In those cases where low resolution
MS1 ion isolation is coupled with high resolution ion detection, the presence of
isobaric masses in the isolation MS1 spectrum can be detected and their
presence taken into account and minimized during the elucidation phases.
Detailed information is also obtained by the observation of sequential
neutral losses to determine the sequence of substructures or “molecular
connectivity” within the analyte [15]. This procedure is analogous to two-
dimensional NMR techniques used to sequentially connect substructures. This
approach has major benefits for those structural modifications whereby the
majority of the structural integrity is maintained. Of course, a familiar example of
molecular connectivity is the determination of the amino acid sequence of a
peptide. Specific neutral losses are indicative of certain amino acids, and the
sequence of these losses can be used to identify the peptide [16].
17. Owens [17] reports a software based technique of spectral correlation or pattern
matching of MS/MS spectra and the determination of a similarity index as a
means of filtering out those tandem MS spectra which have low correlations with
respect to the parent drug MS/MS spectrum and are thus defined as
endogenous background peaks. Where a high similarity exists, this is indicative
that there are spectral elements that show a high degree of correlation to the
parent drug compound [18]. The subsequent auto-correlation between the
assigned parent drug spectrum and the unknown spectrum can then influence
the identification of the changes in the original parent drug structure and thus
the determination of potential structural modifications. When linked with high
mass accuracy data this technique may offer significant value in expediting the
generation of metabolite or impurity structures.
In the determination of chemical structure using either a manual approach
or via some software driven method or a combination of the two techniques, the
assignment of the spectral fragments remains a key part of the process. To date
the spectral analysis software systems that exist in the industry allow assignment
of the spectrum to a particular proposed structure using a rules based approach,
as the autoassignment example, Figure 7, shows. As with all rules based
approaches, it may not be possible to identify all spectral ions and thus the
intervention of a spectrometrist with a detailed knowledge of the chemistries
being investigated can result in a complete assignment of the fragments to a
proposed structure. Where the software assignment algorithms can provide
major benefit is in the assignment of the majority of spectral peaks when
18. predicted using the coded rule sets, thus significantly reducing the amount of
time that it takes to perform a series of spectral assignments. Often the
suggestion of a potential fragmentation process using the rules based approach
can act as a source of inspiration when trying to assign compounds that
fragment through more esoteric and undefined routes.
Where structural elucidation uses an underlying knowledge of the
samples and chemistries, fragmentation analysis of the parent drug substance
provides clear indications for structural modification within the structure as
discussed earlier. In cases where a number of potential changes have to be
considered, it maybe necessary that a series of possible structures need to be
validated against the spectrum. This may be achieved in a couple of ways using
third party tools, where a combination of rules based fragmentation is coupled
with a manual review of the results and where appropriate unpredicted
fragmentation routes maybe added manually , Figure 8.This capability is
presently delivered by third party software tools [19]. In this example, following
the import of a mass spectrum, a chemical structure is attached using the
molecular structure editor integrated into the program. The lasso tool is used to
encircle a particular fragment, and if a spectral ion corresponding to the mass of
the selected structural fragment exists in the spectrum, the fragment is
highlighted and the assignment is added to the fragment assignment table. In
this way, an entire mass spectrum can be assigned and examined for
consistency with the hypothetical structure. If there is a mixture of components in
19. a single spectrum due to co-elution, then each component can be individually
assigned.
Structure as a Means of Communication
As a universal language of chemists, structure represents a clear and
concise way to communicate chemistries that form the nucleus of research
efforts. Whilst the need to elucidate a final and complete structure is the
objective for any spectrometrists, in mass spectrometry, it is commonly the case
that we are unable to arrive at a finalized structure. In addition, during the
process of structural elucidation, there may be a number of iterative versions of
what the structure may be before arriving at a finalized version. In these cases
the ability to represent structure in some incomplete format, such as a Markush
representation, provides a way of creating and storing a “work-in-progress”
structure, Figure 9. In this example the position of the chloro group can be
intuitively defined as 2,3,4,5 and 6 on the phenyl ring. Whilst this representation
has significant benefits for those cases where all remaining positions in a phenyl
ring are possible points of attachment, in the case where the structure is
represented with the chloro group in the meta and ortho positions the above
shorthand notation clearly has limitations. There have been extensions to the
notation of “generic” chemical structures over the years, including but not limited
to the usage of graphical overlay elements such as boxes etc, Figure 10 [20,21],
and polymer like brackets Figure 11. Whilst these representations of structure
do have value as a means of visualization within reports they do not convey any
20. chemical knowledge that can be transformed into extractable programmatical
elements that can be used in software platforms. Whilst the needs of FDA
regulation 21 CFR Part 11 [22] are not generally applied in the drug discovery
phase of drug development, for example in metabolism identification, these
regulations have in reality set a precedent for the storage of electronic records
where feasible, especially in the latest modification to the FDA 21 CFR Part 11
regulations [23]. Whilst the implementation of 21CFR Part 11 in the early phases
of drug discovery and development of metabolites, impurities and degradants
can be highly contentious, the need to communicate information in a variety of
electronic formats is very much becoming a requirement across all of drug
discovery and development groups within the pharmaceutical industry.
Moreover, the reporting, storage, searching and distribution of electronic
information including structures throughout all industries are becoming more
commonplace. Therefore, any representations of incomplete structure that are
ambiguous create opportunities for miscommunication, resulting in time and thus
financial implications for the industry.
Metabolism groups in a number of the major pharmaceutical companies
have been highly instrumental in encouraging the development of more
advanced representations of Markush structure representations which are
designed to more clearly show the sites of attachment of a particular
substituent(s). One such representation, in the form of a shaded Markush from
Advanced Chemistry Development Inc., denotes the points of attachment(s)
using user definable color shading as shown in Figure 12. This approach has
21. been extended to more complex structures where the positions of attachment
are discontinuous as depicted in Figure 13. These visual representations are
also understood programmatically as atom-to-atom mappings allowing the
structures to be searched electronically and thus enabled as part of a larger
structurally enabled analytical data management system.
Linking structures with analytical data
Attaching a structure or a number of structures to an elucidated spectrum
or chromatogram represents a concise way of reporting our findings as analysts.
It is often the case that we cut and paste structures onto our data either in some
document editing system or potentially in a package design for spectral
processing and reporting. Moreover, the attachment of structure to the analytical
data, with subsequent database storage or archival of the elucidated data can
act as an important knowledge system. Searches of meta-data, structures or
data related features, when coupled together, allow the extraction of compounds
and data that are able to provide key insights for current development needs. In
this case an analytical data archive does not describe an archive of raw data
files instead it represents a repository of knowledge extracted from and
associated with the data. A data archive generally describes a repository of raw
data which are originally captured at the instrument, collected and deposited into
the archive without further analysis. Vendors of such file based archive systems
22. include for example NuGenesis Technologies [24]. A further discussion of
databasing is covered later within this Chapter.
It is generally easier for a chemist or spectrometrist to remember and
draw structures from memory than it is to remember a series of spectral masses
or analytically determined parameters. Thus, the physical attachment of a
structure or series of structures to the data and their subsequent storage in an
appropriate database, as shown in Figure 14, represents a primary link between
what a chemist is able to remember and an ability to extract that information
quickly and easily from the database using, for example, a chemical structure
search. This allows data to be located quickly and effectively from amongst huge
volumes of data which are created annually within our organizations.
Structure Based Searching
The representation of chemical structures and their attachment to
analytical data in an electronic format is only made useful when linked with
appropriate search engine capabilities. As discussed earlier in the chapter,
structures range from complete structures to Markush representations and to
fragmental structure information. Additionally, stereochemistry and tautomerism
can all affect the performance of structure searching [25]. To date structure
search engines are usually able to search using full structure, similar structures
and substructure components. In the case of the Markush structural
representation discussed earlier the search engine has to be able to allow for
23. Markush structure searching in a variety of ways over and above the standard
structure search capabilities. For example, if a Markush structure is the starting
point for a search then the search engine should be able to return hits containing
completely defined structures that contain modifications that are within the
region incorporated by the Markush inclusion positions, Figure 15. A similar
search performed using a substructural search of the same database returns as
expected a much greater number of structures as indicated in Figure 16. In
addition, if a search is made with a completely defined structure, then it should
be possible to return structures which are represented as Markush
representations containing the functional modifications contained within the
search structure. These capabilities are a part of the ACD/Labs analytical data
management system, ADMS, software suite which includes as a component the
support of MS data processing and database management.
Databasing and Analytical Data Management
An alternative approach to aid in the identification of an unknown is to
perform a spectrum or subspectrum search against a database of known
structures and associated spectra. A simple search based on just a few peaks
from the mass spectrum is possible. McLafferty developed two search
techniques, based on the probability of certain ions (PBM), as well as a
technique based on a collection of chemical fragments associated with certain
fragmentation patterns [26].
24. Over the years collections of mass spectra have been collected by
different groups. The National Institutes of Health (NIH) and Environmental
Protection Agency (EPA) standardized the data collection and analysis of the
data to ensure a high quality aggregation of tens of thousands of spectra. In
addition, Stenhagen, Abrahamsson, and McLafferty collected thousands of mass
spectra to form one of the standard MS electron ionization (EI) reference
databases available today. The standard computer readable collections are
those of the US Government, distributed by NIST and the McLafferty collection
[27].
The categorization of processed information into databases is a powerful
approach for leveraging the advantages of high throughput analysis schemes.
The implementation of an electronic database storage system represents a
significant change from the way in which many organizations have historically
approached analytical data management. The transition to an electronic storage
system typically requires changes to business practices, requiring some level of
change management for the most effective conversion and implementation
strategies.
Additionally the consistency of storage for data, structures and associated
alpha-numeric meta-data all represent important aspects that should be
considered during the implementation process. In structural terms, isomers, salt
structures, and tautomers all represent different structural forms that can have
an effect on the route that is adopted for searching. Textual based information
including naming conventions for compounds, for example metabolite labels, can
25. be entered in many different ways. If the values are not entered into the
database tables in the same data fields then this will limit the effectiveness of the
implementation. In operation simple business rules and practices are able to
alleviate this potential shortcoming.
A database represents an easily accessible knowledge management
system containing all structural elucidations that have occurred during the
elucidation process and as a storage container for the wide array of textual and
numeric information that supports our analytical studies. Access to records
within the database(s) when enabled through structure similarity, structure and
substructure searches, user field searches, spectrum and subspectrum searches
allow flexible access to the stored information. For example, the identification of
a metabolite structure may require only a retention time and molecular weight
information via LC-MS analysis when compared to the metabolite structure
database compiled from previous studies [28]. A further benefit of databases is
the efficient extraction of information. Databases may be “mined” to detect
trends that may not otherwise be noticed. For example approach can be used to
reveal trends such as the metabolically active sites of a molecule and/or
substructures labile to degradative conditions. The extension of databases to
include a much wider array of data and information over and above the spectrum
allows searches to be done using a wider array of parameters. This method
provides an efficient mechanism to reduce the number of false positives. The
increasing adoption of high mass accuracy instrumentation, represents an
26. exciting addition to the information content that can be stored within proprietary
and commercial databases.
Once created, a database may be transferred to other laboratories and
facilities that are participating in a particular research activity. The resulting
databases can be distributed via standard server technologies or “web-enabled”
and made accessible via corporate intranets or public internets. Information is
coordinated within the database, and different scientists are able to effectively
pool and merge their information. When implemented early within the product
development cycle, valuable information for later stages in drug development
can be made available [29]. Therefore, this approach provides a comprehensive
method for information gathering whereby future projects are planned,
coordinated, and efficiently supported. In most cases, the information gathering
process is targeted towards the creation of either a single compound report,
some larger series of cross study reports or, in the case of a regulatory
submission, the creation of a compound dossier.
It is worth noting that database creation, modification, and use; benefits
greatly from a standard, systematic method. This approach produces reliable
datasets that lend themselves to a highly consistent database format throughout
a project lifetime. While spectral databases can be purchased these are
generally limited to nominal mass EI data. Since library searching techniques
are limited by the size and nature of the library, relative to the particular problem
of the chemist the creation of user databases are of high value to any
corporation. With today’s technologies allowing the generation of low energy
27. ionization techniques and accurate mass data, proprietary databases can
certainly be of significantly higher value than commercial databases as they
represent a focused repository of chemistries appropriate to the organization.
The content contained within proprietary databases typically exceeds that
contained within commercial databases, which dramatically increases their value
to an organization. The searches of such databases can be defined according to
a series of options and multiple databases can be searched simultaneously. In
the case of the spectrum of Ovex, when this was searched against the NIST
replicates database, a similar spectrum for Ovex was returned with a similarity
index of 87.9% as shown in Figure 17
Search efficiency is increased by imposing additional constraints. As an
example of a multi-step constrained search approach, a search of the NIST
database for a para-substituted benzene sulfonic acid fragment, as a starting
point, gives a total of almost 300 such spectra in the database. This subset of
spectra can then be searched according to variables such as molecular formula,
elemental composition based on elemental analysis, and substructural
components based on identified fragmentations (loss of Ph, CCl 3, C(CH3)3 and
so on).
Often, when work is initiated on new project compounds, the use of a
complete spectral database is not possible (i.e. drug discovery). When
information is stored within a comparative database, compounds of interest can
be effectively searched and identified for use in early to late stages of
development [30]. Database capabilities also permit the use of substructure-
28. based searches to identify compounds within a specific dataset or library that
contains a distinct substructural entity [31].
Distribution of Spectrometry Data to Chemists
New technology is delivered at almost every new analytical
instrumentation conference. Similar to standard computer platforms, the cost and
size of MS instrumentation with the same capability continues to drop resulting
in the proliferation of open-access MS labs supporting chemists in both single
and multiple synthesis environments. Typically, the resulting data is pre-
processed by the generating instrument and is provided to the chemist in a
hardcopy format or as a spectral image requiring a vendor-specific viewer.
Both of these scenarios prohibit direct or limit interaction with the spectral
data. While in some cases this is preferable since the data is locked from further
manipulation, as is necessary in a regulated environment, in a research
environment such barriers may limit further analysis. The expense of installing a
copy of vendor software on the desktop of every non-specialist accessing the
MS instrument often renders this level of distribution and flexibility as
uneconomical. Additionally the overhead in training and support needs for such
large distributions, especially in instrumentally heterogeneous environments,
may act as additional limiting factors. In general, such an approach may be
overkill as most chemists simply want access to the final spectrum and or
29. confirmation that the correct product was synthesized. In most cases a simple
determination of molecular weight may be sufficient for such needs.
Traditionally, vendor software provides sophisticated data reduction tools
but limited chemical structure association and reporting capability. An alternative
resolution to this problem is the installation of a third-party structure enabled
desktop processing solution for accessing the data directly over a computer
network, allowing the chemist to further manipulate the data and store the
resulting spectra in a database for further reference. Such an approach offers
additional capability since it is common for a facility to utilize a heterogeneous
mix of hardware platforms whereby spectra are generated. With the capability to
read multiple file formats in their raw binary format, the costs of operation and
the efforts to generate data portability may be significantly reduced.
Integrated Spectroscopic and Chemical Structure
Databasing
Integration strategies often encompass separate events involving
instrumentation, methodology, and process. Conventional methods of analysis
involve multiple steps. For example, the identification of natural products
traditionally involves the scale-up of fermentation broths, solvent extraction,
liquid/liquid or column fractionation, chromatographic fraction collection, and
spectroscopic analysis of the individual components. The integration of these
bench-scale steps into dedicated systems provides unique and powerful
30. advantages for on-line, and perhaps, real-time analysis [31]. Arguably the most
significant bottleneck that exists in industry today is the ability to integrate these
traditional analysis steps with MS processing and analysis.
Discovery chemists and the research and development environments
focus a lot of effort into the resolution of components with direct attention paid to
the actual chemical structures. As a result, for spectroscopic techniques such as
NMR, MS and IR, it is not uncommon to find filing cabinets full of spectra,
relevant scientific literature, and associated information, generally linked to the
chemical structures that gave rise to the spectra. Even though electronic
libraries of chemical structures and MS spectra exist, these libraries are usually
limited to EI data as discussed previously. It is possible to search experimental
MS data against these libraries with the intention to aid in the identification of
possible unknowns. These libraries are, however, not structure or substructure
searchable. The requirement for the electronic management of experimental
spectra with associated chemical structures is an obvious requirement. There
are two general forms to such databases. For spectral-centric solutions the
primary focus of the software is the desktop processing of spectroscopic data,
followed by the concomitant association with chemical structures. Commonly, a
particular facility has access to a structure databasing system from one of the
multiple vendors providing this type of solution. These structure databasing
systems provide a structure-centric solution whereby spectral records are
attached to the structure records in the database for viewing and further
processing.
31. Spectrometrists and chromatographers utilize a variety of technologies to
both separate and identify chemical structures. It is common in today’s analytical
environment to find teams assembled with skillsets to generate both optimal
separation and analysis solutions. Spectrometrists assign their spectra in
relation to chemical structures using parent ion mass or fragment ion mass
analysis in MS, nucleus–to-peak assignments in NMR and vibrational band
association with IR peaks, for example. Spectrometrists have used the standard
filing system of drawers full of spectra with an association of the file number with
some textual identifier in order to locate the detailed knowledge extracted from
the spectra at a later date. The general level of spectral management has been
limited to hand written notes in notebooks or sometimes text-searchable
databases pointing to associated spectra.
As explained earlier, tools are now available to allow spectra to be
databased in electronic format with associated chemical structures [1a]. In this
manner, the mass spectrometrist now has the opportunity to search the
database for related structures or substructures, or spectral features when
performing fresh analyses. When integrated with other spectral data the result is
a legacy database of multiple spectroscopy data, thereby building a foundation
for future analyses. The value residing in such tools is the time savings that
result for the analysis of related chemicals and the exchange of information
between different analytical laboratories within the same company. In theory,
such an approach should not be isolated to spectrometry; for chromatography,
32. tools now exist to allow the similar integration of chromatographic peaks and
chemical structures.
Resulting spectra with associated chemical structure(s) carry valuable
information for future analyses. Such resulting files can be stored on a
centralized server and thus become a powerful means for dissemination of the
mass spectrum-structure connectivity and fragment assignment information. This
general approach can be expanded to a World Wide Web intranet approach
whereby the spectra are posted as individual HTML pages with hyperlinked MS
files.
Software solutions available today allow each spectrum to be databased
with associated chemical structures, thereby offering significantly enhanced
capabilities over the common file systems used today in many laboratories. Due
to recent advances in database technology there is enhanced searching
capability over the standard filing cabinet system or a text-based databasing
system. It is possible to search the resulting databases by structure,
substructure, formula, molecular weight, chromatographic and spectroscopic
parameters or user data. User data includes the creation of user-definable
database fields with particular field labels including, for example, submitter,
project name and type of analysis, all of which become searchable fields.
Multiple databases can be searched at one time, thereby allowing different
databases to be constructed according to analysis type, project name, individual
user and so on. These multiple databases can also be distributed across
different departments, divisions or even an entire corporation, simply by using
33. the ability to point to databases located on mapped network drives. Corporate-
wide database capability engenders concern about the integrity of the
databases. This can be addressed by standard database security features.
Other than the spectrum parameters, the association of individual searchable
user data fields is invaluable, thereby allowing each spectrum in the database to
be associated with a project, a customer, an analyst or any other appropriate
information.
The value of the approach outlined here should be obvious as the ability
to integrate structural information with spectra into a database offers exciting
benefits to the spectrometrist and is an ideal solution for an environment where
multiple spectrometrists need to quickly determine assignments and identify
specific chemical structure classes. The additional benefit of this tool is that it
may also be fully integrated with similar toolsets allowing similar structure-
spectrum management for NMR, MS, UV-Vis, IR and Raman.
Conclusions and Future Prospects
Computer software technologies for the processing and analysis of MS data and
the management of the resulting knowledge are quickly emerging. While it is
almost impossible to define the long term future of MS data processing and
analysis, it certain that MS systems will continue to become smaller, easier to
use, offering greater levels of automation and on-the-fly decision making. It is
certainly likely that an increasing amount of data will be acquired with even
34. higher mass resolution and higher mass accuracies and hence the tools
necessary to manage this data will need to be further developed. The synergistic
coupling of high mass accuracy MS data, MS fragment analysis, integrations
with other forms of spectroscopy, for example LC-NMR-MS, will provide still
further levels of structural detail. The tools which will be delivered in the future
will have to include additional developments in the area of highly automated
processing of thousands of datasets, advances in MS fragmentation and tools
for the creation and searching of accurate mass spectral databases. Such an
approach, when further integrated with spectral processing and databasing for
other techniques (NMR, IR, UV-Vis etc.) will provide a unifying tool for
spectroscopy management.
With further research into statistical and chemometric methods it is hoped
that further techniques will be developed for mass spectral identification.
However, MS, in any of the separate ionization techniques (El, CI, Electrospray
(ESI), Atmospheric Pressure Chemical Ionization (APCI) and so forth), has
inherent limitations. Only in the presence of additional techniques, such as IR
and NMR, will structure elucidation and verification be more rigorous when
identifying the structure of unknown chemicals.
35. Figure 1: A multi-spectroscopic display of Alizarin. UV (top), IR (middle) and MS
(bottom) contained within a single display window. This ability allows unified
desktop viewing of data.
36. Figure 2: Example showing the reduction in chemical and electronic noise using
chemometric algorithms (CODA used for this example). Notice the high levels of
noise and background in the total ion chromatogram and the low relative
intensity of the chromatographically relevant peaks m/z 739 and 1460 mass
regions (Upper panel). The mass spectrum in the top window is for the scan at a
retention time of 17.8 minutes. Notice the low molecular weight components
around m/z 214. Notice the removal of the gradient background after application
of the CODA chemometrics algorithm and the significant decrease in noise level
in the Total Ion Chromatogram (Lower panel). The interface shown is for
ACD/MS Manager.
37. Figure 3: Following the process of COMPARELCMS, mass chromatograms that
are determined to be unique when a control sample is compared with a
metabolized sample are retained for further review.
38. Figure 4: An Isotope Pattern Calculator showing Nominal, Average and
Monoisotopic masses
39. Figure 5: The EI MS spectrum of ethyl pirimiphos [O-[2-(diethylamino)-6-
methylpyrimidin-4-yl] O,O-diethyl thiophosphate] showing two potential fragment
assignments for m/z 152.086.
ROUTE 1 ROUTE 2
40. Figure 6: A series of proposed fragment structures with nominal mass m/z 152
for the fragmentation of ethyl pirimiphos. (a) The structure on the left
corresponds to an accurate mass of m/z 152.006 which has a delta mass of 80
milli Da from the experimental data Figure 5 Route 1 and (b) The right hand
structure with mass m/z 152.082 corresponds to a delta mass of 3 milli Da from
the experimental data Figure 5 Route 2. (Display extracted from MS Fragmenter
Advanced Chemistry Development Inc.)
41. Figure 7: The nominal electron ionization (EI) mass spectrum and structure for
37
temazepam. Note the significant contribution of the Cl isotope ion to the
spectrum especially at m/z 273, the primary fragment ion which can have a
significant impact on the fragment assignment of the spectrum.
42. Figure 8: The assignment of the N-Oxide buspirone MS/MS spectrum using the
“lasso tool” (Left inset box “Stage 1 Lasso structure). The fragment table lists
assigned fragments. Moving the mouse cursor over the table highlights the
assigned molecular fragment on both the spectrum and the structure.
Stage 1 – Lasso Result = fragment
structure selected
44. Figure 10: Incomplete structure representation using graphical elements such as
shaded boxes
O
N O CH 3
N
O N N OH
N
OH
45. Figure 11: Incomplete structure representation using Polymer brackets
O
N O CH 3
N
O N N OH
N
OH
46. Figure 12: Suggested hydroxylation of a buspirone metabolite represented using
the shaded Markush structural representation.
O
HO
N O CH 3
N
O N N OH
N
47. Figure 13: Representation of a Markush structure for a discontinuous series of
attachment positions
O
OH
N O CH 3
N
O N N OH
N
48. Figure 14: Structure attachment of Theophylline to its associated EI Spectrum.
Note that the structure is understood at a programmatical level and thus can be
utilized directly in structurally enabled search engines.
49. Figure 15: The results of a complete structure search of the NIST98 [27]
database of a Markush structure where the position of the hydroxylation and
chlorination are defined within any of the possible ring positions.
OH
Cl
50. Figure 16: The results of a substructure search of the NIST98 [27] database of a
Markush structure where the position of the hydroxylation and chlorination are
defined within any of the possible ring positions. Note that 945 possible
structural combinations are returned.
OH
Cl
51. Figure 17: The most similar match (87.9% match factor – see bottom middle) for
the spectral search displays the spectrum of Ovex, from the catalogue of mass
spectra of pesticides from within the NIST replicates database. The structure of
Ovex is consistent with the suggested structure.
52.
53. REFERENCES
1 Third party vendors providing software solutions for integrated spectroscopy
processing include a) Advanced Chemistry Development Inc., www.acdlabs.com
and b) Thermo,
http://www.thermo.com/eThermo/CDA/Products/Product_Detail/1,1075,22304-
134-X-1-1,00.html
2 Lee, M.S., Kerns, E.H. LC/MS Applications in Drug Development. Mass
Spectrom. Rev. 1999, 18, 187-279
3 Waters Corporation, MS Technologies Centre (Micromass UK Ltd.), Atlas Park
Simonsway, Manchester, M22 5PP, United Kingdom.
4 Agilent Technologies, 5301 Stephens Creek Boulevard, Santa Clara, CA,
95051, USA
5 OpenLynx Global Server™ is a registered trademark of Waters Corporation,
MS Technologies Centre (Micromass UK Ltd.), Atlas Park Simonsway,
Manchester, M22 5PP, United Kingdom.
6 J. E. Biller and K. Biemann, "Reconstructed Mass Spectra, A Novel Approach
For The Utilization Of Gas Chromatograph - Mass Spectrometer Data", Anal.
Letters, 1974, 7 (7), 515-528.
54. 7 Windig, W., Payne, A., Nichols, W., A Noise and Background Reduction
Method for Component Detection in Liquid Chromatography/Mass Spectrometry,
Anal. Chem., 1996, 68, 3602-3606.
8 Comparelcms ref
9 Harland, G, Castro Perez, J., Pugh, J., Leandersson, C., Thompson, R., High
Mass Accuracy Measurements in W-optics using an Orthogonal Hybrid
Quadrupole Time Of Flight Mass Spectrometer for In-Vitro Metabolism Studies.
51st ASMS, Montreal, 2003, TPO 274.
10 Lee, M.S., Yost, R.A., Perchalski, R.J. Tandem Mass Spectrometry and Drug
Metabolism. Annu Rep Med Chem 1986, 21, 313-321.
11 Volk, K.J., Hill, S.E., Kerns, E.H., Lee, M.S. Profiling Degradants of Paclitaxel
Using Liquid Chromatography-Mass Spectrometry and Liquid Chromatography-
Tandem Mass Spectrometry Substructural Techniques. J. Chromatogr. B
Biomed. Sci. 1997, 696, 99-115.
12 Blinov K. A., Carlson D., Elyashberg M.E., Martin G.E.,
Martirosian E.R., Molodtsov, S., Williams, A.J. Computer-assisted structure
elucidation of natural products with limited 2D NMR data: application of
the StrucEluc system., Magn. Reson. Chem. 2003, 41, 359–372
13 Blinov K., Elyashberg M., Martirosian, E.R., Molodtsov, S.G., Williams A.J.,
Tackie, A.N., Maged, M., Sharaf, M.H., Schiff P.L., Crouch, R.C. Jr., Martin G.E.,
Hadden C.E., Guido, J.E., Mills, K.A., Quindolinocryptotackieine: The
55. Elucidation of a Novel Indoloquinoline Alkaloid Structure through the use of
Computer-Assisted Structure Elucidation and 2D-NMR, In Press.
14 Bristow, A.W.T., Nichols, W.F., Webb, K.S., Conway, B, "The evaluation of
the utility of electrospray in-source collisionally induced dissociation (in-source-
CID) spectral libraries", Rapid Communications in Mass Spectrometry, (2002),
16, 2374 - 2386
15 Lee, M.S., Klohr, S.E., Kerns, E.H., Volk, K.J., Leet, J.E., Schroeder, D.R.,
Rosenberg, I.E. The Coordinated Use of Tandem Mass Spectrometry and High
Resolution Mass Spectrometry for the Structure Elucidation of the Kedarcidin
Chromophore. J. Mass Spectrom. 1996, 31, 1253-1260.
16 Roepstorff, P., Fohlman, J. Proposal for a Common Nomenclature for
Sequence Ions in Mass Spectra of Peptides. Biomed. Mass Spectrom. 1984, 11,
601-602.
17 Owens K.G. Application of correlation analytical techniques to mass spectral
data. Applied Spectroscopy Reviews, 1992, 27, 1-49.
18 Gundersdorf, R.W., Fernandez-Metzler, C.L., King, R. C., Overcoming SRM
Blindness with the Linear Ion Trap., 51 st ASMS, Montreal, 2003, WPH 146.
19 Advanced Chemistry Development Inc., Suite 600, 90 Adelaide Street West,
Toronto, ON, M5H 3V9, Canada.
20 Mike S. Lee, Wiley, 2002, LC/MS Applications in Drug Development, ISBN
0-471-40520-5.
56. 21 Lam W., Ramanathan R., In Electrospray Ionization Source
Hydrogen/Deuterium Exchange LC-MS and LC-MS-MS for Characterization of
Metabolites., J. Am. Soc. Mass Spectrom., 13, 345 – 353, 2002.
22 21 CFR Part 11 Regulations,
www.fda.gov/ora/compliance_ref/part11/frs/background/11cfr-fr.htm
23 Guidance for Industry Part 11, Electronic Records; Electronic Signatures –
Scope and Application (Draft), February 2003,
www.fda.gov/cder/guidance/index.htm
24 NuGenesis Technologies Corporation, 1900 West Park Drive, Westborough,
MA, 01581, United States
25 Trepalin, S. V., Skorenko, A. V., Balakin K. V., Nasonov, A.F., Lang, S.A.,
Ivashchenko, A. A., Savchuk, N. P., Advanced Exact Structure Seaching in
Large Databases of Chemical Compounds., J. Chem. Inf. Comput. Sci., 2003,
43, 852 – 860.
26 Pesyna G.M., Venkataraghavan R., Dayringer H.E. & McLafferty F.W.
Probability Based Matching System Using a Large Collection of Reference Mass
Spectra. Anal Chem., 1976, 48(9), 1362-1368.
27 The US Government MS database is available from NIST, Office of Standard
Reference Data, Washington DC, 20234. The McLafferty database is available
from John Wiley & Sons, Electronic Publishing Division, 605 Third Avenue, New
York, New York 10158.
57. 28 Kerns, E.H., Rourick, R.A., Volk, K.J., Lee, M.S. Buspirone Metabolite
Structure Profile Using a Standard Liquid Chromatographic-Mass Spectrometric
Protocol. J. Chromatogr. B 1997, 698,133-145.
29 Kerns, E.H., Volk, K.J., Hill, S.E., Lee, M.S. Profiling Taxanes in Taxus
Extracts Using LC/MS and LC/MS/MS Techniques. J. Nat. Prod. 1994, 57, 1391-
1403.
30 Kerns, E.H., Volk, K.J., Hill, S.E., Lee, M.S. Profiling New Taxanes Using
LC/MS and LC/MS/MS Substructural Analysis Techniques. Rapid Commun.
Mass Spectrom. 1995, 9, 1539-1545.
31 Lee, M.S., Kerns, E.H., Hail, M.E., Liu, J., Volk, K.J. Recent Applications of
LC-MS Techniques for the Structure Identification of Drug Metabolites and
Related Compounds. LC-GC, 1997, 15, 542-558.