SlideShare uma empresa Scribd logo
1 de 21
Baixar para ler offline
The Application of Empirical Methods of       13C   NMR Chemical Shift Prediction as a
Filter for Determining Possible Relative Stereochemistry.


A short title:
The Application of Empirical NMR Prediction to Determine Stereochemistry

Mikhail E. Elyashberg+, Kirill A. Blinov+ and Antony J.Williams*.
+
    Advanced Chemistry Development, Moscow Department, 6 Akademik Bakulev Street,

Moscow 117513, Russian Federation,

    ChemZoo Inc., 904 Tamaras Circle, Wake Forest, North Carolina 27587




Abstract:

The reliable determination of stereocenters contained within chemical structures usually

requires utilization of NMR data, chemical derivatization, molecular modeling, quantum-

mechanical calculations and, if available, X-ray analysis. In this article we show that the

number of stereoisomers which need to be thoroughly verified can be significantly

reduced by the application of NMR chemical shift calculation to the full stereoisomer set

of possibilities using a fragmental approach based on HOSE codes. The applicability of

this suggested method is illustrated using experimental data published for a series of

complex chemical structures.



Keywords:

NMR, 1H, 13C, chemical shift prediction, stereochemistry.

Introduction




                                                                                         1
A number of different methods of NMR chemical shift prediction have been applied

to the process of molecular structure elucidation and validation. Empirical methods are

attractive since they are fast enough and fully automatic. The fastest NMR spectra

calculations are provided using an incremental approach and offer a computational speed

of 6,000-10,000 chemical shifts per second on a normal desktop computer (circa 2007)

and provides an average chemical shift deviation for carbon NMR of 1.8 ppm[1,2].

Spectral prediction utilizing artificial neural networks provide similar speed and accuracy

performance[1,2]. The third most popular empirical method is slower and is based on the
                                                                                13
application of a database containing reference structures with assigned           C or 1H

chemical shifts. The target and reference structures are described by means of HOSE

codes[3] and this allows prediction of the chemical shift of an atom from the target

structure using the chemical shifts of the reference structures as the basis. In the

ACD/NMR predictor[4], the prediction algorithms use a library containing 185,000

structures with NMR chemical shifts assigned to carbon and hydrogen atoms. If

information regarding the relative stereochemistry of a given atom ai and its environment

is known then these data are also coded into the reference structures. To predict the

chemical shift of an atom ai in the target structure its HOSE code is compared with the

codes of the corresponding atoms in reference structures. As a result of statistical

processing of the chemical shifts assigned to all “atom-twins” detected in the reference

structures, the chemical shift of an atom from the target structure is predicted. A strategy

based on combining all mentioned methods was suggested[5,6]. It allows selection of the

most probable structure from the output file of expert system developed for the molecular

structure elucidation.




                                                                                          2
At the same time a series of articles have been published espousing the value of ab-

initio quantum mechanical (QM) approaches for NMR chemical shift calculations (for

instance,[7-12]) and, most frequently, the GIAO option of the DFT method[13] has been

employed for the calculation of 1H and 13C chemical shifts. It was shown that DFT based

methods can be applied for the selection of a preferable structural hypothesis by means of

comparing the predicted chemical shifts with those determined experimentally. This

approach was also an efficient tool for evaluating the different conformers of flexible

molecules as well as the elucidation of the most probable stereoisomers[13-17].

     In our previous report[18] we have shown that empirical methods of NMR chemical

shift prediction can be successfully used at the selection stage of structural hypotheses

which are verified further with application of molecular geometry optimization and QM

chemical shift prediction. In this regard we hypothesize that empirical methods can help

in preliminary selection of a set of the most probable stereoisomers for their subsequent

verification by additional experimental techniques and QM chemical shift prediction.

This may be possible since the stereocenters of structures included into the ACD/CNMR

database and stereochemistry is taken into account by the NMR chemical shift prediction

algorithms. The incremental and neural nets based algorithms of chemical shift prediction

also use the stereochemistry information related to the atoms included into 3-6-membered

cycles[2]. It was interesting to know whether this information can be useful for

stereochemistry determination.

     We have tested our hypothesis using a series of examples. We have used examples

from recent literature (2007-8) for novel structures for which relative stereochemistry

was reported. These structures are deliberately absent from the ACD/CNMR database.




                                                                                        3
The application of empirical methods of 13C NMR chemical shift prediction is shown to

allow the selection of a set of the most probable stereoisomers and always includes the

genuine stereoconfiguration.



RESULTS AND DISCUSSION.

     Fattorusso et al[15] utilized DFT chemical shift computation to confirm the most

probable stereoisomer of artarborol, 1, a rare nor-caryophyllane derivative, isolated by

the authors[15] and structurally characterized by both 1D and 2D NMR spectroscopic

methods.

                                       H                 O
                                                         10              CH3
                                           17                                16
                                                    1            4

                                           11                            6


                                       12                                    7


                                                2                    5            H
                                                         3                        19
                                 HO                                                   CH3
                                      13                                 8            14
                                                    H
                                                    18       9
                                                                                 CH3
                                                                                 15



                                                             1

       To select the most probable stereoisomer the authors[15] carried out a series of

investigations. Structure 1 contains five stereogenic carbons (numbered 1-5 on structure

1) with four of them at junctions between the 9-membered ring and the small ring cycles,

while both cis- and trans- junctions of rings adjacent to the nine-membered core are

possible in natural caryophyllanes.

       A combination of 2D ROESY experiments with Mosher’s modified method[19]

was used to assess the absolute configuration of C-2 (R) and allowed the authors[15] to

reduce the total number of possible stereoisomers to the following four (Figure 1):



                                                                                            4
O                                                      O                                                      O                                                          O
      H                 10              CH3                  H                 10                                   H                     10              CH3                  H                 10              CH3
      17                                                                                       CH3                     17                                     16                  17
                                            16               17                                    16                             1               4                                                                  16
                    1           4                                          1           4                                                                                                     1           4

          11                            6                        11                            6                        11                                6                        11                            6


         12                                 7                   12                                 7                   12                                     7                   12                                 7

               2                    5            H                    2                    5            H                    2                        5            H                    2                    5            H
                                                 19                                                     19                                3                        19                                                     19
                        3                                                      3                                                                                                                 3
                                                     CH3                                                    CH3                                                        CH3                                                    CH3
                                                      14                                                     14                                           8             14                                                     14
    HO             H                    8                  HO                                  8                  HO             H                                           HO             H                    8
     13                     9                               13
                                                                          H        9                               13            18           9                               13                     9
                   18                                                     18                                                                                                                18
                                            CH3                                                    CH3                                                        CH3                                                    CH3
                                                15                                                     15                                                         15                                                     15

                    A                                                                  B                                              C                                                               D


Figure 1. The four candidate stereoisomer structures of artarborol.



Further selection was made by analyzing the scalar coupling constants and additional

spatial couplings across the entire molecule for which all candidate structures were

subjected to a conformational search. As a result, structures B and D were rejected at the

first step, structure C was then excluded and finally stereoconfiguration A was assigned

to artarborol. To support this stereochemical assignment each conformation of the

stereoisomers A and C were fully optimized by the authors[15], and the NMR chemical

shifts were calculated using the GIAO option of the MPW1PW91/6-31G(d,p) DFT

method[20]. A Boltzmann-weighted average of the 13C NMR chemical shifts for all carbon

atoms in the low-energy conformers was calculated for each configuration, using the ab-

initio standard free energies as weighting factors[21]. The total processing time for each

molecule was approximately 60 h (PC Pentium IV). A comparison of calculated chemical

shifts with those determined experimentally for structures A and C showed that

deviations were smaller for structure A thereby confirming the validity of the solution.

          Selection of the most probable stereoisomer was attained as a result of a

comprehensive experimental and theoretical investigation of the compound and its

conceivable 3D models. We investigated what results would be obtained if the problem



                                                                                                                                                                                                                                    5
is solved using 1D and 2D NMR spectra and the empirical chemical shift prediction

methods implemented into the expert system Structure Elucidator[5,6,22].

         To perform this analysis structure 1 was input into the system and all carbon and

hydrogen atoms were supplied with chemical shifts in accordance with the author’s

assignment. Then all 25=32 streoisomers were generated by the program and depicted

using conventional designations for stereobonds. 1H and           13
                                                                       C chemical shifts were

calculated for the complete stereoisomer set using the fragment-based approach within
                                                 13
the Structure Elucidator program. In addition,        C NMR chemical shifts were calculated

using both neural net (N) and incremental (I) approaches.

         The average deviations of the predicted chemical shifts relative to the

experimental shifts (dA = fragmental approach, dN = NN approach and dI = incremental

approach) were calculated for each of 32 stereoisomers and all stereoisomers were ranked

in ascending order of the 13C deviation values. Since the chemical shifts are insensitive to

the absolute configuration of a stereoisomer and its inverse partner the reduced ranked

stereoisomer set was finally represented as a sequence of 16 stereoisomer pairs, each pair

having equal deviations. Figure 2 shows the first 8 out of 16 “unique” stereoisomers
                                                                           13
ranked in ascending order of the average deviations calculated for              C NMR spectrum.

The remaining stereoisomers are characterized by 13C average deviations dA(13C) falling

in the range between 2.49 and 2.90 ppm.

         Figure 2 shows that the correct stereoisomer was distinguished both by its 13C and
1
    H average deviations. Our experiences in the field of computer-aided structure

elucidation have shown [22] that the dA(1H) deviation is a less reliable criterion compared

with dC and it is usually only used for additional confirmation of the most probable




                                                                                              6
structural isomer[5,6,22]. The difference between the deviations dA(13C) found for the

second and first ranked structures is not large (0.2 ppm), but this value is frequently

observed in the structure elucidation process when the “best structure” is selected[22] . It is

worthy to note that in the stereoisomers 3, 4, 6 and 9, atoms H-17 and H-19 are situated

on opposite sides of the macrocycle and are unlikely to be close enough in space to show

a ROESY coupling. Since the authors[15] made the final choice between structures A and
                                                                                                                                                                              13
C on the basis of comparison of differences between experimental and calculated                                                                                                    C

chemical shifts of all carbon atoms we also compared these values (see Figure 3).

1 (ID:29)                                      2 (ID:4)                                     3 (ID:13)                                  4 (ID:24)
                                                                      O                                         O                                          O
                        O
                         A                                    H           CH3                           H              CH3                         H              CH3
            H                CH3                         H                                          H                                          H
       H                           H                                            H                                          H                                          H
                                                  H                                            H                                          H
                                                                                    H                                          H                                          H
   H                                   H
                                                                                    H                                          H                                          H
                                        H        H                                            H                                          H
   H                                                                                H                                          H                                          H
                                     H               H                                          H                                          H
   H                                H                                      H                                           H                                          H
                                   CH3                                           CH3                                         CH3                                        CH3
       HO                                            HO           H                               HO        H                                HO        H
                H
                             CH3                                            CH3                                        CH3                                        CH3

dA(13C): 1.773 (v.11.01)                       dA(13C): 1.959 (v.11.01)                     dA(13C): 1.969 (v.11.01)                   dA(13C): 1.982 (v.11.01)
dI(13C): 2.791                                 dI(13C): 2.893                               dI(13C): 2.893                             dI(13C): 2.893
dN(13C): 2.738                                 dN(13C): 2.817                               dN(13C): 2.817                             dN(13C): 2.817
dA(1H): 0.289 (v.11.01)                        dA(1H): 0.313 (v.11.01)                      dA(1H): 0.312 (v.11.01)                    dA(1H): 0.313 (v.11.01)

5 (ID:8)                                       6 (ID:20)                                    7 (ID:12)                                  8 (ID:33)
                                                                                                                O                                          O
                        D
                        O
                                                                      C
                                                                      O
                                                                                                        H              CH3                         H              CH3
                H             CH3                             H           CH3                       H                                          H
                                                                                                                           H                                          H
            H                                             H                                    H                                          H
                                   H                                        H                                                  H                                          H
  H                                              H
                                           H                                            H
                                                                                                                               H                                          H
                                       H                                                      H                                          H
       H                                           H                                H                                          H                                          H
                                           H                                                    H                                          H
       H                                              H                       H                                            H CH                                   H
                               H                                            H                                                                                           CH3
                                    CH3                                     CH3                                                    3
           HO                                          HO                                         HO        H                                HO        H
                    H                                             H
                               CH3                                         CH3                                         CH3                                        CH3

dA(13C): 1.998 (v.11.01)                       dA(13C): 2.092 (v.11.01)                     dA(13C): 2.358 (v.11.01)                   dA(13C): 2.364 (v.11.01)
dI(13C): 2.791                                 dI(13C): 2.791                               dI(13C): 3.643                             dI(13C): 2.893
dN(13C): 2.738                                 dN(13C): 2.738                               dN(13C): 3.306                             dN(13C): 2.817
dA(1H): 0.293 (v.11.01)                        dA(1H): 0.293 (v.11.01)                      dA(1H): 0.313 (v.11.01)                    dA(1H): 0.309 (v.11.01)




Figure 2. The first 8 out of 16 stereoisomers ranked in ascending order of the average

deviation dA (13C).




                                                                                                                                                                                   7
6

                                                            4




                          Chemical shift difference, ppm
                                                            2

                                                            0
                                                                 1    3      5   7    9     11   13     A
                                                            -2
                                                                                                        C
                                                            -4

                                                            -6

                                                            -8

                                                           -10
                                                                             Atom num ber



                                                                     13
Figure 3. A comparison of the                                             C chemical shift deviations calculated for the carbon

atoms contained in stereoisomers A and C.

Figure 3 shows that the main difference between the chemical shifts calculated for

structures A and C is observed for atoms 6 and 7. For structure A the calculated values

are markedly closer to the experimental values. The maximum prediction errors are

shown for atoms 3 and 5 at the junction between the macrocycle and the 4-membered

ring. Stereoisomer ranking with dN (13C ) and dI (13C ) values in general supported the

priority of stereoisomers A-D: these fell into the first four stereoisomers for which all dN

(13C ) values and all dI (13C ) values proved to be equal (see Supporting Materials, Figure

1S).

           The approach described here looks attractive due to its simplicity and high speed:
      13
the    C and 1H chemical shift calculations for all 32 isomers took about 2 minutes on a

Pentium IV, 2.8 GHz processor compared to 60 hours per prediction as reported by the

authors of the original paper. It could be useful for the preliminary assessment of a full

stereoisomer set and rejection of deliberately improbable structures when the analyzed

molecule is relatively rigid. The reliability of such conclusions can be heuristically


                                                                                                                             8
evaluated by visual comparison of the reference structures used for chemical shift

prediction with the target structure. For instance, a series of structures containing the ring

framework of artarborol were shown by the program when examining the chemical shift

prediction protocol. It should be emphasized that the artarborol molecule (a new

compound) was absent from the library of structures included with the ACD/NMR

prediction program. Reference structure 2 is the most similar structure to the artarborol

structure under investigation:

                                                      H
                                                               O
                                                          63.65          CH3
                                                  27.60            59.80 16.90


                                          29.45                        40.10


                                 HO           51.50                   24.45
                                      66.40
                                                      43.75   44.25

                                              H                      H
                                                      39.50   34.60
                                                                       CH3
                                                                         21.55
                                                               CH3
                                                                  29.85



                                                      2

We demonstrated that removing structure 2 from the database did not influence the

results: the deviation characteristic for the best stereoisomer was only slightly increased

from 1.773 to 1.799 ppm.

     The described approach was also applied to two new ketopelenolides 3 and 4 which

were separated and scrutinized by the same research group[23]. The stereochemistry

shown in structures 3 and 4 was determined by authors[23] as a result of conformational

analysis and QM based 13C chemical shift calculation of the most probable stereoisomers.

The calculations were performed in groups of four for each structure (C1-C4 for structure

3 and D1-D4 for structure 4, see Figure 4). It has been shown that C1 corresponds to

stereoisomer 3 and D1 – to stereoisomer 4.


                                                                                            9
HO                                                                                                O
                                                 CH3
                H                                                                                                H                          CH3



    O                                                                                            O
                                                                                                                                                         H
H3C                                                          H
                                                                                                 H3C
                                                                       CH3                                                                                    CH3
            H                                                                                        H3C
                                H                                                                                    H
                                          O                        H                                                              O                           H
                                                           O                                                                                        O

                                          3                                                                                             4

            HO                  CH3                                HO                  CH3                           HO               CH3                               HO                  CH3
        H                                                      H                                                 H                                                  H


O                                                      O                                                   O                                                  O
 H                                                      H                                                   H                                                  H
                                         H                                                   H                                                  H                                                     H
H 3C                                                   H3C                                                 H3C                                                H3C
                                                 CH3                                                 CH3                                                CH3                                                   CH3
                H                                                      H                                                 H                                                  H
                            O                H                                     O             H                                O                 H                                   O                 H

                                      O                                                      O                                              O                                                     O


                    C1                                                         C2                                             C3                                                    C4
                        O                                                      O                                              O                                                     O
            H                       CH3                            H                     CH3                         H                 CH3                              H                    CH3



O                                                      O                                                   O                                                  O
                                      H                                                      H                                              H                                                     H
                                          H                                                      H                                              H                                                     H
H 3C                                                   H3C
        H                                     CH3              H                                     CH3 H3C     H                                      CH3
                                                                                                                                                              H3C
                                                                                                                                                                    H                                     CH3
                    H                                                      H                                              H                                                     H
                            O                                                      O                                              O                                                     O
                                     O                                                       O                                              O                                                     O


                        D1                                                     D2                                                 D3                                                D4


Figure 4. The most probable stereoisomers of structures 3 and 4 selected for detailed

theoretical analysis in the work[23].

            Structure Elucidator was used to generate all possible stereoisomers for structures 3

and 4 (in both cases N=64) and to perform NMR chemical shift calculations for all




                                                                                                                                                                                                               10
13
stereoisomers using empirical methods.             C chemical shift prediction using the

fragmental method placed stereoisomer C2 in first position in the ranked file and the

genuine stereoisomer C1 at the second position with a difference between deviations of

0.01 ppm. At the same time ranking stereoisomers using dN(13C) values brought

stereoisomers C1-C4 to the 1-4 positions with equal dN(13C) and dI(13C) values for all of

the stereoisomers (see Supporting Materials, Figure 2S). For structure 4 the stereoisomers

were ranked by dA(13C) values in the following order: 1st – D1, 2nd – D2, 3rd – D3, 5th –

D4 (see Supporting Materials, Figure 3S). The correct stereoisomer was placed in first

position and the other most probable stereoisomers selected in[23] were distinguished by

the program as also deserving attention.

       For preliminary evaluation of the generality of the described approach we

repeated the work using the structures of natural products belonging to a number of

different classes, i.e. steroids, alkaloids, terpenes, cembranoids, etc.   A set of such

structures whose relative stereochemistry was recently described in a series of

publications was chosen (see Table 1).




Table 1. Examples of structures for which sets of preferable stereoisomers were selected
                               13
using empirical methods of          C    NMR chemical shift prediction. The R and S




                                                                                       11
designations shown in the structures correspond to the stereochemistry at the particular

stereocenter.


                                                                                                          Nds,         Sr,       Ref.
Example.                                           Structure                                            Number      Position
  No                                                                                                       of      of Correct
                                                                                                        Stereo-   Stereoisomer
                                                                                                        isomers
                                                                        CH3                                                      [24]
    1                                                    HO
                                                                                                          1024         1
                                                                        R
                                                   H3C          R

                                                                S
                                                                        S           OH
                           H3C                 CH3              H
                                       R                 S
                               R               R                                O
                                                   H3C
                  R            R
                                       H
            HO         R
              O            H
                      OH CH3
                                               O                                                                                 [25]
    2                                                                                                    256           1
                                                             CH3
                       H3C
                                   S           E            E
                           R               S                                                CH3
                                                                E           E
                                   H               CH3
            HO S    R                                                           E       E
                  S
             H      H
              H3C
                                                                                        O
                      HO
                                                                                            O     CH3


                                                                                                                                 [26]
    3                                                                                                     32           1
                                               H
                                                            N
                                                   S
                                       R

                                                   H
                                           S
                                                                R                   H
                                                    R
                               N
                                           H                    H
                                                        S
                                                                    O
                        O
                                                         H




                                                                                                                                   12
O                        [27]
4                                                                                         32    1
                                              O
                  H3C
                                                                                    CH2
                                      R


              O                               O           CH3R
                                                                               H
                                              O
                                      R               R
                      H3C                                  S
                                      S                                    H
    H3C                   O
                                      H                       OH
     H3C                                                                                            [28]
5                         O                                                               64    1
                                                  CH3
                                                                                    O
                          H                                        O
              O
          H                                               R
                      S
                                                                       H
        HO                                                                     CH3
                  S   H3C
                                                  S
    H3C                   H3C
                  S                                        H
                                          S               H            O
                              S

                                          H           O

                                                                   CH3
                                      CH2
                                                                                   OH               [29]
6                                                              OH
                                                                                          32    3

                                      O
    HO                            S                                H
                                          S                            S
                              H
                                              H H              S
                                                                           O
             HO                                H
                                                  S

                                  O           R
                                                      O
                                      H                                O



                                                               OH
                  O                                                                                 [30]
7                                     CH3                                                 128   3
                              O

              HO CH3
                   H
                       S
                  S               S
                                                          H
                  S               R               R
     H    R
                       N
    H     S       H               H
    H3C




                                                                                                      13
H                                                                  [31]
8                                                                                CH3                                    2048   3
                                                                    R                        CH3
                                                        O                                *
                                    H3C                                 O
                                                        R            R                      OH
                                        CH3
                                                S            H          H
                    CH3                 R
     HO                     R           S
           S        R
     H     R        R       H           OH
     HO
           H        H O
                                                        O                                                                          [32]
9                                                                                                                       1024   3
                         CH3                O                                           CH3
                                H                           H
           H3C                              R
                                    R                                O                                  CH3
                        H                           R
                            H
                                    S           R
         H3C                                                     CH3
                                R                                               O
                                        H           S
                                    R                           O
                S                                       H
        HO           S          S       CH3O                                         CH3
     H3C   H                                                     O
                   O HH                 O
                                                                            CH3
           O
                                            O                                                                                      [28]
10                      H3C                                 CH3                                                         512    3
                     O              O HO H   OH
                        HO           S R CH3   H                    S
               H3C              S                           S                           O
           O                                        CH3                         H
                            H3C                          R                  S                       E
                                                    R                            O              E
                O                       R
     H3C                                                    H CH
                                                                2
                                        H H O
                                        O
                                                        CH3


                                                                                                    H                              [33]
11                                                                              H3C                               CH3    64    3
                                                                     H3C
                                                                                            S                 N

                                                                                    S                             CH3
                                    H3C                                     R
                                                                 H
                                                        S
                                            R                    R

                        S                               H
     H3C       N

           H3C




                                                                                                                                     14
O                                         [27]
   12                                                                                                  32    4
                                 H3C                 O
           H3C
                             O               R
                                                                 H                           CH2
                                     S
                         H
                 O
                                                                     R

                                 H           O                   CH3

                                                                     S
                                                                                 H
                                         S               R
                  H2C
                                                                             OH

   13                                                O                                                 256   8
                                                                                                                       [34]

                 H3C                                                 CH3
                                                 O
                                                             H
                                     OH3C
                                                                                                 CH3
                     O                               S
                                                                                     Z
            O            O
                                         R
                                                 S
                                 S                                                       Z
                                         H                       HO
                    H                            S                                                 H
                 CH3                                                             S
                                                             S           S
                                                                                             O
                                               H
                                 O           CH2                         R
                                                             H
                                                 O           H3C                         O
                                                                                 H
                             H3C
                                                                     O                                                 [35]
   14                                                                                                  256   12
                                     H               CH3
                         HO                                                          CH3
                                                             S
                                     R               R                       H
                         H3C
           HO                        R               S
                 S           R
           H                         H
                                                     OH
           H     R           R

            HO
                             H
                                     O




All selected structures were supplied with assigned experimental 1H and                                           13
                                                                                                                   C NMR

chemical shifts. Three similar structures borrowed from earlier publications (of 2003 and

2004) were temporarily removed from the database during our research. For each

molecule a full set of N possible stereoisomers was generated and the 13C NMR chemical

shifts of Nds differing stereoisomers (Nds =N/2, N=2n, n – number of stereocenters) were

calculated by all three mentioned algorithms. A stereoisomer file was ranked in the same

way as in the artarborol case – in descending order of dA(13C) values, and the position of


                                                                                                                         15
the correct stereoisomer, as determined in the corresponding article, was detected in the

ranked file. The result of each computational experiment was characterized by an Sr value

where Sr is the number of stereoisomers for which the deviations dA(13C) are less than or

equal to the deviation calculated for the right stereoisomer. For instance, Sr =1 means that

the right stereoisomer was ranked the first in the file with deviation dA1(13C), and dA1(13C)

< dA2(13C), where dA2(13C) is the deviation calculated for the stereoisomer ranked in

second position. The notation Sr =4 means that the correct stereoisomer is among the first

four stereoisomers in the ranked file.

       Table 1 shows that our suggested approach can indeed be used for selecting a set

of the most probable stereoisomers from all possible members of the family. Even for

rather complex structures the preferable stereoisomer was ranked early in the set.

Stereoisomer ranking using dN(13C) is not as effective as dA(13C) but nevertheless in this

case the right stereoisomer most frequently fell into the set of the first 8 ranked

stereoisomers. Consequently, the neural net approach can be used for preliminary ranking

the stereoisomer file for subsequent spectrum prediction based on fragmental method as

is common in Structure Elucidator system[6]. When NOESY/ROESY data were available

from the corresponding articles, application of these data to structures presented in top

sets (Sr =3-12) allowed us to conclude that the right stereoisomer is the preferred one

algorithmically also. Examples of the several top ranked sets of stereoisomers are

presented in the Supporting Materials.



Computational Details.




                                                                                          16
All calculations were performed using ACD/NMR predictor Version 11.00. A personal

computer equipped with a 2.8 GHz Intel processor and 2Gb of RAM and running the

Windows2000 operating system was used. All computer programs are an integral part of

the Structure Elucidator expert system. Other than supplying a set of structures,

stereoisomer generation and NMR chemical shift calculation requires no intervention

from the chemist and are performed fully automatically.



Conclusions.
                                                    13
The possibility of applying empirical methods of     C NMR chemical shift prediction for

selection of a set of the most probable stereoisomers related to a given chemical structure

has been shown for a series of examples. Application of this approach to the elucidation

of the preferred stereoisomer of artarborol has been considered in more detail. We

selected the most probable stereoisomer of artarborol using a simple and fast empirical

method of chemical shift prediction based on HOSE codes. We suggest that it is worth

employing this approach for the preliminary evaluation of all possible stereoisomers

generated by the expert system Structure Elucidator. We expect that this approach will

show general utility when the analyzed structure is relatively rigid and the reference

structures used for chemical shift prediction contain large common fragments with stereo

assignments. This approach can markedly reduce the number of stereoisomers that should

be thoroughly investigated on the basis of NOE correlations, coupling constant values

and quantum-mechanical calculations to finally establish the preferable stereoisomer. The

method can be enhanced by utilizing the methodology suggested in our work[36] and vice

versa: if a starting stereoisomer fed as input to the genetic algorithm for prediction and is




                                                                                          17
close to the right one the genetic algorithm will complete the calculations in a shorter

time.

        To continue to develop an optimal strategy and deduce further practical

recommendations it is necessary to investigate a larger set of diverse structures. In this

way we can further refine our methods of NMR chemical shift prediction and make them

more sensitive to relative stereochemistry. For this aim a statistically relevant collection

of material must be accumulated and generalized. This work is in progress, and results

will be presented in our next publication.



References

[1]     Blinov KA, Smurnyy YD, Elyashberg ME, Churanova TS, Kvasha M, Steinbeck

C, Lefebvre BE, Williams AJ. J. Chem. Inf. Model. 2008; 48: 550.

[2]     Smurnyy YD, Blinov KA, Churanova TS, Elyashberg ME, Williams AJ. J. Chem.

Inf. Model. 2008; 48: 128.

[3]     Bremser W. Anal.Chim. Act. Comp. Techn. Optimiz. 1978; 2: 355.

[4]     ACD/NMR Predictor v.11. Advanced Chemistry Development, Toronto, Canada.

[5]     Blinov KA, Carlson D, Elyashberg ME, Martin GE, Martirosian ER, Molodtsov

SG, Williams AJ. Magn. Reson. Chem. 2003; 41: 359.

[6]     Elyashberg ME, Blinov KA, Molodtsov SG, Williams AJ, Martin GE. J. Chem.

Inf. Comput. Sci. 2004; 44: 771.

[7]     Bagno A, Saielli G. Theor. Chem. Acc. 2007; 117: 603.

[8]     Balandina A, Kalinin A, Mamedov V, Figadere B, Latypov S. Magn. Reson.

Chem. 2005; 43: 816.




                                                                                         18
[9]     Balandina A, Saifina D, Mamedov V, Latypov S. J. Mol. Struc. 2006; 791: 77.

[10]    Balandina AA, Mamedov VA, Khafizova EA, Latypov SK. Russ. Chem. Bull.

2006; 55: 2256.

[11]    Barone G, Gomez-Paloma L, Duca D, Silvestri A, Riccio R, Bifulco G. Chemistry

2002; 8: 3233.

[12]    Barone V, Cimino P, Crescenzi O, Pavone M. J. Mol. Struc. 2007; 811: 323.

[13]    Ditchfield R. Mol. Phys. 1974; 27: 789.

[14]    Bifulco G, Dambruoso P, Gomez-Paloma L, Riccio R. Chem. Rev. 2007; 107:

3744.

[15]    Fattorusso C, Stendardo E, Appendino G, Fattorusso E, Luciano P, Romano A,

Taglialatela-Scafati O. Org. Lett. 2007; 9: 2377.

[16]    Sebag AB, Forsyth DA, Plante MA. J. Org. Chem. 2001; 66: 7967.

[17]    Sebag AB, Hanson RN, Forsyth DA, Lee CY. Magn. Reson. Chem. 2003; 41:

246.

[18]    Elyashberg ME, Blinov K, Williams AW. Magn. Reson. Chem. (submitted article)

[19]    Ohtani I, Kusumi T, Kashman Y, Kakisawa H. J. Am. Chem. Soc. 1991; 113:

4092.

[20]    Adamo C, Barone V. J. Chem. Phys. 1998; 108: 664.

[21]    Barone G, Duca D, Silvestri A, Gomez-Paloma L, Riccio R, Bifulco G. Chemistry

2002; 8: 3240.

[22]    Elyashberg ME, Blinov KA, Williams AJ, Molodtsov SG, Martin GE. J. Chem.

Inf. Model. 2006; 46: 1643.




                                                                                      19
[23]    Fattorusso E, Luciano P, Romano A, Taglialatela-Scafati O, Appendino G,

Borriello M, Fattorusso E. J. Nat. Prod. 2008; 71 (web ASAP)

[24]    Thuong PT, Lee CH, Dao TT, Nguyen PH, Kim WG, Lee SJ, Oh WK. J. Nat.

Prod.. 2008; 71: 1775.

[25]    Lv F, Xu M, Deng Z, de Voogd NJ, van Soest RWM, Proksch P, Lin W. J. Nat.

Prod.. 2008; 71: 1738.

[26]    Breitmaier E, Voelter W Carbon-13 NMR spectroscopy. VCH, Weinheim, 3rd

Edition, 1987.

[27]    Lu Y, Huang CY, Lin Y-F, Wen Z-H, Su J-H, Kuo Y-H, Chiang MY, Sheu J-H.

J. Nat. Prod. 2008; 71: 1754.

[28]    Shi Q-W, Sauriol F, Mamer O, Zamir LO. J. Nat. Prod. 2003; 66: 1480.

[29]    Ge HM, Huang B, Tan SH, Shi DH, Song YC, Tan RX. J. Nat. Prod. 2006; 69:

1800.

[30]    Zhang C-R, Yang S-P, Yue J-M. J. Nat. Prod. 2008; 71: 1663.

[31]    Castro A, Coll J, Tandro´n YA, Pant AK, Mathela CS. J. Nat. Prod. 2008; 71:

1294.

[32]    Jang KH, Jeon J-E, Ryu S, Lee H-S, Oh K-B, Shin J. J. Nat. Prod. 2008; 71:

1701.

[33]    Devkota KP, Lenta BN, Wansi JD, Choudhary MI, Kisangau DP. J. Nat. Prod.

2008; 71: 1481.

[34]    Liaw C-C, Shen Y-C, Lin Y-S, Hwang T-L, Kuo Y-H, Khalil AT. J. Nat. Prod.

2008; 71: 1551.




                                                                                      20
[35]   Hunyadi A, Tóth G, Simon A, Mák M, Kele Z, Máthé I, Báthori M. J. Nat. Prod.

2007; 70: 412.

[36]   Smurnyy YD, Elyashberg ME, Blinov KA, Lefebvre B, Martin GE, Williams AJ.

Tetrahedron 2005; 61 9980.



Captions

Figure 1. The four candidate stereoisomer structures of artarborol.

Figure 2. The first 8 out of 16 stereoisomers ranked in ascending order of the average

deviation dC.
                                  13
Figure 3. A comparison of the          C chemical shift deviations calculated for the carbon

atoms contained in stereoisomers A and C.

Figure 4. The most probable stereoisomers of structures 3 and 4 selected for detailed

theoretical analysis in the work[23]




                                                                                         21

Mais conteúdo relacionado

Mais procurados

Mais procurados (7)

Racemization in peptide synthesis
Racemization in peptide synthesisRacemization in peptide synthesis
Racemization in peptide synthesis
 
Obtaining RCOSY-type Correlations via Covariance Processing of GCOSY Spectra
Obtaining RCOSY-type Correlations via Covariance Processing of GCOSY SpectraObtaining RCOSY-type Correlations via Covariance Processing of GCOSY Spectra
Obtaining RCOSY-type Correlations via Covariance Processing of GCOSY Spectra
 
Understanding the radioactivity at Fukushima
Understanding the  radioactivity at Fukushima Understanding the  radioactivity at Fukushima
Understanding the radioactivity at Fukushima
 
Flash chromatography
Flash chromatographyFlash chromatography
Flash chromatography
 
Tableofelements
TableofelementsTableofelements
Tableofelements
 
Size exclusion chromatography
Size exclusion chromatographySize exclusion chromatography
Size exclusion chromatography
 
Chiral chromatography
Chiral chromatographyChiral chromatography
Chiral chromatography
 

Destaque

Major Structural Components in Freshwater Dissolved Organic Matter
Major Structural Components in Freshwater Dissolved Organic Matter Major Structural Components in Freshwater Dissolved Organic Matter
Major Structural Components in Freshwater Dissolved Organic Matter
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
How a Structure-Centric Community for Chemists Can Benefit Drug Discovery - V...
How a Structure-Centric Community for Chemists Can Benefit Drug Discovery - V...How a Structure-Centric Community for Chemists Can Benefit Drug Discovery - V...
How a Structure-Centric Community for Chemists Can Benefit Drug Discovery - V...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
A predictive ligand based Bayesian model for human drug induced liver injury
A predictive ligand based Bayesian model for human drug induced liver injury A predictive ligand based Bayesian model for human drug induced liver injury
A predictive ligand based Bayesian model for human drug induced liver injury
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
ChemSpider – A Community Platform for Chemistry and Resources Supporting the ...
ChemSpider – A Community Platform for Chemistry and Resources Supporting the ...ChemSpider – A Community Platform for Chemistry and Resources Supporting the ...
ChemSpider – A Community Platform for Chemistry and Resources Supporting the ...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Navigating the Complex Web of Chemistry Using ChemSpider
Navigating the Complex Web of Chemistry Using ChemSpiderNavigating the Complex Web of Chemistry Using ChemSpider
ChemSpider – A Platform to Gather, Host and Integrate Structure Based Data Ac...
ChemSpider – A Platform to Gather, Host and Integrate Structure Based Data Ac...ChemSpider – A Platform to Gather, Host and Integrate Structure Based Data Ac...
ChemSpider – A Platform to Gather, Host and Integrate Structure Based Data Ac...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 

Destaque (18)

An Introduction To Blogging
An Introduction To BloggingAn Introduction To Blogging
An Introduction To Blogging
 
Major Structural Components in Freshwater Dissolved Organic Matter
Major Structural Components in Freshwater Dissolved Organic Matter Major Structural Components in Freshwater Dissolved Organic Matter
Major Structural Components in Freshwater Dissolved Organic Matter
 
Enc07 Neutral Network Algorithms 070420
Enc07 Neutral Network Algorithms 070420Enc07 Neutral Network Algorithms 070420
Enc07 Neutral Network Algorithms 070420
 
Navigating the Complex Web of Chemistry Using ChemSpider
Navigating the Complex Web of Chemistry Using ChemSpiderNavigating the Complex Web of Chemistry Using ChemSpider
Navigating the Complex Web of Chemistry Using ChemSpider
 
The Performance Validation of Neural Network Based 13C NMR Prediction Using a...
The Performance Validation of Neural Network Based 13C NMR Prediction Using a...The Performance Validation of Neural Network Based 13C NMR Prediction Using a...
The Performance Validation of Neural Network Based 13C NMR Prediction Using a...
 
How a Structure-Centric Community for Chemists Can Benefit Drug Discovery - V...
How a Structure-Centric Community for Chemists Can Benefit Drug Discovery - V...How a Structure-Centric Community for Chemists Can Benefit Drug Discovery - V...
How a Structure-Centric Community for Chemists Can Benefit Drug Discovery - V...
 
A predictive ligand based Bayesian model for human drug induced liver injury
A predictive ligand based Bayesian model for human drug induced liver injury A predictive ligand based Bayesian model for human drug induced liver injury
A predictive ligand based Bayesian model for human drug induced liver injury
 
RSC ChemSpider – Building An Internet Based Community For Chemists
RSC ChemSpider – Building An Internet Based Community For ChemistsRSC ChemSpider – Building An Internet Based Community For Chemists
RSC ChemSpider – Building An Internet Based Community For Chemists
 
Using Text-Mining and Crowdsourced Curation to Build a Structure Centric Comm...
Using Text-Mining and Crowdsourced Curation to Build a Structure Centric Comm...Using Text-Mining and Crowdsourced Curation to Build a Structure Centric Comm...
Using Text-Mining and Crowdsourced Curation to Build a Structure Centric Comm...
 
Enhancing Discoverability Across Royal Society Of Chemistry Content By Integr...
Enhancing Discoverability Across Royal Society Of Chemistry Content By Integr...Enhancing Discoverability Across Royal Society Of Chemistry Content By Integr...
Enhancing Discoverability Across Royal Society Of Chemistry Content By Integr...
 
ChemSpider – A Community Platform for Chemistry and Resources Supporting the ...
ChemSpider – A Community Platform for Chemistry and Resources Supporting the ...ChemSpider – A Community Platform for Chemistry and Resources Supporting the ...
ChemSpider – A Community Platform for Chemistry and Resources Supporting the ...
 
The Benefits to Chemical Vendors of Putting their data on ChemSpider
The Benefits to Chemical Vendors of Putting their data on ChemSpiderThe Benefits to Chemical Vendors of Putting their data on ChemSpider
The Benefits to Chemical Vendors of Putting their data on ChemSpider
 
Building a Community Resource of Open Spectral Data
Building a Community Resource of Open Spectral DataBuilding a Community Resource of Open Spectral Data
Building a Community Resource of Open Spectral Data
 
Connecting Chemists to the Internet Through ChemSpider
Connecting Chemists to the Internet Through ChemSpiderConnecting Chemists to the Internet Through ChemSpider
Connecting Chemists to the Internet Through ChemSpider
 
Navigating the Complex Web of Chemistry Using ChemSpider
Navigating the Complex Web of Chemistry Using ChemSpiderNavigating the Complex Web of Chemistry Using ChemSpider
Navigating the Complex Web of Chemistry Using ChemSpider
 
ChemSpider – A Platform to Gather, Host and Integrate Structure Based Data Ac...
ChemSpider – A Platform to Gather, Host and Integrate Structure Based Data Ac...ChemSpider – A Platform to Gather, Host and Integrate Structure Based Data Ac...
ChemSpider – A Platform to Gather, Host and Integrate Structure Based Data Ac...
 
Collaborative Computational Technologies for Biomedical Research: An Enabler ...
Collaborative Computational Technologies for Biomedical Research: An Enabler ...Collaborative Computational Technologies for Biomedical Research: An Enabler ...
Collaborative Computational Technologies for Biomedical Research: An Enabler ...
 
Accessing chemical health and safety data online using Royal Society of Chemi...
Accessing chemical health and safety data online using Royal Society of Chemi...Accessing chemical health and safety data online using Royal Society of Chemi...
Accessing chemical health and safety data online using Royal Society of Chemi...
 

Último

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Último (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 

The application of empirical methods of 13C NMR chemical shift prediction as a filter for determining possible relative stereochemistry

  • 1. The Application of Empirical Methods of 13C NMR Chemical Shift Prediction as a Filter for Determining Possible Relative Stereochemistry. A short title: The Application of Empirical NMR Prediction to Determine Stereochemistry Mikhail E. Elyashberg+, Kirill A. Blinov+ and Antony J.Williams*. + Advanced Chemistry Development, Moscow Department, 6 Akademik Bakulev Street, Moscow 117513, Russian Federation,  ChemZoo Inc., 904 Tamaras Circle, Wake Forest, North Carolina 27587 Abstract: The reliable determination of stereocenters contained within chemical structures usually requires utilization of NMR data, chemical derivatization, molecular modeling, quantum- mechanical calculations and, if available, X-ray analysis. In this article we show that the number of stereoisomers which need to be thoroughly verified can be significantly reduced by the application of NMR chemical shift calculation to the full stereoisomer set of possibilities using a fragmental approach based on HOSE codes. The applicability of this suggested method is illustrated using experimental data published for a series of complex chemical structures. Keywords: NMR, 1H, 13C, chemical shift prediction, stereochemistry. Introduction 1
  • 2. A number of different methods of NMR chemical shift prediction have been applied to the process of molecular structure elucidation and validation. Empirical methods are attractive since they are fast enough and fully automatic. The fastest NMR spectra calculations are provided using an incremental approach and offer a computational speed of 6,000-10,000 chemical shifts per second on a normal desktop computer (circa 2007) and provides an average chemical shift deviation for carbon NMR of 1.8 ppm[1,2]. Spectral prediction utilizing artificial neural networks provide similar speed and accuracy performance[1,2]. The third most popular empirical method is slower and is based on the 13 application of a database containing reference structures with assigned C or 1H chemical shifts. The target and reference structures are described by means of HOSE codes[3] and this allows prediction of the chemical shift of an atom from the target structure using the chemical shifts of the reference structures as the basis. In the ACD/NMR predictor[4], the prediction algorithms use a library containing 185,000 structures with NMR chemical shifts assigned to carbon and hydrogen atoms. If information regarding the relative stereochemistry of a given atom ai and its environment is known then these data are also coded into the reference structures. To predict the chemical shift of an atom ai in the target structure its HOSE code is compared with the codes of the corresponding atoms in reference structures. As a result of statistical processing of the chemical shifts assigned to all “atom-twins” detected in the reference structures, the chemical shift of an atom from the target structure is predicted. A strategy based on combining all mentioned methods was suggested[5,6]. It allows selection of the most probable structure from the output file of expert system developed for the molecular structure elucidation. 2
  • 3. At the same time a series of articles have been published espousing the value of ab- initio quantum mechanical (QM) approaches for NMR chemical shift calculations (for instance,[7-12]) and, most frequently, the GIAO option of the DFT method[13] has been employed for the calculation of 1H and 13C chemical shifts. It was shown that DFT based methods can be applied for the selection of a preferable structural hypothesis by means of comparing the predicted chemical shifts with those determined experimentally. This approach was also an efficient tool for evaluating the different conformers of flexible molecules as well as the elucidation of the most probable stereoisomers[13-17]. In our previous report[18] we have shown that empirical methods of NMR chemical shift prediction can be successfully used at the selection stage of structural hypotheses which are verified further with application of molecular geometry optimization and QM chemical shift prediction. In this regard we hypothesize that empirical methods can help in preliminary selection of a set of the most probable stereoisomers for their subsequent verification by additional experimental techniques and QM chemical shift prediction. This may be possible since the stereocenters of structures included into the ACD/CNMR database and stereochemistry is taken into account by the NMR chemical shift prediction algorithms. The incremental and neural nets based algorithms of chemical shift prediction also use the stereochemistry information related to the atoms included into 3-6-membered cycles[2]. It was interesting to know whether this information can be useful for stereochemistry determination. We have tested our hypothesis using a series of examples. We have used examples from recent literature (2007-8) for novel structures for which relative stereochemistry was reported. These structures are deliberately absent from the ACD/CNMR database. 3
  • 4. The application of empirical methods of 13C NMR chemical shift prediction is shown to allow the selection of a set of the most probable stereoisomers and always includes the genuine stereoconfiguration. RESULTS AND DISCUSSION. Fattorusso et al[15] utilized DFT chemical shift computation to confirm the most probable stereoisomer of artarborol, 1, a rare nor-caryophyllane derivative, isolated by the authors[15] and structurally characterized by both 1D and 2D NMR spectroscopic methods. H O 10 CH3 17 16 1 4 11 6 12 7 2 5 H 3 19 HO CH3 13 8 14 H 18 9 CH3 15 1 To select the most probable stereoisomer the authors[15] carried out a series of investigations. Structure 1 contains five stereogenic carbons (numbered 1-5 on structure 1) with four of them at junctions between the 9-membered ring and the small ring cycles, while both cis- and trans- junctions of rings adjacent to the nine-membered core are possible in natural caryophyllanes. A combination of 2D ROESY experiments with Mosher’s modified method[19] was used to assess the absolute configuration of C-2 (R) and allowed the authors[15] to reduce the total number of possible stereoisomers to the following four (Figure 1): 4
  • 5. O O O O H 10 CH3 H 10 H 10 CH3 H 10 CH3 17 CH3 17 16 17 16 17 16 1 4 16 1 4 1 4 1 4 11 6 11 6 11 6 11 6 12 7 12 7 12 7 12 7 2 5 H 2 5 H 2 5 H 2 5 H 19 19 3 19 19 3 3 3 CH3 CH3 CH3 CH3 14 14 8 14 14 HO H 8 HO 8 HO H HO H 8 13 9 13 H 9 13 18 9 13 9 18 18 18 CH3 CH3 CH3 CH3 15 15 15 15 A B C D Figure 1. The four candidate stereoisomer structures of artarborol. Further selection was made by analyzing the scalar coupling constants and additional spatial couplings across the entire molecule for which all candidate structures were subjected to a conformational search. As a result, structures B and D were rejected at the first step, structure C was then excluded and finally stereoconfiguration A was assigned to artarborol. To support this stereochemical assignment each conformation of the stereoisomers A and C were fully optimized by the authors[15], and the NMR chemical shifts were calculated using the GIAO option of the MPW1PW91/6-31G(d,p) DFT method[20]. A Boltzmann-weighted average of the 13C NMR chemical shifts for all carbon atoms in the low-energy conformers was calculated for each configuration, using the ab- initio standard free energies as weighting factors[21]. The total processing time for each molecule was approximately 60 h (PC Pentium IV). A comparison of calculated chemical shifts with those determined experimentally for structures A and C showed that deviations were smaller for structure A thereby confirming the validity of the solution. Selection of the most probable stereoisomer was attained as a result of a comprehensive experimental and theoretical investigation of the compound and its conceivable 3D models. We investigated what results would be obtained if the problem 5
  • 6. is solved using 1D and 2D NMR spectra and the empirical chemical shift prediction methods implemented into the expert system Structure Elucidator[5,6,22]. To perform this analysis structure 1 was input into the system and all carbon and hydrogen atoms were supplied with chemical shifts in accordance with the author’s assignment. Then all 25=32 streoisomers were generated by the program and depicted using conventional designations for stereobonds. 1H and 13 C chemical shifts were calculated for the complete stereoisomer set using the fragment-based approach within 13 the Structure Elucidator program. In addition, C NMR chemical shifts were calculated using both neural net (N) and incremental (I) approaches. The average deviations of the predicted chemical shifts relative to the experimental shifts (dA = fragmental approach, dN = NN approach and dI = incremental approach) were calculated for each of 32 stereoisomers and all stereoisomers were ranked in ascending order of the 13C deviation values. Since the chemical shifts are insensitive to the absolute configuration of a stereoisomer and its inverse partner the reduced ranked stereoisomer set was finally represented as a sequence of 16 stereoisomer pairs, each pair having equal deviations. Figure 2 shows the first 8 out of 16 “unique” stereoisomers 13 ranked in ascending order of the average deviations calculated for C NMR spectrum. The remaining stereoisomers are characterized by 13C average deviations dA(13C) falling in the range between 2.49 and 2.90 ppm. Figure 2 shows that the correct stereoisomer was distinguished both by its 13C and 1 H average deviations. Our experiences in the field of computer-aided structure elucidation have shown [22] that the dA(1H) deviation is a less reliable criterion compared with dC and it is usually only used for additional confirmation of the most probable 6
  • 7. structural isomer[5,6,22]. The difference between the deviations dA(13C) found for the second and first ranked structures is not large (0.2 ppm), but this value is frequently observed in the structure elucidation process when the “best structure” is selected[22] . It is worthy to note that in the stereoisomers 3, 4, 6 and 9, atoms H-17 and H-19 are situated on opposite sides of the macrocycle and are unlikely to be close enough in space to show a ROESY coupling. Since the authors[15] made the final choice between structures A and 13 C on the basis of comparison of differences between experimental and calculated C chemical shifts of all carbon atoms we also compared these values (see Figure 3). 1 (ID:29) 2 (ID:4) 3 (ID:13) 4 (ID:24) O O O O A H CH3 H CH3 H CH3 H CH3 H H H H H H H H H H H H H H H H H H H H H H H H H H H H H H H H H H H H CH3 CH3 CH3 CH3 HO HO H HO H HO H H CH3 CH3 CH3 CH3 dA(13C): 1.773 (v.11.01) dA(13C): 1.959 (v.11.01) dA(13C): 1.969 (v.11.01) dA(13C): 1.982 (v.11.01) dI(13C): 2.791 dI(13C): 2.893 dI(13C): 2.893 dI(13C): 2.893 dN(13C): 2.738 dN(13C): 2.817 dN(13C): 2.817 dN(13C): 2.817 dA(1H): 0.289 (v.11.01) dA(1H): 0.313 (v.11.01) dA(1H): 0.312 (v.11.01) dA(1H): 0.313 (v.11.01) 5 (ID:8) 6 (ID:20) 7 (ID:12) 8 (ID:33) O O D O C O H CH3 H CH3 H CH3 H CH3 H H H H H H H H H H H H H H H H H H H H H H H H H H H H H H H H H CH H H H CH3 CH3 CH3 3 HO HO HO H HO H H H CH3 CH3 CH3 CH3 dA(13C): 1.998 (v.11.01) dA(13C): 2.092 (v.11.01) dA(13C): 2.358 (v.11.01) dA(13C): 2.364 (v.11.01) dI(13C): 2.791 dI(13C): 2.791 dI(13C): 3.643 dI(13C): 2.893 dN(13C): 2.738 dN(13C): 2.738 dN(13C): 3.306 dN(13C): 2.817 dA(1H): 0.293 (v.11.01) dA(1H): 0.293 (v.11.01) dA(1H): 0.313 (v.11.01) dA(1H): 0.309 (v.11.01) Figure 2. The first 8 out of 16 stereoisomers ranked in ascending order of the average deviation dA (13C). 7
  • 8. 6 4 Chemical shift difference, ppm 2 0 1 3 5 7 9 11 13 A -2 C -4 -6 -8 -10 Atom num ber 13 Figure 3. A comparison of the C chemical shift deviations calculated for the carbon atoms contained in stereoisomers A and C. Figure 3 shows that the main difference between the chemical shifts calculated for structures A and C is observed for atoms 6 and 7. For structure A the calculated values are markedly closer to the experimental values. The maximum prediction errors are shown for atoms 3 and 5 at the junction between the macrocycle and the 4-membered ring. Stereoisomer ranking with dN (13C ) and dI (13C ) values in general supported the priority of stereoisomers A-D: these fell into the first four stereoisomers for which all dN (13C ) values and all dI (13C ) values proved to be equal (see Supporting Materials, Figure 1S). The approach described here looks attractive due to its simplicity and high speed: 13 the C and 1H chemical shift calculations for all 32 isomers took about 2 minutes on a Pentium IV, 2.8 GHz processor compared to 60 hours per prediction as reported by the authors of the original paper. It could be useful for the preliminary assessment of a full stereoisomer set and rejection of deliberately improbable structures when the analyzed molecule is relatively rigid. The reliability of such conclusions can be heuristically 8
  • 9. evaluated by visual comparison of the reference structures used for chemical shift prediction with the target structure. For instance, a series of structures containing the ring framework of artarborol were shown by the program when examining the chemical shift prediction protocol. It should be emphasized that the artarborol molecule (a new compound) was absent from the library of structures included with the ACD/NMR prediction program. Reference structure 2 is the most similar structure to the artarborol structure under investigation: H O 63.65 CH3 27.60 59.80 16.90 29.45 40.10 HO 51.50 24.45 66.40 43.75 44.25 H H 39.50 34.60 CH3 21.55 CH3 29.85 2 We demonstrated that removing structure 2 from the database did not influence the results: the deviation characteristic for the best stereoisomer was only slightly increased from 1.773 to 1.799 ppm. The described approach was also applied to two new ketopelenolides 3 and 4 which were separated and scrutinized by the same research group[23]. The stereochemistry shown in structures 3 and 4 was determined by authors[23] as a result of conformational analysis and QM based 13C chemical shift calculation of the most probable stereoisomers. The calculations were performed in groups of four for each structure (C1-C4 for structure 3 and D1-D4 for structure 4, see Figure 4). It has been shown that C1 corresponds to stereoisomer 3 and D1 – to stereoisomer 4. 9
  • 10. HO O CH3 H H CH3 O O H H3C H H3C CH3 CH3 H H3C H H O H O H O O 3 4 HO CH3 HO CH3 HO CH3 HO CH3 H H H H O O O O H H H H H H H H H 3C H3C H3C H3C CH3 CH3 CH3 CH3 H H H H O H O H O H O H O O O O C1 C2 C3 C4 O O O O H CH3 H CH3 H CH3 H CH3 O O O O H H H H H H H H H 3C H3C H CH3 H CH3 H3C H CH3 H3C H CH3 H H H H O O O O O O O O D1 D2 D3 D4 Figure 4. The most probable stereoisomers of structures 3 and 4 selected for detailed theoretical analysis in the work[23]. Structure Elucidator was used to generate all possible stereoisomers for structures 3 and 4 (in both cases N=64) and to perform NMR chemical shift calculations for all 10
  • 11. 13 stereoisomers using empirical methods. C chemical shift prediction using the fragmental method placed stereoisomer C2 in first position in the ranked file and the genuine stereoisomer C1 at the second position with a difference between deviations of 0.01 ppm. At the same time ranking stereoisomers using dN(13C) values brought stereoisomers C1-C4 to the 1-4 positions with equal dN(13C) and dI(13C) values for all of the stereoisomers (see Supporting Materials, Figure 2S). For structure 4 the stereoisomers were ranked by dA(13C) values in the following order: 1st – D1, 2nd – D2, 3rd – D3, 5th – D4 (see Supporting Materials, Figure 3S). The correct stereoisomer was placed in first position and the other most probable stereoisomers selected in[23] were distinguished by the program as also deserving attention. For preliminary evaluation of the generality of the described approach we repeated the work using the structures of natural products belonging to a number of different classes, i.e. steroids, alkaloids, terpenes, cembranoids, etc. A set of such structures whose relative stereochemistry was recently described in a series of publications was chosen (see Table 1). Table 1. Examples of structures for which sets of preferable stereoisomers were selected 13 using empirical methods of C NMR chemical shift prediction. The R and S 11
  • 12. designations shown in the structures correspond to the stereochemistry at the particular stereocenter. Nds, Sr, Ref. Example. Structure Number Position No of of Correct Stereo- Stereoisomer isomers CH3 [24] 1 HO 1024 1 R H3C R S S OH H3C CH3 H R S R R O H3C R R H HO R O H OH CH3 O [25] 2 256 1 CH3 H3C S E E R S CH3 E E H CH3 HO S R E E S H H H3C O HO O CH3 [26] 3 32 1 H N S R H S R H R N H H S O O H 12
  • 13. O [27] 4 32 1 O H3C CH2 R O O CH3R H O R R H3C S S H H3C O H OH H3C [28] 5 O 64 1 CH3 O H O O H R S H HO CH3 S H3C S H3C H3C S H S H O S H O CH3 CH2 OH [29] 6 OH 32 3 O HO S H S S H H H S O HO H S O R O H O OH O [30] 7 CH3 128 3 O HO CH3 H S S S H S R R H R N H S H H H3C 13
  • 14. H [31] 8 CH3 2048 3 R CH3 O * H3C O R R OH CH3 S H H CH3 R HO R S S R H R R H OH HO H H O O [32] 9 1024 3 CH3 O CH3 H H H3C R R O CH3 H R H S R H3C CH3 R O H S R O S H HO S S CH3O CH3 H3C H O O HH O CH3 O O [28] 10 H3C CH3 512 3 O O HO H OH HO S R CH3 H S H3C S S O O CH3 H H3C R S E R O E O R H3C H CH 2 H H O O CH3 H [33] 11 H3C CH3 64 3 H3C S N S CH3 H3C R H S R R S H H3C N H3C 14
  • 15. O [27] 12 32 4 H3C O H3C O R H CH2 S H O R H O CH3 S H S R H2C OH 13 O 256 8 [34] H3C CH3 O H OH3C CH3 O S Z O O R S S Z H HO H S H CH3 S S S O H O CH2 R H O H3C O H H3C O [35] 14 256 12 H CH3 HO CH3 S R R H H3C HO R S S R H H OH H R R HO H O All selected structures were supplied with assigned experimental 1H and 13 C NMR chemical shifts. Three similar structures borrowed from earlier publications (of 2003 and 2004) were temporarily removed from the database during our research. For each molecule a full set of N possible stereoisomers was generated and the 13C NMR chemical shifts of Nds differing stereoisomers (Nds =N/2, N=2n, n – number of stereocenters) were calculated by all three mentioned algorithms. A stereoisomer file was ranked in the same way as in the artarborol case – in descending order of dA(13C) values, and the position of 15
  • 16. the correct stereoisomer, as determined in the corresponding article, was detected in the ranked file. The result of each computational experiment was characterized by an Sr value where Sr is the number of stereoisomers for which the deviations dA(13C) are less than or equal to the deviation calculated for the right stereoisomer. For instance, Sr =1 means that the right stereoisomer was ranked the first in the file with deviation dA1(13C), and dA1(13C) < dA2(13C), where dA2(13C) is the deviation calculated for the stereoisomer ranked in second position. The notation Sr =4 means that the correct stereoisomer is among the first four stereoisomers in the ranked file. Table 1 shows that our suggested approach can indeed be used for selecting a set of the most probable stereoisomers from all possible members of the family. Even for rather complex structures the preferable stereoisomer was ranked early in the set. Stereoisomer ranking using dN(13C) is not as effective as dA(13C) but nevertheless in this case the right stereoisomer most frequently fell into the set of the first 8 ranked stereoisomers. Consequently, the neural net approach can be used for preliminary ranking the stereoisomer file for subsequent spectrum prediction based on fragmental method as is common in Structure Elucidator system[6]. When NOESY/ROESY data were available from the corresponding articles, application of these data to structures presented in top sets (Sr =3-12) allowed us to conclude that the right stereoisomer is the preferred one algorithmically also. Examples of the several top ranked sets of stereoisomers are presented in the Supporting Materials. Computational Details. 16
  • 17. All calculations were performed using ACD/NMR predictor Version 11.00. A personal computer equipped with a 2.8 GHz Intel processor and 2Gb of RAM and running the Windows2000 operating system was used. All computer programs are an integral part of the Structure Elucidator expert system. Other than supplying a set of structures, stereoisomer generation and NMR chemical shift calculation requires no intervention from the chemist and are performed fully automatically. Conclusions. 13 The possibility of applying empirical methods of C NMR chemical shift prediction for selection of a set of the most probable stereoisomers related to a given chemical structure has been shown for a series of examples. Application of this approach to the elucidation of the preferred stereoisomer of artarborol has been considered in more detail. We selected the most probable stereoisomer of artarborol using a simple and fast empirical method of chemical shift prediction based on HOSE codes. We suggest that it is worth employing this approach for the preliminary evaluation of all possible stereoisomers generated by the expert system Structure Elucidator. We expect that this approach will show general utility when the analyzed structure is relatively rigid and the reference structures used for chemical shift prediction contain large common fragments with stereo assignments. This approach can markedly reduce the number of stereoisomers that should be thoroughly investigated on the basis of NOE correlations, coupling constant values and quantum-mechanical calculations to finally establish the preferable stereoisomer. The method can be enhanced by utilizing the methodology suggested in our work[36] and vice versa: if a starting stereoisomer fed as input to the genetic algorithm for prediction and is 17
  • 18. close to the right one the genetic algorithm will complete the calculations in a shorter time. To continue to develop an optimal strategy and deduce further practical recommendations it is necessary to investigate a larger set of diverse structures. In this way we can further refine our methods of NMR chemical shift prediction and make them more sensitive to relative stereochemistry. For this aim a statistically relevant collection of material must be accumulated and generalized. This work is in progress, and results will be presented in our next publication. References [1] Blinov KA, Smurnyy YD, Elyashberg ME, Churanova TS, Kvasha M, Steinbeck C, Lefebvre BE, Williams AJ. J. Chem. Inf. Model. 2008; 48: 550. [2] Smurnyy YD, Blinov KA, Churanova TS, Elyashberg ME, Williams AJ. J. Chem. Inf. Model. 2008; 48: 128. [3] Bremser W. Anal.Chim. Act. Comp. Techn. Optimiz. 1978; 2: 355. [4] ACD/NMR Predictor v.11. Advanced Chemistry Development, Toronto, Canada. [5] Blinov KA, Carlson D, Elyashberg ME, Martin GE, Martirosian ER, Molodtsov SG, Williams AJ. Magn. Reson. Chem. 2003; 41: 359. [6] Elyashberg ME, Blinov KA, Molodtsov SG, Williams AJ, Martin GE. J. Chem. Inf. Comput. Sci. 2004; 44: 771. [7] Bagno A, Saielli G. Theor. Chem. Acc. 2007; 117: 603. [8] Balandina A, Kalinin A, Mamedov V, Figadere B, Latypov S. Magn. Reson. Chem. 2005; 43: 816. 18
  • 19. [9] Balandina A, Saifina D, Mamedov V, Latypov S. J. Mol. Struc. 2006; 791: 77. [10] Balandina AA, Mamedov VA, Khafizova EA, Latypov SK. Russ. Chem. Bull. 2006; 55: 2256. [11] Barone G, Gomez-Paloma L, Duca D, Silvestri A, Riccio R, Bifulco G. Chemistry 2002; 8: 3233. [12] Barone V, Cimino P, Crescenzi O, Pavone M. J. Mol. Struc. 2007; 811: 323. [13] Ditchfield R. Mol. Phys. 1974; 27: 789. [14] Bifulco G, Dambruoso P, Gomez-Paloma L, Riccio R. Chem. Rev. 2007; 107: 3744. [15] Fattorusso C, Stendardo E, Appendino G, Fattorusso E, Luciano P, Romano A, Taglialatela-Scafati O. Org. Lett. 2007; 9: 2377. [16] Sebag AB, Forsyth DA, Plante MA. J. Org. Chem. 2001; 66: 7967. [17] Sebag AB, Hanson RN, Forsyth DA, Lee CY. Magn. Reson. Chem. 2003; 41: 246. [18] Elyashberg ME, Blinov K, Williams AW. Magn. Reson. Chem. (submitted article) [19] Ohtani I, Kusumi T, Kashman Y, Kakisawa H. J. Am. Chem. Soc. 1991; 113: 4092. [20] Adamo C, Barone V. J. Chem. Phys. 1998; 108: 664. [21] Barone G, Duca D, Silvestri A, Gomez-Paloma L, Riccio R, Bifulco G. Chemistry 2002; 8: 3240. [22] Elyashberg ME, Blinov KA, Williams AJ, Molodtsov SG, Martin GE. J. Chem. Inf. Model. 2006; 46: 1643. 19
  • 20. [23] Fattorusso E, Luciano P, Romano A, Taglialatela-Scafati O, Appendino G, Borriello M, Fattorusso E. J. Nat. Prod. 2008; 71 (web ASAP) [24] Thuong PT, Lee CH, Dao TT, Nguyen PH, Kim WG, Lee SJ, Oh WK. J. Nat. Prod.. 2008; 71: 1775. [25] Lv F, Xu M, Deng Z, de Voogd NJ, van Soest RWM, Proksch P, Lin W. J. Nat. Prod.. 2008; 71: 1738. [26] Breitmaier E, Voelter W Carbon-13 NMR spectroscopy. VCH, Weinheim, 3rd Edition, 1987. [27] Lu Y, Huang CY, Lin Y-F, Wen Z-H, Su J-H, Kuo Y-H, Chiang MY, Sheu J-H. J. Nat. Prod. 2008; 71: 1754. [28] Shi Q-W, Sauriol F, Mamer O, Zamir LO. J. Nat. Prod. 2003; 66: 1480. [29] Ge HM, Huang B, Tan SH, Shi DH, Song YC, Tan RX. J. Nat. Prod. 2006; 69: 1800. [30] Zhang C-R, Yang S-P, Yue J-M. J. Nat. Prod. 2008; 71: 1663. [31] Castro A, Coll J, Tandro´n YA, Pant AK, Mathela CS. J. Nat. Prod. 2008; 71: 1294. [32] Jang KH, Jeon J-E, Ryu S, Lee H-S, Oh K-B, Shin J. J. Nat. Prod. 2008; 71: 1701. [33] Devkota KP, Lenta BN, Wansi JD, Choudhary MI, Kisangau DP. J. Nat. Prod. 2008; 71: 1481. [34] Liaw C-C, Shen Y-C, Lin Y-S, Hwang T-L, Kuo Y-H, Khalil AT. J. Nat. Prod. 2008; 71: 1551. 20
  • 21. [35] Hunyadi A, Tóth G, Simon A, Mák M, Kele Z, Máthé I, Báthori M. J. Nat. Prod. 2007; 70: 412. [36] Smurnyy YD, Elyashberg ME, Blinov KA, Lefebvre B, Martin GE, Williams AJ. Tetrahedron 2005; 61 9980. Captions Figure 1. The four candidate stereoisomer structures of artarborol. Figure 2. The first 8 out of 16 stereoisomers ranked in ascending order of the average deviation dC. 13 Figure 3. A comparison of the C chemical shift deviations calculated for the carbon atoms contained in stereoisomers A and C. Figure 4. The most probable stereoisomers of structures 3 and 4 selected for detailed theoretical analysis in the work[23] 21