SlideShare uma empresa Scribd logo
1 de 80
Baixar para ler offline
O NTOLOGY-BASED
 C LASSIFICATION OF M OLECULES :
A L OGIC P ROGRAMMING A PPROACH


                Despoina Magka

  Department of Computer Science, University of Oxford


               November 30, 2012
B IOINFORMATICS AND S EMANTIC T ECHNOLOGIES
      Life sciences data deluge




1
B IOINFORMATICS AND S EMANTIC T ECHNOLOGIES
      Life sciences data deluge
      Hierarchical organisation of biochemical knowledge




1
B IOINFORMATICS AND S EMANTIC T ECHNOLOGIES
      Life sciences data deluge
      Hierarchical organisation of biochemical knowledge




1
B IOINFORMATICS AND S EMANTIC T ECHNOLOGIES
      Life sciences data deluge
      Hierarchical organisation of biochemical knowledge




1
B IOINFORMATICS AND S EMANTIC T ECHNOLOGIES
      Life sciences data deluge
      Hierarchical organisation of biochemical knowledge




      Fast, automatic and repeatable classification driven by
      Semantic technologies




1
B IOINFORMATICS AND S EMANTIC T ECHNOLOGIES
      Life sciences data deluge
      Hierarchical organisation of biochemical knowledge




      Fast, automatic and repeatable classification driven by
      Semantic technologies
      Web Ontology Language, a W3C standard family
      of logic-based formalisms



1
B IOINFORMATICS AND S EMANTIC T ECHNOLOGIES
      Life sciences data deluge
      Hierarchical organisation of biochemical knowledge




      Fast, automatic and repeatable classification driven by
      Semantic technologies
      Web Ontology Language, a W3C standard family
      of logic-based formalisms
      OWL bio- and chemo-ontologies widely adopted

1
T HE C H EBI O NTOLOGY

    OWL ontology Chemical Entities of Biological Interest




2
T HE C H EBI O NTOLOGY

    OWL ontology Chemical Entities of Biological Interest
    Dictionary of molecules with taxonomical information




2
T HE C H EBI O NTOLOGY

    OWL ontology Chemical Entities of Biological Interest
    Dictionary of molecules with taxonomical information




              caffeine is a cyclic molecule
2
T HE C H EBI O NTOLOGY

    OWL ontology Chemical Entities of Biological Interest
    Dictionary of molecules with taxonomical information




            serotonin is an organic molecule

2
T HE C H EBI O NTOLOGY

    OWL ontology Chemical Entities of Biological Interest
    Dictionary of molecules with taxonomical information




              ascorbic acid is a carboxylic ester




2
T HE C H EBI O NTOLOGY

    OWL ontology Chemical Entities of Biological Interest
    Dictionary of molecules with taxonomical information
    Pharmaceutical design and study of biological pathways




2
T HE C H EBI O NTOLOGY

    OWL ontology Chemical Entities of Biological Interest
    Dictionary of molecules with taxonomical information
    Pharmaceutical design and study of biological pathways




    ChEBI is manually incremented




2
T HE C H EBI O NTOLOGY

    OWL ontology Chemical Entities of Biological Interest
    Dictionary of molecules with taxonomical information
    Pharmaceutical design and study of biological pathways




    ChEBI is manually incremented
    Currently ~30,000 chemical entities, expands at 3,500/yr




2
T HE C H EBI O NTOLOGY

    OWL ontology Chemical Entities of Biological Interest
    Dictionary of molecules with taxonomical information
    Pharmaceutical design and study of biological pathways




    ChEBI is manually incremented
    Currently ~30,000 chemical entities, expands at 3,500/yr
    Existing chemical databases describe millions of molecules




2
T HE C H EBI O NTOLOGY

    OWL ontology Chemical Entities of Biological Interest
    Dictionary of molecules with taxonomical information
    Pharmaceutical design and study of biological pathways




    ChEBI is manually incremented
    Currently ~30,000 chemical entities, expands at 3,500/yr
    Existing chemical databases describe millions of molecules
    Speed up growth by automating chemical classification



2
E XPRESSIVITY L IMITATIONS OF OWL
    1   At least one tree-shaped model for each consistent OWL
        ontology    problematic representation of cycles




3
E XPRESSIVITY L IMITATIONS OF OWL
     1   At least one tree-shaped model for each consistent OWL
         ontology    problematic representation of cycles




    E XAMPLE


     C      C

     C      C




3
E XPRESSIVITY L IMITATIONS OF OWL
      1   At least one tree-shaped model for each consistent OWL
          ontology    problematic representation of cycles




     E XAMPLE
    Cyclobutane    ∃(= 4)hasAtom.(Carbon     ∃(= 2)hasBond.Carbon)

      C      C

      C      C




3
E XPRESSIVITY L IMITATIONS OF OWL
      1   At least one tree-shaped model for each consistent OWL
          ontology    problematic representation of cycles




     E XAMPLE
    Cyclobutane    ∃(= 4)hasAtom.(Carbon     ∃(= 2)hasBond.Carbon)

      C      C

      C      C




3
E XPRESSIVITY L IMITATIONS OF OWL
      1   At least one tree-shaped model for each consistent OWL
          ontology    problematic representation of cycles




     E XAMPLE
    Cyclobutane      ∃(= 4)hasAtom.(Carbon        ∃(= 2)hasBond.Carbon)

      C         C

      C         C


          OWL-based reasoning support
            1   Is cyclobutane a cyclic molecule? 


3
E XPRESSIVITY L IMITATIONS OF OWL
      1   At least one tree-shaped model for each consistent OWL
          ontology    problematic representation of cycles
      2   No minimality condition on the models    hard to axiomatise
          classes based on the absence of attributes

     E XAMPLE
    Cyclobutane      ∃(= 4)hasAtom.(Carbon        ∃(= 2)hasBond.Carbon)

      C         C

      C         C


          OWL-based reasoning support
            1   Is cyclobutane a cyclic molecule? 


3
E XPRESSIVITY L IMITATIONS OF OWL
      1   At least one tree-shaped model for each consistent OWL
          ontology    problematic representation of cycles
      2   No minimality condition on the models    hard to axiomatise
          classes based on the absence of attributes

     E XAMPLE
    Cyclobutane      ∃(= 4)hasAtom.(Carbon        ∃(= 2)hasBond.Carbon)
                                                  Oxygen
      C         C

      C         C


          OWL-based reasoning support
            1   Is cyclobutane a cyclic molecule? 


3
E XPRESSIVITY L IMITATIONS OF OWL
      1   At least one tree-shaped model for each consistent OWL
          ontology    problematic representation of cycles
      2   No minimality condition on the models    hard to axiomatise
          classes based on the absence of attributes

     E XAMPLE
    Cyclobutane      ∃(= 4)hasAtom.(Carbon        ∃(= 2)hasBond.Carbon)
                                                  Oxygen
      C         C

      C         C


          OWL-based reasoning support
            1   Is cyclobutane a cyclic molecule? 
            2   Is cyclobutane a hydrocarbon? 

3
E XPRESSIVITY L IMITATIONS OF OWL
      1   At least one tree-shaped model for each consistent OWL
          ontology    problematic representation of cycles
      2   No minimality condition on the models    hard to axiomatise
          classes based on the absence of attributes

     E XAMPLE
    Cyclobutane    ∃(= 4)hasAtom.(Carbon      ∃(= 2)hasBond.Carbon)
                                             Oxygen
      C      C

      C      C




3
E XPRESSIVITY L IMITATIONS OF OWL
      1   At least one tree-shaped model for each consistent OWL
          ontology    problematic representation of cycles
      2   No minimality condition on the models    hard to axiomatise
          classes based on the absence of attributes

     E XAMPLE
    Cyclobutane      ∃(= 4)hasAtom.(Carbon          ∃(= 2)hasBond.Carbon)
                                                    Oxygen
      C         C

      C         C

          Required reasoning support
            1   Is cyclobutane a cyclic molecule?
            2   Is cyclobutane a hydrocarbon?

3
E XPRESSIVITY L IMITATIONS OF OWL
      1   At least one tree-shaped model for each consistent OWL
          ontology    problematic representation of cycles
      2   No minimality condition on the models    hard to axiomatise
          classes based on the absence of attributes

     E XAMPLE
    Cyclobutane      ∃(= 4)hasAtom.(Carbon        ∃(= 2)hasBond.Carbon)
                                                  Oxygen
      C         C

      C         C

          Required reasoning support
            1   Is cyclobutane a cyclic molecule? 
            2   Is cyclobutane a hydrocarbon? 

3
R ESULTS OVERVIEW
    1   Expressive and decidable formalism for modelling complex
        objects: Description Graphs Logic Programs




4
R ESULTS OVERVIEW
    1   Expressive and decidable formalism for modelling complex
        objects: Description Graphs Logic Programs
    2   Modelling that spans a wide range of structure-dependent
        classes of molecules




4
R ESULTS OVERVIEW
    1   Expressive and decidable formalism for modelling complex
        objects: Description Graphs Logic Programs
    2   Modelling that spans a wide range of structure-dependent
        classes of molecules
    3   Implementation that draws upon DLV and performs
        structure-based classification with a significant speedup




4
R ESULTS OVERVIEW
    1   Expressive and decidable formalism for modelling complex
        objects: Description Graphs Logic Programs
    2   Modelling that spans a wide range of structure-dependent
        classes of molecules
    3   Implementation that draws upon DLV and performs
        structure-based classification with a significant speedup
    4   Evaluation over part of the manually curated ChEBI
        ontology revealed modelling errors




4
R ESULTS OVERVIEW
    1   Expressive and decidable formalism for modelling complex
        objects: Description Graphs Logic Programs
    2   Modelling that spans a wide range of structure-dependent
        classes of molecules
    3   Implementation that draws upon DLV and performs
        structure-based classification with a significant speedup
    4   Evaluation over part of the manually curated ChEBI
        ontology revealed modelling errors

          Language for representing biochemical structures with a
              favourable performance/expressivity trade-off




4
C LASSIFYING S TRUCTURED O BJECTS




5
C LASSIFYING S TRUCTURED O BJECTS

                   ascorbicAcid :   0
                                                 o
                                                 6
                                        o         c                o
                                                                                   o
                                        5   c    11        c       1       c
                       hasAtom                   h
                                                                                   2
                                            12            10               7
                        single                                 c       c
                                                 13
                        double                                 9       8


                                                      4    o                   3   o




5
C LASSIFYING S TRUCTURED O BJECTS

                                     ascorbicAcid :   0
                                                                   o
                                                                   6
                                                          o         c                o
                                                                                                     o
                                                          5   c    11        c       1       c
                                         hasAtom                   h
                                                                                                     2
                                                              12            10               7
                                          single                                 c       c
                                                                   13
                                          double                                 9       8


                                                                        4    o                   3   o




    ascorbicAcid(x) →hasAtom(x, f1 (x)) ∧ . . . ∧ hasAtom(x, f13 (x))
                       o(f1 (x)) ∧ . . . ∧ c(f7 (x)) ∧ . . . ∧
                       single(f1 (x), f7 (x)) ∧ double(f7 (x), f2 (x)) ∧ . . .




5
C LASSIFYING S TRUCTURED O BJECTS

                                        ascorbicAcid :   0
                                                                      o
                                                                      6
                                                             o         c                o
                                                                                                        o
                                                             5   c    11        c       1       c
                                            hasAtom                   h
                                                                                                        2
                                                                 12            10               7
                                             single                                 c       c
                                                                      13
                                             double                                 9       8


                                                                           4    o                   3   o




     ascorbicAcid(x) →hasAtom(x, f1 (x)) ∧ . . . ∧ hasAtom(x, f13 (x))
                          o(f1 (x)) ∧ . . . ∧ c(f7 (x)) ∧ . . . ∧
                          single(f1 (x), f7 (x)) ∧ double(f7 (x), f2 (x)) ∧ . . .
     hasAtom(x, y1 ) ∧ hasAtom(x, y2 ) ∧ y1 = y2 → polyatomicEntity(x)
     ∧5 hasAtom(x, yi ) ∧ c(y1 ) ∧ o(y2 ) ∧ o(y3 )∧
      i=1
               c(y4 ) ∧ horc(y5 ) ∧ double(y1 , y2 )∧
    single(y1 , y3 ) ∧ single(y3 , y4 ) ∧ single(y1 , y5 ) → carboxylicEster(x)
5
C LASSIFYING S TRUCTURED O BJECTS

                                      ascorbicAcid :   0
                                                                    o
                                                                    6
                                                           o         c                o
                                                                                                      o
                                                           5   c    11        c       1       c
                                          hasAtom                   h
                                                                                                      2
                                                               12            10               7
                                           single                                 c       c
                                                                    13
                                           double                                 9       8


                                                                         4    o                   3   o



    Input fact: ascorbicAcid(a)
    Stable model: ascorbicAcid(a), hasAtom(a, af ) for 1 ≤ i ≤ 13,
                                                           i
    o(af ) for 1 ≤ i ≤ 6, c(af ) for 7 ≤ i ≤ 12, h(af ), single(af , af ),
       i                      i                       13               8 3
    single(af , af ), single(af , af ) for i ∈ {5, 11}, single(af , af ),
              9 4               12 i                                11 6
    single(af , af ) for i ∈ {1, 9, 11, 13}, single(af , af ) for i ∈ {1, 8},
              10 i                                    7 i
    double(af , af ), double(af , af ), horc(af ) for 7 ≤ i ≤ 13,
               2 7                8 9            i
    polyatomicEntity(a), carboxylicEster(a), cyclic(a)


5
C LASSIFYING S TRUCTURED O BJECTS

                                      ascorbicAcid :   0
                                                                    o
                                                                    6
                                                           o         c                o
                                                                                                      o
                                                           5   c    11        c       1       c
                                          hasAtom                   h
                                                                                                      2
                                                               12            10               7
                                           single                                 c       c
                                                                    13
                                           double                                 9       8


                                                                         4    o                   3   o



    Input fact: ascorbicAcid(a)
    Stable model: ascorbicAcid(a), hasAtom(a, af ) for 1 ≤ i ≤ 13,
                                                           i
    o(af ) for 1 ≤ i ≤ 6, c(af ) for 7 ≤ i ≤ 12, h(af ), single(af , af ),
       i                      i                       13               8 3
    single(af , af ), single(af , af ) for i ∈ {5, 11}, single(af , af ),
              9 4               12 i                                11 6
    single(af , af ) for i ∈ {1, 9, 11, 13}, single(af , af ) for i ∈ {1, 8},
              10 i                                    7 i
    double(af , af ), double(af , af ), horc(af ) for 7 ≤ i ≤ 13,
               2 7                8 9            i
    polyatomicEntity(a), carboxylicEster(a), cyclic(a)
    Ascorbic acid is a cyclic polyatomic entity and a carboxylic ester

5
C HEMICAL C LASSES W E C OVERED
    1   Existence of subcomponents




6
C HEMICAL C LASSES W E C OVERED
    1   Existence of subcomponents
            Carbon molecules




6
C HEMICAL C LASSES W E C OVERED
    1   Existence of subcomponents
            Carbon molecules
            Carboxylic acids and carboxylic esters




6
C HEMICAL C LASSES W E C OVERED
    1   Existence of subcomponents
            Carbon molecules
            Carboxylic acids and carboxylic esters
            Ketones and aldehydes




6
C HEMICAL C LASSES W E C OVERED
    1   Existence of subcomponents
            Carbon molecules
            Carboxylic acids and carboxylic esters
            Ketones and aldehydes
    2   Exact cardinality of parts




6
C HEMICAL C LASSES W E C OVERED
    1   Existence of subcomponents
            Carbon molecules
            Carboxylic acids and carboxylic esters
            Ketones and aldehydes
    2   Exact cardinality of parts
            Exactly two carbons




6
C HEMICAL C LASSES W E C OVERED
    1   Existence of subcomponents
            Carbon molecules
            Carboxylic acids and carboxylic esters
            Ketones and aldehydes
    2   Exact cardinality of parts
            Exactly two carbons
            Dicarboxylic acid




6
C HEMICAL C LASSES W E C OVERED
    1   Existence of subcomponents
            Carbon molecules
            Carboxylic acids and carboxylic esters
            Ketones and aldehydes
    2   Exact cardinality of parts
            Exactly two carbons
            Dicarboxylic acid
    3   Exclusive composition




6
C HEMICAL C LASSES W E C OVERED
    1   Existence of subcomponents
            Carbon molecules
            Carboxylic acids and carboxylic esters
            Ketones and aldehydes
    2   Exact cardinality of parts
            Exactly two carbons
            Dicarboxylic acid
    3   Exclusive composition
            Inorganic molecules




6
C HEMICAL C LASSES W E C OVERED
    1   Existence of subcomponents
            Carbon molecules
            Carboxylic acids and carboxylic esters
            Ketones and aldehydes
    2   Exact cardinality of parts
            Exactly two carbons
            Dicarboxylic acid
    3   Exclusive composition
            Inorganic molecules
            Hydrocarbons




6
C HEMICAL C LASSES W E C OVERED
    1   Existence of subcomponents
            Carbon molecules
            Carboxylic acids and carboxylic esters
            Ketones and aldehydes
    2   Exact cardinality of parts
            Exactly two carbons
            Dicarboxylic acid
    3   Exclusive composition
            Inorganic molecules
            Hydrocarbons
            Saturated molecules




6
C HEMICAL C LASSES W E C OVERED
    1   Existence of subcomponents
            Carbon molecules
            Carboxylic acids and carboxylic esters
            Ketones and aldehydes
    2   Exact cardinality of parts
            Exactly two carbons
            Dicarboxylic acid
    3   Exclusive composition
            Inorganic molecules
            Hydrocarbons
            Saturated molecules
    4   Cyclicity-related classes




6
C HEMICAL C LASSES W E C OVERED
    1   Existence of subcomponents
            Carbon molecules
            Carboxylic acids and carboxylic esters
            Ketones and aldehydes
    2   Exact cardinality of parts
            Exactly two carbons
            Dicarboxylic acid
    3   Exclusive composition
            Inorganic molecules
            Hydrocarbons
            Saturated molecules
    4   Cyclicity-related classes
            Benzenes




6
C HEMICAL C LASSES W E C OVERED
    1   Existence of subcomponents
            Carbon molecules
            Carboxylic acids and carboxylic esters
            Ketones and aldehydes
    2   Exact cardinality of parts
            Exactly two carbons
            Dicarboxylic acid
    3   Exclusive composition
            Inorganic molecules
            Hydrocarbons
            Saturated molecules
    4   Cyclicity-related classes
            Benzenes
            Cyclic molecules




6
C HEMICAL C LASSES W E C OVERED
    1   Existence of subcomponents
            Carbon molecules
            Carboxylic acids and carboxylic esters
            Ketones and aldehydes
    2   Exact cardinality of parts
            Exactly two carbons
            Dicarboxylic acid
    3   Exclusive composition
            Inorganic molecules
            Hydrocarbons
            Saturated molecules
    4   Cyclicity-related classes
            Benzenes
            Cyclic molecules
            Alkanes




6
E MPIRICAL E VALUATION
    Draws upon DLV, a deductive databases engine




7
E MPIRICAL E VALUATION
    Draws upon DLV, a deductive databases engine
    Evaluation with data extracted from ChEBI




7
E MPIRICAL E VALUATION
    Draws upon DLV, a deductive databases engine
    Evaluation with data extracted from ChEBI
    500 molecules under 51 chemical classes in 40 secs




7
E MPIRICAL E VALUATION
    Draws upon DLV, a deductive databases engine
    Evaluation with data extracted from ChEBI
    500 molecules under 51 chemical classes in 40 secs
    Quicker than other approaches:




7
E MPIRICAL E VALUATION
    Draws upon DLV, a deductive databases engine
    Evaluation with data extracted from ChEBI
    500 molecules under 51 chemical classes in 40 secs
    Quicker than other approaches:
        [Hastings et al., 2010] 140 molecules in 4 hours
        [Magka et al., 2012] 70 molecules in 450 secs




7
E MPIRICAL E VALUATION
    Draws upon DLV, a deductive databases engine
    Evaluation with data extracted from ChEBI
    500 molecules under 51 chemical classes in 40 secs
    Quicker than other approaches:
        [Hastings et al., 2010] 140 molecules in 4 hours
        [Magka et al., 2012] 70 molecules in 450 secs
    Subsumptions exposed by our prototype:




7
E MPIRICAL E VALUATION
    Draws upon DLV, a deductive databases engine
    Evaluation with data extracted from ChEBI
    500 molecules under 51 chemical classes in 40 secs
    Quicker than other approaches:
        [Hastings et al., 2010] 140 molecules in 4 hours
        [Magka et al., 2012] 70 molecules in 450 secs
    Subsumptions exposed by our prototype:
        ascorbic acid is a polyatomic entity, a carboxylic ester and a
        cyclic molecule
        missing from the ChEBI OWL ontology




7
E MPIRICAL E VALUATION
    Draws upon DLV, a deductive databases engine
    Evaluation with data extracted from ChEBI
    500 molecules under 51 chemical classes in 40 secs
    Quicker than other approaches:
        [Hastings et al., 2010] 140 molecules in 4 hours
        [Magka et al., 2012] 70 molecules in 450 secs
    Subsumptions exposed by our prototype:
        ascorbic acid is a polyatomic entity, a carboxylic ester and a
        cyclic molecule
        missing from the ChEBI OWL ontology
    Contradictory subclass relation from ChEBI:




7
E MPIRICAL E VALUATION
    Draws upon DLV, a deductive databases engine
    Evaluation with data extracted from ChEBI
    500 molecules under 51 chemical classes in 40 secs
    Quicker than other approaches:
        [Hastings et al., 2010] 140 molecules in 4 hours
        [Magka et al., 2012] 70 molecules in 450 secs
    Subsumptions exposed by our prototype:
        ascorbic acid is a polyatomic entity, a carboxylic ester and a
        cyclic molecule
        missing from the ChEBI OWL ontology
    Contradictory subclass relation from ChEBI:
        Ascorbic acid is asserted to be a carboxylic acid (release 95)
        Not listed among the subsumptions derived by our prototype




7
C ONCLUSION AND F URTHER R ESEARCH
    Results
     1   Expressive and decidable formalism for complex objects




8
C ONCLUSION AND F URTHER R ESEARCH
    Results
     1   Expressive and decidable formalism for complex objects
     2   Wide range of structure-based classes




8
C ONCLUSION AND F URTHER R ESEARCH
    Results
     1   Expressive and decidable formalism for complex objects
     2   Wide range of structure-based classes
     3   DLV-based implementation exhibits a significant speedup




8
C ONCLUSION AND F URTHER R ESEARCH
    Results
     1   Expressive and decidable formalism for complex objects
     2   Wide range of structure-based classes
     3   DLV-based implementation exhibits a significant speedup
     4   Evaluation over ChEBI ontology revealed modelling errors




8
C ONCLUSION AND F URTHER R ESEARCH
    Results
     1   Expressive and decidable formalism for complex objects
     2   Wide range of structure-based classes
     3   DLV-based implementation exhibits a significant speedup
     4   Evaluation over ChEBI ontology revealed modelling errors
     Language for representing biochemical structures with a
         favourable performance/expressivity trade-off




8
C ONCLUSION AND F URTHER R ESEARCH
    Results
      1   Expressive and decidable formalism for complex objects
      2   Wide range of structure-based classes
      3   DLV-based implementation exhibits a significant speedup
      4   Evaluation over ChEBI ontology revealed modelling errors
      Language for representing biochemical structures with a
          favourable performance/expressivity trade-off

    Future directions
          SMILES-based surface syntax




8
C ONCLUSION AND F URTHER R ESEARCH
    Results
      1 Expressive and decidable formalism for complex objects
      2 Wide range of structure-based classes
      3 DLV-based implementation exhibits a significant speedup
      4 Evaluation over ChEBI ontology revealed modelling errors

      Language for representing biochemical structures with a
          favourable performance/expressivity trade-off

    Future directions
          SMILES-based surface syntax

           ∧5 hasAtom(x, yi ) ∧ c(y1 ) ∧ o(y2 ) ∧ o(y3 ) ∧ c(y4 )∧
            i=1
           double(y1 , y2 ) ∧ single(y1 , y3 ) ∧ single(y3 , y4 ) ∧ single(y1 , y5 )
           → carboxylicEster(x)




8
C ONCLUSION AND F URTHER R ESEARCH
    Results
      1 Expressive and decidable formalism for complex objects
      2 Wide range of structure-based classes
      3 DLV-based implementation exhibits a significant speedup
      4 Evaluation over ChEBI ontology revealed modelling errors

      Language for representing biochemical structures with a
          favourable performance/expressivity trade-off

    Future directions
          SMILES-based surface syntax

           define carboxylicEster
           some hasAtom SMILES(COC(= O)[∗])
           end.




8
C ONCLUSION AND F URTHER R ESEARCH
    Results
      1   Expressive and decidable formalism for complex objects
      2   Wide range of structure-based classes
      3   DLV-based implementation exhibits a significant speedup
      4   Evaluation over ChEBI ontology revealed modelling errors
      Language for representing biochemical structures with a
          favourable performance/expressivity trade-off

    Future directions
          SMILES-based surface syntax
          Detect subsumptions between classes




8
C ONCLUSION AND F URTHER R ESEARCH
    Results
      1   Expressive and decidable formalism for complex objects
      2   Wide range of structure-based classes
      3   DLV-based implementation exhibits a significant speedup
      4   Evaluation over ChEBI ontology revealed modelling errors
      Language for representing biochemical structures with a
          favourable performance/expressivity trade-off

    Future directions
          SMILES-based surface syntax
          Detect subsumptions between classes
          E.g., Carboxylic ester is an organic molecular entity




8
C ONCLUSION AND F URTHER R ESEARCH
    Results
      1   Expressive and decidable formalism for complex objects
      2   Wide range of structure-based classes
      3   DLV-based implementation exhibits a significant speedup
      4   Evaluation over ChEBI ontology revealed modelling errors
      Language for representing biochemical structures with a
          favourable performance/expressivity trade-off

    Future directions
          SMILES-based surface syntax
          Detect subsumptions between classes
          Extensions with numerical datatypes




8
C ONCLUSION AND F URTHER R ESEARCH
    Results
      1   Expressive and decidable formalism for complex objects
      2   Wide range of structure-based classes
      3   DLV-based implementation exhibits a significant speedup
      4   Evaluation over ChEBI ontology revealed modelling errors
      Language for representing biochemical structures with a
          favourable performance/expressivity trade-off

    Future directions
          SMILES-based surface syntax
          Detect subsumptions between classes
          Extensions with numerical datatypes
          E.g., Small molecules if they weigh less than 800 daltons




8
C ONCLUSION AND F URTHER R ESEARCH
    Results
      1   Expressive and decidable formalism for complex objects
      2   Wide range of structure-based classes
      3   DLV-based implementation exhibits a significant speedup
      4   Evaluation over ChEBI ontology revealed modelling errors
      Language for representing biochemical structures with a
          favourable performance/expressivity trade-off

    Future directions
          SMILES-based surface syntax
          Detect subsumptions between classes
          Extensions with numerical datatypes
          Classification of complex biological objects




8
C ONCLUSION AND F URTHER R ESEARCH
    Results
      1   Expressive and decidable formalism for complex objects
      2   Wide range of structure-based classes
      3   DLV-based implementation exhibits a significant speedup
      4   Evaluation over ChEBI ontology revealed modelling errors
      Language for representing biochemical structures with a
          favourable performance/expressivity trade-off

    Future directions
          SMILES-based surface syntax
          Detect subsumptions between classes
          Extensions with numerical datatypes
          Classification of complex biological objects
          Integration with Protégé, Bioclipse, JChemPaint,. . .




8
C ONCLUSION AND F URTHER R ESEARCH
    Results
      1   Expressive and decidable formalism for complex objects
      2   Wide range of structure-based classes
      3   DLV-based implementation exhibits a significant speedup
      4   Evaluation over ChEBI ontology revealed modelling errors
      Language for representing biochemical structures with a
          favourable performance/expressivity trade-off

    Future directions
          SMILES-based surface syntax
          Detect subsumptions between classes
          Extensions with numerical datatypes
          Classification of complex biological objects
          Integration with Protégé, Bioclipse, JChemPaint,. . .
          Mapping from our logic to RDF




8
C ONCLUSION AND F URTHER R ESEARCH
    Results
      1   Expressive and decidable formalism for complex objects
      2   Wide range of structure-based classes
      3   DLV-based implementation exhibits a significant speedup
      4   Evaluation over ChEBI ontology revealed modelling errors
      Language for representing biochemical structures with a
          favourable performance/expressivity trade-off

    Future directions
          SMILES-based surface syntax
          Detect subsumptions between classes
          Extensions with numerical datatypes
          Classification of complex biological objects
          Integration with Protégé, Bioclipse, JChemPaint,. . .
          Mapping from our logic to RDF
    Thank you! Questions?!?


8

Mais conteúdo relacionado

Destaque

Acyclicity Conditions and their Application to Query Answering in Description...
Acyclicity Conditions and their Application to Query Answering in Description...Acyclicity Conditions and their Application to Query Answering in Description...
Acyclicity Conditions and their Application to Query Answering in Description...
Despoina Magka
 
Ontology-based Classification and Faceted Search Interface for APIs
Ontology-based Classification and Faceted Search Interface for APIsOntology-based Classification and Faceted Search Interface for APIs
Ontology-based Classification and Faceted Search Interface for APIs
New York City College of Technology Computer Systems Technology Colloquium
 
Introduction to the Semantic Web
Introduction to the Semantic WebIntroduction to the Semantic Web
Introduction to the Semantic Web
Tomek Pluskiewicz
 
Ontology Powerpoint
Ontology PowerpointOntology Powerpoint
Ontology Powerpoint
ARH_Miller
 

Destaque (19)

Classifying Chemicals with Description Graphs and Logic Programming
Classifying Chemicals with Description Graphs and Logic ProgrammingClassifying Chemicals with Description Graphs and Logic Programming
Classifying Chemicals with Description Graphs and Logic Programming
 
Acyclicity Conditions and their Application to Query Answering in Description...
Acyclicity Conditions and their Application to Query Answering in Description...Acyclicity Conditions and their Application to Query Answering in Description...
Acyclicity Conditions and their Application to Query Answering in Description...
 
Tractable Extensions of the Description Logic EL with Numerical Datatypes
Tractable Extensions of the Description Logic EL with Numerical DatatypesTractable Extensions of the Description Logic EL with Numerical Datatypes
Tractable Extensions of the Description Logic EL with Numerical Datatypes
 
Computing Stable Models for Nonmonotonic Existential Rules
Computing Stable Models for Nonmonotonic Existential RulesComputing Stable Models for Nonmonotonic Existential Rules
Computing Stable Models for Nonmonotonic Existential Rules
 
thesis-despoina
thesis-despoinathesis-despoina
thesis-despoina
 
Ontology-based Classification and Faceted Search Interface for APIs
Ontology-based Classification and Faceted Search Interface for APIsOntology-based Classification and Faceted Search Interface for APIs
Ontology-based Classification and Faceted Search Interface for APIs
 
Data Integration at the Ontology Engineering Group
Data Integration at the Ontology Engineering GroupData Integration at the Ontology Engineering Group
Data Integration at the Ontology Engineering Group
 
Ontology For Data Integration
Ontology For Data IntegrationOntology For Data Integration
Ontology For Data Integration
 
Ontology-based Data Integration
Ontology-based Data IntegrationOntology-based Data Integration
Ontology-based Data Integration
 
Jarrar: Introduction to Ontology
Jarrar: Introduction to OntologyJarrar: Introduction to Ontology
Jarrar: Introduction to Ontology
 
Examples of Ontology Applications
Examples of Ontology ApplicationsExamples of Ontology Applications
Examples of Ontology Applications
 
Introduction to Ontology Concepts and Terminology
Introduction to Ontology Concepts and TerminologyIntroduction to Ontology Concepts and Terminology
Introduction to Ontology Concepts and Terminology
 
Ontologies in computer science and on the web
Ontologies in computer science and on the webOntologies in computer science and on the web
Ontologies in computer science and on the web
 
Introduction to the Semantic Web
Introduction to the Semantic WebIntroduction to the Semantic Web
Introduction to the Semantic Web
 
Introduction to the Semantic Web
Introduction to the Semantic WebIntroduction to the Semantic Web
Introduction to the Semantic Web
 
ontology based- data_integration.
ontology based- data_integration.ontology based- data_integration.
ontology based- data_integration.
 
Ontology Powerpoint
Ontology PowerpointOntology Powerpoint
Ontology Powerpoint
 
Ontology
OntologyOntology
Ontology
 
Functional Programming Patterns (BuildStuff '14)
Functional Programming Patterns (BuildStuff '14)Functional Programming Patterns (BuildStuff '14)
Functional Programming Patterns (BuildStuff '14)
 

Semelhante a Ontology-Based Classification of Molecules: a Logic Programming Approach

Organic-Chemistry Introduction Chembiooo
Organic-Chemistry Introduction ChembioooOrganic-Chemistry Introduction Chembiooo
Organic-Chemistry Introduction Chembiooo
rmayukilchmsu
 
Ionic bonds ok1294990488
Ionic bonds  ok1294990488Ionic bonds  ok1294990488
Ionic bonds ok1294990488
Navin Joshi
 
Stereochem2012ques.pptx
Stereochem2012ques.pptxStereochem2012ques.pptx
Stereochem2012ques.pptx
GETU2
 
12 chp14 lect 3
12 chp14 lect 312 chp14 lect 3
12 chp14 lect 3
subham1543
 
Chapter 26 skeleton notes
Chapter 26   skeleton notesChapter 26   skeleton notes
Chapter 26 skeleton notes
fingiie
 

Semelhante a Ontology-Based Classification of Molecules: a Logic Programming Approach (20)

Organic-Chemistry Introduction Chembiooo
Organic-Chemistry Introduction ChembioooOrganic-Chemistry Introduction Chembiooo
Organic-Chemistry Introduction Chembiooo
 
intro organic
intro organicintro organic
intro organic
 
Or Ganic Intro
Or Ganic IntroOr Ganic Intro
Or Ganic Intro
 
organic.doc
organic.docorganic.doc
organic.doc
 
Ch05. streochemistry
Ch05. streochemistryCh05. streochemistry
Ch05. streochemistry
 
basic_organic_chemistry_and_mechanisms_revision_from_m_wills_for_when_you_are...
basic_organic_chemistry_and_mechanisms_revision_from_m_wills_for_when_you_are...basic_organic_chemistry_and_mechanisms_revision_from_m_wills_for_when_you_are...
basic_organic_chemistry_and_mechanisms_revision_from_m_wills_for_when_you_are...
 
Introduction to organic chemisry
Introduction to organic chemisryIntroduction to organic chemisry
Introduction to organic chemisry
 
steroechemistry
steroechemistrysteroechemistry
steroechemistry
 
Ionic bonds ok1294990488
Ionic bonds  ok1294990488Ionic bonds  ok1294990488
Ionic bonds ok1294990488
 
Stereochem2012ques.pptx
Stereochem2012ques.pptxStereochem2012ques.pptx
Stereochem2012ques.pptx
 
12 chp14 lect 3
12 chp14 lect 312 chp14 lect 3
12 chp14 lect 3
 
Chemical bonding
Chemical bondingChemical bonding
Chemical bonding
 
stereochemistry
stereochemistrystereochemistry
stereochemistry
 
Chapter 26 skeleton notes
Chapter 26   skeleton notesChapter 26   skeleton notes
Chapter 26 skeleton notes
 
Neighbouring Group Participation.pptx
Neighbouring Group Participation.pptxNeighbouring Group Participation.pptx
Neighbouring Group Participation.pptx
 
Chapter3unsaturatedhydrocarbons 151111005305-lva1-app6891
Chapter3unsaturatedhydrocarbons 151111005305-lva1-app6891Chapter3unsaturatedhydrocarbons 151111005305-lva1-app6891
Chapter3unsaturatedhydrocarbons 151111005305-lva1-app6891
 
Chapter 3 Unsaturated Hydrocarbons
Chapter 3 Unsaturated HydrocarbonsChapter 3 Unsaturated Hydrocarbons
Chapter 3 Unsaturated Hydrocarbons
 
Isomerism .ppt
Isomerism .pptIsomerism .ppt
Isomerism .ppt
 
Organic Chemistry Introduction and Orbital Design
Organic Chemistry Introduction and Orbital DesignOrganic Chemistry Introduction and Orbital Design
Organic Chemistry Introduction and Orbital Design
 
Basic Concepts and Alkanes
Basic Concepts and AlkanesBasic Concepts and Alkanes
Basic Concepts and Alkanes
 

Último

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Último (20)

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 

Ontology-Based Classification of Molecules: a Logic Programming Approach

  • 1. O NTOLOGY-BASED C LASSIFICATION OF M OLECULES : A L OGIC P ROGRAMMING A PPROACH Despoina Magka Department of Computer Science, University of Oxford November 30, 2012
  • 2. B IOINFORMATICS AND S EMANTIC T ECHNOLOGIES Life sciences data deluge 1
  • 3. B IOINFORMATICS AND S EMANTIC T ECHNOLOGIES Life sciences data deluge Hierarchical organisation of biochemical knowledge 1
  • 4. B IOINFORMATICS AND S EMANTIC T ECHNOLOGIES Life sciences data deluge Hierarchical organisation of biochemical knowledge 1
  • 5. B IOINFORMATICS AND S EMANTIC T ECHNOLOGIES Life sciences data deluge Hierarchical organisation of biochemical knowledge 1
  • 6. B IOINFORMATICS AND S EMANTIC T ECHNOLOGIES Life sciences data deluge Hierarchical organisation of biochemical knowledge Fast, automatic and repeatable classification driven by Semantic technologies 1
  • 7. B IOINFORMATICS AND S EMANTIC T ECHNOLOGIES Life sciences data deluge Hierarchical organisation of biochemical knowledge Fast, automatic and repeatable classification driven by Semantic technologies Web Ontology Language, a W3C standard family of logic-based formalisms 1
  • 8. B IOINFORMATICS AND S EMANTIC T ECHNOLOGIES Life sciences data deluge Hierarchical organisation of biochemical knowledge Fast, automatic and repeatable classification driven by Semantic technologies Web Ontology Language, a W3C standard family of logic-based formalisms OWL bio- and chemo-ontologies widely adopted 1
  • 9. T HE C H EBI O NTOLOGY OWL ontology Chemical Entities of Biological Interest 2
  • 10. T HE C H EBI O NTOLOGY OWL ontology Chemical Entities of Biological Interest Dictionary of molecules with taxonomical information 2
  • 11. T HE C H EBI O NTOLOGY OWL ontology Chemical Entities of Biological Interest Dictionary of molecules with taxonomical information caffeine is a cyclic molecule 2
  • 12. T HE C H EBI O NTOLOGY OWL ontology Chemical Entities of Biological Interest Dictionary of molecules with taxonomical information serotonin is an organic molecule 2
  • 13. T HE C H EBI O NTOLOGY OWL ontology Chemical Entities of Biological Interest Dictionary of molecules with taxonomical information ascorbic acid is a carboxylic ester 2
  • 14. T HE C H EBI O NTOLOGY OWL ontology Chemical Entities of Biological Interest Dictionary of molecules with taxonomical information Pharmaceutical design and study of biological pathways 2
  • 15. T HE C H EBI O NTOLOGY OWL ontology Chemical Entities of Biological Interest Dictionary of molecules with taxonomical information Pharmaceutical design and study of biological pathways ChEBI is manually incremented 2
  • 16. T HE C H EBI O NTOLOGY OWL ontology Chemical Entities of Biological Interest Dictionary of molecules with taxonomical information Pharmaceutical design and study of biological pathways ChEBI is manually incremented Currently ~30,000 chemical entities, expands at 3,500/yr 2
  • 17. T HE C H EBI O NTOLOGY OWL ontology Chemical Entities of Biological Interest Dictionary of molecules with taxonomical information Pharmaceutical design and study of biological pathways ChEBI is manually incremented Currently ~30,000 chemical entities, expands at 3,500/yr Existing chemical databases describe millions of molecules 2
  • 18. T HE C H EBI O NTOLOGY OWL ontology Chemical Entities of Biological Interest Dictionary of molecules with taxonomical information Pharmaceutical design and study of biological pathways ChEBI is manually incremented Currently ~30,000 chemical entities, expands at 3,500/yr Existing chemical databases describe millions of molecules Speed up growth by automating chemical classification 2
  • 19. E XPRESSIVITY L IMITATIONS OF OWL 1 At least one tree-shaped model for each consistent OWL ontology problematic representation of cycles 3
  • 20. E XPRESSIVITY L IMITATIONS OF OWL 1 At least one tree-shaped model for each consistent OWL ontology problematic representation of cycles E XAMPLE C C C C 3
  • 21. E XPRESSIVITY L IMITATIONS OF OWL 1 At least one tree-shaped model for each consistent OWL ontology problematic representation of cycles E XAMPLE Cyclobutane ∃(= 4)hasAtom.(Carbon ∃(= 2)hasBond.Carbon) C C C C 3
  • 22. E XPRESSIVITY L IMITATIONS OF OWL 1 At least one tree-shaped model for each consistent OWL ontology problematic representation of cycles E XAMPLE Cyclobutane ∃(= 4)hasAtom.(Carbon ∃(= 2)hasBond.Carbon) C C C C 3
  • 23. E XPRESSIVITY L IMITATIONS OF OWL 1 At least one tree-shaped model for each consistent OWL ontology problematic representation of cycles E XAMPLE Cyclobutane ∃(= 4)hasAtom.(Carbon ∃(= 2)hasBond.Carbon) C C C C OWL-based reasoning support 1 Is cyclobutane a cyclic molecule? 3
  • 24. E XPRESSIVITY L IMITATIONS OF OWL 1 At least one tree-shaped model for each consistent OWL ontology problematic representation of cycles 2 No minimality condition on the models hard to axiomatise classes based on the absence of attributes E XAMPLE Cyclobutane ∃(= 4)hasAtom.(Carbon ∃(= 2)hasBond.Carbon) C C C C OWL-based reasoning support 1 Is cyclobutane a cyclic molecule? 3
  • 25. E XPRESSIVITY L IMITATIONS OF OWL 1 At least one tree-shaped model for each consistent OWL ontology problematic representation of cycles 2 No minimality condition on the models hard to axiomatise classes based on the absence of attributes E XAMPLE Cyclobutane ∃(= 4)hasAtom.(Carbon ∃(= 2)hasBond.Carbon) Oxygen C C C C OWL-based reasoning support 1 Is cyclobutane a cyclic molecule? 3
  • 26. E XPRESSIVITY L IMITATIONS OF OWL 1 At least one tree-shaped model for each consistent OWL ontology problematic representation of cycles 2 No minimality condition on the models hard to axiomatise classes based on the absence of attributes E XAMPLE Cyclobutane ∃(= 4)hasAtom.(Carbon ∃(= 2)hasBond.Carbon) Oxygen C C C C OWL-based reasoning support 1 Is cyclobutane a cyclic molecule? 2 Is cyclobutane a hydrocarbon? 3
  • 27. E XPRESSIVITY L IMITATIONS OF OWL 1 At least one tree-shaped model for each consistent OWL ontology problematic representation of cycles 2 No minimality condition on the models hard to axiomatise classes based on the absence of attributes E XAMPLE Cyclobutane ∃(= 4)hasAtom.(Carbon ∃(= 2)hasBond.Carbon) Oxygen C C C C 3
  • 28. E XPRESSIVITY L IMITATIONS OF OWL 1 At least one tree-shaped model for each consistent OWL ontology problematic representation of cycles 2 No minimality condition on the models hard to axiomatise classes based on the absence of attributes E XAMPLE Cyclobutane ∃(= 4)hasAtom.(Carbon ∃(= 2)hasBond.Carbon) Oxygen C C C C Required reasoning support 1 Is cyclobutane a cyclic molecule? 2 Is cyclobutane a hydrocarbon? 3
  • 29. E XPRESSIVITY L IMITATIONS OF OWL 1 At least one tree-shaped model for each consistent OWL ontology problematic representation of cycles 2 No minimality condition on the models hard to axiomatise classes based on the absence of attributes E XAMPLE Cyclobutane ∃(= 4)hasAtom.(Carbon ∃(= 2)hasBond.Carbon) Oxygen C C C C Required reasoning support 1 Is cyclobutane a cyclic molecule? 2 Is cyclobutane a hydrocarbon? 3
  • 30. R ESULTS OVERVIEW 1 Expressive and decidable formalism for modelling complex objects: Description Graphs Logic Programs 4
  • 31. R ESULTS OVERVIEW 1 Expressive and decidable formalism for modelling complex objects: Description Graphs Logic Programs 2 Modelling that spans a wide range of structure-dependent classes of molecules 4
  • 32. R ESULTS OVERVIEW 1 Expressive and decidable formalism for modelling complex objects: Description Graphs Logic Programs 2 Modelling that spans a wide range of structure-dependent classes of molecules 3 Implementation that draws upon DLV and performs structure-based classification with a significant speedup 4
  • 33. R ESULTS OVERVIEW 1 Expressive and decidable formalism for modelling complex objects: Description Graphs Logic Programs 2 Modelling that spans a wide range of structure-dependent classes of molecules 3 Implementation that draws upon DLV and performs structure-based classification with a significant speedup 4 Evaluation over part of the manually curated ChEBI ontology revealed modelling errors 4
  • 34. R ESULTS OVERVIEW 1 Expressive and decidable formalism for modelling complex objects: Description Graphs Logic Programs 2 Modelling that spans a wide range of structure-dependent classes of molecules 3 Implementation that draws upon DLV and performs structure-based classification with a significant speedup 4 Evaluation over part of the manually curated ChEBI ontology revealed modelling errors Language for representing biochemical structures with a favourable performance/expressivity trade-off 4
  • 35. C LASSIFYING S TRUCTURED O BJECTS 5
  • 36. C LASSIFYING S TRUCTURED O BJECTS ascorbicAcid : 0 o 6 o c o o 5 c 11 c 1 c hasAtom h 2 12 10 7 single c c 13 double 9 8 4 o 3 o 5
  • 37. C LASSIFYING S TRUCTURED O BJECTS ascorbicAcid : 0 o 6 o c o o 5 c 11 c 1 c hasAtom h 2 12 10 7 single c c 13 double 9 8 4 o 3 o ascorbicAcid(x) →hasAtom(x, f1 (x)) ∧ . . . ∧ hasAtom(x, f13 (x)) o(f1 (x)) ∧ . . . ∧ c(f7 (x)) ∧ . . . ∧ single(f1 (x), f7 (x)) ∧ double(f7 (x), f2 (x)) ∧ . . . 5
  • 38. C LASSIFYING S TRUCTURED O BJECTS ascorbicAcid : 0 o 6 o c o o 5 c 11 c 1 c hasAtom h 2 12 10 7 single c c 13 double 9 8 4 o 3 o ascorbicAcid(x) →hasAtom(x, f1 (x)) ∧ . . . ∧ hasAtom(x, f13 (x)) o(f1 (x)) ∧ . . . ∧ c(f7 (x)) ∧ . . . ∧ single(f1 (x), f7 (x)) ∧ double(f7 (x), f2 (x)) ∧ . . . hasAtom(x, y1 ) ∧ hasAtom(x, y2 ) ∧ y1 = y2 → polyatomicEntity(x) ∧5 hasAtom(x, yi ) ∧ c(y1 ) ∧ o(y2 ) ∧ o(y3 )∧ i=1 c(y4 ) ∧ horc(y5 ) ∧ double(y1 , y2 )∧ single(y1 , y3 ) ∧ single(y3 , y4 ) ∧ single(y1 , y5 ) → carboxylicEster(x) 5
  • 39. C LASSIFYING S TRUCTURED O BJECTS ascorbicAcid : 0 o 6 o c o o 5 c 11 c 1 c hasAtom h 2 12 10 7 single c c 13 double 9 8 4 o 3 o Input fact: ascorbicAcid(a) Stable model: ascorbicAcid(a), hasAtom(a, af ) for 1 ≤ i ≤ 13, i o(af ) for 1 ≤ i ≤ 6, c(af ) for 7 ≤ i ≤ 12, h(af ), single(af , af ), i i 13 8 3 single(af , af ), single(af , af ) for i ∈ {5, 11}, single(af , af ), 9 4 12 i 11 6 single(af , af ) for i ∈ {1, 9, 11, 13}, single(af , af ) for i ∈ {1, 8}, 10 i 7 i double(af , af ), double(af , af ), horc(af ) for 7 ≤ i ≤ 13, 2 7 8 9 i polyatomicEntity(a), carboxylicEster(a), cyclic(a) 5
  • 40. C LASSIFYING S TRUCTURED O BJECTS ascorbicAcid : 0 o 6 o c o o 5 c 11 c 1 c hasAtom h 2 12 10 7 single c c 13 double 9 8 4 o 3 o Input fact: ascorbicAcid(a) Stable model: ascorbicAcid(a), hasAtom(a, af ) for 1 ≤ i ≤ 13, i o(af ) for 1 ≤ i ≤ 6, c(af ) for 7 ≤ i ≤ 12, h(af ), single(af , af ), i i 13 8 3 single(af , af ), single(af , af ) for i ∈ {5, 11}, single(af , af ), 9 4 12 i 11 6 single(af , af ) for i ∈ {1, 9, 11, 13}, single(af , af ) for i ∈ {1, 8}, 10 i 7 i double(af , af ), double(af , af ), horc(af ) for 7 ≤ i ≤ 13, 2 7 8 9 i polyatomicEntity(a), carboxylicEster(a), cyclic(a) Ascorbic acid is a cyclic polyatomic entity and a carboxylic ester 5
  • 41. C HEMICAL C LASSES W E C OVERED 1 Existence of subcomponents 6
  • 42. C HEMICAL C LASSES W E C OVERED 1 Existence of subcomponents Carbon molecules 6
  • 43. C HEMICAL C LASSES W E C OVERED 1 Existence of subcomponents Carbon molecules Carboxylic acids and carboxylic esters 6
  • 44. C HEMICAL C LASSES W E C OVERED 1 Existence of subcomponents Carbon molecules Carboxylic acids and carboxylic esters Ketones and aldehydes 6
  • 45. C HEMICAL C LASSES W E C OVERED 1 Existence of subcomponents Carbon molecules Carboxylic acids and carboxylic esters Ketones and aldehydes 2 Exact cardinality of parts 6
  • 46. C HEMICAL C LASSES W E C OVERED 1 Existence of subcomponents Carbon molecules Carboxylic acids and carboxylic esters Ketones and aldehydes 2 Exact cardinality of parts Exactly two carbons 6
  • 47. C HEMICAL C LASSES W E C OVERED 1 Existence of subcomponents Carbon molecules Carboxylic acids and carboxylic esters Ketones and aldehydes 2 Exact cardinality of parts Exactly two carbons Dicarboxylic acid 6
  • 48. C HEMICAL C LASSES W E C OVERED 1 Existence of subcomponents Carbon molecules Carboxylic acids and carboxylic esters Ketones and aldehydes 2 Exact cardinality of parts Exactly two carbons Dicarboxylic acid 3 Exclusive composition 6
  • 49. C HEMICAL C LASSES W E C OVERED 1 Existence of subcomponents Carbon molecules Carboxylic acids and carboxylic esters Ketones and aldehydes 2 Exact cardinality of parts Exactly two carbons Dicarboxylic acid 3 Exclusive composition Inorganic molecules 6
  • 50. C HEMICAL C LASSES W E C OVERED 1 Existence of subcomponents Carbon molecules Carboxylic acids and carboxylic esters Ketones and aldehydes 2 Exact cardinality of parts Exactly two carbons Dicarboxylic acid 3 Exclusive composition Inorganic molecules Hydrocarbons 6
  • 51. C HEMICAL C LASSES W E C OVERED 1 Existence of subcomponents Carbon molecules Carboxylic acids and carboxylic esters Ketones and aldehydes 2 Exact cardinality of parts Exactly two carbons Dicarboxylic acid 3 Exclusive composition Inorganic molecules Hydrocarbons Saturated molecules 6
  • 52. C HEMICAL C LASSES W E C OVERED 1 Existence of subcomponents Carbon molecules Carboxylic acids and carboxylic esters Ketones and aldehydes 2 Exact cardinality of parts Exactly two carbons Dicarboxylic acid 3 Exclusive composition Inorganic molecules Hydrocarbons Saturated molecules 4 Cyclicity-related classes 6
  • 53. C HEMICAL C LASSES W E C OVERED 1 Existence of subcomponents Carbon molecules Carboxylic acids and carboxylic esters Ketones and aldehydes 2 Exact cardinality of parts Exactly two carbons Dicarboxylic acid 3 Exclusive composition Inorganic molecules Hydrocarbons Saturated molecules 4 Cyclicity-related classes Benzenes 6
  • 54. C HEMICAL C LASSES W E C OVERED 1 Existence of subcomponents Carbon molecules Carboxylic acids and carboxylic esters Ketones and aldehydes 2 Exact cardinality of parts Exactly two carbons Dicarboxylic acid 3 Exclusive composition Inorganic molecules Hydrocarbons Saturated molecules 4 Cyclicity-related classes Benzenes Cyclic molecules 6
  • 55. C HEMICAL C LASSES W E C OVERED 1 Existence of subcomponents Carbon molecules Carboxylic acids and carboxylic esters Ketones and aldehydes 2 Exact cardinality of parts Exactly two carbons Dicarboxylic acid 3 Exclusive composition Inorganic molecules Hydrocarbons Saturated molecules 4 Cyclicity-related classes Benzenes Cyclic molecules Alkanes 6
  • 56. E MPIRICAL E VALUATION Draws upon DLV, a deductive databases engine 7
  • 57. E MPIRICAL E VALUATION Draws upon DLV, a deductive databases engine Evaluation with data extracted from ChEBI 7
  • 58. E MPIRICAL E VALUATION Draws upon DLV, a deductive databases engine Evaluation with data extracted from ChEBI 500 molecules under 51 chemical classes in 40 secs 7
  • 59. E MPIRICAL E VALUATION Draws upon DLV, a deductive databases engine Evaluation with data extracted from ChEBI 500 molecules under 51 chemical classes in 40 secs Quicker than other approaches: 7
  • 60. E MPIRICAL E VALUATION Draws upon DLV, a deductive databases engine Evaluation with data extracted from ChEBI 500 molecules under 51 chemical classes in 40 secs Quicker than other approaches: [Hastings et al., 2010] 140 molecules in 4 hours [Magka et al., 2012] 70 molecules in 450 secs 7
  • 61. E MPIRICAL E VALUATION Draws upon DLV, a deductive databases engine Evaluation with data extracted from ChEBI 500 molecules under 51 chemical classes in 40 secs Quicker than other approaches: [Hastings et al., 2010] 140 molecules in 4 hours [Magka et al., 2012] 70 molecules in 450 secs Subsumptions exposed by our prototype: 7
  • 62. E MPIRICAL E VALUATION Draws upon DLV, a deductive databases engine Evaluation with data extracted from ChEBI 500 molecules under 51 chemical classes in 40 secs Quicker than other approaches: [Hastings et al., 2010] 140 molecules in 4 hours [Magka et al., 2012] 70 molecules in 450 secs Subsumptions exposed by our prototype: ascorbic acid is a polyatomic entity, a carboxylic ester and a cyclic molecule missing from the ChEBI OWL ontology 7
  • 63. E MPIRICAL E VALUATION Draws upon DLV, a deductive databases engine Evaluation with data extracted from ChEBI 500 molecules under 51 chemical classes in 40 secs Quicker than other approaches: [Hastings et al., 2010] 140 molecules in 4 hours [Magka et al., 2012] 70 molecules in 450 secs Subsumptions exposed by our prototype: ascorbic acid is a polyatomic entity, a carboxylic ester and a cyclic molecule missing from the ChEBI OWL ontology Contradictory subclass relation from ChEBI: 7
  • 64. E MPIRICAL E VALUATION Draws upon DLV, a deductive databases engine Evaluation with data extracted from ChEBI 500 molecules under 51 chemical classes in 40 secs Quicker than other approaches: [Hastings et al., 2010] 140 molecules in 4 hours [Magka et al., 2012] 70 molecules in 450 secs Subsumptions exposed by our prototype: ascorbic acid is a polyatomic entity, a carboxylic ester and a cyclic molecule missing from the ChEBI OWL ontology Contradictory subclass relation from ChEBI: Ascorbic acid is asserted to be a carboxylic acid (release 95) Not listed among the subsumptions derived by our prototype 7
  • 65. C ONCLUSION AND F URTHER R ESEARCH Results 1 Expressive and decidable formalism for complex objects 8
  • 66. C ONCLUSION AND F URTHER R ESEARCH Results 1 Expressive and decidable formalism for complex objects 2 Wide range of structure-based classes 8
  • 67. C ONCLUSION AND F URTHER R ESEARCH Results 1 Expressive and decidable formalism for complex objects 2 Wide range of structure-based classes 3 DLV-based implementation exhibits a significant speedup 8
  • 68. C ONCLUSION AND F URTHER R ESEARCH Results 1 Expressive and decidable formalism for complex objects 2 Wide range of structure-based classes 3 DLV-based implementation exhibits a significant speedup 4 Evaluation over ChEBI ontology revealed modelling errors 8
  • 69. C ONCLUSION AND F URTHER R ESEARCH Results 1 Expressive and decidable formalism for complex objects 2 Wide range of structure-based classes 3 DLV-based implementation exhibits a significant speedup 4 Evaluation over ChEBI ontology revealed modelling errors Language for representing biochemical structures with a favourable performance/expressivity trade-off 8
  • 70. C ONCLUSION AND F URTHER R ESEARCH Results 1 Expressive and decidable formalism for complex objects 2 Wide range of structure-based classes 3 DLV-based implementation exhibits a significant speedup 4 Evaluation over ChEBI ontology revealed modelling errors Language for representing biochemical structures with a favourable performance/expressivity trade-off Future directions SMILES-based surface syntax 8
  • 71. C ONCLUSION AND F URTHER R ESEARCH Results 1 Expressive and decidable formalism for complex objects 2 Wide range of structure-based classes 3 DLV-based implementation exhibits a significant speedup 4 Evaluation over ChEBI ontology revealed modelling errors Language for representing biochemical structures with a favourable performance/expressivity trade-off Future directions SMILES-based surface syntax ∧5 hasAtom(x, yi ) ∧ c(y1 ) ∧ o(y2 ) ∧ o(y3 ) ∧ c(y4 )∧ i=1 double(y1 , y2 ) ∧ single(y1 , y3 ) ∧ single(y3 , y4 ) ∧ single(y1 , y5 ) → carboxylicEster(x) 8
  • 72. C ONCLUSION AND F URTHER R ESEARCH Results 1 Expressive and decidable formalism for complex objects 2 Wide range of structure-based classes 3 DLV-based implementation exhibits a significant speedup 4 Evaluation over ChEBI ontology revealed modelling errors Language for representing biochemical structures with a favourable performance/expressivity trade-off Future directions SMILES-based surface syntax define carboxylicEster some hasAtom SMILES(COC(= O)[∗]) end. 8
  • 73. C ONCLUSION AND F URTHER R ESEARCH Results 1 Expressive and decidable formalism for complex objects 2 Wide range of structure-based classes 3 DLV-based implementation exhibits a significant speedup 4 Evaluation over ChEBI ontology revealed modelling errors Language for representing biochemical structures with a favourable performance/expressivity trade-off Future directions SMILES-based surface syntax Detect subsumptions between classes 8
  • 74. C ONCLUSION AND F URTHER R ESEARCH Results 1 Expressive and decidable formalism for complex objects 2 Wide range of structure-based classes 3 DLV-based implementation exhibits a significant speedup 4 Evaluation over ChEBI ontology revealed modelling errors Language for representing biochemical structures with a favourable performance/expressivity trade-off Future directions SMILES-based surface syntax Detect subsumptions between classes E.g., Carboxylic ester is an organic molecular entity 8
  • 75. C ONCLUSION AND F URTHER R ESEARCH Results 1 Expressive and decidable formalism for complex objects 2 Wide range of structure-based classes 3 DLV-based implementation exhibits a significant speedup 4 Evaluation over ChEBI ontology revealed modelling errors Language for representing biochemical structures with a favourable performance/expressivity trade-off Future directions SMILES-based surface syntax Detect subsumptions between classes Extensions with numerical datatypes 8
  • 76. C ONCLUSION AND F URTHER R ESEARCH Results 1 Expressive and decidable formalism for complex objects 2 Wide range of structure-based classes 3 DLV-based implementation exhibits a significant speedup 4 Evaluation over ChEBI ontology revealed modelling errors Language for representing biochemical structures with a favourable performance/expressivity trade-off Future directions SMILES-based surface syntax Detect subsumptions between classes Extensions with numerical datatypes E.g., Small molecules if they weigh less than 800 daltons 8
  • 77. C ONCLUSION AND F URTHER R ESEARCH Results 1 Expressive and decidable formalism for complex objects 2 Wide range of structure-based classes 3 DLV-based implementation exhibits a significant speedup 4 Evaluation over ChEBI ontology revealed modelling errors Language for representing biochemical structures with a favourable performance/expressivity trade-off Future directions SMILES-based surface syntax Detect subsumptions between classes Extensions with numerical datatypes Classification of complex biological objects 8
  • 78. C ONCLUSION AND F URTHER R ESEARCH Results 1 Expressive and decidable formalism for complex objects 2 Wide range of structure-based classes 3 DLV-based implementation exhibits a significant speedup 4 Evaluation over ChEBI ontology revealed modelling errors Language for representing biochemical structures with a favourable performance/expressivity trade-off Future directions SMILES-based surface syntax Detect subsumptions between classes Extensions with numerical datatypes Classification of complex biological objects Integration with Protégé, Bioclipse, JChemPaint,. . . 8
  • 79. C ONCLUSION AND F URTHER R ESEARCH Results 1 Expressive and decidable formalism for complex objects 2 Wide range of structure-based classes 3 DLV-based implementation exhibits a significant speedup 4 Evaluation over ChEBI ontology revealed modelling errors Language for representing biochemical structures with a favourable performance/expressivity trade-off Future directions SMILES-based surface syntax Detect subsumptions between classes Extensions with numerical datatypes Classification of complex biological objects Integration with Protégé, Bioclipse, JChemPaint,. . . Mapping from our logic to RDF 8
  • 80. C ONCLUSION AND F URTHER R ESEARCH Results 1 Expressive and decidable formalism for complex objects 2 Wide range of structure-based classes 3 DLV-based implementation exhibits a significant speedup 4 Evaluation over ChEBI ontology revealed modelling errors Language for representing biochemical structures with a favourable performance/expressivity trade-off Future directions SMILES-based surface syntax Detect subsumptions between classes Extensions with numerical datatypes Classification of complex biological objects Integration with Protégé, Bioclipse, JChemPaint,. . . Mapping from our logic to RDF Thank you! Questions?!? 8