Presented at the 2nd ChEBI User Group Workshop. Discusses some of the difficulties encountered in the project which aims to classify chemicals in the ChEBI ontology automatically based on their structures.
08448380779 Call Girls In Friends Colony Women Seeking Men
Automatic classification in ChEBI
1. Automatic classification, logical definitions Janna Hastings, EBI Cheminformatics and Metabolism 2nd ChEBI User Group Workshop, 24 June 2010
2.
3. ChEBI ontology 20.10.10 The ChEBI ontology contains a large asserted is-a hierarchy of chemical classes and compounds Each chemical class is clearly defined in natural language
17. Classification based on chemical structure ChEBI ontology 20.10.10 Best would be to include the structure in the ontology Without structure, all parts must be explicitly asserted (combinatorial explosion for larger molecules) But the structure of complex molecules breaks the OWL Tree Model requirement does not have a model in the shape of a tree
25. Substructure search ChEBI ontology 20.10.10 benzoic acid has this part so: is a carboxylic acid carboxylic acid benzoic acid
26. Goal 20.10.10 We extract features from the structural specifications of chemical compounds using standard cheminformatics techniques and use these to automatically classify compounds into defined classes CDK has_part exactly 1 ( ‘carboxy group’ ) has_part some ( ‘cholesterol’ ) 3β-hydroxy-4β-methyl-5α-cholest-7-ene-4α-carboxylic acid has_part only some ( ‘carbon atom’ or ‘oxygen atom’ or ‘hydrogen atom’ ) hydroxy monocarboxylic acid
More notes in team discussion: prioritise standard inchis add batch submissions (bulk submissions) which additional properties do we pre-calculate and make visible in ChEBI? (team discussion needed) (lipinski?) OWL improvement to make SPARQL querying easier and improve the relationship patterns (not ALWAYS subclassof exists some). This ties into the SADI-fying of ChEBI and should also involve thinking of and testing out specific use cases for *doing stuff with* the exported OWL file. Concern about downgrade in quality caused by increase in scale (quantity of compounds)