SlideShare uma empresa Scribd logo
1 de 60
Semantic Web Approaches
           to
    Candidate Gene
     Identification

      Simon Twigger, Ph.D.
http://rgd.mcw.edu
Meet the client
Hypertensive
Hypertensive




         Hypertension
QTL
Hypertensive




         Hypertension
QTL
Hypertensive

                        G   G   G




         Hypertension
QTL
Hypertensive

                        G   G   G




         Hypertension
QTL
Hypertensive

                        G   G   G




         Hypertension
Rat researchers ask...
Rat researchers ask...
      Has anyone done any expression
      studies using congenic rats?
Rat researchers ask...
      Has anyone done any expression
      studies using congenic rats?
          What tissue is this gene
               expressed in?
Rat researchers ask...
      Has anyone done any expression
      studies using congenic rats?
          What tissue is this gene
               expressed in?
                   Are any of these
                  genes associated
                 with my phenotype?
Rat researchers ask...
      Has anyone done any expression
      studies using congenic rats?
          What tissue is this gene
               expressed in?
                   Are any of these
                  genes associated
                 with my phenotype?

What rat expression studies have been
 done on Mammary Cancer(aka breast
neoplasms/breast cancer/cancer of the
Rat researchers ask...
             Has anyone done any expression
             studies using congenic rats?
                  What tissue is this gene
What expression data expressed in?
 is known for SD (aka      Are any of these
   SD/NHsd, Harlan        genes associated
    Sprague Dawley,      with my phenotype?
Sprague Dawley) rats?
       What rat expression studies have been
       done on Mammary Cancer(aka breast
      neoplasms/breast cancer/cancer of the
Rat researchers ask...
             Has anyone done any expression
             studies using congenic rats?
                  What tissue is this gene
What expression data expressed in?
 is known for SD (aka      Are any of these
   SD/NHsd, Harlan        genes associated
    Sprague Dawley,      with my phenotype?
Sprague Dawley) rats?
         Has this gene been seen in the brain?
       What rat expression studies have been
       done on Mammary Cancer(aka breast
      neoplasms/breast cancer/cancer of the
Biological Data Warehouse
Biological Data Warehouse




Really important piece of data...
NCBI GEO db
Data hidden in plain sight
NCBO Annotator




http://www.bioontology.org/wiki/index.php/Annotator_Web_service
Parallel Annotation Workflow
 GEO Records


                Create Annotation
                Jobs & Queue Up

                                      Q-Out
                                                                   1..n Annot. Workers



                                    RabbitMQ                           Index text
                                                                         at OBA


                                                                         Parse
                                     Q-In
                                                                        Results

               Results saved to                Put results in to
               GMiner database                 queue for save
Ontologies
gminer.mcw.edu
Curation of Results
Curation of Results
Curation of Results
Addition of new annotations




NCBO Ontology Widgets
http://www.bioontology.org/wiki/index.php/Ontology_Widgets
Addition of new annotations




NCBO Ontology Widgets
http://www.bioontology.org/wiki/index.php/Ontology_Widgets
Addition of new annotations




NCBO Ontology Widgets
http://www.bioontology.org/wiki/index.php/Ontology_Widgets
Addition of new annotations




NCBO Ontology Widgets
http://www.bioontology.org/wiki/index.php/Ontology_Widgets
Browse/Review Results
Browse/Review Results
Browse/Review Results
Linking annotations to data
Linking annotations to data
Linking annotations to data




    Tm2d1
RGD1306410
       Svs4
       Hbb
    Scgb2a1
       Alb
Linking annotations to data




    Tm2d1
RGD1306410
       Svs4
       Hbb
    Scgb2a1
       Alb
Linking annotations to data
     Tm2d1
RGD1306410
       Svs4
       Hbb
    Scgb2a1
                          +
        Alb
Linking annotations to data
     Tm2d1
RGD1306410
       Svs4
       Hbb
    Scgb2a1
                                   +
        Alb




              Hbb   is_expressed_in rat kidney
              Tm2d1 is_expressed_in rat kidney
Linking annotations to data
     Tm2d1
RGD1306410
       Svs4
       Hbb
    Scgb2a1
                                            +
        Alb




              Hbb   is_expressed_in rat kidney
              Tm2d1 is_expressed_in rat kidney

        Human (U133, U133v2.), Mouse (430, U74, U95) and Rat
        (U34a/b/c, 230, 230v2)
        62,000 samples x ca. 25,000 genes/sample = 1.5B data points
Probeset results on GMiner
Probeset 1395269_s_at for Gabrd - gamma-aminobutyric
             acid (GABA) A receptor, delta
Probeset results on GMiner
                       Gabdr
Probeset results on GMiner
                           Gabdr




Hs GABDR
RDF Data integration

Probeset
 to MA




             Triple Store
RDF Data integration

Probeset   Rat Genes
 to MA      & xrefs




                Triple Store
RDF Data integration

Probeset   Rat Genes    Probeset to
 to MA      & xrefs       RGD ID




                Triple Store
RDF Data integration

Probeset   Rat Genes    Probeset to   Mouse Anatomy
 to MA      & xrefs       RGD ID        Ontology




                Triple Store
RDF Data integration

Probeset   Rat Genes    Probeset to   Mouse Anatomy
 to MA      & xrefs       RGD ID        Ontology




                Triple Store
QTL
Hypertensive

                        G   G   G




         Hypertension
QTL
Hypertensive

                        G   G   G




         Hypertension
QTL
Hypertensive

                                  G   G   G



                        Pathway




         Hypertension
QTL
Hypertensive

                                          G   G   G



                                Pathway

                        G

                            G




         Hypertension
QTL
Hypertensive

                                                    G   G   G



                                          Pathway

                                G

                                    G
                    Component
                        Function
                                Process


         Hypertension
QTL
Hypertensive

                                                    G   G   G



                                          Pathway

                                G

                                    G
                    Component
                        Function
                                Process


         Hypertension
QTL
Hypertensive

                                                    G      G   G



                                          Pathway

                                G

                                    G           Anatomy
                                                (Kidney)
                    Component
                        Function
                                Process


         Hypertension
QTL
Hypertensive

                                                    G      G       G



                                          Pathway          Str 1   !=   Str 2
                                G

                                    G           Anatomy
                                                (Kidney)
                    Component
                        Function
                                Process


         Hypertension
Ongoing
• Work on improving term recognition
• Additional ontologies - Cell Type, Drugs,
  Phenotype, Disease
• RDFizing (what URIs to use?)
• Triple Store implementation
• Integrate Strain and tissue results into RGD
Acknowledgements
•   Joey Geiger - Development of GMiner

•   Jennifer Smith - Video creation, data curation

•   Rajni Nigam - Rat Strain Ontology

•   Clement Jonquet - NCBO Annotator tools

•   Mark Musen & NIH Roadmap Initiative - Our Funding!
Links
•   http://gminer.mcw.edu    Web application

•   http://github.com/mcwbbc/gminer    Gminer Code

•   http://github.com/simont/MCW-RDF     RDFizer code



                     Email: simont@mcw.edu
                     Twitter: @simon_t

Mais conteúdo relacionado

Destaque

Gene identification and discovery
Gene identification and discoveryGene identification and discovery
Gene identification and discovery
Amit Ruchi Yadav
 

Destaque (6)

Introduction to NGS Variant Calling Analysis (UEB-UAT Bioinformatics Course -...
Introduction to NGS Variant Calling Analysis (UEB-UAT Bioinformatics Course -...Introduction to NGS Variant Calling Analysis (UEB-UAT Bioinformatics Course -...
Introduction to NGS Variant Calling Analysis (UEB-UAT Bioinformatics Course -...
 
Genome analysis2
Genome analysis2Genome analysis2
Genome analysis2
 
Identification of pathological mutations from the single-gene case to exome p...
Identification of pathological mutations from the single-gene case to exome p...Identification of pathological mutations from the single-gene case to exome p...
Identification of pathological mutations from the single-gene case to exome p...
 
Gene identification and discovery
Gene identification and discoveryGene identification and discovery
Gene identification and discovery
 
Genetic disorder
Genetic disorderGenetic disorder
Genetic disorder
 
Exome sequencing for disease gene identification and patient diagnostics, Gen...
Exome sequencing for disease gene identification and patient diagnostics, Gen...Exome sequencing for disease gene identification and patient diagnostics, Gen...
Exome sequencing for disease gene identification and patient diagnostics, Gen...
 

Semelhante a Semantic Web Approaches to Candidate Gene Identification

Heterotrimeric G-proteins
Heterotrimeric G-proteinsHeterotrimeric G-proteins
Heterotrimeric G-proteins
Gulpreet Kaur
 
GAPDH, a well-known glycolytic enzyme, mediates
GAPDH, a well-known glycolytic enzyme, mediatesGAPDH, a well-known glycolytic enzyme, mediates
GAPDH, a well-known glycolytic enzyme, mediates
Pei-Ju Chin
 
Pyrosequencing slide presentation rev3.
Pyrosequencing slide presentation rev3.Pyrosequencing slide presentation rev3.
Pyrosequencing slide presentation rev3.
Robert Bruce
 
IntOGen & Gitools
IntOGen & GitoolsIntOGen & Gitools
IntOGen & Gitools
christian.perez
 
Correlation of Serum Creatinine Based Calculation of Glomerular Filtration Ra...
Correlation of Serum Creatinine Based Calculation of Glomerular Filtration Ra...Correlation of Serum Creatinine Based Calculation of Glomerular Filtration Ra...
Correlation of Serum Creatinine Based Calculation of Glomerular Filtration Ra...
ijtsrd
 
Clinical applications of NGS
Clinical applications of NGSClinical applications of NGS
Clinical applications of NGS
Eastern Biotech
 
C Amp Detection Methods In Hts
C Amp Detection Methods In HtsC Amp Detection Methods In Hts
C Amp Detection Methods In Hts
Vincen Pan
 

Semelhante a Semantic Web Approaches to Candidate Gene Identification (20)

Heterotrimeric G-proteins
Heterotrimeric G-proteinsHeterotrimeric G-proteins
Heterotrimeric G-proteins
 
GAPDH, a well-known glycolytic enzyme, mediates
GAPDH, a well-known glycolytic enzyme, mediatesGAPDH, a well-known glycolytic enzyme, mediates
GAPDH, a well-known glycolytic enzyme, mediates
 
Francisco Zafra Centro de Biologia Molecular Severo Ochoa. CSIC-UAM.
Francisco Zafra  Centro de Biologia Molecular Severo Ochoa. CSIC-UAM. Francisco Zafra  Centro de Biologia Molecular Severo Ochoa. CSIC-UAM.
Francisco Zafra Centro de Biologia Molecular Severo Ochoa. CSIC-UAM.
 
M Reich - GenomeSpace
M Reich - GenomeSpaceM Reich - GenomeSpace
M Reich - GenomeSpace
 
2.2 analyzing and manipulating dna
2.2 analyzing and manipulating dna2.2 analyzing and manipulating dna
2.2 analyzing and manipulating dna
 
Gel Electrophoresis Notes
Gel Electrophoresis NotesGel Electrophoresis Notes
Gel Electrophoresis Notes
 
Cellular Neuroscience Products
Cellular Neuroscience ProductsCellular Neuroscience Products
Cellular Neuroscience Products
 
Cell signalling 2
Cell signalling   2Cell signalling   2
Cell signalling 2
 
Optimizing Grape Rootstock Production and Export of inhibitors of X. fastidio...
Optimizing Grape Rootstock Production and Export of inhibitors of X. fastidio...Optimizing Grape Rootstock Production and Export of inhibitors of X. fastidio...
Optimizing Grape Rootstock Production and Export of inhibitors of X. fastidio...
 
Gpcr in plants
Gpcr in plantsGpcr in plants
Gpcr in plants
 
Pyrosequencing slide presentation rev3.
Pyrosequencing slide presentation rev3.Pyrosequencing slide presentation rev3.
Pyrosequencing slide presentation rev3.
 
2009 09 08 Wiltshire Ipit Seminar Slides
2009 09 08 Wiltshire Ipit Seminar Slides2009 09 08 Wiltshire Ipit Seminar Slides
2009 09 08 Wiltshire Ipit Seminar Slides
 
VII Jornadas SEQT - hERG
VII Jornadas SEQT - hERGVII Jornadas SEQT - hERG
VII Jornadas SEQT - hERG
 
IntOGen & Gitools
IntOGen & GitoolsIntOGen & Gitools
IntOGen & Gitools
 
Correlation of Serum Creatinine Based Calculation of Glomerular Filtration Ra...
Correlation of Serum Creatinine Based Calculation of Glomerular Filtration Ra...Correlation of Serum Creatinine Based Calculation of Glomerular Filtration Ra...
Correlation of Serum Creatinine Based Calculation of Glomerular Filtration Ra...
 
Clinical applications of NGS
Clinical applications of NGSClinical applications of NGS
Clinical applications of NGS
 
G protein signal
G protein signalG protein signal
G protein signal
 
Signal transduction
Signal transductionSignal transduction
Signal transduction
 
Perennial Ryegrass (Lolium perenne L.) Improvement Through Cisgenics®
Perennial Ryegrass (Lolium perenne L.) Improvement Through Cisgenics®Perennial Ryegrass (Lolium perenne L.) Improvement Through Cisgenics®
Perennial Ryegrass (Lolium perenne L.) Improvement Through Cisgenics®
 
C Amp Detection Methods In Hts
C Amp Detection Methods In HtsC Amp Detection Methods In Hts
C Amp Detection Methods In Hts
 

Mais de Simon Twigger

Mais de Simon Twigger (8)

Converged IT and Data Commons
Converged IT and Data CommonsConverged IT and Data Commons
Converged IT and Data Commons
 
A Distributed Annotation Pipeline for MSSNG
A Distributed Annotation Pipeline for MSSNGA Distributed Annotation Pipeline for MSSNG
A Distributed Annotation Pipeline for MSSNG
 
DevOps and Automation for Bioinformaticians
DevOps and Automation for BioinformaticiansDevOps and Automation for Bioinformaticians
DevOps and Automation for Bioinformaticians
 
NCBO DBP
NCBO DBPNCBO DBP
NCBO DBP
 
the iPad - an interface for Biologists?
the iPad - an interface for Biologists?the iPad - an interface for Biologists?
the iPad - an interface for Biologists?
 
Helping Haiti - a semantic web approach to crisis information management
Helping Haiti - a semantic web approach to crisis information managementHelping Haiti - a semantic web approach to crisis information management
Helping Haiti - a semantic web approach to crisis information management
 
Using the NCBO Web Services for Concept Recognition and Ontology Annotation o...
Using the NCBO Web Services for Concept Recognition and Ontology Annotation o...Using the NCBO Web Services for Concept Recognition and Ontology Annotation o...
Using the NCBO Web Services for Concept Recognition and Ontology Annotation o...
 
Virtual Proteomics Analysis Cluster in the Cloud
Virtual Proteomics Analysis Cluster in the CloudVirtual Proteomics Analysis Cluster in the Cloud
Virtual Proteomics Analysis Cluster in the Cloud
 

Último

Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
PECB
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
SoniaTolstoy
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 

Último (20)

microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 

Semantic Web Approaches to Candidate Gene Identification

Notas do Editor

  1. The Rat Genome Database is one of the main projects we have at MCW. It is the model organism database for the laboratory rat, Rattus norvegicus. We curate, genes, strains, QTL, etc. and make extensive use of ontologies such as GO, pathway, rat strain, disease, phenotype.
  2. This is a typical use case for rat genomics - how to identify the causes of hypertension in a hypertensive rat? Quite often a QTL is measured indicating a region on the chromosome that is statistically shown to be related to the trait in question - how to go from the genes in that region to the cause of the disease? Not an easy task - ‘then a miracle happens’
  3. This is a typical use case for rat genomics - how to identify the causes of hypertension in a hypertensive rat? Quite often a QTL is measured indicating a region on the chromosome that is statistically shown to be related to the trait in question - how to go from the genes in that region to the cause of the disease? Not an easy task - ‘then a miracle happens’
  4. This is a typical use case for rat genomics - how to identify the causes of hypertension in a hypertensive rat? Quite often a QTL is measured indicating a region on the chromosome that is statistically shown to be related to the trait in question - how to go from the genes in that region to the cause of the disease? Not an easy task - ‘then a miracle happens’
  5. This is a typical use case for rat genomics - how to identify the causes of hypertension in a hypertensive rat? Quite often a QTL is measured indicating a region on the chromosome that is statistically shown to be related to the trait in question - how to go from the genes in that region to the cause of the disease? Not an easy task - ‘then a miracle happens’
  6. This is a typical use case for rat genomics - how to identify the causes of hypertension in a hypertensive rat? Quite often a QTL is measured indicating a region on the chromosome that is statistically shown to be related to the trait in question - how to go from the genes in that region to the cause of the disease? Not an easy task - ‘then a miracle happens’
  7. Rat biologists ask many questions related to gene expression and diseases, these are some examples of typical questions. Many of these questions are in areas covered by ontologies and would benefit from the additional searching flexibility that ontologies provide
  8. Rat biologists ask many questions related to gene expression and diseases, these are some examples of typical questions. Many of these questions are in areas covered by ontologies and would benefit from the additional searching flexibility that ontologies provide
  9. Rat biologists ask many questions related to gene expression and diseases, these are some examples of typical questions. Many of these questions are in areas covered by ontologies and would benefit from the additional searching flexibility that ontologies provide
  10. Rat biologists ask many questions related to gene expression and diseases, these are some examples of typical questions. Many of these questions are in areas covered by ontologies and would benefit from the additional searching flexibility that ontologies provide
  11. Rat biologists ask many questions related to gene expression and diseases, these are some examples of typical questions. Many of these questions are in areas covered by ontologies and would benefit from the additional searching flexibility that ontologies provide
  12. Rat biologists ask many questions related to gene expression and diseases, these are some examples of typical questions. Many of these questions are in areas covered by ontologies and would benefit from the additional searching flexibility that ontologies provide
  13. Technical problem - lots of data being stored, hard to find it again. Government Warehouse image. Data is archived with good intentions but in doing so is often not easy to find again... If you cant find the data, its not really much use.
  14. NCBI’s Gene Expression Omnibus has a lot of relevant data, either as text or raw data.
  15. Can we start to capture some of this informaiton in an informatically-tractable fashion using ontologies and the OBA tools at the National Center for Biomedical Ontology in an annotation pipeline? The red boxes highlight some concepts of interest - rat strains and tissues being used in this experiment. A human can read these and know whats going on but what about a computer?
  16. Driving biological project - use NCBO Annotator web services to mark up the text in the GEO records using ontologies
  17. Take sections of text from GEO records, create annotation jobs, place in queue Workers take the jobs off the queue, index for appropriate ontologies at NCBO Results are placed on Input queue for saving back to the database.
  18. We are currently using two ontologies, the rat strain ontology created at RGD and the Mouse Gross Anatomy Ontology created at the JAX
  19. GEO data is run through the pipeline and loaded into Gminer for curation and analysis
  20. Annotated results can be reviewed and verified, some annotations are missed such as the Sprague Dawley link
  21. Annotated results can be reviewed and verified, some annotations are missed such as the Sprague Dawley link
  22. New annotations can be added using the NCBO ontology widgets
  23. New annotations can be added using the NCBO ontology widgets
  24. New annotations can be added using the NCBO ontology widgets
  25. Put the OBA system on an Amazon AMI so it can be instantiated at will Allows users to run as many of these things as they want? Consider using a Virtual Machine?
  26. Initial results focusing on GEO rat datasets has provided a lot of great information and allowed us to create some handy navigational interfaces to the data, enabling queries that were not possible on any other site. Want to find expression data for the SS rat Kidney - click the terms and the datasets appear.
  27. Initial results focusing on GEO rat datasets has provided a lot of great information and allowed us to create some handy navigational interfaces to the data, enabling queries that were not possible on any other site. Want to find expression data for the SS rat Kidney - click the terms and the datasets appear.
  28. Can we link from the annotations to the samples, down to the raw data in that sample and from there to the genes involved? Affy chips have the detection call, a fairly conservative present/absent call indicating if the probe set was observed in that particular sample.
  29. Can we link from the annotations to the samples, down to the raw data in that sample and from there to the genes involved? Affy chips have the detection call, a fairly conservative present/absent call indicating if the probe set was observed in that particular sample.
  30. Can we link from the annotations to the samples, down to the raw data in that sample and from there to the genes involved? Affy chips have the detection call, a fairly conservative present/absent call indicating if the probe set was observed in that particular sample.
  31. We can then related the probesets to the genes to the ontology annotations to create triple such as this. If we do this for the affy data in GEO for Rat, Mouse and Human we will have somewhere upwards of 1.5B data points to encode.
  32. We can then related the probesets to the genes to the ontology annotations to create triple such as this. If we do this for the affy data in GEO for Rat, Mouse and Human we will have somewhere upwards of 1.5B data points to encode.
  33. For each probe we can look at the samples in which it was tested and see if it was present/absent/marginal and compile this data to get a feel for how often a gene was seen in a particular tissue/organ.
  34. This can be viewed as a chart of tissue distribution. When compared to similar results from GeneCards/Novartis BioGPS the results are quite comparable indicating that this approach has some merit.
  35. Experimenting with exporting this data into RDF and integrating with related data and vocabularies in triple stores such as Sesame, AllegroGraph and Virtuoso. Early days, still climbing the learning curve with this!
  36. Experimenting with exporting this data into RDF and integrating with related data and vocabularies in triple stores such as Sesame, AllegroGraph and Virtuoso. Early days, still climbing the learning curve with this!
  37. Experimenting with exporting this data into RDF and integrating with related data and vocabularies in triple stores such as Sesame, AllegroGraph and Virtuoso. Early days, still climbing the learning curve with this!
  38. Experimenting with exporting this data into RDF and integrating with related data and vocabularies in triple stores such as Sesame, AllegroGraph and Virtuoso. Early days, still climbing the learning curve with this!
  39. As we start to create these triples we can bridge the gap from the QTL and its genes to the disease, allowing the scientists to identify or prioritize candidate genes in their QTL regions (or gene lists) and save them (to some degree) from spending a lot of time manually searching databases online.
  40. As we start to create these triples we can bridge the gap from the QTL and its genes to the disease, allowing the scientists to identify or prioritize candidate genes in their QTL regions (or gene lists) and save them (to some degree) from spending a lot of time manually searching databases online.
  41. As we start to create these triples we can bridge the gap from the QTL and its genes to the disease, allowing the scientists to identify or prioritize candidate genes in their QTL regions (or gene lists) and save them (to some degree) from spending a lot of time manually searching databases online.
  42. As we start to create these triples we can bridge the gap from the QTL and its genes to the disease, allowing the scientists to identify or prioritize candidate genes in their QTL regions (or gene lists) and save them (to some degree) from spending a lot of time manually searching databases online.
  43. As we start to create these triples we can bridge the gap from the QTL and its genes to the disease, allowing the scientists to identify or prioritize candidate genes in their QTL regions (or gene lists) and save them (to some degree) from spending a lot of time manually searching databases online.
  44. As we start to create these triples we can bridge the gap from the QTL and its genes to the disease, allowing the scientists to identify or prioritize candidate genes in their QTL regions (or gene lists) and save them (to some degree) from spending a lot of time manually searching databases online.
  45. As we start to create these triples we can bridge the gap from the QTL and its genes to the disease, allowing the scientists to identify or prioritize candidate genes in their QTL regions (or gene lists) and save them (to some degree) from spending a lot of time manually searching databases online.
  46. Acknowledgements