SlideShare uma empresa Scribd logo
1 de 17
Baixar para ler offline
CDAO-STORE:
A New Vision for Data Integration



Brandon Chisham, Trung Le, Enrico Pontelli,
          Tran Son, Ben Wright


               IEvoBio 2010
               Portland, OR
CDAO

    Comparative Data Analysis Ontology
    
        Provides semantics to the descriptions of data
        commonly found in the domain of phylogenetic
        inference.
    
        Enables the rigorous description of phylogenetic
        trees and associated character data matrices.
What We Did

    CDAO-Store
    
        A repository providing a rich set of API's for
        querying phyloinformatics data.

    CDAO-Explorer
    
        A visualization tool for viewing data sets stored in
        the repository.
CDAO-Store Repository

    What's in it?
    
        TreeBASE dump dated January 2009
    
        Also allows the importation of CDAO formatted files.
        −  To get your files into CDAO, we can translate
           NEXUS, PHYLIP, and MEGA into CDAO format.
    
        Files can be exported in RDF/XML using CDAO
        terms
Querying CDAO-Store

    PhyloWS
    
        Retrieve data sets via name, tree identifier, taxon,
        or size.
    
        Supports computing the minimum spanning clade or
        the nearest common ancestor of a set of taxa.

    Web-Based
    
        Search for data sets by author or study
    
        View data sets online by tree, taxon, algorithm,
        method, or size.
Web-Based Queries
•   Landing page for
    web-queries.
Trees Containing a Taxonomic Unit
•   Shows a list of trees
    matching the
    Taxonomic Unit
•   Has links to query
    these trees or View
    them graphically
Tree Query
•   Shows a listing of
    nodes in the tree.
•   Allows user to select
    any set of them to
    find their minimum
    spanning clade, or
    Nearest Common
    Ancestor
Searching by Author
•   List studies from a
    particular author.
Study Detail
•   Lists all authors,
    with links to their
    studies.
•   Abstract
•   Trees associated
    with the study.
•   Future: Matrices the
    data is available in
    the system but not
    exposed to the user.
Searching by Algorithm or Method
•    Can search by
     Algorithm or Method
•    As before listing
     shows tree name
     and links to query
     the tree or view it.
Visualization with CDAO-Explorer

    CDAO-Explorer
    
        Tree Viewer
    
        Matrix Viewer
Tree Viewer

    Uses the Prefuse
    framework

    2 Layouts, “Force
    Layout” and “Node
    Layout”

    Can search by
    node/edge name

    View details of nodes
    or edges

    Can save as jpg or
    png
Matrix Viewer

    Custom built

    Color-coded cells

    Extract or 'crop' parts
    of the Matrix for
    closer views

    Zoom in and out of
    the matrix

    Annotation support in
    development.
Conclusion

    The CDAO-store tool set provides a robust
    foundation for a semantically aware, phylogeny
    resource

    The CDAO-Explorer portion of the store has
    achieved a good base-line functionality and
    provides a set of useful features to advance the
    current state of visualization of large data sets
    in this field.
Future

    Annotations / MIAPA / OBI

    User-defined SPARQL Queries

    Better Tree / Matrix integration

    Ambiguous Name Resolution (at taxon, tree,
    and study levels)

    Integrating other stores besides TreeBASE
Questions?

    Find us at:
    
        http://www.cs.nmsu.edu/~cdaostore
    
        http://cdaotools.sourceforge.net
    
        http://www.twitter.com/cdaotools

    Funding for this project provided by:
    
        NSF CREST grant HRD-0420407
    
        NSF IGERT grant DGE-0504304

    Additional Support provided by:
    
        NESCent
    
        NMSU
    
        CDAO Development Team

Mais conteúdo relacionado

Mais procurados

Data repositories -- Xiamen University 2012 06-08
Data repositories -- Xiamen University 2012 06-08Data repositories -- Xiamen University 2012 06-08
Data repositories -- Xiamen University 2012 06-08
Jian Qin
 
Networked Digital Library Of Theses And Dissertations
Networked Digital Library Of Theses And DissertationsNetworked Digital Library Of Theses And Dissertations
Networked Digital Library Of Theses And Dissertations
singlish
 
Dublin Core Application Profile for Scholarly Works Slainte
Dublin Core Application Profile for Scholarly Works SlainteDublin Core Application Profile for Scholarly Works Slainte
Dublin Core Application Profile for Scholarly Works Slainte
Julie Allinson
 
Dublin Core Application Profile for Scholarly Works KE
Dublin Core Application Profile for Scholarly Works KEDublin Core Application Profile for Scholarly Works KE
Dublin Core Application Profile for Scholarly Works KE
Julie Allinson
 
Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...
Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...
Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...
Ahmad C. Bukhari
 

Mais procurados (20)

Data repositories -- Xiamen University 2012 06-08
Data repositories -- Xiamen University 2012 06-08Data repositories -- Xiamen University 2012 06-08
Data repositories -- Xiamen University 2012 06-08
 
Integrated research data management in the Structural Sciences
Integrated research data management in the Structural SciencesIntegrated research data management in the Structural Sciences
Integrated research data management in the Structural Sciences
 
Networked Digital Library Of Theses And Dissertations
Networked Digital Library Of Theses And DissertationsNetworked Digital Library Of Theses And Dissertations
Networked Digital Library Of Theses And Dissertations
 
Scalable and privacy-preserving data integration - part 1
Scalable and privacy-preserving data integration - part 1Scalable and privacy-preserving data integration - part 1
Scalable and privacy-preserving data integration - part 1
 
Modelling Knowledge Organization Systems and Structures
Modelling Knowledge Organization Systems and StructuresModelling Knowledge Organization Systems and Structures
Modelling Knowledge Organization Systems and Structures
 
Semantic Web: Technolgies and Applications for Real-World
Semantic Web: Technolgies and Applications for Real-WorldSemantic Web: Technolgies and Applications for Real-World
Semantic Web: Technolgies and Applications for Real-World
 
Answer ado.net pre-exam2018
Answer ado.net pre-exam2018Answer ado.net pre-exam2018
Answer ado.net pre-exam2018
 
Dublin Core Application Profile for Scholarly Works Slainte
Dublin Core Application Profile for Scholarly Works SlainteDublin Core Application Profile for Scholarly Works Slainte
Dublin Core Application Profile for Scholarly Works Slainte
 
Schemas and Schema-driven Metadata Software
Schemas and Schema-driven Metadata SoftwareSchemas and Schema-driven Metadata Software
Schemas and Schema-driven Metadata Software
 
Dublin Core Application Profile for Scholarly Works KE
Dublin Core Application Profile for Scholarly Works KEDublin Core Application Profile for Scholarly Works KE
Dublin Core Application Profile for Scholarly Works KE
 
Semantic Web Technologies: A Paradigm for Medical Informatics
Semantic Web Technologies: A Paradigm for Medical InformaticsSemantic Web Technologies: A Paradigm for Medical Informatics
Semantic Web Technologies: A Paradigm for Medical Informatics
 
{Ontology: Resource} x {Matching : Mapping} x {Schema : Instance} :: Compone...
{Ontology: Resource} x {Matching : Mapping} x {Schema : Instance} :: Compone...{Ontology: Resource} x {Matching : Mapping} x {Schema : Instance} :: Compone...
{Ontology: Resource} x {Matching : Mapping} x {Schema : Instance} :: Compone...
 
Sdmx2 context
Sdmx2 contextSdmx2 context
Sdmx2 context
 
NAISTビッグデータシンポジウム - 情報 松本先生
NAISTビッグデータシンポジウム - 情報 松本先生NAISTビッグデータシンポジウム - 情報 松本先生
NAISTビッグデータシンポジウム - 情報 松本先生
 
Primary and secondary databases ppt by puneet kulyana
Primary and secondary databases ppt by puneet kulyanaPrimary and secondary databases ppt by puneet kulyana
Primary and secondary databases ppt by puneet kulyana
 
Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...
Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...
Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...
 
Database in bioinformatics
Database in bioinformaticsDatabase in bioinformatics
Database in bioinformatics
 
How Portable Are the Metadata Standards for Scientific Data?
How Portable Are the Metadata Standards for Scientific Data?How Portable Are the Metadata Standards for Scientific Data?
How Portable Are the Metadata Standards for Scientific Data?
 
Extended WordNet
Extended WordNetExtended WordNet
Extended WordNet
 
Resource Browser
Resource BrowserResource Browser
Resource Browser
 

Destaque (6)

Cdao Obo Workshop 2010 (3)
Cdao Obo Workshop 2010 (3)Cdao Obo Workshop 2010 (3)
Cdao Obo Workshop 2010 (3)
 
Technology Basics
Technology BasicsTechnology Basics
Technology Basics
 
Android community which takes a lead in the virtuous cycle structure establis...
Android community which takes a lead in the virtuous cycle structure establis...Android community which takes a lead in the virtuous cycle structure establis...
Android community which takes a lead in the virtuous cycle structure establis...
 
Kiosk / PHP
Kiosk / PHP Kiosk / PHP
Kiosk / PHP
 
iPhoneのオモチャ箱 - 刊行記念イベント@ジュンク堂新宿 - バスケ
iPhoneのオモチャ箱 - 刊行記念イベント@ジュンク堂新宿 - バスケiPhoneのオモチャ箱 - 刊行記念イベント@ジュンク堂新宿 - バスケ
iPhoneのオモチャ箱 - 刊行記念イベント@ジュンク堂新宿 - バスケ
 
Women's week on Techstory.in
Women's week on Techstory.inWomen's week on Techstory.in
Women's week on Techstory.in
 

Semelhante a iEvoBio 2010 cdaostore

Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...
Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...
Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...
ICZN
 
FAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practiceFAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practice
Carole Goble
 

Semelhante a iEvoBio 2010 cdaostore (20)

Dataset Catalogs as a Foundation for FAIR* Data
Dataset Catalogs as a Foundation for FAIR* DataDataset Catalogs as a Foundation for FAIR* Data
Dataset Catalogs as a Foundation for FAIR* Data
 
Dataset description: DCAT and other vocabularies
Dataset description: DCAT and other vocabulariesDataset description: DCAT and other vocabularies
Dataset description: DCAT and other vocabularies
 
Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...
Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...
Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...
 
Chachra, "Improving Discovery Systems Through Post Processing of Harvested Data"
Chachra, "Improving Discovery Systems Through Post Processing of Harvested Data"Chachra, "Improving Discovery Systems Through Post Processing of Harvested Data"
Chachra, "Improving Discovery Systems Through Post Processing of Harvested Data"
 
Towards a rebirth of data science (by Data Fellas)
Towards a rebirth of data science (by Data Fellas)Towards a rebirth of data science (by Data Fellas)
Towards a rebirth of data science (by Data Fellas)
 
What is a distributed data science pipeline. how with apache spark and friends.
What is a distributed data science pipeline. how with apache spark and friends.What is a distributed data science pipeline. how with apache spark and friends.
What is a distributed data science pipeline. how with apache spark and friends.
 
clustering_classification.ppt
clustering_classification.pptclustering_classification.ppt
clustering_classification.ppt
 
Metadata as Standard: improving Interoperability through the Research Data Al...
Metadata as Standard: improving Interoperability through the Research Data Al...Metadata as Standard: improving Interoperability through the Research Data Al...
Metadata as Standard: improving Interoperability through the Research Data Al...
 
Price "KBART: improving the supply of data to link resolvers and knowledge ba...
Price "KBART: improving the supply of data to link resolvers and knowledge ba...Price "KBART: improving the supply of data to link resolvers and knowledge ba...
Price "KBART: improving the supply of data to link resolvers and knowledge ba...
 
Price "KBART: Improving the Supply of Data to Link Resolvers and Knowledge Ba...
Price "KBART: Improving the Supply of Data to Link Resolvers and Knowledge Ba...Price "KBART: Improving the Supply of Data to Link Resolvers and Knowledge Ba...
Price "KBART: Improving the Supply of Data to Link Resolvers and Knowledge Ba...
 
FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout
 
Vital AI: Big Data Modeling
Vital AI: Big Data ModelingVital AI: Big Data Modeling
Vital AI: Big Data Modeling
 
Data Mining Presentation on Science Day 2023
Data Mining Presentation on Science Day 2023Data Mining Presentation on Science Day 2023
Data Mining Presentation on Science Day 2023
 
New Directions in Metadata
New Directions in MetadataNew Directions in Metadata
New Directions in Metadata
 
FAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practiceFAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practice
 
Leveraging mesos as the ultimate distributed data science platform
Leveraging mesos as the ultimate distributed data science platformLeveraging mesos as the ultimate distributed data science platform
Leveraging mesos as the ultimate distributed data science platform
 
Data mining
Data miningData mining
Data mining
 
Assembling the Tree of Life from public DNA sequence data
Assembling the Tree of Life from public DNA sequence dataAssembling the Tree of Life from public DNA sequence data
Assembling the Tree of Life from public DNA sequence data
 
Metadata
MetadataMetadata
Metadata
 
Dats nih-dccpc-kc7-april2018-prs-uoxf
Dats  nih-dccpc-kc7-april2018-prs-uoxfDats  nih-dccpc-kc7-april2018-prs-uoxf
Dats nih-dccpc-kc7-april2018-prs-uoxf
 

Último

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Último (20)

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 

iEvoBio 2010 cdaostore

  • 1. CDAO-STORE: A New Vision for Data Integration Brandon Chisham, Trung Le, Enrico Pontelli, Tran Son, Ben Wright IEvoBio 2010 Portland, OR
  • 2. CDAO  Comparative Data Analysis Ontology  Provides semantics to the descriptions of data commonly found in the domain of phylogenetic inference.  Enables the rigorous description of phylogenetic trees and associated character data matrices.
  • 3. What We Did  CDAO-Store  A repository providing a rich set of API's for querying phyloinformatics data.  CDAO-Explorer  A visualization tool for viewing data sets stored in the repository.
  • 4. CDAO-Store Repository  What's in it?  TreeBASE dump dated January 2009  Also allows the importation of CDAO formatted files. − To get your files into CDAO, we can translate NEXUS, PHYLIP, and MEGA into CDAO format.  Files can be exported in RDF/XML using CDAO terms
  • 5. Querying CDAO-Store  PhyloWS  Retrieve data sets via name, tree identifier, taxon, or size.  Supports computing the minimum spanning clade or the nearest common ancestor of a set of taxa.  Web-Based  Search for data sets by author or study  View data sets online by tree, taxon, algorithm, method, or size.
  • 6. Web-Based Queries • Landing page for web-queries.
  • 7. Trees Containing a Taxonomic Unit • Shows a list of trees matching the Taxonomic Unit • Has links to query these trees or View them graphically
  • 8. Tree Query • Shows a listing of nodes in the tree. • Allows user to select any set of them to find their minimum spanning clade, or Nearest Common Ancestor
  • 9. Searching by Author • List studies from a particular author.
  • 10. Study Detail • Lists all authors, with links to their studies. • Abstract • Trees associated with the study. • Future: Matrices the data is available in the system but not exposed to the user.
  • 11. Searching by Algorithm or Method • Can search by Algorithm or Method • As before listing shows tree name and links to query the tree or view it.
  • 12. Visualization with CDAO-Explorer  CDAO-Explorer  Tree Viewer  Matrix Viewer
  • 13. Tree Viewer  Uses the Prefuse framework  2 Layouts, “Force Layout” and “Node Layout”  Can search by node/edge name  View details of nodes or edges  Can save as jpg or png
  • 14. Matrix Viewer  Custom built  Color-coded cells  Extract or 'crop' parts of the Matrix for closer views  Zoom in and out of the matrix  Annotation support in development.
  • 15. Conclusion  The CDAO-store tool set provides a robust foundation for a semantically aware, phylogeny resource  The CDAO-Explorer portion of the store has achieved a good base-line functionality and provides a set of useful features to advance the current state of visualization of large data sets in this field.
  • 16. Future  Annotations / MIAPA / OBI  User-defined SPARQL Queries  Better Tree / Matrix integration  Ambiguous Name Resolution (at taxon, tree, and study levels)  Integrating other stores besides TreeBASE
  • 17. Questions?  Find us at:  http://www.cs.nmsu.edu/~cdaostore  http://cdaotools.sourceforge.net  http://www.twitter.com/cdaotools  Funding for this project provided by:  NSF CREST grant HRD-0420407  NSF IGERT grant DGE-0504304  Additional Support provided by:  NESCent  NMSU  CDAO Development Team