SlideShare uma empresa Scribd logo
1 de 21
Modeling a microbial community 
and biodiversity assay with OBI 
and PCO: the gains of a modular 
approach 
ICBO2014, in Houston Oct 6-9 
Philippe Rocca-Serra, Ramona Walls, Jacob Parnell, Rachel Gallery, Jie 
Zheng, Susanna Assunta Sansone and Alejandra Gonzalez-Beltran
Biodiversity in the 
News 
• Grim headlines 
• True for many 
Vertebrates species 
• Mankind only now 
starts to build tools 
enabling true 
exploration of diversity
Exploring the world biodiversity 
• Game changing progress in sequencing 
technology 
– Illumina 
– Oxford Nanopore Minion 
http://dx.doi.org/10.5524/100102
Microbial Diversity
Biodiversity studies with molecular 
techniques 
• Shotgun sequencing: 
– Sequencing as much as possible (probing is 
limited by sequencing depth available, the 
rarer the species, the deeper the sequencing 
needs to be) 
• Targeted sequencing: 
– Reliance on a ‘marker gene’ whose variability 
will be used to estimate distance between 
species
‘Barcode’ as in Multiplexed 
Libraries 
genomic DNA isolated from individual sample is 
-fragmented (shearing) 
-ligated to a unique short DNA tag (i.e called the barcode) 
-PCR amplification and sequencing 
-output of a single collection of reads which can be subsequently sorted 
using the DNA short-hand by computational mean – deconvolution process 
Credits: http://rdp.cme.msu.edu/wiki/index.php/Pyrosequencing_Help
‘Barcode’ as in Barcode of Life 
Credits: http://www.barcodeoflife.org
Ambiguous Language 
• What is a barcode or what is a barcoding 
experiment? 
– Metaphors are impenetrable to computers. 
– Need to make representation unambiguous 
– Barcoding, meaning a technique for 
processing more samples in one go -> 
another word for multiplexing 
– Barcoding, meaning the creation of a unique 
profile as a means to identify types of living 
things
Heaps of sequence data for 
sure….but 
• What is the value in 
the absence of 
accompanying 
descriptors? 
• Essential annotation 
to ascertain identity 
and origin, sampling 
conditions and 
rationale
Helping Data Management 
• MIXS Guidelines checklist 
• SRA xml schema, Genbank records… 
• Tabular Templates for Data Collection 
• Wealth of RDF conversion tools 
– R2RML W3C data standards 
• Using the same xml and same guidelines, 
nevertheless ambiguities subsist
ISA templates for Microbial 
Diversity Studies 
• Integrating MIXS checklist in the ISA 
framework 
• Mapping MIXS entities into SRA XML 
schema 
– Properties of sample 
– Properties of sample processing 
– Properties of resulting libraries 
– Properties of data processing
Ambiguities: Barcoding 
• Library Experiment Sample unicity 
• Use Case: creation of libraries for 
Bacteria,Fungi,Eukaryota with specific genes 
(16sRNA, ITS, COI) 
• ISA conversion to ENA: 
– 1 sample -> 3 libraries 
• SRA/ENA submission: 
– 3 libraries -> 3 samples
Working with OBI, PCO,SO, CHEBI 
Drawn using CMAPtools: http://cmap.ihmc.us
Working with OBI, PCO,SO, CHEBI 
Drawn using CMAPtools: http://cmap.ihmc.us
OBI-PCO based representation 
• ‘targeted gene survey’ 
• has part some ‘library preparation’ (OBI_0000711) 
• ‘polymerase chain reaction’ (OBI_0000415) is_part_of ‘library preparation’ (OBI_0000711) 
• ‘polymerase chain reaction’(OBI_0000415) 
• has_specified_input some ‘forward pcr primer’ (OBI_0000722) 
• has_specified_input some ‘reverse pcr primer’ (OBI_0001951) 
• has_specified_input some ‘multiplexing sequence identifier’ 
• has_specified_input some ‘DNA extract’ (OBI_0001051) 
• ‘library preparation’ (OBI_0000711) ‘has_specified_output’ some ‘single fragment library’ 
(OBI_0000736) 
• ‘library preparation’ (OBI_0000711) precedes ‘DNA sequencing’(OBI_0000626) 
• ‘library sequence deconvolution’ is_preceded_by ‘DNA sequencing’(OBI_0000626) 
• ‘library sequence deconvolution’ is_followed_by ‘(OBI_0200187)’ 
• ‘sequence analysis data transformation’ (OBI_0200187) has_specified_output some ‘data 
item’ (IAO_0000027) and is about ‘population quality’ (PCO_0000003)
Conclusions 
• We have clarified the OWL representation of 
several assays commonly used in biodiversity 
studies. 
• We have outlined good practice for serializing 
biodiversity experimental process both using ISA, 
SRA and RDF format 
• We have shown how synergies obtained from 
resources of the OBO Foundry can greatly benefit 
fast development of fit for purpose tabular data 
collection templates which greatly help compliance 
with annotation standard guidelines.
Why does it matter? 
• Correct sample size assessment 
• Assessing independence of samples and 
sampling events. 
• Is it really possible to ascertain identity of 
samples by solely relying a metadata? 
• How can such uncertainties affect 
downstream analysis / meta analysis?
Future directions 
• Sample Collection Protocols and 
Procedures as applied in biodiversity 
studies (field studies, “Marine macrofauna 
grab sampling method” and so forth) 
• Clarify the reporting of actual results 
• Keeping working with PCO and OBO 
Foundry related efforts.
Acknowledgements 
• Dr. Ramona Walls (iPlant, Uni of Arizona) 
• Pr. Paula Mabee (Uni South Dakota) 
• RCN: Phenotype Ontology Research Coordination 
Network , National Science Foundation (NSF-DEB- 
0956049), (2010 - 2015) 
• Dr. Jie Zheng and OBI companions 
• PCO coworkers and RCN workshop participants 
• ISA Team 
• You
Acknowledgements 2

Mais conteúdo relacionado

Mais procurados

An International Cooperative Digital Library for Taxonomic Literature: The Bi...
An International Cooperative Digital Library for Taxonomic Literature: The Bi...An International Cooperative Digital Library for Taxonomic Literature: The Bi...
An International Cooperative Digital Library for Taxonomic Literature: The Bi...Martin Kalfatovic
 
Practical interoperability across semantic stores of data for ecological, tax...
Practical interoperability across semantic stores of data for ecological, tax...Practical interoperability across semantic stores of data for ecological, tax...
Practical interoperability across semantic stores of data for ecological, tax...Cyndy Parr
 
Metagenomic Data Provenance and Management using the ISA infrastructure --- o...
Metagenomic Data Provenance and Management using the ISA infrastructure --- o...Metagenomic Data Provenance and Management using the ISA infrastructure --- o...
Metagenomic Data Provenance and Management using the ISA infrastructure --- o...Alejandra Gonzalez-Beltran
 
Introduction to Web Apollo for the i5K pilot species.
Introduction to Web Apollo for the i5K pilot species.Introduction to Web Apollo for the i5K pilot species.
Introduction to Web Apollo for the i5K pilot species.Monica Munoz-Torres
 
Standards and software: practical aids for reproducibility of computational r...
Standards and software: practical aids for reproducibility of computational r...Standards and software: practical aids for reproducibility of computational r...
Standards and software: practical aids for reproducibility of computational r...Mike Hucka
 
ICBO2017 - Supporting Ontology-Based Standardization of Biomedical Metadata i...
ICBO2017 - Supporting Ontology-Based Standardization of Biomedical Metadata i...ICBO2017 - Supporting Ontology-Based Standardization of Biomedical Metadata i...
ICBO2017 - Supporting Ontology-Based Standardization of Biomedical Metadata i...marcosmartinezromero
 
How Bio Ontologies Enable Open Science
How Bio Ontologies Enable Open ScienceHow Bio Ontologies Enable Open Science
How Bio Ontologies Enable Open Sciencedrnigam
 
Introduction to Biodiversity Informatics
Introduction to Biodiversity Informatics Introduction to Biodiversity Informatics
Introduction to Biodiversity Informatics David Shorthouse
 
Schindel i evobio norman ok - jun 11
Schindel   i evobio norman ok - jun 11Schindel   i evobio norman ok - jun 11
Schindel i evobio norman ok - jun 11David Schindel
 
Gaining Weight for Good Reason: Analysis of Fuller Bibliographic Records in S...
Gaining Weight for Good Reason: Analysis of Fuller Bibliographic Records in S...Gaining Weight for Good Reason: Analysis of Fuller Bibliographic Records in S...
Gaining Weight for Good Reason: Analysis of Fuller Bibliographic Records in S...CALA-MW
 

Mais procurados (13)

Embracing Semantic Technology for Better Metadata Authoring in Biomedicine (S...
Embracing Semantic Technology for Better Metadata Authoring in Biomedicine (S...Embracing Semantic Technology for Better Metadata Authoring in Biomedicine (S...
Embracing Semantic Technology for Better Metadata Authoring in Biomedicine (S...
 
An Open Repository Model for Acquiring Knowledge About Scientific Experiments
An Open Repository Model for Acquiring Knowledge About Scientific ExperimentsAn Open Repository Model for Acquiring Knowledge About Scientific Experiments
An Open Repository Model for Acquiring Knowledge About Scientific Experiments
 
An International Cooperative Digital Library for Taxonomic Literature: The Bi...
An International Cooperative Digital Library for Taxonomic Literature: The Bi...An International Cooperative Digital Library for Taxonomic Literature: The Bi...
An International Cooperative Digital Library for Taxonomic Literature: The Bi...
 
Practical interoperability across semantic stores of data for ecological, tax...
Practical interoperability across semantic stores of data for ecological, tax...Practical interoperability across semantic stores of data for ecological, tax...
Practical interoperability across semantic stores of data for ecological, tax...
 
Metagenomic Data Provenance and Management using the ISA infrastructure --- o...
Metagenomic Data Provenance and Management using the ISA infrastructure --- o...Metagenomic Data Provenance and Management using the ISA infrastructure --- o...
Metagenomic Data Provenance and Management using the ISA infrastructure --- o...
 
The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata ...
The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata ...The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata ...
The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata ...
 
Introduction to Web Apollo for the i5K pilot species.
Introduction to Web Apollo for the i5K pilot species.Introduction to Web Apollo for the i5K pilot species.
Introduction to Web Apollo for the i5K pilot species.
 
Standards and software: practical aids for reproducibility of computational r...
Standards and software: practical aids for reproducibility of computational r...Standards and software: practical aids for reproducibility of computational r...
Standards and software: practical aids for reproducibility of computational r...
 
ICBO2017 - Supporting Ontology-Based Standardization of Biomedical Metadata i...
ICBO2017 - Supporting Ontology-Based Standardization of Biomedical Metadata i...ICBO2017 - Supporting Ontology-Based Standardization of Biomedical Metadata i...
ICBO2017 - Supporting Ontology-Based Standardization of Biomedical Metadata i...
 
How Bio Ontologies Enable Open Science
How Bio Ontologies Enable Open ScienceHow Bio Ontologies Enable Open Science
How Bio Ontologies Enable Open Science
 
Introduction to Biodiversity Informatics
Introduction to Biodiversity Informatics Introduction to Biodiversity Informatics
Introduction to Biodiversity Informatics
 
Schindel i evobio norman ok - jun 11
Schindel   i evobio norman ok - jun 11Schindel   i evobio norman ok - jun 11
Schindel i evobio norman ok - jun 11
 
Gaining Weight for Good Reason: Analysis of Fuller Bibliographic Records in S...
Gaining Weight for Good Reason: Analysis of Fuller Bibliographic Records in S...Gaining Weight for Good Reason: Analysis of Fuller Bibliographic Records in S...
Gaining Weight for Good Reason: Analysis of Fuller Bibliographic Records in S...
 

Destaque

Met soc15 roccaserra-biocrates-datasharing
Met soc15 roccaserra-biocrates-datasharingMet soc15 roccaserra-biocrates-datasharing
Met soc15 roccaserra-biocrates-datasharingPhilippe Rocca-Serra
 
Scott Edmunds at #GAMe2017: GigaGalaxy & publishing workflows for publishing ...
Scott Edmunds at #GAMe2017: GigaGalaxy & publishing workflows for publishing ...Scott Edmunds at #GAMe2017: GigaGalaxy & publishing workflows for publishing ...
Scott Edmunds at #GAMe2017: GigaGalaxy & publishing workflows for publishing ...GigaScience, BGI Hong Kong
 
ISA-Tab Standards at Metabolomics Society Meeting, Tsuruoka 2014, Japan
ISA-Tab Standards at Metabolomics Society Meeting, Tsuruoka 2014, JapanISA-Tab Standards at Metabolomics Society Meeting, Tsuruoka 2014, Japan
ISA-Tab Standards at Metabolomics Society Meeting, Tsuruoka 2014, JapanPhilippe Rocca-Serra
 
BioSharing - mapping the landscape of Standards, Databases and Data policies ...
BioSharing - mapping the landscape of Standards, Databases and Data policies ...BioSharing - mapping the landscape of Standards, Databases and Data policies ...
BioSharing - mapping the landscape of Standards, Databases and Data policies ...Peter McQuilton
 
Ontomaton icbo2013-alternative order-t_wv3
Ontomaton icbo2013-alternative order-t_wv3Ontomaton icbo2013-alternative order-t_wv3
Ontomaton icbo2013-alternative order-t_wv3Philippe Rocca-Serra
 
Soil Microbial Communities: Key Indicators of Soil Carbon Transformations Whe...
Soil Microbial Communities: Key Indicators of Soil Carbon Transformations Whe...Soil Microbial Communities: Key Indicators of Soil Carbon Transformations Whe...
Soil Microbial Communities: Key Indicators of Soil Carbon Transformations Whe...National Institute of Food and Agriculture
 
FINAL POSTER
FINAL POSTERFINAL POSTER
FINAL POSTERRyan Foo
 
Plant Barcoding
Plant BarcodingPlant Barcoding
Plant Barcodingkrisjett
 
Use of DNA barcoding and its role in the plant species/varietal Identifica...
Use of DNA  barcoding  and its role in the plant species/varietal  Identifica...Use of DNA  barcoding  and its role in the plant species/varietal  Identifica...
Use of DNA barcoding and its role in the plant species/varietal Identifica...Senthil Natesan
 
Microbial community composition of different soil layers in an aged oil spill...
Microbial community composition of different soil layers in an aged oil spill...Microbial community composition of different soil layers in an aged oil spill...
Microbial community composition of different soil layers in an aged oil spill...Erhovwon Aggreh
 

Destaque (13)

Met soc15 roccaserra-biocrates-datasharing
Met soc15 roccaserra-biocrates-datasharingMet soc15 roccaserra-biocrates-datasharing
Met soc15 roccaserra-biocrates-datasharing
 
Scott Edmunds at #GAMe2017: GigaGalaxy & publishing workflows for publishing ...
Scott Edmunds at #GAMe2017: GigaGalaxy & publishing workflows for publishing ...Scott Edmunds at #GAMe2017: GigaGalaxy & publishing workflows for publishing ...
Scott Edmunds at #GAMe2017: GigaGalaxy & publishing workflows for publishing ...
 
ISA-Tab Standards at Metabolomics Society Meeting, Tsuruoka 2014, Japan
ISA-Tab Standards at Metabolomics Society Meeting, Tsuruoka 2014, JapanISA-Tab Standards at Metabolomics Society Meeting, Tsuruoka 2014, Japan
ISA-Tab Standards at Metabolomics Society Meeting, Tsuruoka 2014, Japan
 
TranSMART ISA-june2012
TranSMART ISA-june2012TranSMART ISA-june2012
TranSMART ISA-june2012
 
BioSharing - mapping the landscape of Standards, Databases and Data policies ...
BioSharing - mapping the landscape of Standards, Databases and Data policies ...BioSharing - mapping the landscape of Standards, Databases and Data policies ...
BioSharing - mapping the landscape of Standards, Databases and Data policies ...
 
Damon Little - Opening Plenary
Damon Little - Opening PlenaryDamon Little - Opening Plenary
Damon Little - Opening Plenary
 
Ontomaton icbo2013-alternative order-t_wv3
Ontomaton icbo2013-alternative order-t_wv3Ontomaton icbo2013-alternative order-t_wv3
Ontomaton icbo2013-alternative order-t_wv3
 
Soil Microbial Communities: Key Indicators of Soil Carbon Transformations Whe...
Soil Microbial Communities: Key Indicators of Soil Carbon Transformations Whe...Soil Microbial Communities: Key Indicators of Soil Carbon Transformations Whe...
Soil Microbial Communities: Key Indicators of Soil Carbon Transformations Whe...
 
FINAL POSTER
FINAL POSTERFINAL POSTER
FINAL POSTER
 
Plant Barcoding
Plant BarcodingPlant Barcoding
Plant Barcoding
 
Use of DNA barcoding and its role in the plant species/varietal Identifica...
Use of DNA  barcoding  and its role in the plant species/varietal  Identifica...Use of DNA  barcoding  and its role in the plant species/varietal  Identifica...
Use of DNA barcoding and its role in the plant species/varietal Identifica...
 
Microbial Ecology
Microbial EcologyMicrobial Ecology
Microbial Ecology
 
Microbial community composition of different soil layers in an aged oil spill...
Microbial community composition of different soil layers in an aged oil spill...Microbial community composition of different soil layers in an aged oil spill...
Microbial community composition of different soil layers in an aged oil spill...
 

Semelhante a Modeling a Microbial Community and Biodiversity Assay with OBI and PCO OBO Foundry Ontologies: The Interoperability Gains of a Modular Approach

Johannes Bergsten Dna Barcoding
Johannes Bergsten Dna BarcodingJohannes Bergsten Dna Barcoding
Johannes Bergsten Dna Barcodingbioinfocourse
 
Primary Bioinformatics Database.pptx
Primary Bioinformatics Database.pptxPrimary Bioinformatics Database.pptx
Primary Bioinformatics Database.pptxVandana Yadav03
 
DNA BarcodING IN ANIMALS
DNA BarcodING IN ANIMALS DNA BarcodING IN ANIMALS
DNA BarcodING IN ANIMALS Gull Fatima
 
Connecting life sciences data at the European Bioinformatics Institute
Connecting life sciences data at the European Bioinformatics InstituteConnecting life sciences data at the European Bioinformatics Institute
Connecting life sciences data at the European Bioinformatics InstituteConnected Data World
 
Biological Database (1)pptxpdfpdfpdf.pdf
Biological Database (1)pptxpdfpdfpdf.pdfBiological Database (1)pptxpdfpdfpdf.pdf
Biological Database (1)pptxpdfpdfpdf.pdfBioinformaticsCentre
 
Nucleic acid database
Nucleic acid databaseNucleic acid database
Nucleic acid databaseEsakkiammal S
 
Ontologies for life sciences: examples from the gene ontology
Ontologies for life sciences: examples from the gene ontologyOntologies for life sciences: examples from the gene ontology
Ontologies for life sciences: examples from the gene ontologyMelanie Courtot
 
Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...
Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...
Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...Spark Summit
 
Curating and Preserving Collaborative Digital Experiments
Curating and Preserving Collaborative Digital ExperimentsCurating and Preserving Collaborative Digital Experiments
Curating and Preserving Collaborative Digital ExperimentsJose Enrique Ruiz
 
Facilitating semantic alignment.-biohackathon-jupp
Facilitating semantic alignment.-biohackathon-juppFacilitating semantic alignment.-biohackathon-jupp
Facilitating semantic alignment.-biohackathon-juppSimon Jupp
 
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK Cyndy Parr
 
#LAWDI Open Context, publishing linked data in archaeology
#LAWDI Open Context, publishing linked data in archaeology#LAWDI Open Context, publishing linked data in archaeology
#LAWDI Open Context, publishing linked data in archaeologyekansa
 
Encyclopedia of Life: Use cases for phenotypes
Encyclopedia of Life: Use cases for phenotypesEncyclopedia of Life: Use cases for phenotypes
Encyclopedia of Life: Use cases for phenotypesCyndy Parr
 
Biological data bioinformatics
Biological data bioinformatics Biological data bioinformatics
Biological data bioinformatics AakifahAmreen
 

Semelhante a Modeling a Microbial Community and Biodiversity Assay with OBI and PCO OBO Foundry Ontologies: The Interoperability Gains of a Modular Approach (20)

Johannes Bergsten Dna Barcoding
Johannes Bergsten Dna BarcodingJohannes Bergsten Dna Barcoding
Johannes Bergsten Dna Barcoding
 
Primary Bioinformatics Database.pptx
Primary Bioinformatics Database.pptxPrimary Bioinformatics Database.pptx
Primary Bioinformatics Database.pptx
 
DNA BarcodING IN ANIMALS
DNA BarcodING IN ANIMALS DNA BarcodING IN ANIMALS
DNA BarcodING IN ANIMALS
 
Connecting life sciences data at the European Bioinformatics Institute
Connecting life sciences data at the European Bioinformatics InstituteConnecting life sciences data at the European Bioinformatics Institute
Connecting life sciences data at the European Bioinformatics Institute
 
Biological Database (1)pptxpdfpdfpdf.pdf
Biological Database (1)pptxpdfpdfpdf.pdfBiological Database (1)pptxpdfpdfpdf.pdf
Biological Database (1)pptxpdfpdfpdf.pdf
 
Nucleic acid database
Nucleic acid databaseNucleic acid database
Nucleic acid database
 
Ontologies for life sciences: examples from the gene ontology
Ontologies for life sciences: examples from the gene ontologyOntologies for life sciences: examples from the gene ontology
Ontologies for life sciences: examples from the gene ontology
 
Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...
Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...
Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...
 
2014 bangkok-talk
2014 bangkok-talk2014 bangkok-talk
2014 bangkok-talk
 
Data base in detail
Data base in detailData base in detail
Data base in detail
 
Curating and Preserving Collaborative Digital Experiments
Curating and Preserving Collaborative Digital ExperimentsCurating and Preserving Collaborative Digital Experiments
Curating and Preserving Collaborative Digital Experiments
 
Facilitating semantic alignment.-biohackathon-jupp
Facilitating semantic alignment.-biohackathon-juppFacilitating semantic alignment.-biohackathon-jupp
Facilitating semantic alignment.-biohackathon-jupp
 
Protein Database
Protein DatabaseProtein Database
Protein Database
 
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
 
Variant analysis and whole exome sequencing
Variant analysis and whole exome sequencingVariant analysis and whole exome sequencing
Variant analysis and whole exome sequencing
 
#LAWDI Open Context, publishing linked data in archaeology
#LAWDI Open Context, publishing linked data in archaeology#LAWDI Open Context, publishing linked data in archaeology
#LAWDI Open Context, publishing linked data in archaeology
 
Workflow Preservation
Workflow PreservationWorkflow Preservation
Workflow Preservation
 
Data integration
Data integrationData integration
Data integration
 
Encyclopedia of Life: Use cases for phenotypes
Encyclopedia of Life: Use cases for phenotypesEncyclopedia of Life: Use cases for phenotypes
Encyclopedia of Life: Use cases for phenotypes
 
Biological data bioinformatics
Biological data bioinformatics Biological data bioinformatics
Biological data bioinformatics
 

Último

Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptxRajatChauhan518211
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PPRINCE C P
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCEPRINCE C P
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfSumit Kumar yadav
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfSumit Kumar yadav
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡anilsa9823
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfnehabiju2046
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxgindu3009
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptMAESTRELLAMesa2
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsSumit Kumar yadav
 

Último (20)

Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptx
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C P
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdf
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.ppt
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questions
 

Modeling a Microbial Community and Biodiversity Assay with OBI and PCO OBO Foundry Ontologies: The Interoperability Gains of a Modular Approach

  • 1. Modeling a microbial community and biodiversity assay with OBI and PCO: the gains of a modular approach ICBO2014, in Houston Oct 6-9 Philippe Rocca-Serra, Ramona Walls, Jacob Parnell, Rachel Gallery, Jie Zheng, Susanna Assunta Sansone and Alejandra Gonzalez-Beltran
  • 2. Biodiversity in the News • Grim headlines • True for many Vertebrates species • Mankind only now starts to build tools enabling true exploration of diversity
  • 3. Exploring the world biodiversity • Game changing progress in sequencing technology – Illumina – Oxford Nanopore Minion http://dx.doi.org/10.5524/100102
  • 5. Biodiversity studies with molecular techniques • Shotgun sequencing: – Sequencing as much as possible (probing is limited by sequencing depth available, the rarer the species, the deeper the sequencing needs to be) • Targeted sequencing: – Reliance on a ‘marker gene’ whose variability will be used to estimate distance between species
  • 6. ‘Barcode’ as in Multiplexed Libraries genomic DNA isolated from individual sample is -fragmented (shearing) -ligated to a unique short DNA tag (i.e called the barcode) -PCR amplification and sequencing -output of a single collection of reads which can be subsequently sorted using the DNA short-hand by computational mean – deconvolution process Credits: http://rdp.cme.msu.edu/wiki/index.php/Pyrosequencing_Help
  • 7. ‘Barcode’ as in Barcode of Life Credits: http://www.barcodeoflife.org
  • 8. Ambiguous Language • What is a barcode or what is a barcoding experiment? – Metaphors are impenetrable to computers. – Need to make representation unambiguous – Barcoding, meaning a technique for processing more samples in one go -> another word for multiplexing – Barcoding, meaning the creation of a unique profile as a means to identify types of living things
  • 9. Heaps of sequence data for sure….but • What is the value in the absence of accompanying descriptors? • Essential annotation to ascertain identity and origin, sampling conditions and rationale
  • 10. Helping Data Management • MIXS Guidelines checklist • SRA xml schema, Genbank records… • Tabular Templates for Data Collection • Wealth of RDF conversion tools – R2RML W3C data standards • Using the same xml and same guidelines, nevertheless ambiguities subsist
  • 11.
  • 12. ISA templates for Microbial Diversity Studies • Integrating MIXS checklist in the ISA framework • Mapping MIXS entities into SRA XML schema – Properties of sample – Properties of sample processing – Properties of resulting libraries – Properties of data processing
  • 13. Ambiguities: Barcoding • Library Experiment Sample unicity • Use Case: creation of libraries for Bacteria,Fungi,Eukaryota with specific genes (16sRNA, ITS, COI) • ISA conversion to ENA: – 1 sample -> 3 libraries • SRA/ENA submission: – 3 libraries -> 3 samples
  • 14. Working with OBI, PCO,SO, CHEBI Drawn using CMAPtools: http://cmap.ihmc.us
  • 15. Working with OBI, PCO,SO, CHEBI Drawn using CMAPtools: http://cmap.ihmc.us
  • 16. OBI-PCO based representation • ‘targeted gene survey’ • has part some ‘library preparation’ (OBI_0000711) • ‘polymerase chain reaction’ (OBI_0000415) is_part_of ‘library preparation’ (OBI_0000711) • ‘polymerase chain reaction’(OBI_0000415) • has_specified_input some ‘forward pcr primer’ (OBI_0000722) • has_specified_input some ‘reverse pcr primer’ (OBI_0001951) • has_specified_input some ‘multiplexing sequence identifier’ • has_specified_input some ‘DNA extract’ (OBI_0001051) • ‘library preparation’ (OBI_0000711) ‘has_specified_output’ some ‘single fragment library’ (OBI_0000736) • ‘library preparation’ (OBI_0000711) precedes ‘DNA sequencing’(OBI_0000626) • ‘library sequence deconvolution’ is_preceded_by ‘DNA sequencing’(OBI_0000626) • ‘library sequence deconvolution’ is_followed_by ‘(OBI_0200187)’ • ‘sequence analysis data transformation’ (OBI_0200187) has_specified_output some ‘data item’ (IAO_0000027) and is about ‘population quality’ (PCO_0000003)
  • 17. Conclusions • We have clarified the OWL representation of several assays commonly used in biodiversity studies. • We have outlined good practice for serializing biodiversity experimental process both using ISA, SRA and RDF format • We have shown how synergies obtained from resources of the OBO Foundry can greatly benefit fast development of fit for purpose tabular data collection templates which greatly help compliance with annotation standard guidelines.
  • 18. Why does it matter? • Correct sample size assessment • Assessing independence of samples and sampling events. • Is it really possible to ascertain identity of samples by solely relying a metadata? • How can such uncertainties affect downstream analysis / meta analysis?
  • 19. Future directions • Sample Collection Protocols and Procedures as applied in biodiversity studies (field studies, “Marine macrofauna grab sampling method” and so forth) • Clarify the reporting of actual results • Keeping working with PCO and OBO Foundry related efforts.
  • 20. Acknowledgements • Dr. Ramona Walls (iPlant, Uni of Arizona) • Pr. Paula Mabee (Uni South Dakota) • RCN: Phenotype Ontology Research Coordination Network , National Science Foundation (NSF-DEB- 0956049), (2010 - 2015) • Dr. Jie Zheng and OBI companions • PCO coworkers and RCN workshop participants • ISA Team • You

Notas do Editor

  1. Biodiversity, the field of science interested in documenting The Earth’s life form wherever they are. For Vertebrates and many macroscopic species, the outlook seems grim as seen in recent headlines, here exemplified by BBC news title dating back September 30th. This is all the more troubling as we only start to have the molecular tools to probe life very diverse niches.
  2. We are all too aware of the advances in sequencing technologies, with Illumina instruments dominating the market. While those instruments are fast they are still bulky and competitors are working hard at developing new alternatives whose size (here is Oxford Nanopore Minion USB connected nanopore sequencing ) for which a first dataset has been published in a BMC GigaScience.
  3. For a long time, scientists have been limited in their exploration by the ‘lense’ through which their were looking. This can not be more explicitly demonstrated in the world of microbiology where only what that could be grown in lab conditions would be characterized. The advent of fast, accurate sequencing techniques opened entirely new horizons to life exploration. Here are few examples, from our happy scientists at the zoology department in Oxford, collecting new deep sea samples, to colleagues monitoring extreme habitats such as mining waters. Other projects such as Tara Ocean recapitulate some of the sea trails followed by the XV century explorers in an attempt to provide a snapshot of marine biodiversity. Finally, biodiversity is within too as shown in this famous Nature article and by projects such as the American Gut.
  4. When it comes to biodiversity studies relying on sequencing techniques, there are in fact 2 main approaches: global or targeted. In the first case, one will try to sequence as much as possible, and this means as deep as possible to trawl the rarest (i.e. less abundant) species. But deep sequencing is expensive and requires long machine time, which can be an issue with a limited number of instruments and a vast number of samples to process. Another approach is much more parsimonious but only provides an indirect measure of biodiversity. The technique relies on identifying a genomic region specific to a genre, but variable enough to estimate the spread of subspecies within that genre. Such genomic region are often coding genes, common ancestors which have accumulated mutations and can be used as a proxy to estimate distance between relatives. For the Bacteria, 16sRNA gene is used, for Fungi, hyper-variable regions of gene ITS are the prime tool and COI gene is often used for Eukaryotes.
  5. This brings the need to disambiguate 2 very distinct (even though related in their metaphor) of the notion of ‘Barcode’. You remember we mentioned that instrument occupancy was still a bottleneck (as well as reagent costs). There, multiplexing techniques offer an extremely valuable solution for speeding up throughput. Once more, the advances in computational treatment of sequencing reads meant it was possible to devise library construction techniques allowing pooling of tagged samples so one single reaction well could be used to produce signal. Since individual genomic DNA for each sample has been tagged with a ‘multiplex identifier’ (mid) colloquially called ‘barcode’, it is possible to apply a deconvolution protocol and group together all sequencing reads associated to the tag and therefore a sample. This is first meaning of ‘barcode sequencing’ in the field.
  6. But Barcode is also met in the project ‘Barcode of life’ . Here the aim, is to defined a true nucleic acid profile (if possible in single gene region) which would uniquely define a given species. This slide shows the overall workflow and ambition.
  7. (all on the slide)
  8. Fine, huge amounts of sequencing data are being generated but those will be of little value if contextual data is missing. The criticality of such annotation has been outlined in a NatBiotech paper from 2011 by Yilmaz et al., who published the MIXS/MIMARKS minimal information specification. This work was carried out under the Genomic Standards Consortium (GSC) initiative.
  9. The MIXS/MIMARKS checklist provides a framework detailing which metadata to collect, with specific requirements for specific sample types. It is meant to facilitate exchange of data between centres collecting and archiving environmental samples. We will now show how these guidelines have been implemented by the ISA Team, that generated a set of configurations defining data collection templates.
  10. A quick introduction of ISA tool suites, support data collection, persistence and conversion to a set of formats supported by Public Repositories. Ecosystem revolving around the ISA-TAB format Support for massively parallel datasets Gradient from left to right – configuration (annotation guidelines), curation tools to analysis and usage – people can choose the path that is more convenient for their use case. More recently, we became involved with Publishers (NPG and BMC GigaScience)
  11. The main job consisted in 2 steps: i. create the ISA configurations from MIMARKS guidelines. This meant binning metadata tags defined by GSC to the relevant ISA syntactic element. For instance, MIXS geo_loc (geographical location) has been mapped to the ISA Source Name element while ‘collection device’ has been mapped as a Parameter Value associated to a ‘Protocol’. A screenshot shown here illustrate the ‘distribution’ of MIMARKS tags over an ISA workflow , showing here only the annotation related to library preparation and data acquisition. ii. Step 2 consisted in adjusting the ISA SRA converter and mapping the metadata into SRA schema objects. This is where we realized that the same information (MIMARKS) can be mapped differently to the same schema (SRA).
  12. The example we consider here is that of an environmental gene survey performed on the same sample but using 3 different sets of PCR primers to amplify genomic regions targeting 3 different Genera. Following ISA templates, the interpretation of the conversion retains that feature, i.e. all libraries have been derived from the same samples. However, other tools will create 3 distinct SRA samples. Identity will have to be assumed. The experience has been used to fully describe these types of assays in a BFO based ontological framework in order to ensure semantic accuracy and avoid the pitfalls. The following 2 slides present a graphical representation (in the form of cMAP) of ‘targeted gene survey’ assay by exploiting the OBI assay design pattern and augmenting it to accommodate the specifics of the procedure.
  13. This shows the component corresponding the biomaterial sample collection and preservation,
  14. This shows the component corresponding the biomaterial processing to generate sequencing libraries, preceding the data acquisition and treatment processes, which ultimately, produced information artifact about a population.
  15. The representation can therefore be exploited to convert ISA spreadsheet for this type of information and totally clarify the semantics of the tables. Such mapping can be fed to the ISA RDF conversion module (LinkedISA) as the means to make biodiversity data more linked. Obviously, this pattern is independent from ISA based representation but the same representation can be used as mapping template, thus providing a patterns to consistently represent such data.
  16. Conclusions: (all on the slide, really)
  17. In digging into the details of sequence based biodiversity assay, we have identified a potential issue in existing representation affecting the ability to accurately assess true sample size. This may result in inconsistencies between declared sample size in experimental reports and sample sizes computed from deposited data. While remedial heuristics can be devised to compensate, they have a cost. Those methods will have to rely on computing distance metrics based on vectors of metadata values and try to infer identity of origin. They key question will be to understand how it may influence downstream data analysis
  18. This leads to the discussion of future direction of work PCO and OBI could look into. These could range from capturing the specifics of sampling procedures used in environmental and biodiversity studies. A number of protocols and guidelines , such as the” Marine macrofauna grab sampling method” to give an example, development could also look into clarifying the actual measurement produced from such studies. Ideally, working under the foundry, as people are growing more familiar with development conventions and practices, it makes cross talk more productive , with term dispatch and composition protocol being more refined and detailed. This also encourages cross domain development and outreach to existing and sometimes overlapping efforts. OBI and IAO are currently outlining a plan for alignment, these are encouraging signals for the community.
  19. A big thank to Ramona Walls, Paula Mabee and RCN Phenotype group for organizing and leading these twins events. Al the participants of the PCO meeting (Robert Garulnick, Pier Luigi Buttigieg, Adam among others…) Jie Zheng and obi folks , of course all my colleagues of the ISA Team (Alejandra, Eamonn, Susanna and Milo) and you for your attention
  20. I have to insist of a Heartfelt acknowledgment as it meant swapping this (Oxford floods in February) to this (Arizona desert, February, same year) It was nice to be somewhere dry and in such a great company 