Modeling a Microbial Community and Biodiversity Assay with OBI and PCO OBO Foundry Ontologies: The Interoperability Gains of a Modular Approach

Modeling a microbial community
and biodiversity assay with OBI
and PCO: the gains of a modular
approach
ICBO2014, in Houston Oct 6-9
Philippe Rocca-Serra, Ramona Walls, Jacob Parnell, Rachel Gallery, Jie
Zheng, Susanna Assunta Sansone and Alejandra Gonzalez-Beltran

Biodiversity in the
News
• Grim headlines
• True for many
Vertebrates species
• Mankind only now
starts to build tools
enabling true
exploration of diversity

Exploring the world biodiversity
• Game changing progress in sequencing
technology
– Illumina
– Oxford Nanopore Minion
http://dx.doi.org/10.5524/100102

Biodiversity studies with molecular
techniques
• Shotgun sequencing:
– Sequencing as much as possible (probing is
limited by sequencing depth available, the
rarer the species, the deeper the sequencing
needs to be)
• Targeted sequencing:
– Reliance on a ‘marker gene’ whose variability
will be used to estimate distance between
species

‘Barcode’ as in Multiplexed
Libraries
genomic DNA isolated from individual sample is
-fragmented (shearing)
-ligated to a unique short DNA tag (i.e called the barcode)
-PCR amplification and sequencing
-output of a single collection of reads which can be subsequently sorted
using the DNA short-hand by computational mean – deconvolution process
Credits: http://rdp.cme.msu.edu/wiki/index.php/Pyrosequencing_Help

‘Barcode’ as in Barcode of Life
Credits: http://www.barcodeoflife.org

Ambiguous Language
• What is a barcode or what is a barcoding
experiment?
– Metaphors are impenetrable to computers.
– Need to make representation unambiguous
– Barcoding, meaning a technique for
processing more samples in one go ->
another word for multiplexing
– Barcoding, meaning the creation of a unique
profile as a means to identify types of living
things

Heaps of sequence data for
sure….but
• What is the value in
the absence of
accompanying
descriptors?
• Essential annotation
to ascertain identity
and origin, sampling
conditions and
rationale

Helping Data Management
• MIXS Guidelines checklist
• SRA xml schema, Genbank records…
• Tabular Templates for Data Collection
• Wealth of RDF conversion tools
– R2RML W3C data standards
• Using the same xml and same guidelines,
nevertheless ambiguities subsist

ISA templates for Microbial
Diversity Studies
• Integrating MIXS checklist in the ISA
framework
• Mapping MIXS entities into SRA XML
schema
– Properties of sample
– Properties of sample processing
– Properties of resulting libraries
– Properties of data processing

Ambiguities: Barcoding
• Library Experiment Sample unicity
• Use Case: creation of libraries for
Bacteria,Fungi,Eukaryota with specific genes
(16sRNA, ITS, COI)
• ISA conversion to ENA:
– 1 sample -> 3 libraries
• SRA/ENA submission:
– 3 libraries -> 3 samples

Working with OBI, PCO,SO, CHEBI
Drawn using CMAPtools: http://cmap.ihmc.us

OBI-PCO based representation
• ‘targeted gene survey’
• has part some ‘library preparation’ (OBI_0000711)
• ‘polymerase chain reaction’ (OBI_0000415) is_part_of ‘library preparation’ (OBI_0000711)
• ‘polymerase chain reaction’(OBI_0000415)
• has_specified_input some ‘forward pcr primer’ (OBI_0000722)
• has_specified_input some ‘reverse pcr primer’ (OBI_0001951)
• has_specified_input some ‘multiplexing sequence identifier’
• has_specified_input some ‘DNA extract’ (OBI_0001051)
• ‘library preparation’ (OBI_0000711) ‘has_specified_output’ some ‘single fragment library’
(OBI_0000736)
• ‘library preparation’ (OBI_0000711) precedes ‘DNA sequencing’(OBI_0000626)
• ‘library sequence deconvolution’ is_preceded_by ‘DNA sequencing’(OBI_0000626)
• ‘library sequence deconvolution’ is_followed_by ‘(OBI_0200187)’
• ‘sequence analysis data transformation’ (OBI_0200187) has_specified_output some ‘data
item’ (IAO_0000027) and is about ‘population quality’ (PCO_0000003)

Conclusions
• We have clarified the OWL representation of
several assays commonly used in biodiversity
studies.
• We have outlined good practice for serializing
biodiversity experimental process both using ISA,
SRA and RDF format
• We have shown how synergies obtained from
resources of the OBO Foundry can greatly benefit
fast development of fit for purpose tabular data
collection templates which greatly help compliance
with annotation standard guidelines.

Why does it matter?
• Correct sample size assessment
• Assessing independence of samples and
sampling events.
• Is it really possible to ascertain identity of
samples by solely relying a metadata?
• How can such uncertainties affect
downstream analysis / meta analysis?

Future directions
• Sample Collection Protocols and
Procedures as applied in biodiversity
studies (field studies, “Marine macrofauna
grab sampling method” and so forth)
• Clarify the reporting of actual results
• Keeping working with PCO and OBO
Foundry related efforts.

Acknowledgements
• Dr. Ramona Walls (iPlant, Uni of Arizona)
• Pr. Paula Mabee (Uni South Dakota)
• RCN: Phenotype Ontology Research Coordination
Network , National Science Foundation (NSF-DEB-
0956049), (2010 - 2015)
• Dr. Jie Zheng and OBI companions
• PCO coworkers and RCN workshop participants
• ISA Team
• You

Modeling a Microbial Community and Biodiversity Assay with OBI and PCO OBO Foundry Ontologies: The Interoperability Gains of a Modular Approach

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (13)

Destaque

Destaque (13)

Semelhante a Modeling a Microbial Community and Biodiversity Assay with OBI and PCO OBO Foundry Ontologies: The Interoperability Gains of a Modular Approach

Semelhante a Modeling a Microbial Community and Biodiversity Assay with OBI and PCO OBO Foundry Ontologies: The Interoperability Gains of a Modular Approach (20)

Último

Último (20)

Modeling a Microbial Community and Biodiversity Assay with OBI and PCO OBO Foundry Ontologies: The Interoperability Gains of a Modular Approach

Notas do Editor