SlideShare uma empresa Scribd logo
1 de 65
De-centralized but global:
Redesigning biodiversity data aggregation
for improved engagement and impact
Nico Franz, Ed Gilbert & Beckett Sterner
Biodiversity Knowledge Integration Center – Arizona State University
Keynote Session – June 10, 2019
3rd Annual Digital Data Conference – Yale University, New Haven, CO
Structure of presentation
0. Key message: Biodiversity informatics remains young and fresh; especially if
we aim
to incentivize experts/enthusiasts in publishing high-quality,
"data-intelligent" biodiversity data products.
Structure of presentation
0. Key message: Biodiversity informatics remains young and fresh; especially if
we aim
to incentivize experts/enthusiasts in publishing high-quality,
"data-intelligent" biodiversity data products.
1. "New at ASU" (and more generally): The NEON Biorepository Data Portal
–
a Darwin Core-based ecological monitoring and forecasting
resource.
Structure of presentation
0. Key message: Biodiversity informatics remains young and fresh; especially if
we aim
to incentivize experts/enthusiasts in publishing high-quality,
"data-intelligent" biodiversity data products.
1. "New at ASU" (and more generally): The NEON Biorepository Data Portal
–
a Darwin Core-based ecological monitoring and forecasting
resource.
2. Rethinking centralized biodiversity data aggregation: Diagnosis and
components
of a de-centralized complement.
Structure of presentation
0. Key message: Biodiversity informatics remains young and fresh; especially if
we aim
to incentivize experts/enthusiasts in publishing high-quality,
"data-intelligent" biodiversity data products.
1. "New at ASU" (and more generally): The NEON Biorepository Data Portal
–
a Darwin Core-based ecological monitoring and forecasting
resource.
2. Rethinking centralized biodiversity data aggregation: Diagnosis and
components
of a de-centralized complement.
3. Still our frontier: Taxonomic data intelligence for Darwin Core occurrences.
Introducing the NEON Biorepository Data Portal (1)
https://biorepo.neonscience.org
Introducing the NEON Biorepository Data Portal
Introducing the NEON Biorepository Data Portal
Introducing the NEON Biorepository Data Portal
Introducing the NEON Biorepository Data Portal
Introducing the NEON Biorepository Data Portal
More information at Tuesday's poster session!
• Using the portal
• Access to
samples
• New
developments
• Informatics
• Research
• Prospective
M.Sc./ Ph.D.
applicants and
postdocs
• Early and
broad-based
use is key!
A short demonstration
https://biorepo.neonscience.org
[after a short BioRepo portal demo (Culicidae / SRER Plants)] *
Mosquito species richness per site, 2016–2017. Change in species richness per site.
* Thanks to Dr. Kelsey Yule, NEON Biorepository @ ASU.
• NEON organismal sampling produces a unique, taxon-/region-/time-
constrained, change-focused data signal.
Rethinking centralized biodiversity data aggregation (2)
• https://doi.org/10.1093/database/bax10
0
• Critique of monolithic aggregation
designs.
Rethinking centralized biodiversity data aggregation (2)
• https://doi.org/10.1093/database/bax10
0
• Critique of monolithic aggregation
designs.
• Here we reformulate the issue in
relation to the direction of data flow in
hierarchical biodiversity data
aggregation networks.
Rethinking centralized biodiversity data aggregation (2)
• https://doi.org/10.1093/database/bax10
0
• Critique of monolithic aggregation
designs.
• Here we reformulate the issue in
relation to the direction of data flow in
hierarchical biodiversity data
aggregation networks.
• Then we outline a complementary
solution that is not centralized, yet
which can potentially reach global
coverage.
Rethinking centralized biodiversity data aggregation (2)
• https://doi.org/10.1093/database/bax10
0
• Critique of monolithic aggregation
designs.
• Here we reformulate the issue in
relation to the direction of data flow in
hierarchical biodiversity data
aggregation networks.
• Then we outline a complementary
solution that is not centralized, yet
which can potentially reach global
coverage.
• This solution is being developed at
ASU under the project name
Centralized biodiversity data aggregation
• Many collections remain offline.
Centralized biodiversity data aggregation
• Other, individual collections, may have a web presence.
Centralized biodiversity data aggregation
• Multi-collections software applications can bring institutional datasets on-line;
yet this alone may not suffice to publish DwC-Archive data according to
FAIR standards.
Centralized biodiversity data aggregation
• The Integrated Publishing Toolkit (IPT) allows individual or multi-collection
datasets to become discoverable as DwC-Archive packages to higher-level
aggregators.
Centralized biodiversity data aggregation
• There are also web-only, mid-level portal applications that support live
collection management and can publish "up" through the (IPT).
Centralized biodiversity data aggregation
• Symbiota portals also support "snapshot collections" – i.e., periodical,
manually triggered batch re-/uploads of static versions that are live-managed
elsewhere.
Centralized biodiversity data aggregation
• Symbiota portals also have a custom, fully built-in, IPT-analogous "Darwin
Core Archive Publishing" module.
Example of Symbiota's DwC-A Publishing module (SCAN: ASUCOB)
June 10, 2019
Centralized biodiversity data aggregation
• Multiple, community-themed portals – each with unique live/snapshot
collection profiles – can periodically receive reciprocal snapshot updates.
Highest-level aggregators typically only support collection snapshots!
This hierarchy sustains an imbalance in directional data flow:
Annotations on global datasets are hard to pull downwards.
Moreover, by the time we reach the top, most experts/enthusiasts
no longer feel at home (cf. Wenger 2000).
Wenger 2000: Communities of practice and social learning systems
Community Access  Engagement  Quality  Trust  Use &
Impact
Designing for strong data communities
De-centralized, but global
• Independent, themed portal communities maintain live-managed
collections.
De-centralized, but global
• Partial, relevant collection snapshot subsets are represented.
De-centralized, but global
• Even partial, relevant portal snapshot subsets are ingestible, with
provenance.
De-centralized, but global
• Some research-themed portals may only include partial collection snapshots.
De-centralized, but global
• Highly configurable portal-to-portal APIs negotiate the flow of data
between live and snapshot collection instances.
De-centralized, but global
• As API services are optimized, the distinction between live and snapshot
collection management increasingly falls away.
De-centralized, but global
• API service configurations include filtered, source-/sink-approval contingent
data pushes and pulls.
De-centralized, but global
• API service configurations allow filtered {collection, taxon, region, etc.),
source-/sink-approval contingent data pushes and pulls.
De-centralized, but global
• Portal-to-portal API configurations become the "substrate" upon which the
communities realize their "modes of belonging".
De-centralized, but global
• The de-centralized network is broadly extensible between closely (high
data flow) or remotely (low data flow) related communities.
De-centralized, but global
• On the basis of a shared API service culture, a de-centralized data portal
network can potentially grow to attain global coverage.
Designing for expert/enthusiast access
BioCache: Global access through custom research portal instances
• Researchers create and register "via single handshake" a new portal
instance.
• Research instances enable repeatable, global data queries and re-
/ingestion.
BioCache: Global access through custom research portal instances
• Valued-added data can return to all (live) source collections via filtered
pulls.
BioCache: Global access through custom research portal instances
Stay tuned, it's underway
Taxonomic data intelligence for Darwin Core occurrences (3)
• https://doi.org/10.3897/rio.2.e10610
• Engaging expert/enthusiast
communities  need for pluralism
and democracy for and among
taxonomic perspectives in
biodiversity data aggregation designs.
Alignment by Alan Weakley (http://herbarium.unc.edu/flora.htm)
Taxonomic data intelligence for Darwin Core occurrences (3)
• https://doi.org/10.3897/rio.2.e10610
• Engaging expert/enthusiast
communities  need for pluralism
and democracy for and among
taxonomic perspectives in
biodiversity data aggregation designs.
• Spatial reasoning tools (RCC–5) can
help attain consistent and
comprehensive taxonomic meaning
mappings  intelligence for data
integration across evolving or
conflicting views.
Taxonomic data intelligence for Darwin Core occurrences (3)
• https://doi.org/10.3897/rio.2.e10610
• Engaging expert/enthusiast
communities  need for pluralism
and democracy for and among
taxonomic perspectives in
biodiversity data aggregation designs.
• Spatial reasoning tools (RCC–5) can
help attain consistent and
comprehensive taxonomic meaning
mappings  intelligence for data
integration across evolving or
conflicting views.
• Biological inferences become
robust, in relation to "the taxonomic
variable".
Project remains pending, but look!
"Taxonomically intelligent data integration for a new Flora of Alaska"
• NSF DBI 1759964
• PIs Ickert-Bond & Webb
• Reconciling Hulten, FNA &
Pan-Arctic Flora
• See http://alaskaflora.org/
iNaturalist is mostly there already
https://www.inaturalist.org/taxon_framework_relationships
iNaturalist is mostly there already
https://www.inaturalist.org/taxon_framework_relationships
Recent addition: Concept alignment for phylogenomic trees
Franz, N.M. et al. 2019. Verbalizing phylogenomic conflict: […]. https://doi.org/10.1371/journal.pcbi.1006493
Reliable theories of multi-tree node congruence require expert
judgment
Franz, N.M. et al. 2019. Verbalizing phylogenomic conflict: […]. https://doi.org/10.1371/journal.pcbi.1006493
• "Synthesis"
OpenTree
vs. RCC–5
Hopeful conclusions
Key message: Biodiversity informatics remains young and fresh; especially if
we aim
to incentivize experts/enthusiasts in publishing high-quality,
"data-intelligent" biodiversity data products.
Hopeful conclusions
Key message: Biodiversity informatics remains young and fresh; especially if
we aim
to incentivize experts/enthusiasts in publishing high-quality,
"data-intelligent" biodiversity data products.
There are many grassroots or federally supported projects in this domain that
express
enthusiasm for, and confidence in, a strong future for new, high-
volume,
perpetually data-restructuring systematic research; driving our
evolving
views of relations between DwC datasets and biological knowledge.
Hopeful conclusions
Key message: Biodiversity informatics remains young and fresh; especially if
we aim
to incentivize experts/enthusiasts in publishing high-quality,
"data-intelligent" biodiversity data products.
There are many grassroots or federally supported projects in this domain that
express
enthusiasm for, and confidence in, a strong future for new, high-
volume,
perpetually data-restructuring systematic research; driving our
evolving
views of relations between DwC datasets and biological knowledge.
If you have the passion and stomach for that future, join us now!
Acknowledgments
• Gil Nelson, Alnycea Blackwell, Jillian Goodwin, and all other iDigBio and
Yale Peabody organizers and sponsors of #Digidata 2019.
• ASU Biocollections & NEON Biorepository team.
• Euler/X team: Bertram Ludäscher, Shizhuo Yu, Jessica Cheng.
• If you wish to read one paper on aligning taxonomic concepts:
https://doi.org/10.1093/sysbio/syw023

Mais conteúdo relacionado

Mais procurados

Cataloging Landscape Update: RDA and LC Working Group on the Future of Biblio...
Cataloging Landscape Update: RDA and LC Working Group on the Future of Biblio...Cataloging Landscape Update: RDA and LC Working Group on the Future of Biblio...
Cataloging Landscape Update: RDA and LC Working Group on the Future of Biblio...
kramsey
 

Mais procurados (20)

BioDBCore: Current Status and Next Developments
BioDBCore: Current Status and Next DevelopmentsBioDBCore: Current Status and Next Developments
BioDBCore: Current Status and Next Developments
 
uBio presentation to Species 2000 May 2004
uBio presentation to Species 2000 May 2004uBio presentation to Species 2000 May 2004
uBio presentation to Species 2000 May 2004
 
Web services and the Development of Semantic Applications
Web services and the Development of Semantic ApplicationsWeb services and the Development of Semantic Applications
Web services and the Development of Semantic Applications
 
COBWEB - infrastructure and platform for Environmental Crowd Sensing and Big ...
COBWEB - infrastructure and platform for Environmental Crowd Sensing and Big ...COBWEB - infrastructure and platform for Environmental Crowd Sensing and Big ...
COBWEB - infrastructure and platform for Environmental Crowd Sensing and Big ...
 
EMBL Australian Bioinformatics Resource AHM - Data Commons
EMBL Australian Bioinformatics Resource AHM   - Data CommonsEMBL Australian Bioinformatics Resource AHM   - Data Commons
EMBL Australian Bioinformatics Resource AHM - Data Commons
 
Ala dcig-webinar
Ala dcig-webinarAla dcig-webinar
Ala dcig-webinar
 
The Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of BioinformaticsThe Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of Bioinformatics
 
Cataloging Landscape Update: RDA and LC Working Group on the Future of Biblio...
Cataloging Landscape Update: RDA and LC Working Group on the Future of Biblio...Cataloging Landscape Update: RDA and LC Working Group on the Future of Biblio...
Cataloging Landscape Update: RDA and LC Working Group on the Future of Biblio...
 
Networking Repositories, Optimizing Impact: Georgia Knowledge Repository Meeting
Networking Repositories, Optimizing Impact: Georgia Knowledge Repository MeetingNetworking Repositories, Optimizing Impact: Georgia Knowledge Repository Meeting
Networking Repositories, Optimizing Impact: Georgia Knowledge Repository Meeting
 
Geospatial metadata and spatial data workshop: 19 June 2014
Geospatial metadata and spatial data workshop: 19 June 2014Geospatial metadata and spatial data workshop: 19 June 2014
Geospatial metadata and spatial data workshop: 19 June 2014
 
STI Summit 2011 - LS4 LS Khaos
STI Summit 2011 - LS4 LS KhaosSTI Summit 2011 - LS4 LS Khaos
STI Summit 2011 - LS4 LS Khaos
 
Carnegie seminar
Carnegie seminarCarnegie seminar
Carnegie seminar
 
Forging the Digital Roadmap: The Preservation, Curation and Stewardship Nexus
Forging the Digital Roadmap: The Preservation, Curation and Stewardship NexusForging the Digital Roadmap: The Preservation, Curation and Stewardship Nexus
Forging the Digital Roadmap: The Preservation, Curation and Stewardship Nexus
 
Introduction to FAIRDOM
Introduction to FAIRDOMIntroduction to FAIRDOM
Introduction to FAIRDOM
 
Introduction to the COBWEB Project, January 2013
Introduction to the COBWEB Project, January 2013Introduction to the COBWEB Project, January 2013
Introduction to the COBWEB Project, January 2013
 
Elixir at de.nbi meeting
Elixir at de.nbi meetingElixir at de.nbi meeting
Elixir at de.nbi meeting
 
An Oz Mammals Bioinformatics and Data Resource
An Oz Mammals Bioinformatics and Data ResourceAn Oz Mammals Bioinformatics and Data Resource
An Oz Mammals Bioinformatics and Data Resource
 
dkNET Poster Experimental Biology 2019
dkNET Poster Experimental Biology 2019dkNET Poster Experimental Biology 2019
dkNET Poster Experimental Biology 2019
 
Opportunities and Challenges for International Cooperation Around Big Data
Opportunities and Challenges for International Cooperation Around Big DataOpportunities and Challenges for International Cooperation Around Big Data
Opportunities and Challenges for International Cooperation Around Big Data
 
FAIR Data, Operations and Model management for Systems Biology and Systems Me...
FAIR Data, Operations and Model management for Systems Biology and Systems Me...FAIR Data, Operations and Model management for Systems Biology and Systems Me...
FAIR Data, Operations and Model management for Systems Biology and Systems Me...
 

Semelhante a De-centralized but global: Redesigning biodiversity data aggregation for improved engagement and impact

10th e concertation-brussels-06march2013-v2
10th e concertation-brussels-06march2013-v210th e concertation-brussels-06march2013-v2
10th e concertation-brussels-06march2013-v2
Alex Hardisty
 
Cross-Community User Requirements and the Biodiversity Heritage Library
Cross-Community User Requirements and the Biodiversity Heritage LibraryCross-Community User Requirements and the Biodiversity Heritage Library
Cross-Community User Requirements and the Biodiversity Heritage Library
Chris Freeland
 
Connecting the Dots: Linking Digitized Collections Across Metadata Silos
Connecting the Dots: Linking Digitized Collections Across Metadata SilosConnecting the Dots: Linking Digitized Collections Across Metadata Silos
Connecting the Dots: Linking Digitized Collections Across Metadata Silos
OCLC
 
Gbrds Summary Final July2009 (2)
Gbrds Summary Final July2009 (2)Gbrds Summary Final July2009 (2)
Gbrds Summary Final July2009 (2)
Vishwas Chavan
 

Semelhante a De-centralized but global: Redesigning biodiversity data aggregation for improved engagement and impact (20)

Using e-Infrastructures for Biodiversity Conservation
Using e-Infrastructures for Biodiversity ConservationUsing e-Infrastructures for Biodiversity Conservation
Using e-Infrastructures for Biodiversity Conservation
 
2 Discovery and Acquisition of Data1.pptx
2 Discovery and Acquisition of Data1.pptx2 Discovery and Acquisition of Data1.pptx
2 Discovery and Acquisition of Data1.pptx
 
NHM Data Portal: first steps toward the Graph-of-Life
NHM Data Portal: first steps toward the Graph-of-LifeNHM Data Portal: first steps toward the Graph-of-Life
NHM Data Portal: first steps toward the Graph-of-Life
 
NHM Data Portal: first steps toward the Graph-of-Life
NHM Data Portal: first steps toward the Graph-of-LifeNHM Data Portal: first steps toward the Graph-of-Life
NHM Data Portal: first steps toward the Graph-of-Life
 
AgriVIVO: A Global Ontology-Driven RDF Store Based on a Distributed Architect...
AgriVIVO: A Global Ontology-Driven RDF Store Based on a Distributed Architect...AgriVIVO: A Global Ontology-Driven RDF Store Based on a Distributed Architect...
AgriVIVO: A Global Ontology-Driven RDF Store Based on a Distributed Architect...
 
10th e concertation-brussels-06march2013-v2
10th e concertation-brussels-06march2013-v210th e concertation-brussels-06march2013-v2
10th e concertation-brussels-06march2013-v2
 
EcsiNeurosciences Information Framework (NIF): An example of community Cyberi...
EcsiNeurosciences Information Framework (NIF): An example of community Cyberi...EcsiNeurosciences Information Framework (NIF): An example of community Cyberi...
EcsiNeurosciences Information Framework (NIF): An example of community Cyberi...
 
Neurosciences Information Framework (NIF): An example of community Cyberinfra...
Neurosciences Information Framework (NIF): An example of community Cyberinfra...Neurosciences Information Framework (NIF): An example of community Cyberinfra...
Neurosciences Information Framework (NIF): An example of community Cyberinfra...
 
Mendeley Data: Enhancing Data Discovery, Sharing and Reuse
Mendeley Data: Enhancing Data Discovery, Sharing and ReuseMendeley Data: Enhancing Data Discovery, Sharing and Reuse
Mendeley Data: Enhancing Data Discovery, Sharing and Reuse
 
Knowledge Organization System (KOS) for biodiversity information resources, G...
Knowledge Organization System (KOS) for biodiversity information resources, G...Knowledge Organization System (KOS) for biodiversity information resources, G...
Knowledge Organization System (KOS) for biodiversity information resources, G...
 
The BlueBRIDGE approach to collaborative research
The BlueBRIDGE approach to collaborative researchThe BlueBRIDGE approach to collaborative research
The BlueBRIDGE approach to collaborative research
 
Scratchpads introductory presentation 45mins
Scratchpads introductory presentation   45minsScratchpads introductory presentation   45mins
Scratchpads introductory presentation 45mins
 
Creating Sustainable Communities in Open Data Resources: The eagle-i and VIVO...
Creating Sustainable Communities in Open Data Resources: The eagle-i and VIVO...Creating Sustainable Communities in Open Data Resources: The eagle-i and VIVO...
Creating Sustainable Communities in Open Data Resources: The eagle-i and VIVO...
 
Descriptive Standards and Applications in Memory Institutions
Descriptive Standards and Applications in Memory InstitutionsDescriptive Standards and Applications in Memory Institutions
Descriptive Standards and Applications in Memory Institutions
 
iMicrobe_ASLO_2015
iMicrobe_ASLO_2015iMicrobe_ASLO_2015
iMicrobe_ASLO_2015
 
Presentation on entrez as used in bioinformatics
Presentation on entrez as used in bioinformaticsPresentation on entrez as used in bioinformatics
Presentation on entrez as used in bioinformatics
 
Cross-Community User Requirements and the Biodiversity Heritage Library
Cross-Community User Requirements and the Biodiversity Heritage LibraryCross-Community User Requirements and the Biodiversity Heritage Library
Cross-Community User Requirements and the Biodiversity Heritage Library
 
Bridging Environmental Data Providers and SeaDataNet DIVA Service within a Co...
Bridging Environmental Data Providers and SeaDataNet DIVA Service within a Co...Bridging Environmental Data Providers and SeaDataNet DIVA Service within a Co...
Bridging Environmental Data Providers and SeaDataNet DIVA Service within a Co...
 
Connecting the Dots: Linking Digitized Collections Across Metadata Silos
Connecting the Dots: Linking Digitized Collections Across Metadata SilosConnecting the Dots: Linking Digitized Collections Across Metadata Silos
Connecting the Dots: Linking Digitized Collections Across Metadata Silos
 
Gbrds Summary Final July2009 (2)
Gbrds Summary Final July2009 (2)Gbrds Summary Final July2009 (2)
Gbrds Summary Final July2009 (2)
 

Mais de taxonbytes

Franz et al 2015 escjam 2015 logic resolution taxonomic variable
Franz et al 2015 escjam 2015 logic resolution taxonomic variableFranz et al 2015 escjam 2015 logic resolution taxonomic variable
Franz et al 2015 escjam 2015 logic resolution taxonomic variable
taxonbytes
 
Zhang Franz ESCJAM 2015 Exophthalmus Reclassification
Zhang Franz ESCJAM 2015 Exophthalmus ReclassificationZhang Franz ESCJAM 2015 Exophthalmus Reclassification
Zhang Franz ESCJAM 2015 Exophthalmus Reclassification
taxonbytes
 
Franz 2015 SPNHC Taxonomic concept resolution for voucher-based biodiversity ...
Franz 2015 SPNHC Taxonomic concept resolution for voucher-based biodiversity ...Franz 2015 SPNHC Taxonomic concept resolution for voucher-based biodiversity ...
Franz 2015 SPNHC Taxonomic concept resolution for voucher-based biodiversity ...
taxonbytes
 

Mais de taxonbytes (20)

Anzaldo franz 2017 ecn your daily weevil
Anzaldo franz 2017 ecn your daily weevilAnzaldo franz 2017 ecn your daily weevil
Anzaldo franz 2017 ecn your daily weevil
 
Franz et al 2017 ecn creating and publishing a symbiota based checklist version
Franz et al 2017 ecn creating and publishing a symbiota based checklist versionFranz et al 2017 ecn creating and publishing a symbiota based checklist version
Franz et al 2017 ecn creating and publishing a symbiota based checklist version
 
Franz 2017 sols cbs seminar the limits of synthesis for integrative biology
Franz 2017 sols cbs seminar the limits of synthesis for integrative biologyFranz 2017 sols cbs seminar the limits of synthesis for integrative biology
Franz 2017 sols cbs seminar the limits of synthesis for integrative biology
 
Franz 2017 uiuc cirss non unitary syntheses of systematic knowledge
Franz 2017 uiuc cirss non unitary syntheses of systematic knowledgeFranz 2017 uiuc cirss non unitary syntheses of systematic knowledge
Franz 2017 uiuc cirss non unitary syntheses of systematic knowledge
 
Franz et al tdwg 2016 new developments for libraries of life
Franz et al tdwg 2016 new developments for libraries of lifeFranz et al tdwg 2016 new developments for libraries of life
Franz et al tdwg 2016 new developments for libraries of life
 
Franz et al tdwg 2016 introducing lep net
Franz et al tdwg 2016 introducing lep netFranz et al tdwg 2016 introducing lep net
Franz et al tdwg 2016 introducing lep net
 
Franz et al TDWG 2016 Updates on multiple neotropical symbiota portals
Franz et al TDWG 2016 Updates on multiple neotropical symbiota portalsFranz et al TDWG 2016 Updates on multiple neotropical symbiota portals
Franz et al TDWG 2016 Updates on multiple neotropical symbiota portals
 
Franz sterner tdwg 2016 new power balance needed for trustworthy biodiversity...
Franz sterner tdwg 2016 new power balance needed for trustworthy biodiversity...Franz sterner tdwg 2016 new power balance needed for trustworthy biodiversity...
Franz sterner tdwg 2016 new power balance needed for trustworthy biodiversity...
 
Franz ludaescher tdwg 2016 an update on taxonomic concept reasoning
Franz ludaescher tdwg 2016 an update on taxonomic concept reasoningFranz ludaescher tdwg 2016 an update on taxonomic concept reasoning
Franz ludaescher tdwg 2016 an update on taxonomic concept reasoning
 
Franz Zhang et al Weevil Workshop 2016 Neotropical Entiminae Systematics evol...
Franz Zhang et al Weevil Workshop 2016 Neotropical Entiminae Systematics evol...Franz Zhang et al Weevil Workshop 2016 Neotropical Entiminae Systematics evol...
Franz Zhang et al Weevil Workshop 2016 Neotropical Entiminae Systematics evol...
 
Franz et al ice 2016 addressing the name meaning drift challenge in open ende...
Franz et al ice 2016 addressing the name meaning drift challenge in open ende...Franz et al ice 2016 addressing the name meaning drift challenge in open ende...
Franz et al ice 2016 addressing the name meaning drift challenge in open ende...
 
Zhang et al ecn 2016 building an accessible weevil tissue collection for geno...
Zhang et al ecn 2016 building an accessible weevil tissue collection for geno...Zhang et al ecn 2016 building an accessible weevil tissue collection for geno...
Zhang et al ecn 2016 building an accessible weevil tissue collection for geno...
 
Franz et al evol 2016 aligning multipe incongruent phylogenies with the euler...
Franz et al evol 2016 aligning multipe incongruent phylogenies with the euler...Franz et al evol 2016 aligning multipe incongruent phylogenies with the euler...
Franz et al evol 2016 aligning multipe incongruent phylogenies with the euler...
 
Zhang et al evol 2016 beyond otus phylogenetic identification of bacterial sy...
Zhang et al evol 2016 beyond otus phylogenetic identification of bacterial sy...Zhang et al evol 2016 beyond otus phylogenetic identification of bacterial sy...
Zhang et al evol 2016 beyond otus phylogenetic identification of bacterial sy...
 
Franz et al evol 2016 representing phylogeny as a logically tractable variable
Franz et al evol 2016 representing phylogeny as a logically tractable variable  Franz et al evol 2016 representing phylogeny as a logically tractable variable
Franz et al evol 2016 representing phylogeny as a logically tractable variable
 
Franz 2016 Phenotype RCN Representing Taxonomy and Phylogeny as Logically Tra...
Franz 2016 Phenotype RCN Representing Taxonomy and Phylogeny as Logically Tra...Franz 2016 Phenotype RCN Representing Taxonomy and Phylogeny as Logically Tra...
Franz 2016 Phenotype RCN Representing Taxonomy and Phylogeny as Logically Tra...
 
Franz et al 2015 escjam 2015 logic resolution taxonomic variable
Franz et al 2015 escjam 2015 logic resolution taxonomic variableFranz et al 2015 escjam 2015 logic resolution taxonomic variable
Franz et al 2015 escjam 2015 logic resolution taxonomic variable
 
Zhang Franz ESCJAM 2015 Exophthalmus Reclassification
Zhang Franz ESCJAM 2015 Exophthalmus ReclassificationZhang Franz ESCJAM 2015 Exophthalmus Reclassification
Zhang Franz ESCJAM 2015 Exophthalmus Reclassification
 
Franz cobb seltmann 2015 spnhc current state of arthropod biodiversity data
Franz cobb seltmann 2015 spnhc current state of arthropod biodiversity dataFranz cobb seltmann 2015 spnhc current state of arthropod biodiversity data
Franz cobb seltmann 2015 spnhc current state of arthropod biodiversity data
 
Franz 2015 SPNHC Taxonomic concept resolution for voucher-based biodiversity ...
Franz 2015 SPNHC Taxonomic concept resolution for voucher-based biodiversity ...Franz 2015 SPNHC Taxonomic concept resolution for voucher-based biodiversity ...
Franz 2015 SPNHC Taxonomic concept resolution for voucher-based biodiversity ...
 

Último

Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
levieagacer
 
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptxTHE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
ANSARKHAN96
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
MohamedFarag457087
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
NazaninKarimi6
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
seri bangash
 
PODOCARPUS...........................pptx
PODOCARPUS...........................pptxPODOCARPUS...........................pptx
PODOCARPUS...........................pptx
Cherry
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Sérgio Sacani
 

Último (20)

CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIACURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
 
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
 
Plasmid: types, structure and functions.
Plasmid: types, structure and functions.Plasmid: types, structure and functions.
Plasmid: types, structure and functions.
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptxTHE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
 
Cyanide resistant respiration pathway.pptx
Cyanide resistant respiration pathway.pptxCyanide resistant respiration pathway.pptx
Cyanide resistant respiration pathway.pptx
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
Concept of gene and Complementation test.pdf
Concept of gene and Complementation test.pdfConcept of gene and Complementation test.pdf
Concept of gene and Complementation test.pdf
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
 
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRLGwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
FS P2 COMBO MSTA LAST PUSH past exam papers.
FS P2 COMBO MSTA LAST PUSH past exam papers.FS P2 COMBO MSTA LAST PUSH past exam papers.
FS P2 COMBO MSTA LAST PUSH past exam papers.
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
 
PODOCARPUS...........................pptx
PODOCARPUS...........................pptxPODOCARPUS...........................pptx
PODOCARPUS...........................pptx
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
 
Genome sequencing,shotgun sequencing.pptx
Genome sequencing,shotgun sequencing.pptxGenome sequencing,shotgun sequencing.pptx
Genome sequencing,shotgun sequencing.pptx
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Site specific recombination and transposition.........pdf
Site specific recombination and transposition.........pdfSite specific recombination and transposition.........pdf
Site specific recombination and transposition.........pdf
 

De-centralized but global: Redesigning biodiversity data aggregation for improved engagement and impact

  • 1. De-centralized but global: Redesigning biodiversity data aggregation for improved engagement and impact Nico Franz, Ed Gilbert & Beckett Sterner Biodiversity Knowledge Integration Center – Arizona State University Keynote Session – June 10, 2019 3rd Annual Digital Data Conference – Yale University, New Haven, CO
  • 2. Structure of presentation 0. Key message: Biodiversity informatics remains young and fresh; especially if we aim to incentivize experts/enthusiasts in publishing high-quality, "data-intelligent" biodiversity data products.
  • 3. Structure of presentation 0. Key message: Biodiversity informatics remains young and fresh; especially if we aim to incentivize experts/enthusiasts in publishing high-quality, "data-intelligent" biodiversity data products. 1. "New at ASU" (and more generally): The NEON Biorepository Data Portal – a Darwin Core-based ecological monitoring and forecasting resource.
  • 4. Structure of presentation 0. Key message: Biodiversity informatics remains young and fresh; especially if we aim to incentivize experts/enthusiasts in publishing high-quality, "data-intelligent" biodiversity data products. 1. "New at ASU" (and more generally): The NEON Biorepository Data Portal – a Darwin Core-based ecological monitoring and forecasting resource. 2. Rethinking centralized biodiversity data aggregation: Diagnosis and components of a de-centralized complement.
  • 5. Structure of presentation 0. Key message: Biodiversity informatics remains young and fresh; especially if we aim to incentivize experts/enthusiasts in publishing high-quality, "data-intelligent" biodiversity data products. 1. "New at ASU" (and more generally): The NEON Biorepository Data Portal – a Darwin Core-based ecological monitoring and forecasting resource. 2. Rethinking centralized biodiversity data aggregation: Diagnosis and components of a de-centralized complement. 3. Still our frontier: Taxonomic data intelligence for Darwin Core occurrences.
  • 6. Introducing the NEON Biorepository Data Portal (1) https://biorepo.neonscience.org
  • 7. Introducing the NEON Biorepository Data Portal
  • 8. Introducing the NEON Biorepository Data Portal
  • 9. Introducing the NEON Biorepository Data Portal
  • 10. Introducing the NEON Biorepository Data Portal
  • 11. Introducing the NEON Biorepository Data Portal
  • 12. More information at Tuesday's poster session! • Using the portal • Access to samples • New developments • Informatics • Research • Prospective M.Sc./ Ph.D. applicants and postdocs • Early and broad-based use is key!
  • 14. [after a short BioRepo portal demo (Culicidae / SRER Plants)] * Mosquito species richness per site, 2016–2017. Change in species richness per site. * Thanks to Dr. Kelsey Yule, NEON Biorepository @ ASU. • NEON organismal sampling produces a unique, taxon-/region-/time- constrained, change-focused data signal.
  • 15. Rethinking centralized biodiversity data aggregation (2) • https://doi.org/10.1093/database/bax10 0 • Critique of monolithic aggregation designs.
  • 16.
  • 17. Rethinking centralized biodiversity data aggregation (2) • https://doi.org/10.1093/database/bax10 0 • Critique of monolithic aggregation designs. • Here we reformulate the issue in relation to the direction of data flow in hierarchical biodiversity data aggregation networks.
  • 18. Rethinking centralized biodiversity data aggregation (2) • https://doi.org/10.1093/database/bax10 0 • Critique of monolithic aggregation designs. • Here we reformulate the issue in relation to the direction of data flow in hierarchical biodiversity data aggregation networks. • Then we outline a complementary solution that is not centralized, yet which can potentially reach global coverage.
  • 19. Rethinking centralized biodiversity data aggregation (2) • https://doi.org/10.1093/database/bax10 0 • Critique of monolithic aggregation designs. • Here we reformulate the issue in relation to the direction of data flow in hierarchical biodiversity data aggregation networks. • Then we outline a complementary solution that is not centralized, yet which can potentially reach global coverage. • This solution is being developed at ASU under the project name
  • 20. Centralized biodiversity data aggregation • Many collections remain offline.
  • 21. Centralized biodiversity data aggregation • Other, individual collections, may have a web presence.
  • 22. Centralized biodiversity data aggregation • Multi-collections software applications can bring institutional datasets on-line; yet this alone may not suffice to publish DwC-Archive data according to FAIR standards.
  • 23. Centralized biodiversity data aggregation • The Integrated Publishing Toolkit (IPT) allows individual or multi-collection datasets to become discoverable as DwC-Archive packages to higher-level aggregators.
  • 24. Centralized biodiversity data aggregation • There are also web-only, mid-level portal applications that support live collection management and can publish "up" through the (IPT).
  • 25. Centralized biodiversity data aggregation • Symbiota portals also support "snapshot collections" – i.e., periodical, manually triggered batch re-/uploads of static versions that are live-managed elsewhere.
  • 26. Centralized biodiversity data aggregation • Symbiota portals also have a custom, fully built-in, IPT-analogous "Darwin Core Archive Publishing" module.
  • 27. Example of Symbiota's DwC-A Publishing module (SCAN: ASUCOB) June 10, 2019
  • 28. Centralized biodiversity data aggregation • Multiple, community-themed portals – each with unique live/snapshot collection profiles – can periodically receive reciprocal snapshot updates.
  • 29. Highest-level aggregators typically only support collection snapshots!
  • 30. This hierarchy sustains an imbalance in directional data flow: Annotations on global datasets are hard to pull downwards.
  • 31. Moreover, by the time we reach the top, most experts/enthusiasts no longer feel at home (cf. Wenger 2000).
  • 32. Wenger 2000: Communities of practice and social learning systems
  • 33. Community Access  Engagement  Quality  Trust  Use & Impact
  • 34. Designing for strong data communities
  • 35. De-centralized, but global • Independent, themed portal communities maintain live-managed collections.
  • 36. De-centralized, but global • Partial, relevant collection snapshot subsets are represented.
  • 37. De-centralized, but global • Even partial, relevant portal snapshot subsets are ingestible, with provenance.
  • 38. De-centralized, but global • Some research-themed portals may only include partial collection snapshots.
  • 39. De-centralized, but global • Highly configurable portal-to-portal APIs negotiate the flow of data between live and snapshot collection instances.
  • 40. De-centralized, but global • As API services are optimized, the distinction between live and snapshot collection management increasingly falls away.
  • 41. De-centralized, but global • API service configurations include filtered, source-/sink-approval contingent data pushes and pulls.
  • 42. De-centralized, but global • API service configurations allow filtered {collection, taxon, region, etc.), source-/sink-approval contingent data pushes and pulls.
  • 43. De-centralized, but global • Portal-to-portal API configurations become the "substrate" upon which the communities realize their "modes of belonging".
  • 44. De-centralized, but global • The de-centralized network is broadly extensible between closely (high data flow) or remotely (low data flow) related communities.
  • 45. De-centralized, but global • On the basis of a shared API service culture, a de-centralized data portal network can potentially grow to attain global coverage.
  • 47. BioCache: Global access through custom research portal instances • Researchers create and register "via single handshake" a new portal instance.
  • 48. • Research instances enable repeatable, global data queries and re- /ingestion. BioCache: Global access through custom research portal instances
  • 49. • Valued-added data can return to all (live) source collections via filtered pulls. BioCache: Global access through custom research portal instances
  • 50. Stay tuned, it's underway
  • 51. Taxonomic data intelligence for Darwin Core occurrences (3) • https://doi.org/10.3897/rio.2.e10610 • Engaging expert/enthusiast communities  need for pluralism and democracy for and among taxonomic perspectives in biodiversity data aggregation designs.
  • 52. Alignment by Alan Weakley (http://herbarium.unc.edu/flora.htm)
  • 53. Taxonomic data intelligence for Darwin Core occurrences (3) • https://doi.org/10.3897/rio.2.e10610 • Engaging expert/enthusiast communities  need for pluralism and democracy for and among taxonomic perspectives in biodiversity data aggregation designs. • Spatial reasoning tools (RCC–5) can help attain consistent and comprehensive taxonomic meaning mappings  intelligence for data integration across evolving or conflicting views.
  • 54. Taxonomic data intelligence for Darwin Core occurrences (3) • https://doi.org/10.3897/rio.2.e10610 • Engaging expert/enthusiast communities  need for pluralism and democracy for and among taxonomic perspectives in biodiversity data aggregation designs. • Spatial reasoning tools (RCC–5) can help attain consistent and comprehensive taxonomic meaning mappings  intelligence for data integration across evolving or conflicting views. • Biological inferences become robust, in relation to "the taxonomic variable".
  • 55.
  • 57. "Taxonomically intelligent data integration for a new Flora of Alaska" • NSF DBI 1759964 • PIs Ickert-Bond & Webb • Reconciling Hulten, FNA & Pan-Arctic Flora • See http://alaskaflora.org/
  • 58. iNaturalist is mostly there already https://www.inaturalist.org/taxon_framework_relationships
  • 59. iNaturalist is mostly there already https://www.inaturalist.org/taxon_framework_relationships
  • 60. Recent addition: Concept alignment for phylogenomic trees Franz, N.M. et al. 2019. Verbalizing phylogenomic conflict: […]. https://doi.org/10.1371/journal.pcbi.1006493
  • 61. Reliable theories of multi-tree node congruence require expert judgment Franz, N.M. et al. 2019. Verbalizing phylogenomic conflict: […]. https://doi.org/10.1371/journal.pcbi.1006493 • "Synthesis" OpenTree vs. RCC–5
  • 62. Hopeful conclusions Key message: Biodiversity informatics remains young and fresh; especially if we aim to incentivize experts/enthusiasts in publishing high-quality, "data-intelligent" biodiversity data products.
  • 63. Hopeful conclusions Key message: Biodiversity informatics remains young and fresh; especially if we aim to incentivize experts/enthusiasts in publishing high-quality, "data-intelligent" biodiversity data products. There are many grassroots or federally supported projects in this domain that express enthusiasm for, and confidence in, a strong future for new, high- volume, perpetually data-restructuring systematic research; driving our evolving views of relations between DwC datasets and biological knowledge.
  • 64. Hopeful conclusions Key message: Biodiversity informatics remains young and fresh; especially if we aim to incentivize experts/enthusiasts in publishing high-quality, "data-intelligent" biodiversity data products. There are many grassroots or federally supported projects in this domain that express enthusiasm for, and confidence in, a strong future for new, high- volume, perpetually data-restructuring systematic research; driving our evolving views of relations between DwC datasets and biological knowledge. If you have the passion and stomach for that future, join us now!
  • 65. Acknowledgments • Gil Nelson, Alnycea Blackwell, Jillian Goodwin, and all other iDigBio and Yale Peabody organizers and sponsors of #Digidata 2019. • ASU Biocollections & NEON Biorepository team. • Euler/X team: Bertram Ludäscher, Shizhuo Yu, Jessica Cheng. • If you wish to read one paper on aligning taxonomic concepts: https://doi.org/10.1093/sysbio/syw023