SlideShare uma empresa Scribd logo
1 de 54
A new power balance is needed
for trustworthy biodiversity data
Please
@taxonbytes
Nico Franz1 & Beckett W. Sterner1
With contributions by Edward Gilbert1, Andrew Johnston1,
Guanyang Zhang1, Bertram Ludäscher2 & Alan Weakley3
1 School of Life Sciences, Arizona State University
2 iSchool, University of Illinois at Urbana-Champaign
3 Herbarium, University of North Carolina at Chapel Hill
TDWG 2016 – Biodiversity Information Standards
December 09, 2016 – Instituto Tecnológico de Costa Rica (#TDWG16)
@ http://www.slideshare.net/taxonbytes/franz-sterner-tdwg-2016-new-power-balance-needed-for-trustworthy-biodiversity-data
Largely derived from doi:10.3897/rio.2.e10610
91dd0ee1-8a37-4efc-85b7-8176874cf5be
Premise: We agree that there are significant data quality issues
91dd0ee1-8a37-4efc-85b7-8176874cf5be
Aggregated Australian millipede data 'taken to the cleaners'
Premise: We agree that there are significant data quality issues
91dd0ee1-8a37-4efc-85b7-8176874cf5be
Aggregated Australian millipede data 'taken to the cleaners'
Aggregators respond to the charges
Premise: We agree that there are significant data quality issues
91dd0ee1-8a37-4efc-85b7-8176874cf5be
Aggregated Australian millipede data 'taken to the cleaners'
Aggregators respond to the charges
But this leaves open the question(s):
Who (exactly) is responsible for
how much of each particular issue?
We seem to disagree on the question of responsibility assignment(s)
91dd0ee1-8a37-4efc-85b7-8176874cf5be
Source: Belbin et al. 2013. A specialist's audit […]: An 'aggregator's' perspective. doi:10.3897/zookeys.305.5438
Page 73
Often enough, aggregators respond by:
• Acknowledging the general issues and their relevance.
• Pointing to many issues that effectively reside "with the sources".
• Calling for more collaboration across all levels; as well as new tools and
annotation options that "motivate and empower" the research community.
91dd0ee1-8a37-4efc-85b7-8176874cf5be
Source: Belbin et al. 2013. A specialist's audit […]: An 'aggregator's' perspective. doi:10.3897/zookeys.305.5438
Page 74
Thesis: For taxonomy integration, this both wrong and self-defeating
91dd0ee1-8a37-4efc-85b7-8176874cf5be
• Many aggregators are designed to impose a single taxonomic hierarchy –
one at a time – onto all taxonomically annotated records.
91dd0ee1-8a37-4efc-85b7-8176874cf5be
• Many aggregators are designed to impose a single taxonomic hierarchy –
one at a time – onto all taxonomically annotated records.
• By design, these "backbones" are rarely attributable to individual (expert)
authors, but instead are newly created systematic theories that only appear
at the system level.
Thesis: For taxonomy integration, this both wrong and self-defeating
91dd0ee1-8a37-4efc-85b7-8176874cf5be
• Many aggregators are designed to impose a single taxonomic hierarchy –
one at a time – onto all taxonomically annotated records.
• By design, these "backbones" are rarely attributable to individual (expert)
authors, but instead are newly created systematic theories that only appear
at the system level.
• Data are aggregated accordingly; yet backbone-driven modifications may
newly disrupt the original integrity of submitted data packages.
Thesis: For taxonomy integration, this both wrong and self-defeating
91dd0ee1-8a37-4efc-85b7-8176874cf5be
• Many aggregators are designed to impose a single taxonomic hierarchy –
one at a time – onto all taxonomically annotated records.
• By design, these "backbones" are rarely attributable to individual (expert)
authors, but instead are newly created systematic theories that only appear
at the system level.
• Data are aggregated accordingly; yet backbone-driven modifications may
newly disrupt the original integrity of submitted data packages.
• By deflecting on responsibilities, aggregators may cause additional self-harm.
Ultimately, the power balance – as presently built in – must shift to bring
experts back into the process of licensing succinct, trustworthy data packages.
Thesis: For taxonomy integration, this both wrong and self-defeating
Let's re-diagnose:
What happens in dynamic,
open systems?
Charly Lewisw, CC BY-SA 3.0
Taxonomic views of a frequently revised organismal lineage
Source: Franz et al. 2016. Controlling the taxonomic variable: […]. RIO Journal. doi:10.3897/rio.2.e10610
• 9 schemata for the NA Cleistes/Cleistesiopsis complex (orchids, "pogonias")
Snapshot of a more frequently revised organismal lineage
• 9 schemata for the NA Cleistes/Cleistesiopsis complex (orchids, "pogonias")
• Vertical sections identify taxonomic concept regions
Source: Franz et al. 2016. Controlling the taxonomic variable: […]. RIO Journal. doi:10.3897/rio.2.e10610
Snapshot of a more frequently revised organismal lineage
• 9 schemata for the NA Cleistes/Cleistesiopsis complex (orchids, "pogonias")
• Vertical sections identify taxonomic concept regions
• Colors identify lineages of taxonomic names (epithets) in use
Source: Franz et al. 2016. Controlling the taxonomic variable: […]. RIO Journal. doi:10.3897/rio.2.e10610
Snapshot of a more frequently revised organismal lineage
• 9 schemata for the NA Cleistes/Cleistesiopsis complex (orchids)
• Vertical sections identify taxonomic concept regions
• Colors identify lineages of taxonomic names (epithets) in use
• There is no consensus! Five incongruent schemata are used concurrently
Source: Franz et al. 2016. Controlling the taxonomic variable: […]. RIO Journal. doi:10.3897/rio.2.e10610
Further diagnosis:
If incongruent taxonomies are endorsed
– locally, provisionally, and democratically –
then what is the impact for
aggregated biodiversity data?
Further diagnosis:
 Taxonomy becomes a variable
that we need to represent,
and thereby control for
(at the system level)
The 'consensus'
• Query: "Where do these orchid
species occur?"
• Same set of 250 orchid specimens,
according to 4 taxonomies.
"Controllingthetaxonomicvariable" Example: the Cleistes use case
Source: Franz et al. 2016. Controlling the taxonomic variable: […]. RIO Journal. doi:10.3897/rio.2.e10610
The 'consensus' The 'bible'
"Controllingthetaxonomicvariable"
• Query: "Where do these orchid
species occur?"
• Same set of 250 orchid specimens,
according to 4 taxonomies.
Example: the Cleistes use case
Source: Franz et al. 2016. Controlling the taxonomic variable: […]. RIO Journal. doi:10.3897/rio.2.e10610
The 'consensus' The 'bible'
The (formerly)
federal 'standard'
"Controllingthetaxonomicvariable"
Source: Franz et al. 2016. Controlling the taxonomic variable: […]. RIO Journal. doi:10.3897/rio.2.e10610
The 'consensus' The 'bible'
The (formerly)
federal 'standard'
The 'best', latest
regional flora
"Controllingthetaxonomicvariable"
Source: Franz et al. 2016. Controlling the taxonomic variable: […]. RIO Journal. doi:10.3897/rio.2.e10610
The 'consensus' The 'bible'
The (formerly)
federal 'standard'
The 'best', latest
regional flora
"Controllingthetaxonomicvariable"
Expert views
are in conflict
Source: Franz et al. 2016. Controlling the taxonomic variable: […]. RIO Journal. doi:10.3897/rio.2.e10610
The 'consensus' The 'bible'
The (formerly)
federal 'standard'
The 'best', latest
regional flora
"Controllingthetaxonomicvariable"
Expert views
are in conflict
"Just bad"
Source: Franz et al. 2016. Controlling the taxonomic variable: […]. RIO Journal. doi:10.3897/rio.2.e10610
The 'consensus' The 'bible'
The (formerly)
federal 'standard'
The 'best', latest
regional flora
Impact:
Name-based aggregation has created
a novel synthesis that nobody believes in
"Controllingthetaxonomicvariable"
"Just bad"
Source: Franz et al. 2016. Controlling the taxonomic variable: […]. RIO Journal. doi:10.3897/rio.2.e10610
The 'consensus' The 'bible'
The (formerly)
federal 'standard'
The 'best', latest
regional flora
"Controllingthetaxonomicvariable"
"Just
bad"
Expert views
are in conflict
Solution:
Instead of aggregating
an artificial 'consensus',
…
Source: Franz et al. 2016. Controlling the taxonomic variable: […]. RIO Journal. doi:10.3897/rio.2.e10610
The 'consensus' The 'bible'
The (formerly)
federal 'standard'
The 'best', latest
regional flora
"Controllingthetaxonomicvariable"
"Just
bad"
Expert views
are reconciled
Solution:
Instead of aggregating
an artificial 'consensus',
build translation services
Source: Franz et al. 2016. Controlling the taxonomic variable: […]. RIO Journal. doi:10.3897/rio.2.e10610
Challenges:
How can we redesign aggregation to yield
high-quality biodiversity data packages?
Challenges:
How can we redesign aggregation to yield
high-quality biodiversity data packages?
What does this mean for Darwin Core1
and how we use this aggregation standard?
1 Wieczorek et al. 2012. Darwin Core: an evolving […]. PLoS ONE 7(1): e29715. doi:10.1371/journal.pone.0029715
Preview of solution with eight steps
• DwC is insufficient, and part of the problem
# 1: Represent only taxonomic concept labels (TCLs) 1
• Syntax (TCL): taxonomic name [author, year, page] sec. source
1 Multi-taxonomy input/alignment visualizations generated with Euler/X toolkit: https://github.com/EulerProject/EulerX
Cleistes divaricata
sec. Gregg & Catling 1993
Pogonia
sec. Brown & Wunderlin 1997
# 1: DwC score keeping  TCLs are optional; < 1% realized?
• TCL ~ DwC: nameAccordingTo
• SCAN: 19,722 of nearly 9 million records have TCLs (0.2%)
• Lack of enforcement to use TCLs makes standard less big data-ready
"Who authors GBIF's Backbone?"
https://storify.com/taxonbytes/who-authors-gbif-s-backbone
# 2: Represent each source coherently (Parent-Child relationships)
• Syntax (PC): TCL1 is a child/parent of TCL2 [where TCL1/2 = same source]
Cleistesiopsis bifaria sec. Pans. & de Barr. 2008
is a child of
Cleistesiopsis sec. Pans. & de Barr. 2008
# 2: DwC score keeping  Not (adequately) represented
• PC ~ DwC: genus, family, order (etc.; higherClassification)
• However, higher-level names in DwC are not modeled as TCLs
• Taxonomic coherence of sources cannot be preserved with DwC alone
DwC record with higherClassification
(BDJ)
# 3: Do not force a single hierarchy onto all tip-level TCLs
• Syntax (PC): Tip-level TCL1 , TCL2 , etc. [where TCL1/2 = different sources]
# 3: DwC score keeping  Optional Not (ever?) practiced
• No PC ~ DwC: infra-/specificEpithet only
• Typically, a single, 'unitary' higher-level classification is represented
• Combinations of algorithmic and social practices achieve the single hierarchy
"Who authors GBIF's Backbone?"
https://storify.com/taxonbytes/who-authors-gbif-s-backbone
# 4: Link TCLs via expert-provided RCC–5 articulations
• Syntax (RCC–5): TCL1 {==, >, <, ><, !} TCL2 [where TCL1/2 = diff. sources]
• RCC–5 = Region Connection Calculus
• 14 articulations provided by: http://tinyurl.com/Weakley-Flora-2015
Cleistes bifaria "Coastal Populations" sec. Smith et al. 2004
== (is congruent with)
Cleistesiopsis oricamporum sec. Brown & Pans. 2009
==
Source: Thau, D.M. 2010. Reasoning about taxonomies. Thesis, UC Davis. http://gradworks.proquest.com/3422778.pdf
Region Connection Calculus (semantics: set constraints)
== < > >< !
• Two regions N, M are either:
• congruent (N == M)
• properly inclusive (N < M)
• inversely properly inclusive (N > M)
• overlapping (N >< M)
• exclusive of each other (N ! M)
• RCC–5 articulations answer the query: "can we join regions N and M?"
• Taxonomies have multiple RCC–5 alignable components: nodes (parents,
children), node-associated traits, even node-anchoring specimens
# 4: DwC score keeping  Not (adequately) represented
• RCC–5 ~ DwC: accepted(Scientific)Name(Usage), relationshipOfResource,
taxonomicStatus (etc.; nomenclatural relationships)
• Nomenclatural relationships are type-focused, not region-focused
• "Taxonomic Concept Schema"  yes! (however: http://www.tdwg.org/standards/117)
Source: Vane-Wright. 2003. Indifferent philosophy versus […]. Syst. Biodiv. 1: 3–11. doi:10.1017/S1477200003001063
Example:
Milkweed butterflies
Oscillating meanings of the epithet hyalites – 1911 to 2003
Phenotypicdiversity
Type-anchorednameidentityrelations
Source: Vane-Wright. 2003. Indifferent philosophy versus […]. Syst. Biodiv. 1: 3–11. doi:10.1017/S1477200003001063
# 5: Identify occurrence records only to TCLs
Records:
EKY39235
MTSU003611
NCSC00040204
…
Records:
BOON8098
CLEMS0061133
WILLI39399
…
Records:
GMUF-0039355
IBE006808
USCH58399
…
Records:
CONV0006268
MDKY00006482
NCU00038930
…
Records:
BRYV0023582, BRYV0023584
KHD00032030, MISS0016604
MMNS000227, NCSC00040206
USMS_000002923, USMS_000002924
VSC0053223, VSC0065528
…
Records:
ARIZ393087
DBG39049
USCH51217
…
Records:
NCU00040710
USCH96248
VSC0053218
…
Records:
CLEMS0012881
FUGR0003293
GA023130
…
Records:
BOON8100
NCSC00040210
SJNM45487
…
Records:
GA023144
LSU00012494
MISS0016608
…
Records:
IBE006810, IND-0012374, MMNS000227
Records:
NY8654
• Syntax (ID): Occurrence / organism is identified to TCL
"CLEMS0012881"
is identified to
Cleistes divaricata sec. Smith et al. 2004
[additional ID metadata]
DwC record with Identification metadata
(BDJ)
# 5: DwC score keeping  ID metadata optional; > 50% realized
• ID ~ DwC: Identification, (date)identified(By), identificationReference
• SCAN: 4,715,277 of nearly 9 million records have ID metadata (52.5%)
• Enforcement…still also require use of TCLs
# 6: Generate comprehensive, consistent RCC–5 alignments
• Euler/X is a toolkit that infers logically consistent RCC–5 alignments
# 6: Generate comprehensive, consistent RCC–5 alignments
• Valued-added: MIR – set of Maximally Informative Relations containing
the RCC–5 articulation for every possible TCL pair  scalability
Reasonerinference
# 7: Joining occurrence-to-TCL identifications & RCC–5 alignments
Records:
BOON8098, CLEMS0061133, CONV0006268, EKY39235
GMUF-0039355, IBE006808, IBE006810, IND-0012374
MDKY00006482, MMNS000227, MTSU003611, NCSC00040204
NCU00038930, NY8654, USCH58399, WILLI39399
…
Records:
ARIZ393087, BRYV0023582, BRYV0023584, DBG39049
KHD00032030, MISS0016604, MMNS00022, NCSC00040206
USMS_000002923, USMS_000002924, VSC0053223, VSC0065528
…
Records:
BOON8100, CLEMS0012881, FUGR0003293
GA023130, GA023144, LSU00012494
MISS0016608, NCSC00040210, NCU00040710
SJNM45487, USCH96248, VSC0053218
…
• Specimen integration is fully driven by TCL-to-TCL RCC–5 signals
The 'consensus' The 'bible'
The (formerly)
federal 'standard'
The 'best', latest
regional flora
"Controllingthetaxonomicvariable"
Impact:
"Please select your preference (A – D);
we can perform all translations"
Source: Franz et al. 2016. Controlling the taxonomic variable: […]. RIO Journal. doi:10.3897/rio.2.e10610
• We can now respond to queries such as:
• "Show all specimens identified to the taxonomic name Cleistes divaricata"
• Returns many records  resolves incongruent lineage of name usages
# 8: "Do you trust us now?" Aggregation as a translational service
• We can now respond to queries such as:
• "Show all specimens identified to the taxonomic name Cleistes divaricata"
• Returns many records  resolves incongruent lineage of name usages
• "Now show specimens with the TCL Cleistesiopsis divaricata sec. Weakley 2015"
• Returns record subset  resolving only one narrowly circumscribed concept
# 8: "Do you trust us now?" Aggregation as a translational service
# 8: "Do you trust us now?" Aggregation as a translational service
• We can now respond to queries such as:
• "Show all specimens identified to the taxonomic name Cleistes divaricata"
• Returns many records  resolves incongruent lineage of name usages
• "Now show specimens with the TCL Cleistesiopsis divaricata sec. Weakley 2015"
• Returns record subset  resolving only one narrowly circumscribed concept
• "Now show specimens identified to the TCL Cleistes divaricata sec. RAB 1968,
yet translated into the more granular TCLs sec. Weakley 2015"
• Returns (again) many records, yet represents and contrasts two treatments,
as opposed to providing the ambiguous lineage view (above)
• "Show all specimens with ambiguous 2010/2015 TCL identifications…" (etc.)
Conclusion – designing trusted biodiversity data services
• The Darwin Core standard for aggregating biodiversity data:
(1) Has under-utilized options for better representing taxonomic expertise
(2) Is part of a design paradigm that undermines the plurality of expertise
• The Darwin Core standard for aggregating biodiversity data:
(1) Has under-utilized options for better representing taxonomic expertise
(2) Is part of a design paradigm that undermines the plurality of expertise
• Solutions are in development that realize data aggregation via translational
services – not as disenfranchising "backbones" – and without disrupting the
formation of expert-licensed, high-quality biodiversity data packages
Conclusion – designing trusted biodiversity data services
• The Darwin Core standard for aggregating biodiversity data:
(1) Has under-utilized options for better representing taxonomic expertise
(2) Is part of a design paradigm that undermines the plurality of expertise
• Solutions are in development that realize data aggregation via translational
services – not as disenfranchising "backbones" – and without disrupting the
formation of expert-licensed, high-quality biodiversity data packages
• All of us – not just aggregators – "own" the responsibility of designing
systems where the plurality of taxonomic expertise is fairly accommodated
Conclusion – designing trusted biodiversity data services
Acknowledgments & links to products
• Cleistes use case: Alan Weakley (UNC)
• Euler/X toolkit: Shizhuo Yu (UC Davis)
• Other data issues, discussions: Andrew Johnston, Guanyang Zhang
• NSF DEB–1155984, DBI–1342595 (PI Franz)
• NSF IIS–118088, DBI–1147273 (PI Ludäscher)
• Euler/X code @ https://github.com/EulerProject/EulerX
• Franz et al. 2016. Two influential primate classifications logically aligned.
Systematic Biology 65(4): 561–582. Link
Interested in exploring
multi-taxonomy and/or
-phylogeny alignments?
Please contact me.
nico.franz@asu.edu
@taxonbytes
https://biokic.asu.edu/

Mais conteúdo relacionado

Mais procurados

Hail: SCALING GENETIC DATA ANALYSIS WITH APACHE SPARK: Keynote by Cotton Seed
Hail: SCALING GENETIC DATA ANALYSIS WITH APACHE SPARK: Keynote by Cotton SeedHail: SCALING GENETIC DATA ANALYSIS WITH APACHE SPARK: Keynote by Cotton Seed
Hail: SCALING GENETIC DATA ANALYSIS WITH APACHE SPARK: Keynote by Cotton Seed
Spark Summit
 
Franz et al 2015 escjam 2015 logic resolution taxonomic variable
Franz et al 2015 escjam 2015 logic resolution taxonomic variableFranz et al 2015 escjam 2015 logic resolution taxonomic variable
Franz et al 2015 escjam 2015 logic resolution taxonomic variable
taxonbytes
 
Biodiversity Informatics of the Cyperaceae: Where we stand and where we’re he...
Biodiversity Informatics of the Cyperaceae: Where we stand and where we’re he...Biodiversity Informatics of the Cyperaceae: Where we stand and where we’re he...
Biodiversity Informatics of the Cyperaceae: Where we stand and where we’re he...
Edward Baker
 

Mais procurados (20)

ICBO2017 - Supporting Ontology-Based Standardization of Biomedical Metadata i...
ICBO2017 - Supporting Ontology-Based Standardization of Biomedical Metadata i...ICBO2017 - Supporting Ontology-Based Standardization of Biomedical Metadata i...
ICBO2017 - Supporting Ontology-Based Standardization of Biomedical Metadata i...
 
Franz 2016 Phenotype RCN Representing Taxonomy and Phylogeny as Logically Tra...
Franz 2016 Phenotype RCN Representing Taxonomy and Phylogeny as Logically Tra...Franz 2016 Phenotype RCN Representing Taxonomy and Phylogeny as Logically Tra...
Franz 2016 Phenotype RCN Representing Taxonomy and Phylogeny as Logically Tra...
 
Hail: SCALING GENETIC DATA ANALYSIS WITH APACHE SPARK: Keynote by Cotton Seed
Hail: SCALING GENETIC DATA ANALYSIS WITH APACHE SPARK: Keynote by Cotton SeedHail: SCALING GENETIC DATA ANALYSIS WITH APACHE SPARK: Keynote by Cotton Seed
Hail: SCALING GENETIC DATA ANALYSIS WITH APACHE SPARK: Keynote by Cotton Seed
 
Facilitating Scientific Discovery through Crowdsourcing and Distributed Parti...
Facilitating Scientific Discovery through Crowdsourcing and Distributed Parti...Facilitating Scientific Discovery through Crowdsourcing and Distributed Parti...
Facilitating Scientific Discovery through Crowdsourcing and Distributed Parti...
 
ContentMine + EPMC: Finding Zika!
ContentMine + EPMC: Finding Zika!ContentMine + EPMC: Finding Zika!
ContentMine + EPMC: Finding Zika!
 
ContentMine + EPMC: Finding Zika!
ContentMine + EPMC: Finding Zika! ContentMine + EPMC: Finding Zika!
ContentMine + EPMC: Finding Zika!
 
Quality Assessment of Biomedical Metadata using Topic Modeling
Quality Assessment of Biomedical Metadata using Topic ModelingQuality Assessment of Biomedical Metadata using Topic Modeling
Quality Assessment of Biomedical Metadata using Topic Modeling
 
John La Salle - Opening Plenary
John La Salle - Opening PlenaryJohn La Salle - Opening Plenary
John La Salle - Opening Plenary
 
Franz et al 2015 escjam 2015 logic resolution taxonomic variable
Franz et al 2015 escjam 2015 logic resolution taxonomic variableFranz et al 2015 escjam 2015 logic resolution taxonomic variable
Franz et al 2015 escjam 2015 logic resolution taxonomic variable
 
2014 sage-talk
2014 sage-talk2014 sage-talk
2014 sage-talk
 
Using the Semantic Web to Support Ecoinformatics
Using the Semantic Web to Support EcoinformaticsUsing the Semantic Web to Support Ecoinformatics
Using the Semantic Web to Support Ecoinformatics
 
Rphenoscape: 
Connecting the semantics of evolutionary morphology to comparat...
Rphenoscape: 
Connecting the semantics of evolutionary morphology to comparat...Rphenoscape: 
Connecting the semantics of evolutionary morphology to comparat...
Rphenoscape: 
Connecting the semantics of evolutionary morphology to comparat...
 
Visualizing Primary Data form Taxonomic Literature
Visualizing Primary Data form Taxonomic LiteratureVisualizing Primary Data form Taxonomic Literature
Visualizing Primary Data form Taxonomic Literature
 
Biodiversity Informatics of the Cyperaceae: Where we stand and where we’re he...
Biodiversity Informatics of the Cyperaceae: Where we stand and where we’re he...Biodiversity Informatics of the Cyperaceae: Where we stand and where we’re he...
Biodiversity Informatics of the Cyperaceae: Where we stand and where we’re he...
 
Wikidata and the Semantic Web of Food
Wikidata and the  Semantic Web of FoodWikidata and the  Semantic Web of Food
Wikidata and the Semantic Web of Food
 
Next Generation Cancer Data Discovery, Access, and Integration Using Prizms a...
Next Generation Cancer Data Discovery, Access, and Integration Using Prizms a...Next Generation Cancer Data Discovery, Access, and Integration Using Prizms a...
Next Generation Cancer Data Discovery, Access, and Integration Using Prizms a...
 
Of Trees and Owl: 
The challenges of reasoning over the semantics of shared d...
Of Trees and Owl: 
The challenges of reasoning over the semantics of shared d...Of Trees and Owl: 
The challenges of reasoning over the semantics of shared d...
Of Trees and Owl: 
The challenges of reasoning over the semantics of shared d...
 
Publishing Germplasm Vocabularies as Linked Data
Publishing Germplasm Vocabularies as Linked DataPublishing Germplasm Vocabularies as Linked Data
Publishing Germplasm Vocabularies as Linked Data
 
Collaborative Genomic Data Analyses in the Cloud
Collaborative Genomic Data Analyses in the CloudCollaborative Genomic Data Analyses in the Cloud
Collaborative Genomic Data Analyses in the Cloud
 
Highly dimensional data_20160926
Highly dimensional data_20160926Highly dimensional data_20160926
Highly dimensional data_20160926
 

Destaque (7)

Franz et al tdwg 2016 new developments for libraries of life
Franz et al tdwg 2016 new developments for libraries of lifeFranz et al tdwg 2016 new developments for libraries of life
Franz et al tdwg 2016 new developments for libraries of life
 
Franz et al tdwg 2016 introducing lep net
Franz et al tdwg 2016 introducing lep netFranz et al tdwg 2016 introducing lep net
Franz et al tdwg 2016 introducing lep net
 
Franz et al TDWG 2016 Updates on multiple neotropical symbiota portals
Franz et al TDWG 2016 Updates on multiple neotropical symbiota portalsFranz et al TDWG 2016 Updates on multiple neotropical symbiota portals
Franz et al TDWG 2016 Updates on multiple neotropical symbiota portals
 
AP WH Chapter 26 PPT
AP WH Chapter 26 PPTAP WH Chapter 26 PPT
AP WH Chapter 26 PPT
 
Ch. 26 - "The New Power Balance"
Ch. 26 - "The New Power Balance"Ch. 26 - "The New Power Balance"
Ch. 26 - "The New Power Balance"
 
Chapter 26 ppt - Balance of Power
Chapter 26 ppt - Balance of PowerChapter 26 ppt - Balance of Power
Chapter 26 ppt - Balance of Power
 
Balance Of Power
Balance Of PowerBalance Of Power
Balance Of Power
 

Semelhante a Franz sterner tdwg 2016 new power balance needed for trustworthy biodiversity data

Franz 2015 SPNHC Taxonomic concept resolution for voucher-based biodiversity ...
Franz 2015 SPNHC Taxonomic concept resolution for voucher-based biodiversity ...Franz 2015 SPNHC Taxonomic concept resolution for voucher-based biodiversity ...
Franz 2015 SPNHC Taxonomic concept resolution for voucher-based biodiversity ...
taxonbytes
 
Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...
Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...
Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...
ICZN
 
Apollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
Apollo and i5K: Collaborative Curation and Interactive Analysis of GenomesApollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
Apollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
Monica Munoz-Torres
 
Wilkinson bosc2010 moby-to-sadi
Wilkinson bosc2010 moby-to-sadiWilkinson bosc2010 moby-to-sadi
Wilkinson bosc2010 moby-to-sadi
BOSC 2010
 
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
Susanna-Assunta Sansone
 

Semelhante a Franz sterner tdwg 2016 new power balance needed for trustworthy biodiversity data (20)

Franz 2015 SPNHC Taxonomic concept resolution for voucher-based biodiversity ...
Franz 2015 SPNHC Taxonomic concept resolution for voucher-based biodiversity ...Franz 2015 SPNHC Taxonomic concept resolution for voucher-based biodiversity ...
Franz 2015 SPNHC Taxonomic concept resolution for voucher-based biodiversity ...
 
Franz 2017 sols cbs seminar the limits of synthesis for integrative biology
Franz 2017 sols cbs seminar the limits of synthesis for integrative biologyFranz 2017 sols cbs seminar the limits of synthesis for integrative biology
Franz 2017 sols cbs seminar the limits of synthesis for integrative biology
 
Idcc kansa-kansa-arbuckle
Idcc kansa-kansa-arbuckleIdcc kansa-kansa-arbuckle
Idcc kansa-kansa-arbuckle
 
Connecting life sciences data at the European Bioinformatics Institute
Connecting life sciences data at the European Bioinformatics InstituteConnecting life sciences data at the European Bioinformatics Institute
Connecting life sciences data at the European Bioinformatics Institute
 
Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...
Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...
Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...
 
How to share useful data
How to share useful dataHow to share useful data
How to share useful data
 
How SADI & SHARE help restore the Scientific Method to in silico science
How SADI & SHARE help restore the Scientific Method to in silico scienceHow SADI & SHARE help restore the Scientific Method to in silico science
How SADI & SHARE help restore the Scientific Method to in silico science
 
Apollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
Apollo and i5K: Collaborative Curation and Interactive Analysis of GenomesApollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
Apollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
 
Whitney Symposium Lecture June 2008
Whitney Symposium Lecture June 2008Whitney Symposium Lecture June 2008
Whitney Symposium Lecture June 2008
 
Multi-omics methods and resources for Bioconductor
Multi-omics methods and resources for BioconductorMulti-omics methods and resources for Bioconductor
Multi-omics methods and resources for Bioconductor
 
Chemspider Presentation at the ACS Meeting in New orleans
Chemspider Presentation at the ACS Meeting in New orleansChemspider Presentation at the ACS Meeting in New orleans
Chemspider Presentation at the ACS Meeting in New orleans
 
NISO Apr 29 Virtual Conference: Dismantling a Single-Discipline Journal Bundl...
NISO Apr 29 Virtual Conference: Dismantling a Single-Discipline Journal Bundl...NISO Apr 29 Virtual Conference: Dismantling a Single-Discipline Journal Bundl...
NISO Apr 29 Virtual Conference: Dismantling a Single-Discipline Journal Bundl...
 
Towards Incidental Collaboratories; Research Data Services
Towards Incidental Collaboratories; Research Data ServicesTowards Incidental Collaboratories; Research Data Services
Towards Incidental Collaboratories; Research Data Services
 
Scott Edmunds talk at AIST: Overcoming the Reproducibility Crisis: and why I ...
Scott Edmunds talk at AIST: Overcoming the Reproducibility Crisis: and why I ...Scott Edmunds talk at AIST: Overcoming the Reproducibility Crisis: and why I ...
Scott Edmunds talk at AIST: Overcoming the Reproducibility Crisis: and why I ...
 
Wilkinson bosc2010 moby-to-sadi
Wilkinson bosc2010 moby-to-sadiWilkinson bosc2010 moby-to-sadi
Wilkinson bosc2010 moby-to-sadi
 
Open Science and Ecological meta-anlaysis
Open Science and Ecological meta-anlaysisOpen Science and Ecological meta-anlaysis
Open Science and Ecological meta-anlaysis
 
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
 
Big Data Standards - Workshop, ExpBio, Boston, 2015
Big Data Standards - Workshop, ExpBio, Boston, 2015Big Data Standards - Workshop, ExpBio, Boston, 2015
Big Data Standards - Workshop, ExpBio, Boston, 2015
 
Building bioinformatics resources for the global community
Building bioinformatics resources for the global communityBuilding bioinformatics resources for the global community
Building bioinformatics resources for the global community
 
GARNet workshop on Integrating Large Data into Plant Science
GARNet workshop on Integrating Large Data into Plant ScienceGARNet workshop on Integrating Large Data into Plant Science
GARNet workshop on Integrating Large Data into Plant Science
 

Mais de taxonbytes

Zhang Franz ESCJAM 2015 Exophthalmus Reclassification
Zhang Franz ESCJAM 2015 Exophthalmus ReclassificationZhang Franz ESCJAM 2015 Exophthalmus Reclassification
Zhang Franz ESCJAM 2015 Exophthalmus Reclassification
taxonbytes
 
Zhang & Franz 2014 - Integrating and Visualizing Taxonomic Concept Changes in...
Zhang & Franz 2014 - Integrating and Visualizing Taxonomic Concept Changes in...Zhang & Franz 2014 - Integrating and Visualizing Taxonomic Concept Changes in...
Zhang & Franz 2014 - Integrating and Visualizing Taxonomic Concept Changes in...
taxonbytes
 
Franz. Anatomy of a Cladistic Analysis.
Franz. Anatomy of a Cladistic Analysis.Franz. Anatomy of a Cladistic Analysis.
Franz. Anatomy of a Cladistic Analysis.
taxonbytes
 

Mais de taxonbytes (19)

De-centralized but global: Redesigning biodiversity data aggregation for impr...
De-centralized but global: Redesigning biodiversity data aggregation for impr...De-centralized but global: Redesigning biodiversity data aggregation for impr...
De-centralized but global: Redesigning biodiversity data aggregation for impr...
 
Anzaldo franz 2017 ecn your daily weevil
Anzaldo franz 2017 ecn your daily weevilAnzaldo franz 2017 ecn your daily weevil
Anzaldo franz 2017 ecn your daily weevil
 
Franz et al 2017 ecn creating and publishing a symbiota based checklist version
Franz et al 2017 ecn creating and publishing a symbiota based checklist versionFranz et al 2017 ecn creating and publishing a symbiota based checklist version
Franz et al 2017 ecn creating and publishing a symbiota based checklist version
 
Franz Zhang et al Weevil Workshop 2016 Neotropical Entiminae Systematics evol...
Franz Zhang et al Weevil Workshop 2016 Neotropical Entiminae Systematics evol...Franz Zhang et al Weevil Workshop 2016 Neotropical Entiminae Systematics evol...
Franz Zhang et al Weevil Workshop 2016 Neotropical Entiminae Systematics evol...
 
Zhang et al ecn 2016 building an accessible weevil tissue collection for geno...
Zhang et al ecn 2016 building an accessible weevil tissue collection for geno...Zhang et al ecn 2016 building an accessible weevil tissue collection for geno...
Zhang et al ecn 2016 building an accessible weevil tissue collection for geno...
 
Franz et al evol 2016 aligning multipe incongruent phylogenies with the euler...
Franz et al evol 2016 aligning multipe incongruent phylogenies with the euler...Franz et al evol 2016 aligning multipe incongruent phylogenies with the euler...
Franz et al evol 2016 aligning multipe incongruent phylogenies with the euler...
 
Zhang et al evol 2016 beyond otus phylogenetic identification of bacterial sy...
Zhang et al evol 2016 beyond otus phylogenetic identification of bacterial sy...Zhang et al evol 2016 beyond otus phylogenetic identification of bacterial sy...
Zhang et al evol 2016 beyond otus phylogenetic identification of bacterial sy...
 
Zhang Franz ESCJAM 2015 Exophthalmus Reclassification
Zhang Franz ESCJAM 2015 Exophthalmus ReclassificationZhang Franz ESCJAM 2015 Exophthalmus Reclassification
Zhang Franz ESCJAM 2015 Exophthalmus Reclassification
 
Franz cobb seltmann 2015 spnhc current state of arthropod biodiversity data
Franz cobb seltmann 2015 spnhc current state of arthropod biodiversity dataFranz cobb seltmann 2015 spnhc current state of arthropod biodiversity data
Franz cobb seltmann 2015 spnhc current state of arthropod biodiversity data
 
Johnston ESA 2014 Trogloderus Sand Dune Speciation
Johnston ESA 2014 Trogloderus Sand Dune SpeciationJohnston ESA 2014 Trogloderus Sand Dune Speciation
Johnston ESA 2014 Trogloderus Sand Dune Speciation
 
Zhang Et Al ESA 2014 Ancient reverse colonization of Central America from the...
Zhang Et Al ESA 2014 Ancient reverse colonization of Central America from the...Zhang Et Al ESA 2014 Ancient reverse colonization of Central America from the...
Zhang Et Al ESA 2014 Ancient reverse colonization of Central America from the...
 
Franz 2014 ESA Aligning Insect Phylogenies Perelleschus and Other Cases
Franz 2014 ESA Aligning Insect Phylogenies Perelleschus and Other CasesFranz 2014 ESA Aligning Insect Phylogenies Perelleschus and Other Cases
Franz 2014 ESA Aligning Insect Phylogenies Perelleschus and Other Cases
 
Franz 2014 BIGCB Tracking Change across Classifications and Phylogenies
Franz 2014 BIGCB Tracking Change across Classifications and PhylogeniesFranz 2014 BIGCB Tracking Change across Classifications and Phylogenies
Franz 2014 BIGCB Tracking Change across Classifications and Phylogenies
 
Arizona State University Natural History Collections - Moving to Alameda (201...
Arizona State University Natural History Collections - Moving to Alameda (201...Arizona State University Natural History Collections - Moving to Alameda (201...
Arizona State University Natural History Collections - Moving to Alameda (201...
 
Cobb, Seltmann, Franz. 2014. The Current State of Arthropod Biodiversity Data...
Cobb, Seltmann, Franz. 2014. The Current State of Arthropod Biodiversity Data...Cobb, Seltmann, Franz. 2014. The Current State of Arthropod Biodiversity Data...
Cobb, Seltmann, Franz. 2014. The Current State of Arthropod Biodiversity Data...
 
Franz. 2014. Explaining taxonomy's legacy to computers – how and why?
Franz. 2014. Explaining taxonomy's legacy to computers – how and why?Franz. 2014. Explaining taxonomy's legacy to computers – how and why?
Franz. 2014. Explaining taxonomy's legacy to computers – how and why?
 
Ludäscher et al. 2014 - A Hybrid Diagnosis Approach Combining Black-Box and W...
Ludäscher et al. 2014 - A Hybrid Diagnosis Approach Combining Black-Box and W...Ludäscher et al. 2014 - A Hybrid Diagnosis Approach Combining Black-Box and W...
Ludäscher et al. 2014 - A Hybrid Diagnosis Approach Combining Black-Box and W...
 
Zhang & Franz 2014 - Integrating and Visualizing Taxonomic Concept Changes in...
Zhang & Franz 2014 - Integrating and Visualizing Taxonomic Concept Changes in...Zhang & Franz 2014 - Integrating and Visualizing Taxonomic Concept Changes in...
Zhang & Franz 2014 - Integrating and Visualizing Taxonomic Concept Changes in...
 
Franz. Anatomy of a Cladistic Analysis.
Franz. Anatomy of a Cladistic Analysis.Franz. Anatomy of a Cladistic Analysis.
Franz. Anatomy of a Cladistic Analysis.
 

Último

Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
PirithiRaju
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
levieagacer
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
NazaninKarimi6
 

Último (20)

SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
 
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts ServiceJustdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
 
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
 
Introduction to Viruses
Introduction to VirusesIntroduction to Viruses
Introduction to Viruses
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 

Franz sterner tdwg 2016 new power balance needed for trustworthy biodiversity data

  • 1. A new power balance is needed for trustworthy biodiversity data Please @taxonbytes Nico Franz1 & Beckett W. Sterner1 With contributions by Edward Gilbert1, Andrew Johnston1, Guanyang Zhang1, Bertram Ludäscher2 & Alan Weakley3 1 School of Life Sciences, Arizona State University 2 iSchool, University of Illinois at Urbana-Champaign 3 Herbarium, University of North Carolina at Chapel Hill TDWG 2016 – Biodiversity Information Standards December 09, 2016 – Instituto Tecnológico de Costa Rica (#TDWG16) @ http://www.slideshare.net/taxonbytes/franz-sterner-tdwg-2016-new-power-balance-needed-for-trustworthy-biodiversity-data
  • 2. Largely derived from doi:10.3897/rio.2.e10610 91dd0ee1-8a37-4efc-85b7-8176874cf5be
  • 3. Premise: We agree that there are significant data quality issues 91dd0ee1-8a37-4efc-85b7-8176874cf5be Aggregated Australian millipede data 'taken to the cleaners'
  • 4. Premise: We agree that there are significant data quality issues 91dd0ee1-8a37-4efc-85b7-8176874cf5be Aggregated Australian millipede data 'taken to the cleaners' Aggregators respond to the charges
  • 5. Premise: We agree that there are significant data quality issues 91dd0ee1-8a37-4efc-85b7-8176874cf5be Aggregated Australian millipede data 'taken to the cleaners' Aggregators respond to the charges But this leaves open the question(s): Who (exactly) is responsible for how much of each particular issue?
  • 6. We seem to disagree on the question of responsibility assignment(s) 91dd0ee1-8a37-4efc-85b7-8176874cf5be Source: Belbin et al. 2013. A specialist's audit […]: An 'aggregator's' perspective. doi:10.3897/zookeys.305.5438 Page 73
  • 7. Often enough, aggregators respond by: • Acknowledging the general issues and their relevance. • Pointing to many issues that effectively reside "with the sources". • Calling for more collaboration across all levels; as well as new tools and annotation options that "motivate and empower" the research community. 91dd0ee1-8a37-4efc-85b7-8176874cf5be Source: Belbin et al. 2013. A specialist's audit […]: An 'aggregator's' perspective. doi:10.3897/zookeys.305.5438 Page 74
  • 8. Thesis: For taxonomy integration, this both wrong and self-defeating 91dd0ee1-8a37-4efc-85b7-8176874cf5be • Many aggregators are designed to impose a single taxonomic hierarchy – one at a time – onto all taxonomically annotated records.
  • 9. 91dd0ee1-8a37-4efc-85b7-8176874cf5be • Many aggregators are designed to impose a single taxonomic hierarchy – one at a time – onto all taxonomically annotated records. • By design, these "backbones" are rarely attributable to individual (expert) authors, but instead are newly created systematic theories that only appear at the system level. Thesis: For taxonomy integration, this both wrong and self-defeating
  • 10. 91dd0ee1-8a37-4efc-85b7-8176874cf5be • Many aggregators are designed to impose a single taxonomic hierarchy – one at a time – onto all taxonomically annotated records. • By design, these "backbones" are rarely attributable to individual (expert) authors, but instead are newly created systematic theories that only appear at the system level. • Data are aggregated accordingly; yet backbone-driven modifications may newly disrupt the original integrity of submitted data packages. Thesis: For taxonomy integration, this both wrong and self-defeating
  • 11. 91dd0ee1-8a37-4efc-85b7-8176874cf5be • Many aggregators are designed to impose a single taxonomic hierarchy – one at a time – onto all taxonomically annotated records. • By design, these "backbones" are rarely attributable to individual (expert) authors, but instead are newly created systematic theories that only appear at the system level. • Data are aggregated accordingly; yet backbone-driven modifications may newly disrupt the original integrity of submitted data packages. • By deflecting on responsibilities, aggregators may cause additional self-harm. Ultimately, the power balance – as presently built in – must shift to bring experts back into the process of licensing succinct, trustworthy data packages. Thesis: For taxonomy integration, this both wrong and self-defeating
  • 12. Let's re-diagnose: What happens in dynamic, open systems? Charly Lewisw, CC BY-SA 3.0
  • 13. Taxonomic views of a frequently revised organismal lineage Source: Franz et al. 2016. Controlling the taxonomic variable: […]. RIO Journal. doi:10.3897/rio.2.e10610 • 9 schemata for the NA Cleistes/Cleistesiopsis complex (orchids, "pogonias")
  • 14. Snapshot of a more frequently revised organismal lineage • 9 schemata for the NA Cleistes/Cleistesiopsis complex (orchids, "pogonias") • Vertical sections identify taxonomic concept regions Source: Franz et al. 2016. Controlling the taxonomic variable: […]. RIO Journal. doi:10.3897/rio.2.e10610
  • 15. Snapshot of a more frequently revised organismal lineage • 9 schemata for the NA Cleistes/Cleistesiopsis complex (orchids, "pogonias") • Vertical sections identify taxonomic concept regions • Colors identify lineages of taxonomic names (epithets) in use Source: Franz et al. 2016. Controlling the taxonomic variable: […]. RIO Journal. doi:10.3897/rio.2.e10610
  • 16. Snapshot of a more frequently revised organismal lineage • 9 schemata for the NA Cleistes/Cleistesiopsis complex (orchids) • Vertical sections identify taxonomic concept regions • Colors identify lineages of taxonomic names (epithets) in use • There is no consensus! Five incongruent schemata are used concurrently Source: Franz et al. 2016. Controlling the taxonomic variable: […]. RIO Journal. doi:10.3897/rio.2.e10610
  • 17. Further diagnosis: If incongruent taxonomies are endorsed – locally, provisionally, and democratically – then what is the impact for aggregated biodiversity data?
  • 18. Further diagnosis:  Taxonomy becomes a variable that we need to represent, and thereby control for (at the system level)
  • 19. The 'consensus' • Query: "Where do these orchid species occur?" • Same set of 250 orchid specimens, according to 4 taxonomies. "Controllingthetaxonomicvariable" Example: the Cleistes use case Source: Franz et al. 2016. Controlling the taxonomic variable: […]. RIO Journal. doi:10.3897/rio.2.e10610
  • 20. The 'consensus' The 'bible' "Controllingthetaxonomicvariable" • Query: "Where do these orchid species occur?" • Same set of 250 orchid specimens, according to 4 taxonomies. Example: the Cleistes use case Source: Franz et al. 2016. Controlling the taxonomic variable: […]. RIO Journal. doi:10.3897/rio.2.e10610
  • 21. The 'consensus' The 'bible' The (formerly) federal 'standard' "Controllingthetaxonomicvariable" Source: Franz et al. 2016. Controlling the taxonomic variable: […]. RIO Journal. doi:10.3897/rio.2.e10610
  • 22. The 'consensus' The 'bible' The (formerly) federal 'standard' The 'best', latest regional flora "Controllingthetaxonomicvariable" Source: Franz et al. 2016. Controlling the taxonomic variable: […]. RIO Journal. doi:10.3897/rio.2.e10610
  • 23. The 'consensus' The 'bible' The (formerly) federal 'standard' The 'best', latest regional flora "Controllingthetaxonomicvariable" Expert views are in conflict Source: Franz et al. 2016. Controlling the taxonomic variable: […]. RIO Journal. doi:10.3897/rio.2.e10610
  • 24. The 'consensus' The 'bible' The (formerly) federal 'standard' The 'best', latest regional flora "Controllingthetaxonomicvariable" Expert views are in conflict "Just bad" Source: Franz et al. 2016. Controlling the taxonomic variable: […]. RIO Journal. doi:10.3897/rio.2.e10610
  • 25. The 'consensus' The 'bible' The (formerly) federal 'standard' The 'best', latest regional flora Impact: Name-based aggregation has created a novel synthesis that nobody believes in "Controllingthetaxonomicvariable" "Just bad" Source: Franz et al. 2016. Controlling the taxonomic variable: […]. RIO Journal. doi:10.3897/rio.2.e10610
  • 26. The 'consensus' The 'bible' The (formerly) federal 'standard' The 'best', latest regional flora "Controllingthetaxonomicvariable" "Just bad" Expert views are in conflict Solution: Instead of aggregating an artificial 'consensus', … Source: Franz et al. 2016. Controlling the taxonomic variable: […]. RIO Journal. doi:10.3897/rio.2.e10610
  • 27. The 'consensus' The 'bible' The (formerly) federal 'standard' The 'best', latest regional flora "Controllingthetaxonomicvariable" "Just bad" Expert views are reconciled Solution: Instead of aggregating an artificial 'consensus', build translation services Source: Franz et al. 2016. Controlling the taxonomic variable: […]. RIO Journal. doi:10.3897/rio.2.e10610
  • 28. Challenges: How can we redesign aggregation to yield high-quality biodiversity data packages?
  • 29. Challenges: How can we redesign aggregation to yield high-quality biodiversity data packages? What does this mean for Darwin Core1 and how we use this aggregation standard? 1 Wieczorek et al. 2012. Darwin Core: an evolving […]. PLoS ONE 7(1): e29715. doi:10.1371/journal.pone.0029715
  • 30. Preview of solution with eight steps • DwC is insufficient, and part of the problem
  • 31. # 1: Represent only taxonomic concept labels (TCLs) 1 • Syntax (TCL): taxonomic name [author, year, page] sec. source 1 Multi-taxonomy input/alignment visualizations generated with Euler/X toolkit: https://github.com/EulerProject/EulerX Cleistes divaricata sec. Gregg & Catling 1993 Pogonia sec. Brown & Wunderlin 1997
  • 32. # 1: DwC score keeping  TCLs are optional; < 1% realized? • TCL ~ DwC: nameAccordingTo • SCAN: 19,722 of nearly 9 million records have TCLs (0.2%) • Lack of enforcement to use TCLs makes standard less big data-ready "Who authors GBIF's Backbone?" https://storify.com/taxonbytes/who-authors-gbif-s-backbone
  • 33. # 2: Represent each source coherently (Parent-Child relationships) • Syntax (PC): TCL1 is a child/parent of TCL2 [where TCL1/2 = same source] Cleistesiopsis bifaria sec. Pans. & de Barr. 2008 is a child of Cleistesiopsis sec. Pans. & de Barr. 2008
  • 34. # 2: DwC score keeping  Not (adequately) represented • PC ~ DwC: genus, family, order (etc.; higherClassification) • However, higher-level names in DwC are not modeled as TCLs • Taxonomic coherence of sources cannot be preserved with DwC alone DwC record with higherClassification (BDJ)
  • 35. # 3: Do not force a single hierarchy onto all tip-level TCLs • Syntax (PC): Tip-level TCL1 , TCL2 , etc. [where TCL1/2 = different sources]
  • 36. # 3: DwC score keeping  Optional Not (ever?) practiced • No PC ~ DwC: infra-/specificEpithet only • Typically, a single, 'unitary' higher-level classification is represented • Combinations of algorithmic and social practices achieve the single hierarchy "Who authors GBIF's Backbone?" https://storify.com/taxonbytes/who-authors-gbif-s-backbone
  • 37. # 4: Link TCLs via expert-provided RCC–5 articulations • Syntax (RCC–5): TCL1 {==, >, <, ><, !} TCL2 [where TCL1/2 = diff. sources] • RCC–5 = Region Connection Calculus • 14 articulations provided by: http://tinyurl.com/Weakley-Flora-2015 Cleistes bifaria "Coastal Populations" sec. Smith et al. 2004 == (is congruent with) Cleistesiopsis oricamporum sec. Brown & Pans. 2009 ==
  • 38. Source: Thau, D.M. 2010. Reasoning about taxonomies. Thesis, UC Davis. http://gradworks.proquest.com/3422778.pdf Region Connection Calculus (semantics: set constraints) == < > >< ! • Two regions N, M are either: • congruent (N == M) • properly inclusive (N < M) • inversely properly inclusive (N > M) • overlapping (N >< M) • exclusive of each other (N ! M) • RCC–5 articulations answer the query: "can we join regions N and M?" • Taxonomies have multiple RCC–5 alignable components: nodes (parents, children), node-associated traits, even node-anchoring specimens
  • 39. # 4: DwC score keeping  Not (adequately) represented • RCC–5 ~ DwC: accepted(Scientific)Name(Usage), relationshipOfResource, taxonomicStatus (etc.; nomenclatural relationships) • Nomenclatural relationships are type-focused, not region-focused • "Taxonomic Concept Schema"  yes! (however: http://www.tdwg.org/standards/117) Source: Vane-Wright. 2003. Indifferent philosophy versus […]. Syst. Biodiv. 1: 3–11. doi:10.1017/S1477200003001063 Example: Milkweed butterflies
  • 40. Oscillating meanings of the epithet hyalites – 1911 to 2003 Phenotypicdiversity Type-anchorednameidentityrelations Source: Vane-Wright. 2003. Indifferent philosophy versus […]. Syst. Biodiv. 1: 3–11. doi:10.1017/S1477200003001063
  • 41. # 5: Identify occurrence records only to TCLs Records: EKY39235 MTSU003611 NCSC00040204 … Records: BOON8098 CLEMS0061133 WILLI39399 … Records: GMUF-0039355 IBE006808 USCH58399 … Records: CONV0006268 MDKY00006482 NCU00038930 … Records: BRYV0023582, BRYV0023584 KHD00032030, MISS0016604 MMNS000227, NCSC00040206 USMS_000002923, USMS_000002924 VSC0053223, VSC0065528 … Records: ARIZ393087 DBG39049 USCH51217 … Records: NCU00040710 USCH96248 VSC0053218 … Records: CLEMS0012881 FUGR0003293 GA023130 … Records: BOON8100 NCSC00040210 SJNM45487 … Records: GA023144 LSU00012494 MISS0016608 … Records: IBE006810, IND-0012374, MMNS000227 Records: NY8654 • Syntax (ID): Occurrence / organism is identified to TCL "CLEMS0012881" is identified to Cleistes divaricata sec. Smith et al. 2004 [additional ID metadata]
  • 42. DwC record with Identification metadata (BDJ) # 5: DwC score keeping  ID metadata optional; > 50% realized • ID ~ DwC: Identification, (date)identified(By), identificationReference • SCAN: 4,715,277 of nearly 9 million records have ID metadata (52.5%) • Enforcement…still also require use of TCLs
  • 43. # 6: Generate comprehensive, consistent RCC–5 alignments • Euler/X is a toolkit that infers logically consistent RCC–5 alignments
  • 44. # 6: Generate comprehensive, consistent RCC–5 alignments • Valued-added: MIR – set of Maximally Informative Relations containing the RCC–5 articulation for every possible TCL pair  scalability Reasonerinference
  • 45. # 7: Joining occurrence-to-TCL identifications & RCC–5 alignments Records: BOON8098, CLEMS0061133, CONV0006268, EKY39235 GMUF-0039355, IBE006808, IBE006810, IND-0012374 MDKY00006482, MMNS000227, MTSU003611, NCSC00040204 NCU00038930, NY8654, USCH58399, WILLI39399 … Records: ARIZ393087, BRYV0023582, BRYV0023584, DBG39049 KHD00032030, MISS0016604, MMNS00022, NCSC00040206 USMS_000002923, USMS_000002924, VSC0053223, VSC0065528 … Records: BOON8100, CLEMS0012881, FUGR0003293 GA023130, GA023144, LSU00012494 MISS0016608, NCSC00040210, NCU00040710 SJNM45487, USCH96248, VSC0053218 … • Specimen integration is fully driven by TCL-to-TCL RCC–5 signals
  • 46. The 'consensus' The 'bible' The (formerly) federal 'standard' The 'best', latest regional flora "Controllingthetaxonomicvariable" Impact: "Please select your preference (A – D); we can perform all translations" Source: Franz et al. 2016. Controlling the taxonomic variable: […]. RIO Journal. doi:10.3897/rio.2.e10610
  • 47. • We can now respond to queries such as: • "Show all specimens identified to the taxonomic name Cleistes divaricata" • Returns many records  resolves incongruent lineage of name usages # 8: "Do you trust us now?" Aggregation as a translational service
  • 48. • We can now respond to queries such as: • "Show all specimens identified to the taxonomic name Cleistes divaricata" • Returns many records  resolves incongruent lineage of name usages • "Now show specimens with the TCL Cleistesiopsis divaricata sec. Weakley 2015" • Returns record subset  resolving only one narrowly circumscribed concept # 8: "Do you trust us now?" Aggregation as a translational service
  • 49. # 8: "Do you trust us now?" Aggregation as a translational service • We can now respond to queries such as: • "Show all specimens identified to the taxonomic name Cleistes divaricata" • Returns many records  resolves incongruent lineage of name usages • "Now show specimens with the TCL Cleistesiopsis divaricata sec. Weakley 2015" • Returns record subset  resolving only one narrowly circumscribed concept • "Now show specimens identified to the TCL Cleistes divaricata sec. RAB 1968, yet translated into the more granular TCLs sec. Weakley 2015" • Returns (again) many records, yet represents and contrasts two treatments, as opposed to providing the ambiguous lineage view (above) • "Show all specimens with ambiguous 2010/2015 TCL identifications…" (etc.)
  • 50. Conclusion – designing trusted biodiversity data services • The Darwin Core standard for aggregating biodiversity data: (1) Has under-utilized options for better representing taxonomic expertise (2) Is part of a design paradigm that undermines the plurality of expertise
  • 51. • The Darwin Core standard for aggregating biodiversity data: (1) Has under-utilized options for better representing taxonomic expertise (2) Is part of a design paradigm that undermines the plurality of expertise • Solutions are in development that realize data aggregation via translational services – not as disenfranchising "backbones" – and without disrupting the formation of expert-licensed, high-quality biodiversity data packages Conclusion – designing trusted biodiversity data services
  • 52. • The Darwin Core standard for aggregating biodiversity data: (1) Has under-utilized options for better representing taxonomic expertise (2) Is part of a design paradigm that undermines the plurality of expertise • Solutions are in development that realize data aggregation via translational services – not as disenfranchising "backbones" – and without disrupting the formation of expert-licensed, high-quality biodiversity data packages • All of us – not just aggregators – "own" the responsibility of designing systems where the plurality of taxonomic expertise is fairly accommodated Conclusion – designing trusted biodiversity data services
  • 53. Acknowledgments & links to products • Cleistes use case: Alan Weakley (UNC) • Euler/X toolkit: Shizhuo Yu (UC Davis) • Other data issues, discussions: Andrew Johnston, Guanyang Zhang • NSF DEB–1155984, DBI–1342595 (PI Franz) • NSF IIS–118088, DBI–1147273 (PI Ludäscher) • Euler/X code @ https://github.com/EulerProject/EulerX • Franz et al. 2016. Two influential primate classifications logically aligned. Systematic Biology 65(4): 561–582. Link
  • 54. Interested in exploring multi-taxonomy and/or -phylogeny alignments? Please contact me. nico.franz@asu.edu @taxonbytes https://biokic.asu.edu/

Notas do Editor

  1. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  2. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  3. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  4. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  5. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  6. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  7. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  8. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  9. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  10. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  11. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  12. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  13. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  14. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  15. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  16. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  17. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  18. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  19. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  20. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  21. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  22. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  23. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  24. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  25. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  26. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  27. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  28. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  29. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  30. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  31. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  32. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  33. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  34. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  35. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  36. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  37. The simple semantics of RCC-5 makes this a rather generic vocabulary for representing advancement in phylogenetic knowledge. At the same time, the onus is on the phylogeneticists to apply the articulations in auch ways that the desired query services are actually obtained.
  38. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  39. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  40. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  41. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  42. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  43. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  44. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  45. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  46. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  47. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  48. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  49. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.