Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Remsen EOL Content Summit
1. EOL Content Summit, Barro Colorado Island, Panama Global Biodiversity Information Facility David Remsen Senior Programme Officer Global Biodiversity Information Facility (GBIF) January 2012
12. SPECIES INFORMATION Distribution Species Descriptions!! Classification Synonymy Bibliography Specimens Common Names Images Annotated Species Checklists General Descriptions Morphology Behavior Conservation Diagnosic Reproduction
Before I describe the challenges inherent to the index, I’d like to illustrate how primary biodiversity data has been used in various scientific and biodiversity policy-related contexts.
GBIF has a specific focus within biodiversity information in that our scope is restricted to the mobilisation, discovery, and use of primary biodiversity data. Primary biodiversity data are the digital text or multimedia data records that detail the instance of an organism – the ‘what, where, when, how and by whom’ of the organism’s occurrence and recording. One major class of primary biodiversity data is that derived from natural history collections.
A second class of primary biodiversity data originate with observations of species and there are numerous instances of observational data networks that collect millions of species observations every year.
A second class of primary biodiversity data originate with observations of species and there are numerous instances of observational data networks that collect millions of species observations every year.
GBIF represents a federated network that is composed of thousands of different primary biodiversity databases located all over the world.
Before I describe the challenges inherent to the index, I’d like to illustrate how primary biodiversity data has been used in various scientific and biodiversity policy-related contexts.
Before I describe the challenges inherent to the index, I’d like to illustrate how primary biodiversity data has been used in various scientific and biodiversity policy-related contexts.
Before I describe the challenges inherent to the index, I’d like to illustrate how primary biodiversity data has been used in various scientific and biodiversity policy-related contexts.
GBIF has invested heavily in the development of Darwin Core Archive data publishing tools and supporting documentation.
GBIF has invested heavily in the development of Darwin Core Archive data publishing tools and supporting documentation.
GBIF has invested heavily in the development of Darwin Core Archive data publishing tools and supporting documentation.
What makes all of these different databases part of the GBIF network are: These data are made available on the Internet using a common set of communications protocols and data formats. A registry, representing a list of all members of the network and the location of the data itself (often a URL) serves as a master network directory.
Lists of these resources are available via RESTful machine interfaces. Here is an example of listing all Darwin Core Archive checklists data as a JSON object.
The registry and communications protocols are utilised to poll each database in the network and retrieve an index of the biodiversity data records they contain. The index includes the key taxonomic, geospatial, and provenance elements of the data record. This allows the data to be visually represented, for instance, on a map of the Earth.
The data in the index are made available through the GBIF data portal. A primary means by which data are accessed is via taxonomic organisation – either by searching for a taxon by keyword or by browsing through a taxonomic hierarchy.
Currently the GBIF index stands at over 310 million records from over 9000 different databases. Each of these data records records the name of the taxon, usually a species, that the record is associated with. The total number of scientific names in this virtual dataset exceeds 6 million different text strings – far exceeding the number of known species. Correctly interpreting this list of names is a key requirement in enabling effective use of the index.
GBIF has invested heavily in the development of Darwin Core Archive data publishing tools and supporting documentation.
Before I describe the challenges inherent to the index, I’d like to illustrate how primary biodiversity data has been used in various scientific and biodiversity policy-related contexts.