Bibliographic References in BHL

Bibliographic references in BHL
Coordination and routes for
cooperation across organizations,
projects and e-infrastructures
23rd of May 2013
William Ulate R., Missouri Botanical Garden

Questions to Answer
1. Type of content we discuss (e.g., occurrences, genes, behaviour,
morphology, etc.)
2. Sources of content (from where)
3. Formats of content (formats, standards)
4. Methods of gathering information (e.g., harvesting, ftp uploads,
protocols)
5. Methods of delivery of information (e,g., free searches, API, web
services, automated exports, linking mechanisms, etc.; provide links
to API and web services documentation)
6. Identifiers used (type, persistence, dereferencing, resolvability)
7. Present or forthcoming interoperability features with other
platforms
8. Constraints, needs and expectations to:
a) Suppliers of content, and
b) Users of content
9. What is needed for Bibliographic References?

The Biodiversity Heritage Library
www.biodiversitylibrary.org

Sharing
BHL shares data through:
APIs
Data Export
OpenURL
OAI-PMH

Open Data
• Downloads
– Simple tab-delimited exports of core data
– http://www.biodiversitylibrary.org/data/BHLExportSchema.pdf
• Data model
– DB schema as ERD
– http://bhl-bits.googlecode.com/files/20090930_BHLDataModel.pdf

Services
• Names Service
– Return all occurrences of a name throughout BHL digitized corpus
• Documentation: http://bit.ly/2e6sg9
– Access to 100+ million name strings using TaxonFinder & NetiNeti
• 1.5 million unique names
– Algorithm to detect nomenclatural & taxonomic acts
• OpenURL
– Facilitate links to citations: protologues, articles, references
• Documentation: http://www.biodiversitylibrary.org/openurlhelp.aspx
– Useful to Nomenclators, Reference Systems
• IPNI
• Tropicos

Services: OpenURL
http://www.biodiversitylibrary.org/openurl?
pid=title:3934&volume=14&issue=&spage=301&date=1879
http://www.tropicos.org/Name/1200408

DOIs for Legacy Literature
• BHL member of CrossRef through Smithsonian
• Started assigning DOIs to BHL monographs
– Low hanging fruit: Easy, non-controversial
– 54,856 DOIs Approved to date
• Next, other publication types / articles?
– Process of automatically assigning CrossRef DOIs
to articles has a higher potential for collisions.

Article-level metadata
• Disambiguating and locating structural components
in the corpus
• Done by automated and crowdsourced means
– Thanks Rod Page! Welcome others!
• Greatly increases semantic value of the dataset
• Makes data addressable and thus linkable
Chapter-level metadataTreatment-level metadataPart-level metadata

Genesis: “BHL Article Repository”
• Idea first introduced at TDWG 2008, Fremantle
(by BHL, many have discussed for years)
• YouTube for biodiversity articles
• Needed (need) a way to access articles in BHL
– “BHL has no articles.”
– BHL has hundreds of thousands of articles but you
can’t search for them via author, article title search
– Can find via “article coordinates” using BHL’s UI &
OpenURL resolver: Journal / Volume / Start Page / Year

CiteBank
• Objectives
– Create a repository for community-vetted
taxonomic bibliographies.
– Ability to ingest, display, download, and index
articles so that the BHL can operate as an article
repository.
– Provide links to content published online through
other repositories.
• Launched on December 6th 2010
• 185609 bibliographic records to date

Citations today: http://citebank.org

Specimen
Databases
Commercial
Aggregators
Software Tools
Open Access
Digital Libraries
Indices
Nomenclators
Specimen
Databases
Commercial
Aggregators
Software Tools
Open Access
Digital Libraries
Indices
Nomenclators
Open Access
Publishers
International Collaborative Projects

Lessons Learned
• Biblio/Drupal data model insufficient for mass of data
envisioned for all biodiversity, too flat and difficult to
expand in collaboration with Biblio development
community
• Data providers want their content findable and
managed in the Biodiversity Heritage Library, not a
system alongside BHL
• Maintaining two platforms for biodiversity literature
threatens sustainability of the literature resources over
the longer term

What have we done?
• Articles
– Extended BHL data model to store article metadata
– Built process to harvest data from BioStor
• Created user interfaces for adding article metadata
and associated files
– Defined functional requirements as improvements to
Drupal-based Citebank
– Defined process flow for adding article metadata and
associated files
– Implemented UI changes
• Changed BHL UI to accommodate article search
• Changed BHL UI to accommodate article display (TOC)

Requirements for a citation repository?
Admin. Interface
– IMPORT AND MAPPING TOOL
• Preview/Accept/Reject/Undo/Report on Import
• No standard schema, MODS or Bibtex
• Drag & drop GUI or mapped source and target field config.
– USER MANAGEMENT
• Self-Registration
• Admin. Approval & Deletion
• User Roles Assignment
– GLOBAL UPDATES

General User Interface
– IMPORT
• Upload/Preview/Accept/Reject/Undo/Report on Import
– CREATE CITATION
• By filling a Form, via BibTex
– BROWSE
• Faceted: title,author,subject, year, contributor, my citations

• CITATION TYPES
– Journal Article, Book Chapter, Conference Proceedings,
Conference Paper, Thesis, Government Report, Note, etc.
• OAI HARVESTING
– Harvest and serve data through OAI-PMH
• SPECIFICATIONS FOR DATA PROVIDERS PAGE
• CONTRIBUTORS PAGE
– Recognize ALL contributions
• REPORTING
– Statistics Page by Citation and Publication type
– Recent/Latest Uploads

What are we doing?
• Integrate BHL’s Services with ZooBank, IPNI & IF
• Authoritative list of titles in common use for
nomenclatural acts (“TL3”)
• Harvest relevant content from Mendeley
• Integrate services and interfaces with the GNUB
data model
• Interoperate with citation parsing tools & services

Support citation reconciliation
.
.
.
.
.
.
.
L. Sp. Pl. 2: 971. 1753
Linneaus, C. Species Plantarum, vol. 2 p. 971. 1753
Linné, Carl von. Sp. Pl. Vol. 2 Page 971. 1753
Caroli Linnaei, Species Plantarum exhibentes plantas rite cognitas, ad genera
relatas, cum Differentis Specificis, Nominibus Trivialibus, Synonymis Selectis,
Locis Natalibus, secundum SYSTEMA SEXUALE digestas.. 2:971. 1753
Zea mays

Questions to Answer
1. Type of content - Literature, Images, OCR Text
and Bibliographic Citations
2. Sources of content - BHL, CB & other Repositories
3. Formats of content - BibTex, MODS, DC
4. Methods of gathering info - Harvesting, FTP Uploads
5. Methods of delivery of info - Free Searches, API, web
services, exports, linking
mechanisms
6. Identifiers used - CrossRef DOIs for Monographs
7. Interoperability with
other platforms - Zoobank, IPNI, IF
8. Constraints, needs and expectations to suppliers of content
and users of content

Thank you
pro-iBiosphere Meeting 3
Coordination and routes for cooperation across organizations, projects and e-infrastructures
Berlin, Germany
May 23rd, 2013
William.Ulate@mobot.org
Global BHL Project Manager
BHL Technical Director
Senior Project Manager
Missouri Botanical Garden

Bibliographic References in BHL

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (18)

Destaque

Destaque (9)

Semelhante a Bibliographic References in BHL

Semelhante a Bibliographic References in BHL (20)

Mais de William Ulate

Mais de William Ulate (18)

Último

Último (20)

Bibliographic References in BHL

Notas do Editor