This presentation was given by Doug Holland and Trish Rose-Sandler at the Missouri Libraries Association conference held in St Louis MO in Oct 2013. There is a significant online literature and image repository called the Biodiversity Heritage Library (BHL). Content from this repository has inspired a range of users to re-contextualize the BHL data in new, previously unimagined roles including: scientists creating visualizations of species names publishing; citizen scientists blogging about fascinating creatures; designers incorporating marine life into wedding invitations, artists creating collages of animal illustrations and nature photography ; and home decorators adding punch and wit to the walls of their kids bedrooms. Using the example of BHL and its open data principles, the presentation will discuss what open data is and how libraries can expand the impact and reach of their collections through open data methods.
Breathing new life into old data - How opening your collection can spark imagination and inspire creative re-use
1. Breathing new life into old data:
How opening your collection can spark
imagination and inspire creative re-use
Doug Holland, Library Director, Missouri Botanical Garden
Trish Rose-Sandler, Data Project Coordinator, Missouri Botanical Garden
10/4/13 Missouri Library Association conference, St Louis MO
2. 10/4/13
Missouri Library Association conference, St
Louis MO
What is BHL?
• A consortium of natural history, botanical libraries
and research institutions
• An open access digital repository for historic
biodiversity literature
• An open data repository of taxonomic names and
bibliographic information
3. 10/4/13
Missouri Library Association conference, St
Louis MO
Member Institutions
• Academy of Natural Sciences Library and
Archives
• American Museum of Natural History Library
• California Academy of Sciences Library
• Cornell University Library
• The Field Museum Library
• Harvard University Botany Libraries
• Harvard University, Ernst Mayr Library of the
Museum of Comparative Zoology
• Library of Congress
• Marine Biological Laboratory / Woods Hole
Oceanographic Institution Library
• Missouri Botanical Garden Library
• Natural History Museum, London, Library &
Archives
• The New York Botanical Garden
• Royal Botanic Gardens, Kew, Library & Archives
• Smithsonian Institution Libraries
• United States Geological Survey Libraries
5. 10/4/13
Missouri Library Association conference, St
Louis MO
What is Art of Life?
• Full title - The Art of Life: Data Mining and
Crowdsourcing the Identification and Description
of Natural History Illustrations from the
Biodiversity Heritage Library (BHL)
• Grant given to Missouri Botanical Garden in St
Louis
• Funded by National Endowment for the
Humanities
• Runs May 2012-April 2014
6. 10/4/13 Missouri Library Association conference, St Louis MO
What is open content?
1. Artists/Designers
2. Biologists/Taxonomists/
“A piece of data or content is open if anyone is free to use,
reuse, and redistribute it — subject only, at most, to the
requirement to attribute and/or share-alike.”
7. 10/4/13
Missouri Library Association conference, St
Louis MO
What are some content types ripe for opening up?
Bibliographic records
Vocabularies
Content files
• Literature published before 1923 (public domain)
• Primary source materials
• manuscripts
• photographs
• maps
• artifacts
• audio and video recordings
• oral histories
• postcards
• posters
• Local history – Genealogical collections
8. 10/4/13 Missouri Library Association conference, St Louis MO
Examples of open content – bibliographic recordst
/
9. 10/4/13 Missouri Library Association conference, St Louis MO
Examples of open content – bibliographic recordst
/
10. 10/4/13 Missouri Library Association conference, St Louis MO
Examples of open content – vocabularies
/
11. 10/4/13 Missouri Library Association conference, St Louis MO
Examples of open content – vocabularies
/
12. 10/4/13 Missouri Library Association conference, St Louis MO
Examples of open content – primary resources
/
13. 10/4/13 Missouri Library Association conference, St Louis MO
Examples of open content – primary resources
/
14. 10/4/13
Missouri Library Association conference, St
Louis MO
Why is open content important to cultural heritage
institutions?
“it is now the mark—and social
responsibility—of world-class
institutions to develop and share
free cultural and educational
resources.”
Initiative to building a
global cultural commons
for everyone to use, access
and enjoy.
Getty holds “the conviction that
understanding art makes the
world a better place, and sharing
our digital resources is the natural
extension of that belief. “
16. 10/4/13
Missouri Library Association conference, St
Louis MO
How to open your data
Bulk data
BHL provides metadata & content via
• Export files
• OAI-PMH
• APIs
• OpenURL
19. 10/4/13
Missouri Library Association conference, St
Louis MO
Copyright and Licensing
Make public domain content available – make it clear its copyright
status=public domain.
For copyrighted content if you are the copyright holder, dedicate it to the
public domain or license content as openly as possible
CCO
CC-BY
CC-BY-SA
CC-BY-NC
PDDL
ODC-By
ODC-ODbL
26. 10/4/13
Missouri Library Association conference, St
Louis MO
Artists/Graphic Designers
Artists/Graphic Designers
Nonprofit fund raising
Natural Histories: Extraordinary Rare
Book Selections from the American
Museum of Natural History
28. 10/4/13
Missouri Library Association conference, St
Louis MO
http://www.thisoldhouse.com/toh/photos/0,,20612661_2118724
9,00.html
Citizen scientists/enthusiasts
30. 10/4/13
Missouri Library Association conference, St
Louis MO
Henry Curtis-Williams
@All Rights Reserved
http://www.flickr.com/photos/67312941@N03/850422509
8/in/photostream/
Artists/Graphic DesignersArtists/Graphic Designers
31. 10/4/13
Missouri Library Association conference, St
Louis MO
Artists/Graphic Designers
http://www.missmoss.co.za/2013/06/24/biodiversity-heritage-library/
35. 10/4/13
Missouri Library Association conference, St
Louis MO
Artists/Graphic Designers
Artists/Graphic Designers
Public Art in Denver Light Rail Stations
Created by artist Nancy O’Neil
36. 10/4/13
Missouri Library Association conference, St
Louis MO
Benefits to open content?
• Fulfills public mission of libraries
• Promotes your collection to new audiences
• Stimulates creative reuse
• Increases discoverability
• Enables Data enrichment
• Less taxing on staff resources
37. 10/4/13
Missouri Library Association conference, St
Louis MO
Further information about open content and
libraries
• Open Bibliographic Data Guide http://obd.jisc.ac.uk/
• OpenGLAM http://openglam.org/
• LODLAM http://lodlam.net/
Notas do Editor
A consortium of natural history, botanical libraries and research institutions who have cooperated to digitize the books and journals in their collectionsIt is an open access digital repository for historic biodiversity literature. Most of our literature is in the public domain (published before 1923)It is an open data repository of taxonomic names and bibliographic information. Not only are the pages of books digitized but we run then through OCR software so that the species names can be identified. This allows taxonomists to identify the first publication of a name and track its changes over time.Notice the terms “open access” and “open data” which drive the BHL mission. I will talk more about what these terms mean in a minute.
We serve our content both at the Internet Archive and at a specialized portal at biodiversitylibrary.org We have over 117 thousand titles and 41 million pages of digitized text
Art of Life is a related project to BHL. Within BHL texts there are over a million natural history illustrations that are not discoverable due to a lack of metadata. The Art of Life project which is funded by the National Endowment for the Humanities is an effort to address this problem by identifying which pages have images and crowdsourcing their description. I’m mentioning this project because it is the image content in BHL that has the widest appeal across disciplines and to a wide range of audiences.
Definition from the Open Knowledge Foundation“A piece of data or content is open if anyone is free to use, reuse, and redistribute it — subject only, at most, to the requirement to attribute and/or share-alike.” –Keep in mind There are varying interpretations of “open” when it comes to data and content and this definition is the most liberal version - no restrictions on reuse. I also wanted to briefly mention “linked data” because the two concepts are often used together as in “linked open data” You may have heard the term LOD-LAM – Linked open data for Libraries, Archives, and Museums. Linked data is a method for providing your data and content on the web with persistent URIs instead of just text strings so that computers can automatically relate and link it to other data on the Web. But for this talk we’ll focus only on the open data
In Aug 2012 OCLC began recommending the Open Data Commons Attribution License (ODC-BY) for member institutions that would like to release their library catalog data on the Web. In explaining their rationale for the decision they said:"Many libraries are now examining ways that they can make their bibliographic records available, for free, on the Internet, so that they can be reused and more fully integrated into the broader Web environment," said Jim Michalko, Vice President, OCLC Research Library Partnership, who has overseen OCLC's license policy discussions. …the ODC-BY license provides a good way to share records that's consistent with the cooperative nature of OCLC cataloging."
Europeana, which is a portal that aggregrates content from Europe's leading galleries, libraries, archives and museums, puts its metadata under a Creative Commons Open license in Sep 2011
Many of the Library of Congress authorities and vocabularies are available as bulk download or individual URIs. All the data is considered public domain.Here’s a case where we have to redefine what we mean by “open” Open as long as the government is open ;)
I did visit Library of Congress later in the day and found it was back up!
Library of Congress Prints and Photographs Division has provided access to thousands of its images online In its rights statement it says “As a publicly supported institution the Library generally does not own rights to material in its collections. Therefore, it does not charge permission fees for use of such material and cannot give or deny permission to publish or otherwise distribute material in its collections. “
In May of 2012 The Walters Art Museum donated 19,000 images of artworks to Wikimedia Commons. The Walters’ collection includes ancient art, medieval art and manuscripts, decorative objects, Asian art and Old Master and 19th-century paintings.
Many institutions have held public domain content hostage for years thinking that since they digitized the resource they owned the copyright for the digital representation and therefore should charge money for access to it. Many cultural institutions are now rethinking that logic The NMC Horizon Report > 2012 Museum Edition identifies six emerging technology topics, as well as key trends and critical challenges, it identified augmented reality and open content as technologies that will be adopted into the mainstream within 2-3 yrs. IT stated that “it is now the mark—and social responsibility—of world-class institutions to develop and share free cultural and educational resources.” OpenGLAM (Galleries, Libraries, Archives and Museum) is an initiative coordinated by the Open Knowledge Foundation that is committed to building a global cultural commons for everyone to use, access and enjoy. OpenGLAM helps cultural institutions to open up their content and data through hands-on workshops, documentation and guidance and it supports a network of open culture evangelists through its Working Group. The Getty recently launched their Open Content Program which provide freely available digital images to which they hold copyright or are in public domain. In explaining the reason for the program they stated that the Getty was “founded on the conviction that understanding art makes the world a better place, and sharing our digital resources is the natural extension of that belief. This move is also an educational imperative. Artists, students, teachers, writers, and countless others rely on artwork images to learn, tell stories, exchange ideas, and feed their own creativity” - http://blogs.getty.edu/iris/open-content-an-idea-whose-time-has-come/Many institutions have started opening data in small steps such as just their bibliographic records (you don’t need to open all your digitized data at once)
One click downloadsMake it easy for users to download your metadata or content file with the click of a button. BHL allows users to download an entire book or journal volume, select individual pages for a custom PDF, or download the bibliographic info as MODS, Bibtex, or Endnote
Provide ways for users to take content in bulk and use if for data mining. You also may want contribute the catalog records so they can be aggregated into other portals (such as Serials Solutions’ Summon Service )BHL provides its bulk data via: Exports, OAI-PMH, APIs, OpenURL
Push content to other portalsInstead of expecting users to find you at your institutional website you can push that content into environments where users already are. THis is a way to provide data to new audiences who you wouldn’t reach otherwise (BHL has shared thousands of its images with Flickr and Wikimedia Commons).
As well as with the Encyclopedia of Life
IIts important that you clearly state for your users the copyright status of your content and if whether that gives them permission to re-use the content and how. Public domain content should be clearly labeled as such and indicate that no permission is needed to re-useCopyrighted content can be dedicated to the public domain via CCO or PDDLOr license under any number of Creative Commons license and Open Data Commons licenses
BHL has a Mix of public and copyrighted materials - About 83% of our collection is public domain because we digitize historic literature before 1923 For the other 17% we have gotten permission from the rights holders to provide access but reuse can only be for education purposes.Here is our copyright page that helps user understand the different copyright statuses: public domain, unknown, and in copyright and what that allows them to do.Copyright caveat- BHL found out a Publishing company was selling in copyright digitized works from BHL materials on Amazon.com. Upon investigation we discovered This was our fault - due to incorrect copyright info in our metadata. We corrected the metadata on our end and contacted the seller.
Here’s an article from This Old House where someone has taken illustrations from Botanicus.org (precursor to BHL) and matted them between glass for their kitchen.
This group likes to talk and share things about the natural world. They get excited by our text and images but particularly the images. They often include our images in their blog posts, pin them to their Pinterest boards, feature them in their Tumblr sites, Some are crafters/home decorators who have used the illustrations to decorate their home in creative ways.
Here’s an article from This Old House where someone has taken illustrations from Botanicus.org (precursor to BHL) and matted them between glass for their kitchen.
Here is an article by journalist Hannah Walter reporting on the TedxDeExtinction conference and the idea of whether to ressurect extinct species from DNA. In the article she uses 3 images from BHL of the now extinct Ectopistes migratorius or better known as the passenger pigeon
This is a blog called “Miss Moss” by a graphic designer in South Africa who likes to highlight the intersections of fashion, art, design, and photography. In this post she did a great plug for BHL content where she posted several illustrations from the BHL texts and then combined BHL illustrations with street fashion
Its even more fun to read the comments to her images. Even the BHL Program Director participated in the comments. It’s a great way to connect with users of your content and let them know you’re pleased with their creative re-use – after all this is the purpose of “open content” in the first place.
Here is a collage artist by the name of Dawn Arsenaux who goes by the name of “Ms. Neaux Neaux” who has used BHL illustrations in quite of few of her dadist style collages. If you notice at the bottom of the image she acknowledges the original source the image as being from the BHL but she got them via another blog on tumblr so you can see how social media becomes this web of sharing that promotes your collection way beyond what your organization could do on its own. People often find out about us through these serendipitous pathways. One thing to note about “open content” that you put online. Not everyone who reuses your content will acknowledge the original source. We’ve come across this regularly with BHL content and initially our reaction was to get angry because we felt they were not sharing in the same spirit that we had shared it with.
Fits in with public mission of libraries – our mission is to “disseminate knowledge, play a role in our communities, enable innovation, and enhance the Web of knowledge” (from http://obd.jisc.ac.uk/wp-content/uploads/2010/11/Open-Bibliographic-Data-The-Use-Cases.pdf)Increases discoverability and promotes your collection to new audiences – by pushing your metadata and content to other locations on the Web we can meet the needs of users in their environments instead of expecting them to come find us. e.g. most humanities scholars would not have considered to BHL’s text collection for finding natural history illustrations, BHL has copies of all of its bib records and content files available at the Internet ArchiveStimulates creative reuse – when content can be easily mined and repurposed, collections can be reinterpreted in new ways, e.g. BHL wants users to create new knowledge from our data- not just access the dataEnables Data enrichment – if you allow users to tag your content can add to accessibility of contente.g. BHL users tag our content in Flickr, BHL users can create article PDFs in the BHL portal in which we ask the users to add both the article title and author names. This information is then available for other users to search on Less taxing on staff resources – staff don’t have to package up content and send it off, users can grab data themselves e.g National Geographic publications that have used BHL images