SlideShare uma empresa Scribd logo
1 de 13
Biodiversity Heritage Library:
Creating a digital library for access and research




                                      Joe Coleman
                                      BHL Digitisation manager
“The cultivation of natural science
cannot be efficiently carried on
without reference to an extensive
library”.
             C.R. Darwin
Digitisation


  • Digitisation for BHL about access

      Turning this…                     … into this.
Partners
Access


  • Web portal just one way into
    the BHL data
         •   Can be search OCR text for
             species name
         •   Full text search (coming soon)

  • Users can develop own tools
    to access data
         •   Cite Bank & Biostor two
             examples
         •   BHL data available using open
             standards and linked to semantic web
                                                    An elementary manual of New Zealand entomology
                                                    London,West, Newman & Co.,1892.
                                                    biodiversitylibrary.org/item/34950
Copyright



                                                      • Majority of works in Public Domain
                                                            •   Public Domain considered to be
                                                                 > 90 years from publication
                                                            •   What’s in the public domain
                                                                stays in the public domain.

                                                      • Permission sought by contributors
                                                        for in copyright material.
                                                            • Use of in copyright content
L'histoire naturelle des estranges poissons marins
A Paris :De l'imprimerie de Regnaud Chaudiere,1551.
                                                             licensed
biodiversitylibrary.org/page/4748789
Audience


 • Scientific Community
           •   Need authoritative reference texts
           •   Need to find data

  • General interest
           •   Want to see what’s interesting
               and attractive
           •   Need to know what to look for
  • Web developers
           •   Need a challenge!
           •   Need open data standards.
                                                    A monograph of the Trochilidæ, or family of humming-birds /.
                                                    London :Printed by Taylor and Francis ;1861 [i.e. 1849-1861].
                                                    biodiversitylibrary.org/page/34843253
Digitisation Workflow


 • Images
         •   Captured as RAW
         •   Converted to 8 bit uncompressed TIFF
         •   Processed TIFFs saved as
             archival copy
         •   Compressed JPEG 2000
             uploaded to IA
  • Metadata
         •   Bibliographic record as MODS
         •   Page metadata exported from
             Macaw as XML
                                             Ornithological miscellany V.1
                                             London :Trübner and Co., Bernard Quaritch, R.H. Porter,1876-1878
                                             biodiversitylibrary.org/item/108982
Digitisaton workflow
Macaw


 • Developed by Smithsonian Libraries ….
        •   Simple metadata creation for book
            and page items
        •   Upload directly to
            Internet Archive




 • … Hacked by MV
        •   Simplified workflow
        •   New styling
        •   Multiple contributor upload
Geographical distribution of publications
Thank you




            Joe Coleman
            jcoleman@museum.vic.gov.au
            http://bhl.ala.org.au

Mais conteúdo relacionado

Mais procurados

Burton Manuscript Brochure
Burton Manuscript BrochureBurton Manuscript Brochure
Burton Manuscript Brochure
Maureen Simari
 

Mais procurados (10)

Muswebho
MuswebhoMuswebho
Muswebho
 
Burton Manuscript Brochure
Burton Manuscript BrochureBurton Manuscript Brochure
Burton Manuscript Brochure
 
BHL Technical Director's Report, Mar. 2014
BHL Technical Director's Report, Mar. 2014BHL Technical Director's Report, Mar. 2014
BHL Technical Director's Report, Mar. 2014
 
Wikipedia & Museums - Qatar Presentation
Wikipedia & Museums - Qatar PresentationWikipedia & Museums - Qatar Presentation
Wikipedia & Museums - Qatar Presentation
 
The biodiversity informatics landscape: a systematics perspective
The biodiversity informatics landscape: a systematics perspectiveThe biodiversity informatics landscape: a systematics perspective
The biodiversity informatics landscape: a systematics perspective
 
Enabling Progress in Global Biodiversity Research: The Biodiversity Heritage ...
Enabling Progress in Global Biodiversity Research: The Biodiversity Heritage ...Enabling Progress in Global Biodiversity Research: The Biodiversity Heritage ...
Enabling Progress in Global Biodiversity Research: The Biodiversity Heritage ...
 
Library Futures & the Importance of Understanding Communities of Users
Library Futures & the Importance of Understanding Communities of UsersLibrary Futures & the Importance of Understanding Communities of Users
Library Futures & the Importance of Understanding Communities of Users
 
Ifla Bhl080208cr
Ifla Bhl080208crIfla Bhl080208cr
Ifla Bhl080208cr
 
The Biodiversity Heritage Library 10 Years and More!
The Biodiversity Heritage Library 10 Years and More!The Biodiversity Heritage Library 10 Years and More!
The Biodiversity Heritage Library 10 Years and More!
 
Engaging the Citizen Scientist in Content Enhancement for BHL
Engaging the Citizen Scientist in Content Enhancement for BHLEngaging the Citizen Scientist in Content Enhancement for BHL
Engaging the Citizen Scientist in Content Enhancement for BHL
 

Destaque

Destaque (10)

James Smithies Academic Earthquake Research
James Smithies Academic Earthquake ResearchJames Smithies Academic Earthquake Research
James Smithies Academic Earthquake Research
 
Parul Sharma Sally Vermaaten Right Combination
Parul Sharma Sally Vermaaten Right CombinationParul Sharma Sally Vermaaten Right Combination
Parul Sharma Sally Vermaaten Right Combination
 
Bedrich Vychodil DIFFER
Bedrich Vychodil DIFFERBedrich Vychodil DIFFER
Bedrich Vychodil DIFFER
 
T Bahr M Lindlar Goportis Digital Preservation Pilot
T Bahr M Lindlar Goportis Digital Preservation PilotT Bahr M Lindlar Goportis Digital Preservation Pilot
T Bahr M Lindlar Goportis Digital Preservation Pilot
 
Martin Donnelly Sarah Jones DMP Online
Martin Donnelly Sarah Jones DMP OnlineMartin Donnelly Sarah Jones DMP Online
Martin Donnelly Sarah Jones DMP Online
 
Jay Gattuso Persistently Identifying Formats
Jay Gattuso Persistently Identifying FormatsJay Gattuso Persistently Identifying Formats
Jay Gattuso Persistently Identifying Formats
 
Shaun Hendy Innovation Ecosystem
Shaun Hendy Innovation EcosystemShaun Hendy Innovation Ecosystem
Shaun Hendy Innovation Ecosystem
 
Cassie Findlay Digital Transformation SRNSW
Cassie Findlay Digital Transformation SRNSWCassie Findlay Digital Transformation SRNSW
Cassie Findlay Digital Transformation SRNSW
 
Eclipse shortcut[most usuage]
Eclipse shortcut[most usuage]Eclipse shortcut[most usuage]
Eclipse shortcut[most usuage]
 
Android code convention
Android code conventionAndroid code convention
Android code convention
 

Semelhante a Joe Coleman Biodiversity Heritage Library

2012.03.20 ihr farquhar v03
2012.03.20 ihr   farquhar v032012.03.20 ihr   farquhar v03
2012.03.20 ihr farquhar v03
Digital History
 
Collaborative histories and community contributed collections: reappraising ...
Collaborative histories and  community contributed collections: reappraising ...Collaborative histories and  community contributed collections: reappraising ...
Collaborative histories and community contributed collections: reappraising ...
Kate Lindsay
 
Books, Bytes, Biodiversity: Using the Biodiversity Heritage Library in Your R...
Books, Bytes, Biodiversity: Using the Biodiversity Heritage Library in Your R...Books, Bytes, Biodiversity: Using the Biodiversity Heritage Library in Your R...
Books, Bytes, Biodiversity: Using the Biodiversity Heritage Library in Your R...
Becky Morin
 
Nat geo webinars public sa final
Nat geo webinars public sa finalNat geo webinars public sa final
Nat geo webinars public sa final
Stephen Abram
 
New Metaphors: Data Papers and Data Citations
New Metaphors: Data Papers and Data CitationsNew Metaphors: Data Papers and Data Citations
New Metaphors: Data Papers and Data Citations
John Kunze
 

Semelhante a Joe Coleman Biodiversity Heritage Library (20)

Europeana Libraries: bringing content to the researcher
Europeana Libraries: bringing content to the researcherEuropeana Libraries: bringing content to the researcher
Europeana Libraries: bringing content to the researcher
 
2012.03.20 ihr farquhar v03
2012.03.20 ihr   farquhar v032012.03.20 ihr   farquhar v03
2012.03.20 ihr farquhar v03
 
Europeana Libraries: the value of a library domain aggregator
Europeana Libraries: the value of a library domain aggregatorEuropeana Libraries: the value of a library domain aggregator
Europeana Libraries: the value of a library domain aggregator
 
Open Access and Libraries
Open Access and LibrariesOpen Access and Libraries
Open Access and Libraries
 
An Inordinate Fondness for Data: The Biodiversity Heritage Library
An Inordinate Fondness for Data: The Biodiversity Heritage LibraryAn Inordinate Fondness for Data: The Biodiversity Heritage Library
An Inordinate Fondness for Data: The Biodiversity Heritage Library
 
BHL: Your 24hr Library
BHL: Your 24hr LibraryBHL: Your 24hr Library
BHL: Your 24hr Library
 
Collaborative histories and community contributed collections: reappraising ...
Collaborative histories and  community contributed collections: reappraising ...Collaborative histories and  community contributed collections: reappraising ...
Collaborative histories and community contributed collections: reappraising ...
 
Forging the Digital Roadmap: The Preservation, Curation and Stewardship Nexus
Forging the Digital Roadmap: The Preservation, Curation and Stewardship NexusForging the Digital Roadmap: The Preservation, Curation and Stewardship Nexus
Forging the Digital Roadmap: The Preservation, Curation and Stewardship Nexus
 
NLW Linked Open Data Sets
NLW Linked Open Data SetsNLW Linked Open Data Sets
NLW Linked Open Data Sets
 
Optimizing global cooperation: an imperative for the knowledge economy
Optimizing global cooperation: an imperative for the knowledge economyOptimizing global cooperation: an imperative for the knowledge economy
Optimizing global cooperation: an imperative for the knowledge economy
 
Building and Managing Online Communities
Building and Managing Online CommunitiesBuilding and Managing Online Communities
Building and Managing Online Communities
 
International Digital Library Initiatives
International Digital Library InitiativesInternational Digital Library Initiatives
International Digital Library Initiatives
 
Books, Bytes, Biodiversity: Using the Biodiversity Heritage Library in Your R...
Books, Bytes, Biodiversity: Using the Biodiversity Heritage Library in Your R...Books, Bytes, Biodiversity: Using the Biodiversity Heritage Library in Your R...
Books, Bytes, Biodiversity: Using the Biodiversity Heritage Library in Your R...
 
Nat geo webinars public sa final
Nat geo webinars public sa finalNat geo webinars public sa final
Nat geo webinars public sa final
 
Purposeful Gaming, OCR Correction and Seed & Nursery Catalog Digitization
Purposeful Gaming, OCR Correction and Seed & Nursery Catalog DigitizationPurposeful Gaming, OCR Correction and Seed & Nursery Catalog Digitization
Purposeful Gaming, OCR Correction and Seed & Nursery Catalog Digitization
 
Reaching the researcher
Reaching the researcherReaching the researcher
Reaching the researcher
 
The value of digitally encoded information for libraries
The value of digitally encoded information for librariesThe value of digitally encoded information for libraries
The value of digitally encoded information for libraries
 
New Metaphors: Data Papers and Data Citations
New Metaphors: Data Papers and Data CitationsNew Metaphors: Data Papers and Data Citations
New Metaphors: Data Papers and Data Citations
 
Small pieces loosely joined: getting louse research online.
Small pieces loosely joined: getting louse research online.Small pieces loosely joined: getting louse research online.
Small pieces loosely joined: getting louse research online.
 
The public library and wikipedia
The public library and wikipediaThe public library and wikipedia
The public library and wikipedia
 

Mais de Future Perfect 2012

Mais de Future Perfect 2012 (18)

Working Across Organizations white paper
Working Across Organizations white paperWorking Across Organizations white paper
Working Across Organizations white paper
 
Ensuring Data Integrity white paper
Ensuring Data Integrity white paperEnsuring Data Integrity white paper
Ensuring Data Integrity white paper
 
Bigger Hard Drive Jamie Lean
Bigger Hard Drive Jamie LeanBigger Hard Drive Jamie Lean
Bigger Hard Drive Jamie Lean
 
Steve Knight by Design
Steve Knight by DesignSteve Knight by Design
Steve Knight by Design
 
Michael Parsons Passion
Michael Parsons PassionMichael Parsons Passion
Michael Parsons Passion
 
Kris Carpenter Negulescu Gordon Paynter Archiving the National Web of New Zea...
Kris Carpenter Negulescu Gordon Paynter Archiving the National Web of New Zea...Kris Carpenter Negulescu Gordon Paynter Archiving the National Web of New Zea...
Kris Carpenter Negulescu Gordon Paynter Archiving the National Web of New Zea...
 
Steve Mc Eachern Australian Data Archive
Steve Mc Eachern Australian Data ArchiveSteve Mc Eachern Australian Data Archive
Steve Mc Eachern Australian Data Archive
 
Alison Fleming Michael Upton Collaborating for Success
Alison Fleming Michael Upton Collaborating for SuccessAlison Fleming Michael Upton Collaborating for Success
Alison Fleming Michael Upton Collaborating for Success
 
Andrew Waugh Business Systems
Andrew Waugh Business SystemsAndrew Waugh Business Systems
Andrew Waugh Business Systems
 
Gabe Nault Data Integrity
Gabe Nault Data IntegrityGabe Nault Data Integrity
Gabe Nault Data Integrity
 
Clare Somerville Trish O’Kane Data in Databases
Clare Somerville Trish O’Kane Data in DatabasesClare Somerville Trish O’Kane Data in Databases
Clare Somerville Trish O’Kane Data in Databases
 
Cochrane von Suchodoletz File Creation, Rendering and Formats
Cochrane von Suchodoletz File Creation, Rendering and FormatsCochrane von Suchodoletz File Creation, Rendering and Formats
Cochrane von Suchodoletz File Creation, Rendering and Formats
 
Dave Pearson The Adventures of Digi
Dave Pearson The Adventures of DigiDave Pearson The Adventures of Digi
Dave Pearson The Adventures of Digi
 
Jeff Rothenberg Digital Preservation Perspective
Jeff Rothenberg Digital Preservation PerspectiveJeff Rothenberg Digital Preservation Perspective
Jeff Rothenberg Digital Preservation Perspective
 
Stuart Wakefield Cloud Computing
Stuart Wakefield Cloud ComputingStuart Wakefield Cloud Computing
Stuart Wakefield Cloud Computing
 
Kevin De Vorsey Past is Prologue
Kevin De Vorsey Past is PrologueKevin De Vorsey Past is Prologue
Kevin De Vorsey Past is Prologue
 
Grace Currie Ann Jebson First Things First
Grace Currie Ann Jebson First Things FirstGrace Currie Ann Jebson First Things First
Grace Currie Ann Jebson First Things First
 
Dennis Phillips Cooperative Digital Preservation
Dennis Phillips Cooperative Digital PreservationDennis Phillips Cooperative Digital Preservation
Dennis Phillips Cooperative Digital Preservation
 

Último

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Último (20)

GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 

Joe Coleman Biodiversity Heritage Library

  • 1. Biodiversity Heritage Library: Creating a digital library for access and research Joe Coleman BHL Digitisation manager
  • 2. “The cultivation of natural science cannot be efficiently carried on without reference to an extensive library”. C.R. Darwin
  • 3. Digitisation • Digitisation for BHL about access Turning this… … into this.
  • 5. Access • Web portal just one way into the BHL data • Can be search OCR text for species name • Full text search (coming soon) • Users can develop own tools to access data • Cite Bank & Biostor two examples • BHL data available using open standards and linked to semantic web An elementary manual of New Zealand entomology London,West, Newman & Co.,1892. biodiversitylibrary.org/item/34950
  • 6. Copyright • Majority of works in Public Domain • Public Domain considered to be > 90 years from publication • What’s in the public domain stays in the public domain. • Permission sought by contributors for in copyright material. • Use of in copyright content L'histoire naturelle des estranges poissons marins A Paris :De l'imprimerie de Regnaud Chaudiere,1551. licensed biodiversitylibrary.org/page/4748789
  • 7. Audience • Scientific Community • Need authoritative reference texts • Need to find data • General interest • Want to see what’s interesting and attractive • Need to know what to look for • Web developers • Need a challenge! • Need open data standards. A monograph of the Trochilidæ, or family of humming-birds /. London :Printed by Taylor and Francis ;1861 [i.e. 1849-1861]. biodiversitylibrary.org/page/34843253
  • 8.
  • 9. Digitisation Workflow • Images • Captured as RAW • Converted to 8 bit uncompressed TIFF • Processed TIFFs saved as archival copy • Compressed JPEG 2000 uploaded to IA • Metadata • Bibliographic record as MODS • Page metadata exported from Macaw as XML Ornithological miscellany V.1 London :Trübner and Co., Bernard Quaritch, R.H. Porter,1876-1878 biodiversitylibrary.org/item/108982
  • 11. Macaw • Developed by Smithsonian Libraries …. • Simple metadata creation for book and page items • Upload directly to Internet Archive • … Hacked by MV • Simplified workflow • New styling • Multiple contributor upload
  • 13. Thank you Joe Coleman jcoleman@museum.vic.gov.au http://bhl.ala.org.au

Notas do Editor

  1. Welcome slide
  2. Introduction.Every scientist stands on the shoulders of those that have gone before. In particular, the sciences of taxonomy and bioinformatics are the sort of discipline that involve as much time in the library as in the laboratory. The study of the species that make up the world’s biodiversity requires reference to a large body of biological literature, much of it spanning centuries of research. Libraries Academic libraries house large collections of this stuff and natural history museums in particular hold very focused catalogues of literature pertaining to the scope of their collection. A terrific resource these may be, but they have one major drawback: the library collections are usually not located where the biologist would most like to be, out collecting in the field.  Furthermore, as anyone familiar with Murphy’s law would agree, the one book we most desire when undertaking research, is the one that is missing, on loan or was deemed not to fit with the acquisition policies devised by management of the day.
  3. Introducing BHL The Biodiversity Heritage Library was begun as a way of solving some of these problems and began as a consortium of some of the heavyweights of natural history museums, herbaria and academic libraries of the UK and North America. The goal of the Biodiversity Heritage Library has been to digitise as much of the literature relating to biological sciences as possible and make it accessible online. Underlying this endeavour is the philosophy that the body of knowledge contained within makes up the legacy of human understanding about the world we live in, and as such should be freely available to everyone. 
  4. Partners:With founding members such as the Smithsonian Institution, Natural History Museum of London, Kew Gardens and Woods Hole marine institute a fair portion of the literature (at least from the Northern Hemisphere) has already been scanned. There are currently around 103 thousand volumes online and growing. The project aims to be truly global in its reach and it has since expanded to include collections from Continental Europe in affiliation with the Europeana project, China, Brazil, Australasia and most recently a partnership in Africa. Each affiliated project contributes to either the provision of technical services, content or both. Internet Archive.Central to the success of the BHL has been a partnership with the Internet Archive. The archive is a not for profit organisation tasked with just the small mission of providing universal online access to recorded knowledge.  The Archive already hosts a vast amount of digitised literature, and has the resources and expertise to host the content of the BHL as well as assisted with much of the scanning. Once a book has been uploaded, the Archive’s internal processes create the OCR text and derivatives for download or online viewing. In reality, the BHL comprises two distinct projects, each sharing the same mutual objective of open access to knowledge: one project with the function of digitizing literature and an online project to develop the systems to deliver the content to its audience.
  5. Access With such a large amount of data online, the issue of access changes from a question of how to get a hold of a resource, to one of how do I find the information I want and how can I share it with others?  For many users, the first contact with the BHL might be through the various portal sites which reflect the regional contributions. In addition to standard title, subject and author searches, these search functions are tuned to the requirements of the local scientific community. For example the search results for a particular region may place an emphasis on result showing species or publications from that area. Built in to the BHL is a link to the uBio’sTaxonfinder resource that scours the OCR text for species names within the full text of the BHL and can return results based on a plant or animal’s Latin name. A new feature that is under development by the Australian node’s partners at the CSIRO is a full text search of the entire collection. It uses an enhanced Lucene search to produce results very quickly from the complete text and It’s scheduled to be available on the Australian site in beta form in the next couple of weeks.  Further tools for the researcher are available through the CiteBank service to generate customised bibliographies, based on species citations and allows the user to build up a personalised library. The BioStor project is a service developed by a BHL user to search for article extents within journals and link back to them using OpenURLs. The BHL is committed to sharing data using open web standards. It’s important for the success of the project that the data held by the BHL is used widely and in creative ways. Book metadata is published as OAI-PMH queries and returns metadata in either Dublin Core or MODS formats. The books themselves can be referenced by DOI or persistent URL and individual pages can also be accessed by OpenURLs.
  6. Copyright Before I go any further, I had better mention the C word. The BHL walks the copyright tightrope pretty carefully. It has to because no library wants to be embroiled in a copyright suit with publishers and the BHL relies on the goodwill of its contributors to succeed.  The ‘Heritage’ part of the Biodiversity Heritage Library indicates the historical nature of the library’s collection. Some of the digitised titles in the collection date back to the fifteenth century and about 70 per cent is over a hundred years old and in the public domain. We get into murky territory with local differences in the extent of public domain, so as a precaution the default limit has been 90 years from date of publication. The remainder has been published copyright free or the contributing institution has explicitly sought the permission of the copyright holder to put the material online.  We want people to be able to access the combined literary resources of some of the world’s great natural history collections and it is hoped that they will be able to put this resource to good use. To this end, a memorandum of understanding has been signed between all participants to the effect that all material currently in the public domain that has been made available by the Biodiversity Heritage Library remains in the public domain and no party shall claim intellectual property rights over the original or any derivative version.  Documents currently in copyright for which permission has been granted for digital representation is done so a Creative Commons Non-Commercial, Share Alike 3.0 license.   Certainly, the more recent publications there are online in the BHL, the more useful it can be to scientific research so we’re actively engaged in negotiating with the copyright holders to gain permission to extend the holdings of some of our targeted content.
  7. Audience So who are the audience for the BHL? Without a doubt, the largest segment of users of the BHL is the scientific community; many researchers in the field of taxonomy depend on the BHL as a major resource. These people need access to authoritative bibliographies and first descriptions of species and digital access saves an enormous amount of time and headaches. A second segment of the BHL’s audience are the people who appreciate the art of the scientific illustrations and the bibliophiles for whom such a deep collection of historical books represents hours of fascination.  Developers Finally, another group we would like to encourage are the developers and metadata junkies who can creatively reuse and mash up the bibliographic data and the book content to develop their own technology projects. These are the people who add to the value to the collection, who surprise us with novel ways of re-imagining the knowledge residing in the historical literature and presenting it in novel ways. To aid these people and those who wish to mine the dataset for research, the BHL has a published API which allows access to metadata and content using open web standards. Documentation can be found on the developer page on the BHL portals.
  8. BHL Australia In this part of the world, the local branch of the BHL was set up to provide the literature service for of the Atlas of Living Australia and is being coordinated from Melbourne by Museum Victoria. Since the middle of last year, we have put in place a local portal to the collection and developed software to facilitate our regional content contribution, and designed the workflow for digitisation of books from local libraries. The focus so far has been on accessing Australian collections, but I’m keen on expanding our scope to include literature from elsewhere in the region, especially New Zealand.  Small scale scanning Our scanning operation is intentionally small scale: thousands of titles have been scanned already by the big libraries including many publications from Australia and New Zealand. We don’t have the resources to digitise large volumes of material, but we can target our operation to fill the gaps that the big guys have left.  To direct our scanning effort, we’ve developed a website where We encourage our users within the scientific community to nominate titles and vote on the priority of our scanning list. We initially seeded our database with a bibliography obtained from the Australian government species registers and order the list according to number of citations. The initial list came in at over 7 thousand titles but we’ve had to pull out quite a few duplicates. We allow users to vote with a simple ‘like’ type system to add a weighting to a title and move it up the list. Similarly, titles added by users carry a greater weigh than the seed list, so hopefully our scanning list reflects something close to the preferences of our community of users.  In general, we’re focussing on completing the runs of locally published serial titles that are represented in the BHL but have incomplete holdings. Then we’re targeting the small niche publications such as those put out by the amateur naturalist societies. We are also hoping to digitise some of the beautiful rare books in our collections, which may be hard to obtain, have hand coloured illustrations or are in some way unique.  We began digitising titles from Museum Victoria’s library using our Bookdrive Pro copy stand just before Christmas, but only really began in earnest in February when we put in place a volunteer programme to do the image capture and post processing. In that time we’ve digitised about seventy volumes and fifty of those have been up loaded to the BHL so far. This is tiny compared to what we have on our bid list – but we’re making headway.
  9. Digitisation at MV When I started out on the digitisation project, I didn’t realize just what a manual process it was going to be. The image capture is very hands on and quite physical, but on a good day we can get through about 1000 pages in an hour. When we were selecting a digitisation platform, we chose the Bookdrive system because it allowed for pages to be photographed flat without unbinding the volume. This is particularly important for our conservators if we are to digitise our rare books.  Once we’ve imaged a book, the files are batch processed out of Adobe Bridge from camera Raw to TIFF. Each file is then opened in Photoshop and individually cropped and straightened. I’ve evaluated a number of different solutions for the image processing, but from what I’ve experienced so far, the best results and the most efficient process has been to do the post processing in Photoshop. This stage takes by far the longest, so we have two workstations set up processing files from the one capture rig.
  10. Volunteers. Since the beginning of February most of the digitisation has been carried out by volunteers from the Museum’s volunteer programme. We have six people who have committed to the project until the end of July and they have been operating the image capture system and carrying out the post-processing. When they started, they all had only basic computer skills and were a bit overwhelmed by the amount they had to learn. We provided training and to begin with, supervised them closely until they became familiar with process, but as their confidence in operating the machinery has grown, so has their output. Lately, on average we have been getting though about four books per volunteer a day and we have two each day for three days a week.  They’re a terrific group of people from very different backgrounds but all have become very passionate about their contribution to the BHL and are keen to continue with the project. Owing to the fairly physical and repetitive aspect of the imaging we have established a buddy system where they are paired up and swap jobs at regular intervals. Each pair is responsible for digitising their allocated books for the day and they usually exceed their targets. I haven’t even had to bribe them, but to keep them interested; we have regular morning teas as well as special viewings of the rare books and collection areas of the Museum.  
  11. Volunteers. Since the beginning of February most of the digitisation has been carried out by volunteers from the Museum’s volunteer programme. We have six people who have committed to the project until the end of July and they have been operating the image capture system and carrying out the post-processing. When they started, they all had only basic computer skills and were a bit overwhelmed by the amount they had to learn. We provided training and to begin with, supervised them closely until they became familiar with process, but as their confidence in operating the machinery has grown, so has their output. Lately, on average we have been getting though about four books per volunteer a day and we have two each day for three days a week.  They’re a terrific group of people from very different backgrounds but all have become very passionate about their contribution to the BHL and are keen to continue with the project. Owing to the fairly physical and repetitive aspect of the imaging we have established a buddy system where they are paired up and swap jobs at regular intervals. Each pair is responsible for digitising their allocated books for the day and they usually exceed their targets. I haven’t even had to bribe them, but to keep them interested; we have regular morning teas as well as special viewings of the rare books and collection areas of the Museum.  
  12. Future  So what of the future for the BHL? Unfortunately in many parts of the world, money is getting hard to come by for further digitisation projects. Right now, the digitisation operations are winding down among most of the US contributors and in Europe. But this doesn’t mean that the collection will remain static. The Smithsonian is the exception to rule and I believe that they are continuing scanning at a cracking pace while much of the recent contributions have been coming from China.  In the US, the technical team has just received a grant to develop crowdsourcing tools to correct OCR text and identify and describe the illustrations from the collection so there will still be plenty of development going on. The Australasian branch runs out of money toward the end of this year and at the moment we are seeking new sources of funding. In spite of this, we are well positioned to continue our contribution. With the volunteer programme in place and once the Macaw portal is set up I am hopeful that we will continue digitising and uploading content even if other parts of the project are scaled back.
  13. Conclusion The world’s biosphere is changing at an unprecedented rate in human experience and yet new species are still being discovered. If scientists are to understand the life that exists on earth today they must have access to the documentary legacy of research that has gone before. The Biodiversity Heritage Library plays an important role in giving access to this literature for the benefit of science. The more complete the library, the greater use it will be to scientists in the future.