SlideShare uma empresa Scribd logo
1 de 16
PhD thesis Digitisation
Project
Gavin Willshaw, Digital Curator, Library & University Collections
@gwillshaw
Project background
• 27,000 PhD theses dating from
early 1600s to present day
• 10,000 already digitised / in
digital format
• 2005: requirement for
submission of digital thesis
• Several small-scale digitisation
projects
The collection
• Largely standardised
• Yet, lots of diversity:
• Latin / handwritten
• Awkward foldouts
• Varying size
• Some theses damaged /
dirty
• Biological specimens…
Project aims
• Provide global, unhindered access to unique Edinburgh research
• Obtain equipment, software and expertise for future mass digitisation
projects
• Digitise 17,000 PhD theses – online by end 2018
• Create basic MARC records for 4,000 uncatalogued theses
• Undertake conservation work on 2,000 damaged theses
• 10,000 theses scanned destructively in-house
• Boards and spines removed
• Pages fed through Kodak i4250 document scanner
Destructive scanning
• 3,000 unique theses scanned non-
destructively in-house
• i2s Copibook Cobalt scanners (x2)
• Angle support allows for scanning
items with tight bindings
• 4,000 unique theses outsourced
Non-destructive scanning
LIMB Server batch processing software
• Deskew, sharpen, remove signatures / addresses, OCR, QA
• Output keyword searchable 300 DPI PDF
Copyright / Licensing
• Made available open access through Edinburgh Research Archive (ERA)
• However, copyright still held by authors, not UoE
• 2039 rule: all unpublished works (inc PhDs) under copyright until 2039,
even if author died centuries ago
• UoE has no right to openly licence
• Low risk; Take-down policy
• Gain expertise in mass digitisation
• Obtain equipment / software at
project end for future digitisation
initiatives
• More control over fragile material /
workflows
• Frees up 500 linear metres of shelf
space
Why this approach?
Date Activity
Feb 16 Funding confirmed
May 16 Equipment and staff in place – scanning work begins
Jun 16 First batch of digitised theses online
Nov 16 Conservation work begins
Mar 17 Procurement partner confirmed and outsourcing begins
Jul 17 Conservation work complete
May 18 All in-house scanning and processing complete
Dec 18 All outsourced theses returned
Dec 18 All theses available online
Timeline
• 5,646 scanned in-house
• 4,132 duplicate items
• 1,514 unique items
• 4,898 processed in-house
• 4,434 online
• On track to have in-house
element completed within
timeframe
Progress to date
Not just text…
Some notable authors
• Linking theses to Wikipedia
• Wikisource
• Looking to explore advanced
research techniques (e.g.
text mining / data
visualisation)
Beyond scanning
• libraryblogs.is.ed.ac.uk/phddigitisation
• era.lib.ed.ac.uk
• facebook.com/crc.edinburgh
• @CRC_EdUni
• @gwillshaw
Find out more
Gordon Brown: By Copyright World Economic Forum (www.weforum.org), swiss-image.ch/Photo by Remy Steinegger [CC BY-
SA 2.0 (http://creativecommons.org/licenses/by-sa/2.0)], via Wikimedia Commons
Arthur Conan Doyle: By Arnold Genthe - PD image from
http://www.sru.edu/depts/cisba/compsci/dailey/217students/sgm8660/Final/They got it from:
http://www.lib.utexas.edu/photodraw/portraits/,where the source was given as:Current History of the War v.I (December
1914 - March 1915). New York: New York Times Company., Public Domain,
https://commons.wikimedia.org/w/index.php?curid=240887
Alexander McCall Smith: By TimDuncan (Own work) [CC BY 3.0 (http://creativecommons.org/licenses/by/3.0)], via Wikimedia
Commons
Honor Fell: See page for author [CC BY 4.0 (http://creativecommons.org/licenses/by/4.0)], via Wikimedia Commons
Isabel Emslie Hutton: By Post of Serbia (http://www.wnsstamps.post/en/stamps/RS060.15) [Public domain], via Wikimedia
Commons
Attributions

Mais conteúdo relacionado

Semelhante a PhD Thesis Digitisation Project

British Library Labs, Aly Conteh, Digitisation Programme Manager at British L...
British Library Labs, Aly Conteh, Digitisation Programme Manager at British L...British Library Labs, Aly Conteh, Digitisation Programme Manager at British L...
British Library Labs, Aly Conteh, Digitisation Programme Manager at British L...
The European Library
 
Islandora Webinar: Highlighting CUHK Chinese Digital Collections
Islandora Webinar:  Highlighting CUHK Chinese Digital CollectionsIslandora Webinar:  Highlighting CUHK Chinese Digital Collections
Islandora Webinar: Highlighting CUHK Chinese Digital Collections
Erin Tripp
 
Digitizing Spectator - Libraries Digital Program
Digitizing Spectator - Libraries Digital ProgramDigitizing Spectator - Libraries Digital Program
Digitizing Spectator - Libraries Digital Program
Robert Frech
 

Semelhante a PhD Thesis Digitisation Project (20)

Preservation of Research Data: Dataverse / Archivematica Integration by Allan...
Preservation of Research Data: Dataverse / Archivematica Integration by Allan...Preservation of Research Data: Dataverse / Archivematica Integration by Allan...
Preservation of Research Data: Dataverse / Archivematica Integration by Allan...
 
Europeana Libraries: bringing content to the researcher
Europeana Libraries: bringing content to the researcherEuropeana Libraries: bringing content to the researcher
Europeana Libraries: bringing content to the researcher
 
Digital Humanities Clinics – Leading Dutch Librarians into DH. Lotte Wilms, N...
Digital Humanities Clinics – Leading Dutch Librarians into DH. Lotte Wilms, N...Digital Humanities Clinics – Leading Dutch Librarians into DH. Lotte Wilms, N...
Digital Humanities Clinics – Leading Dutch Librarians into DH. Lotte Wilms, N...
 
Ariadne overview
Ariadne overviewAriadne overview
Ariadne overview
 
Research Software Engineering Inside and Outside the Library
Research Software Engineering Inside and Outside the LibraryResearch Software Engineering Inside and Outside the Library
Research Software Engineering Inside and Outside the Library
 
BL Labs and Digital Humanities
BL Labs and Digital HumanitiesBL Labs and Digital Humanities
BL Labs and Digital Humanities
 
NLW Linked Open Data Sets
NLW Linked Open Data SetsNLW Linked Open Data Sets
NLW Linked Open Data Sets
 
VanDyck Long-Term Preservation of Digital Scholarly Literature
VanDyck Long-Term Preservation of Digital Scholarly LiteratureVanDyck Long-Term Preservation of Digital Scholarly Literature
VanDyck Long-Term Preservation of Digital Scholarly Literature
 
Developments in digital scholarship: at the British Library and at kitchen ta...
Developments in digital scholarship: at the British Library and at kitchen ta...Developments in digital scholarship: at the British Library and at kitchen ta...
Developments in digital scholarship: at the British Library and at kitchen ta...
 
British Library Labs, Aly Conteh, Digitisation Programme Manager at British L...
British Library Labs, Aly Conteh, Digitisation Programme Manager at British L...British Library Labs, Aly Conteh, Digitisation Programme Manager at British L...
British Library Labs, Aly Conteh, Digitisation Programme Manager at British L...
 
Islandora Webinar: Highlighting CUHK Chinese Digital Collections
Islandora Webinar:  Highlighting CUHK Chinese Digital CollectionsIslandora Webinar:  Highlighting CUHK Chinese Digital Collections
Islandora Webinar: Highlighting CUHK Chinese Digital Collections
 
Digitizing Spectator - Libraries Digital Program
Digitizing Spectator - Libraries Digital ProgramDigitizing Spectator - Libraries Digital Program
Digitizing Spectator - Libraries Digital Program
 
AMIA: Examining AV Enterprise at a Regional Academic Archive
AMIA: Examining AV Enterprise at a Regional Academic ArchiveAMIA: Examining AV Enterprise at a Regional Academic Archive
AMIA: Examining AV Enterprise at a Regional Academic Archive
 
Going Digital - Future Digital
Going Digital - Future DigitalGoing Digital - Future Digital
Going Digital - Future Digital
 
Latest developments in Hydra-land - Chris Awre, University of Hull
Latest developments in Hydra-land - Chris Awre, University of HullLatest developments in Hydra-land - Chris Awre, University of Hull
Latest developments in Hydra-land - Chris Awre, University of Hull
 
"Curation-Ready" Workflows for Digitized Photograph Collections: A Temporary ...
"Curation-Ready" Workflows for Digitized Photograph Collections: A Temporary ..."Curation-Ready" Workflows for Digitized Photograph Collections: A Temporary ...
"Curation-Ready" Workflows for Digitized Photograph Collections: A Temporary ...
 
Europeana Libraries: the value of a library domain aggregator
Europeana Libraries: the value of a library domain aggregatorEuropeana Libraries: the value of a library domain aggregator
Europeana Libraries: the value of a library domain aggregator
 
More Product, Less Process: Mass Digitization of Special Collections (Elings,...
More Product, Less Process: Mass Digitization of Special Collections (Elings,...More Product, Less Process: Mass Digitization of Special Collections (Elings,...
More Product, Less Process: Mass Digitization of Special Collections (Elings,...
 
Retrospective digitization: increasing the visibility, accessibility and impa...
Retrospective digitization: increasing the visibility, accessibility and impa...Retrospective digitization: increasing the visibility, accessibility and impa...
Retrospective digitization: increasing the visibility, accessibility and impa...
 
British Library Labs Competition Presentation - Digital Humanities, Universit...
British Library Labs Competition Presentation - Digital Humanities, Universit...British Library Labs Competition Presentation - Digital Humanities, Universit...
British Library Labs Competition Presentation - Digital Humanities, Universit...
 

Mais de Lorna Campbell

Mais de Lorna Campbell (20)

Empowering Student Engagement with Open Education
Empowering Student Engagement with Open EducationEmpowering Student Engagement with Open Education
Empowering Student Engagement with Open Education
 
Fundamentals of Music Theory: Co-creating sustainable open textbooks for musi...
Fundamentals of Music Theory: Co-creating sustainable open textbooks for musi...Fundamentals of Music Theory: Co-creating sustainable open textbooks for musi...
Fundamentals of Music Theory: Co-creating sustainable open textbooks for musi...
 
Open eTextbooks for Access to Music Education: Outputs and Reflections
Open eTextbooks for Access to Music Education: Outputs and ReflectionsOpen eTextbooks for Access to Music Education: Outputs and Reflections
Open eTextbooks for Access to Music Education: Outputs and Reflections
 
Knowledge Activism: Representing HIV & AIDS activism on Wikipedia
Knowledge Activism: Representing HIV & AIDS activism on WikipediaKnowledge Activism: Representing HIV & AIDS activism on Wikipedia
Knowledge Activism: Representing HIV & AIDS activism on Wikipedia
 
The Scale of Open: Re-purposing open resources for music education 
The Scale of Open: Re-purposing open resources for music education The Scale of Open: Re-purposing open resources for music education 
The Scale of Open: Re-purposing open resources for music education 
 
Opened To All: OER as infrastructure at the University of Edinburgh
Opened To All: OER as infrastructure at the University of Edinburgh Opened To All: OER as infrastructure at the University of Edinburgh
Opened To All: OER as infrastructure at the University of Edinburgh
 
For the Common Good: Responding to the global pandemic with OER
For the Common Good: Responding to the global pandemic with OER For the Common Good: Responding to the global pandemic with OER
For the Common Good: Responding to the global pandemic with OER
 
Creative Commons Quick Start: A short introduction to using CC licences
Creative Commons Quick Start: A short introduction to using CC licencesCreative Commons Quick Start: A short introduction to using CC licences
Creative Commons Quick Start: A short introduction to using CC licences
 
Open knowledge in the Curriculum: Building competencies, attributes and liter...
Open knowledge in the Curriculum: Building competencies, attributes and liter...Open knowledge in the Curriculum: Building competencies, attributes and liter...
Open knowledge in the Curriculum: Building competencies, attributes and liter...
 
Get Blogging!
Get Blogging! Get Blogging!
Get Blogging!
 
Drawing the Line: Reflections on Ope Practice and Digital Labour
Drawing the Line: Reflections on Ope Practice and Digital LabourDrawing the Line: Reflections on Ope Practice and Digital Labour
Drawing the Line: Reflections on Ope Practice and Digital Labour
 
A Common Purpose: Wikimedia, Open Education and Knowledge Equity for all Intr...
A Common Purpose: Wikimedia, Open Education and Knowledge Equity for all Intr...A Common Purpose: Wikimedia, Open Education and Knowledge Equity for all Intr...
A Common Purpose: Wikimedia, Open Education and Knowledge Equity for all Intr...
 
Into the Open: Exploring the benefits of open education and OER
Into the Open: Exploring the benefits of open education and OERInto the Open: Exploring the benefits of open education and OER
Into the Open: Exploring the benefits of open education and OER
 
Introduction to Academic Blogging
Introduction to Academic BloggingIntroduction to Academic Blogging
Introduction to Academic Blogging
 
Influential Voices - Developing a blogging service based on trust and openness
Influential Voices - Developing a blogging service based on trust and opennessInfluential Voices - Developing a blogging service based on trust and openness
Influential Voices - Developing a blogging service based on trust and openness
 
Why Blog?
Why Blog? Why Blog?
Why Blog?
 
Opening Online Learning with OER
Opening Online Learning with OEROpening Online Learning with OER
Opening Online Learning with OER
 
Benefits of Professional Blogging
Benefits of Professional BloggingBenefits of Professional Blogging
Benefits of Professional Blogging
 
Positioning the values and practices of open education at the core of Univers...
Positioning the values and practices of open education at the core of Univers...Positioning the values and practices of open education at the core of Univers...
Positioning the values and practices of open education at the core of Univers...
 
Open.Ed - Supporting Open Education at the University of Edinburgh
Open.Ed -  Supporting Open Education at the University of EdinburghOpen.Ed -  Supporting Open Education at the University of Edinburgh
Open.Ed - Supporting Open Education at the University of Edinburgh
 

Último

Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
negromaestrong
 

Último (20)

Asian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptxAsian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptx
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-IIFood Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 

PhD Thesis Digitisation Project

  • 1. PhD thesis Digitisation Project Gavin Willshaw, Digital Curator, Library & University Collections @gwillshaw
  • 2. Project background • 27,000 PhD theses dating from early 1600s to present day • 10,000 already digitised / in digital format • 2005: requirement for submission of digital thesis • Several small-scale digitisation projects
  • 3. The collection • Largely standardised • Yet, lots of diversity: • Latin / handwritten • Awkward foldouts • Varying size • Some theses damaged / dirty • Biological specimens…
  • 4. Project aims • Provide global, unhindered access to unique Edinburgh research • Obtain equipment, software and expertise for future mass digitisation projects • Digitise 17,000 PhD theses – online by end 2018 • Create basic MARC records for 4,000 uncatalogued theses • Undertake conservation work on 2,000 damaged theses
  • 5. • 10,000 theses scanned destructively in-house • Boards and spines removed • Pages fed through Kodak i4250 document scanner Destructive scanning
  • 6. • 3,000 unique theses scanned non- destructively in-house • i2s Copibook Cobalt scanners (x2) • Angle support allows for scanning items with tight bindings • 4,000 unique theses outsourced Non-destructive scanning
  • 7. LIMB Server batch processing software • Deskew, sharpen, remove signatures / addresses, OCR, QA • Output keyword searchable 300 DPI PDF
  • 8. Copyright / Licensing • Made available open access through Edinburgh Research Archive (ERA) • However, copyright still held by authors, not UoE • 2039 rule: all unpublished works (inc PhDs) under copyright until 2039, even if author died centuries ago • UoE has no right to openly licence • Low risk; Take-down policy
  • 9. • Gain expertise in mass digitisation • Obtain equipment / software at project end for future digitisation initiatives • More control over fragile material / workflows • Frees up 500 linear metres of shelf space Why this approach?
  • 10. Date Activity Feb 16 Funding confirmed May 16 Equipment and staff in place – scanning work begins Jun 16 First batch of digitised theses online Nov 16 Conservation work begins Mar 17 Procurement partner confirmed and outsourcing begins Jul 17 Conservation work complete May 18 All in-house scanning and processing complete Dec 18 All outsourced theses returned Dec 18 All theses available online Timeline
  • 11. • 5,646 scanned in-house • 4,132 duplicate items • 1,514 unique items • 4,898 processed in-house • 4,434 online • On track to have in-house element completed within timeframe Progress to date
  • 14. • Linking theses to Wikipedia • Wikisource • Looking to explore advanced research techniques (e.g. text mining / data visualisation) Beyond scanning
  • 15. • libraryblogs.is.ed.ac.uk/phddigitisation • era.lib.ed.ac.uk • facebook.com/crc.edinburgh • @CRC_EdUni • @gwillshaw Find out more
  • 16. Gordon Brown: By Copyright World Economic Forum (www.weforum.org), swiss-image.ch/Photo by Remy Steinegger [CC BY- SA 2.0 (http://creativecommons.org/licenses/by-sa/2.0)], via Wikimedia Commons Arthur Conan Doyle: By Arnold Genthe - PD image from http://www.sru.edu/depts/cisba/compsci/dailey/217students/sgm8660/Final/They got it from: http://www.lib.utexas.edu/photodraw/portraits/,where the source was given as:Current History of the War v.I (December 1914 - March 1915). New York: New York Times Company., Public Domain, https://commons.wikimedia.org/w/index.php?curid=240887 Alexander McCall Smith: By TimDuncan (Own work) [CC BY 3.0 (http://creativecommons.org/licenses/by/3.0)], via Wikimedia Commons Honor Fell: See page for author [CC BY 4.0 (http://creativecommons.org/licenses/by/4.0)], via Wikimedia Commons Isabel Emslie Hutton: By Post of Serbia (http://www.wnsstamps.post/en/stamps/RS060.15) [Public domain], via Wikimedia Commons Attributions