SlideShare uma empresa Scribd logo
1 de 26
Systems, processes & how we
stop the wheels falling off
Digitisation Open Day, September 2013
Dave Thompson
Digital Curator, Wellcome Library
Digitisation – process overview
Plan project
Catalogue
Identify material
Identify resources
Plan process
Review as you go
Digitise/proces
s
Deliver
Refine processes
Document/share
Document/share
Document/share
Funding, staff, equipment, IT,
storage, data management
planning
Open source player
Meanwhile, at the coal face…
Administrative
metadata
Descriptive metadata
Digitised images
Ingestion into
repository
Creation of METS Access
+
=
+
+ +
Thinking conceptually … OAIS
Administrative
metadata
Descriptive metadata
Digitised images
Ingestion into
repository
Creation of METS Access
+
=
+
+ +
In OAIS speak this is a SIP. An aggregation of object & its
metadata in a form that is acceptable to the repository, e.g.
JPEG2000 images and MARC XML.
The Open Archive Information System Reference model (OAIS) is an ISO
that describes a conceptual model of an archive. It sets out the activities of an
archive & the processes involved in submission, storage & access. Developed
by NASA after they ‘lost’ space data through obsolescence.
Thinking conceptually… OAIS
Administrative
metadata
Descriptive metadata
Digitised images
Ingestion into
repository
Creation of METS Access
+
=
+
+ +
In OAIS speak this is a AIP. This is the object & its metadata
stored in a repository.
OAIS talks of 3 information packages.
1.Submission Information package = what is ingested
2.Archive Information Package = what is stored
3.Dissemination Information package = what is made available
Thinking conceptually …OAIS
Administrative
metadata
Descriptive metadata
Digitised images
Ingestion into
repository
Creation of METS Access
+
=
+
+ +
In OAIS speak this is a DIP. This is the parts of the object & its
metadata that we are able to make available.
As defined in the (#DPC) handbook, access is assumed to mean continued,
ongoing usability of a digital resource, retaining all qualities of authenticity,
accuracy and functionality deemed to be essential for the purposes the digital
material was created and/or acquired for.
Lets tackle the basics…processing
Administrative
metadata
Descriptive metadata
Digitised images
Ingestion into
repository
Creation of METS Access
+
=
+
+ +
Administrative metadata, (AMD) technical description of the files.
Automatically created by Safety Deposit Box (SDB) on ingest
into our repository. Used by the player for display purposes.
Administrative MetaData is typically created automatically, it could be:
•File size
•Image HxW
•File format
•Checksum
Lets tackle the basics…processing
Administrative
metadata
Descriptive metadata
Digitised images
Ingestion into
repository
Creation of METS Access
+
=
+
+ +
DMD. MARC, converted to MARC XML. This becomes MODS in
the METS. Material must be catalogued before we can store it &
make it available.
Descriptive MetaData (DMD), typically human generated, AKA cataloguing
metadata. ISAD(g) for archival material, MARC for bibliographic material.
Metadata Object Description Schema (MODS)
Lets tackle the basics…processing
Administrative
metadata
Descriptive metadata
Digitised images
Ingestion into
repository
Creation of METS Access
+
=
+
+ +
Safety Deposit Box (SDB), the place where we store digital stuff.
Ingest is automatically initiated by Goobi. Database that
associates objects with DMD & AMD. Source for dissemination.
Digital Repositories offer a convenient infrastructure through which to store,
manage, re-use and curate digital materials. They are used by a variety of
communities, may carry out many different functions, and can take many
forms.
Lets tackle the basics…processing
Administrative
metadata
Descriptive metadata
Digitised images
Ingestion into
repository
Creation of METS Access
+
=
+
+ +
METS is metadata about structure & pagination created by
humans, METS file built automatically.
A Metadata Encoding & Transmission Standard (METS) file is an aggregated
collection of DMD & AMD (a file list with structure) that provides a mechanism
for managed access. A METS file allows metadata from different system to
be combined into a portable format.
The formats
• JPEG2000 is our master image format.
• We create dissemination images (JPEG) on the
fly.
• Also use PDF, MPEG2, MP3
The systems
• Goobi. Manages & tracks the production of
digitised content.
• SDB. Repository that stores digitised content
along with its DMD & AMD.
• Player. User interface to view digitised material.
How Goobi works – the basics
• Project based.
• Workflow driven.
• Users accept ‘tasks’.
• A users role determines what projects they belong
to & what roles they have.
How Goobi works – a workflow
How Goobi works – METS editing
Pagination as per original
Descriptive metadata
Structure
Lessons from Goobi
• Design your workflows in advance. But be flexible.
• Automate as much as possible, saves time &
more efficient.
• Document processes & procedures.
• Share what you learn.
How SDB works – the basics
• Workflow based easily ‘talks’ to other systems.
• Content agnostic.
• Creates administrative metadata on ingest.
• Preservation orientated.
How SDB works
How SDB works – behind the scenes
• No public access to SDB.
• Little direct staff access to SDB content.
• High levels of automation of ingest, Goobi.
• Platform for dissemination mediated by the player.
Lessons from SDB
• Plan your systems integration, which system talks
to which, and how.
• Plan workflows & processes.
• Data management plan. Your eggs in one basket.
• Plan what you’ll do when it all turns to custard.
How the player works – the basics
How the player works
• Makes HTTP request to SDB for content.
• Draws access conditions from METS file.
• Permitted actions drawn from METS.
• Draws DMD from live catalogue.
Summary
• Digitisation is an end to end process that brings
together objects & metadata.
• Have to think about the whole system to deliver
results. Process is one of combining metadata
from different systems.
• Document plans & document process.
• Be prepared to be flexible & to change as
necessary. But try to stick to the plan!
Further reading
• Wellcome Library – http://wellcomelibrary.org
• Metadata Encoding & Transmission Standard at the Library of Congress -
http://www.loc.gov/standards/mets/
• Reference Model for an Open Archival Information System (OAIS).
Magenta Book. Issue 2. June 2012 -
http://public.ccsds.org/publications/RefModel.aspx
• Tessella, Safety Deposit Box - http://www.tessella.com/tag/safety-deposit-
box/
• Data management planning - http://www.dcc.ac.uk/resources/data-
management-plans
• Repository Software Comparison: Building Digital Library Infrastructure at
LSE - http://www.ariadne.ac.uk/issue64/fay
Thank you
Questions now, questions later…?
Dave Thompson, Digital Curator
Wellcome Library
d.thompson@wellcome.ac.uk - #welldigi
http://wellcomelibrary.org/

Mais conteúdo relacionado

Mais procurados

Cloud computing and big data analytics
Cloud computing and big data analyticsCloud computing and big data analytics
Cloud computing and big data analyticshanish93
 
Campus Bridging with Globus Services
Campus Bridging with Globus ServicesCampus Bridging with Globus Services
Campus Bridging with Globus ServicesIan Foster
 
20090701 Climate Data Staging
20090701 Climate Data Staging20090701 Climate Data Staging
20090701 Climate Data StagingHenning Bergmeyer
 
BHL Global Infrastructure - Vision
BHL Global Infrastructure - VisionBHL Global Infrastructure - Vision
BHL Global Infrastructure - VisionChris Freeland
 
Big Data Course - BigData HUB
Big Data Course - BigData HUBBig Data Course - BigData HUB
Big Data Course - BigData HUBAhmed Salman
 
BigData HUB Workshop
BigData HUB WorkshopBigData HUB Workshop
BigData HUB WorkshopAhmed Salman
 
Big data processing using hadoop poster presentation
Big data processing using hadoop poster presentationBig data processing using hadoop poster presentation
Big data processing using hadoop poster presentationAmrut Patil
 
Cni research data_oxford_horstmann_jefferies
Cni research data_oxford_horstmann_jefferiesCni research data_oxford_horstmann_jefferies
Cni research data_oxford_horstmann_jefferiesBDLSS
 
Big Data Analytics 2014
Big Data Analytics 2014Big Data Analytics 2014
Big Data Analytics 2014Stratebi
 
An Introduction to Big Data, NoSQL and MongoDB
An Introduction to Big Data, NoSQL and MongoDBAn Introduction to Big Data, NoSQL and MongoDB
An Introduction to Big Data, NoSQL and MongoDBWilliam LaForest
 
Hadoop, SQL and NoSQL, No longer an either/or question
Hadoop, SQL and NoSQL, No longer an either/or questionHadoop, SQL and NoSQL, No longer an either/or question
Hadoop, SQL and NoSQL, No longer an either/or questionDataWorks Summit
 
Globus status and publication plans
Globus status and publication plansGlobus status and publication plans
Globus status and publication plansIan Foster
 
Foundations for the Future of Science
Foundations for the Future of ScienceFoundations for the Future of Science
Foundations for the Future of ScienceGlobus
 

Mais procurados (15)

Cloud computing and big data analytics
Cloud computing and big data analyticsCloud computing and big data analytics
Cloud computing and big data analytics
 
Big Data simplified
Big Data simplifiedBig Data simplified
Big Data simplified
 
Campus Bridging with Globus Services
Campus Bridging with Globus ServicesCampus Bridging with Globus Services
Campus Bridging with Globus Services
 
20090701 Climate Data Staging
20090701 Climate Data Staging20090701 Climate Data Staging
20090701 Climate Data Staging
 
BHL Global Infrastructure - Vision
BHL Global Infrastructure - VisionBHL Global Infrastructure - Vision
BHL Global Infrastructure - Vision
 
Big Data Course - BigData HUB
Big Data Course - BigData HUBBig Data Course - BigData HUB
Big Data Course - BigData HUB
 
BigData HUB Workshop
BigData HUB WorkshopBigData HUB Workshop
BigData HUB Workshop
 
Big data processing using hadoop poster presentation
Big data processing using hadoop poster presentationBig data processing using hadoop poster presentation
Big data processing using hadoop poster presentation
 
Cni research data_oxford_horstmann_jefferies
Cni research data_oxford_horstmann_jefferiesCni research data_oxford_horstmann_jefferies
Cni research data_oxford_horstmann_jefferies
 
Big Data Analytics 2014
Big Data Analytics 2014Big Data Analytics 2014
Big Data Analytics 2014
 
An Introduction to Big Data, NoSQL and MongoDB
An Introduction to Big Data, NoSQL and MongoDBAn Introduction to Big Data, NoSQL and MongoDB
An Introduction to Big Data, NoSQL and MongoDB
 
Hadoop, SQL and NoSQL, No longer an either/or question
Hadoop, SQL and NoSQL, No longer an either/or questionHadoop, SQL and NoSQL, No longer an either/or question
Hadoop, SQL and NoSQL, No longer an either/or question
 
Mongodb
MongodbMongodb
Mongodb
 
Globus status and publication plans
Globus status and publication plansGlobus status and publication plans
Globus status and publication plans
 
Foundations for the Future of Science
Foundations for the Future of ScienceFoundations for the Future of Science
Foundations for the Future of Science
 

Destaque

Copyright clearance for genetics books - a pilot project at the Wellcome Library
Copyright clearance for genetics books - a pilot project at the Wellcome LibraryCopyright clearance for genetics books - a pilot project at the Wellcome Library
Copyright clearance for genetics books - a pilot project at the Wellcome LibraryWellcome Library
 
Systems and Processes: making order out of chaos
Systems and Processes: making order out of chaosSystems and Processes: making order out of chaos
Systems and Processes: making order out of chaosWellcome Library
 
Webinar - Order out of Chaos: Avoiding the Migration Migraine
Webinar - Order out of Chaos: Avoiding the Migration MigraineWebinar - Order out of Chaos: Avoiding the Migration Migraine
Webinar - Order out of Chaos: Avoiding the Migration MigrainePeak Hosting
 
Doing Projects: 10 laws of digitisation
Doing Projects: 10 laws of digitisationDoing Projects: 10 laws of digitisation
Doing Projects: 10 laws of digitisationWellcome Library
 
How will history remember you…?
How will history remember you…?How will history remember you…?
How will history remember you…?Wellcome Library
 

Destaque (6)

Copyright clearance for genetics books - a pilot project at the Wellcome Library
Copyright clearance for genetics books - a pilot project at the Wellcome LibraryCopyright clearance for genetics books - a pilot project at the Wellcome Library
Copyright clearance for genetics books - a pilot project at the Wellcome Library
 
Systems and Processes: making order out of chaos
Systems and Processes: making order out of chaosSystems and Processes: making order out of chaos
Systems and Processes: making order out of chaos
 
Webinar - Order out of Chaos: Avoiding the Migration Migraine
Webinar - Order out of Chaos: Avoiding the Migration MigraineWebinar - Order out of Chaos: Avoiding the Migration Migraine
Webinar - Order out of Chaos: Avoiding the Migration Migraine
 
Doing Projects: 10 laws of digitisation
Doing Projects: 10 laws of digitisationDoing Projects: 10 laws of digitisation
Doing Projects: 10 laws of digitisation
 
How will history remember you…?
How will history remember you…?How will history remember you…?
How will history remember you…?
 
Making Order Out of Chaos
Making Order Out of ChaosMaking Order Out of Chaos
Making Order Out of Chaos
 

Semelhante a Systems, processes & how we stop the wheels falling off

Wt dnt digitisation_open_day_v9
Wt dnt digitisation_open_day_v9Wt dnt digitisation_open_day_v9
Wt dnt digitisation_open_day_v9Wellcome Library
 
Analytics with unified file and object
Analytics with unified file and object Analytics with unified file and object
Analytics with unified file and object Sandeep Patil
 
Data Management - Full Stack Deep Learning
Data Management - Full Stack Deep LearningData Management - Full Stack Deep Learning
Data Management - Full Stack Deep LearningSergey Karayev
 
From Business Intelligence to Big Data - hack/reduce Dec 2014
From Business Intelligence to Big Data - hack/reduce Dec 2014From Business Intelligence to Big Data - hack/reduce Dec 2014
From Business Intelligence to Big Data - hack/reduce Dec 2014Adam Ferrari
 
Overview of Big Data by Sunny
Overview of Big Data by SunnyOverview of Big Data by Sunny
Overview of Big Data by SunnyDignitasDigital1
 
Digital library technologies
Digital library technologies Digital library technologies
Digital library technologies Shriram Pandey
 
Research Data (and Software) Management at Imperial: (Everything you need to ...
Research Data (and Software) Management at Imperial: (Everything you need to ...Research Data (and Software) Management at Imperial: (Everything you need to ...
Research Data (and Software) Management at Imperial: (Everything you need to ...Sarah Anna Stewart
 
Using MongoDB + Hadoop Together
Using MongoDB + Hadoop TogetherUsing MongoDB + Hadoop Together
Using MongoDB + Hadoop TogetherMongoDB
 
Shaping the Role of a Data Lake in a Modern Data Fabric Architecture
Shaping the Role of a Data Lake in a Modern Data Fabric ArchitectureShaping the Role of a Data Lake in a Modern Data Fabric Architecture
Shaping the Role of a Data Lake in a Modern Data Fabric ArchitectureDenodo
 
METS Metadata for Complete Beginners
METS Metadata for Complete BeginnersMETS Metadata for Complete Beginners
METS Metadata for Complete Beginnersstuartayeates
 
Hadoop meets Agile! - An Agile Big Data Model
Hadoop meets Agile! - An Agile Big Data ModelHadoop meets Agile! - An Agile Big Data Model
Hadoop meets Agile! - An Agile Big Data ModelUwe Printz
 
Government GraphSummit: And Then There Were 15 Standards
Government GraphSummit: And Then There Were 15 StandardsGovernment GraphSummit: And Then There Were 15 Standards
Government GraphSummit: And Then There Were 15 StandardsNeo4j
 

Semelhante a Systems, processes & how we stop the wheels falling off (20)

Wt dnt digitisation_open_day_v9
Wt dnt digitisation_open_day_v9Wt dnt digitisation_open_day_v9
Wt dnt digitisation_open_day_v9
 
Analytics with unified file and object
Analytics with unified file and object Analytics with unified file and object
Analytics with unified file and object
 
Data Management - Full Stack Deep Learning
Data Management - Full Stack Deep LearningData Management - Full Stack Deep Learning
Data Management - Full Stack Deep Learning
 
From Business Intelligence to Big Data - hack/reduce Dec 2014
From Business Intelligence to Big Data - hack/reduce Dec 2014From Business Intelligence to Big Data - hack/reduce Dec 2014
From Business Intelligence to Big Data - hack/reduce Dec 2014
 
Overview of Big Data by Sunny
Overview of Big Data by SunnyOverview of Big Data by Sunny
Overview of Big Data by Sunny
 
Digital library technologies
Digital library technologies Digital library technologies
Digital library technologies
 
Research Data (and Software) Management at Imperial: (Everything you need to ...
Research Data (and Software) Management at Imperial: (Everything you need to ...Research Data (and Software) Management at Imperial: (Everything you need to ...
Research Data (and Software) Management at Imperial: (Everything you need to ...
 
unit 1 big data.pptx
unit 1 big data.pptxunit 1 big data.pptx
unit 1 big data.pptx
 
Using MongoDB + Hadoop Together
Using MongoDB + Hadoop TogetherUsing MongoDB + Hadoop Together
Using MongoDB + Hadoop Together
 
2015 05-07-mac
2015 05-07-mac2015 05-07-mac
2015 05-07-mac
 
Big Data and Hadoop
Big Data and HadoopBig Data and Hadoop
Big Data and Hadoop
 
Shaping the Role of a Data Lake in a Modern Data Fabric Architecture
Shaping the Role of a Data Lake in a Modern Data Fabric ArchitectureShaping the Role of a Data Lake in a Modern Data Fabric Architecture
Shaping the Role of a Data Lake in a Modern Data Fabric Architecture
 
METS Metadata for Complete Beginners
METS Metadata for Complete BeginnersMETS Metadata for Complete Beginners
METS Metadata for Complete Beginners
 
Dbms Useful PPT
Dbms Useful PPTDbms Useful PPT
Dbms Useful PPT
 
Hadoop meets Agile! - An Agile Big Data Model
Hadoop meets Agile! - An Agile Big Data ModelHadoop meets Agile! - An Agile Big Data Model
Hadoop meets Agile! - An Agile Big Data Model
 
Database system
Database systemDatabase system
Database system
 
Government GraphSummit: And Then There Were 15 Standards
Government GraphSummit: And Then There Were 15 StandardsGovernment GraphSummit: And Then There Were 15 Standards
Government GraphSummit: And Then There Were 15 Standards
 
Architecting a datalake
Architecting a datalakeArchitecting a datalake
Architecting a datalake
 
Data Privacy at Scale
Data Privacy at ScaleData Privacy at Scale
Data Privacy at Scale
 
Presentation 16 may keynote karin bredenberg
Presentation 16 may keynote karin bredenbergPresentation 16 may keynote karin bredenberg
Presentation 16 may keynote karin bredenberg
 

Mais de Wellcome Library

Wellcome Library Transcribing Recipes report
Wellcome Library Transcribing Recipes reportWellcome Library Transcribing Recipes report
Wellcome Library Transcribing Recipes reportWellcome Library
 
ProQuest Early European Books: Partner Perspective
ProQuest Early European Books: Partner PerspectiveProQuest Early European Books: Partner Perspective
ProQuest Early European Books: Partner PerspectiveWellcome Library
 
Creating an online resource for medical archives at the Wellcome Library
Creating an online resource for medical archives at the Wellcome LibraryCreating an online resource for medical archives at the Wellcome Library
Creating an online resource for medical archives at the Wellcome LibraryWellcome Library
 
Jpeg2000 at Wellcome Library
Jpeg2000 at Wellcome LibraryJpeg2000 at Wellcome Library
Jpeg2000 at Wellcome LibraryWellcome Library
 
Digitisation Projects at Wellcome Library
Digitisation Projects at Wellcome LibraryDigitisation Projects at Wellcome Library
Digitisation Projects at Wellcome LibraryWellcome Library
 
Conservation for Digitisation
Conservation for DigitisationConservation for Digitisation
Conservation for DigitisationWellcome Library
 
Copyright Clearance for Genetics Books, A pilot project at the Wellcome Library
Copyright Clearance for Genetics Books, A pilot project at the Wellcome LibraryCopyright Clearance for Genetics Books, A pilot project at the Wellcome Library
Copyright Clearance for Genetics Books, A pilot project at the Wellcome LibraryWellcome Library
 
Managing Large Scale Digitisation at the Wellcome Library
Managing Large Scale Digitisation at the Wellcome LibraryManaging Large Scale Digitisation at the Wellcome Library
Managing Large Scale Digitisation at the Wellcome LibraryWellcome Library
 
Upscaling digitisation at the Wellcome Library
Upscaling digitisation at the Wellcome LibraryUpscaling digitisation at the Wellcome Library
Upscaling digitisation at the Wellcome LibraryWellcome Library
 
Mandating Open Access - Wellcome Trust
Mandating Open Access - Wellcome TrustMandating Open Access - Wellcome Trust
Mandating Open Access - Wellcome TrustWellcome Library
 

Mais de Wellcome Library (11)

Wellcome Library Transcribing Recipes report
Wellcome Library Transcribing Recipes reportWellcome Library Transcribing Recipes report
Wellcome Library Transcribing Recipes report
 
ProQuest Early European Books: Partner Perspective
ProQuest Early European Books: Partner PerspectiveProQuest Early European Books: Partner Perspective
ProQuest Early European Books: Partner Perspective
 
Creating an online resource for medical archives at the Wellcome Library
Creating an online resource for medical archives at the Wellcome LibraryCreating an online resource for medical archives at the Wellcome Library
Creating an online resource for medical archives at the Wellcome Library
 
Jpeg2000 at Wellcome Library
Jpeg2000 at Wellcome LibraryJpeg2000 at Wellcome Library
Jpeg2000 at Wellcome Library
 
Digitisation Projects at Wellcome Library
Digitisation Projects at Wellcome LibraryDigitisation Projects at Wellcome Library
Digitisation Projects at Wellcome Library
 
Image Capture
Image CaptureImage Capture
Image Capture
 
Conservation for Digitisation
Conservation for DigitisationConservation for Digitisation
Conservation for Digitisation
 
Copyright Clearance for Genetics Books, A pilot project at the Wellcome Library
Copyright Clearance for Genetics Books, A pilot project at the Wellcome LibraryCopyright Clearance for Genetics Books, A pilot project at the Wellcome Library
Copyright Clearance for Genetics Books, A pilot project at the Wellcome Library
 
Managing Large Scale Digitisation at the Wellcome Library
Managing Large Scale Digitisation at the Wellcome LibraryManaging Large Scale Digitisation at the Wellcome Library
Managing Large Scale Digitisation at the Wellcome Library
 
Upscaling digitisation at the Wellcome Library
Upscaling digitisation at the Wellcome LibraryUpscaling digitisation at the Wellcome Library
Upscaling digitisation at the Wellcome Library
 
Mandating Open Access - Wellcome Trust
Mandating Open Access - Wellcome TrustMandating Open Access - Wellcome Trust
Mandating Open Access - Wellcome Trust
 

Último

Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 

Último (20)

Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 

Systems, processes & how we stop the wheels falling off

  • 1. Systems, processes & how we stop the wheels falling off Digitisation Open Day, September 2013 Dave Thompson Digital Curator, Wellcome Library
  • 2. Digitisation – process overview Plan project Catalogue Identify material Identify resources Plan process Review as you go Digitise/proces s Deliver Refine processes Document/share Document/share Document/share Funding, staff, equipment, IT, storage, data management planning Open source player
  • 3. Meanwhile, at the coal face… Administrative metadata Descriptive metadata Digitised images Ingestion into repository Creation of METS Access + = + + +
  • 4. Thinking conceptually … OAIS Administrative metadata Descriptive metadata Digitised images Ingestion into repository Creation of METS Access + = + + + In OAIS speak this is a SIP. An aggregation of object & its metadata in a form that is acceptable to the repository, e.g. JPEG2000 images and MARC XML. The Open Archive Information System Reference model (OAIS) is an ISO that describes a conceptual model of an archive. It sets out the activities of an archive & the processes involved in submission, storage & access. Developed by NASA after they ‘lost’ space data through obsolescence.
  • 5. Thinking conceptually… OAIS Administrative metadata Descriptive metadata Digitised images Ingestion into repository Creation of METS Access + = + + + In OAIS speak this is a AIP. This is the object & its metadata stored in a repository. OAIS talks of 3 information packages. 1.Submission Information package = what is ingested 2.Archive Information Package = what is stored 3.Dissemination Information package = what is made available
  • 6. Thinking conceptually …OAIS Administrative metadata Descriptive metadata Digitised images Ingestion into repository Creation of METS Access + = + + + In OAIS speak this is a DIP. This is the parts of the object & its metadata that we are able to make available. As defined in the (#DPC) handbook, access is assumed to mean continued, ongoing usability of a digital resource, retaining all qualities of authenticity, accuracy and functionality deemed to be essential for the purposes the digital material was created and/or acquired for.
  • 7. Lets tackle the basics…processing Administrative metadata Descriptive metadata Digitised images Ingestion into repository Creation of METS Access + = + + + Administrative metadata, (AMD) technical description of the files. Automatically created by Safety Deposit Box (SDB) on ingest into our repository. Used by the player for display purposes. Administrative MetaData is typically created automatically, it could be: •File size •Image HxW •File format •Checksum
  • 8. Lets tackle the basics…processing Administrative metadata Descriptive metadata Digitised images Ingestion into repository Creation of METS Access + = + + + DMD. MARC, converted to MARC XML. This becomes MODS in the METS. Material must be catalogued before we can store it & make it available. Descriptive MetaData (DMD), typically human generated, AKA cataloguing metadata. ISAD(g) for archival material, MARC for bibliographic material. Metadata Object Description Schema (MODS)
  • 9. Lets tackle the basics…processing Administrative metadata Descriptive metadata Digitised images Ingestion into repository Creation of METS Access + = + + + Safety Deposit Box (SDB), the place where we store digital stuff. Ingest is automatically initiated by Goobi. Database that associates objects with DMD & AMD. Source for dissemination. Digital Repositories offer a convenient infrastructure through which to store, manage, re-use and curate digital materials. They are used by a variety of communities, may carry out many different functions, and can take many forms.
  • 10. Lets tackle the basics…processing Administrative metadata Descriptive metadata Digitised images Ingestion into repository Creation of METS Access + = + + + METS is metadata about structure & pagination created by humans, METS file built automatically. A Metadata Encoding & Transmission Standard (METS) file is an aggregated collection of DMD & AMD (a file list with structure) that provides a mechanism for managed access. A METS file allows metadata from different system to be combined into a portable format.
  • 11. The formats • JPEG2000 is our master image format. • We create dissemination images (JPEG) on the fly. • Also use PDF, MPEG2, MP3
  • 12. The systems • Goobi. Manages & tracks the production of digitised content. • SDB. Repository that stores digitised content along with its DMD & AMD. • Player. User interface to view digitised material.
  • 13. How Goobi works – the basics • Project based. • Workflow driven. • Users accept ‘tasks’. • A users role determines what projects they belong to & what roles they have.
  • 14. How Goobi works – a workflow
  • 15. How Goobi works – METS editing Pagination as per original Descriptive metadata Structure
  • 16. Lessons from Goobi • Design your workflows in advance. But be flexible. • Automate as much as possible, saves time & more efficient. • Document processes & procedures. • Share what you learn.
  • 17. How SDB works – the basics • Workflow based easily ‘talks’ to other systems. • Content agnostic. • Creates administrative metadata on ingest. • Preservation orientated.
  • 19. How SDB works – behind the scenes • No public access to SDB. • Little direct staff access to SDB content. • High levels of automation of ingest, Goobi. • Platform for dissemination mediated by the player.
  • 20. Lessons from SDB • Plan your systems integration, which system talks to which, and how. • Plan workflows & processes. • Data management plan. Your eggs in one basket. • Plan what you’ll do when it all turns to custard.
  • 21. How the player works – the basics
  • 22. How the player works • Makes HTTP request to SDB for content. • Draws access conditions from METS file. • Permitted actions drawn from METS. • Draws DMD from live catalogue.
  • 23.
  • 24. Summary • Digitisation is an end to end process that brings together objects & metadata. • Have to think about the whole system to deliver results. Process is one of combining metadata from different systems. • Document plans & document process. • Be prepared to be flexible & to change as necessary. But try to stick to the plan!
  • 25. Further reading • Wellcome Library – http://wellcomelibrary.org • Metadata Encoding & Transmission Standard at the Library of Congress - http://www.loc.gov/standards/mets/ • Reference Model for an Open Archival Information System (OAIS). Magenta Book. Issue 2. June 2012 - http://public.ccsds.org/publications/RefModel.aspx • Tessella, Safety Deposit Box - http://www.tessella.com/tag/safety-deposit- box/ • Data management planning - http://www.dcc.ac.uk/resources/data- management-plans • Repository Software Comparison: Building Digital Library Infrastructure at LSE - http://www.ariadne.ac.uk/issue64/fay
  • 26. Thank you Questions now, questions later…? Dave Thompson, Digital Curator Wellcome Library d.thompson@wellcome.ac.uk - #welldigi http://wellcomelibrary.org/

Notas do Editor

  1. dnt
  2. dnt
  3. dnt
  4. dnt
  5. dnt
  6. dnt
  7. dnt
  8. dnt
  9. dnt