SlideShare a Scribd company logo
1 of 12
Central Registry
                          for Digitized Objects:
                         Linking Production and
                          Bibliographic Control



Ralf Stockmann
Göttinger Digitization Center
As things are now
• Huge ventures in
  – Digitization
     •   Google
     •   Microsoft
     •   National programs
     •   Local centers
  – Accessibility
     •   World Digital Library
     •   European Digital Library
     •   National portals
     •   Google Book Search
As things are now
• We just face the dawn of mass digitization
  – Leaving behind the state of
    manufacturing
  – Entering industrialization
  – Scanning Robots
  – Accessible Full Text (OCR)
Lack of …
• Coordination in
  digitization activities
   – Who scans what
     where when
     in which quality
     and how will it
     be accessible
      • How is “quality” defined?
      • Do we agree on “what”?
Facing the Consequences
                                                Technical
                                                Improvements
                                                               Costs




                                 Waste of Ressources
Costs / Value




                                                               Additional
                                                               Benefit

                 Number of digitized items per volume
The Solution
• Central registry for digitized objects
• Focused on the production context (no user
  frontend)
• API driven
  – Application Programming Interface
  – Query / Ingest
  – Simple implementation into existing workflow-tools
• Batch mode (lists)
• Open Source / free service
• Matching on volume level
  – Score / probability
Implementation
                           Backend Services
                               EROMM / EDL / OCLC / …



                      Registry / Meta Data Store

                  Aggregator / Normalizer / Mapping

                                          API
                       Query

      Ingest                                      Ingest       Ingest




                      ? ? ?                                !      !      !
Present Collections             Running Project            Notice of Intent
Metadata Store
•   Bibliographic
     –   Title
     –   Author
     –   Date
     –   Place of publication      Matching / Score
     –   Number of Pages (?)       „what“
     –   Language
     –   Print / Format
     –   Edition
•   Technical
     –   Resolution
     –   Color depth
     –   File type / compression
•   Accessibility                  Additional Judging
     –   Institution               „who, where, which
     –   Persistent identifier     quality, how
     –   Rights                    accesible“
     –   URL
•   Status
     –   Digitized
     –   In Progress               Decisive Factor
     –   Intended (Timeline?)      „when“
     –   Requested?
Obstacles
• (open source) Tools for automated matching /
  scoring?
• Interface for manual comparison / decision making
• Multivolume works: low rate of uniformity (near
  50% of physical SUB stock before 1900)
• Unicode
• Transliteration tables
• Random bound books
• Reliable identifier
   – ISBN for old books?

• Anticipated rate of accuracy: 50 – 70 %
Appreciation of Values
• The goal is NOT to build a reliable database in terms of
  library standards
• But to prevent further waste of resources.
• If we manage to archive just 50% precision,
• We saved a min. 50% of founding!
Work Packages
• Define metadata model
• Set up database
• Implement mapping tools
• Define API calls
• Implement API
• Build some connectors to popular mass digitization workflow
  tools (e.g. “Goobi”)
• Establish ISBN workflow
• Harvest existing sources
• Start with a community of actual projects

• Get some (!) founding
• Estimated schedule plan: 6 months
Thank You
(stockmann@uni-goettingen.de)

More Related Content

Viewers also liked

Das materielle Objekt in der digitalen Welt
Das materielle Objekt in der digitalen WeltDas materielle Objekt in der digitalen Welt
Das materielle Objekt in der digitalen WeltRalf Stockmann
 
Deutsche Digitale Bibliothek - Vorstellung CeBit 2008
Deutsche Digitale Bibliothek - Vorstellung CeBit 2008Deutsche Digitale Bibliothek - Vorstellung CeBit 2008
Deutsche Digitale Bibliothek - Vorstellung CeBit 2008Ralf Stockmann
 
DFG Expertenworkshop - Workflow Volltextgenerierung über OCR
DFG Expertenworkshop - Workflow Volltextgenerierung über OCRDFG Expertenworkshop - Workflow Volltextgenerierung über OCR
DFG Expertenworkshop - Workflow Volltextgenerierung über OCRRalf Stockmann
 
eAqua und europeana4D - 2009
eAqua und europeana4D - 2009eAqua und europeana4D - 2009
eAqua und europeana4D - 2009Ralf Stockmann
 
Ist Langzeitarchivierung finanzierbar? Präsentation Akademie Sankelmark 2008
Ist Langzeitarchivierung finanzierbar? Präsentation Akademie Sankelmark 2008Ist Langzeitarchivierung finanzierbar? Präsentation Akademie Sankelmark 2008
Ist Langzeitarchivierung finanzierbar? Präsentation Akademie Sankelmark 2008Ralf Stockmann
 
Visualisierung bibliographischer Daten
Visualisierung bibliographischer DatenVisualisierung bibliographischer Daten
Visualisierung bibliographischer DatenRalf Stockmann
 
GUI-Mockups in der Softwareentwicklung
GUI-Mockups in der SoftwareentwicklungGUI-Mockups in der Softwareentwicklung
GUI-Mockups in der SoftwareentwicklungRalf Stockmann
 
maple , part2
maple , part2maple , part2
maple , part2ahamidp
 
Lansio Cysill Ar Lein12
Lansio Cysill Ar Lein12Lansio Cysill Ar Lein12
Lansio Cysill Ar Lein12canolfanbedwyr
 
Fireside Chats
Fireside ChatsFireside Chats
Fireside Chatscrissy3258
 
Lecture04- Use Case Diagrams
Lecture04- Use Case DiagramsLecture04- Use Case Diagrams
Lecture04- Use Case Diagramsartgreen
 
Out of comfort zone, into the adventure
Out of comfort zone, into the adventure Out of comfort zone, into the adventure
Out of comfort zone, into the adventure Adalberto Geradini
 
Il processo di cambiamento in un'Azienda Sanitaria
Il processo di cambiamento in un'Azienda SanitariaIl processo di cambiamento in un'Azienda Sanitaria
Il processo di cambiamento in un'Azienda SanitariaAdalberto Geradini
 
C'è un nuovo mondo del lavoro ?!
C'è un nuovo mondo del lavoro ?!C'è un nuovo mondo del lavoro ?!
C'è un nuovo mondo del lavoro ?!Adalberto Geradini
 
The Genocide In Rwanda
The Genocide In RwandaThe Genocide In Rwanda
The Genocide In Rwandabpersett
 
Perchè qualcuno dovrebbe darti un lavoro ?
Perchè qualcuno dovrebbe darti un lavoro ?Perchè qualcuno dovrebbe darti un lavoro ?
Perchè qualcuno dovrebbe darti un lavoro ?Adalberto Geradini
 
Gli s-vantaggi della relazione
Gli s-vantaggi della relazioneGli s-vantaggi della relazione
Gli s-vantaggi della relazioneAdalberto Geradini
 

Viewers also liked (20)

Das materielle Objekt in der digitalen Welt
Das materielle Objekt in der digitalen WeltDas materielle Objekt in der digitalen Welt
Das materielle Objekt in der digitalen Welt
 
Deutsche Digitale Bibliothek - Vorstellung CeBit 2008
Deutsche Digitale Bibliothek - Vorstellung CeBit 2008Deutsche Digitale Bibliothek - Vorstellung CeBit 2008
Deutsche Digitale Bibliothek - Vorstellung CeBit 2008
 
DFG Expertenworkshop - Workflow Volltextgenerierung über OCR
DFG Expertenworkshop - Workflow Volltextgenerierung über OCRDFG Expertenworkshop - Workflow Volltextgenerierung über OCR
DFG Expertenworkshop - Workflow Volltextgenerierung über OCR
 
eAqua und europeana4D - 2009
eAqua und europeana4D - 2009eAqua und europeana4D - 2009
eAqua und europeana4D - 2009
 
Ist Langzeitarchivierung finanzierbar? Präsentation Akademie Sankelmark 2008
Ist Langzeitarchivierung finanzierbar? Präsentation Akademie Sankelmark 2008Ist Langzeitarchivierung finanzierbar? Präsentation Akademie Sankelmark 2008
Ist Langzeitarchivierung finanzierbar? Präsentation Akademie Sankelmark 2008
 
Visualisierung bibliographischer Daten
Visualisierung bibliographischer DatenVisualisierung bibliographischer Daten
Visualisierung bibliographischer Daten
 
GUI-Mockups in der Softwareentwicklung
GUI-Mockups in der SoftwareentwicklungGUI-Mockups in der Softwareentwicklung
GUI-Mockups in der Softwareentwicklung
 
maple , part2
maple , part2maple , part2
maple , part2
 
Lansio Cysill Ar Lein12
Lansio Cysill Ar Lein12Lansio Cysill Ar Lein12
Lansio Cysill Ar Lein12
 
Fireside Chats
Fireside ChatsFireside Chats
Fireside Chats
 
Cyflwyniad Bloc
Cyflwyniad BlocCyflwyniad Bloc
Cyflwyniad Bloc
 
Lecture04- Use Case Diagrams
Lecture04- Use Case DiagramsLecture04- Use Case Diagrams
Lecture04- Use Case Diagrams
 
Out of comfort zone, into the adventure
Out of comfort zone, into the adventure Out of comfort zone, into the adventure
Out of comfort zone, into the adventure
 
Il processo di cambiamento in un'Azienda Sanitaria
Il processo di cambiamento in un'Azienda SanitariaIl processo di cambiamento in un'Azienda Sanitaria
Il processo di cambiamento in un'Azienda Sanitaria
 
Visioning the vision
Visioning the visionVisioning the vision
Visioning the vision
 
C'è un nuovo mondo del lavoro ?!
C'è un nuovo mondo del lavoro ?!C'è un nuovo mondo del lavoro ?!
C'è un nuovo mondo del lavoro ?!
 
The Genocide In Rwanda
The Genocide In RwandaThe Genocide In Rwanda
The Genocide In Rwanda
 
Perchè qualcuno dovrebbe darti un lavoro ?
Perchè qualcuno dovrebbe darti un lavoro ?Perchè qualcuno dovrebbe darti un lavoro ?
Perchè qualcuno dovrebbe darti un lavoro ?
 
Gli s-vantaggi della relazione
Gli s-vantaggi della relazioneGli s-vantaggi della relazione
Gli s-vantaggi della relazione
 
Be unique
Be unique Be unique
Be unique
 

Similar to Central Registry for Digitized Objects Links Production and Bibliographic Control

Workflows in the Virtual Observatory
Workflows in the Virtual ObservatoryWorkflows in the Virtual Observatory
Workflows in the Virtual ObservatoryJose Enrique Ruiz
 
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...Open Analytics
 
Open Data Summit Presentation by Joe Olsen
Open Data Summit Presentation by Joe OlsenOpen Data Summit Presentation by Joe Olsen
Open Data Summit Presentation by Joe OlsenChristopher Whitaker
 
Liferay & Big Data Dev Con 2014
Liferay & Big Data Dev Con 2014Liferay & Big Data Dev Con 2014
Liferay & Big Data Dev Con 2014Miguel Pastor
 
Wordware 2011: Lingoport i18n Planning & Static Analysis
Wordware 2011: Lingoport i18n Planning & Static AnalysisWordware 2011: Lingoport i18n Planning & Static Analysis
Wordware 2011: Lingoport i18n Planning & Static AnalysisLingoport (www.lingoport.com)
 
Designing and Implementing Search Solutions
Designing and Implementing Search SolutionsDesigning and Implementing Search Solutions
Designing and Implementing Search SolutionsFindwise
 
Caliber 2009 Tutorial Mgsree
Caliber 2009 Tutorial MgsreeCaliber 2009 Tutorial Mgsree
Caliber 2009 Tutorial Mgsreemgsree
 
HPCC Systems Engineering Summit: Community Use Case: Because Who Has Time for...
HPCC Systems Engineering Summit: Community Use Case: Because Who Has Time for...HPCC Systems Engineering Summit: Community Use Case: Because Who Has Time for...
HPCC Systems Engineering Summit: Community Use Case: Because Who Has Time for...HPCC Systems
 
Open Source Web Content Management Technologies for Libraries
Open Source Web Content Management Technologies for LibrariesOpen Source Web Content Management Technologies for Libraries
Open Source Web Content Management Technologies for LibrariesAnil Mishra
 
BI on Cloud Computing
BI on Cloud ComputingBI on Cloud Computing
BI on Cloud Computingtdwiindia
 
Docs as Part of the Product - Open Source Summit North America 2018
Docs as Part of the Product - Open Source Summit North America 2018Docs as Part of the Product - Open Source Summit North America 2018
Docs as Part of the Product - Open Source Summit North America 2018Den Delimarsky
 
Engineering patterns for implementing data science models on big data platforms
Engineering patterns for implementing data science models on big data platformsEngineering patterns for implementing data science models on big data platforms
Engineering patterns for implementing data science models on big data platformsHisham Arafat
 
Crossmedia Workflows
Crossmedia WorkflowsCrossmedia Workflows
Crossmedia WorkflowsDwight Kelly
 
Kuali OLE: A Look at our Software Deliverables Roadmap One Year On
Kuali OLE: A Look at our Software Deliverables Roadmap One Year OnKuali OLE: A Look at our Software Deliverables Roadmap One Year On
Kuali OLE: A Look at our Software Deliverables Roadmap One Year OnRobert H. McDonald
 
Digitization in theory and practice
Digitization in theory and practiceDigitization in theory and practice
Digitization in theory and practiceHelen Nneka Okpala
 
Katherine Kott Slides for DLF PM Group 2011
Katherine Kott Slides for DLF PM Group 2011Katherine Kott Slides for DLF PM Group 2011
Katherine Kott Slides for DLF PM Group 2011DLFCLIR
 
What Your Library Needs to Know About Kuali Open Library Environment (OLE) an...
What Your Library Needs to Know About Kuali Open Library Environment (OLE) an...What Your Library Needs to Know About Kuali Open Library Environment (OLE) an...
What Your Library Needs to Know About Kuali Open Library Environment (OLE) an...Robert H. McDonald
 

Similar to Central Registry for Digitized Objects Links Production and Bibliographic Control (20)

Workflows in the Virtual Observatory
Workflows in the Virtual ObservatoryWorkflows in the Virtual Observatory
Workflows in the Virtual Observatory
 
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
 
Open Data Summit Presentation by Joe Olsen
Open Data Summit Presentation by Joe OlsenOpen Data Summit Presentation by Joe Olsen
Open Data Summit Presentation by Joe Olsen
 
Liferay & Big Data Dev Con 2014
Liferay & Big Data Dev Con 2014Liferay & Big Data Dev Con 2014
Liferay & Big Data Dev Con 2014
 
Wordware 2011: Lingoport i18n Planning & Static Analysis
Wordware 2011: Lingoport i18n Planning & Static AnalysisWordware 2011: Lingoport i18n Planning & Static Analysis
Wordware 2011: Lingoport i18n Planning & Static Analysis
 
Designing and Implementing Search Solutions
Designing and Implementing Search SolutionsDesigning and Implementing Search Solutions
Designing and Implementing Search Solutions
 
Caliber 2009 Tutorial Mgsree
Caliber 2009 Tutorial MgsreeCaliber 2009 Tutorial Mgsree
Caliber 2009 Tutorial Mgsree
 
Cassandra eu
Cassandra euCassandra eu
Cassandra eu
 
HPCC Systems Engineering Summit: Community Use Case: Because Who Has Time for...
HPCC Systems Engineering Summit: Community Use Case: Because Who Has Time for...HPCC Systems Engineering Summit: Community Use Case: Because Who Has Time for...
HPCC Systems Engineering Summit: Community Use Case: Because Who Has Time for...
 
Open Source Web Content Management Technologies for Libraries
Open Source Web Content Management Technologies for LibrariesOpen Source Web Content Management Technologies for Libraries
Open Source Web Content Management Technologies for Libraries
 
BI on Cloud Computing
BI on Cloud ComputingBI on Cloud Computing
BI on Cloud Computing
 
32 cc 3_a_l-drumheller
32 cc 3_a_l-drumheller32 cc 3_a_l-drumheller
32 cc 3_a_l-drumheller
 
Docs as Part of the Product - Open Source Summit North America 2018
Docs as Part of the Product - Open Source Summit North America 2018Docs as Part of the Product - Open Source Summit North America 2018
Docs as Part of the Product - Open Source Summit North America 2018
 
Engineering patterns for implementing data science models on big data platforms
Engineering patterns for implementing data science models on big data platformsEngineering patterns for implementing data science models on big data platforms
Engineering patterns for implementing data science models on big data platforms
 
Crossmedia Workflows
Crossmedia WorkflowsCrossmedia Workflows
Crossmedia Workflows
 
Kuali OLE: A Look at our Software Deliverables Roadmap One Year On
Kuali OLE: A Look at our Software Deliverables Roadmap One Year OnKuali OLE: A Look at our Software Deliverables Roadmap One Year On
Kuali OLE: A Look at our Software Deliverables Roadmap One Year On
 
E meyer lamp2012
E meyer lamp2012E meyer lamp2012
E meyer lamp2012
 
Digitization in theory and practice
Digitization in theory and practiceDigitization in theory and practice
Digitization in theory and practice
 
Katherine Kott Slides for DLF PM Group 2011
Katherine Kott Slides for DLF PM Group 2011Katherine Kott Slides for DLF PM Group 2011
Katherine Kott Slides for DLF PM Group 2011
 
What Your Library Needs to Know About Kuali Open Library Environment (OLE) an...
What Your Library Needs to Know About Kuali Open Library Environment (OLE) an...What Your Library Needs to Know About Kuali Open Library Environment (OLE) an...
What Your Library Needs to Know About Kuali Open Library Environment (OLE) an...
 

More from Ralf Stockmann

Freiräume schaffen - im Social Intranet
Freiräume schaffen - im Social IntranetFreiräume schaffen - im Social Intranet
Freiräume schaffen - im Social IntranetRalf Stockmann
 
Die Bibliothek als Wolkenfabrik - Cloud-Dienste als Plattformen für digitale ...
Die Bibliothek als Wolkenfabrik - Cloud-Dienste als Plattformen für digitale ...Die Bibliothek als Wolkenfabrik - Cloud-Dienste als Plattformen für digitale ...
Die Bibliothek als Wolkenfabrik - Cloud-Dienste als Plattformen für digitale ...Ralf Stockmann
 
Wie man vom Intranet aus die Welt verbessern kann
Wie man vom Intranet aus die Welt verbessern kannWie man vom Intranet aus die Welt verbessern kann
Wie man vom Intranet aus die Welt verbessern kannRalf Stockmann
 
Die Revolution vergisst ihre Kinder - Drei Szenarien, wie Bibliotheken in 15 ...
Die Revolution vergisst ihre Kinder - Drei Szenarien, wie Bibliotheken in 15 ...Die Revolution vergisst ihre Kinder - Drei Szenarien, wie Bibliotheken in 15 ...
Die Revolution vergisst ihre Kinder - Drei Szenarien, wie Bibliotheken in 15 ...Ralf Stockmann
 
Der Zauberlehrling 
war nicht als
 Anleitung gemeint
Der Zauberlehrling 
war nicht als
 Anleitung gemeintDer Zauberlehrling 
war nicht als
 Anleitung gemeint
Der Zauberlehrling 
war nicht als
 Anleitung gemeintRalf Stockmann
 
BibliothekarInnen gestalten digitale Wissensräume
BibliothekarInnen gestalten digitale WissensräumeBibliothekarInnen gestalten digitale Wissensräume
BibliothekarInnen gestalten digitale WissensräumeRalf Stockmann
 
Fit für die digitale Bibliothek? (2007)
Fit für die digitale Bibliothek? (2007)Fit für die digitale Bibliothek? (2007)
Fit für die digitale Bibliothek? (2007)Ralf Stockmann
 
Grundlagen Digitaler Mediengestaltung
Grundlagen Digitaler MediengestaltungGrundlagen Digitaler Mediengestaltung
Grundlagen Digitaler MediengestaltungRalf Stockmann
 
Was Wissenschaftler wirklich Wollen
Was Wissenschaftler wirklich WollenWas Wissenschaftler wirklich Wollen
Was Wissenschaftler wirklich WollenRalf Stockmann
 
Was tun mit den Ergebnissen der OCR?
Was tun mit den Ergebnissen der OCR?Was tun mit den Ergebnissen der OCR?
Was tun mit den Ergebnissen der OCR?Ralf Stockmann
 
Keynote Studip Zukunftsworkshop
Keynote Studip ZukunftsworkshopKeynote Studip Zukunftsworkshop
Keynote Studip ZukunftsworkshopRalf Stockmann
 
Visually Lossless Kompression für die Digitalisierung an Bibliotheken
Visually Lossless Kompression für die Digitalisierung an BibliothekenVisually Lossless Kompression für die Digitalisierung an Bibliotheken
Visually Lossless Kompression für die Digitalisierung an BibliothekenRalf Stockmann
 
Was wir HEUTE beachten müssen um die Wissenschaftler in 10 Jahren nicht zu en...
Was wir HEUTE beachten müssen um die Wissenschaftler in 10 Jahren nicht zu en...Was wir HEUTE beachten müssen um die Wissenschaftler in 10 Jahren nicht zu en...
Was wir HEUTE beachten müssen um die Wissenschaftler in 10 Jahren nicht zu en...Ralf Stockmann
 
Goobi Rollen Und Rechte
Goobi Rollen Und RechteGoobi Rollen Und Rechte
Goobi Rollen Und RechteRalf Stockmann
 
Persitent Identifier in Goobi
Persitent Identifier in GoobiPersitent Identifier in Goobi
Persitent Identifier in GoobiRalf Stockmann
 
Kooperative Angebote von GBV und GDZ im Bereich Digitalisierung
Kooperative Angebote von GBV und GDZ im Bereich DigitalisierungKooperative Angebote von GBV und GDZ im Bereich Digitalisierung
Kooperative Angebote von GBV und GDZ im Bereich DigitalisierungRalf Stockmann
 

More from Ralf Stockmann (17)

Freiräume schaffen - im Social Intranet
Freiräume schaffen - im Social IntranetFreiräume schaffen - im Social Intranet
Freiräume schaffen - im Social Intranet
 
Die Bibliothek als Wolkenfabrik - Cloud-Dienste als Plattformen für digitale ...
Die Bibliothek als Wolkenfabrik - Cloud-Dienste als Plattformen für digitale ...Die Bibliothek als Wolkenfabrik - Cloud-Dienste als Plattformen für digitale ...
Die Bibliothek als Wolkenfabrik - Cloud-Dienste als Plattformen für digitale ...
 
Wie man vom Intranet aus die Welt verbessern kann
Wie man vom Intranet aus die Welt verbessern kannWie man vom Intranet aus die Welt verbessern kann
Wie man vom Intranet aus die Welt verbessern kann
 
Die Revolution vergisst ihre Kinder - Drei Szenarien, wie Bibliotheken in 15 ...
Die Revolution vergisst ihre Kinder - Drei Szenarien, wie Bibliotheken in 15 ...Die Revolution vergisst ihre Kinder - Drei Szenarien, wie Bibliotheken in 15 ...
Die Revolution vergisst ihre Kinder - Drei Szenarien, wie Bibliotheken in 15 ...
 
Der Zauberlehrling 
war nicht als
 Anleitung gemeint
Der Zauberlehrling 
war nicht als
 Anleitung gemeintDer Zauberlehrling 
war nicht als
 Anleitung gemeint
Der Zauberlehrling 
war nicht als
 Anleitung gemeint
 
BibliothekarInnen gestalten digitale Wissensräume
BibliothekarInnen gestalten digitale WissensräumeBibliothekarInnen gestalten digitale Wissensräume
BibliothekarInnen gestalten digitale Wissensräume
 
Fit für die digitale Bibliothek? (2007)
Fit für die digitale Bibliothek? (2007)Fit für die digitale Bibliothek? (2007)
Fit für die digitale Bibliothek? (2007)
 
Grundlagen Digitaler Mediengestaltung
Grundlagen Digitaler MediengestaltungGrundlagen Digitaler Mediengestaltung
Grundlagen Digitaler Mediengestaltung
 
Was Wissenschaftler wirklich Wollen
Was Wissenschaftler wirklich WollenWas Wissenschaftler wirklich Wollen
Was Wissenschaftler wirklich Wollen
 
Was tun mit den Ergebnissen der OCR?
Was tun mit den Ergebnissen der OCR?Was tun mit den Ergebnissen der OCR?
Was tun mit den Ergebnissen der OCR?
 
Keynote Studip Zukunftsworkshop
Keynote Studip ZukunftsworkshopKeynote Studip Zukunftsworkshop
Keynote Studip Zukunftsworkshop
 
Zukunft der E Books
Zukunft der E BooksZukunft der E Books
Zukunft der E Books
 
Visually Lossless Kompression für die Digitalisierung an Bibliotheken
Visually Lossless Kompression für die Digitalisierung an BibliothekenVisually Lossless Kompression für die Digitalisierung an Bibliotheken
Visually Lossless Kompression für die Digitalisierung an Bibliotheken
 
Was wir HEUTE beachten müssen um die Wissenschaftler in 10 Jahren nicht zu en...
Was wir HEUTE beachten müssen um die Wissenschaftler in 10 Jahren nicht zu en...Was wir HEUTE beachten müssen um die Wissenschaftler in 10 Jahren nicht zu en...
Was wir HEUTE beachten müssen um die Wissenschaftler in 10 Jahren nicht zu en...
 
Goobi Rollen Und Rechte
Goobi Rollen Und RechteGoobi Rollen Und Rechte
Goobi Rollen Und Rechte
 
Persitent Identifier in Goobi
Persitent Identifier in GoobiPersitent Identifier in Goobi
Persitent Identifier in Goobi
 
Kooperative Angebote von GBV und GDZ im Bereich Digitalisierung
Kooperative Angebote von GBV und GDZ im Bereich DigitalisierungKooperative Angebote von GBV und GDZ im Bereich Digitalisierung
Kooperative Angebote von GBV und GDZ im Bereich Digitalisierung
 

Recently uploaded

Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 

Recently uploaded (20)

Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 

Central Registry for Digitized Objects Links Production and Bibliographic Control

  • 1. Central Registry for Digitized Objects: Linking Production and Bibliographic Control Ralf Stockmann Göttinger Digitization Center
  • 2. As things are now • Huge ventures in – Digitization • Google • Microsoft • National programs • Local centers – Accessibility • World Digital Library • European Digital Library • National portals • Google Book Search
  • 3. As things are now • We just face the dawn of mass digitization – Leaving behind the state of manufacturing – Entering industrialization – Scanning Robots – Accessible Full Text (OCR)
  • 4. Lack of … • Coordination in digitization activities – Who scans what where when in which quality and how will it be accessible • How is “quality” defined? • Do we agree on “what”?
  • 5. Facing the Consequences Technical Improvements Costs Waste of Ressources Costs / Value Additional Benefit Number of digitized items per volume
  • 6. The Solution • Central registry for digitized objects • Focused on the production context (no user frontend) • API driven – Application Programming Interface – Query / Ingest – Simple implementation into existing workflow-tools • Batch mode (lists) • Open Source / free service • Matching on volume level – Score / probability
  • 7. Implementation Backend Services EROMM / EDL / OCLC / … Registry / Meta Data Store Aggregator / Normalizer / Mapping API Query Ingest Ingest Ingest ? ? ? ! ! ! Present Collections Running Project Notice of Intent
  • 8. Metadata Store • Bibliographic – Title – Author – Date – Place of publication Matching / Score – Number of Pages (?) „what“ – Language – Print / Format – Edition • Technical – Resolution – Color depth – File type / compression • Accessibility Additional Judging – Institution „who, where, which – Persistent identifier quality, how – Rights accesible“ – URL • Status – Digitized – In Progress Decisive Factor – Intended (Timeline?) „when“ – Requested?
  • 9. Obstacles • (open source) Tools for automated matching / scoring? • Interface for manual comparison / decision making • Multivolume works: low rate of uniformity (near 50% of physical SUB stock before 1900) • Unicode • Transliteration tables • Random bound books • Reliable identifier – ISBN for old books? • Anticipated rate of accuracy: 50 – 70 %
  • 10. Appreciation of Values • The goal is NOT to build a reliable database in terms of library standards • But to prevent further waste of resources. • If we manage to archive just 50% precision, • We saved a min. 50% of founding!
  • 11. Work Packages • Define metadata model • Set up database • Implement mapping tools • Define API calls • Implement API • Build some connectors to popular mass digitization workflow tools (e.g. “Goobi”) • Establish ISBN workflow • Harvest existing sources • Start with a community of actual projects • Get some (!) founding • Estimated schedule plan: 6 months