SlideShare uma empresa Scribd logo
1 de 20
Unified Digital Format Registry
a semantic registry for digital preservation



                                                 Digital Library Federation Forum
                                               Baltimore, October 31-November 2, 2011




           UDFR: A Semantic Registry for Format
               Representation Information


                                                      Lisa Dawn Colvin
                                                       Abhishek Salve
                                                      Stephen Abrams
                                                      UC Curation Center
                                                     California Digital Library
Unified Digital Format Registry
a semantic registry for digital preservation




                                               Outline
       What
       Why
       How
       When
Unified Digital Format Registry
a semantic registry for digital preservation




                                                  Why formats?
      “Format” is the dividing line between bits and
      information
        ffd8ffe000104a46                                SOI
        4946000102010083                                APP0    JFIF 1.2
        00830000ffed0fb0                                APP13   IPTC
        50686f746f73686f                                APP2    ICC
        7020332e30003842                                DQT
        494d03e90a507269                                SOF0    183x512
        6e7420496e666f00                                DRI
                                               Syntax             Semantics
        0000007800000000                                DHT
        0048004800000000                                SOS
        02f40240ffeeffee                                ECS0
        0306025203470528                                RST0
        03fc000200000048                                ECS1
        00480000000002d8                                RST1
        0228000100000064                                ECS2
        0000000100030...                                ...
Unified Digital Format Registry
a semantic registry for digital preservation




                                               Why formats?
    There are many necessary preservation activities that
    can be usefully performed on bits qua bits
    But to preserve information you most act on
    formatted bits and know what those formats mean
            • Preservation of syntax and semantics
Unified Digital Format Registry
a semantic registry for digital preservation




                         Unified Digital Format Registry
      “A reliable, publicly accessible, and sustainable
      knowledge base of file format representation
      information for use by the digital preservation
      community”
              • “Unification” of the function and holdings of PRONOM
                and GDFR
                     http://www.nationalarchives.gov.uk/PRONOM
                     http://gdfr.info/

              • Open source platform / GPL
              • Semantic wiki
              • Funded by the Library of Congress
Unified Digital Format Registry
a semantic registry for digital preservation




                                               Timeline
       PRONOM – National Archives [UK], 2002
       http://www.nationalarchives.gov.uk/PRONOM

              “ready access to reliable technical information about the
              nature of electronic records”
       JHOVE – Harvard, 2003
       http://hul.harvard.edu/jhove

                “digital object validation and characterization”
       GDFR – Harvard/OCLC, 2006
       http://gdfr.info/

              “a distributed and replicated registry of format information
              populated and vetted by experts and enthusiasts world-
              wide”
Unified Digital Format Registry
a semantic registry for digital preservation




                                               Timeline
      UDFR – Ad hoc stakeholder community, 2009
              • Resolve PRONOM IPR issues and develop a community-
                supported open source solution
              • Advance beyond legacy RDBMS and XML database
                technology
      UDFR – CDL, January 2011
      http://udfr.org/

                     “a semantic registry for digital preservation”
              • Stakeholder meeting, April 2011
              • Beta release, November 2011
              • Production release, January 2012
Unified Digital Format Registry
a semantic registry for digital preservation




                                Representation information
      What you need to know about something in order to
      exploit that thing meaningfully [OAIS/ISO 14720]
      Information that lets you answer important
      preservation questions
              •      What format is it?
              •      What are its significant properties?
              •      Is it valid?
              •      Is it at risk?
              •      How can I render/play/read it?
              •      What can it be transformed into?
              •      And how?
Unified Digital Format Registry
a semantic registry for digital preservation




                                               Why semantic?
        Everyone wants to say something about everything
               • The semantic web lets anyone say anything about
                 anything
               • Understandable to both people and machines
Unified Digital Format Registry
   a semantic registry for digital preservation




                                                          Data modeling
                                                                              Abstract               Controlled
                                                                               Base                  Vocabulary                …
                                                                                            holder
                                                                                             dependency
                                           holder                  creator
                                                                   owner      Abstract     product                                               Abstract
 Process                      IPR                     Agent                                            Holding                Digest
                                                                              Product                                                           Signature
                                                                maintainer                                            reference
   embodies                                              ipr                       specification                            file
                                                                                                                                       digest


                                                                              Abstract                                                           External
 Software                 Hardware                   Media                                            Document                 File
                                                                              Format                                                            Signature
                                                                                                                                                  Internal
                                                                                                                                                 Signature
                                                  input / output                                                  signature



                                                    Character                                        Compression
Assessment                Grammar                                            File Format
                                                    Encoding                                          Algorithm

                                                    grammar
                assessment
Unified Digital Format Registry
a semantic registry for digital preservation




                                               Provenance
      “Trust, but verify”
              • Complete change history
                at the assertion level,
                including
                        – Who made the assertion, and when?
                        – Confidence based on personal and institutional
                          reputation
              • Imprimatur by technically knowledgeable
                reviewers
Unified Digital Format Registry
a semantic registry for digital preservation




                                                       Ontologies
                    Prefixu          Namespace
                    udfrs            http://udfr.org/onto#
                    udfr             http://udfr.org/udfr/


                    dc               http://purl.org/dc/elements/1.1/
                    dcterms          http://purl.org/dc/terms/
                    foaf             http://xmls.com/foaf/0.1/
                    owl              http://www.w3.org/2002/07/owl#
                    pronom           http://reference.data.gov.uk/technical-registry/
                    rdf              http://www.w3.org/1999/02/22-rdf-syntax-ns#
                    rdfs             http://www.w3.org/2000/01/rdf-schema#
                    skos             http://www.w3.org/2004/02/skos/core#
                    xds              http://www.w3.org/2001/XMLSchema#
Unified Digital Format Registry
a semantic registry for digital preservation




                                               Technology stack
                                                              HTTP / SPARQL



                                                          JavaScript / CSS


                                                             Ontowiki              Erfurt / RDFAuthor
                                                                                     http://aksw.org/Projects/Erfurt
                                                           http://ontowiki.net/
                                                                                  https://github.com/AKSW/RDFauthor




                                                    Zend framework                  Virtuoso 4store
                                                       http://www.zend.com/       http://virtuoso.openlinksw.com/




                                                            PHP                            RDF
                                                     http://www.php.net/           http://www.w3.org/RDF




                                                 Apache httpd
                                                 http://httpd.apache.org/
Unified Digital Format Registry
a semantic registry for digital preservation




                                               Initial population
      Export from PRONOM
              • Working with TNA to identify appropriate subset
              • Transform to cross-walk modeling differences
Unified Digital Format Registry
a semantic registry for digital preservation




                                               Licensing
        Code is available under GPLv3
        http://www.gnu.org/copyleft/gpl.html

             • Hosted on BitBucket
                    http://www.bitbucket.org/udfr


        Data is contributed and available under CC-BY
        http://creativecommons.org/licenses/by/3.0/

               • Consistent with UK open government license applicable
                 to PRONOM data
                       http://www.nationalarchives.gov.uk/doc/open-government-licence
Unified Digital Format Registry
a semantic registry for digital preservation




                                               Demo
Unified Digital Format Registry
a semantic registry for digital preservation




                                               Lessons learned
 People with semantic experience are scarce
 Too much time evaluating/prototyping potential
  technology choices
 More difficulty than anticipated integrating disparate
  open source products
 0.x software is often numbered that for a reason
 Feature lists aren’t (always)
Unified Digital Format Registry
a semantic registry for digital preservation




                                               Lessons learned
 Availability of a worldwide selection of products is a
  good thing (except when you don’t read German)
       • Excellent support from AKWS/Universität Leipzig

 Modeling differences
       • RDF (non-)standards

 VM deployment
       • Disparate IT organizations supporting dev/prod instances
Unified Digital Format Registry
a semantic registry for digital preservation




                                               Next steps
 Long-term governance and operational support
 Technical maintenance and enhancement
 Replication/synchronization
 Building contributor and reviewer communities
Unified Digital Format Registry
a semantic registry for digital preservation




                                               For more information
      UDFR                                                    UC3
      http://udfr.org/                                        http://www.cdlib.org/uc3
      http://bitbucket.org/udfr                               uc3@ucop.edu
                                                              Stephen Abrams      Mark Reyes
      PRONOM                                                  Lisa Colvin         Abhishek Salve
      http://www.nationalarchives.gov.uk/PRONOM               Patricia Cruse      Tracy Seneca
                                                              Scott Fisher        Joan Starr
      GDFR                                                    Erik Hetzner        Carly Strasser
      http://gdfr.info/                                       Greg Janée          Marisa Strong
                                                              John Kunze          Adrian Turner
      OntoWiki                                                Margaret Low
                                                              David Loy
                                                                                  Perry Willett
      http://ontowiki.net/Projects/OntoWiki

      Virtuoso
      http://www.openlinksw.com/dataspace/dav/wiki/Main/VOSRDFWP

      Agile Knowledge and Semantic Web (AKSW), Universität Leipzig
      http://aksw.org/

Mais conteúdo relacionado

Destaque

Ontology-based concept mapping in Plant Sciences
Ontology-based concept mapping in Plant SciencesOntology-based concept mapping in Plant Sciences
Ontology-based concept mapping in Plant SciencesKaty Jordan
 
Ontology driven Annotation
Ontology driven AnnotationOntology driven Annotation
Ontology driven AnnotationAshish Kulkarni
 
Light Intro to the Gene Ontology
Light Intro to the Gene OntologyLight Intro to the Gene Ontology
Light Intro to the Gene Ontologynniiicc
 

Destaque (6)

Ontology-based concept mapping in Plant Sciences
Ontology-based concept mapping in Plant SciencesOntology-based concept mapping in Plant Sciences
Ontology-based concept mapping in Plant Sciences
 
Ontology driven Annotation
Ontology driven AnnotationOntology driven Annotation
Ontology driven Annotation
 
Personal Brand
Personal BrandPersonal Brand
Personal Brand
 
Personal Brand
Personal BrandPersonal Brand
Personal Brand
 
Light Intro to the Gene Ontology
Light Intro to the Gene OntologyLight Intro to the Gene Ontology
Light Intro to the Gene Ontology
 
Bird House
Bird HouseBird House
Bird House
 

Semelhante a Dlf 2011UDFR-a-semantic-registry-for-format-representation-information-v1

HFCommunity: A Tool to Analyse the Hugging Face Hub Community
HFCommunity: A Tool to Analyse the Hugging Face Hub CommunityHFCommunity: A Tool to Analyse the Hugging Face Hub Community
HFCommunity: A Tool to Analyse the Hugging Face Hub CommunityAdem Ait
 
A strategic view of document and digital object management
A strategic view of document and digital object managementA strategic view of document and digital object management
A strategic view of document and digital object managementDerek Keats
 
The Role of OAIS Representation Information in the Digital Curation of Crysta...
The Role of OAIS Representation Information in the Digital Curation of Crysta...The Role of OAIS Representation Information in the Digital Curation of Crysta...
The Role of OAIS Representation Information in the Digital Curation of Crysta...ManjulaPatel
 
Tackling File Characterization and Analysis in Archivematica
Tackling File Characterization and Analysis in ArchivematicaTackling File Characterization and Analysis in Archivematica
Tackling File Characterization and Analysis in ArchivematicaCourtney Mumma
 
Digital Preservation Best Practices: Lessons Learned From Across the Pond
Digital Preservation Best Practices: Lessons Learned From Across the PondDigital Preservation Best Practices: Lessons Learned From Across the Pond
Digital Preservation Best Practices: Lessons Learned From Across the PondBenoit Pauwels
 
Digital Presentation Best Practices: Lessons Learned From Across the Pond
Digital Presentation Best Practices: Lessons Learned From Across the PondDigital Presentation Best Practices: Lessons Learned From Across the Pond
Digital Presentation Best Practices: Lessons Learned From Across the PondULB - Bibliothèques
 
Digital Libraries
Digital LibrariesDigital Libraries
Digital LibrariesJack Eapen
 
Digital Libraries
Digital LibrariesDigital Libraries
Digital LibrariesJack Eapen
 
Caliber 2009 Tutorial Mgsree
Caliber 2009 Tutorial MgsreeCaliber 2009 Tutorial Mgsree
Caliber 2009 Tutorial Mgsreemgsree
 
ISA online service to make it easier for Public Administrations to find ICT s...
ISA online service to make it easier for Public Administrations to find ICT s...ISA online service to make it easier for Public Administrations to find ICT s...
ISA online service to make it easier for Public Administrations to find ICT s...Joao Frade
 
"Updates on Semantic Fingerprinting", Francisco Webber, Inventor and Co-Found...
"Updates on Semantic Fingerprinting", Francisco Webber, Inventor and Co-Found..."Updates on Semantic Fingerprinting", Francisco Webber, Inventor and Co-Found...
"Updates on Semantic Fingerprinting", Francisco Webber, Inventor and Co-Found...Dataconomy Media
 
Preservation content in_files
Preservation content in_filesPreservation content in_files
Preservation content in_filesRichard Wright
 
RUresearch: Supporting the Management and Preservation of Research Data - Ale...
RUresearch: Supporting the Management and Preservation of Research Data - Ale...RUresearch: Supporting the Management and Preservation of Research Data - Ale...
RUresearch: Supporting the Management and Preservation of Research Data - Ale...ASIS&T
 
Putting it all together for digital assets
Putting it all together for digital assetsPutting it all together for digital assets
Putting it all together for digital assetsJon Morley
 

Semelhante a Dlf 2011UDFR-a-semantic-registry-for-format-representation-information-v1 (20)

HFCommunity: A Tool to Analyse the Hugging Face Hub Community
HFCommunity: A Tool to Analyse the Hugging Face Hub CommunityHFCommunity: A Tool to Analyse the Hugging Face Hub Community
HFCommunity: A Tool to Analyse the Hugging Face Hub Community
 
A strategic view of document and digital object management
A strategic view of document and digital object managementA strategic view of document and digital object management
A strategic view of document and digital object management
 
The Role of OAIS Representation Information in the Digital Curation of Crysta...
The Role of OAIS Representation Information in the Digital Curation of Crysta...The Role of OAIS Representation Information in the Digital Curation of Crysta...
The Role of OAIS Representation Information in the Digital Curation of Crysta...
 
Iasa Presentatie
Iasa PresentatieIasa Presentatie
Iasa Presentatie
 
A/V in Archivematica
A/V in ArchivematicaA/V in Archivematica
A/V in Archivematica
 
Tackling File Characterization and Analysis in Archivematica
Tackling File Characterization and Analysis in ArchivematicaTackling File Characterization and Analysis in Archivematica
Tackling File Characterization and Analysis in Archivematica
 
Digital Preservation Best Practices: Lessons Learned From Across the Pond
Digital Preservation Best Practices: Lessons Learned From Across the PondDigital Preservation Best Practices: Lessons Learned From Across the Pond
Digital Preservation Best Practices: Lessons Learned From Across the Pond
 
Digital Presentation Best Practices: Lessons Learned From Across the Pond
Digital Presentation Best Practices: Lessons Learned From Across the PondDigital Presentation Best Practices: Lessons Learned From Across the Pond
Digital Presentation Best Practices: Lessons Learned From Across the Pond
 
Digital Libraries
Digital LibrariesDigital Libraries
Digital Libraries
 
Digital Libraries
Digital LibrariesDigital Libraries
Digital Libraries
 
Caliber 2009 Tutorial Mgsree
Caliber 2009 Tutorial MgsreeCaliber 2009 Tutorial Mgsree
Caliber 2009 Tutorial Mgsree
 
心理影响.ppt
心理影响.ppt心理影响.ppt
心理影响.ppt
 
Guru_poster
Guru_posterGuru_poster
Guru_poster
 
Trm Introduction
Trm IntroductionTrm Introduction
Trm Introduction
 
ISA online service to make it easier for Public Administrations to find ICT s...
ISA online service to make it easier for Public Administrations to find ICT s...ISA online service to make it easier for Public Administrations to find ICT s...
ISA online service to make it easier for Public Administrations to find ICT s...
 
B0110508
B0110508B0110508
B0110508
 
"Updates on Semantic Fingerprinting", Francisco Webber, Inventor and Co-Found...
"Updates on Semantic Fingerprinting", Francisco Webber, Inventor and Co-Found..."Updates on Semantic Fingerprinting", Francisco Webber, Inventor and Co-Found...
"Updates on Semantic Fingerprinting", Francisco Webber, Inventor and Co-Found...
 
Preservation content in_files
Preservation content in_filesPreservation content in_files
Preservation content in_files
 
RUresearch: Supporting the Management and Preservation of Research Data - Ale...
RUresearch: Supporting the Management and Preservation of Research Data - Ale...RUresearch: Supporting the Management and Preservation of Research Data - Ale...
RUresearch: Supporting the Management and Preservation of Research Data - Ale...
 
Putting it all together for digital assets
Putting it all together for digital assetsPutting it all together for digital assets
Putting it all together for digital assets
 

Mais de DLFCLIR

Managing the Digitization of Large Press Archives
Managing the Digitization of Large Press ArchivesManaging the Digitization of Large Press Archives
Managing the Digitization of Large Press ArchivesDLFCLIR
 
Dlf bonnie tijerina keynote
Dlf  bonnie tijerina keynoteDlf  bonnie tijerina keynote
Dlf bonnie tijerina keynoteDLFCLIR
 
Participatory Digital Library
Participatory Digital LibraryParticipatory Digital Library
Participatory Digital LibraryDLFCLIR
 
Public Knowledge Project
Public Knowledge ProjectPublic Knowledge Project
Public Knowledge ProjectDLFCLIR
 
Biomedical Annotation - Kevin Livingston
Biomedical Annotation - Kevin LivingstonBiomedical Annotation - Kevin Livingston
Biomedical Annotation - Kevin LivingstonDLFCLIR
 
Introducing NYU to Digital Scholarship: A faculty-library partnership
Introducing NYU to Digital Scholarship: A faculty-library partnershipIntroducing NYU to Digital Scholarship: A faculty-library partnership
Introducing NYU to Digital Scholarship: A faculty-library partnershipDLFCLIR
 
Collaborative Service Models: Building Support for Digital Scholarship
Collaborative Service Models: Building Support for Digital ScholarshipCollaborative Service Models: Building Support for Digital Scholarship
Collaborative Service Models: Building Support for Digital ScholarshipDLFCLIR
 
Sustaining ArchivesSpace
Sustaining ArchivesSpaceSustaining ArchivesSpace
Sustaining ArchivesSpaceDLFCLIR
 
From Projects to... Services
From Projects to... ServicesFrom Projects to... Services
From Projects to... ServicesDLFCLIR
 
An Introduction to Linked Data and Microdata
An Introduction to Linked Data and MicrodataAn Introduction to Linked Data and Microdata
An Introduction to Linked Data and MicrodataDLFCLIR
 
Katherine Kott Slides for DLF PM Group 2011
Katherine Kott Slides for DLF PM Group 2011Katherine Kott Slides for DLF PM Group 2011
Katherine Kott Slides for DLF PM Group 2011DLFCLIR
 
Charter Nonstarter by Eric Stedfeld, NYU
Charter Nonstarter by Eric Stedfeld, NYUCharter Nonstarter by Eric Stedfeld, NYU
Charter Nonstarter by Eric Stedfeld, NYUDLFCLIR
 
Hypatia for dlf 2011
Hypatia for dlf 2011Hypatia for dlf 2011
Hypatia for dlf 2011DLFCLIR
 

Mais de DLFCLIR (13)

Managing the Digitization of Large Press Archives
Managing the Digitization of Large Press ArchivesManaging the Digitization of Large Press Archives
Managing the Digitization of Large Press Archives
 
Dlf bonnie tijerina keynote
Dlf  bonnie tijerina keynoteDlf  bonnie tijerina keynote
Dlf bonnie tijerina keynote
 
Participatory Digital Library
Participatory Digital LibraryParticipatory Digital Library
Participatory Digital Library
 
Public Knowledge Project
Public Knowledge ProjectPublic Knowledge Project
Public Knowledge Project
 
Biomedical Annotation - Kevin Livingston
Biomedical Annotation - Kevin LivingstonBiomedical Annotation - Kevin Livingston
Biomedical Annotation - Kevin Livingston
 
Introducing NYU to Digital Scholarship: A faculty-library partnership
Introducing NYU to Digital Scholarship: A faculty-library partnershipIntroducing NYU to Digital Scholarship: A faculty-library partnership
Introducing NYU to Digital Scholarship: A faculty-library partnership
 
Collaborative Service Models: Building Support for Digital Scholarship
Collaborative Service Models: Building Support for Digital ScholarshipCollaborative Service Models: Building Support for Digital Scholarship
Collaborative Service Models: Building Support for Digital Scholarship
 
Sustaining ArchivesSpace
Sustaining ArchivesSpaceSustaining ArchivesSpace
Sustaining ArchivesSpace
 
From Projects to... Services
From Projects to... ServicesFrom Projects to... Services
From Projects to... Services
 
An Introduction to Linked Data and Microdata
An Introduction to Linked Data and MicrodataAn Introduction to Linked Data and Microdata
An Introduction to Linked Data and Microdata
 
Katherine Kott Slides for DLF PM Group 2011
Katherine Kott Slides for DLF PM Group 2011Katherine Kott Slides for DLF PM Group 2011
Katherine Kott Slides for DLF PM Group 2011
 
Charter Nonstarter by Eric Stedfeld, NYU
Charter Nonstarter by Eric Stedfeld, NYUCharter Nonstarter by Eric Stedfeld, NYU
Charter Nonstarter by Eric Stedfeld, NYU
 
Hypatia for dlf 2011
Hypatia for dlf 2011Hypatia for dlf 2011
Hypatia for dlf 2011
 

Último

Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management SystemChristalin Nelson
 
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptxMusic 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptxleah joy valeriano
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)cama23
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4JOYLYNSAMANIEGO
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptxmary850239
 
Activity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationActivity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationRosabel UA
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfErwinPantujan2
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONHumphrey A Beña
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptxmary850239
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Seán Kennedy
 
Integumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptIntegumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptshraddhaparab530
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxAshokKarra1
 

Último (20)

Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management System
 
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptxMusic 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx
 
Activity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationActivity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translation
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
 
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
 
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptxFINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...
 
Integumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptIntegumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.ppt
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptx
 

Dlf 2011UDFR-a-semantic-registry-for-format-representation-information-v1

  • 1. Unified Digital Format Registry a semantic registry for digital preservation Digital Library Federation Forum Baltimore, October 31-November 2, 2011 UDFR: A Semantic Registry for Format Representation Information Lisa Dawn Colvin Abhishek Salve Stephen Abrams UC Curation Center California Digital Library
  • 2. Unified Digital Format Registry a semantic registry for digital preservation Outline  What  Why  How  When
  • 3. Unified Digital Format Registry a semantic registry for digital preservation Why formats? “Format” is the dividing line between bits and information ffd8ffe000104a46 SOI 4946000102010083 APP0 JFIF 1.2 00830000ffed0fb0 APP13 IPTC 50686f746f73686f APP2 ICC 7020332e30003842 DQT 494d03e90a507269 SOF0 183x512 6e7420496e666f00 DRI Syntax Semantics 0000007800000000 DHT 0048004800000000 SOS 02f40240ffeeffee ECS0 0306025203470528 RST0 03fc000200000048 ECS1 00480000000002d8 RST1 0228000100000064 ECS2 0000000100030... ...
  • 4. Unified Digital Format Registry a semantic registry for digital preservation Why formats? There are many necessary preservation activities that can be usefully performed on bits qua bits But to preserve information you most act on formatted bits and know what those formats mean • Preservation of syntax and semantics
  • 5. Unified Digital Format Registry a semantic registry for digital preservation Unified Digital Format Registry “A reliable, publicly accessible, and sustainable knowledge base of file format representation information for use by the digital preservation community” • “Unification” of the function and holdings of PRONOM and GDFR http://www.nationalarchives.gov.uk/PRONOM http://gdfr.info/ • Open source platform / GPL • Semantic wiki • Funded by the Library of Congress
  • 6. Unified Digital Format Registry a semantic registry for digital preservation Timeline PRONOM – National Archives [UK], 2002 http://www.nationalarchives.gov.uk/PRONOM “ready access to reliable technical information about the nature of electronic records” JHOVE – Harvard, 2003 http://hul.harvard.edu/jhove “digital object validation and characterization” GDFR – Harvard/OCLC, 2006 http://gdfr.info/ “a distributed and replicated registry of format information populated and vetted by experts and enthusiasts world- wide”
  • 7. Unified Digital Format Registry a semantic registry for digital preservation Timeline UDFR – Ad hoc stakeholder community, 2009 • Resolve PRONOM IPR issues and develop a community- supported open source solution • Advance beyond legacy RDBMS and XML database technology UDFR – CDL, January 2011 http://udfr.org/ “a semantic registry for digital preservation” • Stakeholder meeting, April 2011 • Beta release, November 2011 • Production release, January 2012
  • 8. Unified Digital Format Registry a semantic registry for digital preservation Representation information What you need to know about something in order to exploit that thing meaningfully [OAIS/ISO 14720] Information that lets you answer important preservation questions • What format is it? • What are its significant properties? • Is it valid? • Is it at risk? • How can I render/play/read it? • What can it be transformed into? • And how?
  • 9. Unified Digital Format Registry a semantic registry for digital preservation Why semantic? Everyone wants to say something about everything • The semantic web lets anyone say anything about anything • Understandable to both people and machines
  • 10. Unified Digital Format Registry a semantic registry for digital preservation Data modeling Abstract Controlled Base Vocabulary … holder dependency holder creator owner Abstract product Abstract Process IPR Agent Holding Digest Product Signature maintainer reference embodies ipr specification file digest Abstract External Software Hardware Media Document File Format Signature Internal Signature input / output signature Character Compression Assessment Grammar File Format Encoding Algorithm grammar assessment
  • 11. Unified Digital Format Registry a semantic registry for digital preservation Provenance “Trust, but verify” • Complete change history at the assertion level, including – Who made the assertion, and when? – Confidence based on personal and institutional reputation • Imprimatur by technically knowledgeable reviewers
  • 12. Unified Digital Format Registry a semantic registry for digital preservation Ontologies Prefixu Namespace udfrs http://udfr.org/onto# udfr http://udfr.org/udfr/ dc http://purl.org/dc/elements/1.1/ dcterms http://purl.org/dc/terms/ foaf http://xmls.com/foaf/0.1/ owl http://www.w3.org/2002/07/owl# pronom http://reference.data.gov.uk/technical-registry/ rdf http://www.w3.org/1999/02/22-rdf-syntax-ns# rdfs http://www.w3.org/2000/01/rdf-schema# skos http://www.w3.org/2004/02/skos/core# xds http://www.w3.org/2001/XMLSchema#
  • 13. Unified Digital Format Registry a semantic registry for digital preservation Technology stack HTTP / SPARQL JavaScript / CSS Ontowiki Erfurt / RDFAuthor http://aksw.org/Projects/Erfurt http://ontowiki.net/ https://github.com/AKSW/RDFauthor Zend framework Virtuoso 4store http://www.zend.com/ http://virtuoso.openlinksw.com/ PHP RDF http://www.php.net/ http://www.w3.org/RDF Apache httpd http://httpd.apache.org/
  • 14. Unified Digital Format Registry a semantic registry for digital preservation Initial population Export from PRONOM • Working with TNA to identify appropriate subset • Transform to cross-walk modeling differences
  • 15. Unified Digital Format Registry a semantic registry for digital preservation Licensing Code is available under GPLv3 http://www.gnu.org/copyleft/gpl.html • Hosted on BitBucket http://www.bitbucket.org/udfr Data is contributed and available under CC-BY http://creativecommons.org/licenses/by/3.0/ • Consistent with UK open government license applicable to PRONOM data http://www.nationalarchives.gov.uk/doc/open-government-licence
  • 16. Unified Digital Format Registry a semantic registry for digital preservation Demo
  • 17. Unified Digital Format Registry a semantic registry for digital preservation Lessons learned  People with semantic experience are scarce  Too much time evaluating/prototyping potential technology choices  More difficulty than anticipated integrating disparate open source products  0.x software is often numbered that for a reason  Feature lists aren’t (always)
  • 18. Unified Digital Format Registry a semantic registry for digital preservation Lessons learned  Availability of a worldwide selection of products is a good thing (except when you don’t read German) • Excellent support from AKWS/Universität Leipzig  Modeling differences • RDF (non-)standards  VM deployment • Disparate IT organizations supporting dev/prod instances
  • 19. Unified Digital Format Registry a semantic registry for digital preservation Next steps  Long-term governance and operational support  Technical maintenance and enhancement  Replication/synchronization  Building contributor and reviewer communities
  • 20. Unified Digital Format Registry a semantic registry for digital preservation For more information UDFR UC3 http://udfr.org/ http://www.cdlib.org/uc3 http://bitbucket.org/udfr uc3@ucop.edu Stephen Abrams Mark Reyes PRONOM Lisa Colvin Abhishek Salve http://www.nationalarchives.gov.uk/PRONOM Patricia Cruse Tracy Seneca Scott Fisher Joan Starr GDFR Erik Hetzner Carly Strasser http://gdfr.info/ Greg Janée Marisa Strong John Kunze Adrian Turner OntoWiki Margaret Low David Loy Perry Willett http://ontowiki.net/Projects/OntoWiki Virtuoso http://www.openlinksw.com/dataspace/dav/wiki/Main/VOSRDFWP Agile Knowledge and Semantic Web (AKSW), Universität Leipzig http://aksw.org/

Notas do Editor

  1. Edward Burne-Jones (British, 1833-1898)The Days of Creation: the First Day, 1870-1876Watercolor and gouache, 102.2×35.5 cmFogg Art Museum, Harvard University, 1943.454Bequest of Grenville L. Winthrop
  2. Move from necessity to sufficiencySyntax -- http://www.flickr.com/photos/afeeld/4322852401Philosophy dictionary definition – http://botox4thebrain.com
  3. Shaking hands – Chris-Håvard Berge, http://www.flickr.com/photos/chberge/4670939397
  4. JHOVE – JSTOR Electronic-Archiving Initiative.GDFR – Andrew W. Mellon FoundationUDFR – LC/NDIIPP
  5. Ponder – HobviasSudoneighm, http://www.flickr.com/photos/striatic/2144933705
  6. The age of the democratization of expressionShout! – Mark Wheadon, http://www.flickr.com/photos/mark_wheadon/2557902153Robots! – Jere Keys, http://www.flickr.com/photos/tyreseus/527207577
  7. Gorbachev and Reagan -- AFP/Getty Images, http://www.britannica.com/bps/media-view/121436/1/0/0
  8. Leaning Tower of Pisa – Stephen and Claire Farnsworth, http://www.flickr.com/photos/the_farnsworths/2623592483
  9. WAAAAAAY too many plugs – Isaac Lee, http://www.flickr.com/photos/ikelee/12680878Checklist -- http://www.flickr.com/photos/adesigna/4090782772
  10. Square peg in a round hole -- http://www.flickr.com/photos/21664580@N04/2095574414Tug of war -- http://www.flickr.com/photos/toffehoff/244870161 / http://www.flickr.com/photos/toffehoff/244870160
  11. Legislature – Mike Refund, http://www.flickr.com/photos/deltamike/3358213826Wrench – Ed Platt, http://www.flickr.com/photos/philentropist/176054470Obama inauguration crowd – Brett Farmiloe, http://www.flickr.com/photos/pursuethepassion/3220803117