SlideShare uma empresa Scribd logo
1 de 102
http://www.niso.org/news/events/2013/webinars/preservation




              NISO Webinar:
        Metadata for Preservation:
       A Digital Object's Best Friend

            February 13, 2013


Speakers: Rebecca Guenther, Amy Kirchhoff
Metadata for Preservation: A Digital
Object’s Best Friend
Introduction to Preservation Metadata



              Rebecca Squire Guenther
              Library of Congress, NDMSO and
              Consultant, meetyourdata.com
              rguenther52@gmail.com

              NISO Webinar, Feb. 13, 2013
Digital preservation: imperative and challenge

    More and more of scholarly and cultural record exists in digital
     form; steps must be taken to secure its long-term future

    Groups such as Digital Preservation Coalition, NDIIPP and National
     Digital Stewardship Alliance have made significant progress in
     raising awareness about digital preservation imperative

    Gradual shift in focus from articulating problem to solving it …
      •   Not so much “Why is digital preservation important” anymore;
          rather, “What must be done to achieve preservation objectives?”


    Many practical challenges in implementing reliable, sustainable
     digital preservation programs

    One key challenge: preservation metadata
Metadata and preservation metadata


                                   PRESERVATION
“Structured information that         METADATA
describes, explains, locates,
or otherwise makes it easier to
retrieve, use, or manage an
information resource”

                                      “Metadata that supports
                                      and documents the digital
                                      preservation process”


                        METADATA
Preservation
    Preservation metadata includes:                               Metadata


   Provenance:                                        Content
     • Who has had custody/ownership of the
       digital object?
                                                            10 years on
   Authenticity:
     • Is the digital object what it purports to be?
                                                            50 years on

   Preservation Activity:
     • What has been done to preserve it?                   Forever!


   Technical Environment:
     • What is needed to render and use it?


   Rights Management:
     • What IPR must be observed?


 Makes digital objects self-documenting across time
Basics of preservation metadata

 Digital preservation concentrates on well-designed formal
  systems based on digital library and trusted digital
  repository concepts
 Information about what needs to be preserved and how are
  part of any preservation system
 Since items aren’t on shelves, metadata is the only
  mechanism for actually keeping or finding anything
 3 concepts are important
   • Metadata about preservation of digital objects
   • Preservation of metadata itself to ensure that content
     and metadata is preserved
   • Use of metadata in a trusted digital repository
PREMIS Data Dictionary
 May 2005: Data Dictionary for Preservation
  Metadata: Final Report of the PREMIS Working
  Group
     • Version   2.0   (April 2008)
     • Version   2.1   (January 2011)
     • Version   2.2   (July 2012)
     • Version   3.0   expected 2013

 Includes:
     Data Dictionary               Context/assumptions
     Data model                    Usage examples
     Conformance                   XML schema to support implementation

   Data Dictionary:
     • Core set of implementable, broadly applicable preservation
       metadata semantic units, supported by guidelines and
       recommendations for management and use
What does PREMIS cover?
 Administrative metadata that supports the digital
  preservation process
 Provides information to help manage a resource
  for preservation purposes
   •   Technical characteristics
   •   Information about actions on an object
   •   Relationships (structural and derivative)
        • Structural: indicates how compound objects are put
          together
        • Derivative: results of common preservation actions
   • Rights metadata associated with preservation
 In OAIS terms:
   • Metadata as part of SIP, AIP or DIP
   • Fits into Preservation Description Information
     (Reference, Context, Provenance, Fixity)
What PREMIS is and is not
   What PREMIS is:
    •   Common data model for organizing/thinking about preservation
        metadata
    •   A checklist for core metadata in a repository
    •   Guidance for local implementations
    •   Standard for exchanging information packages between repositories

   What PREMIS is not:
    •   Out-of-the-box solution: need to instantiate as metadata elements in
        repository system
    •   All needed metadata: excludes business rules, format-specific
        technical metadata, descriptive metadata for access, non-core
        preservation metadata
    •   Lifecycle management of objects outside repository
    •   Rights management: limited to permissions regarding actions taken
        within repository
PREMIS Data Model


  Intellectual
    Entities
                        Rights
                      Statements




            Objects                Agents




                       Events
Intellectual Entities

                                 Set of content that is
                                  considered a single
                                  intellectual unit for purposes
                                  of management and
                                  description (e.g., a book, a
                                  photograph, a map, a
                                  database)
                                 May include other Intellectual
Examples:                         Entities (e.g. a website that
 Rabbit Run by John Updike       includes a web page)
  (a book)                       **Has one or more digital
 “Maggie at the beach”           representations**
  (a photograph)                 Previously not fully described
 The Library of Congress         in PREMIS DD, but will be in
  Website (a website)             scope in version 3.0
 The Library of Congress:
  American Memory Home
  page (a web page)
   Discrete unit of information in
Objects                                 digital form
                                       **Objects are what repository
                                        actually preserves**
                                       Three types of Object:
                                         • FILE: named and ordered
                                           sequence of bytes that is
                                           known by an operating
                                           system
                                         • REPRESENTATION: set of
Examples:                                  files, including structural
 chapter1.pdf (a file)                    metadata, that, taken
                                           together, constitute a
 chapter1.pdf + chapter2.pdf +            complete rendering of an
  chapter3.pdf (representation of          Intellectual Entity
  a book w/3 chapters)                   • BITSTREAM: data within a
 TIFF file containing header and          file with properties relevant
  2 images (2 bitstreams                   for preservation purposes
  (images), each with own set of           (but needs additional
  properties (semantic units):             structure or reformatting to
  e.g., identifiers, technical             be stand-alone file)
                                         Intellectual entity will become
  metadata, inhibitors, … )                another level of object
Object Example: book in two versions


                                            Intellectual Entity
                                            Da Vinci Code by
                                               Dan Brown




                   Representation 1
                                                                  Representation 2
                     Page image
                                                                   ebook version
                       version




  File 1:     File 2:           File N:              File N+1:         File 1:
page1.tiff   page2.tiff        pageN.tiff            METS.xml         book.lit
Events

                                   An action that involves or
                                    impacts at least one Object or
                                    Agent associated with or
                                    known by the preservation
                                    repository
                                   Helps document digital
                                    provenance. Can track
Examples:                           history of Object through the
 Validation Event: use JHOVE       chain of Events that occur
  tool to verify that               during the Objects lifecycle
  chapter1.pdf is a valid PDF      Determining which Events
  file                              should be recorded, and at
 Ingest Event: transform an        what level of granularity is up
  OAIS SIP into an AIP
                                    to the repository
 Migration Event: create a
  new version of an Object in
  an up-to-date format
Agents

                                    Person, organization, or
                                     software program/system
                                     associated with an Event or a
                                     Right (permission statement)
                                    Agents are associated only
                                     indirectly to Objects through
                                     Events or Rights
Examples:                           Not defined in detail in
 Martha Anderson (a person)         PREMIS DD; not considered
 Library of Congress (an            core preservation metadata
  organization)                      beyond identification
 Dark Archive in the Sunshine
  State implementation (a
  system)
 JHOVE version 1.0 (a
  software program)
Rights Statements

                                     An agreement with a rights
                                      holder that grants permission
                                      for the repository to
                                      undertake an action(s)
                                      associated with an Object(s)
                                      in the repository.
                                     Not a full rights expression
Example:                              language; focuses exclusively
 Priscilla Caplan grants FCLA        on permissions that take the
  digital repository permission       form:
  to make three copies of              • Agent X grants Permission
  metadata_fundamentals.pdf               Y to the repository in
  for preservation purposes.              regard to Object Z.
Technical metadata pertaining to
objects
 Object identifier              Storage
 Preservation level             Environment
 Significant characteristics      • software
 Object characteristics           • hardware
   • fixity                      Digital signatures
   • format                      Relationships
   • size                        Linking event identifier
   • creating application        Linking permission
   • inhibitors
                                  statement identifier
   • object characteristics
     extension
 Creating application
 Original name
Semantic units pertaining to Events:
provenance and preservation activity
 Event identifier
 Event type (e.g.
  capture, creation, validation, migration, fixity
  check)
 Event dateTime
 Event detail
 Event outcome
 Event outcome detail
 Linking agent identifier
 Linking object identifier
Semantic units pertaining to Rights

 Rights Statement             Rights Granted
   Rights Statement             act
   Identifier                   restriction
   Rights Basis                 termOfGrant
   Copyright Information        rightsGranted
   License Information        Linking Object
   Statute Information        Identifier
   Other Rights Information
                              Linking Agent Identifier
                              rightsExtension
Semantic units pertaining to Agents


   Agent Identifier
   Agent Name
   Agent Type
   Agent Note
   Agent Extension
   linking Event Identifier
   Linking Rights Identifier
The State of PREMIS
 de facto standard for preservation metadata; in some
  countries mandated for cultural heritage repositories

 Was recognized by winning the Digital Preservation Award
  (2005) and was shortlisted for DPC Decennial award for
  outstanding contribution to digital preservation (2012)

 PREMIS implementations are appearing in many
  places, many contexts, many forms

 Experimentation has led to changes in the data dictionary
  and schema

 PREMIS Implementation fairs: attempts to consolidate
  implementation experiences, issues, best practices,
Key features of PREMIS
 Developed through international consensus-making process
    Mobilized community to address shared need
    Shared solution to a shared need

 Implementation neutral
   • Makes no assumptions about technology
   • Can be flexibly adapted for use across all sorts of
     institutions, digital preservation contexts, repository systems
   • Allows for extensibility

 Supported by Maintenance Activity and Editorial
  Committee, under auspices of US Library of Congress
    PREMIS is sustained, maintained, and evolved

 Extensive outreach to implementer community
    Tutorials, guides, implementation fairs, PIG Forum
    “Support system” in place for PREMIS implementers
PREMIS Maintenance Activity
 Web site:
  •   Permanent Web presence, hosted by
       Library of Congress
  •   Central destination for PREMIS-related
       info, announcements, resources
  •   Home of the PREMIS Implementers’ Group (PIG)
      discussion list

 PREMIS Editorial Committee:
  •   Set directions/priorities for PREMIS development
  •   Coordinate future revisions of Data Dictionary and XML
      schema
  •   Promote implementation

            http://www.loc.gov/standards/premis/
Implementation resources
 Tools:
   • XML schema
   • PREMIS-in-METS toolbox <http://pim.fcla.edu>
   • Controlled vocabularies at http://id.loc.gov
   • RDF/OWL ontology for use as Linked Data
 Guidelines:
   • PREMIS conformance statement
   • PREMIS & METS guidelines
 Community Working groups on special topics
 Others:
   • Understanding PREMIS (available in multiple languages)
   • PIG Forum
   • Implementation Registry
   • Tools Registry
Some implementers …

    DAITTSS (Florida): a preservation repository for the use of the
     libraries of the public universities of Florida.
    Ex Libris Rosetta: a commercial digital preservation system
     supporting
     acquisition, validation, ingest, storage, management, preservation
     and dissemination of different types of digital objects
    National Digital Newspaper Program
    Archivematica: comrehensive open-source digital preservation
     system
    National Archives of Sweden, National Archives of Scotland
    Carolina Digital Repository: repository for material in electronic
     formats produced by members of the University of North Carolina
     at Chapel Hill community.
    British Library electronic journal archiving project

    For more information see:
      • http://www.loc.gov/premis/premis-registry.html
Impact
 De facto international standard for preservation metadata
   • Part of permanent infrastructure supporting digital
     preservation
   • ISO standardization being considered

 Wide applicability means benefits from PREMIS extend to
  entire digital preservation community

 Ongoing work to revise/update Data Dictionary and create
  new supporting resources
   • PREMIS is a dynamic resource that continues to generate
     new sources of value to implementer community

 Stood the test of time:
   • Seven years after initial release, is now indispensable part
     of digital preservation implementations around the world
   • Not surpassed or replaced by other standard or resource
URLs, etc.

   PREMIS Maintenance Activity:
    http://www.loc.gov/standards/premis/

   PREMIS Data Dictionary for Preservation Metadata:
    http://www.loc.gov/standards/premis/v2/premis-2-2.pdf

   Understanding PREMIS:
    http://www.loc.gov/standards/premis/understanding-
    premis.pdf

   PREMIS Implementation Registry
    http://www.loc.gov/standards/premis/premis-registry.php

   PREMIS Implementers Group list
    http://listserv.loc.gov/listarch/pig.html
Metadata for
Preservation
A digital object’s best
friend

Implementation!
Amy Kirchhoff
Archive Service
Product Manager
Standar
ds
Standar
 framework for thinking


ds
Standar
 framework for thinking
 interchange specification

ds
[The PREMIS
documentation has an]
emphasis on the need to
know rather than the
need to record or
represent in any
Content
Type



Content
Set(s)



Archival
Unit(s)



Content
Unit(s)



Functional
Unit(s)



Storage
Unit(s)
Intellectual
  Entities
                       Rights
                     Statements




           Objects                Agents




                      Events
Digital preservation is the
series of management
policies and activities
necessary to ensure the
enduring
usability, authenticity, discov
erability and accessibility of
Dublin Core
Dublin Core
DIDL (from MPEG-21)
Dublin Core
DIDL (from MPEG-21)
METS
Dublin Core
DIDL (from MPEG-21)
METS
OAIS
Dublin Core
DIDL (from MPEG-21)
METS
OAIS
…
Dublin Core
DIDL (from MPEG-21)
METS
OAIS
…

Experience
1. Content model
2. Metadata
   elements
3. Registries
Intellectual
  Entities
                       Rights
                     Statements




           Objects                Agents




                      Events
Identifier
s
Intellectual
  Entities
                       Rights
                     Statements




           Objects                Agents




                      Events
Books
Journals
Digitized Newspapers
Digitized Documents
Supplied Files
Archive Management Docum
Books
Journals
Digitized Newspapers
Digitized Documents
Supplied Files
Archive Management Docum
Content
Type(s)
Content
Type(s)



Content
Set(s)
Content
Type(s)



Content
Set(s)



Archival
Unit(s)
Content
Type(s)



Content
Set(s)



Archival
Unit(s)



Content
Unit(s)
Content
Type(s)



Content
Set(s)



Archival
Unit(s)



Content
Unit(s)



Functional
Unit(s)
Content
Type



Content
Set(s)



Archival
Unit(s)



Content
Unit(s)



Functional
Unit(s)



Storage
Unit(s)
Descriptive Metadata
Technical Metadata
Events Metadata
PMD
PMD
a thing of beauty
Intellectual
  Entities
                       Rights
                     Statements




           Objects                Agents




                      Events
Semantic Units
1.1 objectIdentifier
1.2 objectCategory   for Objects
1.3 preservationLevel
1.4 significantProperties
1.5 objectCharacteristics
1.6 originalName
1.7 storage
1.8 environment
1.9 signatureInformation
1.10 relationship
1.11 linkingEventIdentifier
1.12 linkingIntellectualEntityIdentifier
1.13 linkingRightsStatementIdentifier
Registries
Intellectual
  Entities
                       Rights
                     Statements




           Objects                Agents




                      Events
Semantic Units
               for Events
2.1 eventIdentifier
2.2 eventType
2.3 eventDateTime
2.4 eventDetail
2.5 eventOutcomeInformation
2.6 linkingAgentIdentifier
2.7 linkingObjectIdentifier
Processing Record
Event Sets
Events
Some Portico Events
Edit Descriptive Metadata
Check Descriptive Metadata
Generate Descriptive Metadata
Ingest Into Archive
Create File
Generate Technical Metadata
Set Preservation Level
Generate Fixity
Portico Event
                     Elements
Timestamp
Rationale
InputList
ArgList
Output
ToolWrapper
Tool Component List
Outcome
OutcomeDetailList
Content
Type



Content
Set(s)



Archival
Unit(s)



Content
Unit(s)



Functional
Unit(s)



Storage
Unit(s)
Intellectual
  Entities
                       Rights
                     Statements




           Objects                Agents




                      Events
Semantic Units
               for Agents
3.1 agentIdentifier
3.2 agentName
3.3 agentType
3.4 agentNote
3.5 agentExtension
3.6 linkingEventIdentifier
3.7 linkingRightsStatementIdentifier
Intellectual
  Entities
                       Rights
                     Statements




           Objects                Agents




                      Events
Semantic Units
4.1 rightsStatement
                     for Rights
  4.1.1 rightsStatementIdentifier
  4.1.2 rightsBasis
  4.1.3 copyrightInformation
  4.1.4 licenseInformation
  4.1.5 statuteInformation
  4.1.6 otherRightsInformation
  4.1.7 rightsGranted
  4.1.8 linkingObjectIdentifier
  4.1.9 linkingAgentIdentifier
4.2 rightsExtension
Eas
y
Eas
For Portico
For the moment


y
Intellectual
  Entities
                       Rights
                     Statements




           Objects                Agents




                      Events
THANK YOU.

Amy Kirchhoff
amy.kirchhoff@ithaka.org

http://www.portico.org
NISO Webinar:
Metadata for Preservation:
A Digital Object's Best Friend


Questions?
All questions will be posted with presenter answers on
the NISO website following the webinar:

http://www.niso.org/news/events/2013/webinars/preservation


                    NISO Webinar • February 13, 2013
THANK YOU
           Thank you for joining us today.
Please take a moment to fill out the brief online survey.

         We look forward to hearing from you!

Mais conteúdo relacionado

Mais procurados

Applying Digital Library Metadata Standards
Applying Digital Library Metadata StandardsApplying Digital Library Metadata Standards
Applying Digital Library Metadata StandardsJenn Riley
 
ArchivesSpace: Building a Next-Generation Archives Management Tool
ArchivesSpace: Building a Next-Generation Archives Management ToolArchivesSpace: Building a Next-Generation Archives Management Tool
ArchivesSpace: Building a Next-Generation Archives Management ToolMark Matienzo
 
Going local with a world-class data infrastructure: Enabling SDMX for researc...
Going local with a world-class data infrastructure: Enabling SDMX for researc...Going local with a world-class data infrastructure: Enabling SDMX for researc...
Going local with a world-class data infrastructure: Enabling SDMX for researc...Rob Grim
 
Needs for Data Management & Citation Throughout the Information Lifecycle
Needs for Data Management & Citation Throughout  the Information LifecycleNeeds for Data Management & Citation Throughout  the Information Lifecycle
Needs for Data Management & Citation Throughout the Information LifecycleMicah Altman
 
Using and Developing with Open Source Digital Forensics Software in Digital A...
Using and Developing with Open Source Digital Forensics Software in Digital A...Using and Developing with Open Source Digital Forensics Software in Digital A...
Using and Developing with Open Source Digital Forensics Software in Digital A...Mark Matienzo
 
Hypatia for dlf 2011
Hypatia for dlf 2011Hypatia for dlf 2011
Hypatia for dlf 2011DLFCLIR
 
Metadata For Catalogers (introductions)
Metadata For Catalogers (introductions)Metadata For Catalogers (introductions)
Metadata For Catalogers (introductions)robin fay
 
e-Science, Research Data and Libaries
e-Science, Research Data and Libariese-Science, Research Data and Libaries
e-Science, Research Data and LibariesRob Grim
 
Cni research data_oxford_horstmann_jefferies
Cni research data_oxford_horstmann_jefferiesCni research data_oxford_horstmann_jefferies
Cni research data_oxford_horstmann_jefferiesBDLSS
 
Virtuoso, The Prometheus of RDF -- Sematics 2014 Conference Keynote
 Virtuoso, The Prometheus of RDF -- Sematics 2014 Conference Keynote Virtuoso, The Prometheus of RDF -- Sematics 2014 Conference Keynote
Virtuoso, The Prometheus of RDF -- Sematics 2014 Conference KeynoteKingsley Uyi Idehen
 
Digital Forensics for Digital Archives
Digital Forensics for Digital ArchivesDigital Forensics for Digital Archives
Digital Forensics for Digital ArchivesMark Matienzo
 
LOD Cloud Knowledge Graph vs COVID-19
LOD Cloud Knowledge Graph vs COVID-19LOD Cloud Knowledge Graph vs COVID-19
LOD Cloud Knowledge Graph vs COVID-19Kingsley Uyi Idehen
 
Technologies For Appraising and Managing Electronic Records
Technologies For Appraising and Managing Electronic RecordsTechnologies For Appraising and Managing Electronic Records
Technologies For Appraising and Managing Electronic Recordspbajcsy
 
Introduction to Digital Humanities: Metadata standards and ontologies
Introduction to Digital Humanities: Metadata standards and ontologies Introduction to Digital Humanities: Metadata standards and ontologies
Introduction to Digital Humanities: Metadata standards and ontologies LIBIS
 
W4 4 marc-alexandre-nolin-v2
W4 4 marc-alexandre-nolin-v2W4 4 marc-alexandre-nolin-v2
W4 4 marc-alexandre-nolin-v2nolmar01
 
Accessioning-Based Metadata Extraction and Iterative Processing: Notes From t...
Accessioning-Based Metadata Extraction and Iterative Processing: Notes From t...Accessioning-Based Metadata Extraction and Iterative Processing: Notes From t...
Accessioning-Based Metadata Extraction and Iterative Processing: Notes From t...Mark Matienzo
 

Mais procurados (20)

Applying Digital Library Metadata Standards
Applying Digital Library Metadata StandardsApplying Digital Library Metadata Standards
Applying Digital Library Metadata Standards
 
ArchivesSpace: Building a Next-Generation Archives Management Tool
ArchivesSpace: Building a Next-Generation Archives Management ToolArchivesSpace: Building a Next-Generation Archives Management Tool
ArchivesSpace: Building a Next-Generation Archives Management Tool
 
Metadata Standards
Metadata StandardsMetadata Standards
Metadata Standards
 
Going local with a world-class data infrastructure: Enabling SDMX for researc...
Going local with a world-class data infrastructure: Enabling SDMX for researc...Going local with a world-class data infrastructure: Enabling SDMX for researc...
Going local with a world-class data infrastructure: Enabling SDMX for researc...
 
Does metadata matter?
Does metadata matter?Does metadata matter?
Does metadata matter?
 
Needs for Data Management & Citation Throughout the Information Lifecycle
Needs for Data Management & Citation Throughout  the Information LifecycleNeeds for Data Management & Citation Throughout  the Information Lifecycle
Needs for Data Management & Citation Throughout the Information Lifecycle
 
Metadata: A concept
Metadata: A conceptMetadata: A concept
Metadata: A concept
 
Using and Developing with Open Source Digital Forensics Software in Digital A...
Using and Developing with Open Source Digital Forensics Software in Digital A...Using and Developing with Open Source Digital Forensics Software in Digital A...
Using and Developing with Open Source Digital Forensics Software in Digital A...
 
Hypatia for dlf 2011
Hypatia for dlf 2011Hypatia for dlf 2011
Hypatia for dlf 2011
 
Metadata For Catalogers (introductions)
Metadata For Catalogers (introductions)Metadata For Catalogers (introductions)
Metadata For Catalogers (introductions)
 
e-Science, Research Data and Libaries
e-Science, Research Data and Libariese-Science, Research Data and Libaries
e-Science, Research Data and Libaries
 
Cni research data_oxford_horstmann_jefferies
Cni research data_oxford_horstmann_jefferiesCni research data_oxford_horstmann_jefferies
Cni research data_oxford_horstmann_jefferies
 
Virtuoso, The Prometheus of RDF -- Sematics 2014 Conference Keynote
 Virtuoso, The Prometheus of RDF -- Sematics 2014 Conference Keynote Virtuoso, The Prometheus of RDF -- Sematics 2014 Conference Keynote
Virtuoso, The Prometheus of RDF -- Sematics 2014 Conference Keynote
 
Digital Forensics for Digital Archives
Digital Forensics for Digital ArchivesDigital Forensics for Digital Archives
Digital Forensics for Digital Archives
 
LOD Cloud Knowledge Graph vs COVID-19
LOD Cloud Knowledge Graph vs COVID-19LOD Cloud Knowledge Graph vs COVID-19
LOD Cloud Knowledge Graph vs COVID-19
 
Technologies For Appraising and Managing Electronic Records
Technologies For Appraising and Managing Electronic RecordsTechnologies For Appraising and Managing Electronic Records
Technologies For Appraising and Managing Electronic Records
 
Semantic Digital Libraries
Semantic Digital LibrariesSemantic Digital Libraries
Semantic Digital Libraries
 
Introduction to Digital Humanities: Metadata standards and ontologies
Introduction to Digital Humanities: Metadata standards and ontologies Introduction to Digital Humanities: Metadata standards and ontologies
Introduction to Digital Humanities: Metadata standards and ontologies
 
W4 4 marc-alexandre-nolin-v2
W4 4 marc-alexandre-nolin-v2W4 4 marc-alexandre-nolin-v2
W4 4 marc-alexandre-nolin-v2
 
Accessioning-Based Metadata Extraction and Iterative Processing: Notes From t...
Accessioning-Based Metadata Extraction and Iterative Processing: Notes From t...Accessioning-Based Metadata Extraction and Iterative Processing: Notes From t...
Accessioning-Based Metadata Extraction and Iterative Processing: Notes From t...
 

Destaque

Customising DMPonline
Customising DMPonline Customising DMPonline
Customising DMPonline Sarah Jones
 
Integrating data management planning into institutional processes: a case stu...
Integrating data management planning into institutional processes: a case stu...Integrating data management planning into institutional processes: a case stu...
Integrating data management planning into institutional processes: a case stu...Joy Davidson
 
Introduction to Omeka
Introduction to OmekaIntroduction to Omeka
Introduction to OmekaShawn Day
 
Yuk Hui: What is a digital object?
Yuk Hui: What is a digital object?Yuk Hui: What is a digital object?
Yuk Hui: What is a digital object?PhiloWeb
 
Digital Object Identifiers: Affecting How Libraries Connect to Online Digital...
Digital Object Identifiers: Affecting How Libraries Connect to Online Digital...Digital Object Identifiers: Affecting How Libraries Connect to Online Digital...
Digital Object Identifiers: Affecting How Libraries Connect to Online Digital...Richard Bernier
 
Freddy Limpens: From folksonomies to ontologies: a socio-technical solution.
Freddy Limpens: From folksonomies to ontologies: a socio-technical solution.Freddy Limpens: From folksonomies to ontologies: a socio-technical solution.
Freddy Limpens: From folksonomies to ontologies: a socio-technical solution.PhiloWeb
 
Digital Preservation
Digital PreservationDigital Preservation
Digital Preservationsmtcd
 
Digital preservation: an introduction
Digital preservation: an introductionDigital preservation: an introduction
Digital preservation: an introductionMichael Day
 
Advantages and disadvantages of digital library
Advantages and disadvantages of digital libraryAdvantages and disadvantages of digital library
Advantages and disadvantages of digital libraryyhen06
 
Digital libraries
Digital libraries Digital libraries
Digital libraries Dheeraj Negi
 
Advantages and disadvantages of digital library
Advantages and disadvantages of digital libraryAdvantages and disadvantages of digital library
Advantages and disadvantages of digital libraryyhen06
 
Digital library
Digital libraryDigital library
Digital librarynamithavn
 

Destaque (13)

Customising DMPonline
Customising DMPonline Customising DMPonline
Customising DMPonline
 
Integrating data management planning into institutional processes: a case stu...
Integrating data management planning into institutional processes: a case stu...Integrating data management planning into institutional processes: a case stu...
Integrating data management planning into institutional processes: a case stu...
 
Introduction to Omeka
Introduction to OmekaIntroduction to Omeka
Introduction to Omeka
 
Yuk Hui: What is a digital object?
Yuk Hui: What is a digital object?Yuk Hui: What is a digital object?
Yuk Hui: What is a digital object?
 
Digital Object Identifiers: Affecting How Libraries Connect to Online Digital...
Digital Object Identifiers: Affecting How Libraries Connect to Online Digital...Digital Object Identifiers: Affecting How Libraries Connect to Online Digital...
Digital Object Identifiers: Affecting How Libraries Connect to Online Digital...
 
Freddy Limpens: From folksonomies to ontologies: a socio-technical solution.
Freddy Limpens: From folksonomies to ontologies: a socio-technical solution.Freddy Limpens: From folksonomies to ontologies: a socio-technical solution.
Freddy Limpens: From folksonomies to ontologies: a socio-technical solution.
 
Digital library Assignment
Digital library AssignmentDigital library Assignment
Digital library Assignment
 
Digital Preservation
Digital PreservationDigital Preservation
Digital Preservation
 
Digital preservation: an introduction
Digital preservation: an introductionDigital preservation: an introduction
Digital preservation: an introduction
 
Advantages and disadvantages of digital library
Advantages and disadvantages of digital libraryAdvantages and disadvantages of digital library
Advantages and disadvantages of digital library
 
Digital libraries
Digital libraries Digital libraries
Digital libraries
 
Advantages and disadvantages of digital library
Advantages and disadvantages of digital libraryAdvantages and disadvantages of digital library
Advantages and disadvantages of digital library
 
Digital library
Digital libraryDigital library
Digital library
 

Semelhante a NISO Webinar: Metadata for Preservation: A Digital Object's Best Friend

Preservation Planning: Choosing a suitable digital preservation strategy
Preservation Planning: Choosing a suitable digital preservation strategyPreservation Planning: Choosing a suitable digital preservation strategy
Preservation Planning: Choosing a suitable digital preservation strategyGarethKnight
 
NCompass Live: Digital Preservation, Part 2: Storage and Protection
NCompass Live: Digital Preservation, Part 2: Storage and ProtectionNCompass Live: Digital Preservation, Part 2: Storage and Protection
NCompass Live: Digital Preservation, Part 2: Storage and ProtectionNebraska Library Commission
 
Digital library technologies
Digital library technologies Digital library technologies
Digital library technologies Shriram Pandey
 
Mets opening day - web based mets creation (2007)
Mets opening day - web based mets creation (2007)Mets opening day - web based mets creation (2007)
Mets opening day - web based mets creation (2007)Ralf Stockmann
 
Exploring Process Barriers to Release Public Sector Information in Local Gove...
Exploring Process Barriers to Release Public Sector Information in Local Gove...Exploring Process Barriers to Release Public Sector Information in Local Gove...
Exploring Process Barriers to Release Public Sector Information in Local Gove...Peter Conradie
 
Analytics with unified file and object
Analytics with unified file and object Analytics with unified file and object
Analytics with unified file and object Sandeep Patil
 
Digital library and metadata
Digital library and metadataDigital library and metadata
Digital library and metadataramncsi
 
Metadata and Tagging
Metadata and TaggingMetadata and Tagging
Metadata and Taggingpauloshea
 
Systems, processes & how we stop the wheels falling off
Systems, processes & how we stop the wheels falling offSystems, processes & how we stop the wheels falling off
Systems, processes & how we stop the wheels falling offWellcome Library
 
Presentation IS
Presentation ISPresentation IS
Presentation ISyanacoolen
 
Knowledge Engineering for TELDAP
Knowledge Engineering for TELDAPKnowledge Engineering for TELDAP
Knowledge Engineering for TELDAPAAT Taiwan
 
Towards FAIR Open Science with PID Kernel Information: RPID Testbed
Towards FAIR Open Science with PID Kernel Information: RPID TestbedTowards FAIR Open Science with PID Kernel Information: RPID Testbed
Towards FAIR Open Science with PID Kernel Information: RPID TestbedBeth Plale
 

Semelhante a NISO Webinar: Metadata for Preservation: A Digital Object's Best Friend (20)

Preservation Metadata
Preservation MetadataPreservation Metadata
Preservation Metadata
 
Metadata For Preservation Delos
Metadata For Preservation DelosMetadata For Preservation Delos
Metadata For Preservation Delos
 
Welcome to the CTDA
Welcome to the CTDAWelcome to the CTDA
Welcome to the CTDA
 
Preservation Planning: Choosing a suitable digital preservation strategy
Preservation Planning: Choosing a suitable digital preservation strategyPreservation Planning: Choosing a suitable digital preservation strategy
Preservation Planning: Choosing a suitable digital preservation strategy
 
NCompass Live: Digital Preservation, Part 2: Storage and Protection
NCompass Live: Digital Preservation, Part 2: Storage and ProtectionNCompass Live: Digital Preservation, Part 2: Storage and Protection
NCompass Live: Digital Preservation, Part 2: Storage and Protection
 
Saadallah vtls
Saadallah vtlsSaadallah vtls
Saadallah vtls
 
Digital library technologies
Digital library technologies Digital library technologies
Digital library technologies
 
Mets opening day - web based mets creation (2007)
Mets opening day - web based mets creation (2007)Mets opening day - web based mets creation (2007)
Mets opening day - web based mets creation (2007)
 
Exploring Process Barriers to Release Public Sector Information in Local Gove...
Exploring Process Barriers to Release Public Sector Information in Local Gove...Exploring Process Barriers to Release Public Sector Information in Local Gove...
Exploring Process Barriers to Release Public Sector Information in Local Gove...
 
Analytics with unified file and object
Analytics with unified file and object Analytics with unified file and object
Analytics with unified file and object
 
Digital library and metadata
Digital library and metadataDigital library and metadata
Digital library and metadata
 
Metadata and Tagging
Metadata and TaggingMetadata and Tagging
Metadata and Tagging
 
Systems, processes & how we stop the wheels falling off
Systems, processes & how we stop the wheels falling offSystems, processes & how we stop the wheels falling off
Systems, processes & how we stop the wheels falling off
 
Fedora
FedoraFedora
Fedora
 
UAEU_MDL_Slides_rev1.ppt
UAEU_MDL_Slides_rev1.pptUAEU_MDL_Slides_rev1.ppt
UAEU_MDL_Slides_rev1.ppt
 
Digital Library
Digital LibraryDigital Library
Digital Library
 
Presentation IS
Presentation ISPresentation IS
Presentation IS
 
Completepresentation
CompletepresentationCompletepresentation
Completepresentation
 
Knowledge Engineering for TELDAP
Knowledge Engineering for TELDAPKnowledge Engineering for TELDAP
Knowledge Engineering for TELDAP
 
Towards FAIR Open Science with PID Kernel Information: RPID Testbed
Towards FAIR Open Science with PID Kernel Information: RPID TestbedTowards FAIR Open Science with PID Kernel Information: RPID Testbed
Towards FAIR Open Science with PID Kernel Information: RPID Testbed
 

Mais de National Information Standards Organization (NISO)

Mais de National Information Standards Organization (NISO) (20)

Bazargan "NISO Webinar, Sustainability in Publishing"
Bazargan "NISO Webinar, Sustainability in Publishing"Bazargan "NISO Webinar, Sustainability in Publishing"
Bazargan "NISO Webinar, Sustainability in Publishing"
 
Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"
 
Compton "NISO Webinar, Sustainability in Publishing"
Compton "NISO Webinar, Sustainability in Publishing"Compton "NISO Webinar, Sustainability in Publishing"
Compton "NISO Webinar, Sustainability in Publishing"
 
Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"
 
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
 
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
 
Mattingly "Text and Data Mining: Building Data Driven Applications"
Mattingly "Text and Data Mining: Building Data Driven Applications"Mattingly "Text and Data Mining: Building Data Driven Applications"
Mattingly "Text and Data Mining: Building Data Driven Applications"
 
Mattingly "Text and Data Mining: Searching Vectors"
Mattingly "Text and Data Mining: Searching Vectors"Mattingly "Text and Data Mining: Searching Vectors"
Mattingly "Text and Data Mining: Searching Vectors"
 
Mattingly "Text Mining Techniques"
Mattingly "Text Mining Techniques"Mattingly "Text Mining Techniques"
Mattingly "Text Mining Techniques"
 
Mattingly "Text Processing for Library Data: Representing Text as Data"
Mattingly "Text Processing for Library Data: Representing Text as Data"Mattingly "Text Processing for Library Data: Representing Text as Data"
Mattingly "Text Processing for Library Data: Representing Text as Data"
 
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
 
Ross and Clark "Strategic Planning"
Ross and Clark "Strategic Planning"Ross and Clark "Strategic Planning"
Ross and Clark "Strategic Planning"
 
Mattingly "Data Mining Techniques: Classification and Clustering"
Mattingly "Data Mining Techniques: Classification and Clustering"Mattingly "Data Mining Techniques: Classification and Clustering"
Mattingly "Data Mining Techniques: Classification and Clustering"
 
Straza "Global collaboration towards equitable and open science: UNESCO Recom...
Straza "Global collaboration towards equitable and open science: UNESCO Recom...Straza "Global collaboration towards equitable and open science: UNESCO Recom...
Straza "Global collaboration towards equitable and open science: UNESCO Recom...
 
Lippincott "Beyond access: Accelerating discovery and increasing trust throug...
Lippincott "Beyond access: Accelerating discovery and increasing trust throug...Lippincott "Beyond access: Accelerating discovery and increasing trust throug...
Lippincott "Beyond access: Accelerating discovery and increasing trust throug...
 
Kriegsman "Integrating Open and Equitable Research into Open Science"
Kriegsman "Integrating Open and Equitable Research into Open Science"Kriegsman "Integrating Open and Equitable Research into Open Science"
Kriegsman "Integrating Open and Equitable Research into Open Science"
 
Mattingly "Ethics and Cleaning Data"
Mattingly "Ethics and Cleaning Data"Mattingly "Ethics and Cleaning Data"
Mattingly "Ethics and Cleaning Data"
 
Mercado-Lara "Open & Equitable Program"
Mercado-Lara "Open & Equitable Program"Mercado-Lara "Open & Equitable Program"
Mercado-Lara "Open & Equitable Program"
 
Ratner "Enhancing Open Science: Assessing Tools & Charting Progress"
Ratner "Enhancing Open Science: Assessing Tools & Charting Progress"Ratner "Enhancing Open Science: Assessing Tools & Charting Progress"
Ratner "Enhancing Open Science: Assessing Tools & Charting Progress"
 
Pfeiffer "Enhancing Open Science: Assessing Tools & Charting Progress"
Pfeiffer "Enhancing Open Science: Assessing Tools & Charting Progress"Pfeiffer "Enhancing Open Science: Assessing Tools & Charting Progress"
Pfeiffer "Enhancing Open Science: Assessing Tools & Charting Progress"
 

NISO Webinar: Metadata for Preservation: A Digital Object's Best Friend

  • 1. http://www.niso.org/news/events/2013/webinars/preservation NISO Webinar: Metadata for Preservation: A Digital Object's Best Friend February 13, 2013 Speakers: Rebecca Guenther, Amy Kirchhoff
  • 2. Metadata for Preservation: A Digital Object’s Best Friend Introduction to Preservation Metadata Rebecca Squire Guenther Library of Congress, NDMSO and Consultant, meetyourdata.com rguenther52@gmail.com NISO Webinar, Feb. 13, 2013
  • 3. Digital preservation: imperative and challenge  More and more of scholarly and cultural record exists in digital form; steps must be taken to secure its long-term future  Groups such as Digital Preservation Coalition, NDIIPP and National Digital Stewardship Alliance have made significant progress in raising awareness about digital preservation imperative  Gradual shift in focus from articulating problem to solving it … • Not so much “Why is digital preservation important” anymore; rather, “What must be done to achieve preservation objectives?”  Many practical challenges in implementing reliable, sustainable digital preservation programs  One key challenge: preservation metadata
  • 4. Metadata and preservation metadata PRESERVATION “Structured information that METADATA describes, explains, locates, or otherwise makes it easier to retrieve, use, or manage an information resource” “Metadata that supports and documents the digital preservation process” METADATA
  • 5. Preservation Preservation metadata includes: Metadata  Provenance: Content • Who has had custody/ownership of the digital object? 10 years on  Authenticity: • Is the digital object what it purports to be? 50 years on  Preservation Activity: • What has been done to preserve it? Forever!  Technical Environment: • What is needed to render and use it?  Rights Management: • What IPR must be observed?  Makes digital objects self-documenting across time
  • 6. Basics of preservation metadata  Digital preservation concentrates on well-designed formal systems based on digital library and trusted digital repository concepts  Information about what needs to be preserved and how are part of any preservation system  Since items aren’t on shelves, metadata is the only mechanism for actually keeping or finding anything  3 concepts are important • Metadata about preservation of digital objects • Preservation of metadata itself to ensure that content and metadata is preserved • Use of metadata in a trusted digital repository
  • 7.
  • 8. PREMIS Data Dictionary  May 2005: Data Dictionary for Preservation Metadata: Final Report of the PREMIS Working Group • Version 2.0 (April 2008) • Version 2.1 (January 2011) • Version 2.2 (July 2012) • Version 3.0 expected 2013  Includes: Data Dictionary Context/assumptions Data model Usage examples Conformance XML schema to support implementation  Data Dictionary: • Core set of implementable, broadly applicable preservation metadata semantic units, supported by guidelines and recommendations for management and use
  • 9. What does PREMIS cover?  Administrative metadata that supports the digital preservation process  Provides information to help manage a resource for preservation purposes • Technical characteristics • Information about actions on an object • Relationships (structural and derivative) • Structural: indicates how compound objects are put together • Derivative: results of common preservation actions • Rights metadata associated with preservation  In OAIS terms: • Metadata as part of SIP, AIP or DIP • Fits into Preservation Description Information (Reference, Context, Provenance, Fixity)
  • 10. What PREMIS is and is not  What PREMIS is: • Common data model for organizing/thinking about preservation metadata • A checklist for core metadata in a repository • Guidance for local implementations • Standard for exchanging information packages between repositories  What PREMIS is not: • Out-of-the-box solution: need to instantiate as metadata elements in repository system • All needed metadata: excludes business rules, format-specific technical metadata, descriptive metadata for access, non-core preservation metadata • Lifecycle management of objects outside repository • Rights management: limited to permissions regarding actions taken within repository
  • 11. PREMIS Data Model Intellectual Entities Rights Statements Objects Agents Events
  • 12. Intellectual Entities  Set of content that is considered a single intellectual unit for purposes of management and description (e.g., a book, a photograph, a map, a database)  May include other Intellectual Examples: Entities (e.g. a website that  Rabbit Run by John Updike includes a web page) (a book)  **Has one or more digital  “Maggie at the beach” representations** (a photograph)  Previously not fully described  The Library of Congress in PREMIS DD, but will be in Website (a website) scope in version 3.0  The Library of Congress: American Memory Home page (a web page)
  • 13. Discrete unit of information in Objects digital form  **Objects are what repository actually preserves**  Three types of Object: • FILE: named and ordered sequence of bytes that is known by an operating system • REPRESENTATION: set of Examples: files, including structural  chapter1.pdf (a file) metadata, that, taken together, constitute a  chapter1.pdf + chapter2.pdf + complete rendering of an chapter3.pdf (representation of Intellectual Entity a book w/3 chapters) • BITSTREAM: data within a  TIFF file containing header and file with properties relevant 2 images (2 bitstreams for preservation purposes (images), each with own set of (but needs additional properties (semantic units): structure or reformatting to e.g., identifiers, technical be stand-alone file) Intellectual entity will become metadata, inhibitors, … ) another level of object
  • 14. Object Example: book in two versions Intellectual Entity Da Vinci Code by Dan Brown Representation 1 Representation 2 Page image ebook version version File 1: File 2: File N: File N+1: File 1: page1.tiff page2.tiff pageN.tiff METS.xml book.lit
  • 15. Events  An action that involves or impacts at least one Object or Agent associated with or known by the preservation repository  Helps document digital provenance. Can track Examples: history of Object through the  Validation Event: use JHOVE chain of Events that occur tool to verify that during the Objects lifecycle chapter1.pdf is a valid PDF  Determining which Events file should be recorded, and at  Ingest Event: transform an what level of granularity is up OAIS SIP into an AIP to the repository  Migration Event: create a new version of an Object in an up-to-date format
  • 16. Agents  Person, organization, or software program/system associated with an Event or a Right (permission statement)  Agents are associated only indirectly to Objects through Events or Rights Examples:  Not defined in detail in  Martha Anderson (a person) PREMIS DD; not considered  Library of Congress (an core preservation metadata organization) beyond identification  Dark Archive in the Sunshine State implementation (a system)  JHOVE version 1.0 (a software program)
  • 17. Rights Statements  An agreement with a rights holder that grants permission for the repository to undertake an action(s) associated with an Object(s) in the repository.  Not a full rights expression Example: language; focuses exclusively  Priscilla Caplan grants FCLA on permissions that take the digital repository permission form: to make three copies of • Agent X grants Permission metadata_fundamentals.pdf Y to the repository in for preservation purposes. regard to Object Z.
  • 18. Technical metadata pertaining to objects  Object identifier  Storage  Preservation level  Environment  Significant characteristics • software  Object characteristics • hardware • fixity  Digital signatures • format  Relationships • size  Linking event identifier • creating application  Linking permission • inhibitors statement identifier • object characteristics extension  Creating application  Original name
  • 19. Semantic units pertaining to Events: provenance and preservation activity  Event identifier  Event type (e.g. capture, creation, validation, migration, fixity check)  Event dateTime  Event detail  Event outcome  Event outcome detail  Linking agent identifier  Linking object identifier
  • 20. Semantic units pertaining to Rights Rights Statement Rights Granted Rights Statement act Identifier restriction Rights Basis termOfGrant Copyright Information rightsGranted License Information Linking Object Statute Information Identifier Other Rights Information Linking Agent Identifier rightsExtension
  • 21. Semantic units pertaining to Agents  Agent Identifier  Agent Name  Agent Type  Agent Note  Agent Extension  linking Event Identifier  Linking Rights Identifier
  • 22. The State of PREMIS  de facto standard for preservation metadata; in some countries mandated for cultural heritage repositories  Was recognized by winning the Digital Preservation Award (2005) and was shortlisted for DPC Decennial award for outstanding contribution to digital preservation (2012)  PREMIS implementations are appearing in many places, many contexts, many forms  Experimentation has led to changes in the data dictionary and schema  PREMIS Implementation fairs: attempts to consolidate implementation experiences, issues, best practices,
  • 23. Key features of PREMIS  Developed through international consensus-making process  Mobilized community to address shared need  Shared solution to a shared need  Implementation neutral • Makes no assumptions about technology • Can be flexibly adapted for use across all sorts of institutions, digital preservation contexts, repository systems • Allows for extensibility  Supported by Maintenance Activity and Editorial Committee, under auspices of US Library of Congress  PREMIS is sustained, maintained, and evolved  Extensive outreach to implementer community  Tutorials, guides, implementation fairs, PIG Forum  “Support system” in place for PREMIS implementers
  • 24. PREMIS Maintenance Activity  Web site: • Permanent Web presence, hosted by Library of Congress • Central destination for PREMIS-related info, announcements, resources • Home of the PREMIS Implementers’ Group (PIG) discussion list  PREMIS Editorial Committee: • Set directions/priorities for PREMIS development • Coordinate future revisions of Data Dictionary and XML schema • Promote implementation http://www.loc.gov/standards/premis/
  • 25. Implementation resources  Tools: • XML schema • PREMIS-in-METS toolbox <http://pim.fcla.edu> • Controlled vocabularies at http://id.loc.gov • RDF/OWL ontology for use as Linked Data  Guidelines: • PREMIS conformance statement • PREMIS & METS guidelines  Community Working groups on special topics  Others: • Understanding PREMIS (available in multiple languages) • PIG Forum • Implementation Registry • Tools Registry
  • 26. Some implementers …  DAITTSS (Florida): a preservation repository for the use of the libraries of the public universities of Florida.  Ex Libris Rosetta: a commercial digital preservation system supporting acquisition, validation, ingest, storage, management, preservation and dissemination of different types of digital objects  National Digital Newspaper Program  Archivematica: comrehensive open-source digital preservation system  National Archives of Sweden, National Archives of Scotland  Carolina Digital Repository: repository for material in electronic formats produced by members of the University of North Carolina at Chapel Hill community.  British Library electronic journal archiving project  For more information see: • http://www.loc.gov/premis/premis-registry.html
  • 27. Impact  De facto international standard for preservation metadata • Part of permanent infrastructure supporting digital preservation • ISO standardization being considered  Wide applicability means benefits from PREMIS extend to entire digital preservation community  Ongoing work to revise/update Data Dictionary and create new supporting resources • PREMIS is a dynamic resource that continues to generate new sources of value to implementer community  Stood the test of time: • Seven years after initial release, is now indispensable part of digital preservation implementations around the world • Not surpassed or replaced by other standard or resource
  • 28. URLs, etc.  PREMIS Maintenance Activity: http://www.loc.gov/standards/premis/  PREMIS Data Dictionary for Preservation Metadata: http://www.loc.gov/standards/premis/v2/premis-2-2.pdf  Understanding PREMIS: http://www.loc.gov/standards/premis/understanding- premis.pdf  PREMIS Implementation Registry http://www.loc.gov/standards/premis/premis-registry.php  PREMIS Implementers Group list http://listserv.loc.gov/listarch/pig.html
  • 29. Metadata for Preservation A digital object’s best friend Implementation!
  • 30.
  • 32.
  • 35. Standar framework for thinking ds
  • 36. Standar framework for thinking interchange specification ds
  • 37. [The PREMIS documentation has an] emphasis on the need to know rather than the need to record or represent in any
  • 39. Intellectual Entities Rights Statements Objects Agents Events
  • 40. Digital preservation is the series of management policies and activities necessary to ensure the enduring usability, authenticity, discov erability and accessibility of
  • 43. Dublin Core DIDL (from MPEG-21) METS
  • 44. Dublin Core DIDL (from MPEG-21) METS OAIS
  • 45. Dublin Core DIDL (from MPEG-21) METS OAIS …
  • 46. Dublin Core DIDL (from MPEG-21) METS OAIS … Experience
  • 47. 1. Content model 2. Metadata elements 3. Registries
  • 48. Intellectual Entities Rights Statements Objects Agents Events
  • 50. Intellectual Entities Rights Statements Objects Agents Events
  • 60. PMD
  • 61. PMD a thing of beauty
  • 62.
  • 63.
  • 64.
  • 65.
  • 66.
  • 67. Intellectual Entities Rights Statements Objects Agents Events
  • 68. Semantic Units 1.1 objectIdentifier 1.2 objectCategory for Objects 1.3 preservationLevel 1.4 significantProperties 1.5 objectCharacteristics 1.6 originalName 1.7 storage 1.8 environment 1.9 signatureInformation 1.10 relationship 1.11 linkingEventIdentifier 1.12 linkingIntellectualEntityIdentifier 1.13 linkingRightsStatementIdentifier
  • 69.
  • 70.
  • 71.
  • 72.
  • 73.
  • 74.
  • 75.
  • 77.
  • 78.
  • 79.
  • 80. Intellectual Entities Rights Statements Objects Agents Events
  • 81. Semantic Units for Events 2.1 eventIdentifier 2.2 eventType 2.3 eventDateTime 2.4 eventDetail 2.5 eventOutcomeInformation 2.6 linkingAgentIdentifier 2.7 linkingObjectIdentifier
  • 83.
  • 84.
  • 85.
  • 86. Some Portico Events Edit Descriptive Metadata Check Descriptive Metadata Generate Descriptive Metadata Ingest Into Archive Create File Generate Technical Metadata Set Preservation Level Generate Fixity
  • 87. Portico Event Elements Timestamp Rationale InputList ArgList Output ToolWrapper Tool Component List Outcome OutcomeDetailList
  • 89. Intellectual Entities Rights Statements Objects Agents Events
  • 90. Semantic Units for Agents 3.1 agentIdentifier 3.2 agentName 3.3 agentType 3.4 agentNote 3.5 agentExtension 3.6 linkingEventIdentifier 3.7 linkingRightsStatementIdentifier
  • 91.
  • 92.
  • 93. Intellectual Entities Rights Statements Objects Agents Events
  • 94. Semantic Units 4.1 rightsStatement for Rights 4.1.1 rightsStatementIdentifier 4.1.2 rightsBasis 4.1.3 copyrightInformation 4.1.4 licenseInformation 4.1.5 statuteInformation 4.1.6 otherRightsInformation 4.1.7 rightsGranted 4.1.8 linkingObjectIdentifier 4.1.9 linkingAgentIdentifier 4.2 rightsExtension
  • 95. Eas y
  • 97.
  • 98.
  • 99. Intellectual Entities Rights Statements Objects Agents Events
  • 101. NISO Webinar: Metadata for Preservation: A Digital Object's Best Friend Questions? All questions will be posted with presenter answers on the NISO website following the webinar: http://www.niso.org/news/events/2013/webinars/preservation NISO Webinar • February 13, 2013
  • 102. THANK YOU Thank you for joining us today. Please take a moment to fill out the brief online survey. We look forward to hearing from you!

Notas do Editor

  1. PREMIS in METS toolbox consists of 3 modules to help implementers: describe (generate PREMIS metadata), convert (between PREMIS and METS), validate (ensure quality metadata)Controlled vocabularies to increase interoperability and consistency of metadataRDF/OWL ontology to allow for interconnection among preservation repositories, facilitate querying the metadata, and incorporate preservation-specific controlled vocabulariesGuidelines available results in quality and consistent metadata through the conformance statement and the guidelines for using PREMIS in METSCommunity working groups on specific topics include: Ontology working group; Environment working group (to amend the data model)– open to the preservation community at large to participatePREMIS Implementers group forum allows for the preservation community to participate in PREMIS development and submit change requests to the ECImplementation registry assists new implementers in planning their preservation systemsTools registry gives implementers tools
  2. PREMIS has had a significant impact in digital preservation activitiesIts wide applicability has resulted in cost savings to institutions developing preservation repositories because they have a standard that can be used by the entire preservation communitiyOngoing work makes it a dynamic resource– it continues to generate new sources of value to the implementer community
  3. Turn everything off. Make your sidebar completely empty and make sure your PC won’t shut off or down.
  4. Who am I????
  5. I am Mom to these 4 beautiful children …&lt;click&gt;
  6. More pertinently, though, I am the Archive Service Product Manager. I have an MA in Library Science. I have been with JSTOR and Portico forever – I started at JSTOR in 1996. I now focus on preservation at Portico and JSTOR.&lt;click – to standards&gt;
  7. Before we begin, I want to share my philosophy on standards.In my opinion, standards do two things really well …&lt;click&gt;
  8. They provide a framework for thinking about a topic and making a plan.Enter the wildernees with a map.&lt;click.
  9. They are also quite valuable as interchange specifications between organizations, or even groups within a single organization.&lt;click&gt;
  10. Fortunately for me, the PREMIS folks seem to agree. PREMIS is about a way to think about preservation metadata. About the elements and units you need to consider.In my talk to day, you aren’t going to see any PREMIS XML.You are going to see, quite a lot about …&lt;click&gt;
  11. The Portico content model and an XML content wrapper that we call PMD or the Preservation Metadata file.It is a pretty direct reflection of our content model and we have at least one PMD file for every item we preserve.Many considerations went into the design of the Portico preservation metadat …&lt;click&gt;
  12. Another is our definiton of preservation which isn on the screen.  We spent quite awhile developing this definition and it really helps us focus when making preservation decisions.&lt;click&gt;What is Digital Preservation? Digital preservation is the series of management policies and activities necessary to ensure the enduring usability, authenticity, discoverability and accessibility of content over the very long term. The key goals of digital preservation include:usability – the intellectual content of the item must remain usable via the delivery mechanism of current technologyauthenticity – the provenance of the content must be proven and the content an authentic replica of the originaldiscoverability – the content must have logical bibliographic metadata so that the content can be found by end users through timeaccessibility – the content must be available for use to the appropriate community
  13. Any number of other standards influnceed us, including …&lt;click&gt;
  14. DIDL is a content model. It is very flexible. We almost used it.&lt;click&gt;
  15. Our first preservation metadata file was METS based. We migrated to our new format a couple of years ago.&lt;click&gt;
  16. Of coures …&lt;click&gt;
  17. And, no doubt many others that aren’t on the tip of my tongue at the moment.&lt;click&gt;
  18. When we redesigned our preservation metadata file a couple of years back, we also drew pretty extensively on our experience. You’ll see that refelected in some areas as we talk, for example how we deal with events in our metadat file.&lt;click&gt;
  19. The PREMIS entities and semantic units can be found embedded in the Portico content model, our metadata elements, and also in a system of registries we implement. Registries are a way for us to track things.&lt;click&gt;
  20. A word about identifiers.PREMIS requires unique identifiers on every entity and semantic unit.At Portico we firmly believe in this philosophy and you’ll see through out the presentation, many of the ways in which we use unique identifiers to link between elements of our content model.&lt;click&gt;
  21. We currently preserved a number of disparate things. The have many similarities, but they also have not insignificant differences.&lt;click&gt;
  22. One of our goals is to represent all these disparate content types in one content model and with one set of preservation metadata.We need to manage the archive and the preserved content uniformly.To put this another way, if can’t manage these uniformly, my head my explode. So, one content model …&lt;click&gt;
  23. So the Portico content model is pretty heavily informed by DIDL.Containers contain other containers.Our model is limited to six levels.We have content types, such as e-books, e-journals, and digitized newspapres.&lt;click&gt;
  24. They contain one or more content sets.A content set is just a way for us to bag content together.For example …For the e-journal content type, our content set is the journal.For the e-book content type, our content set is the publisher.For the digitized newspaper content type, our content set is the collection.&lt;click&gt;
  25. Content sets contain one or more Archival Units. These are the units of preservation.For example …For e-journals, the archival unit is the article.For e-books, the archival unit is the book.For digitized newspapers, the archival unit is the issue.&lt;click&gt;
  26. Each archival unit may contain one or more content units.We’d use this technique if the publisher sent us an update to the full item.&lt;click&gt;
  27. Content units contain one or more function units.A funcitonal unit is an intellectualy entity within the item.For example, the page images of an article are a functional unit.Each figure graphic is a functional unit.&lt;click.
  28. Each functional unit can contain one or more storage units (which are essentially files).Say we receive a high res image, low res image, and thumbnail for a single figure graphic.That one figure graphic functional unit would contain four storage units.&lt;click&gt;
  29. At any level of the Portico content model, we can apply these metadata. Into which some of the PREMIS semantic units may be found.&lt;click&gt;
  30. This entire mess of information … content model and metadata are recorded in the file we call PMD or Preservation metadata.&lt;click&gt;
  31. It is a thing of beauty.And, I’m not even an XML geek!&lt;click.
  32. Our PMD files tightly match the content model.This is a snippet of the XML tree of a PMD file.&lt;click&gt;
  33. Archival units …&lt;click.
  34. Contain content units …&lt;click&gt;
  35. Which contain functional units …
  36. Which contain storage units.The higher elements of our content model are encoded in the construction of the archive itself and as metadata attributes and elements within the PMD file.&lt;click&gt;
  37. Per PREMIS, objects often have the following information associated with them.This type of information is pretty deeply embedded in the Portico PMD.&lt;click&gt;
  38. For example, here as a snippet of information about as storage unit (or file).&lt;click&gt;
  39. Among other things, we have an ID for this storage unit.&lt;click&gt;
  40. And a preservation level.&lt;click&gt;
  41. Deeper into the storage unit, we have additional information, including:&lt;click&gt;
  42. The size of the file …&lt;click&gt;
  43. A basic format for the file …&lt;click&gt;
  44. A format status for the file…&lt;click&gt;
  45. I am going to touch very briefly on registries.At Portico, we use registries as a way to consolidate information.In this case, information about formats. &lt;click&gt;
  46. Here we have two files (this is an element within the storage unit element.&lt;click&gt;
  47. These files each have a very specific format name.&lt;click&gt;
  48. That name provides us with significant additional information, found in our format registry. Including a description, the authority and maintenance agencies, the default file extension and more.While we do track PREMIS information on our objects, it is found in a number of different places, from embedded in the content model or PMD files to registries.&lt;click&gt;
  49. Per PREMIS, the importantsementic units for Events are on the screen. Nothing too surprising&lt;click&gt;
  50. That key information for events can be found in three different locations within the Portico preservation metadata file.&lt;click&gt;
  51. This is a processing record.Precossing records are relatively new features for us.When Portico first started, we designed a very flexible system that would allow us to run different elements of our workflow on different machines. As we ramped up, it became clear that our administrative costs would be lower if we limited the number of machines we managed and that we could get much greater throughput running on a single, powerful machine. Originally, we had put all the information about the machine and tools into each event record. But, with experience under our belt it became clear that we could streamline our metadata files by consolidating this information into Processing Records.&lt;click&gt;
  52. Here is a close-up on a processing record and a set of events that reference this processing record.&lt;click&gt;
  53. They are tied together through that unique processing record ID.And this relationship is telling the world that the events within this event set all occurred on the ConPrepLite system in July 2010.&lt;click&gt;
  54. Within our PMD file, events are grouped into Event Sets. These are just a set of events that happened at the same time, for the same conceptual purpose, and are associated with a single processing record.Some of the events we track are above.Nothing too exciting.&lt;click&gt;
  55. Another change we made was to unify the format of our events. Within Portico all events now contain only elements from the above list of possible elements.These were informed by PREMIS and you’ll see a number of similarities.&lt;click&gt;
  56. If you are going to walk away remembering one thing, remember that events (like descriptive and technical metadata) can live on any element within the content model.&lt;click&gt;
  57. These are the semantic units for agents.In general, rights holders are primary agents within a repository.&lt;click&gt;
  58. In addition, however, are repository systems and people that might make changes to the content.&lt;click&gt;
  59. For example, within these three processing records are three agents that touched the content.&lt;click&gt;
  60. Per PREMIS, the important information to consider is on the screen.&lt;click&gt;
  61. &lt;chuckle&gt;&lt;click&gt;
  62. At the moment and for Portico our rights statements are relatively straight forward.All of the Portico agreements, at the moment, are similar and thus, we do not have a need to track a variety of different clauses and commitments.We have set up a system where by content will not enter the Portico archive until such time as we have a formal agreement in place and that agreement has been preserved in the archive.&lt;click&gt;
  63. As with many other PREMIS entities, rights entities are embedded within our PMD file.&lt;click&gt;
  64. Every archival unit must reference a specific agreement. That agreement has a unique ID and can be found in the archive.&lt;click&gt;
  65. Questions?&lt;Amy: stay on afterward.&gt;&lt;don’t click unless you need to address 2CUL or Xref questions&gt;