SlideShare a Scribd company logo
1 of 15
Download to read offline
Tackling File
                                             Characterization
                                             & Analysis with
                                              Archivematica
                                               Courtney C. Mumma
                                                CurateGear 2013
                                             Wednesday, January 9, 2013

Jetsons-robot-football-player1 - I, Foxbot
digital preservation consulting
open-source sofware for archives and libraries
`


                          monitor and control




                                                     web     MCP               micro-service
                                                    server   server          processing clients



                                                                 fileshare




                                                              digital curation
                                                                                                  AIP
                                                              micro-services           success

                                                watched
                                                directory
                                                                                         error    DIP
    web-based dashboard                                      python       FOSS
                                                             scripts       tools




                                                                         transfer of
                                                               or       digital objects
                                                    SIP                  & metadata
Preservation planning
●   A two-pronged approach:
    ●   Normalization on ingest
    ●   Preservation of the original file to support future
        strategies such as emulation
●   Normalization relies on format policies based on an
    analysis of the significant characteristics of file
    formats
        –   A format policy indicates the actions, tools and settings to
            apply to a file of a particular file format (e.g. normalization to
            preservation and/or access format)
Archivematica format policies
●   Criteria for selecting default formats:
       –   Non-proprietary
       –   Freely available specifications
       –   Widely used/endorsed by major repositories
       –   No compression/lossless video compression
       –   Tools available to write and render the format
●   Format policies will change as community
    standards, practices and tools evolve.
https://www.archivematica.org/wiki/Format_policies
Format confessional



●




    iPres 2012 Toronto CurateCamp
24-hour CurateCamp/OPF
worldwide file id hackathon

  ●   participants from
      every time zone
  ●   rapid, iterative testing
  ●   OpenFITS (github
      fork)
  ●   Open Planets Format
      Corpus (github)
Choose normalization tool
Select normalization type
Review normalization results
Format Policy Registry - FPR
●   To share and test format choices and tool
    commands for normalization
       –   ie “format policies”
●   To support the community-wide evolution of
    best practices
●   Hosted at archivematica.org/FPR with local
    dashboard customization and updates
Format Policy Registry - FPR
●   Allow users to choose:
    ●   normalization tool
    ●   preservation and/or access format
    ●   file properties (eg resolution, bitrate)
    ●   to use local tool/process outside of the system
●   Allow users to add new format policies
Format Policy Registry - FPR
Tackling File Characterization and Analysis in Archivematica

More Related Content

Viewers also liked

POWRR Tools: Lessons learned from an IMLS National Leadership Grant
POWRR Tools: Lessons learned from an IMLS National Leadership GrantPOWRR Tools: Lessons learned from an IMLS National Leadership Grant
POWRR Tools: Lessons learned from an IMLS National Leadership GrantLynne Thomas
 
Getting Started with AtoM and Archivematica for Digital Preservation and Access
Getting Started with AtoM and Archivematica for Digital Preservation and AccessGetting Started with AtoM and Archivematica for Digital Preservation and Access
Getting Started with AtoM and Archivematica for Digital Preservation and AccessArtefactual Systems - Archivematica
 
Processing at the University of Michigan Bentley Historical Library
Processing at the University of Michigan Bentley Historical LibraryProcessing at the University of Michigan Bentley Historical Library
Processing at the University of Michigan Bentley Historical Librarymikeum
 
Digital Preservation Best Practices: Lessons Learned From Across the Pond
Digital Preservation Best Practices: Lessons Learned From Across the PondDigital Preservation Best Practices: Lessons Learned From Across the Pond
Digital Preservation Best Practices: Lessons Learned From Across the PondBenoit Pauwels
 
HNI leeromgeving 02
HNI leeromgeving 02HNI leeromgeving 02
HNI leeromgeving 02fneggers
 
Mapping the Digital Preservation Wilderness: What you need to know
Mapping the Digital Preservation Wilderness:  What you need to knowMapping the Digital Preservation Wilderness:  What you need to know
Mapping the Digital Preservation Wilderness: What you need to knowJody DeRidder
 
Digital Preservation
Digital PreservationDigital Preservation
Digital PreservationMichael Day
 
Challenges & opportunities in the preservation of (digital) information: the ...
Challenges & opportunities in the preservation of (digital) information: the ...Challenges & opportunities in the preservation of (digital) information: the ...
Challenges & opportunities in the preservation of (digital) information: the ...LIBER Europe
 
HNI leeromgeving 05
HNI leeromgeving 05HNI leeromgeving 05
HNI leeromgeving 05fneggers
 
Cultural heritage collections in a web 2
Cultural heritage collections in a web 2Cultural heritage collections in a web 2
Cultural heritage collections in a web 2Lynne Thomas
 
Progress with FITS for analyzing video
Progress with FITS for analyzing videoProgress with FITS for analyzing video
Progress with FITS for analyzing videoprwheatley
 
CNZ2013 Keynote | Trust in Digital Preservation | Natalie Harrower
CNZ2013 Keynote | Trust in Digital Preservation | Natalie HarrowerCNZ2013 Keynote | Trust in Digital Preservation | Natalie Harrower
CNZ2013 Keynote | Trust in Digital Preservation | Natalie Harrowerdri_ireland
 

Viewers also liked (20)

POWRR Tools: Lessons learned from an IMLS National Leadership Grant
POWRR Tools: Lessons learned from an IMLS National Leadership GrantPOWRR Tools: Lessons learned from an IMLS National Leadership Grant
POWRR Tools: Lessons learned from an IMLS National Leadership Grant
 
Getting Started with AtoM and Archivematica for Digital Preservation and Access
Getting Started with AtoM and Archivematica for Digital Preservation and AccessGetting Started with AtoM and Archivematica for Digital Preservation and Access
Getting Started with AtoM and Archivematica for Digital Preservation and Access
 
Processing at the University of Michigan Bentley Historical Library
Processing at the University of Michigan Bentley Historical LibraryProcessing at the University of Michigan Bentley Historical Library
Processing at the University of Michigan Bentley Historical Library
 
A Foundational Framework for Digital Curation: The Sept Domain Model. Stephen...
A Foundational Framework for Digital Curation: The Sept Domain Model. Stephen...A Foundational Framework for Digital Curation: The Sept Domain Model. Stephen...
A Foundational Framework for Digital Curation: The Sept Domain Model. Stephen...
 
Digital Preservation Best Practices: Lessons Learned From Across the Pond
Digital Preservation Best Practices: Lessons Learned From Across the PondDigital Preservation Best Practices: Lessons Learned From Across the Pond
Digital Preservation Best Practices: Lessons Learned From Across the Pond
 
Your Digital Preservation Cookbook
Your Digital Preservation CookbookYour Digital Preservation Cookbook
Your Digital Preservation Cookbook
 
SHAREmodule2
SHAREmodule2SHAREmodule2
SHAREmodule2
 
Characterization of CDROMs for Emulation-based Access. Klaus Rechert, Thomas ...
Characterization of CDROMs for Emulation-based Access. Klaus Rechert, Thomas ...Characterization of CDROMs for Emulation-based Access. Klaus Rechert, Thomas ...
Characterization of CDROMs for Emulation-based Access. Klaus Rechert, Thomas ...
 
HNI leeromgeving 02
HNI leeromgeving 02HNI leeromgeving 02
HNI leeromgeving 02
 
Mapping the Digital Preservation Wilderness: What you need to know
Mapping the Digital Preservation Wilderness:  What you need to knowMapping the Digital Preservation Wilderness:  What you need to know
Mapping the Digital Preservation Wilderness: What you need to know
 
Digital Preservation
Digital PreservationDigital Preservation
Digital Preservation
 
Challenges & opportunities in the preservation of (digital) information: the ...
Challenges & opportunities in the preservation of (digital) information: the ...Challenges & opportunities in the preservation of (digital) information: the ...
Challenges & opportunities in the preservation of (digital) information: the ...
 
ENArC- international cooperation, current and past project activities - statu...
ENArC- international cooperation, current and past project activities - statu...ENArC- international cooperation, current and past project activities - statu...
ENArC- international cooperation, current and past project activities - statu...
 
SHAREmodule3
SHAREmodule3SHAREmodule3
SHAREmodule3
 
HNI leeromgeving 05
HNI leeromgeving 05HNI leeromgeving 05
HNI leeromgeving 05
 
Cultural heritage collections in a web 2
Cultural heritage collections in a web 2Cultural heritage collections in a web 2
Cultural heritage collections in a web 2
 
Progress with FITS for analyzing video
Progress with FITS for analyzing videoProgress with FITS for analyzing video
Progress with FITS for analyzing video
 
CNZ2013 Keynote | Trust in Digital Preservation | Natalie Harrower
CNZ2013 Keynote | Trust in Digital Preservation | Natalie HarrowerCNZ2013 Keynote | Trust in Digital Preservation | Natalie Harrower
CNZ2013 Keynote | Trust in Digital Preservation | Natalie Harrower
 
Electronic textbooks for studying of the archival sciences
Electronic textbooks for studying of the archival sciencesElectronic textbooks for studying of the archival sciences
Electronic textbooks for studying of the archival sciences
 
Adding MediaConch to Archivematica for mkv/ffv1 checking
Adding MediaConch to Archivematica for mkv/ffv1 checkingAdding MediaConch to Archivematica for mkv/ffv1 checking
Adding MediaConch to Archivematica for mkv/ffv1 checking
 

Similar to Tackling File Characterization and Analysis in Archivematica

fiwalk With Me: Building Emergent Pre-Ingest Workflows for Digital Archival R...
fiwalk With Me: Building Emergent Pre-Ingest Workflows for Digital Archival R...fiwalk With Me: Building Emergent Pre-Ingest Workflows for Digital Archival R...
fiwalk With Me: Building Emergent Pre-Ingest Workflows for Digital Archival R...Mark Matienzo
 
Tutorial: Best Practices for Building a Records-Management Deployment in Shar...
Tutorial: Best Practices for Building a Records-Management Deployment in Shar...Tutorial: Best Practices for Building a Records-Management Deployment in Shar...
Tutorial: Best Practices for Building a Records-Management Deployment in Shar...SPTechCon
 
IntrospeQt iCapture Connect for Alfresco
IntrospeQt iCapture Connect for AlfrescoIntrospeQt iCapture Connect for Alfresco
IntrospeQt iCapture Connect for AlfrescoSrikant Tallapragada
 
e-Services to Keep Your Digital Files Current
e-Services to Keep Your Digital Files Currente-Services to Keep Your Digital Files Current
e-Services to Keep Your Digital Files Currentpbajcsy
 
Leadership Symposium on Digital Media in Healthcare
Leadership Symposium on Digital Media in HealthcareLeadership Symposium on Digital Media in Healthcare
Leadership Symposium on Digital Media in Healthcaresetstanford
 
Archivematica and Local Authority Archive Services
Archivematica and Local Authority Archive ServicesArchivematica and Local Authority Archive Services
Archivematica and Local Authority Archive ServicesPaweł Jaskulski
 
Presentation arsip nov 2012 frans smit handout
Presentation arsip nov 2012 frans smit handoutPresentation arsip nov 2012 frans smit handout
Presentation arsip nov 2012 frans smit handoutGemeente Almere
 
Eudat user forum-london-11march2013-biovel-v3
Eudat user forum-london-11march2013-biovel-v3Eudat user forum-london-11march2013-biovel-v3
Eudat user forum-london-11march2013-biovel-v3Alex Hardisty
 
Exercises portfolio-Digital Curation Tools (IS40620)
Exercises portfolio-Digital Curation Tools (IS40620)Exercises portfolio-Digital Curation Tools (IS40620)
Exercises portfolio-Digital Curation Tools (IS40620)softwaresatish
 
A little simple explanation abut Digital imaging
A little simple explanation abut Digital imagingA little simple explanation abut Digital imaging
A little simple explanation abut Digital imagingaechaa93
 
Introduction of file based workflows 111004 vfinal
Introduction of file based workflows 111004 vfinalIntroduction of file based workflows 111004 vfinal
Introduction of file based workflows 111004 vfinalMarie Josée (MJ) Drouin
 
CASE-6 Structured Content Authoring and Publishing through Alfresco and Compo...
CASE-6 Structured Content Authoring and Publishing through Alfresco and Compo...CASE-6 Structured Content Authoring and Publishing through Alfresco and Compo...
CASE-6 Structured Content Authoring and Publishing through Alfresco and Compo...Alfresco Software
 
Q2 Briefing Presentation
Q2 Briefing PresentationQ2 Briefing Presentation
Q2 Briefing PresentationKurt Carlsen
 
Wewebu customer success story California Dept. of Public Health
Wewebu customer success story California Dept. of Public HealthWewebu customer success story California Dept. of Public Health
Wewebu customer success story California Dept. of Public HealthWeWebU Software AG
 
PRESENTATION: Tips and Tricks for Government Agencies to Push the Limits of P...
PRESENTATION: Tips and Tricks for Government Agencies to Push the Limits of P...PRESENTATION: Tips and Tricks for Government Agencies to Push the Limits of P...
PRESENTATION: Tips and Tricks for Government Agencies to Push the Limits of P...Adlib - The PDF Experts
 

Similar to Tackling File Characterization and Analysis in Archivematica (20)

fiwalk With Me: Building Emergent Pre-Ingest Workflows for Digital Archival R...
fiwalk With Me: Building Emergent Pre-Ingest Workflows for Digital Archival R...fiwalk With Me: Building Emergent Pre-Ingest Workflows for Digital Archival R...
fiwalk With Me: Building Emergent Pre-Ingest Workflows for Digital Archival R...
 
Tutorial: Best Practices for Building a Records-Management Deployment in Shar...
Tutorial: Best Practices for Building a Records-Management Deployment in Shar...Tutorial: Best Practices for Building a Records-Management Deployment in Shar...
Tutorial: Best Practices for Building a Records-Management Deployment in Shar...
 
IntrospeQt iCapture Connect for Alfresco
IntrospeQt iCapture Connect for AlfrescoIntrospeQt iCapture Connect for Alfresco
IntrospeQt iCapture Connect for Alfresco
 
e-Services to Keep Your Digital Files Current
e-Services to Keep Your Digital Files Currente-Services to Keep Your Digital Files Current
e-Services to Keep Your Digital Files Current
 
Leadership Symposium on Digital Media in Healthcare
Leadership Symposium on Digital Media in HealthcareLeadership Symposium on Digital Media in Healthcare
Leadership Symposium on Digital Media in Healthcare
 
Archivematica and Local Authority Archive Services
Archivematica and Local Authority Archive ServicesArchivematica and Local Authority Archive Services
Archivematica and Local Authority Archive Services
 
IT Governance Portals
IT Governance   PortalsIT Governance   Portals
IT Governance Portals
 
Presentation arsip nov 2012 frans smit handout
Presentation arsip nov 2012 frans smit handoutPresentation arsip nov 2012 frans smit handout
Presentation arsip nov 2012 frans smit handout
 
Resource space
Resource spaceResource space
Resource space
 
Eudat user forum-london-11march2013-biovel-v3
Eudat user forum-london-11march2013-biovel-v3Eudat user forum-london-11march2013-biovel-v3
Eudat user forum-london-11march2013-biovel-v3
 
Exercises portfolio-Digital Curation Tools (IS40620)
Exercises portfolio-Digital Curation Tools (IS40620)Exercises portfolio-Digital Curation Tools (IS40620)
Exercises portfolio-Digital Curation Tools (IS40620)
 
Alfresco Records Management
Alfresco Records ManagementAlfresco Records Management
Alfresco Records Management
 
A little simple explanation abut Digital imaging
A little simple explanation abut Digital imagingA little simple explanation abut Digital imaging
A little simple explanation abut Digital imaging
 
Introduction of file based workflows 111004 vfinal
Introduction of file based workflows 111004 vfinalIntroduction of file based workflows 111004 vfinal
Introduction of file based workflows 111004 vfinal
 
CASE-6 Structured Content Authoring and Publishing through Alfresco and Compo...
CASE-6 Structured Content Authoring and Publishing through Alfresco and Compo...CASE-6 Structured Content Authoring and Publishing through Alfresco and Compo...
CASE-6 Structured Content Authoring and Publishing through Alfresco and Compo...
 
Personal Digital Archiving 2015 - NYU - Workshop
Personal Digital Archiving 2015 - NYU - WorkshopPersonal Digital Archiving 2015 - NYU - Workshop
Personal Digital Archiving 2015 - NYU - Workshop
 
Q2 Briefing Presentation
Q2 Briefing PresentationQ2 Briefing Presentation
Q2 Briefing Presentation
 
SharePoint 2010: ECM-ready?
SharePoint 2010: ECM-ready?SharePoint 2010: ECM-ready?
SharePoint 2010: ECM-ready?
 
Wewebu customer success story California Dept. of Public Health
Wewebu customer success story California Dept. of Public HealthWewebu customer success story California Dept. of Public Health
Wewebu customer success story California Dept. of Public Health
 
PRESENTATION: Tips and Tricks for Government Agencies to Push the Limits of P...
PRESENTATION: Tips and Tricks for Government Agencies to Push the Limits of P...PRESENTATION: Tips and Tricks for Government Agencies to Push the Limits of P...
PRESENTATION: Tips and Tricks for Government Agencies to Push the Limits of P...
 

Tackling File Characterization and Analysis in Archivematica

  • 1. Tackling File Characterization & Analysis with Archivematica Courtney C. Mumma CurateGear 2013 Wednesday, January 9, 2013 Jetsons-robot-football-player1 - I, Foxbot
  • 2. digital preservation consulting open-source sofware for archives and libraries
  • 3. ` monitor and control web MCP micro-service server server processing clients fileshare digital curation AIP micro-services success watched directory error DIP web-based dashboard python FOSS scripts tools transfer of or digital objects SIP & metadata
  • 4. Preservation planning ● A two-pronged approach: ● Normalization on ingest ● Preservation of the original file to support future strategies such as emulation ● Normalization relies on format policies based on an analysis of the significant characteristics of file formats – A format policy indicates the actions, tools and settings to apply to a file of a particular file format (e.g. normalization to preservation and/or access format)
  • 5. Archivematica format policies ● Criteria for selecting default formats: – Non-proprietary – Freely available specifications – Widely used/endorsed by major repositories – No compression/lossless video compression – Tools available to write and render the format ● Format policies will change as community standards, practices and tools evolve.
  • 7. Format confessional ● iPres 2012 Toronto CurateCamp
  • 8. 24-hour CurateCamp/OPF worldwide file id hackathon ● participants from every time zone ● rapid, iterative testing ● OpenFITS (github fork) ● Open Planets Format Corpus (github)
  • 12. Format Policy Registry - FPR ● To share and test format choices and tool commands for normalization – ie “format policies” ● To support the community-wide evolution of best practices ● Hosted at archivematica.org/FPR with local dashboard customization and updates
  • 13. Format Policy Registry - FPR ● Allow users to choose: ● normalization tool ● preservation and/or access format ● file properties (eg resolution, bitrate) ● to use local tool/process outside of the system ● Allow users to add new format policies