SlideShare uma empresa Scribd logo
1 de 29
Processing Born-Digital “Papers” @ Stanford Glynn Edwards, RBMS, Baton Rouge, LA - 2011
Collections in the late 1990s Apple Computer Inc. records Douglas Engelbart papers Stephen Cabrinety collection By 2000, over 7,000 items of legacy computer media received as part of hybrid collections Now over 26,000 items recorded during accessioning process
Tracking Computer Media (then)
First Digital Lives Research Conference: Personal Digital Archives for the 21st Century
FRED (Forensic Recovery Evidence Device: Digital Intelligence) Software: FTK suite (AccessData) - EnCase
AIMS Born-Digital Collections: An Inter-Institutional Model for Stewardship University of Virginia Yale University Hull University Stanford University Funded by the Andrew W. Mellon Foundation
Robert Creeley papers Stephen Jay Gould papers Keith Henson papers re: to Project Xanadu Peter Rutledge Koch papers
Stephen Jay Gould    Influential American paleontologist, evolutionary biologist and historian of science, Gould began his career at Harvard University in 1967 and worked until his death in 2002.  98 3 ½” floppy diskettes 61 5 ½” floppy diskettes 4 sets of punch cards 3 computer tapes
Dear Peter,Unfortunately we do not manufacture any motherboards now a days which can support the 5.25 floppy. The interface are different than 3.5 and they are becoming obsolete and are no longer available on the newer motherboards.
Capture: success & failure
Trial One – processing using Explorer
Trial Two – processing using FTK ,[object Object]
http://accessdata.com/downloads/media/Recognized%20File%20Types%20FTK1%207-24-08.pdf
Viewing the files will NOT change last accessed dates
Easy to use user interface for creating “bookmarks” for hierarchical information (series, subseries)
Using “labels” on groups of files for descriptive metadata
Pattern & full-text searches – e.g. looking for restricted content such as credit cards, ss#, student grades, etc.
Output to xml ,[object Object]
View Files in “obsolete” file format
Create Hierarchy using “bookmarks”
Administrative & descriptive metadata
Series 6: Gould’s Born-Digital Material
Series 6: Processing Note
Other collections, other issues Robert Creeley’s original media, processed via FTK: ,[object Object]
Identified 8 files related to health records and
69 files with SS#More born-digital material received - May 2011 addenda : ,[object Object]
3 zip drives

Mais conteúdo relacionado

Mais procurados

Dan Needham & Phil Cross (mimas) – Names Project
Dan Needham & Phil Cross (mimas) – Names ProjectDan Needham & Phil Cross (mimas) – Names Project
Dan Needham & Phil Cross (mimas) – Names ProjectRepository Fringe
 
CMoreno_PSE_pg217_1_DM
CMoreno_PSE_pg217_1_DMCMoreno_PSE_pg217_1_DM
CMoreno_PSE_pg217_1_DMCarlos Moreno
 
How to do things with metadata: From rights statements to speech acts.
How to do things with metadata: From rights statements to speech acts.How to do things with metadata: From rights statements to speech acts.
How to do things with metadata: From rights statements to speech acts.Richard Urban
 
Online Library and Information Systems: the DLSU Experience
Online Library and Information Systems: the DLSU ExperienceOnline Library and Information Systems: the DLSU Experience
Online Library and Information Systems: the DLSU ExperienceFe Angela Verzosa
 
Teaching systems and fundamentals
Teaching systems and fundamentalsTeaching systems and fundamentals
Teaching systems and fundamentalsHouston ISD
 
Databasing the world
Databasing the worldDatabasing the world
Databasing the worldChen Zhang
 
S alvarado revision wk 7 copyright crash course
S alvarado revision wk 7 copyright crash courseS alvarado revision wk 7 copyright crash course
S alvarado revision wk 7 copyright crash coursesalvara85
 
Introduction to File System
Introduction to File SystemIntroduction to File System
Introduction to File SystemSanthiNivas
 
Fundamental File Processing Operations
Fundamental File Processing OperationsFundamental File Processing Operations
Fundamental File Processing OperationsRico
 

Mais procurados (10)

Dan Needham & Phil Cross (mimas) – Names Project
Dan Needham & Phil Cross (mimas) – Names ProjectDan Needham & Phil Cross (mimas) – Names Project
Dan Needham & Phil Cross (mimas) – Names Project
 
CMoreno_PSE_pg217_1_DM
CMoreno_PSE_pg217_1_DMCMoreno_PSE_pg217_1_DM
CMoreno_PSE_pg217_1_DM
 
How to do things with metadata: From rights statements to speech acts.
How to do things with metadata: From rights statements to speech acts.How to do things with metadata: From rights statements to speech acts.
How to do things with metadata: From rights statements to speech acts.
 
Online Library and Information Systems: the DLSU Experience
Online Library and Information Systems: the DLSU ExperienceOnline Library and Information Systems: the DLSU Experience
Online Library and Information Systems: the DLSU Experience
 
Teaching systems and fundamentals
Teaching systems and fundamentalsTeaching systems and fundamentals
Teaching systems and fundamentals
 
File structure
File structureFile structure
File structure
 
Databasing the world
Databasing the worldDatabasing the world
Databasing the world
 
S alvarado revision wk 7 copyright crash course
S alvarado revision wk 7 copyright crash courseS alvarado revision wk 7 copyright crash course
S alvarado revision wk 7 copyright crash course
 
Introduction to File System
Introduction to File SystemIntroduction to File System
Introduction to File System
 
Fundamental File Processing Operations
Fundamental File Processing OperationsFundamental File Processing Operations
Fundamental File Processing Operations
 

Semelhante a RBMS 2011_Edwards

Hypatia for dlf 2011
Hypatia for dlf 2011Hypatia for dlf 2011
Hypatia for dlf 2011DLFCLIR
 
247th ACS Meeting: The Eureka Research Workbench
247th ACS Meeting: The Eureka Research Workbench247th ACS Meeting: The Eureka Research Workbench
247th ACS Meeting: The Eureka Research WorkbenchStuart Chalk
 
eScience: A Transformed Scientific Method
eScience: A Transformed Scientific MethodeScience: A Transformed Scientific Method
eScience: A Transformed Scientific MethodDuncan Hull
 
SCA Accessioning Born-Digital Materials Workshop, Nov. 8, 2012
SCA Accessioning Born-Digital Materials Workshop, Nov. 8, 2012SCA Accessioning Born-Digital Materials Workshop, Nov. 8, 2012
SCA Accessioning Born-Digital Materials Workshop, Nov. 8, 2012peterchanws
 
Accessioning Born-Digital Materials
Accessioning Born-Digital MaterialsAccessioning Born-Digital Materials
Accessioning Born-Digital Materialspeterchanws
 
Data Integration Lecture
Data Integration LectureData Integration Lecture
Data Integration LectureSUNY Oneonta
 
Saa Session 502 Born Digital Archives in Collecting Repositories
Saa Session 502 Born Digital Archives in Collecting RepositoriesSaa Session 502 Born Digital Archives in Collecting Repositories
Saa Session 502 Born Digital Archives in Collecting RepositoriesAIMS_Archives
 
Digital Forensics in the Archive
Digital Forensics in the ArchiveDigital Forensics in the Archive
Digital Forensics in the ArchiveGarethKnight
 
BP301: Q: What’s Your Second Most Valuable Asset and Nearly Doubles Every Year?
BP301: Q: What’s Your Second Most Valuable Asset and Nearly Doubles Every Year? BP301: Q: What’s Your Second Most Valuable Asset and Nearly Doubles Every Year?
BP301: Q: What’s Your Second Most Valuable Asset and Nearly Doubles Every Year? panagenda
 
Group project linux helix
Group project linux helixGroup project linux helix
Group project linux helixJeff Carroll
 
Smith T Bio Hdf Bosc2008
Smith T Bio Hdf Bosc2008Smith T Bio Hdf Bosc2008
Smith T Bio Hdf Bosc2008bosc_2008
 
Watching the Detectives: Using digital forensics techniques to investigate th...
Watching the Detectives: Using digital forensics techniques to investigate th...Watching the Detectives: Using digital forensics techniques to investigate th...
Watching the Detectives: Using digital forensics techniques to investigate th...GarethKnight
 
AntiForensics - Leveraging OS and File System Artifacts.pdf
AntiForensics - Leveraging OS and File System Artifacts.pdfAntiForensics - Leveraging OS and File System Artifacts.pdf
AntiForensics - Leveraging OS and File System Artifacts.pdfekobelasting
 
Apache Tika: 1 point Oh!
Apache Tika: 1 point Oh!Apache Tika: 1 point Oh!
Apache Tika: 1 point Oh!Chris Mattmann
 
Impact of Covid-19 on Learning and Education
Impact of Covid-19 on Learning and EducationImpact of Covid-19 on Learning and Education
Impact of Covid-19 on Learning and EducationMANENDRASINGH30
 

Semelhante a RBMS 2011_Edwards (20)

Hypatia for dlf 2011
Hypatia for dlf 2011Hypatia for dlf 2011
Hypatia for dlf 2011
 
247th ACS Meeting: The Eureka Research Workbench
247th ACS Meeting: The Eureka Research Workbench247th ACS Meeting: The Eureka Research Workbench
247th ACS Meeting: The Eureka Research Workbench
 
digital Preservation
digital Preservationdigital Preservation
digital Preservation
 
eScience: A Transformed Scientific Method
eScience: A Transformed Scientific MethodeScience: A Transformed Scientific Method
eScience: A Transformed Scientific Method
 
A Guide for Reproducible Research
A Guide for Reproducible ResearchA Guide for Reproducible Research
A Guide for Reproducible Research
 
SCA Accessioning Born-Digital Materials Workshop, Nov. 8, 2012
SCA Accessioning Born-Digital Materials Workshop, Nov. 8, 2012SCA Accessioning Born-Digital Materials Workshop, Nov. 8, 2012
SCA Accessioning Born-Digital Materials Workshop, Nov. 8, 2012
 
Accessioning Born-Digital Materials
Accessioning Born-Digital MaterialsAccessioning Born-Digital Materials
Accessioning Born-Digital Materials
 
EZID: Easy Persistent Identifiers and Data Citation
EZID: Easy Persistent Identifiers and Data CitationEZID: Easy Persistent Identifiers and Data Citation
EZID: Easy Persistent Identifiers and Data Citation
 
Data Integration Lecture
Data Integration LectureData Integration Lecture
Data Integration Lecture
 
Saa Session 502 Born Digital Archives in Collecting Repositories
Saa Session 502 Born Digital Archives in Collecting RepositoriesSaa Session 502 Born Digital Archives in Collecting Repositories
Saa Session 502 Born Digital Archives in Collecting Repositories
 
Digital Forensics in the Archive
Digital Forensics in the ArchiveDigital Forensics in the Archive
Digital Forensics in the Archive
 
BP301: Q: What’s Your Second Most Valuable Asset and Nearly Doubles Every Year?
BP301: Q: What’s Your Second Most Valuable Asset and Nearly Doubles Every Year? BP301: Q: What’s Your Second Most Valuable Asset and Nearly Doubles Every Year?
BP301: Q: What’s Your Second Most Valuable Asset and Nearly Doubles Every Year?
 
Group project linux helix
Group project linux helixGroup project linux helix
Group project linux helix
 
Smith T Bio Hdf Bosc2008
Smith T Bio Hdf Bosc2008Smith T Bio Hdf Bosc2008
Smith T Bio Hdf Bosc2008
 
Data management
Data management Data management
Data management
 
Data management
Data management Data management
Data management
 
Watching the Detectives: Using digital forensics techniques to investigate th...
Watching the Detectives: Using digital forensics techniques to investigate th...Watching the Detectives: Using digital forensics techniques to investigate th...
Watching the Detectives: Using digital forensics techniques to investigate th...
 
AntiForensics - Leveraging OS and File System Artifacts.pdf
AntiForensics - Leveraging OS and File System Artifacts.pdfAntiForensics - Leveraging OS and File System Artifacts.pdf
AntiForensics - Leveraging OS and File System Artifacts.pdf
 
Apache Tika: 1 point Oh!
Apache Tika: 1 point Oh!Apache Tika: 1 point Oh!
Apache Tika: 1 point Oh!
 
Impact of Covid-19 on Learning and Education
Impact of Covid-19 on Learning and EducationImpact of Covid-19 on Learning and Education
Impact of Covid-19 on Learning and Education
 

Último

Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 

Último (20)

Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 

RBMS 2011_Edwards

Notas do Editor

  1. [INTRO]A little over two years ago, a few elements converged and a core group of us at Stanford began to get more serious about developing a viable method for processing the born-digital “papers” in our collectionsMost of my talk is centered around our first trialsbut I’d first like to describe the context and pressures at SUL that put us on this path … 
  2. The major pressure was the growing quantity of legacy media in our “backlog” …With Stanford situated in Silicon Valley, it’s no big surprise that we have a lot of computer collections that contain old legacy media.  Hence our acquisition in the late 1990s of the records of Apple Computer Inc., the papers of Douglas Engelbart and a really large collection of computer games and software. [images: mouse/engelbart, box of Atari games/Cabrinety]
  3. Because of those very acquisitions, in 1997, the Manuscripts Division began tracking the incoming quantities – just an overall count – of legacy computer media contained in new accessions. By the end of the decade we had recorded over 7,000 “items”Increasingly our b-d material comes from faculty, artists, writers, organizations Today, we have over 26,000 items of legacy media recorded in our backlog. [Univ. Archives has ~700 listed]
  4. The other element was an event in February 2009. A staff member on our digital team (Michael Olson), who had previously worked in Manuscripts, attended the Digital Lives Project’s first conference at the British Library. Two things occurred : He heard about a study* done at the B.L. on data loss in legacy computer media (3% per year) and … He saw that the B.L. was exploring the use of forensic tools for capturing data from media. Based on this and coupled with the weight of our growing backlog of media – we decided on two courses of action: *McLeod, Rory paper “Risk Assessment; using a risk based approach to prioritise handheld digital information” 2008
  5. First we purchased forensic hardware and software to enable us to capture and view legacy media and files. Hardware from Digital Intelligence (FRED) Software – we purchased and tested both FTK and En-Case forensic software. This framed the nucleus of our digital lab … And yet, most forensic equipment is geared toward current/modern media. So, we searched Ebay for old floppy disk drives to use with FRED
  6. Next, we partnered with 3 other institutions (U. Va., Yale and Hull) - as part of the AIMS Project - funded by the Andrew W. Mellon Foundation. (AIMS Born-Digital Collections: An Inter-Institutional Model for Stewardship) The goals of the project were to:to process b-d material from 13 (mostly legacy) collections and to deliver the b-d material in some fashion by the end of the grant
  7. Each repository hired a Digital Archivist. Peter Chan was hired at SUL in January 2010 and began actual work on imaging the disks for our 4 collections and trying out various methods for “processing” the data.  We choose collections that contained different types of media and content. They are: Robert Creeley papers (poet – mostly email, some writing)Stephen Jay Gould papers (paleontologist, author – writing and some data sets)Peter Routledge Koch papers (fine press printer – a mix of files: email, text, image, and design files like Adobe’s InDesign. This was the only collection with files transferred directly from a donor’s current computer)Xanadu Project records (early hypertext project – software program on 6 hard drives)
  8. It is now 1.5 years later and we have created a viable (although under constant development) workflow for accessioning and processing b-d materials using forensic tools. This is the more detailed workflow for collections that would be “fully” processed… We are also working on a minimal processing workflow and one that would fully “accession” the data – i.e. remove from physical storage media - and store for later processing. 
  9. One of the collections that has informed our development of b-d practice is that of Stephen Jay Gould, which contains both paper (analog) – over 500 linear feet – and 3 cartons of digital material. 98 3.5-inch floppy disks61 5.25-inch floppy disks4 sets of punch cards3 computer tapes In total, over 550 linear feet have been rec’d in 8 accessions.His papers and the audio/video are being processed concurrently – by archivist, Jenny Johnson - and will be done this August This month, the processing team discovered another 5 cartons of punch cards in the 2008 accn (21 sets). [This recent find won’t be resolved by end of grant]
  10. Using the two different capture stations – FRED & the floppy/zip station – we created disk images of all the disks : 8 sets of punch cards were successfully read by our neighbors at the Computer History Museum. 1 set was unreadable – as it had no sorting key.We also began tracking loss our own loss statistics - “success or failure” of captures - in a spreadsheet; which we link to our accession records in Archivists’ Toolkit.Loss rate for floppies in Gould are 5% - loss in other collections was higher.Creeley: 6% loss: 1 out of 12 CDs unreadable; 3 out of 53 floppies unreadable. [1987-2004?]Xanadu: 4 of 6 hard drives inoperable – or 67% damaged: [PC’s report: There were mechanical or electrical problems with other drives (one didn't spin after it was powered up and one gave a "dong" sound after it was powered up). We are not sure what the problems with the remaining two drives are – they do spin after power up but we cannot access the data.] Cost to recover ~ $10,000 (2.5K / drive)
  11. To process the materials during our initial trial, we used Windows Explorer.Folders were created that mirrored “series” and “titles” in EAD and files were moved from original media folder into appropriate place. This however changed data associated with the files – such as original path, etc.At this point, Peter Chan attended a week long session on the use of forensic software at Digital Intelligence – focusing on FTKWhile much more robust than we needed for archival work, he decided that many of the tools in FTK could easily be adapted for archival processing. We discovered that this practice mirrored work beginning at both BL and Oxford.
  12. Technical metadata for the disk images are displayed here. The are arranged by floppy disk and display file format (where identifiable), file size, checksum, creation dates, etc. One can change the view to add additional columns, such as duplicate or primary file, etc.
  13. The embedded viewer in FTK – from the same company that does Quick View Plus – allows you to quickly see the contents of many of the files
  14. Here are two quick screen shots showing archival HIERARCHY using FTK’s “bookmark” feature.Series or Subseries can be added as metadata to individual or groups of files by highlighting or checking the boxes of the files in the lower panel.
  15. Description for the three different formats in Gould will be merged at the end of the summer or early fall – paper, audio/video and born-digital files – but the level of description will be different.Gould’s papers are processed to the folder level for most of the collectionThe audio and video are listed at the item level to facilitate any future digitizationThe born-digital material will have Series level description with notes about original mediacapture and processing methods loss/damaged media and delivery methods
  16. Here is a partial view of our working draft for processing notes for Gould b-d “series”
  17. We encountered different issues in our other “AIMS” collections - the main one I will mention is the Robert Creeley collection… His papers originally contained : 53 floppies, 5 zip disks, and 3 CDsInitially the computer media was segregated into a separate collection – but will need to be merged into the main collection record and finding aid in the fall.After processing with FTK, the disk images garnered: Identified 50K emails Identified 8 files related to health records Identified 69 files with SS#A recent addenda complicates the processing of Creeley’s born-digital material : rec’d in May 2011 containing b-d media will need to be processed – and may allow us to have more complete set of emails, drafts, etc.7 computers3 zip drives121 optical discs422 3.5-inch floppy diskettes1 Zip 250 USB Drive1 Olympus C-4000 Camedia Digital Camera & flash cards1 20-gigabyte iPodWe have yet to analyze the data in the new accession and compare to original data but two issues cropped upHow to process and deliver multiple computers over creators life cycleData was captured from various CDs and computers to create an overview of the b-d material before transfer to SUL – what got changed in the process?![image from wikipedia taken by Elsa Dorman]
  18. In processing initial computer media, PC used folder titles on the disks as keywords for files
  19. Using Creeley’s initial text data, we have worked with two individuals – one working in the Digital Humanities – who took the header info from the 50K emails and created a network graph (Elijah Meeks) : Header information from Robert Creeley’s 50,000+ emails emphasizing the connection between the poet and Gerard Malanga. 
  20. To wrap up:Donors & users expect us to acquire, organize, preserve and provide access to b-d collectionsSpecial Collections staff capture, appraise, arrange and describe b-d materials AND contribute to requirements for both access and delivery as well as arrangement and description toolsOur digital group will preserve in our preservation repository (SDR) and provide public access and invite participation – Hypatia (under development)