A dry-run of content I wanted to present to an Australian Society of Archivists workshop 21 October 2016.
This trial run was at Archives New Zealand on 28 September 2016.
4. 2014-06-20: Play It Again Conference Report:
http://bit.ly/2d8Bnw0
(playitagain.org)
2014-11-25: The Reality of Digital Transfer:
http://bit.ly/2ctxocQ
(slideshare.net)
5. We (Archives NZ) have got quite far⊠But
there's still a lot more to doâŠ
6. So let's remind ourselves: What is the point?
â Work in concert with agencies and their consultants.
â Generate better information and records management
â Cleaner transfers...
â Create a more open and transparent government where the digital record is
concerned...
â DIAâs line... Support New Zealanders to build strong communities by providing
access to trusted information and knowledge.
7. And! Digital Preservation
â At this point in time, idiomatic methods of preservation are still forming...
â Whatever the future of archival custodianship...
â Or the future of digital preservation...
â Techniques need to be developed to support agencies with information and records
management, and memory institutes with long-term custodianship.
â Don't fall into the processing trap...
8. What can we identify as important?
â Infrastructure/team, supported by the organisation
â Some things work, some donât; some change... be flexible.
â Work iteratively...
â Look at what you can do...
â Continue to develop... evidence, real use-cases
11. Policy...
âHas been a constant in my time here.
âWas a draw to me starting in NZ
âSets the rules by which we can playâŠ
âLiterally, play: bend donât break
â Achieved through careful stakeholder consultation and consideration of
impact.
âSign-off process at director level.
âTwo favourite policies, checksum, pre-conditioning.
12. Team...
âWe could always do with more peopleâŠ
âBut we recognise that we've been allowed more folk dedicated to this
than some places.
âThe team is supported in their decision making and their skills.
âBreakdown: Curious; driven; up-to-date; drive to âsolveâ born-digital
transfer; different but complementary skills⊠*passion*!
â(And opinionated! ;-) )
âIt doesnât always look that way but there is a certain amount of leeway
from IT support too...
13. Technology...?
Rosetta by Ex-Libris: is the Long-term preservation system, it allows us to manage some
quite complex bits 'n' pieces⊠but:
âDoes not yet enable transfer from Agency-to-Archives (it supports)
âIs not a clearing house for records
âSpot preservation risks up-front
âDoesn't 'do' sentencingâŠ
âDoes not build ingest packagesâŠ
âDoes not 'do' archival description...
âDoes not contain every tool under the sun to handle all the file formatsâŠ
Machine Learning: http://nautil.us/blog/the-fundamental-limits-of-machine-learning
14. The processes we need are biased toward transfer
and ingestâŠ
Rosetta can only help so muchâŠ
||----------------||---------------------------------------------------------------------------------------------------||
Creation Transfer (Life of a record ~25 years) Life of an archive ~â
The other processes we will still need will be
about (active) long term custodianshipâŠ
Rosetta is still only beginning that journey...
15. The miscellany in this presentation...
A story about the tools that can help us...
â Technical Registries (of practice)
â DROID/Siegfried Analysis Report
â Fuzzy Hashes
16.
17.
18. With everything we need to doâŠ
We cannot action it all at the same time...
19. Knowledge needs to remain alive and accessible, record it:
Source: https://commons.wikimedia.org/wiki/Category:Kanban#/media/File:Simple_Task_Kanban.jpg
22. DROID/Siegfried Analysis Report
â Example of changing needs and capability
â Initially a plain-text reporting tool
â Evolved into a 'team' toolâŠ
â Evolving into an organisationâs toolâŠ
â Hopefully a community toolâŠ
â Our first port of call for any transfer...
* Marriage of DROID and Siegfried: http://bit.ly/2ddS0IP
* A little bit more about the tool: http://bit.ly/2dii3jP
23. DROID/Siegfried Analysis Report
â Available to all the community (December 2013): http://bit.ly/2cB8gFY
â Maps DROID and Siegfried output to an SQLite database for querying power and speed.
â Aside from Python, ZERO-dependencies â user needs to be able to download it and go...
â Complete flexibility over output.
â TXT, HTML, Rogues, Heroes⊠Normalization via database layer â write your own!
â Normalization via database layer â abstracted for multiple ID tools
â The tools each do what they're supposed to well, the dissection of output can be left to others.
* Marriage of DROID and Siegfried (OPF Blog): http://bit.ly/2ddS0IP
* A little bit more about the tool (OPF Blog): http://bit.ly/2dii3jP
28. Benefits...
â Sets a baseline for a lingua franca⊠beginners and experts
alike...
â Definitions contributed by our archivists!
â Easier on the eye
â Re-factored to be more flexible
â Give it a try! Let us know how it goes!
31. Checksums
â Looking to be unique
â De-duplication
â Fixity
â No connection between
â Security function
â Cannot reverse
32. But every file has a connection...
â Binary
â File Format
â Textual Content
â Embedded Content
â Template
â Author
â Like DNA, with many different strands to dissect...
â Fuzzy Hashing!
35. And they look like...
â aad371039d588b43e02887f87e570f6d2b1a7f1da89667ef11227d
9b3e706610d8e12d
â 0dc36013dd088b43e02983f87e534e6d2b1a7f1da88627ef11267d
8b3e716610d9e16d
â Not that different from regular checksums!
â But help us to demonstrate a closer relationship between filesâŠ
â âThe sum of the parts is greater than the whole.â
~ Arist!otle
40. How can we use this?
â Sentencing... while still teaching our machines, we can still close
the net while looking at records manuallyâŠ
â Discovery: Amazon like results: You might also like this record!
41. The experiment continues...
â Matches are relative to themselves...
â Algorithms make a difference...
â And perhaps, like genetics... some traits are more dominant than
others...
â Consider working with content in different ways...
â Utilize format bias... normalize
â Separate content from structure and analyse?
â Keep trying things, but at minimum cost... (another agile concept:
minimal viable product)
42.
43. Conclusion: A bit more miscellany
âKeyword: Interim
âOur needs change constantly, and there's a lot to doâŠ
âDon't suffer paralysis by analysis.
âDo a requirements analysis
âLook at what you can do (minimum viable product) and iterate...
44. Conclusion: A bit more miscellany
âLot's of hints to bits 'n' pieces I haven't been able to talk about:
âRole of the community⊠(They/We're here to help! Same problems!)
âCommunication and sharing⊠(Do it!)
âSoftware development skills⊠(There are other ways to be involved)
What's the point? (OPF Blog): http://bit.ly/2ddXnaY
âMaybe also a seed for discussion.