SlideShare uma empresa Scribd logo
1 de 22
Baixar para ler offline
A survey of web-based art resources with
findings applicable to FARL electronic records
collection development
Alison Rhonemus, LIS 698, Seminar and Practicum, Dr. Tula Giannini
Frick Art Reference Library
Deborah Kempe, Chief, Collections Management & Access
Web Survey and Collection Development
Coffee on the terrace
M-LEAD-TWO
Intern enterprises -
"collection assessments, digital resource surveys,
web archiving, provide support for important
consortial programs such as shared resources"
● Brooklyn Museum: Mark Daly, Ronnette Hope,
Project Manager: Emily Atwater
● NYARC Latin American Resources (MOMA):
Ralph Baylor
● FARL: Gretchen Nadasky, Alison Rhonemus
Frick Art Reference Library
In early 2011, the Frick Art Reference Library
and the Thomas J. Watson Library at The
Metropolitan Museum of Art completed a pilot
project to address coordinated collecting of
born-digital auction catalogs using ContentDM
and Archive-It.
FARL web archiving program is situated in Collection Development.
Current plans for website capture include online auction catalogs and art web resources
cataloged by NYARC.
Fellow MLEAD-TWO intern Gretchen Nadasky has just described online auction
catalogs.
My project focused on NYARC cataloged websites.
Web Archiving
"The Internet Archive is already doing it.”
Actually, the IA is providing the tools for
other institutions to use in archiving.
ARCHIVE - IT
uses open source tools developed by the
Internet Archive
● Heritrix Web Crawler
● Wayback Interface
● WARC format, an ISO standard
the report and manual checks
Partner and WAYBACK interface
Quality Assurance
• Password protected sites – can not be archived
• Javascript – more complicated implementation
can be difficult to capture and display. Ongoing
area of development.
• Videos -- difficulty with some proprietary formats
• Form and Database driven content --‐ may be
archived using a sitemap or other direct links to the
content.
Evaluating seeds
Robots.txt Blocks
The crawler by default respects all robots.txt files. Check
post--‐crawl reports for blocked seeds or documents
If your site is blocked:
a) Contact the site owner and ask if they will un--‐block
b) Ask your Partner Specialist to turn on “ignore robots”
feature in your account
Notes:
/ denotes single directory seed
subdomains.archive.org (add individually or expand seed)
Site Survey Criteria
● html/flash/pdf
● images
● embedded material
● links
● directories and subdomains
● terms, rights statements and permissions
Obvious ruse
More of the obvious
Sites created without the intention of
being archived are the sites in need of
archiving.
Survey Says
● 257 cataloged entries
● 168 resources are possible to capture
● 82 resources would require more research or
display definite red flags for web archiving.
● PDFs are available for at least some of the
content in 75 resources.
● Flash was an element in 23 resources
● 16 sites used HTML5
● 54 used a CMS like Drupal or WordPress
There were 3 cataloged resources no longer
available on the live web but viewable through
Internet Archive.
Another 2 defunct resources were not available
through Internet Archive.
The main page for one of these lost resources was
available as a snapshot in WAYBACK but the actual
cataloged resource was not available.
Change is Constant
Archive-It Updates:
● Heritrix 1 series to Heritrix 3 series
(February)
● Archive-It 4.8
(May)
Archive-It 4.8
Plans
● Upcoming grants
● Capture of NYARC institution websites
● Include Wayback interface links in
Arcade catalog records
● Continue to identify websites for
capture and implement capture
Conclusions
○ Digital resources not prevalent enough to
reassign current staff
○ Website capture most costly in terms of staff time
○ Copyright continues to be an issue
○ Long term digital preservation needs yet to be
assessed
○ Capture of Frick Collection sites and NYARC will
pose as a challenging test case

Mais conteúdo relacionado

Destaque

Portfolio of mierza miranti
Portfolio of mierza mirantiPortfolio of mierza miranti
Portfolio of mierza mirantiMierza Miranti
 
Heroes by Antonio García (6ºc)
Heroes by Antonio García (6ºc)Heroes by Antonio García (6ºc)
Heroes by Antonio García (6ºc)Paulo Freire
 
Receitas 6ªC
Receitas   6ªCReceitas   6ªC
Receitas 6ªCMisa Di
 
IDC Archiving
IDC ArchivingIDC Archiving
IDC Archivingarms8586
 
Sonasoft email archiving
Sonasoft email archivingSonasoft email archiving
Sonasoft email archivingkeesnielen
 

Destaque (7)

Portfolio of mierza miranti
Portfolio of mierza mirantiPortfolio of mierza miranti
Portfolio of mierza miranti
 
Ch05 6
Ch05 6Ch05 6
Ch05 6
 
Heroes by Antonio García (6ºc)
Heroes by Antonio García (6ºc)Heroes by Antonio García (6ºc)
Heroes by Antonio García (6ºc)
 
Receitas 6ªC
Receitas   6ªCReceitas   6ªC
Receitas 6ªC
 
IDC Archiving
IDC ArchivingIDC Archiving
IDC Archiving
 
Ds 02 015
Ds 02 015Ds 02 015
Ds 02 015
 
Sonasoft email archiving
Sonasoft email archivingSonasoft email archiving
Sonasoft email archiving
 

Semelhante a Farl web archiving

Information sharing about Columbia University Library’s recent web archiving ...
Information sharing about Columbia University Library’s recent web archiving ...Information sharing about Columbia University Library’s recent web archiving ...
Information sharing about Columbia University Library’s recent web archiving ...Anna Perricci
 
Collaboration and Cash: Web Archiving Incentive Awards
Collaboration and Cash: Web Archiving Incentive AwardsCollaboration and Cash: Web Archiving Incentive Awards
Collaboration and Cash: Web Archiving Incentive AwardsAnna Perricci
 
Making the Black Hole Gray: Implementing the Web Archiving of Specialist Art ...
Making the Black Hole Gray: Implementing the Web Archiving of Specialist Art ...Making the Black Hole Gray: Implementing the Web Archiving of Specialist Art ...
Making the Black Hole Gray: Implementing the Web Archiving of Specialist Art ...The Frick Collection
 
Archiving Web-Based #musetech for Institutional Memory
Archiving Web-Based #musetech for Institutional MemoryArchiving Web-Based #musetech for Institutional Memory
Archiving Web-Based #musetech for Institutional MemorySamantha Norling
 
The development of web archiving 3
The development of web archiving 3The development of web archiving 3
The development of web archiving 3Essam Obaid
 
Web and Twitter Archiving at the Library of Congress
Web and Twitter Archiving at the Library of CongressWeb and Twitter Archiving at the Library of Congress
Web and Twitter Archiving at the Library of Congressnullhandle
 
Creating and Maintaining Web Archives
Creating and Maintaining Web ArchivesCreating and Maintaining Web Archives
Creating and Maintaining Web ArchivesMARAC Bethlehem PC
 
Online Collections Crawlability for Libraries, Archives, and Museums
Online Collections Crawlability for Libraries, Archives, and MuseumsOnline Collections Crawlability for Libraries, Archives, and Museums
Online Collections Crawlability for Libraries, Archives, and Museumsmherbison
 
How to Face the Challenges of Web Archiving? The Experiences of a Small Libra...
How to Face the Challenges of Web Archiving? The Experiences of a Small Libra...How to Face the Challenges of Web Archiving? The Experiences of a Small Libra...
How to Face the Challenges of Web Archiving? The Experiences of a Small Libra...Liber2012
 
Human Scale Web Collecting for Individuals and Institutions (Webrecorder Work...
Human Scale Web Collecting for Individuals and Institutions (Webrecorder Work...Human Scale Web Collecting for Individuals and Institutions (Webrecorder Work...
Human Scale Web Collecting for Individuals and Institutions (Webrecorder Work...Anna Perricci
 
The Commons and Digital Humanities
The Commons and Digital HumanitiesThe Commons and Digital Humanities
The Commons and Digital Humanitieschristinadepaolo
 
Of Cataloging & Context
Of Cataloging & ContextOf Cataloging & Context
Of Cataloging & Contextcharper
 
Archiving the French Web: the BnF web archiving workflow. Sara Aubry
Archiving the French Web: the BnF web archiving workflow. Sara AubryArchiving the French Web: the BnF web archiving workflow. Sara Aubry
Archiving the French Web: the BnF web archiving workflow. Sara AubryBiblioteca Nacional de España
 
Internet browsing techniques
Internet browsing techniquesInternet browsing techniques
Internet browsing techniquesTola Odugbesan
 
Slides anu talkwebarchivingaug2012
Slides anu talkwebarchivingaug2012Slides anu talkwebarchivingaug2012
Slides anu talkwebarchivingaug2012Roxanne Missingham
 
OCLC Research Update at ALA Chicago. June 26, 2017.
OCLC Research Update at ALA Chicago. June 26, 2017.OCLC Research Update at ALA Chicago. June 26, 2017.
OCLC Research Update at ALA Chicago. June 26, 2017.OCLC
 
Making the Black Hole Gray: Web Archiving Art Resources at New York Art Resou...
Making the Black Hole Gray: Web Archiving Art Resources at New York Art Resou...Making the Black Hole Gray: Web Archiving Art Resources at New York Art Resou...
Making the Black Hole Gray: Web Archiving Art Resources at New York Art Resou...The Frick Collection
 
Connecting the Dots: Linking Digitized Collections Across Metadata Silos
Connecting the Dots: Linking Digitized Collections Across Metadata SilosConnecting the Dots: Linking Digitized Collections Across Metadata Silos
Connecting the Dots: Linking Digitized Collections Across Metadata SilosOCLC
 
Spotlight on the Digital: increase discovery of your digital resources
Spotlight on the Digital: increase discovery of your digital resourcesSpotlight on the Digital: increase discovery of your digital resources
Spotlight on the Digital: increase discovery of your digital resourcesPaolaMarchionni
 
IIIF Introduction given in South Africa - 2019
IIIF Introduction given in South Africa - 2019IIIF Introduction given in South Africa - 2019
IIIF Introduction given in South Africa - 2019Glen Robson
 

Semelhante a Farl web archiving (20)

Information sharing about Columbia University Library’s recent web archiving ...
Information sharing about Columbia University Library’s recent web archiving ...Information sharing about Columbia University Library’s recent web archiving ...
Information sharing about Columbia University Library’s recent web archiving ...
 
Collaboration and Cash: Web Archiving Incentive Awards
Collaboration and Cash: Web Archiving Incentive AwardsCollaboration and Cash: Web Archiving Incentive Awards
Collaboration and Cash: Web Archiving Incentive Awards
 
Making the Black Hole Gray: Implementing the Web Archiving of Specialist Art ...
Making the Black Hole Gray: Implementing the Web Archiving of Specialist Art ...Making the Black Hole Gray: Implementing the Web Archiving of Specialist Art ...
Making the Black Hole Gray: Implementing the Web Archiving of Specialist Art ...
 
Archiving Web-Based #musetech for Institutional Memory
Archiving Web-Based #musetech for Institutional MemoryArchiving Web-Based #musetech for Institutional Memory
Archiving Web-Based #musetech for Institutional Memory
 
The development of web archiving 3
The development of web archiving 3The development of web archiving 3
The development of web archiving 3
 
Web and Twitter Archiving at the Library of Congress
Web and Twitter Archiving at the Library of CongressWeb and Twitter Archiving at the Library of Congress
Web and Twitter Archiving at the Library of Congress
 
Creating and Maintaining Web Archives
Creating and Maintaining Web ArchivesCreating and Maintaining Web Archives
Creating and Maintaining Web Archives
 
Online Collections Crawlability for Libraries, Archives, and Museums
Online Collections Crawlability for Libraries, Archives, and MuseumsOnline Collections Crawlability for Libraries, Archives, and Museums
Online Collections Crawlability for Libraries, Archives, and Museums
 
How to Face the Challenges of Web Archiving? The Experiences of a Small Libra...
How to Face the Challenges of Web Archiving? The Experiences of a Small Libra...How to Face the Challenges of Web Archiving? The Experiences of a Small Libra...
How to Face the Challenges of Web Archiving? The Experiences of a Small Libra...
 
Human Scale Web Collecting for Individuals and Institutions (Webrecorder Work...
Human Scale Web Collecting for Individuals and Institutions (Webrecorder Work...Human Scale Web Collecting for Individuals and Institutions (Webrecorder Work...
Human Scale Web Collecting for Individuals and Institutions (Webrecorder Work...
 
The Commons and Digital Humanities
The Commons and Digital HumanitiesThe Commons and Digital Humanities
The Commons and Digital Humanities
 
Of Cataloging & Context
Of Cataloging & ContextOf Cataloging & Context
Of Cataloging & Context
 
Archiving the French Web: the BnF web archiving workflow. Sara Aubry
Archiving the French Web: the BnF web archiving workflow. Sara AubryArchiving the French Web: the BnF web archiving workflow. Sara Aubry
Archiving the French Web: the BnF web archiving workflow. Sara Aubry
 
Internet browsing techniques
Internet browsing techniquesInternet browsing techniques
Internet browsing techniques
 
Slides anu talkwebarchivingaug2012
Slides anu talkwebarchivingaug2012Slides anu talkwebarchivingaug2012
Slides anu talkwebarchivingaug2012
 
OCLC Research Update at ALA Chicago. June 26, 2017.
OCLC Research Update at ALA Chicago. June 26, 2017.OCLC Research Update at ALA Chicago. June 26, 2017.
OCLC Research Update at ALA Chicago. June 26, 2017.
 
Making the Black Hole Gray: Web Archiving Art Resources at New York Art Resou...
Making the Black Hole Gray: Web Archiving Art Resources at New York Art Resou...Making the Black Hole Gray: Web Archiving Art Resources at New York Art Resou...
Making the Black Hole Gray: Web Archiving Art Resources at New York Art Resou...
 
Connecting the Dots: Linking Digitized Collections Across Metadata Silos
Connecting the Dots: Linking Digitized Collections Across Metadata SilosConnecting the Dots: Linking Digitized Collections Across Metadata Silos
Connecting the Dots: Linking Digitized Collections Across Metadata Silos
 
Spotlight on the Digital: increase discovery of your digital resources
Spotlight on the Digital: increase discovery of your digital resourcesSpotlight on the Digital: increase discovery of your digital resources
Spotlight on the Digital: increase discovery of your digital resources
 
IIIF Introduction given in South Africa - 2019
IIIF Introduction given in South Africa - 2019IIIF Introduction given in South Africa - 2019
IIIF Introduction given in South Africa - 2019
 

Último

DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 

Último (20)

DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 

Farl web archiving

  • 1. A survey of web-based art resources with findings applicable to FARL electronic records collection development Alison Rhonemus, LIS 698, Seminar and Practicum, Dr. Tula Giannini Frick Art Reference Library Deborah Kempe, Chief, Collections Management & Access Web Survey and Collection Development Coffee on the terrace
  • 2. M-LEAD-TWO Intern enterprises - "collection assessments, digital resource surveys, web archiving, provide support for important consortial programs such as shared resources" ● Brooklyn Museum: Mark Daly, Ronnette Hope, Project Manager: Emily Atwater ● NYARC Latin American Resources (MOMA): Ralph Baylor ● FARL: Gretchen Nadasky, Alison Rhonemus
  • 3. Frick Art Reference Library In early 2011, the Frick Art Reference Library and the Thomas J. Watson Library at The Metropolitan Museum of Art completed a pilot project to address coordinated collecting of born-digital auction catalogs using ContentDM and Archive-It.
  • 4. FARL web archiving program is situated in Collection Development. Current plans for website capture include online auction catalogs and art web resources cataloged by NYARC. Fellow MLEAD-TWO intern Gretchen Nadasky has just described online auction catalogs. My project focused on NYARC cataloged websites.
  • 5. Web Archiving "The Internet Archive is already doing it.” Actually, the IA is providing the tools for other institutions to use in archiving.
  • 6. ARCHIVE - IT uses open source tools developed by the Internet Archive ● Heritrix Web Crawler ● Wayback Interface ● WARC format, an ISO standard
  • 7.
  • 8. the report and manual checks Partner and WAYBACK interface Quality Assurance
  • 9. • Password protected sites – can not be archived • Javascript – more complicated implementation can be difficult to capture and display. Ongoing area of development. • Videos -- difficulty with some proprietary formats • Form and Database driven content --‐ may be archived using a sitemap or other direct links to the content. Evaluating seeds
  • 10. Robots.txt Blocks The crawler by default respects all robots.txt files. Check post--‐crawl reports for blocked seeds or documents If your site is blocked: a) Contact the site owner and ask if they will un--‐block b) Ask your Partner Specialist to turn on “ignore robots” feature in your account Notes: / denotes single directory seed subdomains.archive.org (add individually or expand seed)
  • 11. Site Survey Criteria ● html/flash/pdf ● images ● embedded material ● links ● directories and subdomains ● terms, rights statements and permissions
  • 13. More of the obvious Sites created without the intention of being archived are the sites in need of archiving.
  • 14. Survey Says ● 257 cataloged entries ● 168 resources are possible to capture ● 82 resources would require more research or display definite red flags for web archiving. ● PDFs are available for at least some of the content in 75 resources. ● Flash was an element in 23 resources ● 16 sites used HTML5 ● 54 used a CMS like Drupal or WordPress
  • 15. There were 3 cataloged resources no longer available on the live web but viewable through Internet Archive. Another 2 defunct resources were not available through Internet Archive. The main page for one of these lost resources was available as a snapshot in WAYBACK but the actual cataloged resource was not available.
  • 16.
  • 17.
  • 18.
  • 19. Change is Constant Archive-It Updates: ● Heritrix 1 series to Heritrix 3 series (February) ● Archive-It 4.8 (May)
  • 21. Plans ● Upcoming grants ● Capture of NYARC institution websites ● Include Wayback interface links in Arcade catalog records ● Continue to identify websites for capture and implement capture
  • 22. Conclusions ○ Digital resources not prevalent enough to reassign current staff ○ Website capture most costly in terms of staff time ○ Copyright continues to be an issue ○ Long term digital preservation needs yet to be assessed ○ Capture of Frick Collection sites and NYARC will pose as a challenging test case