SlideShare uma empresa Scribd logo
1 de 37
1
SAA Description Section
Portland, 26 July 2017
Best Practices for Descriptive
Metadata for Web Archiving
Recommendations of the OCLC Research Library Partnership
Web Archiving Metadata Working Group
Alexis Antracoli, Kate Bowers, Jackie Dooleyoc.lc/wam
2
THE PROBLEM
3
3
Absence of community best practices for descriptive
metadata was the most widely-shared web archiving challenge
identified in two surveys:
– OCLC Research Library Partnership (2015)
– Weber/Graham study of users of archived website (2016)
4
4
5
5
6
OCLC RESEARCH LIBRARY PARTNERSHIP
WEB ARCHIVING METADATA
WORKING GROUP
7
7
Objectives
• Recommend best practices for descriptive metadata for
archived websites that are community-neutral and that
address user needs (NOT a standard; displaces no standards)
• Identify the most relevant data elements and provide
guidance on formulating content
• Bridge bibliographic and archival practices
• Address the differing characteristics of item- and collection-
level descriptions
8
8
Potential user groups
• Practitioners who use standards but want guidance on formulating
content for description of archived single sites and collections
– Dublin Core (used by Archive-It) is a structure standard, not a content standard
– Some archivists want web-specific guidance to supplement DACS
– Those who want to export MARC records to a less granular data structure (such
as Archive-It)
– RDA (library content standard) focuses on transcription of live single sites, but
websites lack standard elements and are not amenable to transcription
• Libraries and archives using a digital asset management system (DAMS),
eschewing descriptive standards; brief records, scalable approach
• Individuals lacking metadata experience who build or contribute to web
archives
9
9
Feedback received
• How much? A lot!
• Mostly positive; three negative comments
–Positive: premise accepted; suggestions for improvement
–Negative: premise rejected, including because …
• Archival (two comments, including TS-DACS): recommendations are
too bibliographic, don’t conform to DACS
• Bibliographic (one comment): recommendations are too archival
LIT REVIEWS & OTHER RESEARCH
Bailey et al. Ben-David & Huurdeman Bernstein Bragg & Hanna Costa
Costa & Gomes Costa & Silva Cruz & Gomes Dougherty & Meyer Galligan
Gatenby Gibbons Goel Goethals Guenther Hartman et al. Hockx-Yu
Jackson Jones & Shankar Lavoie & Gartner Leetaru Mannheimer Masanès
Milligan Murray & Hsieh Neubert Niu O’Dell Peterson Phillips & Koerbin
Pregill Prom & Swain Ras & van Bussel Reynolds Riley & Crookston
Stirling et al. Sweetser Taylor Thomas et al. Thurman & O’Hanlon
Tillinghast Truman Weber & Graham Webster Wuet et al. Zhang et al.
Who uses web archives?
Digital humanists
Web scientists
Computer scientists
Data analysts
Journalists
Lawyers
Website owners
Website designers
Government employees
Genealogists
Patent applicants
Instructors
Students
Linguists
Sociologists
Political scientists
Historians
Anthropologists
Users need “provenance” metadata
• “The critical missing piece”
• Provide context
• Why was the content archived?
• Selection criteria
• Scope
Note: WAM focused solely on descriptive metadata, not technical or preservation metadata.
Metadata practitioner needs
• Archival and bibliographic approaches
• DACS, EAD, MARC, Dublin Core, RDA, MODS
• Data elements vary widely
• Same element name, multiple meanings
• Level of description
– Single site, collection of sites, seed URLs
• Scalability and limited resources
14
14
Best practices methodology
• Analyze metadata standards & institutional guidelines
– RDA (libraries), DACS (archives), Dublin Core (simplified)
• Evaluate existing metadata records “in the wild”
– ArchiveGrid, Archive-It, WorldCat
• Identify dilemmas specific to web archiving
• Incorporate findings from literature reviews
• Prepare data dictionary and report narrative
General finding: extreme inconsistencies in existing practice
15
WEB-SPECIFIC DILEMMAS
16
16
• Is the website creator/owner the … publisher? author? subject?
• Should the title be … transcribed verbatim from the head of the
site? Edited to clarify the nature/scope of the site? Append e.g. "web
archive” for a collection?
• Which dates are important/feasible other than capture dates?
Beginning/end of the site's existence? Date of the content?
Copyright?
• How should extent/size be expressed? 6.25 Gb? 300 websites?
1 archived website? 1 online resource?
• Is the host institution that harvests and manages the
archived content the repository? creator? publisher? selector?
17
17
• Is it important to clearly state that the resource is a website? If so,
where? In the title? description? extent statement? all of these?
• Does provenance refer to …the site owner? the repository that
harvests and hosts the site? ways in which the site evolved?
• Does appraisal mean …the reason the site warrants being archived?
a collection of sites named by the repository? the parts of the site that were
harvested?
• Which URLs should be included? Seed? access? landing page?
18
THE ARCHIVE-IT METADATA
INTERFACE
19
Archive-It, the most widely
used web archiving
platform, provides for
application of DC
metadata for collections
and seeds
20
Detail of the Collection-level metadata editing interface
21
Archive-It has support for non-DC fields
At the seed level, “grab title” can pull a title from HTML
22
Bulk options include downloading all seed metadata
23
RECOMMENDED BEST PRACTICES
24
24
WAM data elements
* = 9 of 14 element names/meanings match Dublin Core
Collector Extent Source of Description
Contributor * Genre/Form Subject *
Creator * Language * Title *
Date * Relation * URL
Description * Rights *
25
25
Data dictionary inclusion criteria
• Includes common elements used for identification and discovery of all
types of resource (e.g., Creator, Date, Subject, Title)
• Other elements must have clear applicability to archived websites (e.g.
Rights, Description, URL)
• Elements excluded that rarely (if ever) appear in guidelines and/or extant
metadata records and have no web-specific meaning (e.g. audience,
publisher, statement of responsibility)
26
26
Data element features
• Element name
• Definition
• Usage note
• Examples
• Crosswalk
27
27
Collector
Definition: The organization responsible for curation and stewardship of an
archived website or collection.
Use Collector for the organization that selects the web content for archiving, creates
metadata and performs other activities associated with “ownership” of a resource.
Stated another way, this is the organization that has taken responsibility for the archived
content, although the digital files are not necessarily stored and maintained by this
organization (collections harvested using Archive-It are a prominent example).
No equivalent in Dublin Core.
28
28
Collector: Lifecycle activities
Institutions involved in web archiving engage in a variety of activities during the lifecycle
of archiving web content. We identified four activities performed by the institution that
assumes responsibility for archiving web content:
• Selecting websites for archiving
• Harvesting the content of the designated seed URLs
• Creating and maintaining metadata to describe the content
• Making decisions about other aspects of collections management, including how
the harvested files will be preserved and how will access be provided.
29
29
Collector: Examples
Creator: Seattle (Wash.)
Title: City of Seattle Harvested Websites
Collector: Seattle Municipal Archives
-============
Title: Globalchange.gov
Contributor: U.S. Global Change Research Program
Collector: Federal Depository Library Program
============
Creator: Association for Research into Crimes Against Art
Title: ARCAblog : promoting the study and research of art crime and cultural
heritage protection
Collector: New York Art Resources Consortium
30
30
Collector: Crosswalks
Crosswalks
Dublin Core Contributor
EAD <repository>
MARC 524
852 subfield a
852 subfield b
MODS <location>
schema.org schema:OwnershipInfo
31
31
Source of description
Definition: Information about the gathering or creation of the metadata
itself, such as sources of data or the date on which source data was
obtained.
Source of Information is used to identify the source of all or some of the
metadata, particularly for descriptions of single sites. Basic aspects of a
website (creator name, title, etc.) may change significantly, but the
responsible institution is unlikely to have the resources to become aware of
changes, let alone update the metadata. Include the date on which the site
was examined and the location from which the information was taken.
No equivalent in Dublin Core.
32
32
Source of description: Examples
Description based on archived web page captured Sept. 22, 2016; title from
title screen (viewed Oct. 27, 2016)
Title from home page last updated June 21, 2012 (viewed June 22, 2012)
Title from home page (viewed on Oct. 11, 2007)
Title from HTML header (viewed Feb. 16, 2006)
33
33
Source of description: Crosswalks
34
FORTHCOMING PUBLICATIONS
35
35
Three simultaneous reports (autumn 2017)
● Best practices for descriptive metadata
○ With data dictionary
● User needs
○ With annotated bibliography
● Tools
○ With evaluation grids
Send comments by August 4th!
36
Q&A
37
SM
Jackie Dooley
Program Officer, OCLC Research
dooleyj@oclc.org
@minniedw
SAA Description Section
26 July 2017
©2016 OCLC. This work is licensed under a Creative Commons Attribution 4.0 International License. Suggested attribution:
“This work uses content from Developing Best Practices for Web Archiving Metadata to Meet User Needs
© OCLC, used under a Creative Commons Attribution 4.0 International License:
http://creativecommons.org/licenses/by/4.0/.”
For more information, please contact:
oc.lc/wam

Mais conteúdo relacionado

Mais procurados

The network reshapes the research library collection
The network reshapes the research library collectionThe network reshapes the research library collection
The network reshapes the research library collectionlisld
 
Full Spectrum Stewardship of the Scholarly Record by Brian E. C. Schottlaende...
Full Spectrum Stewardship of the Scholarly Record by Brian E. C. Schottlaende...Full Spectrum Stewardship of the Scholarly Record by Brian E. C. Schottlaende...
Full Spectrum Stewardship of the Scholarly Record by Brian E. C. Schottlaende...Charleston Conference
 
Open Context and Publishing to the Web of Data: Eric Kansa's LAWDI Presentation
Open Context and Publishing to the Web of Data: Eric Kansa's LAWDI PresentationOpen Context and Publishing to the Web of Data: Eric Kansa's LAWDI Presentation
Open Context and Publishing to the Web of Data: Eric Kansa's LAWDI Presentationekansa
 
The facilitated collection: collections and collecting in a network environment
The facilitated collection: collections and collecting in a network environmentThe facilitated collection: collections and collecting in a network environment
The facilitated collection: collections and collecting in a network environmentlisld
 
The library in the life of the user
The library in the life of the userThe library in the life of the user
The library in the life of the userlisld
 
Social metadata for libraries, archives and museums: Research findings from t...
Social metadata for libraries, archives and museums: Research findings from t...Social metadata for libraries, archives and museums: Research findings from t...
Social metadata for libraries, archives and museums: Research findings from t...Rose Holley
 
OCLC Research Update at ALA Chicago. June 26, 2017.
OCLC Research Update at ALA Chicago. June 26, 2017.OCLC Research Update at ALA Chicago. June 26, 2017.
OCLC Research Update at ALA Chicago. June 26, 2017.OCLC
 
Exploring a world of networked information built from free-text metadata
Exploring a world of networked information built from free-text metadataExploring a world of networked information built from free-text metadata
Exploring a world of networked information built from free-text metadataShenghui Wang
 
Next Generation Technical Services May 2009 Calhoun
Next Generation Technical Services May 2009 CalhounNext Generation Technical Services May 2009 Calhoun
Next Generation Technical Services May 2009 CalhounKaren S Calhoun
 
IASSIT Kansa Presentation
IASSIT Kansa PresentationIASSIT Kansa Presentation
IASSIT Kansa Presentationekansa
 
Towards collaboration at scale: Libraries, the social and the technical
Towards collaboration at scale:  Libraries, the social and the technicalTowards collaboration at scale:  Libraries, the social and the technical
Towards collaboration at scale: Libraries, the social and the technicallisld
 

Mais procurados (20)

The network reshapes the research library collection
The network reshapes the research library collectionThe network reshapes the research library collection
The network reshapes the research library collection
 
Full Spectrum Stewardship of the Scholarly Record by Brian E. C. Schottlaende...
Full Spectrum Stewardship of the Scholarly Record by Brian E. C. Schottlaende...Full Spectrum Stewardship of the Scholarly Record by Brian E. C. Schottlaende...
Full Spectrum Stewardship of the Scholarly Record by Brian E. C. Schottlaende...
 
Open Context and Publishing to the Web of Data: Eric Kansa's LAWDI Presentation
Open Context and Publishing to the Web of Data: Eric Kansa's LAWDI PresentationOpen Context and Publishing to the Web of Data: Eric Kansa's LAWDI Presentation
Open Context and Publishing to the Web of Data: Eric Kansa's LAWDI Presentation
 
Thompson 6-jun15-final
Thompson 6-jun15-finalThompson 6-jun15-final
Thompson 6-jun15-final
 
The facilitated collection: collections and collecting in a network environment
The facilitated collection: collections and collecting in a network environmentThe facilitated collection: collections and collecting in a network environment
The facilitated collection: collections and collecting in a network environment
 
Lauruhn-5-jun15
Lauruhn-5-jun15Lauruhn-5-jun15
Lauruhn-5-jun15
 
The library in the life of the user
The library in the life of the userThe library in the life of the user
The library in the life of the user
 
NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...
NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...
NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...
 
April 8 NISO Webinar: Experimenting with BIBFRAME: Reports from Early Adopters
April 8 NISO Webinar: Experimenting with BIBFRAME: Reports from Early AdoptersApril 8 NISO Webinar: Experimenting with BIBFRAME: Reports from Early Adopters
April 8 NISO Webinar: Experimenting with BIBFRAME: Reports from Early Adopters
 
McDanold-1-jun15
McDanold-1-jun15McDanold-1-jun15
McDanold-1-jun15
 
Social metadata for libraries, archives and museums: Research findings from t...
Social metadata for libraries, archives and museums: Research findings from t...Social metadata for libraries, archives and museums: Research findings from t...
Social metadata for libraries, archives and museums: Research findings from t...
 
OCLC Research Update at ALA Chicago. June 26, 2017.
OCLC Research Update at ALA Chicago. June 26, 2017.OCLC Research Update at ALA Chicago. June 26, 2017.
OCLC Research Update at ALA Chicago. June 26, 2017.
 
NISO Webinar: The Future of Integrated Library Systems PART 2: User Interaction
NISO Webinar: The Future of Integrated Library Systems PART 2: User InteractionNISO Webinar: The Future of Integrated Library Systems PART 2: User Interaction
NISO Webinar: The Future of Integrated Library Systems PART 2: User Interaction
 
Exploring a world of networked information built from free-text metadata
Exploring a world of networked information built from free-text metadataExploring a world of networked information built from free-text metadata
Exploring a world of networked information built from free-text metadata
 
Next Generation Technical Services May 2009 Calhoun
Next Generation Technical Services May 2009 CalhounNext Generation Technical Services May 2009 Calhoun
Next Generation Technical Services May 2009 Calhoun
 
Wacker-4-june15
Wacker-4-june15Wacker-4-june15
Wacker-4-june15
 
IASSIT Kansa Presentation
IASSIT Kansa PresentationIASSIT Kansa Presentation
IASSIT Kansa Presentation
 
Towards collaboration at scale: Libraries, the social and the technical
Towards collaboration at scale:  Libraries, the social and the technicalTowards collaboration at scale:  Libraries, the social and the technical
Towards collaboration at scale: Libraries, the social and the technical
 
Connecting the Dots: Constellations in the Linked Data Universe
Connecting the Dots: Constellations in the Linked Data UniverseConnecting the Dots: Constellations in the Linked Data Universe
Connecting the Dots: Constellations in the Linked Data Universe
 
Lawless-3-jun15
Lawless-3-jun15Lawless-3-jun15
Lawless-3-jun15
 

Semelhante a Best Practices for Web Archiving Metadata

NISO access related projects (presented at the Charleston conference 2016)
NISO access related projects (presented at the Charleston conference 2016)NISO access related projects (presented at the Charleston conference 2016)
NISO access related projects (presented at the Charleston conference 2016)Christine Stohn
 
Crossref LIVE Indonesia: An Introduction to Crossref, CRLIVE-ID 13 July 2021
Crossref LIVE Indonesia: An Introduction to Crossref, CRLIVE-ID 13 July 2021Crossref LIVE Indonesia: An Introduction to Crossref, CRLIVE-ID 13 July 2021
Crossref LIVE Indonesia: An Introduction to Crossref, CRLIVE-ID 13 July 2021Crossref
 
Resource discovery and information sharing: reaching the 2.0 turn
Resource discovery and information sharing: reaching the 2.0 turnResource discovery and information sharing: reaching the 2.0 turn
Resource discovery and information sharing: reaching the 2.0 turnBonaria Biancu
 
Cooperative Cataloging Projects: Managing Them for Best Results
Cooperative Cataloging Projects: Managing Them for Best ResultsCooperative Cataloging Projects: Managing Them for Best Results
Cooperative Cataloging Projects: Managing Them for Best ResultsNASIG
 
Research Data Publishing
Research Data PublishingResearch Data Publishing
Research Data PublishingBrian Hole
 
Collab365 - We Need to Talk: How to Converse with Regular People About Managi...
Collab365 - We Need to Talk: How to Converse with Regular People About Managi...Collab365 - We Need to Talk: How to Converse with Regular People About Managi...
Collab365 - We Need to Talk: How to Converse with Regular People About Managi...Jonathan Ralton
 
Summary of Trends in Cataloging
Summary of Trends in CatalogingSummary of Trends in Cataloging
Summary of Trends in CatalogingWilliam Worford
 
Slides anu talkwebarchivingaug2012
Slides anu talkwebarchivingaug2012Slides anu talkwebarchivingaug2012
Slides anu talkwebarchivingaug2012Roxanne Missingham
 
On building a search interface discovery system
On building a search interface discovery systemOn building a search interface discovery system
On building a search interface discovery systemDenis Shestakov
 
Linked data for Libraries
Linked data for LibrariesLinked data for Libraries
Linked data for Librariesrobin fay
 
The commitment of arabic sites in the field of libraries and information that...
The commitment of arabic sites in the field of libraries and information that...The commitment of arabic sites in the field of libraries and information that...
The commitment of arabic sites in the field of libraries and information that...Alexander Decker
 
ECS19 - Marc Anderson - Managing Content Types in the Modern World
ECS19 - Marc Anderson - Managing Content Types in the Modern WorldECS19 - Marc Anderson - Managing Content Types in the Modern World
ECS19 - Marc Anderson - Managing Content Types in the Modern WorldEuropean Collaboration Summit
 
ECS2019 - Managing Content Types in the Modern World
ECS2019 - Managing Content Types in the Modern WorldECS2019 - Managing Content Types in the Modern World
ECS2019 - Managing Content Types in the Modern WorldMarc D Anderson
 
Wikis, Rubrics and Views: An Integrated Approach to Improving Documentation
Wikis, Rubrics and Views: An Integrated Approach to Improving DocumentationWikis, Rubrics and Views: An Integrated Approach to Improving Documentation
Wikis, Rubrics and Views: An Integrated Approach to Improving DocumentationTed Habermann
 
The Web of data and web data commons
The Web of data and web data commonsThe Web of data and web data commons
The Web of data and web data commonsJesse Wang
 
Content Registration, Crossref ALJEBI, Indonesia
Content Registration, Crossref ALJEBI, IndonesiaContent Registration, Crossref ALJEBI, Indonesia
Content Registration, Crossref ALJEBI, IndonesiaCrossref
 
METADATA: A PRACTICE AND ITS SERVICES TOWARDS DIGITAL ENVIRONMENT
METADATA: A PRACTICE AND ITS SERVICES TOWARDS DIGITAL ENVIRONMENTMETADATA: A PRACTICE AND ITS SERVICES TOWARDS DIGITAL ENVIRONMENT
METADATA: A PRACTICE AND ITS SERVICES TOWARDS DIGITAL ENVIRONMENTVikas Bhushan
 
Internationalising South African Scholarly Journals
Internationalising South African Scholarly Journals Internationalising South African Scholarly Journals
Internationalising South African Scholarly Journals KidsintheCloud
 
Next Generation Repositories
Next Generation RepositoriesNext Generation Repositories
Next Generation Repositoriesukcorr
 

Semelhante a Best Practices for Web Archiving Metadata (20)

NISO access related projects (presented at the Charleston conference 2016)
NISO access related projects (presented at the Charleston conference 2016)NISO access related projects (presented at the Charleston conference 2016)
NISO access related projects (presented at the Charleston conference 2016)
 
Crossref LIVE Indonesia: An Introduction to Crossref, CRLIVE-ID 13 July 2021
Crossref LIVE Indonesia: An Introduction to Crossref, CRLIVE-ID 13 July 2021Crossref LIVE Indonesia: An Introduction to Crossref, CRLIVE-ID 13 July 2021
Crossref LIVE Indonesia: An Introduction to Crossref, CRLIVE-ID 13 July 2021
 
Resource discovery and information sharing: reaching the 2.0 turn
Resource discovery and information sharing: reaching the 2.0 turnResource discovery and information sharing: reaching the 2.0 turn
Resource discovery and information sharing: reaching the 2.0 turn
 
Cooperative Cataloging Projects: Managing Them for Best Results
Cooperative Cataloging Projects: Managing Them for Best ResultsCooperative Cataloging Projects: Managing Them for Best Results
Cooperative Cataloging Projects: Managing Them for Best Results
 
Research Data Publishing
Research Data PublishingResearch Data Publishing
Research Data Publishing
 
Collab365 - We Need to Talk: How to Converse with Regular People About Managi...
Collab365 - We Need to Talk: How to Converse with Regular People About Managi...Collab365 - We Need to Talk: How to Converse with Regular People About Managi...
Collab365 - We Need to Talk: How to Converse with Regular People About Managi...
 
Summary of Trends in Cataloging
Summary of Trends in CatalogingSummary of Trends in Cataloging
Summary of Trends in Cataloging
 
Slides anu talkwebarchivingaug2012
Slides anu talkwebarchivingaug2012Slides anu talkwebarchivingaug2012
Slides anu talkwebarchivingaug2012
 
On building a search interface discovery system
On building a search interface discovery systemOn building a search interface discovery system
On building a search interface discovery system
 
Linked data for Libraries
Linked data for LibrariesLinked data for Libraries
Linked data for Libraries
 
The commitment of arabic sites in the field of libraries and information that...
The commitment of arabic sites in the field of libraries and information that...The commitment of arabic sites in the field of libraries and information that...
The commitment of arabic sites in the field of libraries and information that...
 
ECS19 - Marc Anderson - Managing Content Types in the Modern World
ECS19 - Marc Anderson - Managing Content Types in the Modern WorldECS19 - Marc Anderson - Managing Content Types in the Modern World
ECS19 - Marc Anderson - Managing Content Types in the Modern World
 
ECS2019 - Managing Content Types in the Modern World
ECS2019 - Managing Content Types in the Modern WorldECS2019 - Managing Content Types in the Modern World
ECS2019 - Managing Content Types in the Modern World
 
Wikis, Rubrics and Views: An Integrated Approach to Improving Documentation
Wikis, Rubrics and Views: An Integrated Approach to Improving DocumentationWikis, Rubrics and Views: An Integrated Approach to Improving Documentation
Wikis, Rubrics and Views: An Integrated Approach to Improving Documentation
 
Demo nov 2011
Demo nov 2011Demo nov 2011
Demo nov 2011
 
The Web of data and web data commons
The Web of data and web data commonsThe Web of data and web data commons
The Web of data and web data commons
 
Content Registration, Crossref ALJEBI, Indonesia
Content Registration, Crossref ALJEBI, IndonesiaContent Registration, Crossref ALJEBI, Indonesia
Content Registration, Crossref ALJEBI, Indonesia
 
METADATA: A PRACTICE AND ITS SERVICES TOWARDS DIGITAL ENVIRONMENT
METADATA: A PRACTICE AND ITS SERVICES TOWARDS DIGITAL ENVIRONMENTMETADATA: A PRACTICE AND ITS SERVICES TOWARDS DIGITAL ENVIRONMENT
METADATA: A PRACTICE AND ITS SERVICES TOWARDS DIGITAL ENVIRONMENT
 
Internationalising South African Scholarly Journals
Internationalising South African Scholarly Journals Internationalising South African Scholarly Journals
Internationalising South African Scholarly Journals
 
Next Generation Repositories
Next Generation RepositoriesNext Generation Repositories
Next Generation Repositories
 

Mais de OCLC

Communicating library impact beyond library walls: Findings from an action-or...
Communicating library impact beyond library walls: Findings from an action-or...Communicating library impact beyond library walls: Findings from an action-or...
Communicating library impact beyond library walls: Findings from an action-or...OCLC
 
"You can just tell whether a website looks reliable or not." People's modes o...
"You can just tell whether a website looks reliable or not." People's modes o..."You can just tell whether a website looks reliable or not." People's modes o...
"You can just tell whether a website looks reliable or not." People's modes o...OCLC
 
Factors influencing research data management programs.
Factors influencing research data management programs.Factors influencing research data management programs.
Factors influencing research data management programs.OCLC
 
Teaching research methods in LIS programs: Approaches, formats, and innovativ...
Teaching research methods in LIS programs: Approaches, formats, and innovativ...Teaching research methods in LIS programs: Approaches, formats, and innovativ...
Teaching research methods in LIS programs: Approaches, formats, and innovativ...OCLC
 
OCLC ALISE Library & Information Science Research Grant Program
OCLC ALISE Library & Information Science Research Grant ProgramOCLC ALISE Library & Information Science Research Grant Program
OCLC ALISE Library & Information Science Research Grant ProgramOCLC
 
Investing in library users and potential users: The Many Faces of Digital Vi...
 Investing in library users and potential users: The Many Faces of Digital Vi... Investing in library users and potential users: The Many Faces of Digital Vi...
Investing in library users and potential users: The Many Faces of Digital Vi...OCLC
 
Academic library impact: Improving practice and essential areas to research
Academic library impact: Improving practice and essential areas to researchAcademic library impact: Improving practice and essential areas to research
Academic library impact: Improving practice and essential areas to researchOCLC
 
Studying information behavior: The Many Faces of Digital Visitors and Residents
Studying information behavior: The Many Faces of Digital Visitors and ResidentsStudying information behavior: The Many Faces of Digital Visitors and Residents
Studying information behavior: The Many Faces of Digital Visitors and ResidentsOCLC
 
Online engagement and information literacy: The Many Face of Digital Visitors...
Online engagement and information literacy: The Many Face of Digital Visitors...Online engagement and information literacy: The Many Face of Digital Visitors...
Online engagement and information literacy: The Many Face of Digital Visitors...OCLC
 
People's mode of online engagement: The Many Faces of Digital Visitors and R...
 People's mode of online engagement: The Many Faces of Digital Visitors and R... People's mode of online engagement: The Many Faces of Digital Visitors and R...
People's mode of online engagement: The Many Faces of Digital Visitors and R...OCLC
 
Applying research methods: Investigating the Many Faces of Digital Visitors &...
Applying research methods: Investigating the Many Faces of Digital Visitors &...Applying research methods: Investigating the Many Faces of Digital Visitors &...
Applying research methods: Investigating the Many Faces of Digital Visitors &...OCLC
 
OCLC RLP @ RLUK
OCLC RLP @ RLUKOCLC RLP @ RLUK
OCLC RLP @ RLUKOCLC
 
Using Qualitative Methods for Library Evaluation: An Interactive Workshop
Using Qualitative Methods for Library Evaluation: An Interactive WorkshopUsing Qualitative Methods for Library Evaluation: An Interactive Workshop
Using Qualitative Methods for Library Evaluation: An Interactive WorkshopOCLC
 
Visitors and Residents: The Hows and Whys of Engagement with Technology
Visitors and Residents: The Hows and Whys of Engagement with TechnologyVisitors and Residents: The Hows and Whys of Engagement with Technology
Visitors and Residents: The Hows and Whys of Engagement with TechnologyOCLC
 
Action-Oriented Research Agenda on Library Contributions to Student Learning ...
Action-Oriented Research Agenda on Library Contributions to Student Learning ...Action-Oriented Research Agenda on Library Contributions to Student Learning ...
Action-Oriented Research Agenda on Library Contributions to Student Learning ...OCLC
 
Visitors and Residents: Interactive Mapping Exercise Workshop
Visitors and Residents: Interactive Mapping Exercise WorkshopVisitors and Residents: Interactive Mapping Exercise Workshop
Visitors and Residents: Interactive Mapping Exercise WorkshopOCLC
 
The Library in the Life of the User
The Library in the Life of the UserThe Library in the Life of the User
The Library in the Life of the UserOCLC
 
Where are We Going and What Do We Do Next? Demonstrating the Value of Academi...
Where are We Going and What Do We Do Next? Demonstrating the Value of Academi...Where are We Going and What Do We Do Next? Demonstrating the Value of Academi...
Where are We Going and What Do We Do Next? Demonstrating the Value of Academi...OCLC
 
Changing Tack: A Future-Focused ACRL Research Agenda
Changing Tack: A Future-Focused ACRL Research AgendaChanging Tack: A Future-Focused ACRL Research Agenda
Changing Tack: A Future-Focused ACRL Research AgendaOCLC
 
Qualitative Research Methods in LIS
Qualitative Research Methods in LISQualitative Research Methods in LIS
Qualitative Research Methods in LISOCLC
 

Mais de OCLC (20)

Communicating library impact beyond library walls: Findings from an action-or...
Communicating library impact beyond library walls: Findings from an action-or...Communicating library impact beyond library walls: Findings from an action-or...
Communicating library impact beyond library walls: Findings from an action-or...
 
"You can just tell whether a website looks reliable or not." People's modes o...
"You can just tell whether a website looks reliable or not." People's modes o..."You can just tell whether a website looks reliable or not." People's modes o...
"You can just tell whether a website looks reliable or not." People's modes o...
 
Factors influencing research data management programs.
Factors influencing research data management programs.Factors influencing research data management programs.
Factors influencing research data management programs.
 
Teaching research methods in LIS programs: Approaches, formats, and innovativ...
Teaching research methods in LIS programs: Approaches, formats, and innovativ...Teaching research methods in LIS programs: Approaches, formats, and innovativ...
Teaching research methods in LIS programs: Approaches, formats, and innovativ...
 
OCLC ALISE Library & Information Science Research Grant Program
OCLC ALISE Library & Information Science Research Grant ProgramOCLC ALISE Library & Information Science Research Grant Program
OCLC ALISE Library & Information Science Research Grant Program
 
Investing in library users and potential users: The Many Faces of Digital Vi...
 Investing in library users and potential users: The Many Faces of Digital Vi... Investing in library users and potential users: The Many Faces of Digital Vi...
Investing in library users and potential users: The Many Faces of Digital Vi...
 
Academic library impact: Improving practice and essential areas to research
Academic library impact: Improving practice and essential areas to researchAcademic library impact: Improving practice and essential areas to research
Academic library impact: Improving practice and essential areas to research
 
Studying information behavior: The Many Faces of Digital Visitors and Residents
Studying information behavior: The Many Faces of Digital Visitors and ResidentsStudying information behavior: The Many Faces of Digital Visitors and Residents
Studying information behavior: The Many Faces of Digital Visitors and Residents
 
Online engagement and information literacy: The Many Face of Digital Visitors...
Online engagement and information literacy: The Many Face of Digital Visitors...Online engagement and information literacy: The Many Face of Digital Visitors...
Online engagement and information literacy: The Many Face of Digital Visitors...
 
People's mode of online engagement: The Many Faces of Digital Visitors and R...
 People's mode of online engagement: The Many Faces of Digital Visitors and R... People's mode of online engagement: The Many Faces of Digital Visitors and R...
People's mode of online engagement: The Many Faces of Digital Visitors and R...
 
Applying research methods: Investigating the Many Faces of Digital Visitors &...
Applying research methods: Investigating the Many Faces of Digital Visitors &...Applying research methods: Investigating the Many Faces of Digital Visitors &...
Applying research methods: Investigating the Many Faces of Digital Visitors &...
 
OCLC RLP @ RLUK
OCLC RLP @ RLUKOCLC RLP @ RLUK
OCLC RLP @ RLUK
 
Using Qualitative Methods for Library Evaluation: An Interactive Workshop
Using Qualitative Methods for Library Evaluation: An Interactive WorkshopUsing Qualitative Methods for Library Evaluation: An Interactive Workshop
Using Qualitative Methods for Library Evaluation: An Interactive Workshop
 
Visitors and Residents: The Hows and Whys of Engagement with Technology
Visitors and Residents: The Hows and Whys of Engagement with TechnologyVisitors and Residents: The Hows and Whys of Engagement with Technology
Visitors and Residents: The Hows and Whys of Engagement with Technology
 
Action-Oriented Research Agenda on Library Contributions to Student Learning ...
Action-Oriented Research Agenda on Library Contributions to Student Learning ...Action-Oriented Research Agenda on Library Contributions to Student Learning ...
Action-Oriented Research Agenda on Library Contributions to Student Learning ...
 
Visitors and Residents: Interactive Mapping Exercise Workshop
Visitors and Residents: Interactive Mapping Exercise WorkshopVisitors and Residents: Interactive Mapping Exercise Workshop
Visitors and Residents: Interactive Mapping Exercise Workshop
 
The Library in the Life of the User
The Library in the Life of the UserThe Library in the Life of the User
The Library in the Life of the User
 
Where are We Going and What Do We Do Next? Demonstrating the Value of Academi...
Where are We Going and What Do We Do Next? Demonstrating the Value of Academi...Where are We Going and What Do We Do Next? Demonstrating the Value of Academi...
Where are We Going and What Do We Do Next? Demonstrating the Value of Academi...
 
Changing Tack: A Future-Focused ACRL Research Agenda
Changing Tack: A Future-Focused ACRL Research AgendaChanging Tack: A Future-Focused ACRL Research Agenda
Changing Tack: A Future-Focused ACRL Research Agenda
 
Qualitative Research Methods in LIS
Qualitative Research Methods in LISQualitative Research Methods in LIS
Qualitative Research Methods in LIS
 

Último

Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceSamikshaHamane
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxAshokKarra1
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSJoshuaGantuangco2
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
Q4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxQ4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxnelietumpap1
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptxmary850239
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYKayeClaireEstoconing
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfSpandanaRallapalli
 

Último (20)

Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in Pharmacovigilance
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptxFINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptx
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
Q4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxQ4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptx
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdf
 
Raw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptxRaw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptx
 

Best Practices for Web Archiving Metadata

  • 1. 1 SAA Description Section Portland, 26 July 2017 Best Practices for Descriptive Metadata for Web Archiving Recommendations of the OCLC Research Library Partnership Web Archiving Metadata Working Group Alexis Antracoli, Kate Bowers, Jackie Dooleyoc.lc/wam
  • 3. 3 3 Absence of community best practices for descriptive metadata was the most widely-shared web archiving challenge identified in two surveys: – OCLC Research Library Partnership (2015) – Weber/Graham study of users of archived website (2016)
  • 4. 4 4
  • 5. 5 5
  • 6. 6 OCLC RESEARCH LIBRARY PARTNERSHIP WEB ARCHIVING METADATA WORKING GROUP
  • 7. 7 7 Objectives • Recommend best practices for descriptive metadata for archived websites that are community-neutral and that address user needs (NOT a standard; displaces no standards) • Identify the most relevant data elements and provide guidance on formulating content • Bridge bibliographic and archival practices • Address the differing characteristics of item- and collection- level descriptions
  • 8. 8 8 Potential user groups • Practitioners who use standards but want guidance on formulating content for description of archived single sites and collections – Dublin Core (used by Archive-It) is a structure standard, not a content standard – Some archivists want web-specific guidance to supplement DACS – Those who want to export MARC records to a less granular data structure (such as Archive-It) – RDA (library content standard) focuses on transcription of live single sites, but websites lack standard elements and are not amenable to transcription • Libraries and archives using a digital asset management system (DAMS), eschewing descriptive standards; brief records, scalable approach • Individuals lacking metadata experience who build or contribute to web archives
  • 9. 9 9 Feedback received • How much? A lot! • Mostly positive; three negative comments –Positive: premise accepted; suggestions for improvement –Negative: premise rejected, including because … • Archival (two comments, including TS-DACS): recommendations are too bibliographic, don’t conform to DACS • Bibliographic (one comment): recommendations are too archival
  • 10. LIT REVIEWS & OTHER RESEARCH Bailey et al. Ben-David & Huurdeman Bernstein Bragg & Hanna Costa Costa & Gomes Costa & Silva Cruz & Gomes Dougherty & Meyer Galligan Gatenby Gibbons Goel Goethals Guenther Hartman et al. Hockx-Yu Jackson Jones & Shankar Lavoie & Gartner Leetaru Mannheimer Masanès Milligan Murray & Hsieh Neubert Niu O’Dell Peterson Phillips & Koerbin Pregill Prom & Swain Ras & van Bussel Reynolds Riley & Crookston Stirling et al. Sweetser Taylor Thomas et al. Thurman & O’Hanlon Tillinghast Truman Weber & Graham Webster Wuet et al. Zhang et al.
  • 11. Who uses web archives? Digital humanists Web scientists Computer scientists Data analysts Journalists Lawyers Website owners Website designers Government employees Genealogists Patent applicants Instructors Students Linguists Sociologists Political scientists Historians Anthropologists
  • 12. Users need “provenance” metadata • “The critical missing piece” • Provide context • Why was the content archived? • Selection criteria • Scope Note: WAM focused solely on descriptive metadata, not technical or preservation metadata.
  • 13. Metadata practitioner needs • Archival and bibliographic approaches • DACS, EAD, MARC, Dublin Core, RDA, MODS • Data elements vary widely • Same element name, multiple meanings • Level of description – Single site, collection of sites, seed URLs • Scalability and limited resources
  • 14. 14 14 Best practices methodology • Analyze metadata standards & institutional guidelines – RDA (libraries), DACS (archives), Dublin Core (simplified) • Evaluate existing metadata records “in the wild” – ArchiveGrid, Archive-It, WorldCat • Identify dilemmas specific to web archiving • Incorporate findings from literature reviews • Prepare data dictionary and report narrative General finding: extreme inconsistencies in existing practice
  • 16. 16 16 • Is the website creator/owner the … publisher? author? subject? • Should the title be … transcribed verbatim from the head of the site? Edited to clarify the nature/scope of the site? Append e.g. "web archive” for a collection? • Which dates are important/feasible other than capture dates? Beginning/end of the site's existence? Date of the content? Copyright? • How should extent/size be expressed? 6.25 Gb? 300 websites? 1 archived website? 1 online resource? • Is the host institution that harvests and manages the archived content the repository? creator? publisher? selector?
  • 17. 17 17 • Is it important to clearly state that the resource is a website? If so, where? In the title? description? extent statement? all of these? • Does provenance refer to …the site owner? the repository that harvests and hosts the site? ways in which the site evolved? • Does appraisal mean …the reason the site warrants being archived? a collection of sites named by the repository? the parts of the site that were harvested? • Which URLs should be included? Seed? access? landing page?
  • 19. 19 Archive-It, the most widely used web archiving platform, provides for application of DC metadata for collections and seeds
  • 20. 20 Detail of the Collection-level metadata editing interface
  • 21. 21 Archive-It has support for non-DC fields At the seed level, “grab title” can pull a title from HTML
  • 22. 22 Bulk options include downloading all seed metadata
  • 24. 24 24 WAM data elements * = 9 of 14 element names/meanings match Dublin Core Collector Extent Source of Description Contributor * Genre/Form Subject * Creator * Language * Title * Date * Relation * URL Description * Rights *
  • 25. 25 25 Data dictionary inclusion criteria • Includes common elements used for identification and discovery of all types of resource (e.g., Creator, Date, Subject, Title) • Other elements must have clear applicability to archived websites (e.g. Rights, Description, URL) • Elements excluded that rarely (if ever) appear in guidelines and/or extant metadata records and have no web-specific meaning (e.g. audience, publisher, statement of responsibility)
  • 26. 26 26 Data element features • Element name • Definition • Usage note • Examples • Crosswalk
  • 27. 27 27 Collector Definition: The organization responsible for curation and stewardship of an archived website or collection. Use Collector for the organization that selects the web content for archiving, creates metadata and performs other activities associated with “ownership” of a resource. Stated another way, this is the organization that has taken responsibility for the archived content, although the digital files are not necessarily stored and maintained by this organization (collections harvested using Archive-It are a prominent example). No equivalent in Dublin Core.
  • 28. 28 28 Collector: Lifecycle activities Institutions involved in web archiving engage in a variety of activities during the lifecycle of archiving web content. We identified four activities performed by the institution that assumes responsibility for archiving web content: • Selecting websites for archiving • Harvesting the content of the designated seed URLs • Creating and maintaining metadata to describe the content • Making decisions about other aspects of collections management, including how the harvested files will be preserved and how will access be provided.
  • 29. 29 29 Collector: Examples Creator: Seattle (Wash.) Title: City of Seattle Harvested Websites Collector: Seattle Municipal Archives -============ Title: Globalchange.gov Contributor: U.S. Global Change Research Program Collector: Federal Depository Library Program ============ Creator: Association for Research into Crimes Against Art Title: ARCAblog : promoting the study and research of art crime and cultural heritage protection Collector: New York Art Resources Consortium
  • 30. 30 30 Collector: Crosswalks Crosswalks Dublin Core Contributor EAD <repository> MARC 524 852 subfield a 852 subfield b MODS <location> schema.org schema:OwnershipInfo
  • 31. 31 31 Source of description Definition: Information about the gathering or creation of the metadata itself, such as sources of data or the date on which source data was obtained. Source of Information is used to identify the source of all or some of the metadata, particularly for descriptions of single sites. Basic aspects of a website (creator name, title, etc.) may change significantly, but the responsible institution is unlikely to have the resources to become aware of changes, let alone update the metadata. Include the date on which the site was examined and the location from which the information was taken. No equivalent in Dublin Core.
  • 32. 32 32 Source of description: Examples Description based on archived web page captured Sept. 22, 2016; title from title screen (viewed Oct. 27, 2016) Title from home page last updated June 21, 2012 (viewed June 22, 2012) Title from home page (viewed on Oct. 11, 2007) Title from HTML header (viewed Feb. 16, 2006)
  • 35. 35 35 Three simultaneous reports (autumn 2017) ● Best practices for descriptive metadata ○ With data dictionary ● User needs ○ With annotated bibliography ● Tools ○ With evaluation grids Send comments by August 4th!
  • 37. 37 SM Jackie Dooley Program Officer, OCLC Research dooleyj@oclc.org @minniedw SAA Description Section 26 July 2017 ©2016 OCLC. This work is licensed under a Creative Commons Attribution 4.0 International License. Suggested attribution: “This work uses content from Developing Best Practices for Web Archiving Metadata to Meet User Needs © OCLC, used under a Creative Commons Attribution 4.0 International License: http://creativecommons.org/licenses/by/4.0/.” For more information, please contact: oc.lc/wam