Blepharitis inflammation of eyelid symptoms cause everything included along w...
A Snapshot of the U.S. Web Archiving Landscape through the 2013 NDSA Survey Report
1. A Snapshot of the U.S. Web
Archiving Landscape through the
2013 NDSA Survey Report
Archive-It Partner Meeting
November 18, 2014 “Barrage balloon manufacture...” by Alfred T. Palmer under public domain
Content Working Group
Nicholas Taylor (@nullhandle)
Web Archiving Service Manager
Stanford University Libraries
2. NDSA Web Archiving Survey Working Group
Content Working Group
Jefferson Bailey
Internet Archive / Archive-It
Kristine Hanna
Internet Archive / Archive-It
Cathy Hartman
University of North Texas
Edward McCain
University of Missouri
Abbie Grotke
Library of Congress
Christie Moffatt
National Library of Medicine
Nicholas Taylor
Stanford University
3. NDSA Web Archiving survey background
Content Working Group
2011
• 78 respondents
• program info
• tools and services
• access
• policies
2013
• 92 respondents
• program info
• staff time, metrics, skills,
content concerns
• tools and services
• access and discovery
• new discovery options
• policies
• embargo, social media,
robots.txt, resources
6. universities still make up most programs
Content Working Group
College or
University
47%
Commercial
Fed Gov
Other
12%
State Gov
Archive
13%
13%
8%
2%
Public
Library
2%
Museum
3%
2011
College or
University
52%
Commercial
Fed Gov
State Gov
13%
Archive
15%
Other
8%
5%
4%
Public
Library
2%
Museum
1%
2013
7. Archive-It and SAA top group affiliations
group 2011 2013
8% 7%
31% 33%
45%
72% 71%
Content Working Group
8. most programs are fractionally staffed
less than 25% FTE
Content Working Group
25% FTE
1 to 3 FTE
1 FTE
40-50% FTE
3.5 to 15 FTE
9. web/archiving tech savviness are key skills
Content Working Group
39% 37%
24%
21% 21%
10%
6% 6%
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Percentage of organizations
10. data volume and archive use are key metrics
Content Working Group
53%
47%
22%
20%
8%
4% 4%
60%
50%
40%
30%
20%
10%
0%
Volume Usage Cost Quality Buy-in Loss Policy
Percentage of organizations
11. Content Working Group
Maturity and Progress
“Apple Mouse Evolution” by raneko under CC BY 2.0
12. programs have matured slightly since 2011
Content Working Group
64%
16% 17%
4%
72%
14%
9%
2%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Active Testing Planning No longer collecting
2011 2013
13. strong perceptions of progress since 2011
Content Working Group
Significant progress
40%
Slightly worse off
About the same
20%
Some progress
36%
2%
Much worse off
2%
14. Content Working Group
many new programs since 2011
1
0
3
0
2
1
2
0
2
3
8
6
5
4
6
7
12
19
20
18
16
14
12
10
8
6
4
2
0
1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013
Number of organizations
15. two-thirds of them now use Archive-It
Content Working Group
0 0
1 0
2
0 1 0
1 0
3 3
1 2
4
2
6
4
1 2
0
1
1
1 3
5
3
4 2
2 5
6
15
20
18
16
14
12
10
8
6
4
2
0
1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013
Number of organizations Archive-It Partner as of 2013
16. Content Working Group
Archiving Focus
“Ant Farm Media Van v.08 (Time Capsule) in Bellewether at Southern Exposure” by Steve Rhodes under CC BY-NC-SA 2.0
17. more programs are only self-archiving
Content Working Group
31%
49%
20%
15%
48%
37%
60%
50%
40%
30%
20%
10%
0%
Archive other sites only Archive both Archive own site only
2011 2013
18. concern about social media, databases, video
Content Working Group
69
65 64
49
40
32
16
80
70
60
50
40
30
20
10
0
Social Media Databases Video Interactive
Media
Audio Blogs Art
Number of organizations
19. Content Working Group
untapped interest in collaboration
21%
72%
7%
17%
47%
33%
2%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Yes No Not yet, but interested Don't know
2011 2013
20. “Photocopier” by Joriel "Joz" Jimenez under CC BY-NC-ND 2.0
Content Working Group
Tools and Services
21. web archiving as a service still most popular
Content Working Group
60%
25%
14%
63%
20%
16%
70%
60%
50%
40%
30%
20%
10%
0%
External In-house Both
2011 2013
22. data not transferred from service provider
Content Working Group
19%
81%
20%
80%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Transferred Haven't transferred
2011 2013
23. increased use of tools supporting W/ARC
Content Working Group
24%
76%
38%
62%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Supports W/ARC Doesn't support W/ARC
2011 2013
25. Content Working Group
Archiving Policies
“Handle With Care” by ServInt under CC BY-NC-ND 2.0
26. most don’t notify or seek permission
Content Working Group
42 42
45
17
7
11
14 13
15
50
45
40
35
30
25
20
15
10
5
0
Capture Provide restricted access Provide public access
No action Notify Request permission
27. more conditional handling of robots.txt
22% 21%
Content Working Group
38%
33%
8%
55%
8%
16%
60%
50%
40%
30%
20%
10%
0%
Always respect robots.txt Sometimes/conditionally
respect robots.txt
Never respect robots.txt Don't know
2011 2013
28. social media archiving policies are uncommon
Has social media
archiving policy
Content Working Group
24%
Lacks social media
archiving policy
76%
29. policies based on community practices
Content Working Group
54%
40%
25%
11%
5% 5% 7%
60%
50%
40%
30%
20%
10%
0%
Other
organizations
ARL Code of
Best Practices
Section 108
Study Group
Counsel or
service provider
Oakland Archive
Policy
Statute Don't know
Percentage of organizations
30. Content Working Group
Landscape Summary
“Mt Baldy from Box Springs Mountain wi Theodolite” by signal mirror under CC BY 2.0
31. profile of the average survey respondent
• university archive
• started in last three years
• Archive-It user
• ¼ FTE web-savvy archivist
• concerned w/ content
capture, cost, and use
• broad level of description
• ambivalent about
collaboration “Container” by Glyn Lowe under CC BY 2.0
Content Working Group
32. Content Working Group
maturity and convergence
• maturity
• 75% cite some or significant progress since 2011
• 38% started programs since 2011
• 8% more programs in active status since 2011
• convergence
• 79% using external service providers
• 81% devoting ½ FTE or less to web archiving
• 67% rely on community practices for policy-making
• 13% more using Wayback since 2011
33. Content Working Group
challenges and opportunities
• challenges
• 53% concerned about data volume growth
• 47% concerned about fostering access
• more than 73% concerned about content capture
• opportunities
• 33% interested but not yet involved in collaborations
• 76% lack social media archiving policies
• less than 23% of archived materials are described
34. Content Working Group
implications and questions
• implications
• web archiving not (yet) a top institutional priority
• demand for ongoing Archive-It technical investment
• U.S. web archiving landscape is changing quickly
• questions
• how to build institutional support?
• collaboration with whom and on what?
• what’s not being archived?
• how well are we curating what we do archive?
35. Content Working Group
Nicholas Taylor
@nullhandle
“Thank You” by vistamommy under CC BY 2.0