1. Image: Luc Viatour / www.Lucnix.be
Unpacking
Archival
Silences
A
short
history
of
Web
archives
research
Anat
Ben-‐david,
University
of
Amsterdam,
February
2013
Monday, February 18, 13
2. What
are
Web
Archives
For?
Monday, February 18, 13
3. 1.
Preservation
of
(national)
digital
cultural
heritage
Monday, February 18, 13
4. 1.
Preservation
of
(national)
digital
cultural
heritage
-‐ .."web
resources
which
are
collected
with
the
aim
of
their
long-‐term
preservation".
(Czech
Web
archive)
Monday, February 18, 13
5. 1.
Preservation
of
(national)
digital
cultural
heritage
-‐ .."web
resources
which
are
collected
with
the
aim
of
their
long-‐term
preservation".
(Czech
Web
archive)
-‐ "The
Archive's
mission
is
gathering
and
long-‐term
preservation
of
Internet
publications
as
part
of
the
Croatian
national
heritage”
(Croatian
Web
archive)
Monday, February 18, 13
6. 1.
Preservation
of
(national)
digital
cultural
heritage
-‐ .."web
resources
which
are
collected
with
the
aim
of
their
long-‐term
preservation".
(Czech
Web
archive)
-‐ "The
Archive's
mission
is
gathering
and
long-‐term
preservation
of
Internet
publications
as
part
of
the
Croatian
national
heritage”
(Croatian
Web
archive)
-‐ "..these
websites
were
carefully
selected
to
be
part
of
the
nation's
documentary
heritage".
(Singapore
Web
Archive)
Monday, February 18, 13
8. 2.
Responding
to
a
preservation
risk
-‐ .."the
present
generation
may
be
considered
as
a
forgotten
dark
age
by
future
generations
if
we
neglect
to
select
and
preserve
digital
resources
at
country
level"(South
Korea
Web
archive)
Monday, February 18, 13
9. 2.
Responding
to
a
preservation
risk
-‐ .."the
present
generation
may
be
considered
as
a
forgotten
dark
age
by
future
generations
if
we
neglect
to
select
and
preserve
digital
resources
at
country
level"(South
Korea
Web
archive)
-‐ .."These
days,
documents
are
increasingly
being
published
only
digitally.
If
we
do
not
preserve
the
information,
part
of
our
heritage
will
be
lost
forever"
(Swedish
Web
archive)
Monday, February 18, 13
10. 2.
Responding
to
a
preservation
risk
-‐ .."the
present
generation
may
be
considered
as
a
forgotten
dark
age
by
future
generations
if
we
neglect
to
select
and
preserve
digital
resources
at
country
level"(South
Korea
Web
archive)
-‐ .."These
days,
documents
are
increasingly
being
published
only
digitally.
If
we
do
not
preserve
the
information,
part
of
our
heritage
will
be
lost
forever"
(Swedish
Web
archive)
-‐ .."Responding
to
the
challenge
of
a
potential
‘digital
black
hole’
the
UK
Web
Archive
is
there
to
safeguard
as
many
of
these
websites
as
practical.
(UK
Web
Archive)
Monday, February 18, 13
12. 3.
Viewing
past
versions
of
a
Website
-‐ .."You
can
see
archived
websites
in
their
original
version.
Our
service
will
help
you
search
efWiciently
and
quickly
for
an
important
publication
in
the
Wlood
of
information
on
the
Internet"
(Japan
Web
archive)
Monday, February 18, 13
13. 3.
Viewing
past
versions
of
a
Website
-‐ .."You
can
see
archived
websites
in
their
original
version.
Our
service
will
help
you
search
efWiciently
and
quickly
for
an
important
publication
in
the
Wlood
of
information
on
the
Internet"
(Japan
Web
archive)
-‐ .."The
collection
also
provides
a
visual
history
of
how
websites
change
over
time"
(New
Zealand
Web
archive)
Monday, February 18, 13
14. 3.
Viewing
past
versions
of
a
Website
-‐ .."You
can
see
archived
websites
in
their
original
version.
Our
service
will
help
you
search
efWiciently
and
quickly
for
an
important
publication
in
the
Wlood
of
information
on
the
Internet"
(Japan
Web
archive)
-‐ .."The
collection
also
provides
a
visual
history
of
how
websites
change
over
time"
(New
Zealand
Web
archive)
-‐ .."Warning
-‐
The
current
version
of
the
site
may
no
longer
be
available"
(Latvian
Web
Archive)
Monday, February 18, 13
16. 4.
and..
also
for
research
-‐ .."This
makes
the
web
an
important
source
for
future
researchers,
not
only
for
studies
of
the
development
of
the
web
but
certainly
for
research
on
society
today"
(Dutch
Web
archive)
Monday, February 18, 13
17. 4.
and..
also
for
research
-‐ .."This
makes
the
web
an
important
source
for
future
researchers,
not
only
for
studies
of
the
development
of
the
web
but
certainly
for
research
on
society
today"
(Dutch
Web
archive)
-‐ .."All
materials
are
archived
and
available
for
use
by
researchers
and
others
who
need
them
in
their
studies
-‐
now
and
in
the
future".
(Finland
Web
archive)
Monday, February 18, 13
18. 4.
and..
also
for
research
-‐ .."This
makes
the
web
an
important
source
for
future
researchers,
not
only
for
studies
of
the
development
of
the
web
but
certainly
for
research
on
society
today"
(Dutch
Web
archive)
-‐ .."All
materials
are
archived
and
available
for
use
by
researchers
and
others
who
need
them
in
their
studies
-‐
now
and
in
the
future".
(Finland
Web
archive)
-‐ .."Web
history
can
provide
a
tremendous
base
for
time-‐based
analysis
of
the
content,
the
topology
including
emerging
communities
and
topics,
trends
analysis
etc.
as
well
as
an
invaluable
source
of
information
for
the
future"
(European
Archive)
Monday, February 18, 13
20. “Archival
Silences”
(?)
-‐ “Web
archives
will
be
the
digital
equivalent
of
the
dusty
archive,
often
well-‐
curated
and
maintained,
but
hardly
used”
-‐-‐
(Meyer
et
al.,
2011,
p.
7)
Image source: http://static.guim.co.uk/sys-images/Books/Pix/pictures/2009/10/16/1255686935351/Dusty-bookshelf-001.jpg
Monday, February 18, 13
21. “Archival
Silences”
(?)
-‐ “Web
archives
will
be
the
digital
equivalent
of
the
dusty
archive,
often
well-‐
curated
and
maintained,
but
hardly
used”
-‐-‐
(Meyer
et
al.,
2011,
p.
7)
-‐ “One
must
ask,
in
the
world
of
Internet
research,
why
do
Web
archives
appear
to
be
second
class
citizens?
“
-‐-‐
(Meyer
et
al.,
2011,
p.
9
)
Image source: http://static.guim.co.uk/sys-images/Books/Pix/pictures/2009/10/16/1255686935351/Dusty-bookshelf-001.jpg
Monday, February 18, 13
22. “Archival
Silences”
(?)
-‐ “Web
archives
will
be
the
digital
equivalent
of
the
dusty
archive,
often
well-‐
curated
and
maintained,
but
hardly
used”
-‐-‐
(Meyer
et
al.,
2011,
p.
7)
-‐ “One
must
ask,
in
the
world
of
Internet
research,
why
do
Web
archives
appear
to
be
second
class
citizens?
“
-‐-‐
(Meyer
et
al.,
2011,
p.
9
)
-‐ “Web
archiving
infrastructure
receives
scholarly
and
non-‐scholarly
attention;
the
archived
materials
–
the
primary
source
material
–
gain
less
notice”
-‐-‐
(Rogers
2013,
p.
85)
Image source: http://static.guim.co.uk/sys-images/Books/Pix/pictures/2009/10/16/1255686935351/Dusty-bookshelf-001.jpg
Monday, February 18, 13
23. “Archival
Silences”
(?)
-‐ “Web
archives
will
be
the
digital
equivalent
of
the
dusty
archive,
often
well-‐
curated
and
maintained,
but
hardly
used”
-‐-‐
(Meyer
et
al.,
2011,
p.
7)
-‐ “One
must
ask,
in
the
world
of
Internet
research,
why
do
Web
archives
appear
to
be
second
class
citizens?
“
-‐-‐
(Meyer
et
al.,
2011,
p.
9
)
-‐ “Web
archiving
infrastructure
receives
scholarly
and
non-‐scholarly
attention;
the
archived
materials
–
the
primary
source
material
–
gain
less
notice”
-‐-‐
(Rogers
2013,
p.
85)
-‐ “There
is
a
growing
gulf
in
web
archiving
between
the
researchers
who
want
to
use
web
artifacts
to
study
in
their
Wield
and
the
information
professional
who
serve
information
needs”
-‐-‐
(Dougherty
&
Heuvel
2010,
p.
6)
Image source: http://static.guim.co.uk/sys-images/Books/Pix/pictures/2009/10/16/1255686935351/Dusty-bookshelf-001.jpg
Monday, February 18, 13
25. A
short
history
of
Web
archives
-‐
1996-‐1998
Web
archive
as
a
Web
index
Monday, February 18, 13
26. A
short
history
of
Web
archives
-‐
1996-‐1998
Web
archive
as
a
Web
index
-‐ 1999-‐
Web
archives
as
special
collections
Monday, February 18, 13
27. A
short
history
of
Web
archives
-‐
1996-‐1998
Web
archive
as
a
Web
index
-‐ 1999-‐
Web
archives
as
special
collections
-‐ 2000-‐The
national
turn
in
Web
archiving
Monday, February 18, 13
28. A
short
history
of
Web
archives
-‐
1996-‐1998
Web
archive
as
a
Web
index
-‐ 1999-‐
Web
archives
as
special
collections
-‐ 2000-‐The
national
turn
in
Web
archiving
-‐ 2005
-‐
Emerging
Web
archiving
theory
Monday, February 18, 13
29. 1.
Web
Archive
as
a
Web
Index
-‐ 1996-‐
the
Internet
Archive
and
the
Wayback
Machine
-‐ Crawlers
as
the
ultimate
collection-‐makers
of
the
Web
-‐ Navigational
tool
-‐
together
with
the
Alexa
Toolbar,
providing
solution
to
accessing
broken
links
-‐ Organizational
tool
-‐
borrowing
from
Library
Science
and
Scientometrics
-‐ Web
archive
as
a
digital
library
Image: http://www.wired.com/images_blogs/threatlevel/images/2008/05/07/brewster_kahle_630x.jpg
Monday, February 18, 13
32. 2.
Web
Archives
as
Special
Collections
• Foot
and
Schneider
1999
-‐
“Web
Sphere
Analysis”
• Collections
of
elections,
natural
disasters
and
“transitions”
continue
to
dominate
the
Wield
• Content
and
hyperlink
analysis
Monday, February 18, 13
34. 3.
The
national
turn
in
Web
archiving
Web
archiving
at
a
national
scale
proposes
new
questions
and
challenges:
-‐ What
is
a
national
Web?
How
to
deWine
national
cultural
heritage
on
the
Web?
-‐ Scale:
full
domain
harvest
(e-‐depot)
or
curation?
-‐ Selection
criteria
and
policy
-‐ Infrastructure,
Formats,
Accessibility
-‐ How
is
a
web
archive
different
from
other
digital
collections
maintained
by
national
libraries?
Web
archives
as
institutions
Monday, February 18, 13
37. 4.
Emerging
Web
Archiving
Theory
Some
distinctions:
Monday, February 18, 13
38. 4.
Emerging
Web
Archiving
Theory
Some
distinctions:
-‐ Web
archives
as
tools
for
research
/as
an
object
of
study
Monday, February 18, 13
39. 4.
Emerging
Web
Archiving
Theory
Some
distinctions:
-‐ Web
archives
as
tools
for
research
/as
an
object
of
study
-‐ Web
History
/
Digital
History
Monday, February 18, 13
40. 4.
Emerging
Web
Archiving
Theory
Some
distinctions:
-‐ Web
archives
as
tools
for
research
/as
an
object
of
study
-‐ Web
History
/
Digital
History
-‐ Website
/
Website
in
its
archived
environment
Monday, February 18, 13
41. 4.
Emerging
Web
Archiving
Theory
Some
distinctions:
-‐ Web
archives
as
tools
for
research
/as
an
object
of
study
-‐ Web
History
/
Digital
History
-‐ Website
/
Website
in
its
archived
environment
-‐ Digitized
objects
/
Digital
Objects
/
“Re-‐born
digital
objects”
(Brügger
2012)
Monday, February 18, 13
42. Types
of
Web
Historiography
enabled
Monday, February 18, 13
43. Types
of
Web
Historiography
enabled
Rogers (2013):
Monday, February 18, 13
44. Types
of
Web
Historiography
enabled
Rogers (2013):
- Single site historiography
Monday, February 18, 13
45. Types
of
Web
Historiography
enabled
Rogers (2013):
- Single site historiography
- Collection making
Monday, February 18, 13
46. Types
of
Web
Historiography
enabled
Rogers (2013):
- Single site historiography
- Collection making
- Link analysis, while attempting to figure out what is
missing
Monday, February 18, 13
47. Types
of
Web
Historiography
enabled
Rogers (2013):
- Single site historiography
- Collection making
- Link analysis, while attempting to figure out what is
missing
- Evolution of digital objects (such as source code,
cookies or tracking devices)
Monday, February 18, 13
48. Single website history - Capture history of website,
and
playback as screencast documentary (time-lapsed
photography)
Monday, February 18, 13
49. "Google and the politics of tabs" by Govcom.org,
Amsterdam, 2008.
Monday, February 18, 13
50. Collection making. Build collections from the archive
(e.g., Dutch extremist sites by NRC Handelsblad)
Monday, February 18, 13
53. Ghostery detecting trackers on an archived frontpage of the New York Times from 16 October 2006 in the Internet Archive.
Helmond (2013) Number of trackers per year on the New York Times frontpage. Green: ad,
orange: tracker, blue: analytics, pink: widget. Categorization provided by Ghostery.
Monday, February 18, 13
54. Types
of
Web
Historiography
precluded
Monday, February 18, 13
55. Types
of
Web
Historiography
precluded
-‐ (Most)
Web
archives
are
not
searchable
Monday, February 18, 13
56. Types
of
Web
Historiography
precluded
-‐ (Most)
Web
archives
are
not
searchable
-‐ (Most)
Web
archives
are
not
accessible
online
Monday, February 18, 13
57. Types
of
Web
Historiography
precluded
-‐ (Most)
Web
archives
are
not
searchable
-‐ (Most)
Web
archives
are
not
accessible
online
-‐ Cross-‐collection
comparison
is
difWicult
Monday, February 18, 13
58. Types
of
Web
Historiography
precluded
-‐ (Most)
Web
archives
are
not
searchable
-‐ (Most)
Web
archives
are
not
accessible
online
-‐ Cross-‐collection
comparison
is
difWicult
-‐ Wayback
machine
“jump
cuts
through
time”
(Rogers,
2013)
Monday, February 18, 13
59. WebART project
Web Archive Retrieval Tools
Jaap Kamps, Richard Rogers, Arjen de Vries,
Paul Doorenbosch, René Voorburg, Victor-Jan Vos
Anat Ben-David, Hugo Huurdeman, Thaer Sammar
http://webarchiving.nl
Monday, February 18, 13
61. THE INTERFACE
http://178.228.147.61:8080/
Monday, February 18, 13
62. “DICTATORS” FREQUENCY OVER TIME
800
700
600
500
Mubarek
Assad
400 Putin
Kim Jung Il
Fidel Castro
Raul Castro
300
200
100
0
17/05/2011 25/08/2011 03/12/2011 12/03/2012 20/06/2012 28/09/2012 06/01/2013 16/04/2013
New articles about “dictators” over time
Monday, February 18, 13