SlideShare a Scribd company logo
1 of 83
Download to read offline
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Herbert  Van  de  Sompel
LANL  &  DANS
@hvdsomp
En toen was  er niets meer …
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
The  Web
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
The  Web  Evolves
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Yet,  the  Web  Exists  in  a  Perpetual  Now
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
• Content  Management  Systems
• Web  Archives
• Transactional  archives
• Search  engine  caches
• …
Traces  of  the  Past  Web  Exist
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
But  Past  and  Current  Web(s)  are  Parallel  Universes
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
The  Memento  Protocol  Integrates  the  Current  and  Past  Web
7
http://mementoweb.org/guide/rfc/
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Original  Resource  and  Mementos
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Bridge  from  Present  to  Past
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Bridge  from  Present  to  Past
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Bridge  from  Past  to  Present
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Today
Select  Date
Mar  9  1999
Feb  8  1999
Bibliotheca  
Alexandrina
Web  Archive
Memento:  Access  Versions  via  the  Original  URI  and  a  Datetime
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
vogin.nl in  1999
http://web.archive.bibalex.org/web/19990208021257/http://www.vogin.nl/
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Memento  for  Chrome
http://bit.ly/memento-­for-­chome
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Hyperlinks
Eric  Sieverts  (2017)  https://vogin-­ip-­lezing.net/2017/01/17/linkrot-­linkroest-­en-­webarchieven/
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Hyperlinks  in  Theory
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Hyperlinks  in  Reality
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Hyperlinks  in  Reality
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Link  Rot
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Link  Rot
http://404-­resto.com/typo3temp/pics/7580ea80fa.jpg
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Hyperlinks  in  Reality
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Content  Drift
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Content  Drift
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Content  Drift
http://icecube.wisc.edu/  on  May  8  2009  (left)  and  August  27  2009  (right)
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Content  Drift
2000 2004
2005 2008
http://dl00.org  in  2000,  2004,  2005,  2008
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
No  Content  Drift
http://www.ifa.hawaii.edu/~cowie/k_table.html on  June  9  1997  (left)  and  March  2016  (right)
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
The  Web,  All  Hyperlinks  Subject  to  Link  Rot,  Content  Drift
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
The  Web,  All  Hyperlinks  Subject  to  Reference  Rot
• Reference  Rot  hinders  our  ability  to  follow  links  as  they  were  
intended  when  they  were  put  in  place:
• Link  rot:  A  link  stops  working  all  together
• Content  drift:  The  Linked  content  changes  over  time  and  may  
eventually  no  longer  be  representative  of  the  content  that  was  
originally  linked
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Creating  Pockets  of  Persistence
• How  to  maintain  the  integrity  of  links?  
• This  challenge  exists  for  the  entire  web.  Some  communities  with  well  
managed  collections  care  about  addressing  it  because  they  consider  
it  a  Quality  of  Service  issue:
• Scholarly  communication
• Cultural  heritage
• Legal  publications
• Government  communication
• Journalism
• Wikipedia
• …
• What  can  these  communities  do  to  create  Pockets  of  Persistence?
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
A  Managed  Collection  Desires  Reliable  Outlinks
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Links  to  another  Managed  Collection
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Links  to  Web  at  Large  Resources
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Exploring  Link  Rot  &  Content  Drift
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Preamble  2  -­ Hiberlink Study  of  Reference  Rot  in  STM  Articles
PMC  articles  published  1997-­2012 PMC
Total 479,194
With  links  to  articles 240,857
With  links  to  web-­at-­large  resources 156,160
Links PMC
To  articles 744,678
To  web-­at-­large  resources 480,853A B
A B
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Number  of  Articles  &  Links  -­ PMC
Martin  Klein,  Herbert  Van  de  Sompel,  et  al.  (2014)  Scholarly  context  not  found.  In:  PLOS  ONE
https://doi.org/10.1371/journal.pone.0115253
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Links  to  Articles  &  to  Web  At  Large  Resources  -­ PMC
Martin  Klein,  Herbert  Van  de  Sompel,  et  al.  (2014)  Scholarly  context  not  found.  In:  PLOS  ONE
https://doi.org/10.1371/journal.pone.0115253
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Exploring  Link  Rot  &  Content  Drift
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Links  Rot  Occurs  when  B  moves  to  C
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Introduce  PID(B)
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Link  to  PID(B)  ;;  HTTP  Redirect  from  PID(B)  to  B
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
When  B  moves  to  C:  HTTP  Redirect  from  PID(B)  to  C
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Herbert  Van  de  Sompel,  Martin  Klein,  and  Shawn  Jones  (2016)  Persistent  URIs  Must  Be  Used  
to  Be  Persistent.  In:  WWW2016.  http://arxiv.org/1602.09102
Core  assumption  in  the  PID  solution:  
PIDs  will  be  used  to  establish  links.
But  are  they?
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
• When  classifying  links  extracted  from  PMC  as  linking  to  articles,  we  
assumed  that  filtering  on  http://dx.doi.org/* would  do  the  trick
• But  we  found  a  lot  of  e.g.  http://link.springer.com/article/*
• For  example:
• http://link.springer.com/article/10.1007%2Fs00799-014-018-0
• Instead  of:
• http://dx.doi.org/10.1007/s00799-014-0108-0
• We  used  CrossRef’s Reverse  Domain  Lookup  to  classify  these  
extracted  links  as  linking  to  articles
A  Disconcerting  Observation
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
URI  References  -­ PMC
Herbert  Van  de  Sompel,  Martin  Klein,  and  Shawn  Jones  (2016)  Persistent  URIs  Must  Be  Used  to  Be  Persistent.  
In:  WWW2016.  http://arxiv.org/1602.09102
Herbert  Van  de  Sompel,  Martin  Klein,  and  Shawn  Jones  (2016)  Persistent  URIs  Must  Be  Used  
to  Be  Persistent.  In:  WWW2016.  http://arxiv.org/1602.09102
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Cartoon  by  Patrick  Hochstenbach
A  Proposal  to  Get  PIDs  Used:  Signposting
http://signposting.org
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Signposting:  HTTP  Link with  identifier Relation  Type
http://signposting.org/identifier/
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Signposting:  HTTP  Link with  identifier Relation  Type
http://signposting.org/identifier/
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Signposting:  Use  HTTP  Link  with  identifier Relation  Type
curl –I
http://www.dlib.org/dlib/november15/vandesompel/11vandesompel.html
HTTP/1.1 200 OK
Date: Wed, 26 Oct 2016 12:36:37 GMT
Server: Apache/2.2.15 (CentOS)
Last-Modified: Thu, 19 Nov 2015 14:50:19 GMT
ETag: "205a5e-f5ef-524e5e0ab80c0"
Accept-Ranges: bytes
Content-Length: 62959
Content-Type: text/html; charset=UTF-8
Link: <http://doi.org/10.1045/november2015-vandesompel>
; rel=“identifier”
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
PID  Alternative  -­ When  B  Moves  to  C:  HTTP  Redirect  from  B  to  C
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
PID  Alternative  -­ When  B  Moves  to  C:  HTTP  Redirect  from  B  to  C
• Custodian  of  C  needs  to  hold  on  to  domain  of  B
• Custodian  of  C  needs  to  establish  redirection  patterns,  often  rather  
simple  rules
• No  problem  with  establishing  links  to  PID(B);;  the  URI  in  the  browser  
address  bar  (initially  B,  later  C)  is  just  fine
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Exploring  Link  Rot  &  Content  Drift
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Content  Drift  Occurs  when  B  Changes  over  Time
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Content  Drift  Occurs  when  B  Changes  over  Time
• Was  not  really  considered  an  issue  because:
• the  objects  that  receive  PIDs  were  typically  static,  e.g.  scientific  
papers
• when  a  (substantially)  new  version  of  an  object  is  published,  a  
new  PID  is  assigned
• But:
• PID  links  (typically)  lead  to  landing  pages,  not  the  identified  
objects
• increasingly,  landing  pages  are  increasingly  rich,  aggregate  
comments,  discussion,  annotations;;  they  do  change  over  time.
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Content  Drift  Occurs  when  B  Changes  over  Time
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Custodian  of  B  Takes  Snapshots  of  B  as  it  Evolves  over  Time
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Custodian  of  B  Ensures  Snapshots  of  B  as  it  Evolves  over  Time
• This  does  not  happen  for  PID-­identified  objects,  AFAIK
• Version  Control  Systems  (e.g.  Wikipedia)  hold  on  to  all  versions;;  
snapshots  are  local.
• Pro-­active  archiving  solutions  for  web  servers  that  create  snapshots  
when  e.g.  new  content  is  published/visited  or  at  regular  intervals:
• on-­demand  archiving  of  a  web  server,  cf.  archiefweb.eu,  
archive-­it.org
• self-­archiving  web  server,  cf.  SiteStory
• How  to  access  the  snapshots  of  B?  Memento!
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
SiteStory Transactional  Archive  &  Memento
https://mementoweb.github.io/SiteStory/
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
SiteStory,  Wikipedia,  Web  Archive,  Memento  in  Action
http://lanlsource.lanl.gov/hello
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Exploring  Link  Rot  &  Content  Drift
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Scholarly  Context  Not  Found
Martin  Klein,  Herbert  Van  de  Sompel,  et  al.  (2014)  Scholarly  context  not  found.  In:  PLOS  ONE
https://doi.org/10.1371/journal.pone.0115253
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Link  Rot  -­ PMC
Martin  Klein,  Herbert  Van  de  Sompel,  et  al.  (2014)  Scholarly  context  not  found.  In:  PLOS  ONE
https://doi.org/10.1371/journal.pone.0115253
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Exploring  Link  Rot  &  Content  Drift
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Scholarly  Context  Adrift
Shawn  Jones,  Herbert  Van  de  Sompel,  et  al.  (2016)  Scholarly  context  not  found.  In:  PLOS  ONE  
https://doi.org/10.1371/journal.pone.0167475
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
How  to  Assess  Content  Drift?
Shawn  Jones,  Herbert  Van  de  Sompel,  et  al.  (2016)  Scholarly  context  not  found.  In:  PLOS  ONE  
https://doi.org/10.1371/journal.pone.0167475
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Step  1:  Find  Pre/Post  Mementos
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Step  2:  Select  Representative  Mementos
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Text  Similarity  Measures
• Compute  aggregate  text  similarity  scores  (values  between  0...100)  
for:
• Simhash
• Jaccard
• Sørensen-­Dice
• Cosine
• If  the  aggregate  score  is  100,  we  decide  that  the  Pre/Post  
Mementos  are  representative
• We  find  313K  URI  references  with  representative  Mementos
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
URI  References  without  Representative  Mementos  -­ PMC
Shawn  Jones,  Herbert  Van  de  Sompel,  et  al.  (2016)  Scholarly  context  not  found.  In:  PLOS  ONE  
https://doi.org/10.1371/journal.pone.0167475
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Step  3:  Dereference  Live  Web  Version  of  URI
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Step  4:  Representative  Memento  vs.  Live Version
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Content  Drift  -­ PMC
Shawn  Jones,  Herbert  Van  de  Sompel,  et  al.  (2016)  Scholarly  context  not  found.  In:  PLOS  ONE  
https://doi.org/10.1371/journal.pone.0167475
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Exploring  Link  Rot  &  Content  Drift
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Uncertainty  Regarding  the  Future  of  B  when  A  Links  to  It
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Custodian  of  A  Takes  a  Snapshot  of  B  when  Linking  to  It
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Taking  a  Snapshots  of  B:  Automation  is  Key
• Web  archive  APIs  for  on-­demand  archiving
• perma.cc,  Internet  Archive,  archive.is,  webcitation
• Amber  for  Wordpress &  Drupal  archives  resources  linked  in  a  page
• http://amberlink.org/
• Hiberlink’s experimental  Zotero extension  archives  bookmarked  
URLs
• http://hiberlink.org/zotero.html
• Hiberlink’s experimental  HiberActive archives  all  URLs  referenced  in  
a  newly  submitted  paper
• https://www.slideshare.net/martinklein0815/hiberactive
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Linking  to  Snapshot  of  B  =  Potentially  Creating  a  Rotten  Link
• Existing  practice  for  linking  to  snapshots:
<a href=“URL of snapshot of B”>
• Problems  with  existing  practice:
o Impossible  to  visit  the  original  URI,  if  desired
o Requires  the  permanent  existence/uptime  of  the  archive  that  
holds  the  snapshot
-­ One  link  rot  problem  replaced  by  another
http://robustlinks.mementoweb.org/about/
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Permanent  Existence/Uptime  of  Archives?  
Capture  of  http://webcitation.org dated  July  17  2013
https://archive.today/eAETp
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Permanent  Existence/Uptime  of  Archives?
Remnant  of  discontinued  web  archive  http://mummify.it captured  on  February  14  2014
https://web.archive.org/web/20140214233752/https://www.mummify.it/
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Permanent  Existence/Uptime  of  Archives?
http://www.themoscowtimes.com/news/article/russia-­bans-­wayback-­machine-­internet-­archive-­over-­
islamic-­state-­video/510074.html
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Permanent  Existence/Uptime  of  Archives?
http://web.archive.org/web/20121101043952/http://vogin.nl on  March  6  2017  at  15:59  CET
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Link  to  Snapshot  of  B  and  Decorate  the  Link
• Desired  practice  for  linking  to  captures  is  to  decorate  the  link  so  it  
provides  a  variety  of  options:
<a href=“URL of snapshot of B”
data-originalurl=“B”
data-versiondate=“datetime of snapshot of B”>
• Supports:
o Revisiting  the  original  URL
o Finding  snapshots  in  any  web  archive  (original  URL)
o Finding  a  temporally  appropriate  snapshot  in  any  web  archive  
(original  URL  &  snapshot  datetime)
o Automatically  accessing  a  temporally  appropriate  snapshot  in  
any  web  archive  (Memento,  original  URL  &  snapshot  datetime)
http://robustlinks.mementoweb.org/spec/
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Robust  Links:  Link  Decoration  in  Action
Van  de  Sompel H.  &  Nelson,  M.L.  (2015)  Reminiscing  about  15  years  of  interoperability  efforts.  In:  
D-­Lib  Magazine.  https://doi.org/10.1045/november2015-­vandesompel
JavaScript  makes  the  
link  decorations  actionable
Herbert  Van  de  Sompel
VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017
Herbert  Van  de  Sompel
LANL  &  DANS
@hvdsomp
En toen was  er niets meer …

More Related Content

Viewers also liked

Viewers also liked (9)

Impact - the game
Impact - the gameImpact - the game
Impact - the game
 
Findability of organizational knowledge
Findability of organizational knowledgeFindability of organizational knowledge
Findability of organizational knowledge
 
The changing landscape of search for business information
The changing landscape of search for business informationThe changing landscape of search for business information
The changing landscape of search for business information
 
Searching for reliable business information: free versus fee
Searching for reliable business information: free versus feeSearching for reliable business information: free versus fee
Searching for reliable business information: free versus fee
 
Efficiënt en systematisch zoeken in bibliografische databanken
Efficiënt en systematisch zoeken in bibliografische databankenEfficiënt en systematisch zoeken in bibliografische databanken
Efficiënt en systematisch zoeken in bibliografische databanken
 
Rara, waar ben ik? - een introductie tot geolocatie, Bellingcat's belangrijks...
Rara, waar ben ik? - een introductie tot geolocatie, Bellingcat's belangrijks...Rara, waar ben ik? - een introductie tot geolocatie, Bellingcat's belangrijks...
Rara, waar ben ik? - een introductie tot geolocatie, Bellingcat's belangrijks...
 
Disinformation on the Web: impact, characteristics and detection of Wikipedia...
Disinformation on the Web: impact, characteristics and detection of Wikipedia...Disinformation on the Web: impact, characteristics and detection of Wikipedia...
Disinformation on the Web: impact, characteristics and detection of Wikipedia...
 
Video search by deep-learning
Video search by deep-learningVideo search by deep-learning
Video search by deep-learning
 
Informatie vindbaar met metadata en taxonomieën vogin ip workshop 2017 joyce...
Informatie vindbaar met metadata en taxonomieën vogin ip workshop 2017 joyce...Informatie vindbaar met metadata en taxonomieën vogin ip workshop 2017 joyce...
Informatie vindbaar met metadata en taxonomieën vogin ip workshop 2017 joyce...
 

Similar to En toen was er niets meer ....

New Ways to Deliver Business Outcomes with INtelligent Workstream Collaboration
New Ways to Deliver Business Outcomes with INtelligent Workstream CollaborationNew Ways to Deliver Business Outcomes with INtelligent Workstream Collaboration
New Ways to Deliver Business Outcomes with INtelligent Workstream Collaboration
LetsConnect
 

Similar to En toen was er niets meer .... (20)

Researcher Pod: Scholarly Communication Using the Decentralized Web
Researcher Pod: Scholarly Communication Using the Decentralized WebResearcher Pod: Scholarly Communication Using the Decentralized Web
Researcher Pod: Scholarly Communication Using the Decentralized Web
 
Perseverance on persistence by Herbert Van de Sompel - EuropeanaTech Conferen...
Perseverance on persistence by Herbert Van de Sompel - EuropeanaTech Conferen...Perseverance on persistence by Herbert Van de Sompel - EuropeanaTech Conferen...
Perseverance on persistence by Herbert Van de Sompel - EuropeanaTech Conferen...
 
Perseverance on Persistence
Perseverance on PersistencePerseverance on Persistence
Perseverance on Persistence
 
Perseverance on Persistence by Herbert van de Sompel - EuropeanaTech Conferen...
Perseverance on Persistence by Herbert van de Sompel - EuropeanaTech Conferen...Perseverance on Persistence by Herbert van de Sompel - EuropeanaTech Conferen...
Perseverance on Persistence by Herbert van de Sompel - EuropeanaTech Conferen...
 
New Ways to Deliver Business Outcomes with INtelligent Workstream Collaboration
New Ways to Deliver Business Outcomes with INtelligent Workstream CollaborationNew Ways to Deliver Business Outcomes with INtelligent Workstream Collaboration
New Ways to Deliver Business Outcomes with INtelligent Workstream Collaboration
 
Persistent Identification: Easier Said than Done
Persistent Identification: Easier Said than DonePersistent Identification: Easier Said than Done
Persistent Identification: Easier Said than Done
 
Reminiscing about interoperability
Reminiscing about interoperabilityReminiscing about interoperability
Reminiscing about interoperability
 
Managing Connected Big Data in Art with Neo4j Graph Database - Lorenzo Speran...
Managing Connected Big Data in Art with Neo4j Graph Database - Lorenzo Speran...Managing Connected Big Data in Art with Neo4j Graph Database - Lorenzo Speran...
Managing Connected Big Data in Art with Neo4j Graph Database - Lorenzo Speran...
 
Semantic Technologies for the Web of Linked Data
Semantic Technologies for the Web of Linked DataSemantic Technologies for the Web of Linked Data
Semantic Technologies for the Web of Linked Data
 
How to attract more users – The evolving story of the Eurapco IBM Connections...
How to attract more users – The evolving story of the Eurapco IBM Connections...How to attract more users – The evolving story of the Eurapco IBM Connections...
How to attract more users – The evolving story of the Eurapco IBM Connections...
 
Social Connections 14 - ICS Integration with Node-RED and Open Source
Social Connections 14 - ICS Integration with Node-RED and Open SourceSocial Connections 14 - ICS Integration with Node-RED and Open Source
Social Connections 14 - ICS Integration with Node-RED and Open Source
 
ICS INtegration with Node-RED and Open Source
ICS INtegration with Node-RED and Open SourceICS INtegration with Node-RED and Open Source
ICS INtegration with Node-RED and Open Source
 
Big Data for crowd mobility
Big Data for crowd mobilityBig Data for crowd mobility
Big Data for crowd mobility
 
WebVR meets WebRTC: Towards 360-degree social VR experiences
WebVR meets WebRTC: Towards 360-degree social VR experiencesWebVR meets WebRTC: Towards 360-degree social VR experiences
WebVR meets WebRTC: Towards 360-degree social VR experiences
 
Collaborating web archives - Herbert van de Sompel
Collaborating web archives - Herbert van de SompelCollaborating web archives - Herbert van de Sompel
Collaborating web archives - Herbert van de Sompel
 
Building cognitive apps with Watson Work Services
Building cognitive apps with Watson Work ServicesBuilding cognitive apps with Watson Work Services
Building cognitive apps with Watson Work Services
 
Social Connections 12. We hired hackers to hack us
Social Connections 12. We hired hackers to hack usSocial Connections 12. We hired hackers to hack us
Social Connections 12. We hired hackers to hack us
 
We hired hackers to hack us; A case study about cloud-based authentication an...
We hired hackers to hack us; A case study about cloud-based authentication an...We hired hackers to hack us; A case study about cloud-based authentication an...
We hired hackers to hack us; A case study about cloud-based authentication an...
 
The (IPv6) Internet in Romania - RIPE NCC Data and Tools
The (IPv6) Internet in Romania - RIPE NCC Data and ToolsThe (IPv6) Internet in Romania - RIPE NCC Data and Tools
The (IPv6) Internet in Romania - RIPE NCC Data and Tools
 
Semantic Puzzle
Semantic PuzzleSemantic Puzzle
Semantic Puzzle
 

More from voginip

More from voginip (20)

Zo wordt je factchecker - Aafko Boonstra
Zo wordt je factchecker - Aafko BoonstraZo wordt je factchecker - Aafko Boonstra
Zo wordt je factchecker - Aafko Boonstra
 
Automatisch metadateren - de kansen en de uitdagingen
Automatisch metadateren - de kansen en de uitdagingenAutomatisch metadateren - de kansen en de uitdagingen
Automatisch metadateren - de kansen en de uitdagingen
 
Hybride Intelligentie: de rol van Large Language Models in informatieverwerking
Hybride Intelligentie: de rol van Large Language Models in informatieverwerkingHybride Intelligentie: de rol van Large Language Models in informatieverwerking
Hybride Intelligentie: de rol van Large Language Models in informatieverwerking
 
Solving World War II Photo Mysteries with Open Source Techniques
Solving World War II Photo Mysteries with Open Source TechniquesSolving World War II Photo Mysteries with Open Source Techniques
Solving World War II Photo Mysteries with Open Source Techniques
 
PiCo: Historische personen beter vindbaar maken
PiCo: Historische personen beter vindbaar makenPiCo: Historische personen beter vindbaar maken
PiCo: Historische personen beter vindbaar maken
 
Red het internet! Op weg naar de online publieke ruimte
Red het internet! Op weg naar de online publieke ruimteRed het internet! Op weg naar de online publieke ruimte
Red het internet! Op weg naar de online publieke ruimte
 
AI en IP (Artificieele Intelligentie en Intellectueel Eigendom)
AI en IP (Artificieele Intelligentie en Intellectueel Eigendom)AI en IP (Artificieele Intelligentie en Intellectueel Eigendom)
AI en IP (Artificieele Intelligentie en Intellectueel Eigendom)
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
The Dark Side of Science: Misconduct in Biomedical Research
The Dark Side of Science: Misconduct in Biomedical ResearchThe Dark Side of Science: Misconduct in Biomedical Research
The Dark Side of Science: Misconduct in Biomedical Research
 
Oude boeken, nieuwe vaardigheden en Wikipedia
Oude boeken, nieuwe vaardigheden en WikipediaOude boeken, nieuwe vaardigheden en Wikipedia
Oude boeken, nieuwe vaardigheden en Wikipedia
 
De kracht van samenwerking: hoe de Universiteitsbibliotheek Gent open kennisc...
De kracht van samenwerking: hoe de Universiteitsbibliotheek Gent open kennisc...De kracht van samenwerking: hoe de Universiteitsbibliotheek Gent open kennisc...
De kracht van samenwerking: hoe de Universiteitsbibliotheek Gent open kennisc...
 
Open yet everywhere in chains: Where next for open knowledge?
Open yet everywhere in chains: Where next for open knowledge?Open yet everywhere in chains: Where next for open knowledge?
Open yet everywhere in chains: Where next for open knowledge?
 
The three layers of a knowledge graph and what it means for authoring, storag...
The three layers of a knowledge graph and what it means for authoring, storag...The three layers of a knowledge graph and what it means for authoring, storag...
The three layers of a knowledge graph and what it means for authoring, storag...
 
Vijf vindbaarheidsproblemen waar een taxonomie de schuld van krijgt (maar nik...
Vijf vindbaarheidsproblemen waar een taxonomie de schuld van krijgt (maar nik...Vijf vindbaarheidsproblemen waar een taxonomie de schuld van krijgt (maar nik...
Vijf vindbaarheidsproblemen waar een taxonomie de schuld van krijgt (maar nik...
 
Why one-size-fits all does not work in Explainable Artificial Intelligence!
Why one-size-fits all does not work in Explainable Artificial Intelligence!Why one-size-fits all does not work in Explainable Artificial Intelligence!
Why one-size-fits all does not work in Explainable Artificial Intelligence!
 
Systematisch zoeken op het web
Systematisch zoeken op het webSystematisch zoeken op het web
Systematisch zoeken op het web
 
Grote hoeveelheden tekst analyseren als data
Grote hoeveelheden tekst analyseren als dataGrote hoeveelheden tekst analyseren als data
Grote hoeveelheden tekst analyseren als data
 
Werken met Wikidata
Werken met WikidataWerken met Wikidata
Werken met Wikidata
 
Een gereedschapskist voor digitale vaardigheden
Een gereedschapskist voor digitale vaardighedenEen gereedschapskist voor digitale vaardigheden
Een gereedschapskist voor digitale vaardigheden
 
Een startende éénpitter in informatieland: wat goed ging en wat niet
Een startende éénpitter in informatieland: wat goed ging en wat nietEen startende éénpitter in informatieland: wat goed ging en wat niet
Een startende éénpitter in informatieland: wat goed ging en wat niet
 

Recently uploaded

在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
ydyuyu
 
哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查
哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查
哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查
ydyuyu
 
一比一原版田纳西大学毕业证如何办理
一比一原版田纳西大学毕业证如何办理一比一原版田纳西大学毕业证如何办理
一比一原版田纳西大学毕业证如何办理
F
 
Abu Dhabi Escorts Service 0508644382 Escorts in Abu Dhabi
Abu Dhabi Escorts Service 0508644382 Escorts in Abu DhabiAbu Dhabi Escorts Service 0508644382 Escorts in Abu Dhabi
Abu Dhabi Escorts Service 0508644382 Escorts in Abu Dhabi
Monica Sydney
 
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
pxcywzqs
 
Indian Escort in Abu DHabi 0508644382 Abu Dhabi Escorts
Indian Escort in Abu DHabi 0508644382 Abu Dhabi EscortsIndian Escort in Abu DHabi 0508644382 Abu Dhabi Escorts
Indian Escort in Abu DHabi 0508644382 Abu Dhabi Escorts
Monica Sydney
 
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girls
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girlsRussian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girls
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girls
Monica Sydney
 

Recently uploaded (20)

在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
 
"Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency""Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency"
 
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
 
哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查
哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查
哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查
 
一比一原版田纳西大学毕业证如何办理
一比一原版田纳西大学毕业证如何办理一比一原版田纳西大学毕业证如何办理
一比一原版田纳西大学毕业证如何办理
 
Best SEO Services Company in Dallas | Best SEO Agency Dallas
Best SEO Services Company in Dallas | Best SEO Agency DallasBest SEO Services Company in Dallas | Best SEO Agency Dallas
Best SEO Services Company in Dallas | Best SEO Agency Dallas
 
Abu Dhabi Escorts Service 0508644382 Escorts in Abu Dhabi
Abu Dhabi Escorts Service 0508644382 Escorts in Abu DhabiAbu Dhabi Escorts Service 0508644382 Escorts in Abu Dhabi
Abu Dhabi Escorts Service 0508644382 Escorts in Abu Dhabi
 
Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...
Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...
Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...
 
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
 
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...
 
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
 
Mira Road Housewife Call Girls 07506202331, Nalasopara Call Girls
Mira Road Housewife Call Girls 07506202331, Nalasopara Call GirlsMira Road Housewife Call Girls 07506202331, Nalasopara Call Girls
Mira Road Housewife Call Girls 07506202331, Nalasopara Call Girls
 
Ballia Escorts Service Girl ^ 9332606886, WhatsApp Anytime Ballia
Ballia Escorts Service Girl ^ 9332606886, WhatsApp Anytime BalliaBallia Escorts Service Girl ^ 9332606886, WhatsApp Anytime Ballia
Ballia Escorts Service Girl ^ 9332606886, WhatsApp Anytime Ballia
 
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
 
Real Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirtReal Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirt
 
Trump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts SweatshirtTrump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts Sweatshirt
 
Nagercoil Escorts Service Girl ^ 9332606886, WhatsApp Anytime Nagercoil
Nagercoil Escorts Service Girl ^ 9332606886, WhatsApp Anytime NagercoilNagercoil Escorts Service Girl ^ 9332606886, WhatsApp Anytime Nagercoil
Nagercoil Escorts Service Girl ^ 9332606886, WhatsApp Anytime Nagercoil
 
Indian Escort in Abu DHabi 0508644382 Abu Dhabi Escorts
Indian Escort in Abu DHabi 0508644382 Abu Dhabi EscortsIndian Escort in Abu DHabi 0508644382 Abu Dhabi Escorts
Indian Escort in Abu DHabi 0508644382 Abu Dhabi Escorts
 
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
 
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girls
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girlsRussian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girls
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girls
 

En toen was er niets meer ....

  • 1. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Herbert  Van  de  Sompel LANL  &  DANS @hvdsomp En toen was  er niets meer …
  • 2. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 The  Web
  • 3. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 The  Web  Evolves
  • 4. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Yet,  the  Web  Exists  in  a  Perpetual  Now
  • 5. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 • Content  Management  Systems • Web  Archives • Transactional  archives • Search  engine  caches • … Traces  of  the  Past  Web  Exist
  • 6. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 But  Past  and  Current  Web(s)  are  Parallel  Universes
  • 7. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 The  Memento  Protocol  Integrates  the  Current  and  Past  Web 7 http://mementoweb.org/guide/rfc/
  • 8. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Original  Resource  and  Mementos
  • 9. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Bridge  from  Present  to  Past
  • 10. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Bridge  from  Present  to  Past
  • 11. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Bridge  from  Past  to  Present
  • 12. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Today Select  Date Mar  9  1999 Feb  8  1999 Bibliotheca   Alexandrina Web  Archive Memento:  Access  Versions  via  the  Original  URI  and  a  Datetime
  • 13. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 vogin.nl in  1999 http://web.archive.bibalex.org/web/19990208021257/http://www.vogin.nl/
  • 14. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Memento  for  Chrome http://bit.ly/memento-­for-­chome
  • 15. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Hyperlinks Eric  Sieverts  (2017)  https://vogin-­ip-­lezing.net/2017/01/17/linkrot-­linkroest-­en-­webarchieven/
  • 16. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Hyperlinks  in  Theory
  • 17. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Hyperlinks  in  Reality
  • 18. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Hyperlinks  in  Reality
  • 19. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Link  Rot
  • 20. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Link  Rot http://404-­resto.com/typo3temp/pics/7580ea80fa.jpg
  • 21. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Hyperlinks  in  Reality
  • 22. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Content  Drift
  • 23. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Content  Drift
  • 24. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Content  Drift http://icecube.wisc.edu/  on  May  8  2009  (left)  and  August  27  2009  (right)
  • 25. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Content  Drift 2000 2004 2005 2008 http://dl00.org  in  2000,  2004,  2005,  2008
  • 26. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 No  Content  Drift http://www.ifa.hawaii.edu/~cowie/k_table.html on  June  9  1997  (left)  and  March  2016  (right)
  • 27. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 The  Web,  All  Hyperlinks  Subject  to  Link  Rot,  Content  Drift
  • 28. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 The  Web,  All  Hyperlinks  Subject  to  Reference  Rot • Reference  Rot  hinders  our  ability  to  follow  links  as  they  were   intended  when  they  were  put  in  place: • Link  rot:  A  link  stops  working  all  together • Content  drift:  The  Linked  content  changes  over  time  and  may   eventually  no  longer  be  representative  of  the  content  that  was   originally  linked
  • 29. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Creating  Pockets  of  Persistence • How  to  maintain  the  integrity  of  links?   • This  challenge  exists  for  the  entire  web.  Some  communities  with  well   managed  collections  care  about  addressing  it  because  they  consider   it  a  Quality  of  Service  issue: • Scholarly  communication • Cultural  heritage • Legal  publications • Government  communication • Journalism • Wikipedia • … • What  can  these  communities  do  to  create  Pockets  of  Persistence?
  • 30. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 A  Managed  Collection  Desires  Reliable  Outlinks
  • 31. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Links  to  another  Managed  Collection
  • 32. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Links  to  Web  at  Large  Resources
  • 33. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Exploring  Link  Rot  &  Content  Drift
  • 34. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Preamble  2  -­ Hiberlink Study  of  Reference  Rot  in  STM  Articles PMC  articles  published  1997-­2012 PMC Total 479,194 With  links  to  articles 240,857 With  links  to  web-­at-­large  resources 156,160 Links PMC To  articles 744,678 To  web-­at-­large  resources 480,853A B A B
  • 35. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Number  of  Articles  &  Links  -­ PMC Martin  Klein,  Herbert  Van  de  Sompel,  et  al.  (2014)  Scholarly  context  not  found.  In:  PLOS  ONE https://doi.org/10.1371/journal.pone.0115253
  • 36. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Links  to  Articles  &  to  Web  At  Large  Resources  -­ PMC Martin  Klein,  Herbert  Van  de  Sompel,  et  al.  (2014)  Scholarly  context  not  found.  In:  PLOS  ONE https://doi.org/10.1371/journal.pone.0115253
  • 37. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Exploring  Link  Rot  &  Content  Drift
  • 38. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Links  Rot  Occurs  when  B  moves  to  C
  • 39. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Introduce  PID(B)
  • 40. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Link  to  PID(B)  ;;  HTTP  Redirect  from  PID(B)  to  B
  • 41. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 When  B  moves  to  C:  HTTP  Redirect  from  PID(B)  to  C
  • 42. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Herbert  Van  de  Sompel,  Martin  Klein,  and  Shawn  Jones  (2016)  Persistent  URIs  Must  Be  Used   to  Be  Persistent.  In:  WWW2016.  http://arxiv.org/1602.09102 Core  assumption  in  the  PID  solution:   PIDs  will  be  used  to  establish  links. But  are  they?
  • 43. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 • When  classifying  links  extracted  from  PMC  as  linking  to  articles,  we   assumed  that  filtering  on  http://dx.doi.org/* would  do  the  trick • But  we  found  a  lot  of  e.g.  http://link.springer.com/article/* • For  example: • http://link.springer.com/article/10.1007%2Fs00799-014-018-0 • Instead  of: • http://dx.doi.org/10.1007/s00799-014-0108-0 • We  used  CrossRef’s Reverse  Domain  Lookup  to  classify  these   extracted  links  as  linking  to  articles A  Disconcerting  Observation
  • 44. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 URI  References  -­ PMC Herbert  Van  de  Sompel,  Martin  Klein,  and  Shawn  Jones  (2016)  Persistent  URIs  Must  Be  Used  to  Be  Persistent.   In:  WWW2016.  http://arxiv.org/1602.09102 Herbert  Van  de  Sompel,  Martin  Klein,  and  Shawn  Jones  (2016)  Persistent  URIs  Must  Be  Used   to  Be  Persistent.  In:  WWW2016.  http://arxiv.org/1602.09102
  • 45. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Cartoon  by  Patrick  Hochstenbach A  Proposal  to  Get  PIDs  Used:  Signposting http://signposting.org
  • 46. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Signposting:  HTTP  Link with  identifier Relation  Type http://signposting.org/identifier/
  • 47. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Signposting:  HTTP  Link with  identifier Relation  Type http://signposting.org/identifier/
  • 48. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Signposting:  Use  HTTP  Link  with  identifier Relation  Type curl –I http://www.dlib.org/dlib/november15/vandesompel/11vandesompel.html HTTP/1.1 200 OK Date: Wed, 26 Oct 2016 12:36:37 GMT Server: Apache/2.2.15 (CentOS) Last-Modified: Thu, 19 Nov 2015 14:50:19 GMT ETag: "205a5e-f5ef-524e5e0ab80c0" Accept-Ranges: bytes Content-Length: 62959 Content-Type: text/html; charset=UTF-8 Link: <http://doi.org/10.1045/november2015-vandesompel> ; rel=“identifier”
  • 49. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 PID  Alternative  -­ When  B  Moves  to  C:  HTTP  Redirect  from  B  to  C
  • 50. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 PID  Alternative  -­ When  B  Moves  to  C:  HTTP  Redirect  from  B  to  C • Custodian  of  C  needs  to  hold  on  to  domain  of  B • Custodian  of  C  needs  to  establish  redirection  patterns,  often  rather   simple  rules • No  problem  with  establishing  links  to  PID(B);;  the  URI  in  the  browser   address  bar  (initially  B,  later  C)  is  just  fine
  • 51. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Exploring  Link  Rot  &  Content  Drift
  • 52. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Content  Drift  Occurs  when  B  Changes  over  Time
  • 53. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Content  Drift  Occurs  when  B  Changes  over  Time • Was  not  really  considered  an  issue  because: • the  objects  that  receive  PIDs  were  typically  static,  e.g.  scientific   papers • when  a  (substantially)  new  version  of  an  object  is  published,  a   new  PID  is  assigned • But: • PID  links  (typically)  lead  to  landing  pages,  not  the  identified   objects • increasingly,  landing  pages  are  increasingly  rich,  aggregate   comments,  discussion,  annotations;;  they  do  change  over  time.
  • 54. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Content  Drift  Occurs  when  B  Changes  over  Time
  • 55. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Custodian  of  B  Takes  Snapshots  of  B  as  it  Evolves  over  Time
  • 56. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Custodian  of  B  Ensures  Snapshots  of  B  as  it  Evolves  over  Time • This  does  not  happen  for  PID-­identified  objects,  AFAIK • Version  Control  Systems  (e.g.  Wikipedia)  hold  on  to  all  versions;;   snapshots  are  local. • Pro-­active  archiving  solutions  for  web  servers  that  create  snapshots   when  e.g.  new  content  is  published/visited  or  at  regular  intervals: • on-­demand  archiving  of  a  web  server,  cf.  archiefweb.eu,   archive-­it.org • self-­archiving  web  server,  cf.  SiteStory • How  to  access  the  snapshots  of  B?  Memento!
  • 57. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 SiteStory Transactional  Archive  &  Memento https://mementoweb.github.io/SiteStory/
  • 58. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 SiteStory,  Wikipedia,  Web  Archive,  Memento  in  Action http://lanlsource.lanl.gov/hello
  • 59. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Exploring  Link  Rot  &  Content  Drift
  • 60. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Scholarly  Context  Not  Found Martin  Klein,  Herbert  Van  de  Sompel,  et  al.  (2014)  Scholarly  context  not  found.  In:  PLOS  ONE https://doi.org/10.1371/journal.pone.0115253
  • 61. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Link  Rot  -­ PMC Martin  Klein,  Herbert  Van  de  Sompel,  et  al.  (2014)  Scholarly  context  not  found.  In:  PLOS  ONE https://doi.org/10.1371/journal.pone.0115253
  • 62. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Exploring  Link  Rot  &  Content  Drift
  • 63. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Scholarly  Context  Adrift Shawn  Jones,  Herbert  Van  de  Sompel,  et  al.  (2016)  Scholarly  context  not  found.  In:  PLOS  ONE   https://doi.org/10.1371/journal.pone.0167475
  • 64. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 How  to  Assess  Content  Drift? Shawn  Jones,  Herbert  Van  de  Sompel,  et  al.  (2016)  Scholarly  context  not  found.  In:  PLOS  ONE   https://doi.org/10.1371/journal.pone.0167475
  • 65. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Step  1:  Find  Pre/Post  Mementos
  • 66. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Step  2:  Select  Representative  Mementos
  • 67. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Text  Similarity  Measures • Compute  aggregate  text  similarity  scores  (values  between  0...100)   for: • Simhash • Jaccard • Sørensen-­Dice • Cosine • If  the  aggregate  score  is  100,  we  decide  that  the  Pre/Post   Mementos  are  representative • We  find  313K  URI  references  with  representative  Mementos
  • 68. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 URI  References  without  Representative  Mementos  -­ PMC Shawn  Jones,  Herbert  Van  de  Sompel,  et  al.  (2016)  Scholarly  context  not  found.  In:  PLOS  ONE   https://doi.org/10.1371/journal.pone.0167475
  • 69. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Step  3:  Dereference  Live  Web  Version  of  URI
  • 70. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Step  4:  Representative  Memento  vs.  Live Version
  • 71. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Content  Drift  -­ PMC Shawn  Jones,  Herbert  Van  de  Sompel,  et  al.  (2016)  Scholarly  context  not  found.  In:  PLOS  ONE   https://doi.org/10.1371/journal.pone.0167475
  • 72. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Exploring  Link  Rot  &  Content  Drift
  • 73. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Uncertainty  Regarding  the  Future  of  B  when  A  Links  to  It
  • 74. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Custodian  of  A  Takes  a  Snapshot  of  B  when  Linking  to  It
  • 75. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Taking  a  Snapshots  of  B:  Automation  is  Key • Web  archive  APIs  for  on-­demand  archiving • perma.cc,  Internet  Archive,  archive.is,  webcitation • Amber  for  Wordpress &  Drupal  archives  resources  linked  in  a  page • http://amberlink.org/ • Hiberlink’s experimental  Zotero extension  archives  bookmarked   URLs • http://hiberlink.org/zotero.html • Hiberlink’s experimental  HiberActive archives  all  URLs  referenced  in   a  newly  submitted  paper • https://www.slideshare.net/martinklein0815/hiberactive
  • 76. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Linking  to  Snapshot  of  B  =  Potentially  Creating  a  Rotten  Link • Existing  practice  for  linking  to  snapshots: <a href=“URL of snapshot of B”> • Problems  with  existing  practice: o Impossible  to  visit  the  original  URI,  if  desired o Requires  the  permanent  existence/uptime  of  the  archive  that   holds  the  snapshot -­ One  link  rot  problem  replaced  by  another http://robustlinks.mementoweb.org/about/
  • 77. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Permanent  Existence/Uptime  of  Archives?   Capture  of  http://webcitation.org dated  July  17  2013 https://archive.today/eAETp
  • 78. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Permanent  Existence/Uptime  of  Archives? Remnant  of  discontinued  web  archive  http://mummify.it captured  on  February  14  2014 https://web.archive.org/web/20140214233752/https://www.mummify.it/
  • 79. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Permanent  Existence/Uptime  of  Archives? http://www.themoscowtimes.com/news/article/russia-­bans-­wayback-­machine-­internet-­archive-­over-­ islamic-­state-­video/510074.html
  • 80. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Permanent  Existence/Uptime  of  Archives? http://web.archive.org/web/20121101043952/http://vogin.nl on  March  6  2017  at  15:59  CET
  • 81. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Link  to  Snapshot  of  B  and  Decorate  the  Link • Desired  practice  for  linking  to  captures  is  to  decorate  the  link  so  it   provides  a  variety  of  options: <a href=“URL of snapshot of B” data-originalurl=“B” data-versiondate=“datetime of snapshot of B”> • Supports: o Revisiting  the  original  URL o Finding  snapshots  in  any  web  archive  (original  URL) o Finding  a  temporally  appropriate  snapshot  in  any  web  archive   (original  URL  &  snapshot  datetime) o Automatically  accessing  a  temporally  appropriate  snapshot  in   any  web  archive  (Memento,  original  URL  &  snapshot  datetime) http://robustlinks.mementoweb.org/spec/
  • 82. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Robust  Links:  Link  Decoration  in  Action Van  de  Sompel H.  &  Nelson,  M.L.  (2015)  Reminiscing  about  15  years  of  interoperability  efforts.  In:   D-­Lib  Magazine.  https://doi.org/10.1045/november2015-­vandesompel JavaScript  makes  the   link  decorations  actionable
  • 83. Herbert  Van  de  Sompel VOGIN-­IP,  Amsterdam,  Nederland,  Maart 9  2017 Herbert  Van  de  Sompel LANL  &  DANS @hvdsomp En toen was  er niets meer …