SlideShare a Scribd company logo
1 of 54
Download to read offline
TimeMaps: Metadata for Memento


                                  Herbert Van de Sompel
                                   Robert Sanderson
                                    Michael L. Nelson
                                   Lyudmila Balakireva
                                     Scott Ainsworth
                                     Harihar Shankar


                              http://www.mementoweb.org/


                                 Memento is partially funded by the
                                      Library of Congress




       TimeMaps: Metadata for Memento
   GSLIS Metadata Group, UIUC, 14th July 2010
Memento wants to make Navigating the Web’s Past Easy


•    Problem Statement

•    Memento Solution
       •  Navigation not Search
        •    API for Web Archives

•    Memento Ontology for TimeMaps



                       http://www.mementoweb.org/
             http://groups.google.com/group/memento-dev
                         TimeMaps: Metadata for Memento
                     GSLIS Metadata Group, UIUC, 14th July 2010
Web Resources have Different Representations over Time




                  TimeMaps: Metadata for Memento
              GSLIS Metadata Group, UIUC, 14th July 2010
Thankfully Archived Representations Exist




           TimeMaps: Metadata for Memento
       GSLIS Metadata Group, UIUC, 14th July 2010
3 Issues with Current Access to Archives

1.  Access is via a new URI, unknown to the user.

2.  People do not like to search for archived resources, and there is no
    automated method

3.  Navigation in the past is inconsistent:
      1.  Stuck in single, necessarily incomplete archive
      2.  Or if not rewritten, URIs lead back to the present




            Comment on Popular Science article:     http://bit.ly/bWr5gP


                        TimeMaps: Metadata for Memento
                    GSLIS Metadata Group, UIUC, 14th July 2010
1. Representations Archived at a Different URI




          Sep 11 2001, 20:36:10 UTC                                Dec 20 2001, 4:51:00 UTC

                                                               http://en.wikipedia.org/w/index.php?
http://web.archive.org/web/20010911203610/http://     title=September_11_attacks&oldid=282333 archived
www.cnn.com/ archived resource for http://cnn.com            resource for http://en.wikipedia.org/wiki/
                                                                       September_11_attacks

                                  TimeMaps: Metadata for Memento
                              GSLIS Metadata Group, UIUC, 14th July 2010
2. Searching is Cumbersome




http://web.archive.org/web/*/http://cnn.com/                 http://en.wikipedia.org/w/index.php?
                                                        title=September_11_attacks&action=history


                                TimeMaps: Metadata for Memento
                            GSLIS Metadata Group, UIUC, 14th July 2010
3. Inconsistent Navigation (Archives Incomplete)




    SPACE




           Sep 11 2001, 20:36:10 UTC                             Sep 11 2001, 21:38:55 UTC

http://web.archive.org/web/20010911203610/http://        http://web.archive.org/web/20010911213855/
www.cnn.com/ archived resource for http://cnn.com                 www.cnn.com/TECH/space/


                                  TimeMaps: Metadata for Memento
                              GSLIS Metadata Group, UIUC, 14th July 2010
3. Inconsistent Navigation (Can't Stay in Past)




                                  Pentagon




            Dec 20 2001, 4:51:00 UTC                                        current
         http://en.wikipedia.org/w/index.php?
title=September_11_attacks&oldid=282333 archived           http://en.wikipedia.org/wiki/The_Pentagon
       resource for http://en.wikipedia.org/wiki/
                September_11_attacks3


                                  TimeMaps: Metadata for Memento
                              GSLIS Metadata Group, UIUC, 14th July 2010
Past and Current Web are Not Integrated




          TimeMaps: Metadata for Memento
      GSLIS Metadata Group, UIUC, 14th July 2010
The Web without a Time Dimension




Need to use a different URI to access archived versions of a resource and its current version

                              TimeMaps: Metadata for Memento
                          GSLIS Metadata Group, UIUC, 14th July 2010
The Web with Time Dimension added by Memento




Memento uses URI of the current version to access archived versions, but qualify it
          with datetime, and magically arrive at the correct location.

                         TimeMaps: Metadata for Memento
                     GSLIS Metadata Group, UIUC, 14th July 2010
The Memento Solution



There are two components to the Memento Solution:

•    Component 1: Navigation to an archived resource
     via its original resource, by leveraging content
     negotiation.

•    Component 2: A discovery API for archives that
     enables retrieving a list of all archived versions of a
     resource for a given URI.



                      TimeMaps: Metadata for Memento
                  GSLIS Metadata Group, UIUC, 14th July 2010
Content Negotiation in Time

•    Many systems support content negotiation for file format
      o  Your client by default asks for HTML and gets HTML

      o  But it could get PDF via the same URI



•    Memento proposes a new dimension for content negotiation: Time
      o  Your client by default asks for the current time, and gets it

      o  But it could get an older version via the same URI



•    Can be accomplished with only one new HTTP header in each
     direction:

      o    Accept-Datetime             Request for a particular timestamp
      o    Content-Datetime            The returned content’s timestamp

      o    These exactly mirror existing headers for Format, Language, etc.

                           TimeMaps: Metadata for Memento
                       GSLIS Metadata Group, UIUC, 14th July 2010
Apr 10 2001, 21:39:30 UTC




current



                                                               Aug 15 2004, 08:45:27 UTC



                              Aug 15 2007, 19:21:58 UTC


www.cnn.com                                  web.archive.org

                  TimeMaps: Metadata for Memento
              GSLIS Metadata Group, UIUC, 14th July 2010
Original                                                                      Mementos
Resource




                                               Apr 10 2001, 21:39:30 UTC




current



                                                               Aug 15 2004, 08:45:27 UTC



                              Aug 15 2007, 19:21:58 UTC


www.cnn.com                                  web.archive.org

                  TimeMaps: Metadata for Memento
              GSLIS Metadata Group, UIUC, 14th July 2010
Original
                           ?                                                  Mementos
Resource




                                               Apr 10 2001, 21:39:30 UTC




current



                                                               Aug 15 2004, 08:45:27 UTC



                              Aug 15 2007, 19:21:58 UTC


www.cnn.com                                  web.archive.org

                  TimeMaps: Metadata for Memento
              GSLIS Metadata Group, UIUC, 14th July 2010
Original                TimeGate                                              Mementos
Resource




                                               Apr 10 2001, 21:39:30 UTC




current



                                                               Aug 15 2004, 08:45:27 UTC



                              Aug 15 2007, 19:21:58 UTC


www.cnn.com                                  web.archive.org

                  TimeMaps: Metadata for Memento
              GSLIS Metadata Group, UIUC, 14th July 2010
Conneg with TimeGate to Mementos

Original                TimeGate                                              Mementos
Resource




                                               Apr 10 2001, 21:39:30 UTC




current



                                                               Aug 15 2004, 08:45:27 UTC



                              Aug 15 2007, 19:21:58 UTC


www.cnn.com                                  web.archive.org

                  TimeMaps: Metadata for Memento
              GSLIS Metadata Group, UIUC, 14th July 2010
Link Headers                  Conneg with TimeGate to Mementos

Original                      TimeGate                                              Mementos
Resource




                                                     Apr 10 2001, 21:39:30 UTC




current



                                                                     Aug 15 2004, 08:45:27 UTC



                                    Aug 15 2007, 19:21:58 UTC


www.cnn.com                                        web.archive.org

                        TimeMaps: Metadata for Memento
                    GSLIS Metadata Group, UIUC, 14th July 2010
Link Headers                  Conneg with TimeGate to Mementos

Original                   TimeGate                                         Mementos
Resource




                                wikipedia.org

                     TimeMaps: Metadata for Memento
                 GSLIS Metadata Group, UIUC, 14th July 2010
The Web with Time Dimension added by Memento




              TimeMaps: Metadata for Memento
          GSLIS Metadata Group, UIUC, 14th July 2010
The Memento Solution




•    Component 2: A discovery API for archives that
     allows requesting a list of all archived versions held
     for a resource with a given URI.



                     TimeMaps: Metadata for Memento
                 GSLIS Metadata Group, UIUC, 14th July 2010
Why an API?

•    Mementos for any given resource are distributed across archives.
     (What? Not just the Internet Archive?!)

•    In order to get a correct perspective of available Mementos, different
     archives need to be consulted.

•    Can do by distributed search (slow), or by consulting an aggregator.

•    Aggregator and other services need machine readable description of
     archives' holdings to select appropriate Memento for request
        •  Closest in time
        •  Most reliable representation
        •  Fastest responding
        •  (etc)



                           TimeMaps: Metadata for Memento
                       GSLIS Metadata Group, UIUC, 14th July 2010
WebCitation   13 May 2009 12:28:39




    TimeMaps: Metadata for Memento
GSLIS Metadata Group, UIUC, 14th July 2010
WebCitation   13 May 2009 12:28:39
                                     Archive-It    14 May 2009 01:18:11




    TimeMaps: Metadata for Memento
GSLIS Metadata Group, UIUC, 14th July 2010
WebCitation    13 May 2009 12:28:39
                                     Archive-It     14 May 2009 01:18:11
                                       BL Archive   14 May 2009 07:12:45




    TimeMaps: Metadata for Memento
GSLIS Metadata Group, UIUC, 14th July 2010
WebCitation    13 May 2009 12:28:39
                                     Archive-It     14 May 2009 01:18:11
                                       BL Archive   14 May 2009 07:12:45
                                         Dracos     14 May 2009 13:00:00




    TimeMaps: Metadata for Memento
GSLIS Metadata Group, UIUC, 14th July 2010
WebCitation    13 May 2009 12:28:39
                                     Archive-It     14 May 2009 01:18:11
                                       BL Archive   14 May 2009 07:12:45
                                         Dracos     14 May 2009 13:00:00
                                            TNA     14 May 2009 18:21:32




                                               And no Internet
                                               Archive…




    TimeMaps: Metadata for Memento
GSLIS Metadata Group, UIUC, 14th July 2010
TimeMaps
•  At most basic: List of URIs of Mementos and their times
•  Expressed as Linked Data; a profile of OAI ORE Resource Maps
•  Link header from TimeGate and Memento




                    TimeMaps: Metadata for Memento
                GSLIS Metadata Group, UIUC, 14th July 2010
Basic ORE Model

Aggregation (Aggr) is a set of web resources (R-1 to R-3), described in RDF or
Atom by a Resource Map (ReM).




                          TimeMaps: Metadata for Memento
                      GSLIS Metadata Group, UIUC, 14th July 2010
TimeBundles

Resources of Interest in Memento:
   •  Original Resource
   •  TimeGate
   •  Mementos




                       TimeMaps: Metadata for Memento
                   GSLIS Metadata Group, UIUC, 14th July 2010
TimeGates

•  Period(s) that the TimeGate covers
•  Which resource is it a TimeGate for
•  mem:TimeSpan as can cover multiple distinct periods




                    TimeMaps: Metadata for Memento
                GSLIS Metadata Group, UIUC, 14th July 2010
Mementos

•  Time Period: valid for or observed over, number of observations
•  Metadata: size, format, etc (will come back to the "etc")
•  Which resource it is a Memento for




                    TimeMaps: Metadata for Memento
                GSLIS Metadata Group, UIUC, 14th July 2010
Serializations

•  RDF/XML
    •  Good for XML parsers

•  Turtle, N3 and related
    •  Good for graph parsers

•  RDFa
    •  Good for web browsers

•  Atom
    •  Good for alerting, feed readers etc (but still embeds RDF)

•  New: Link Header format
    •  Good for real-time applications
    •  Smaller file size (just the facts, ma'am)
    •  Easy to implement with existing link header parsers
    •  Servers need to produce format anyway, so non-rdf way out

                    TimeMaps: Metadata for Memento
                GSLIS Metadata Group, UIUC, 14th July 2010
Use Case: Aggregator using TimeMaps




         TimeMaps: Metadata for Memento
     GSLIS Metadata Group, UIUC, 14th July 2010
Link Headers                    Conneg with TimeGate to Mementos

Original                  TimeGate                                            Mementos
Resource




                              TimeMaps: Metadata for Memento
                          GSLIS Metadata Group, UIUC, 14th July 2010
Metadata Discussion Points

1.  What metadata is necessary to determine the most appropriate copy?

       •    Distance to requested time most important
       •    Quality of representation?
       •    Usage statistics for Original Resource? For Memento?
       •    User tagging of Memento for quality?
       •    Archive response speed?
       •    Need to know more information from user preferences?


2.  What other metadata is useful and available?

       •    Crawling archives have limited information
       •    CMS systems have much more
       •    User tags, comments, annotations
       •    Semantic information about content, eg title, author, subject
       •    Distribution of changes over time




                           TimeMaps: Metadata for Memento
                       GSLIS Metadata Group, UIUC, 14th July 2010
Metadata Discussion Points

3.  What metadata is necessary for inter-archive synchronization?

       •    Deduplication information: digests, request headers
       •    "Significant Change" factors
       •    Crawler settings: respect no-cache, robots.txt etc


4.  What metadata can be generated by other services?

       •    Open World Model: Anyone can say anything about anything
       •    Technical metadata easy (MIX for images, etc)
       •    Time Series Analysis interesting (techtales.org)
       •    Machine Learning based approaches?




                           TimeMaps: Metadata for Memento
                       GSLIS Metadata Group, UIUC, 14th July 2010
Thank You 

Rob Sanderson:
    •  azaroth42@gmail.com
    •  rsanderson@lanl.gov

This presentation:
    •    http://www.slideshare.net/azaroth42/xxx

Memento:
   •   http://www.mementoweb.org/
   •   http:groups.google.com/group/memento-dev

MementoFox:
   •   https://addons.mozilla.com/en-US/firefox/addon/100298
         aka: http://bit.ly/memfox



           Memento Enables Navigating the Past Web

                         TimeMaps: Metadata for Memento
                     GSLIS Metadata Group, UIUC, 14th July 2010
Discussion Questions

1.  What metadata is necessary to determine the most appropriate copy?


2.  What other metadata is useful and available?


3.  What metadata is necessary for inter-archive synchronization?


4.  What metadata can be generated by other services?




                         TimeMaps: Metadata for Memento
                     GSLIS Metadata Group, UIUC, 14th July 2010
Appendix: Memento HTTP Flow


    HEAD R, (Accept-Datetime)


             LinkG


     GET G, Accept-Datetime


 302M, Vary, TCN, LinkR,B,M


    GET M, (Accept-Datetime)


200, Content-Datetime, LinkR,B,M
Memento HTTP
        Memento HTTP Flow



Flow
     HEAD R, (Accept-Datetime)


              LinkG


      GET G, Accept-Datetime


  302M, Vary, TCN, LinkR,B,M


     GET M, (Accept-Datetime)


 200, Content-Datetime, LinkR,B,M
Memento HTTP
                            Memento HTTP Flow



            Flow: URI-R
                        HEAD R, (Accept-Datetime)


HEAD http://cnn.com/ HTTP/1.1
Host: cnn.com
Accept-Datetime: Tue, 11 Sep 2001 20:35:00 GMT
Connection: close
Memento HTTP
        Memento HTTP Flow



Flow
     HEAD R, (Accept-Datetime)


              LinkG


      GET G, Accept-Datetime


  302M, Vary, TCN, LinkR,B,M


     GET M, (Accept-Datetime)


 200, Content-Datetime, LinkR,B,M
Memento HTTP
                            Memento HTTP Flow



            Flow: Success –
                                     LinkG


HTTP/1.1 200 OK



            URI-R
Date: Thu, 21 Jan 2010 00:02:12 GMT
Server: Apache
Link: <http://web.archive.org/web/timegate/http://cnn.com>; rel="timegate"
Content-Length: 255
Connection: close
Content-Type: text/html; charset=iso-8859-1
Memento HTTP
        Memento HTTP Flow



Flow
     HEAD R, (Accept-Datetime)


              LinkG


      GET G, Accept-Datetime


  302M, Vary, TCN, LinkR,B,M


     GET M, (Accept-Datetime)


 200, Content-Datetime, LinkR,B,M
Memento HTTP Flow


                          GET G, Accept-Datetime


GET http://web.archive.org/web/timegate/http://cnn.com HTTP/1.1
Host: cnn.com
Accept-Datetime: Tue, 11 Sep 2001 20:35:00 GMT
Connection: close
Memento HTTP
        Memento HTTP Flow



Flow
      HEAD R, Accept-Datetime


              LinkG


      GET G, Accept-Datetime


  302M, Vary, TCN, LinkR,B,M


      GET M, Accept-Datetime


 200, Content-Datetime, LinkR,B,M
Memento HTTP Flow


                        302M, Vary, LinkR,B,M

HTTP/1.1 302 Found
Date: Thu, 21 Jan 2010 00:06:50 GMT
Server: Apache
TCN: choice
Vary: negotiate, accept-datetime
Location: http://web.archive.org/web/20010911203610/http://www.cnn.com
Link: <http://cnn.com/>; rel="original",
<http://web.archive.org/web/timebundle/http://cnn.com/>; rel="timebundle”,
<http://web.archive.org/web/20000915112826/http://www.cnn.com>; rel=“first-memento”;
datetime=“Tue, 15 Sep 2000 11:28:26 GMT”,
<http://web.archive.org/web/20080708093433/http://www.cnn.com>; rel=“last-memento”;
datetime="Tue, 08 Jul 2008 09:34:33 GMT”,
<http://web.archive.org/web/20010911203610/http://www.cnn.com>; rel=“prev-memento”;
datetime="Tue, 11 Sep 2001 20:30:51 GMT”,
<http://web.archive.org/web/20010911203610/http://www.cnn.com>; rel=“next-memento”;
datetime="Tue, 11 Sep 2001 20:47:33 GMT”
Content-Length: 0
Connection: close
Content-Type: text/plain; charset=UTF-8
Memento HTTP Flow


    HEAD R, (Accept-Datetime)


             LinkG


     GET G, Accept-Datetime


 302M, Vary, TCN, LinkR,B,M


    GET M, (Accept-Datetime)


200, Content-Datetime, LinkR,B,M
Memento HTTP Flow


                          GET M, Accept-Datetime

GET http://web.archive.org/web/20010911203610/http://www.cnn.com HTTP/1.1
Host: web.archive.org
Accept-Datetime: Tue, 11 Sep 2001 20:35:00 GMT
Connection: close
Flow
        Memento HTTP Flow


     HEAD R, (Accept-Datetime)


              LinkG


      GET G, Accept-Datetime


  302M, Vary, TCN, LinkR,B,M


     GET M, (Accept-Datetime)


 200, Content-Datetime, LinkR,B,M
Memento HTTP Flow


                  200, Content-Datetime, LinkR,B,M

HTTP/1.1 200 OK
Server: Apache-Coyote/1.1
X-Archive-Orig-Accept-Ranges: bytes
…
Content-Type: text/html;charset=utf-8
Content-Length: 23364
Date: Thu, 21 Jan 2010 00:09:40 GMT
Content-Datetime: Tue, 11 Sep 2001 20:36:10 GMT
Link: <http://cnn.com/>; rel="original",
<http://web.archive.org/web/timebundle/http://cnn.com/>; rel="timebundle”,
<http://web.archive.org/web/20000915112826/http://www.cnn.com>; rel=“first-memento”;
datetime=“Tue, 15 Sep 2000 11:28:26 GMT”,
<http://web.archive.org/web/20080708093433/http://www.cnn.com>; rel=“last-memento”;
datetime="Tue, 08 Jul 2008 09:34:33 GMT”,
<http://web.archive.org/web/20010911203610/http://www.cnn.com>; rel=“prev-memento”;
datetime="Tue, 11 Sep 2001 20:30:51 GMT”,
<http://web.archive.org/web/20010911203610/http://www.cnn.com>; rel=“next-memento”;
datetime="Tue, 11 Sep 2001 20:47:33 GMT”
Connection: close

More Related Content

Viewers also liked

Transcending Silos: Shared Canvas Data Model for Digital Facsimiles
Transcending Silos: Shared Canvas Data Model for Digital FacsimilesTranscending Silos: Shared Canvas Data Model for Digital Facsimiles
Transcending Silos: Shared Canvas Data Model for Digital FacsimilesRobert Sanderson
 
SharedCanvas: A Collaborative Model for Medieval Manuscript Layout Dissemina...
SharedCanvas: A Collaborative Model for Medieval Manuscript Layout Dissemina...SharedCanvas: A Collaborative Model for Medieval Manuscript Layout Dissemina...
SharedCanvas: A Collaborative Model for Medieval Manuscript Layout Dissemina...Robert Sanderson
 
Erika Pricyla Cerino HernáNdez
Erika Pricyla Cerino HernáNdezErika Pricyla Cerino HernáNdez
Erika Pricyla Cerino HernáNdezguest1cc234
 
Dit Heb Je Nog Nooit Gezien
Dit Heb Je Nog Nooit GezienDit Heb Je Nog Nooit Gezien
Dit Heb Je Nog Nooit Gezienguest6964ce
 
NLLC 2011: Memento, Open Annotation, SharedCanvas
NLLC 2011: Memento, Open Annotation, SharedCanvasNLLC 2011: Memento, Open Annotation, SharedCanvas
NLLC 2011: Memento, Open Annotation, SharedCanvasRobert Sanderson
 
W3C Open Annotation: Status and Use Cases
W3C Open Annotation: Status and Use CasesW3C Open Annotation: Status and Use Cases
W3C Open Annotation: Status and Use CasesRobert Sanderson
 
NISO Annotation Meeting (San Francisco)
NISO Annotation Meeting (San Francisco)NISO Annotation Meeting (San Francisco)
NISO Annotation Meeting (San Francisco)Robert Sanderson
 
Making Web Annotations Persistent over Time
Making Web Annotations Persistent over TimeMaking Web Annotations Persistent over Time
Making Web Annotations Persistent over TimeRobert Sanderson
 
W3C Web Annotation WG Update (I Annotate 2016)
W3C Web Annotation WG Update (I Annotate 2016)W3C Web Annotation WG Update (I Annotate 2016)
W3C Web Annotation WG Update (I Annotate 2016)Robert Sanderson
 
IIIF Overview for Linked Data Exhibitions
IIIF Overview for Linked Data ExhibitionsIIIF Overview for Linked Data Exhibitions
IIIF Overview for Linked Data ExhibitionsRobert Sanderson
 
Annotating Scholarly Works - the W3C Open Annotation Model
Annotating Scholarly Works - the W3C Open Annotation ModelAnnotating Scholarly Works - the W3C Open Annotation Model
Annotating Scholarly Works - the W3C Open Annotation ModelRobert Sanderson
 
Linked Data Snowball, or Why We Need Reconciliation
Linked Data Snowball, or Why We Need ReconciliationLinked Data Snowball, or Why We Need Reconciliation
Linked Data Snowball, or Why We Need ReconciliationRobert Sanderson
 
Community Challenges for Practical Linked Open Data - Linked Pasts keynote
Community Challenges for Practical Linked Open Data - Linked Pasts keynoteCommunity Challenges for Practical Linked Open Data - Linked Pasts keynote
Community Challenges for Practical Linked Open Data - Linked Pasts keynoteRobert Sanderson
 

Viewers also liked (20)

Transcending Silos: Shared Canvas Data Model for Digital Facsimiles
Transcending Silos: Shared Canvas Data Model for Digital FacsimilesTranscending Silos: Shared Canvas Data Model for Digital Facsimiles
Transcending Silos: Shared Canvas Data Model for Digital Facsimiles
 
SharedCanvas: A Collaborative Model for Medieval Manuscript Layout Dissemina...
SharedCanvas: A Collaborative Model for Medieval Manuscript Layout Dissemina...SharedCanvas: A Collaborative Model for Medieval Manuscript Layout Dissemina...
SharedCanvas: A Collaborative Model for Medieval Manuscript Layout Dissemina...
 
Niso Annotation Webinar
Niso Annotation WebinarNiso Annotation Webinar
Niso Annotation Webinar
 
Erika Pricyla Cerino HernáNdez
Erika Pricyla Cerino HernáNdezErika Pricyla Cerino HernáNdez
Erika Pricyla Cerino HernáNdez
 
Dit Heb Je Nog Nooit Gezien
Dit Heb Je Nog Nooit GezienDit Heb Je Nog Nooit Gezien
Dit Heb Je Nog Nooit Gezien
 
NLLC 2011: Memento, Open Annotation, SharedCanvas
NLLC 2011: Memento, Open Annotation, SharedCanvasNLLC 2011: Memento, Open Annotation, SharedCanvas
NLLC 2011: Memento, Open Annotation, SharedCanvas
 
Python Web Interaction
Python Web InteractionPython Web Interaction
Python Web Interaction
 
W3C Open Annotation: Status and Use Cases
W3C Open Annotation: Status and Use CasesW3C Open Annotation: Status and Use Cases
W3C Open Annotation: Status and Use Cases
 
NISO Annotation Meeting (San Francisco)
NISO Annotation Meeting (San Francisco)NISO Annotation Meeting (San Francisco)
NISO Annotation Meeting (San Francisco)
 
Making Web Annotations Persistent over Time
Making Web Annotations Persistent over TimeMaking Web Annotations Persistent over Time
Making Web Annotations Persistent over Time
 
W3C Web Annotation WG Update (I Annotate 2016)
W3C Web Annotation WG Update (I Annotate 2016)W3C Web Annotation WG Update (I Annotate 2016)
W3C Web Annotation WG Update (I Annotate 2016)
 
IIIF Presentation API
IIIF Presentation API IIIF Presentation API
IIIF Presentation API
 
IIIF Overview for Linked Data Exhibitions
IIIF Overview for Linked Data ExhibitionsIIIF Overview for Linked Data Exhibitions
IIIF Overview for Linked Data Exhibitions
 
Annotating Scholarly Works - the W3C Open Annotation Model
Annotating Scholarly Works - the W3C Open Annotation ModelAnnotating Scholarly Works - the W3C Open Annotation Model
Annotating Scholarly Works - the W3C Open Annotation Model
 
Hemoptysis jack
Hemoptysis jackHemoptysis jack
Hemoptysis jack
 
Lactate by jack.
Lactate by jack.Lactate by jack.
Lactate by jack.
 
Pneumothorax ..jack
Pneumothorax ..jackPneumothorax ..jack
Pneumothorax ..jack
 
Linked Data Snowball, or Why We Need Reconciliation
Linked Data Snowball, or Why We Need ReconciliationLinked Data Snowball, or Why We Need Reconciliation
Linked Data Snowball, or Why We Need Reconciliation
 
Sepsis 3
Sepsis 3 Sepsis 3
Sepsis 3
 
Community Challenges for Practical Linked Open Data - Linked Pasts keynote
Community Challenges for Practical Linked Open Data - Linked Pasts keynoteCommunity Challenges for Practical Linked Open Data - Linked Pasts keynote
Community Challenges for Practical Linked Open Data - Linked Pasts keynote
 

More from Robert Sanderson

LUX - Cross Collections Cultural Heritage at Yale
LUX - Cross Collections Cultural Heritage at YaleLUX - Cross Collections Cultural Heritage at Yale
LUX - Cross Collections Cultural Heritage at YaleRobert Sanderson
 
Zoom as a Paradigm for Linked Open Usable Data
Zoom as a Paradigm for Linked Open Usable DataZoom as a Paradigm for Linked Open Usable Data
Zoom as a Paradigm for Linked Open Usable DataRobert Sanderson
 
Provenance and Uncertainty in Linked Art
Provenance and Uncertainty in Linked ArtProvenance and Uncertainty in Linked Art
Provenance and Uncertainty in Linked ArtRobert Sanderson
 
Data is our Product: Thoughts on LOD Sustainability
Data is our Product: Thoughts on LOD SustainabilityData is our Product: Thoughts on LOD Sustainability
Data is our Product: Thoughts on LOD SustainabilityRobert Sanderson
 
A Perspective on Wikidata: Ecosystems, Trust, and Usability
A Perspective on Wikidata: Ecosystems, Trust, and UsabilityA Perspective on Wikidata: Ecosystems, Trust, and Usability
A Perspective on Wikidata: Ecosystems, Trust, and UsabilityRobert Sanderson
 
Linked Art: Sustainable Cultural Knowledge through Linked Open Usable Data
Linked Art: Sustainable Cultural Knowledge through Linked Open Usable DataLinked Art: Sustainable Cultural Knowledge through Linked Open Usable Data
Linked Art: Sustainable Cultural Knowledge through Linked Open Usable DataRobert Sanderson
 
Illusions of Grandeur: Trust and Belief in Cultural Heritage Linked Open Data
Illusions of Grandeur: Trust and Belief in Cultural Heritage Linked Open DataIllusions of Grandeur: Trust and Belief in Cultural Heritage Linked Open Data
Illusions of Grandeur: Trust and Belief in Cultural Heritage Linked Open DataRobert Sanderson
 
Structural Metadata in RDF (IS575)
Structural Metadata in RDF (IS575)Structural Metadata in RDF (IS575)
Structural Metadata in RDF (IS575)Robert Sanderson
 
Sanderson CNI 2020 Keynote - Cultural Heritage Research Data Ecosystem
Sanderson CNI 2020 Keynote - Cultural Heritage Research Data EcosystemSanderson CNI 2020 Keynote - Cultural Heritage Research Data Ecosystem
Sanderson CNI 2020 Keynote - Cultural Heritage Research Data EcosystemRobert Sanderson
 
Tiers of Abstraction and Audience in Cultural Heritage Data Modeling
Tiers of Abstraction and Audience in Cultural Heritage Data ModelingTiers of Abstraction and Audience in Cultural Heritage Data Modeling
Tiers of Abstraction and Audience in Cultural Heritage Data ModelingRobert Sanderson
 
The Importance of being LOUD
The Importance of being LOUDThe Importance of being LOUD
The Importance of being LOUDRobert Sanderson
 
Introduction to Linked Art Model
Introduction to Linked Art ModelIntroduction to Linked Art Model
Introduction to Linked Art ModelRobert Sanderson
 
Standards and Communities: Connected People, Consistent Data, Usable Applicat...
Standards and Communities: Connected People, Consistent Data, Usable Applicat...Standards and Communities: Connected People, Consistent Data, Usable Applicat...
Standards and Communities: Connected People, Consistent Data, Usable Applicat...Robert Sanderson
 
Strong Opinions, Weakly Held
Strong Opinions, Weakly HeldStrong Opinions, Weakly Held
Strong Opinions, Weakly HeldRobert Sanderson
 
IIIF Discovery Walkthrough
IIIF Discovery WalkthroughIIIF Discovery Walkthrough
IIIF Discovery WalkthroughRobert Sanderson
 
Linked Art: An Art Museum Profile for CIDOC-CRM
Linked Art: An Art Museum Profile for CIDOC-CRMLinked Art: An Art Museum Profile for CIDOC-CRM
Linked Art: An Art Museum Profile for CIDOC-CRMRobert Sanderson
 
Euromed2018 Keynote: Usability over Completeness, Community over Committee
Euromed2018 Keynote: Usability over Completeness, Community over CommitteeEuromed2018 Keynote: Usability over Completeness, Community over Committee
Euromed2018 Keynote: Usability over Completeness, Community over CommitteeRobert Sanderson
 
Linked Art - Our Linked Open Usable Data Model
Linked Art - Our Linked Open Usable Data ModelLinked Art - Our Linked Open Usable Data Model
Linked Art - Our Linked Open Usable Data ModelRobert Sanderson
 
EuropeanaTech Keynote: Shout it out LOUD
EuropeanaTech Keynote: Shout it out LOUDEuropeanaTech Keynote: Shout it out LOUD
EuropeanaTech Keynote: Shout it out LOUDRobert Sanderson
 

More from Robert Sanderson (20)

Understanding Linked Art
Understanding Linked ArtUnderstanding Linked Art
Understanding Linked Art
 
LUX - Cross Collections Cultural Heritage at Yale
LUX - Cross Collections Cultural Heritage at YaleLUX - Cross Collections Cultural Heritage at Yale
LUX - Cross Collections Cultural Heritage at Yale
 
Zoom as a Paradigm for Linked Open Usable Data
Zoom as a Paradigm for Linked Open Usable DataZoom as a Paradigm for Linked Open Usable Data
Zoom as a Paradigm for Linked Open Usable Data
 
Provenance and Uncertainty in Linked Art
Provenance and Uncertainty in Linked ArtProvenance and Uncertainty in Linked Art
Provenance and Uncertainty in Linked Art
 
Data is our Product: Thoughts on LOD Sustainability
Data is our Product: Thoughts on LOD SustainabilityData is our Product: Thoughts on LOD Sustainability
Data is our Product: Thoughts on LOD Sustainability
 
A Perspective on Wikidata: Ecosystems, Trust, and Usability
A Perspective on Wikidata: Ecosystems, Trust, and UsabilityA Perspective on Wikidata: Ecosystems, Trust, and Usability
A Perspective on Wikidata: Ecosystems, Trust, and Usability
 
Linked Art: Sustainable Cultural Knowledge through Linked Open Usable Data
Linked Art: Sustainable Cultural Knowledge through Linked Open Usable DataLinked Art: Sustainable Cultural Knowledge through Linked Open Usable Data
Linked Art: Sustainable Cultural Knowledge through Linked Open Usable Data
 
Illusions of Grandeur: Trust and Belief in Cultural Heritage Linked Open Data
Illusions of Grandeur: Trust and Belief in Cultural Heritage Linked Open DataIllusions of Grandeur: Trust and Belief in Cultural Heritage Linked Open Data
Illusions of Grandeur: Trust and Belief in Cultural Heritage Linked Open Data
 
Structural Metadata in RDF (IS575)
Structural Metadata in RDF (IS575)Structural Metadata in RDF (IS575)
Structural Metadata in RDF (IS575)
 
Sanderson CNI 2020 Keynote - Cultural Heritage Research Data Ecosystem
Sanderson CNI 2020 Keynote - Cultural Heritage Research Data EcosystemSanderson CNI 2020 Keynote - Cultural Heritage Research Data Ecosystem
Sanderson CNI 2020 Keynote - Cultural Heritage Research Data Ecosystem
 
Tiers of Abstraction and Audience in Cultural Heritage Data Modeling
Tiers of Abstraction and Audience in Cultural Heritage Data ModelingTiers of Abstraction and Audience in Cultural Heritage Data Modeling
Tiers of Abstraction and Audience in Cultural Heritage Data Modeling
 
The Importance of being LOUD
The Importance of being LOUDThe Importance of being LOUD
The Importance of being LOUD
 
Introduction to Linked Art Model
Introduction to Linked Art ModelIntroduction to Linked Art Model
Introduction to Linked Art Model
 
Standards and Communities: Connected People, Consistent Data, Usable Applicat...
Standards and Communities: Connected People, Consistent Data, Usable Applicat...Standards and Communities: Connected People, Consistent Data, Usable Applicat...
Standards and Communities: Connected People, Consistent Data, Usable Applicat...
 
Strong Opinions, Weakly Held
Strong Opinions, Weakly HeldStrong Opinions, Weakly Held
Strong Opinions, Weakly Held
 
IIIF Discovery Walkthrough
IIIF Discovery WalkthroughIIIF Discovery Walkthrough
IIIF Discovery Walkthrough
 
Linked Art: An Art Museum Profile for CIDOC-CRM
Linked Art: An Art Museum Profile for CIDOC-CRMLinked Art: An Art Museum Profile for CIDOC-CRM
Linked Art: An Art Museum Profile for CIDOC-CRM
 
Euromed2018 Keynote: Usability over Completeness, Community over Committee
Euromed2018 Keynote: Usability over Completeness, Community over CommitteeEuromed2018 Keynote: Usability over Completeness, Community over Committee
Euromed2018 Keynote: Usability over Completeness, Community over Committee
 
Linked Art - Our Linked Open Usable Data Model
Linked Art - Our Linked Open Usable Data ModelLinked Art - Our Linked Open Usable Data Model
Linked Art - Our Linked Open Usable Data Model
 
EuropeanaTech Keynote: Shout it out LOUD
EuropeanaTech Keynote: Shout it out LOUDEuropeanaTech Keynote: Shout it out LOUD
EuropeanaTech Keynote: Shout it out LOUD
 

Recently uploaded

Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 

Recently uploaded (20)

Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 

TimeMaps: Metadata for Memento

  • 1. TimeMaps: Metadata for Memento Herbert Van de Sompel Robert Sanderson Michael L. Nelson Lyudmila Balakireva Scott Ainsworth Harihar Shankar http://www.mementoweb.org/ Memento is partially funded by the Library of Congress TimeMaps: Metadata for Memento GSLIS Metadata Group, UIUC, 14th July 2010
  • 2. Memento wants to make Navigating the Web’s Past Easy •  Problem Statement •  Memento Solution •  Navigation not Search •  API for Web Archives •  Memento Ontology for TimeMaps http://www.mementoweb.org/ http://groups.google.com/group/memento-dev TimeMaps: Metadata for Memento GSLIS Metadata Group, UIUC, 14th July 2010
  • 3. Web Resources have Different Representations over Time TimeMaps: Metadata for Memento GSLIS Metadata Group, UIUC, 14th July 2010
  • 4. Thankfully Archived Representations Exist TimeMaps: Metadata for Memento GSLIS Metadata Group, UIUC, 14th July 2010
  • 5. 3 Issues with Current Access to Archives 1.  Access is via a new URI, unknown to the user. 2.  People do not like to search for archived resources, and there is no automated method 3.  Navigation in the past is inconsistent: 1.  Stuck in single, necessarily incomplete archive 2.  Or if not rewritten, URIs lead back to the present Comment on Popular Science article: http://bit.ly/bWr5gP TimeMaps: Metadata for Memento GSLIS Metadata Group, UIUC, 14th July 2010
  • 6. 1. Representations Archived at a Different URI Sep 11 2001, 20:36:10 UTC Dec 20 2001, 4:51:00 UTC http://en.wikipedia.org/w/index.php? http://web.archive.org/web/20010911203610/http:// title=September_11_attacks&oldid=282333 archived www.cnn.com/ archived resource for http://cnn.com resource for http://en.wikipedia.org/wiki/ September_11_attacks TimeMaps: Metadata for Memento GSLIS Metadata Group, UIUC, 14th July 2010
  • 7. 2. Searching is Cumbersome http://web.archive.org/web/*/http://cnn.com/ http://en.wikipedia.org/w/index.php? title=September_11_attacks&action=history TimeMaps: Metadata for Memento GSLIS Metadata Group, UIUC, 14th July 2010
  • 8. 3. Inconsistent Navigation (Archives Incomplete) SPACE Sep 11 2001, 20:36:10 UTC Sep 11 2001, 21:38:55 UTC http://web.archive.org/web/20010911203610/http:// http://web.archive.org/web/20010911213855/ www.cnn.com/ archived resource for http://cnn.com www.cnn.com/TECH/space/ TimeMaps: Metadata for Memento GSLIS Metadata Group, UIUC, 14th July 2010
  • 9. 3. Inconsistent Navigation (Can't Stay in Past) Pentagon Dec 20 2001, 4:51:00 UTC current http://en.wikipedia.org/w/index.php? title=September_11_attacks&oldid=282333 archived http://en.wikipedia.org/wiki/The_Pentagon resource for http://en.wikipedia.org/wiki/ September_11_attacks3 TimeMaps: Metadata for Memento GSLIS Metadata Group, UIUC, 14th July 2010
  • 10. Past and Current Web are Not Integrated TimeMaps: Metadata for Memento GSLIS Metadata Group, UIUC, 14th July 2010
  • 11. The Web without a Time Dimension Need to use a different URI to access archived versions of a resource and its current version TimeMaps: Metadata for Memento GSLIS Metadata Group, UIUC, 14th July 2010
  • 12. The Web with Time Dimension added by Memento Memento uses URI of the current version to access archived versions, but qualify it with datetime, and magically arrive at the correct location. TimeMaps: Metadata for Memento GSLIS Metadata Group, UIUC, 14th July 2010
  • 13. The Memento Solution There are two components to the Memento Solution: •  Component 1: Navigation to an archived resource via its original resource, by leveraging content negotiation. •  Component 2: A discovery API for archives that enables retrieving a list of all archived versions of a resource for a given URI. TimeMaps: Metadata for Memento GSLIS Metadata Group, UIUC, 14th July 2010
  • 14. Content Negotiation in Time •  Many systems support content negotiation for file format o  Your client by default asks for HTML and gets HTML o  But it could get PDF via the same URI •  Memento proposes a new dimension for content negotiation: Time o  Your client by default asks for the current time, and gets it o  But it could get an older version via the same URI •  Can be accomplished with only one new HTTP header in each direction: o  Accept-Datetime Request for a particular timestamp o  Content-Datetime The returned content’s timestamp o  These exactly mirror existing headers for Format, Language, etc. TimeMaps: Metadata for Memento GSLIS Metadata Group, UIUC, 14th July 2010
  • 15. Apr 10 2001, 21:39:30 UTC current Aug 15 2004, 08:45:27 UTC Aug 15 2007, 19:21:58 UTC www.cnn.com web.archive.org TimeMaps: Metadata for Memento GSLIS Metadata Group, UIUC, 14th July 2010
  • 16. Original Mementos Resource Apr 10 2001, 21:39:30 UTC current Aug 15 2004, 08:45:27 UTC Aug 15 2007, 19:21:58 UTC www.cnn.com web.archive.org TimeMaps: Metadata for Memento GSLIS Metadata Group, UIUC, 14th July 2010
  • 17. Original ? Mementos Resource Apr 10 2001, 21:39:30 UTC current Aug 15 2004, 08:45:27 UTC Aug 15 2007, 19:21:58 UTC www.cnn.com web.archive.org TimeMaps: Metadata for Memento GSLIS Metadata Group, UIUC, 14th July 2010
  • 18. Original TimeGate Mementos Resource Apr 10 2001, 21:39:30 UTC current Aug 15 2004, 08:45:27 UTC Aug 15 2007, 19:21:58 UTC www.cnn.com web.archive.org TimeMaps: Metadata for Memento GSLIS Metadata Group, UIUC, 14th July 2010
  • 19. Conneg with TimeGate to Mementos Original TimeGate Mementos Resource Apr 10 2001, 21:39:30 UTC current Aug 15 2004, 08:45:27 UTC Aug 15 2007, 19:21:58 UTC www.cnn.com web.archive.org TimeMaps: Metadata for Memento GSLIS Metadata Group, UIUC, 14th July 2010
  • 20. Link Headers Conneg with TimeGate to Mementos Original TimeGate Mementos Resource Apr 10 2001, 21:39:30 UTC current Aug 15 2004, 08:45:27 UTC Aug 15 2007, 19:21:58 UTC www.cnn.com web.archive.org TimeMaps: Metadata for Memento GSLIS Metadata Group, UIUC, 14th July 2010
  • 21. Link Headers Conneg with TimeGate to Mementos Original TimeGate Mementos Resource wikipedia.org TimeMaps: Metadata for Memento GSLIS Metadata Group, UIUC, 14th July 2010
  • 22. The Web with Time Dimension added by Memento TimeMaps: Metadata for Memento GSLIS Metadata Group, UIUC, 14th July 2010
  • 23. The Memento Solution •  Component 2: A discovery API for archives that allows requesting a list of all archived versions held for a resource with a given URI. TimeMaps: Metadata for Memento GSLIS Metadata Group, UIUC, 14th July 2010
  • 24. Why an API? •  Mementos for any given resource are distributed across archives. (What? Not just the Internet Archive?!) •  In order to get a correct perspective of available Mementos, different archives need to be consulted. •  Can do by distributed search (slow), or by consulting an aggregator. •  Aggregator and other services need machine readable description of archives' holdings to select appropriate Memento for request •  Closest in time •  Most reliable representation •  Fastest responding •  (etc) TimeMaps: Metadata for Memento GSLIS Metadata Group, UIUC, 14th July 2010
  • 25. WebCitation 13 May 2009 12:28:39 TimeMaps: Metadata for Memento GSLIS Metadata Group, UIUC, 14th July 2010
  • 26. WebCitation 13 May 2009 12:28:39 Archive-It 14 May 2009 01:18:11 TimeMaps: Metadata for Memento GSLIS Metadata Group, UIUC, 14th July 2010
  • 27. WebCitation 13 May 2009 12:28:39 Archive-It 14 May 2009 01:18:11 BL Archive 14 May 2009 07:12:45 TimeMaps: Metadata for Memento GSLIS Metadata Group, UIUC, 14th July 2010
  • 28. WebCitation 13 May 2009 12:28:39 Archive-It 14 May 2009 01:18:11 BL Archive 14 May 2009 07:12:45 Dracos 14 May 2009 13:00:00 TimeMaps: Metadata for Memento GSLIS Metadata Group, UIUC, 14th July 2010
  • 29. WebCitation 13 May 2009 12:28:39 Archive-It 14 May 2009 01:18:11 BL Archive 14 May 2009 07:12:45 Dracos 14 May 2009 13:00:00 TNA 14 May 2009 18:21:32 And no Internet Archive… TimeMaps: Metadata for Memento GSLIS Metadata Group, UIUC, 14th July 2010
  • 30. TimeMaps •  At most basic: List of URIs of Mementos and their times •  Expressed as Linked Data; a profile of OAI ORE Resource Maps •  Link header from TimeGate and Memento TimeMaps: Metadata for Memento GSLIS Metadata Group, UIUC, 14th July 2010
  • 31. Basic ORE Model Aggregation (Aggr) is a set of web resources (R-1 to R-3), described in RDF or Atom by a Resource Map (ReM). TimeMaps: Metadata for Memento GSLIS Metadata Group, UIUC, 14th July 2010
  • 32. TimeBundles Resources of Interest in Memento: •  Original Resource •  TimeGate •  Mementos TimeMaps: Metadata for Memento GSLIS Metadata Group, UIUC, 14th July 2010
  • 33. TimeGates •  Period(s) that the TimeGate covers •  Which resource is it a TimeGate for •  mem:TimeSpan as can cover multiple distinct periods TimeMaps: Metadata for Memento GSLIS Metadata Group, UIUC, 14th July 2010
  • 34. Mementos •  Time Period: valid for or observed over, number of observations •  Metadata: size, format, etc (will come back to the "etc") •  Which resource it is a Memento for TimeMaps: Metadata for Memento GSLIS Metadata Group, UIUC, 14th July 2010
  • 35. Serializations •  RDF/XML •  Good for XML parsers •  Turtle, N3 and related •  Good for graph parsers •  RDFa •  Good for web browsers •  Atom •  Good for alerting, feed readers etc (but still embeds RDF) •  New: Link Header format •  Good for real-time applications •  Smaller file size (just the facts, ma'am) •  Easy to implement with existing link header parsers •  Servers need to produce format anyway, so non-rdf way out TimeMaps: Metadata for Memento GSLIS Metadata Group, UIUC, 14th July 2010
  • 36. Use Case: Aggregator using TimeMaps TimeMaps: Metadata for Memento GSLIS Metadata Group, UIUC, 14th July 2010
  • 37. Link Headers Conneg with TimeGate to Mementos Original TimeGate Mementos Resource TimeMaps: Metadata for Memento GSLIS Metadata Group, UIUC, 14th July 2010
  • 38. Metadata Discussion Points 1.  What metadata is necessary to determine the most appropriate copy? •  Distance to requested time most important •  Quality of representation? •  Usage statistics for Original Resource? For Memento? •  User tagging of Memento for quality? •  Archive response speed? •  Need to know more information from user preferences? 2.  What other metadata is useful and available? •  Crawling archives have limited information •  CMS systems have much more •  User tags, comments, annotations •  Semantic information about content, eg title, author, subject •  Distribution of changes over time TimeMaps: Metadata for Memento GSLIS Metadata Group, UIUC, 14th July 2010
  • 39. Metadata Discussion Points 3.  What metadata is necessary for inter-archive synchronization? •  Deduplication information: digests, request headers •  "Significant Change" factors •  Crawler settings: respect no-cache, robots.txt etc 4.  What metadata can be generated by other services? •  Open World Model: Anyone can say anything about anything •  Technical metadata easy (MIX for images, etc) •  Time Series Analysis interesting (techtales.org) •  Machine Learning based approaches? TimeMaps: Metadata for Memento GSLIS Metadata Group, UIUC, 14th July 2010
  • 40. Thank You  Rob Sanderson: •  azaroth42@gmail.com •  rsanderson@lanl.gov This presentation: •  http://www.slideshare.net/azaroth42/xxx Memento: •  http://www.mementoweb.org/ •  http:groups.google.com/group/memento-dev MementoFox: •  https://addons.mozilla.com/en-US/firefox/addon/100298 aka: http://bit.ly/memfox Memento Enables Navigating the Past Web TimeMaps: Metadata for Memento GSLIS Metadata Group, UIUC, 14th July 2010
  • 41. Discussion Questions 1.  What metadata is necessary to determine the most appropriate copy? 2.  What other metadata is useful and available? 3.  What metadata is necessary for inter-archive synchronization? 4.  What metadata can be generated by other services? TimeMaps: Metadata for Memento GSLIS Metadata Group, UIUC, 14th July 2010
  • 42. Appendix: Memento HTTP Flow HEAD R, (Accept-Datetime) LinkG GET G, Accept-Datetime 302M, Vary, TCN, LinkR,B,M GET M, (Accept-Datetime) 200, Content-Datetime, LinkR,B,M
  • 43. Memento HTTP Memento HTTP Flow Flow HEAD R, (Accept-Datetime) LinkG GET G, Accept-Datetime 302M, Vary, TCN, LinkR,B,M GET M, (Accept-Datetime) 200, Content-Datetime, LinkR,B,M
  • 44. Memento HTTP Memento HTTP Flow Flow: URI-R HEAD R, (Accept-Datetime) HEAD http://cnn.com/ HTTP/1.1 Host: cnn.com Accept-Datetime: Tue, 11 Sep 2001 20:35:00 GMT Connection: close
  • 45. Memento HTTP Memento HTTP Flow Flow HEAD R, (Accept-Datetime) LinkG GET G, Accept-Datetime 302M, Vary, TCN, LinkR,B,M GET M, (Accept-Datetime) 200, Content-Datetime, LinkR,B,M
  • 46. Memento HTTP Memento HTTP Flow Flow: Success – LinkG HTTP/1.1 200 OK URI-R Date: Thu, 21 Jan 2010 00:02:12 GMT Server: Apache Link: <http://web.archive.org/web/timegate/http://cnn.com>; rel="timegate" Content-Length: 255 Connection: close Content-Type: text/html; charset=iso-8859-1
  • 47. Memento HTTP Memento HTTP Flow Flow HEAD R, (Accept-Datetime) LinkG GET G, Accept-Datetime 302M, Vary, TCN, LinkR,B,M GET M, (Accept-Datetime) 200, Content-Datetime, LinkR,B,M
  • 48. Memento HTTP Flow GET G, Accept-Datetime GET http://web.archive.org/web/timegate/http://cnn.com HTTP/1.1 Host: cnn.com Accept-Datetime: Tue, 11 Sep 2001 20:35:00 GMT Connection: close
  • 49. Memento HTTP Memento HTTP Flow Flow HEAD R, Accept-Datetime LinkG GET G, Accept-Datetime 302M, Vary, TCN, LinkR,B,M GET M, Accept-Datetime 200, Content-Datetime, LinkR,B,M
  • 50. Memento HTTP Flow 302M, Vary, LinkR,B,M HTTP/1.1 302 Found Date: Thu, 21 Jan 2010 00:06:50 GMT Server: Apache TCN: choice Vary: negotiate, accept-datetime Location: http://web.archive.org/web/20010911203610/http://www.cnn.com Link: <http://cnn.com/>; rel="original", <http://web.archive.org/web/timebundle/http://cnn.com/>; rel="timebundle”, <http://web.archive.org/web/20000915112826/http://www.cnn.com>; rel=“first-memento”; datetime=“Tue, 15 Sep 2000 11:28:26 GMT”, <http://web.archive.org/web/20080708093433/http://www.cnn.com>; rel=“last-memento”; datetime="Tue, 08 Jul 2008 09:34:33 GMT”, <http://web.archive.org/web/20010911203610/http://www.cnn.com>; rel=“prev-memento”; datetime="Tue, 11 Sep 2001 20:30:51 GMT”, <http://web.archive.org/web/20010911203610/http://www.cnn.com>; rel=“next-memento”; datetime="Tue, 11 Sep 2001 20:47:33 GMT” Content-Length: 0 Connection: close Content-Type: text/plain; charset=UTF-8
  • 51. Memento HTTP Flow HEAD R, (Accept-Datetime) LinkG GET G, Accept-Datetime 302M, Vary, TCN, LinkR,B,M GET M, (Accept-Datetime) 200, Content-Datetime, LinkR,B,M
  • 52. Memento HTTP Flow GET M, Accept-Datetime GET http://web.archive.org/web/20010911203610/http://www.cnn.com HTTP/1.1 Host: web.archive.org Accept-Datetime: Tue, 11 Sep 2001 20:35:00 GMT Connection: close
  • 53. Flow Memento HTTP Flow HEAD R, (Accept-Datetime) LinkG GET G, Accept-Datetime 302M, Vary, TCN, LinkR,B,M GET M, (Accept-Datetime) 200, Content-Datetime, LinkR,B,M
  • 54. Memento HTTP Flow 200, Content-Datetime, LinkR,B,M HTTP/1.1 200 OK Server: Apache-Coyote/1.1 X-Archive-Orig-Accept-Ranges: bytes … Content-Type: text/html;charset=utf-8 Content-Length: 23364 Date: Thu, 21 Jan 2010 00:09:40 GMT Content-Datetime: Tue, 11 Sep 2001 20:36:10 GMT Link: <http://cnn.com/>; rel="original", <http://web.archive.org/web/timebundle/http://cnn.com/>; rel="timebundle”, <http://web.archive.org/web/20000915112826/http://www.cnn.com>; rel=“first-memento”; datetime=“Tue, 15 Sep 2000 11:28:26 GMT”, <http://web.archive.org/web/20080708093433/http://www.cnn.com>; rel=“last-memento”; datetime="Tue, 08 Jul 2008 09:34:33 GMT”, <http://web.archive.org/web/20010911203610/http://www.cnn.com>; rel=“prev-memento”; datetime="Tue, 11 Sep 2001 20:30:51 GMT”, <http://web.archive.org/web/20010911203610/http://www.cnn.com>; rel=“next-memento”; datetime="Tue, 11 Sep 2001 20:47:33 GMT” Connection: close