Friday, November 8, 2013

2013-11-08: Proposals for Tighter Integration of the Past and Current Web

The Memento Team is soliciting feedback on two white papers that address related proposals for more tightly integrating the past and current web.

The first is "Thoughts on Referencing, Linking, Reference Rot", which is inspired by the hiberlink project.  This paper proposes making temporal semantics part of the HTML <a> element, via "versiondate" and "versionurl" attributes that respectively include the datetime the link was created and optionally a link to an archived version of the page (in case the live web version becomes 404, goes off topic, etc.).  The idea is that "versiondate" can be used as a Memento-Datetime value by a client, and "versionurl" can be used to record a URI-M value.  This approach is inspired by the Wikipedia Citation Template, which has many metadata fields, including "accessdate" and "archiveurl".  For example, in the article about the band "Coil", one of the links to the source material is broken, but the Citation Template has values for both "accessdate" and "archiveurl":



Unfortunately, when this is transformed into HTML the semantics are lost or relegated to microformats:



A (simple) version with machine-actionable links suitable for the Memento Chrome extension or Zotero could have looked like this in the past, ready to activate when the link eventually went 404:



The second paper, "Memento Capabilities for Wikipedia", "describes added value that Memento can bring to Wikipedia and other MediaWiki platforms.  One is enriching their external links with the recommendations from our first paper (described above), and the second is about native Memento support for wikis.

Native Memento support is possible via a new Memento Extension for MediaWiki servers that we announced for testing and feedback on the wikitech-l list. This new extension is the result of a significant re-engineering effort guided by feedback received from Wikipedia experts to a previous version.  When installed, this extension allows clients to access the "history" portion of wikis in the same manner as they access web archives.  For example, if you wanted to visit the Coil article as it existed on February 2, 2007 instead of wading through the many pages of the article's history, your client would use the Memento protocol to access a prior version with the "Accept-Datetime" request header:



and the server would eventually redirect you to:



In a future blog post we will describe how using a Memento-enabled wiki can be used to avoid spoilers on fan wikis (e.g., The Songs of Ice and Fire wiki) by setting the Accept-Datetime to be right before a episode or book is released.

We've only provided a summary of the content of the two papers and we invite everyone to review them and provide us with feedback (here, twitter, email, etc.). 

--Michael & Herbert

2 comments:

  1. This seems like a good area to use HTML5 data attributes: the example above could be written in standard HTML right now like this:

    <a href="http://liarsociety.tripod.com/blog/index.blog?from=20041130"
    data-memento-version-datetime="2007-02-12T00:00:00Z"
    data-memento-version-url="http://web.archive.org/web/20080206210600/http://liarsociety.tripod.com/blog/index.blog?from=20041130">my link text</a>

    (note also promoting the simple date to allow full ISO-8601 dates as used in e.g. to allow references to content which has changed during the same calendar date to allow for e.g. a future “Dewey beats Truman” on cnn.com)

    I'm also not sure version is the most appropriate term as it implies more knowledge of the content authoring process than actually exists. Perhaps something more generic like "access" to avoid giving the impression that anything more sophisticated is happening?

    ReplyDelete
  2. Hi Chris -- thanks for your feedback. I agree that "data-" attributes are probably the way to go. I also get your point about version vs. access.

    We had envisioned flexible granularity in the dates to discourage people padding out with 0s, which implies a level of precision that's not really there, esp. since some of these attributes will be written by humans or retroactively applied. Memento is defined with second level granularity (via HTTP), but we've found that is not always achievable.

    thanks again for the feedback.

    --Michael

    ReplyDelete