Posts

Showing posts with the label Memento

2013-12-18: Avoiding Spoilers with the Memento Mediawiki Extension

Image
From Modern Family  to the Girl with the Dragon Tatoo , fans have created a flood of fan-based wikis based on their favorite television, book, and movie series. This dedication to fiction has allowed fans to settle disputes and encourage discussion using these resources. These resources, coupled with the rise in experiencing fiction long after it is initially released, has given rise to another cultural phenomenon: spoilers . Using a fan-based resource is wonderful for those who are current with their reading/watching, but is fraught with disaster for those who want to experience the great reveals  and have not caught up yet. Memento can help here. Above is a video showing how the  Memento Chrome Extension  from Los Alamos National Laboratory (LANL) can be used to avoid spoilers while browsing for information on Downtown Abbey . This wiki is of particular interest because the TV show is released in the United Kingdom long before it is released in other countries . The wiki ha

2013-12-13: Hiberlink Presentation at CNI Fall 2013

Image
Herbert and Martin attended the recent Fall 2013 CNI meeting in Washington DC, where they gave an update about the Hiberlink Project (joint with the University of Edinburgh), which is about preserving the referential integrity of the scholarly record. In other words, we link to the general web in our technical publications (and not just other scholarly material) and of course the links rot over time.  But the scholarly publication environment does give us several hooks to help us access web archives to uncover the correct material. As always, there are many slides but they are worth the time to study them.  Of particular importance are slides 8--18, which helps differentiate Hiberlink from other projects, and slides 66-99 which walk through a demonstration of the " Missing Link " concepts (along with the Memento for Chrome extension ) can be used to address the problem of link rot.  In particular, absent specific versiondate attributes on a link, such as: <a vers

2013-11-28: Replaying the SOPA Protest

Image
In an attempt to limit online piracy and theft of intellectual property, the U.S. Government proposed the Stop Online Privacy Act (SOPA) . This act was widely unpopular. On January 18th, 2012, many prevalent websites (e.g., XKCD ) organized a world-wide blackout of their websites in protest of SOPA . While the attempted passing of SOPA may end up being a mere footnote in history, the overwhelming protest in response is significant. This event is an important observance and should be archived in our Web archives. However, some methods of implementing the protest (such as JavaScript and Ajax) made the resulting representations unarchiveable by archival services at the time. As a case study, we will examine the Washington, D.C. Craigslist site and the English Wikipedia page . All screenshots of the live protests were taken during the protest on January 18th, 2012. The screenshots of the mementos were taken on November 27th, 2013. Screenshot of the live Craigslist SOPA Protest fro

2013-11-21: The Conservative Party Speeches and Why We Need Multiple Web Archives

Image
. @Conservatives put speeches in Streisand's house: http://t.co/6aRiOsHwxO @UKWebArchive : http://t.co/BGD3tYavEx via @lljohnston @hhockx — Michael L. Nelson (@phonedude_mln) November 13, 2013 Circulating the web last week the story of the UK's Conservative Party (aka the " Tories ") removing speeches from their website (see Note 1 below).  Not only did they remove the speeches from their website, but via their robots.txt file they also blocked the Internet Archive from serving their archived versions of the pages as well (see Note 2 below of a discussion of robots.txt, as well as for an update about availability in the Internet Archive).  But even though the Internet Archive allows site owners to redact pages from their archive, mementos of the pages likely exist in other archives.  Yes, the Internet Archive was the first web archive and is still by far the largest with 240B+ pages , but the many other web archives, in aggregate, also provide good coverage

2013-11-08: Proposals for Tighter Integration of the Past and Current Web

Image
The Memento Team is soliciting feedback on two white papers that address related proposals for more tightly integrating the past and current web. The first is " Thoughts on Referencing, Linking, Reference Rot ", which is inspired by the hiberlink project.  This paper proposes making temporal semantics part of the HTML <a> element, via "versiondate" and "versionurl" attributes that respectively include the datetime the link was created and optionally a link to an archived version of the page (in case the live web version becomes 404, goes off topic, etc.).  The idea is that "versiondate" can be used as a Memento-Datetime value by a client, and "versionurl" can be used to record a URI-M value.  This approach is inspired by the Wikipedia Citation Template , which has many metadata fields, including "accessdate" and "archiveurl".  For example, in the article about the band "Coil", one of the links t

2013-10-14: Right-Click to the Past -- Memento for Chrome

Image
Last week LANL released Memento for Chrome , an extension that adds Memento capability for Chrome browsers.  It represents such a leap in capability and speed that the prior MementoFox (Memento for FireFox) add-on should be considered deprecated.  It's not just a FireFox vs. Chrome thing either; Memento for Chrome features a subtle change in how it interacts with the past and present.  MementoFox had a toggle switch for present vs. Time Travel mode that would trap and modify all outbound requests , from the current page and all subsequent pages until turned off, to go from the form of: http://example.com/index.html to: http://mementoproxy.lanl.gov/aggr/timegate/http://example.com/index.html This involved some complicated logic to determine when you were getting a memento (i.e., archived web entity) vs. something from the live web.  When you factored in native Memento archives vs. proxied Memento archives, things could get hairy (see the 2011 Code4Lib paper for a (dat

2013-10-04: TPDL 2013 Trip Report

Image
I attended the 2013 Theory and Practice of Digital Libraries (TPDL) Conference on September 22-26 in Valletta, Malta .  Although I've had papers at several of the prior TPDL (known as ECDL prior to 2011) conferences , I think this is the first one I've personally attended since ECDL 2005 in Austria.  Normally I prefer to send students to present their papers, but this year we had five full papers accepted, so I could not afford to send all the students and I went in their stead.  An unfortunate side effect of having so many papers is that between preparation and my own presentations I was unable to see as much of the conference as I would have liked. The conference began with Herbert Van de Sompel and I giving a tutorial about ResourceSync .  Attendees registered for all tutorials and were free to attend whichever one they preferred.  We had as many as ten people in ours at one point, but more importantly we had some key people present who will be implementing Resource

2013-09-06: Wolfram Data Summit 2013 Trip Report

Image
I was fortunate enough to be invited to present at the 2013 Wolfram Data Summit in Washington DC, September 5-6, 2013.  My talk was about the future of web archiving, but the focus of the data summit was " big data ".  As such, there was a variety of disciplines represented at the summit since the unifying factor was the scale of the data.  Logistics dictated that I missed several of the presentations, but many of the ones I did attend were very engaging.  The slides will be posted at the Wolfram site later, but I'll provide some short summaries below (2013-11-26 edit: the presentations are now available ). First was Greg Newby presenting about Project Gutenberg , the long-running collection of free ebooks.  His focus was on PG as a portable collection, which is subtly different from universal access from different interfaces (even if the interface is just Google).  The focus was more on PG as a collection to be explored and personalized services to be built-on.  Du