Showing posts from 2013

2013-12-19: 404 - Your interview has been depublished

Early November 2013 I gave an invited presentation at the EcoCom conference (picture left) and at the Spreeeforum, an informal gathering of researchers to facilitate knowledge exchange and foster collaborations. EcoCom was organized by Prof. Dr. Michael Herzog and his SPiRIT team and the Spreeforum was hosted by Prof. Dr. Jürgen Sieck who leads the INKA research group. Both events were supported by the Alcatel-Lucent Stiftung for Communications research. In my talks I gave a high-level overview of the state of the art in web archiving, outlined the benefits of the Memento protocol, pointed at issues and challenges web archives face today, and gave a demonstration of the Memento for Chrome extension.

Following the talk at the Spreeforum I was asked to give an interview for the German radio station Inforadio (you may think of it as Germany's NPR). The piece was aired on Monday, November 18th at 7.30am CET. As I had left Germany already I was not able to listen to it live but was ha…

2013-12-18: Avoiding Spoilers with the Memento Mediawiki Extension

From Modern Family to the Girl with the Dragon Tatoo, fans have created a flood of fan-based wikis based on their favorite television, book, and movie series. This dedication to fiction has allowed fans to settle disputes and encourage discussion using these resources. These resources, coupled with the rise in experiencing fiction long after it is initially released, has given rise to another cultural phenomenon: spoilers. Using a fan-based resource is wonderful for those who are current with their reading/watching, but is fraught with disaster for those who want to experience the great reveals and have not caught up yet. Memento can help here. Above is a video showing how the Memento Chrome Extension from Los Alamos National Laboratory (LANL) can be used to avoid spoilers while browsing for information on Downtown Abbey. This wiki is of particular interest because the TV show is released in the United Kingdom long before it is released in other countries. The wiki has a nice sign wa…

2013-12-13: Hiberlink Presentation at CNI Fall 2013

Herbert and Martin attended the recent Fall 2013 CNI meeting in Washington DC, where they gave an update about the Hiberlink Project (joint with the University of Edinburgh), which is about preserving the referential integrity of the scholarly record. In other words, we link to the general web in our technical publications (and not just other scholarly material) and of course the links rot over time.  But the scholarly publication environment does give us several hooks to help us access web archives to uncover the correct material.

As always, there are many slides but they are worth the time to study them.  Of particular importance are slides 8--18, which helps differentiate Hiberlink from other projects, and slides 66-99 which walk through a demonstration of the "Missing Link" concepts (along with the Memento for Chrome extension) can be used to address the problem of link rot.  In particular, absent specific versiondate attributes on a link, such as:

<a versiondate=&quo…

2013-11-28: Replaying the SOPA Protest

In an attempt to limit online piracy and theft of intellectual property, the U.S. Government proposed the Stop Online Privacy Act (SOPA). This act was widely unpopular. On January 18th, 2012, many prevalent websites (e.g., XKCD) organized a world-wide blackout of their websites in protest of SOPA.

While the attempted passing of SOPA may end up being a mere footnote in history, the overwhelming protest in response is significant. This event is an important observance and should be archived in our Web archives. However, some methods of implementing the protest (such as JavaScript and Ajax) made the resulting representations unarchiveable by archival services at the time. As a case study, we will examine the Washington, D.C. Craigslist site and the English Wikipedia page. All screenshots of the live protests were taken during the protest on January 18th, 2012. The screenshots of the mementos were taken on November 27th, 2013.

Craigslist put up a blackout page that would only provide acc…

2013-11-21: 2013 Southeast Women in Computing Conference (SEWIC)

Last weekend (Nov 14-17), I was honored to give a keynote at the Southeast Women in Computing Conference (SEWICC), located at the beautiful Lake Guntersville State Park in north Alabama.  The conference was organized by Martha Kosa and Ambareen Siraj (Tennessee Tech University), and Jennifer Whitlow (Georgia Tech).

Videos from the keynotes and pictures from the weekend will soon be posted on the conference website.  (UPDATE 1/24/14: Flickr photostream and links to keynote videos added.)

The 220+ attendees included faculty, graduate students, undergraduates, and even some high school students (and even some men!).

On Friday night, Tracy Camp from the Colorado School of Mines presented the first keynote, "What I Know Now... That I Wish I Knew Then".  It was a great kickoff to the conference and provided a wealth of information on (1) the importance of mentoring, networking, and persevering, (2) tips on negotiating and time management, and (3) advice on dealing with the Imposto…

2013-11-21: The Conservative Party Speeches and Why We Need Multiple Web Archives

.@Conservatives put speeches in Streisand's house: via @lljohnston@hhockx
— Michael L. Nelson (@phonedude_mln) November 13, 2013 Circulating the web last week the story of the UK's Conservative Party (aka the "Tories") removing speeches from their website (see Note 1 below).  Not only did they remove the speeches from their website, but via their robots.txt file they also blocked the Internet Archive from serving their archived versions of the pages as well (see Note 2 below of a discussion of robots.txt, as well as for an update about availability in the Internet Archive).  But even though the Internet Archive allows site owners to redact pages from their archive, mementos of the pages likely exist in other archives.  Yes, the Internet Archive was the first web archive and is still by far the largest with 240B+ pages, but the many other web archives, in aggregate, also provide good coverage (see our 2013 …

2013-11-19: REST, HATEOAS, and Follow Your Nose

This post is hardly timely, but I wanted to gather together some resources that I have been using for REST (Representational State Transfer) and HATEOAS (Hypermedia as the Engine of Application State).  It seems like everyone claims to be RESTful, but mentioning HATEOAS is frequently met with silence.  Of course, these terms come from Roy Fielding's PhD dissertation, but I won't claim that it is very readable (it is not the nature of dissertations to be readable...).  Fortunately he's provided more readable blog posts about REST and HATEOAS. At the risk of aggressively over-simplifying things, REST = "URIs are nouns, not verbs" and HATEOAS = "follow your nose".

"Follow your nose" simply means that when a client dereferences a URI, the entity that is returned is responsible for providing a set of links that allows the user agent to transition to the next state.  This standard procedure in HTML: you follow links to guide you through an online t…

2013-11-13: 2013 Archive-It Partner Meeting Trip Report

test On November 12, I attended the 2013 Archive-It Partner Meeting in Salt Lake City, Utah, our research group's second year of attendance (see 2012 Trip Report). The meeting started off casually at 9am with breakfast and registration. Once everyone was settled, Kristine Hanna, the Director of Archiving Services at Internet Archive introduced her team that was present of the meeting. Kristine acknowledged the fire at Internet Archive last week and the extent of the damage. "It did burn to the ground but thankfully, nobody was injured." She reminded the crowd of partners to review Archive-It's storage and preservation policy and mentioned the redundancies in-place, including a soon-to-be mirror at our very own ODU. Kristine then mentioned news of a new partnership with Reed Technologies to jointly market and sell Archive-It (@archiveitorg). She reassured the audience that nothing would change beyond having more resources for them to accomplish their goals. Kristine t…