Posts

Showing posts from 2013

2013-12-19: 404 - Your interview has been depublished

Image
Early November 2013 I gave an invited presentation at the EcoCom conference (picture left) and at the Spreeeforum, an informal gathering of researchers to facilitate knowledge exchange and foster collaborations. EcoCom was organized by Prof. Dr. Michael Herzog and his SPiRIT team  and the Spreeforum was hosted by Prof. Dr. Jürgen Sieck who leads the  INKA research group . Both events were supported by the  Alcatel-Lucent Stiftung for Communications research . In my talks I gave a high-level overview of the state of the art in web archiving, outlined the benefits of the Memento protocol , pointed at issues and challenges web archives face today, and gave a demonstration of the Memento for Chrome extension . Following the talk at the Spreeforum I was asked to give an interview for the German radio station Inforadio  (you may think of it as Germany's NPR). The piece was aired on Monday, November 18th at 7.30am CET. As I had left Germany already I was not able to ...

2013-12-18: Avoiding Spoilers with the Memento Mediawiki Extension

Image
From Modern Family  to the Girl with the Dragon Tatoo , fans have created a flood of fan-based wikis based on their favorite television, book, and movie series. This dedication to fiction has allowed fans to settle disputes and encourage discussion using these resources. These resources, coupled with the rise in experiencing fiction long after it is initially released, has given rise to another cultural phenomenon: spoilers . Using a fan-based resource is wonderful for those who are current with their reading/watching, but is fraught with disaster for those who want to experience the great reveals  and have not caught up yet. Memento can help here. Above is a video showing how the  Memento Chrome Extension  from Los Alamos National Laboratory (LANL) can be used to avoid spoilers while browsing for information on Downtown Abbey . This wiki is of particular interest because the TV show is released in the United Kingdom long before it is released in ...

2013-12-13: Hiberlink Presentation at CNI Fall 2013

Image
Herbert and Martin attended the recent Fall 2013 CNI meeting in Washington DC, where they gave an update about the Hiberlink Project (joint with the University of Edinburgh), which is about preserving the referential integrity of the scholarly record. In other words, we link to the general web in our technical publications (and not just other scholarly material) and of course the links rot over time.  But the scholarly publication environment does give us several hooks to help us access web archives to uncover the correct material. As always, there are many slides but they are worth the time to study them.  Of particular importance are slides 8--18, which helps differentiate Hiberlink from other projects, and slides 66-99 which walk through a demonstration of the " Missing Link " concepts (along with the Memento for Chrome extension ) can be used to address the problem of link rot.  In particular, absent specific versiondate attributes on a link, such as: <a ...

2013-11-28: Replaying the SOPA Protest

Image
In an attempt to limit online piracy and theft of intellectual property, the U.S. Government proposed the Stop Online Privacy Act (SOPA) . This act was widely unpopular. On January 18th, 2012, many prevalent websites (e.g., XKCD ) organized a world-wide blackout of their websites in protest of SOPA . While the attempted passing of SOPA may end up being a mere footnote in history, the overwhelming protest in response is significant. This event is an important observance and should be archived in our Web archives. However, some methods of implementing the protest (such as JavaScript and Ajax) made the resulting representations unarchiveable by archival services at the time. As a case study, we will examine the Washington, D.C. Craigslist site and the English Wikipedia page . All screenshots of the live protests were taken during the protest on January 18th, 2012. The screenshots of the mementos were taken on November 27th, 2013. Screenshot of the live Craigslist SOPA Protest fro...

2013-11-21: 2013 Southeast Women in Computing Conference (SEWIC)

Image
Last weekend (Nov 14-17), I was honored to give a keynote at the Southeast Women in Computing Conference (SEWICC), located at the beautiful Lake Guntersville State Park in north Alabama.  The conference was organized by  Martha Kosa  and  Ambareen Siraj  (Tennessee Tech University), and  Jennifer Whitlow  (Georgia Tech). Videos from the keynotes and pictures from the weekend will soon be posted on the conference website .  (UPDATE 1/24/14: Flickr photostream  and links to keynote videos added.) The 220+ attendees included faculty, graduate students, undergraduates, and even some high school students (and even some men!). On Friday night,  Tracy Camp  from the Colorado School of Mines presented the first keynote, "What I Know Now... That I Wish I Knew Then".  It was a great kickoff to the conference and provided a wealth of information on (1) the importance of mentoring, networking, and persevering, (2) tips...

2013-11-21: The Conservative Party Speeches and Why We Need Multiple Web Archives

Image
. @Conservatives put speeches in Streisand's house: http://t.co/6aRiOsHwxO @UKWebArchive : http://t.co/BGD3tYavEx via @lljohnston @hhockx — Michael L. Nelson (@phonedude_mln) November 13, 2013 Circulating the web last week the story of the UK's Conservative Party (aka the " Tories ") removing speeches from their website (see Note 1 below).  Not only did they remove the speeches from their website, but via their robots.txt file they also blocked the Internet Archive from serving their archived versions of the pages as well (see Note 2 below of a discussion of robots.txt, as well as for an update about availability in the Internet Archive).  But even though the Internet Archive allows site owners to redact pages from their archive, mementos of the pages likely exist in other archives.  Yes, the Internet Archive was the first web archive and is still by far the largest with 240B+ pages , but the many other web archives, in aggregate, also provide good coverage...