Monday, October 27, 2014

2014-10-27: 404/File Not Found: Link Rot, Legal Citation and Projects to Preserve Precedent

Herbert and I attended the "404/File Not Found: Link Rot, Legal Citation and Projects to Preserve Precedent" at the Georgetown Law Library on Friday, October 24, 2014.  Although the origins for this workshop are many, catalysts for it probably include the recent Liebler  & Liebert study about link rot in Supreme Court opinions,  and the paper by Zittrain, Albert, and Lessig about and the problem of link rot in the scholarly and legal record and the resulting popular media coverage resulting from it  (e.g., NPR and the NYT). 

The speakers were naturally drawn from the legal community at large, but some notable exceptions included David Walls from the GPO, Jefferson Bailey from the Internet Archive, and Herbert Van de Sompel from LANL. The event was streamed and recorded, and videos + slides will be available from the Georgetown site soon so I will only hit the highlights below. 

After a welcome from Michelle Wu, the director of the Georgetown Law Library, the workshop started with an excellent keynote from the always entertaining Jonathan Zittrain, called "Cites and Sites: A Call To Arms".  The theme of the talk centered around "Core Purpose of .edu", which he broke down into:
  1. Cultivation of Scholarly Skills
  2. Access to the world's information
  3. Freely disseminating what we know
  4. Contributing actively and fiercely to the development of free information platforms

For each bullet he gave numerous anecdotes and examples; some innovative, and some humorous and/or sad.  For the last point he mentioned Memento,, and timed release crypto

Next up was a panel with David Walls (GPO), Karen Eltis (University of Ottawa), and Ed Walters (Fastcase).  David mentioned the Federal Depository Library Program Web Archive, Karen talked about the web giving us "Permanence where we don't want it and transience where we require longevity" (I tweeted about our TPDL 2011 paper that showed for music videos on Youtube, individual URIs die all the time but the content just shows up elsewhere), and Ed generated a buzz in the audience when he announced that in rendering their pages they ignore the links because of the problem of link rot.  (Panel notes from Aaron Kirschenfeld.)

The next panel had Raizel Liebler (Yale) author of another legal link rot study mentioned above and an author of one of the useful handouts about links in the 2013-2014 Supreme Court documentsRod Wittenberg (Reed Tech) talked about the findings of the Chesapeake Digital Preservation Group and gave a data dump about link rot in Lexis-Nexis and the resulting commercial impact (wait for the slides).  (Panel notes from Aaron Kirschenfeld.)

After lunch, Roger Skalbeck (Georgetown) gave a web master's take on the problem, talking about best practices, URL rewriting, and other topics -- as well as coining the wonderful phrase "link rot deniers".  During this talk I also tweeted TimBL's classic 1998 resource "Cool URIs Don't Change". 

Next was Jefferson Bailey (IA) and Herbert.  Jefferson talked about web archiving, the IA, and won approval from the audience for his references to Lionel Hutz and HTTP status dogs.  Herbert's talk was entitled "Creating Pockets of Persistence", and covered a variety of topics, obviously including Memento and Hiberlink.

The point is to examine web archiving activities with an eye to the goal of making access to the past web:
  1. Persistent
  2. Precise
  3. Seamless
Even though this was a gathering of legal scholars, the point was to focus on technologies and approaches that are useful across all interested communities.  He also gave examples from our "Thoughts on Referencing, Linking, Reference Rot" (aka "missing link) document, which was also included in the list of handouts.  The point on this effort is enhance existing links (with archived versions, mirror versions, etc.), but not at the expense of removing the link to the original URI and the datetime of intended link.  See our previous blog post on this paper and a similar one for Wikipedia.

The closing session was Leah Prescott (Georgetown; subbing for Carolyn Cox),  Kim Dulin (Harvard), and E. Dana Neacşu (Colombia).   Leah talked some more about the Chesapeake Digital Preservation Group and how their model of placing materials in a repository doesn't completely map to the model of web archiving (note: this actually has fascinating implications for Memento that are beyond the scope of this post).  Kim gave an overview of Harvard's archive, and Dana gave an overview of a prior archiving project at Columbia.  Note that recently received a Mellon Foundation grant (via Columbia) to add Memento capability.

Thanks to Leah Prescott and everyone else that organized this event.  It was an engaging, relevant, and timely workshop.  Herbert and I met several possible collaborators that we will be following up with. 


-- Michael

No comments:

Post a Comment