Tuesday, April 12, 2011

2011-04-13: Implementing Time Travel for the Web

Recent trends in digital libraries are towards integration with the architecture of the World Wide Web. The award-winning Memento Project proposes extending HTTP to provide protocol-level access to mementos (archived previous states) of web resources. Using content negotiation and other protocol operations, rather than archive-specific methods, Memento provides the digital library and preservation community with a standardized method to navigate between the original resource and its mementos.

Memento Client State Chart

The ODU Web Sciences and Digital Libraries Research Group has partnered with the LANL Research Library to create Memento and develop prototype Memento-compliant client and server implementations. A variety of Memento clients have been created, tested, and co-evolved along with the Memento protocol. There is now a FireFox extension, Internet Explorer browser helper object, and WebKit-based Android browser. The design and technical solutions identified during the development of these clients will be of interest to those considering implementation of a Memento-based platform, especially on the client side, and the interactions are also important for building conformant server-side systems.

MementoFox Screenshot

The full article can be found at:

Robert Sanderson, Harihar Shankar, Scott Ainsworth, Frank McCown, and Sam Adams. Implementing Time Travel for the Web. code{4}lib Journal, Issue 13, 2011-04-11. http://journal.code4lib.org/articles/4979.

-- Scott G. Ainsworth

Friday, April 8, 2011

2011-04-08: Radiation Map of Japan

The devastation wrought by the 11 March earthquake in Japan,
and the depths of the human misery left in the wake of the massive Tsunami have left many people awestruck. The size of the quake itself was enormous and many people have had a hard time comprehending just how big this earthquake was. Some sites like Japan Quake Map help us to comprehend the magnitude of this event. As a result of the earthquake and tsunami the nuclear reactor at Dai-ichi was severely damaged and has been leaking radiation. The radiation readings have been made available by WIDE and Japan's Nuclear Safety Division.

The idea was to use R to create an informative map of Japan showing the radiation levels of the different prefectures. Python was used to import the data from both of the web sites and insert it into a MySQL database. The format of both of the pages was understandably quite dynamic and resulted in the python script needing to be tweaked quite often. Sometimes it was easier to just copy and paste the data in a spreadsheet and then export as a csv to import into the database.

For the map, the shapefiles included in the R distribution were not working out so shapefiles for Japan from Harvard Asia Studies were used. These shapefiles combined with the plotPolys() command produced a higher quality map than the standard shapefiles.

The readings for most prefectures were rather reliable however in Fukushima and in Miyagi the readings were sporadic. Miyagi was hardest hit by the tsunami and most of the area was destroyed. It appears that most of the readings were from mobile units and there are gaps in the coverage. If there were no readings available for a given day they were estimated using the surrounding readings both spatially and temporally. In Fukushima which is the location of the reactor, there were many monitoring sites set up but they seemed to come and go over the course of time. For the purposes of this map, the sites located between 20km to 30km from the reactor were averaged together to give a reading for the Fukushima prefecture.
Using R, the average daily radiation for each of the 47 prefectures was calculated. The maximum and minimum values were used to create a color gradient for the map. Most of the readings were low with only one or two high readings. This did not lend itself well to a smooth color gradient so the log of the values was used to create the color gradient.

R code for Map

The radiation level values were each assigned a hex color value and then merged into a vector that matched the prefecture names in the shapefile. Nested for loops are usually a bad idea, especially in R, suggestions for a more elegant solution are welcome. plotPolys() takes care of making the map and the only thing left was to reverse the logarithm function to get the real values back and add a legend.

All of the maps from 16 March to 4 April were combined into an avi using:
mencoder mf://*.png -mf fps=1:type=png -ovc lavc -lavcopts vcodec=mpeg4 -oac copy -o output.avi

Shapefiles from China Historical GIS Project, "Tokugawa Japan GIS, Demo Version." Feb 2004

-- Greg Szalkowski

2011-04-07: MITRE Records Expo Trip Report

I have just returned from MITRE's Records Expo on MITRE's Campus in McLean, VA. The Records Expo is designed to raise awareness of the archival responsibilities of employees within MITRE, and also inform our sponsors about the archives and records management work we're doing. I was invited to present some of the research being done in digital preservation at ODU and MITRE. (George Despres and I have recently received funding to perform digital preservation research on the digital objects living within the corporate intranet. Our research was explained at the Expo.)

We set up booths in the MITRE 2 building, equipped with big-screen TVs with slide shows about other archival and records management systems being pioneered at MITRE (the slides are For Office Use Only, and cannot be shared in this blog). Several MITRE employees attended and listened to presentations given by the archives team and the records management teams at MITRE, as well as George and I. A former Lockheed Martin employee that had worked on the NARA records management system was also in attendance.

I spoke to all of the attendees about Memento and how it will be used to allow users to browse archives within MITRE's intranet as part of George and my research. Most of the attendees had some experience with the WayBack Machine, and had a cursory knowledge of web archiving, but most weren't familiar with Memento. All were extremely interested in hearing about the research further, and some have already been in touch with me requesting additional information.

It was helpful and informative to meet with other professionals working in "The Real World" and attempting to solve the same problems being researched in academia. There were also some additional approaches to archival and records management problems, such as using the Cloud as a repository, and archiving corporate social networking content.

--Justin F. Brunelle