Posts

2010-07-27: NDIIPP Partners Meeting, IETF 78

Image
On July 20-22, I was at the NDIIPP Partners Meeting in Arlington VA, along with Martin Klein and Michele Weigle . The Library of Congress has not yet uploaded a public summary of the meeting, but there were a number of interesting additions to previous NDIIPP Partners Meetings (edit: the meeting slides are now available). First, there were keynotes from both the Librarian of Congress , James Billington , as well as the Archivist of the United States , David Ferriero . There was also a ceremony to commemorate the charter members (which includes ODU CS ) of the National Digital Stewardship Alliance (NDSA). I don't think the NDSA has a canonical web site yet, so the iPRES 2009 paper by Anderson, Gallinger & Potter is probably the best available description (edit: LC has announced a NDSA web site ). There was a theme of exploring the questions about "why we should care about digital preservation". The Library of Congress debuted this video, now available on th...

2010-07-17: Microsoft Research Faculty Summit 2010

Image
On July 12-13 2010 I was at the Microsoft Research Faculty Summit 2010 in Redmond WA. The agenda was exciting and one of the few conferences that where I've had real difficulty in choosing which of the parallel session to attend. The first keynote was about Kinect for Xbox 360 . The demos were very impressive and I had no idea that motion capture was ready for the home market. Check out the trailer at the MS site . The next session I attended was about the "Bing Dialog Model". I must confess that I'm unconvinced on how different Bing is from Google. Here's a side-by-side comparison of each search engine on the query "Michael Nelson": They seem nearly identical to me: the tri-panel layout (controls on left, content in center, ads on right), the link layout/colors ( blue title , black summary, green URI), interspersed images, tabs at the top, etc. The extended summary Bing gives you when you mouse over a link region is nice, and some of th...

2010-07-15: AMS Cloud Physics and Atmospheric Radiation 2010

Image
I presented a poster at the 2010 13th Conference on Cloud Physics 13th Conference on Atmospheric Radiation in Portland Oregon, June 28 - July 2. This was my first atmospheric science meeting in the 2 years since taking off from NASA to attend full-time graduate studies at Old Dominion University . It was good to be back and catch up on old and new atmospheric sciences research being conducted by my colleages and others. This conference takes place every 4 years. There were approximately 300 hundred scientists from around the world in attendance of which 60-70 were from NASA Langley. This was one of our important conferences to showcase our latest cloud and radiation results and products. Clouds and the Earth's Radiant Energy System ( CERES ) group were well represented. It seem like everyone at Langley who works on CERES were there. I saw many familiar faces and met several new CERES folks. My paper was entitled Alternative Method for Data Fusion of NASA CERES and A-TRAIN ...

2010-07-06: Travel Report for Hypertext and JCDL 2010

Image
As mentioned earlier I had two papers accepted at HT and JCDL. In June it was time to travel to the conferences and represent the Old Dominion University colors. HT 2010 took place in Toronto, Canada from June 13th-16th and was hosted by the University of Toronto . The acceptance rate of 37% was slightly higher than last year but the number of registered attendees seemed comparable. I was glad to be able to give the very first presentation since it secured the probably greatest audience of the entire conference. My slides are available through Slideshare. Is This a Good Title? View more presentations from Old Dominion University . The paper itself titled " Is This a Good Title " can be obtained through the ACM Digital Library and its content was covered in my earlier post . My personal highlight of the conference was the keynote by Andrew Dillon . He argued that research on Hypertext today is shaped too much by the Internet and its (inter-)linked natu...

2010-07-05: Foo Camp 2010

Image
I attended the 2010 Foo Camp in Sebastopol CA, June 25-27. For those who are unfamiliar, Foo Camp is an invitation-only " unconference " -- which is basically a conference that consists entirely of birds-of-a-feather sessions as well as the impromptu hallway and dinner conversations that make conferences useful. There were approximately 250 people there and by my estimation they were mostly young (25-35) entrepreneurs (current and former). There were a smattering of others as well: artists, writers, professors, VCs, etc. The best way I can describe Foo Camp is a combination of Burning Man (culture of participation), SIGGRAPH (culture of demonstration), and a country club (culture of capitalism). Geeks aren't really known for being extroverted, but the format of Foo Camp pretty much requires meeting new people and interaction with people outside of your existing circle of colleagues. I was surprised at how approachable most people were. Formulating the sche...

2010-06-23: Hypertext 2010; We laughed, we cried, we danced on air.

Image
Hypertext 2010 13 - 16 June 2010 has come and gone, but the memories linger . Overview Martin Klein and I presented our respective papers. He will be detailing his experience and his paper Is This a Good Title? at Hypertext 2010 when he returns from JCDL 2010. My paper Analysis of Graphs for Digital Preservation Suitability and it's associated PowerPoint presentation are available. The paper and the presentation was given at the Hypertext 2010 in Toronto, Ontario, Canada. A complete Hypertext program is available here . Day zero, 12 June Mary (my wife) and I got to Toronto late Saturday. We were four and a half hours late out of Norfolk because of weather problems in Chicago. Fortunately Mary made alternative reservations out of Dulles to Toronto as soon as we thought we were going to miss our connection. Pays to pay attention and to have alternative plans. Martin (the sly dog) chose to travel 13 June on a direct flight from Richmond, VA to Toronto. Day one, 13...

2010-05-21: Travel Report for LDOW, WWW, DOE, OAC

Image
I've just finished up a pretty busy four week stretch that involved one workshop, one conference, one proposal review panel, the space shuttle, a working group meeting and the end of the spring semester. In the last week of April I went to Raleigh NC for the Linked Data on the Web Workshop ( LDOW 2010 ) and the World Wide Web Conference ( WWW 2010 ). I drove down to Raleigh Monday evening after giving the last lecture (on Memento ) in my CS 751/851 class . In addition to myself, from the WS-DL team Scott Ainsworth and Jeff Shipman were able to attend the pre-conference workshops WS-REST 2010 and LDOW 2010 but they both had to return to work after that and missed the WWW conference itself. WS-DL alumnus Frank McCown was able to attend WWW and it was good catching up with him. From the Memento team, Herbert & Rob were there for the entire week as well. We had a Memento paper at LDOW: Herbert Van de Sompel, Robert Sanderson, Michael L. Nelson, Lyudmila L. Bala...

2010-05-11: How Good is Your Graph? A Question Posed to Hypertext 2010

Image
Usually the first response to a question like that is: Huh, what kind of a question is that and why should I care? Here is a short answer to the caring part (the rest of why this is important is at the end): a good graph can keep data safe even after the person that created the data is gone. The most common interpretation of "graph" is some sort of X-Y plot that shows how one value is affected by another. But in the context of this question, a graph is a system made up of edges and vertices (think of edges as HTML hypertext links and vertices as pages then Internet WWW sites become a graph). Now that we have a graph; the next part of the puzzle is: what does "good" mean and how do you measure it? That is at the heart of a my paper "Analysis of Graphs for Digital Preservation Suitability" that I will be presenting at Hypertext 2010 . I look at different types of graphs that are characterized by (on average) how many edges connect a vertex to its ne...

2010-04-22: Papers landed at Hypertext and JCDL 2010

Image
Not without pride I see two of my papers being accepted at the upcoming conferences ACM Hypertext (HT) and ACM/IEEE Joint Conference on Digital Libraries (JCDL) . The paper " Evaluating Methods to Rediscover Missing Web Pages from the Web Infrastructure " will be published at JCDL. It is co-authored with my advisor Dr. Michael L. Nelson . As part of my ongoing dissertation work we are investigating methods to rediscover missing web pages with the help of the web infratructure (search engines, their caches, the Internet Archive, etc) in real time meaning while the user is browsing the web. This paper evaluates the performance of four of these methods: the title of the web page, its lexical signature (LS) representing the most salient terms of its content, its tags obtained from delicious.com and its neighborhood lexical signature (NHLS), a LS based on content of pages that link to the centroid page. We generate a corpus of web pages by randomly sampling from the Open Directo...

2010-04-20: The Web's Missing Dimension: Time

Image
Herbert just completed an interview about Memento entitled " The Web's Missing Dimension: Time " with Jon Udell for the podcast series " Interviews With Innovators ". It is pretty long at ~44 minutes, but the discussion is thoughtful and gets at some higher-level topics about time and the Web -- topics not generally covered in prior presentations that often focus on mechanics and detailed explanation. Well worth the listen even if you know a good deal about Memento. -- Michael

2010-03-19: MementoFox Add-on Released

Image
There have been a number of developments in the Memento project. Perhaps the most interesting is the release of the MementoFox Mozilla Add-on. Shown to the left is the MementoFox installed in FireFox 3.6. I went to cnn.com , then turned on MementoFox by clicking the green "(M)" logo near the top left. I used the slider bar to select a date of 2010-02-22 (red text box), some magic happened, and then I was presented with an archived version of cnn.com in the WebCite archive with an actual date of 2010-02-23 , 1 day in the future of what I requested (green text box). Entering a new date in the red text box or using the slider bar will cause MementoFox to find the closest archived copy of cnn.com, possibly in archive other than WebCite. Everyone is encouraged to go to the Memento Demos page, install MementoFox and walk through some other time traveling scenarios detailed there. It is actually quite a lot of fun to play with. Feedback is welcome on the memento-dev gr...

2010-02-17: Using Web Page Titles to Rediscover Lost Web Pages

Image
The object of my project was to glean from a web page's title whether the title could be used to find the resource within the yahoo search engines caches. Lost pages for this project are pages that return a 404. A 404 response code is an error message indicating that the client was able to communicate with the server but the server could not find what was requested. There are a multitude of possibilities why a page or an entire web site may disappear. These pages may reside only in the cache’s of search engines, or web archives, or just moved from one URI to another. In the context of this experiment Titles are denoted by the TITLE element within a web page. There can only be one title in a web page. The title may not contain anchors, highlighting, or paragraph marks. What would be most desirable for this experiment would be to take all URIs as our collection set. Regrettably, using the entire web as our test set is unrealistic. Capturing a representative sample set of web-sites...

2010-02-11: Memento and OAC at the CNI Fall 2009 Membership Meeting

Image
Herbert , Rob and I were at the Coalition for Networked Information Fall 2009 Membership Meeting in Washington DC, December 14-15, 2009. The CNI meetings are always good and this one was no exception. We gave a presentation about Memento (direct link on vimeo ): Memento: Time Travel for the Web from CNI Video Editor on Vimeo . Note that this presentation was based on the initial version of Memento first presented in November 2009, not the slightly updated version from February 2010. While we were there, we were also interviewed by Gerry Bayne of EDUCAUSE . Here's an embedded version of the interview: Also at CNI Fall 2009, Rob gave a presentation about the Open Annotation Collaboration (OAC), of which I am on the technical committee. Rob's presentation is also available: Interoperable Annotation: Perspectives from the Open Annotation Collaboration from CNI Video Editor on Vimeo . We also did a short interview about OAC with EDUCAUSE: Rob...

2010-02-08: Memento Meeting, San Francisco, Feb 2-3 2010

Image
The entire Memento team went to San Francisco, CA February 2-3, 2010 to meet with representatives from the Internet Archive , California Digital Library , Microsoft Research , Library of Congress , LOCKSS and WebBase . The full attendee list and agenda is available at the Memento site, including six detailed presentations. Based on the excellent feedback from the representatives, we ended up with two significant changes in our approach. The first change is simply moving the URI of the original resource (URI-R) from the Alternates: response header to a separate Link: header. The information returned from the TimeGate (URI-G) and Memento (URI-M) is the same, it has just moved from one header to another. The second change represents a larger change from the previous model. Instead of URI-R redirecting (302 response code) to URI-G when it sees an Accept-Datetime header, URI-R always returns one or more Link: response headers pointing to one or more TimeGates (whether or not the ...

2010-02-06: Superbowl XLIV

Image
Regardless of which team you are rooting for this is going to be a good football game. Both teams have explosive offenses captained by quarterbacks that are destined to be indoctrinated into the Hall of Fame. Peyton Manning is one cool character and if he can figure out the Saints defense the Colts are going to pull away and not look back. The Colts have been consistently good all season and they have a good chance of continuing that trend on Sunday. If the Colts have a weakness, it is their running game. Both offensively and defensively the Colts run game has performed below the league average. The Saints with Drew Brees, have the leagues best offense without question. They have more yards per attempt and less interceptions than the Colts. They can pass and run the ball very well and if they want to win they had better use it to their advantage. The Saints handicap is their defense. They are below the league average and the Saints secondary against Peyton makes me shudder. That being...