Posts

2009-08-21: CS 751/851 "Introduction to Digital Libraries" Postponed Until Spring 2010

CS 751/851 "Introduction to Digital Libraries" has been postponed from Fall 2009 to Spring 2010. I apologize to those who had planned to take the class this Fall. --Michael

2009-07-30: Position Paper Published in Educause Review

Image
The July/August 2009 issue of Educause Review has a position paper of mine entitled " Data Driven Science: A New Paradigm? " This invited paper is essentially a cleaned-up version of my position paper at the 2007 NSF/JISC Workshop on Data-Driven Science and Scholarship held in Arizona, April 17-19 2007. Prior to the workshop, we were all assigned topics on which we were to write a short position paper . My topic was to address the question of is "data-driven science is becoming a new scientific paradigm – ranking with theory, experimentation, and computational science?" You can judge my response by the original paper's more cheeky title of "I Don't Know and I Don't Care". My argument can be summed up as "we've always had data-driven science at whatever was the largest feasible scale; it just happens that the scale is now very large." Scale is important, in fact some days I might argue that scale is all there is. But part...

2009-07-17: Technical Report "Evaluating Methods to Rediscover Missing Web Pages from the Web Infrastructure"

Image
This week I uploaded the technical report which is co-authored by Michael L. Nelson to the e-print service arxiv.org . The underlying idea of this research is to utilize the web infrastructure (search engines, their caches, the Internet Archive, etc) to rediscover missing web pages - pages that return the 404 "Page not Found" error. We apply various methods to generate search engine queries based on the content of the web page and user created annotations about the page. We then compare the retrieval performance of all methods and introduce a framework to combine such methods to achieve the optimal retrieval performance. The applied methods are: 5- and 7-term lexical signatures of the page the title of the page tags users annotated the page with on delicious.com 5- and 7-term lexical signatures of the page neighborhood (up to 50 pages linking to the missing page) We query the big three search engines (Google, Yahoo and MSN Live) with the outcome of all methods and analyze t...

2009-07-16: The July issue of D-Lib Magazine has JCDL and InDP reports.

The July/August 2009 issue of D-Lib Magazine has just published reports for the 2009 ACM/IEEE JCDL (written by me) and InDP (written by Frank and his co-organizers), as well as several other reports for JCDL workshops and other conferences (such as Open Repositories 2009 ). Whereas my previous entry about JCDL & InDP was focused on our group's experiences, these reports give a broader summary of the events. --Michael

2009-07-07: Hypertext 2009

From June 30th through July 1st I attended Hypertext 2009 ( HT 2009 ) in Torino Italy . The conference saw a 70% increase in submissions (117 total) compared to last year but due to the equally increased number of accepted papers (26 long and 11 short) and posters maintain last years acceptance rate of roughly 32%. HT 2009 also had a record of 150 registered attendees. I presented our paper titled " Comparing the Performance of US College Football Teams in the Web and on the Field " ( DOI ) which was joint work with Olena Hunsicker under the supervision of Michael L. Nelson . The paper describes an extensive study on the correlation of expert rankings of real world entities and search engine rankings of their representative resources on the web. Comparing the Performance of US College Football Teams in the Web and on the Field from Martin Klein We published a poster, " Correlation of Music Charts and Search Engine Rankings " ( DOI ), with the resu...

2009-06-29: NDIIPP Partners Meeting

On June 24-26 I attended the 2009 NDIIPP Partners Meeting in Washington DC. Although it has grown from the early years, I believe this year's attendance of 150 people is similar to last year's. Clay Shirky , author of "Here Comes Everybody", gave the keynote on Wednesday morning. Hopefully the Library of Congress will post a video of the keynote soon. If not, take a look at some of his other presentations -- you will find them enjoyable and informative. On Thursday morning I presented a summary of Martin 's PhD research, the tangible product of which will be a FireFox extension called "Synchronicity": Synchronicity: Just-In-Time Discovery of Lost Web Pages from Michael Nelson The presentation was very well received and there is a lot of interest in the extension. There were several interesting break out sessions, but the real news was on Friday when Martha Anderson (LC) introduced the upcoming National Digital Stewardship Alliance...

2009-06-22: Back From JCDL 2009

Image
We had a good showing at the 2009 ACM/IEEE Joint Conference on Digital Libraries (JCDL) in Austin, TX last week. In total, we had 1 full paper, 3 short papers, 2 posters, 1 workshop paper and 1 doctoral consortium paper. JCDL is the flagship conference in our field and we always make a point to send as many people as possible. Chuck Cartledge (left) presented "A Framework for Digital Object Self-Preservation" at the doctoral consortium . He also presented the related short paper " Unsupervised Creation of Small World Networks for the Preservation of Digital Objects ". Chuck is planning to have his doctoral candidacy exam sometime in the early fall. Michael presented the full paper " Using Timed-Release Cryptography to Mitigate The Preservation Risk of Embargo Periods ". This paper was based on Rabia Haq's MS Thesis, which she defended in the fall of 2008. Michael also co-organized the doctoral consortium and convinced WS-DL alumna Joan Smith ...