Posts

2013-07-26: ACM/IEEE Joint Conference on Digital Libraries (JCDL) 2013

Image
The Old Dominion University Web Science and Digital Libraries (WSDL) research group was well-represented at the JCDL 2013 conference – Digital Libraries at the Crossroads . We arrived in Indianapolis, Indiana on Sunday night. While Hany SalahEldeen and I took time on Monday to ready our presentations, Scott Ainsworth and Yasmin AlNoamany presented at the Doctoral Consortium . Scott presented his research on improving temporal drift in the archives, and Yasmin presented her work on creating a story from mementos. Their presentations (and doctoral consortium) are discussed in more detail in their blog posting . Day 1 After opening remarks from J. Stephen Downie and Robert H. McDonald , Clifford Lynch gave the opening keynote of the conference entitled "Building Social Scale Information Infrastructure: Challenges of Coherence, Interoperability and Priority." Lynch posed a series of questions that are influencing the research areas in th

2013-07-22: JCDL 2013 Doctoral Consortium

Image
The JCDL 2013 Doctoral Consortium is a workshop for Ph.D. students from all over the world who are in the early phases of their dissertation work.  Students present their thesis and research plan and a panel of prominent professors and experienced practitioners in the field of Digital Libraries provides feedback in a constructive atmosphere.  Yasmin AlNaomony and Scott Ainsworth had the privilege of presenting papers at this year's Doctoral Consortium. Scott Ainsworth, Michael Nelson, & Yasmin AlNoamany User Interaction The first session focused on user interaction and was chaired by George Buchanan .  The session began with Erik Choi presenting his work on understanding the motivations behind the questions users ask in Internet Q&A forums.  Prior work in this area has focused on the use an content of Q&A forums; Erik's work focuses on why users ask questions with motivation, expectations, and the relationship between the them. Yasmin AlNaomony pres

2013-07-15: Temporal Intention Relevancy Model (TIRM) Data Set

Image
In the third anniversary of the Haiti earthquake, president Barack Obama held a press conference and discussed the need to keep helping the Haitian community and to invest more in rebuilding the economy. A user was watching the press conference tweeted about it on the 14th of January, and provided a link to the streamed news.  A couple of days later when I read this tweet and clicked on the link and instead of seeing anything related to the press conference, Haiti, or President Obama, I got a stream feed of the Mercedes-Benz Super Dome in New Orleans in preparation for the 2013 Super Bowl. It is worth mentioning that at the time of writing this blog the tweet above was actually deleted, proving that social posts don't persist throughout time as we discussed in our earlier post . This scenario illustrates the problem we are trying to detect, model, and solve. The inconsistency between what is intended at the time of sharing and what the reader sees at the time of c

2013-07-15: Wayback Machine Upgrades Memento Support

Image
Just over a week ago , the Internet Archive upgraded their support for Memento in the Wayback Machine .  The Wayback Machine has had native Memento support for about 2.5 years, but they've just recently implemented a number of changes and now the Wayback Machine and version 08 of the Memento Internet Draft are synchronized.  The changes will be mostly unseen by casual users, but developers will appreciate the changes that should make things even simpler.  Perhaps even more importantly, these changes have been reflected in the open source version of the Wayback Machine , so the numerous sites that are running this software (for example, see the IIPC member list ) should enjoy native Memento support upon their next upgrade. The first and most significant change is that there is now just a single URI prefix for mementos ( URI-M ).  Previously, the URI-M discovered through the Wayback Machine's UI was different from the URI-M discovered through the Memento interface (e.g., us

2013-07-10: WARCreate and WAIL: WARC, Wayback and Heritrix Made Easy

Image
As the Web Science and Digital Libraries Research Group, we regularly interact with end users as well as developers that are interested in digital preservation. One of our goals is to assist in making web preservation accessible to regular users instead of just power users.  As computer scientists, this frequently means creating software. A few digital preservation software packages that were created by WS-DLers include: Because shrimp, that's why. Warrick - a utility for reconstructing (or recovering) a website using various archives and caches. Synchronicity - a Firefox extension that supports the user in rediscovering missing web pages mcurl - a command-line memento client and two that are dear to my heart: And other sea creatures WARCreate - a Google Chrome extension that allows you to create WARC files from any webpage Web Archiving Integration Layer (WAIL) - a re-packaged Wayback and Heritrix that aims to be "One-Click User Instigated Preservatio

2013-07-09: Archive.is Supports Memento

Image
(2014-04-16 edit: Two days ago, archive.is started 301 redirecting to archive.today .   Otherwise, all the existing links should look and function as they had been.) There's a lot to like about Archive.is , a recent entry in the page-at-a-time personal web archiving space: the simple search/upload interface, the bookmarklet for easily pushing pages into the archive while reading, the thumbnails (and full-sized images) of captured pages, how it handles Javascript, etc.  But now there is an additional reason: Archive.is natively supports Memento and is now included in the Memento aggregators at LANL and ODU. Archive.is is similar to WebCite in that it archives a single page when a user requests that it be archived.  This is different from crawlers at, for example, the Internet Archive and Archive-It , which crawl the web all the time, archiving pages as they go along.  These archives represent different, complementary strategies for crawling the web: Archive.is, WebCite: s

2013-06-18: NTRS, Memento, and Handles

Image
In a previous post I covered the shut down of the NASA Technical Report Server , which has since come back online in a reduced capacity .  In this post we examine some of the peculiarities of the current state of NTRS, particularly with respect to Handles and Memento.  Earlier this week I needed to access an old NASA report of mine, ironically enough about NTRS, from 1996: Richard C. Tuey, Mary Collins, Pamela Caswell, Bob Haynes, Michael L. Nelson, Jeanne Holm, Lynn Buquo, Annette Tingle, Bill Cooper and Roy Stiltner, NASAwide Electronic Publishing System-Prototype STI Electronic Document Distribution: Stage-4 Evaluation Report, NASA TM-104630 (parts 1 and 2), May 1996. It is not a particularly enjoyable report; it is the kind of lengthy, multi-authored, sanitized, bureaucratic-engineering report that people write but don't read (a "better" summary can be found in AIAA-95-0964 ).  I probably have a pdf of the report somewhere in my files, but instead I pulle