Posts

2013-05-25: Game Walkthroughs As A Metaphor for Web Preservation

Image
Do you remember playing the Atari 400/800 game " Star Raiders "?  Probably not, but for me it pretty much defined my existence in middle school: the obvious Star Wars inspiration, the stereo sound, the (for the time) complex game play , the 3D(-ish) first-person orientation -- this was all ground-breaking stuff for 1979.  It, along with games like " Eastern Front (1941) ", inspired me at a young age to become a video game developer; an inspiration which did not survive my undergraduate graphics course .  I could encourage you to (re)experience the game by pointing you to the ROM image for the game, as well an appropriate emulator (I used " Atari800MacX "), but without the venerable Atari joystick (the same one used in the more famous 2600 system), it just doesn't feel the same to me.  And although the original instructions have been scanned, the game play is complex enough that unlike most games of the era, you can't immediately understa

2013-05-21: An Update About Archiving Tweets

Image
Today I encountered this article about a UK driver bragging on Twitter about hitting a cyclist .  Rather than extend an already lengthy post about archiving tweets from two weeks ago, this example will be its own post.  Summary: a woman hit a bicyclist participating in a race (the cyclist apparently was not seriously injured) and then bragged about it on Twitter.  The cyclist was apparently not going to report the event, but her bragging changed his mind and he contacted the police: @ emmaway20 we have had tweets ref an RTC with a bike. We suggest you report it at a police station ASAP if not done already & then dm us — Norwich Police (@NorwichPoliceUK) May 19, 2013 The driver deleted her Twitter account , but the offending evidence has already been archived -- not just by concerned citizens making copies (check the thread in the Tweet above), but Topsy also has archived the evidence as well. Interestingly, unlike the Twitpic examples in the previous post,

2013-05-13: Temporal Web Workshop 2013 Trip Report

Image
On May 13, Hany SalahEldeen and I attended the third  Temporal Web Analytic Workshop , collocated with WWW 2013 in Rio De Janeiro, Brazil. Marc Spaniol , from Max Planck Institute for Informatics , Germany, welcomed the audience in the opening note of the workshop. He emphasized on the target of the workshop to build a community of interest in the temporal web. Omar Alonso , from Microsoft Silicon Valley , was the keynote speaker with presentation entitled: “Stuff happens continuously: exploring Web contents with temporal information”. Omar divided his presentation into three parts: Time in document collection, Social data, and Exploring the web using time. In the Time in document collection, Omar gave an intro about the temporal dimension of the document. He defined the characteristics of the temporal by first defining “What is Time?”. The time may be used in normalized format or hierarchy format. The time has 4 types: times; duration; sets, which may explicit (i.e., May 2,

2013-05-09: HTTP Mailbox - Asynchronous RESTful Communication

Image
We often encounter web services that take a very long time to respond to our HTTP requests. In the case of an eventual network failure, we are forced to issue the same HTTP request again. We frequently consume web services that do not support REST . If they did, we could utilize the full range of HTTP methods while retaining the functionality of our application, even when the external API we utilize in our application changes. We sometime wish to set up a web service that takes job requests, processes long running job queues and notifies the clients individually or in groups. HTTP does not allow multicast or broadcast messaging. HTTP also requires the client to stay connected to the server while the request is being processed. Introducing HTTP Mailbox - An Asynchronous RESTful HTTP Communication System. In a nutshell, HTTP Mailbox is a mailbox for HTTP messages. Using its RESTful API, anyone can send an HTTP message (request or response) to anyone else independent of the availabi

2013-05-07: Who Is Archiving Your Tweets?

Image
Who is archiving your tweets? You're probably thinking "the Library of Congress".  And you're right, since 2010 they have been (see the announcements from Twitter and LC ).  But LC is currently providing access only to researchers, and the scale of the archive makes access challenging (see LC's January 2013 white paper that provides a status update on the project). To say I think this joint project between LC and Twitter is exciting and important is an understatement; I could go on about the scholarly importance, the cultural and technological record, the phenomena of social media, etc.  So I was surprised (but in retrospect, should not have been) when almost immediately afterwards projects like noloc.org surfaced so you could opt out of the archiving of your public tweets. However, while you might be able to prevent LC from archiving your tweets, companies like Topsy are archiving them, or at least some of them.  Tospy is one of my new, favorite sites

2013-04-22: IIPC GA 2013

Image
From April 22--26, Michael Nelson and I attended International Internet Preservation Consortium ( IIPC ) General Assembly 2013 that was hosted by the National and University Library of Slovenia in Ljbuljana, Slovenia. This year is the ten-year anniversary of the IIPC. GA this year has the theme of " What were the past challenges? and how can we plan the future of IIPC? ". Also, this year, Old Dominion University becomes an official member of the IIPC. The GA has been organized into five days. Day 1: Monday, April 22, 2013 IIPC General Assembly . Mateja Komel Snoj, the director of the National and University Library Slovenia , and Alenka Kavčič – Čolić, the Head of Library Research Center at National and University Library Slovenia opened the days welcomed the attendance and showed their pleasure for hosting IIPC GA in Slovenia. Mateja emphasized the importance of the digital preservation and the rule of National and University Library Slovenia in the preservation o

2013-04-19: Carbon Dating the Web

Image
(note: Carbon Date 2.0 was released on 2014-11-14 ) In the course of our research we often needed to determine when a certain web resource was created. In numerous cases, this question is fairly straightforward to answer by examining the resource itself. Articles often have publishing datetime stamps, social media contributions have posting time, and others you can estimate the creation date from reading the resource itself. This process is simple upon manually examining the resource, but when the dataset of resources is large it is harder to automate. To solve this problem we conducted several experiments to determine when the resource was created automatically. When a resource is created it often gets indexed in the search engines, archived in the public archives, and shared in the social media thus leaving trails of existence. We trace those trails of existence and use the first appearance of the first trail as a close estimate of the creation date. The timeline below illustra

2013-04-08: Grad Cohort Workshop (CRA-W) 2013

Image
On April 5-6, I was pleased to have the opportunity to meet and network with many successful senior women as well as graduate students from other universities in CRA-W Graduate Cohort , which was held in Boston, MA. Grad Cohort, which began in 2004, aims to increase the ranks of senior women in computing by building and mentoring nationwide communities of women through their graduate studies. Grad Cohort accepts women students in their first, second, or third year of graduate school in computer science and engineering. They provide sessions for each of the three years. Since I am now in my third year of my computer science Ph.D., I attended third year sessions, which I'm going to talk about in the rest of the blog post. The workshop included a mix of formal presentations and informal discussions. In the first day's afternoon, there was a Poster Session for participants to talk about their research. I presented a poster entitled " Access Patterns for Robots and Humans