Monday, September 28, 2009

2009-09-28: OAI-ORE In 10 Minutes

A significant part of my research time in 2007-2008 was spent working on the Open Archives Initiative Object Reuse & Exchange project (OAI-ORE, or simply just ORE). Producing the ORE suite of eight documents was difficult and took longer than I anticipated, but we had an excellent team and I'm extremely proud of the results. In the process, I also learned a great deal about the building blocks of ORE: the Web Architecture, Linked Data and RDF.

I'm often asked "What is ORE?" and I don't always have a good, short answer. The simplest way I like to describe ORE is "machine readable splash-pages". More formally, ORE addresses the problem of identifying Aggregations of Resources on the Web. For example, we often use the URI of an html page as the identifier of an entire collection of Resources. Consider this YouTube URI:

Technically, it identifies just the html page that is returned when that URI is dereferenced:

But we frequently (incorrectly) use this URI to also identify all the information contained within that html page, which is actually a collection of many URIs, some of which include:

There is more to ORE than this, however. Interested readers can read the ORE primer, and then tackle more difficult documents like the ORE Abstract Data Model. There are a variety of presentations about ORE available as well, including all the presentations we gave at the 2008 Open Day at Open Repositories 2008 at Southampton University. Herbert Van de Sompel has several presentations uploaded to slideshare (see additional presentations with the "oaiore" tag), but some are quite lengthy (160+ slides).

Fortunately, there is now a short, gentle introduction to ORE. Herbert has just uploaded to YouTube a nice 10 minute narrated overview of ORE in preparation for the 2009 Dublin Core conference. Obviously, there is a limit to how much can be covered in a 10 minute presentation, but this should provide you with the answer to "what is ORE about?" and give you enough background to start reading the ORE suite of documents.

Thanks to Herbert for taking the time to record and upload this video.

-- Michael

Thursday, September 17, 2009

2009-09-19: Football Intelligence and Beyond

Football Intelligence (FI) is a system for gathering, storing, analyzing, and providing access to data to help Football enthusiasts discover more about the performance of their favorite past time.

While taking Dr. Nelson's Collective Intelligence class I became fascinated with techniques for mining useful data from the "collective intelligence" of readily available data on the Internet.

We decided to apply some of the Data Mining Techniques covered in class in an attempt to predict the 2009 NFL Football season. There is a plethora of data out there that could be mined from Injury reports to betting lines but we decided to limit the scope to use the box score data for training and predictions.

Using box scores from 2003 to present we trained a number of different models from Support Vector Machines to Multilayer Perceptron Networks. The implementations of the models we are using are based on the Weka Data Mining Software. Weka contains a number of tools for experimenting with and visualizing data.

For comparison and to provide some controls we have chosen a few schemes like Home team always wins, City Population, and best Mascot. For the best mascot competition I had my daughters rank the mascots from best to worst and that ranking will be used throughout the season. Poe from the Baltimore Ravens came out on top.

If you would like to see how we are doing or even join us with your own predictions we have pick'em leagues for straight and against the spread.

Greg Szalkowski

Wednesday, September 16, 2009

2009-09-16: Announcing ArchiveFacebook - A Firefox Add-on for Archiving Facebook Accounts

ArchiveFacebook is a Firefox extension, which helps you to save web pages from Facebook and easily manage them. Save content from Facebook directly to your hard drive and view them exactly the same way you currently view them on Facebook.
Why would you want to do this?  Facebook has become a very important part of our lives.  Information about our friends, family, business contacts and acquaintances is stored in Facebook with no easy way to get it out.  ArchiveFacebook allows you to do just that.  What guarantee do you have that Facebook won't accidentally, or in some cases intentionally delete your account?  Don't trust your data to one web site alone.  Take matters into your own hands and preserve this information.  Show it to your kids one day!
Currently ArchiveFacebook can save:
  • Photos
  • Messages
  • Activity Stream
  • Friends List
  • Notes
  • Events
  • Groups
  • Info
You can download the extension from  Once at this page, press “Add to Firefox” and follow the prompts for installation.  Firefox will prompt you to restart Firefox.  Once you do so, you should see a new menu called “ArchiveFB”.  At this point, ArchiveFacebook has successfully installed.
It should be noted that, at the time of this writing, ScrapBook should not be enabled while ArchiveFacebook is enabled.  This will cause instability issues within both programs.  You can easily disable an extension from within the Firefox “Add-ons” dialog.  This dialog can be found in the “Tools” menu of Firefox.

Logging In:
First of all, make sure you are logged into your Facebook account.  ArchiveFacebook uses Firefox to view the pages of your account and then save them.  So, you need to be logged into your account in order to have the proper authentication and authorization to archive your account.
Once logged in, it is a good idea to open the sidebar.  To do this, click the ArchiveFB --> Show in Sidebar.  The sidebar should appear on the left hand side of the screen.  The sidebar does not need to be open in order to archive, but it provides a richer interface for doing so.
To archive your account, press ArchiveFB --> Archive.  This will redirect you to your Facebook profile page.  You will then see a dialog box telling you that you activity stream will be expanded.  If you press “Cancel” the archiving process will be completely cancelled.  If you press “Ok”, your activity stream will be expanded, displaying all activity done on Facebook since your accounts creation.  As your activity is being retrieved, you are presented with another dialog box that lets you know the date of the current activities that are being retrieved.  You may cancel this process at any time and your account will still be archived.  Your activity stream will be archived up until the date where you cancelled the retrieval.

Once the retrieval of your activity stream has completed, the archiving process will begin.  You will see a window that says “Capture” on it.  This window drives the archive process.  Each page to be archived will be listed in the scrollbar pane.


Browsing the Archive:
Once the archiving process has completed, you will see an entry in the sidebar that says “Facebook | username date” where username is your Facebook username and date is the current date.  Click on the entry.  You will see your Facebook profile page appear and at the bottom will be an annotation bar where you can highlight text or make comments on a page for your personal records.  Click through your archived Facebook pages to ensure that all pages have been archived.  All pages listed in the introduction should be archived.  You can tell if a page has been archived by placing your cursor over a link.  Look in the bottom left hand corner and Firefox will show the location of the link i.e. if the location starts with “file://”, it is on your hard drive, if it starts with “http://”, it is on the web.  If it is not on your local hard drive, try to archive your account again.  If the second attempt doesn’t work, please notify us and we will attempt to fix the problem.

ArchiveFacebook was developed at Old Dominion University (ODU) by Carlton Northern. Michael L. Nelson (ODU) and Frank McCown (Harding University) served as advisers. You can read more about the add-on in the research paper entitled What Happens When Facebook is Gone? presented at JCDL 2009.
ArchiveFacebook was developed by modifying code from ScrapBook. Note that ArchiveFacebook will not work correctly when ScrapBook is installed.  You will need to temporarily disable or uninstall ScrapBook before using ArchiveFacebook.
When running ArchiveFacebook, it may take several minutes to several hours to complete, depending on the amount of content to be archived.  At the beginning of the process, Firefox will be temporarily frozen while it retrieves your activity stream.  You may cancel this retrieval at any time and the archiving of the rest of your account will still occur.

User Manuals: