Friday, June 10, 2011

2011-06-10: Launching Synchronicity - A Firefox Add-on for Rediscovering Missing Web Pages in Real Time


Today we introduce Synchronicity, a Firefox extension that supports the user in rediscovering missing web pages. It triggers on the occurrence of 404 "Page not Found" errors, provides archived copies of the missing page as well as five methods to query search engines for the new location of the page (in case it has moved) or to obtain a good enough replacement page (in case the page is really gone).
Synchronicity works in real time and helps to overcome the detriment of link rot in the web.

Installation:
Download the add-on from https://addons.mozilla.org/en-US/firefox/addon/synchronicity and follow the installation instructions. After restarting Firefox you will notice Synchronicity's shrimp icon in the right corner of the status bar.

Usage:
Whenever a 404 "Page not Found" error occurs the little icon will change colors and turn to notify the user that it has caught the error. Just click once on the red icon and the Synchronicity panel will load up.
Synchronicity utilizes the Memento framework to obtain archived copies of a page. On startup you are in the Archived Version tab where two visualizations of all available archived copies are offered.
The TimeGraph is a static image giving an overview of the number of copies available per year. Three drop down boxes enable you to pick a particular copy by date and have it display in the main browser window.
The TimeLine offers a "zoomable" way to explore the copies in dependence of the time they were archived. Each copy is represented by the icon of its hosting archive. You can click on the icon to receive metadata about the copy and see a link that will display the copy. You can also filter the copies by their archive.


Based on these copies Synchronicity provides two content based methods:
  1. the title of the page
  2. the keywords (lexical signature) of the page
that both can be used as queries against Google, Yahoo! and Bing. The idea is that these queries represent the "aboutness" of the missing page and hence make a good query to discover the page at its new location (URI) or a discover a good enough replacement page that satisfies the user's information need.


Synchronicity can further obtain tags from Delicious created by users to annotate the page. Even thought tags are sparse, if available they can make a well performing search engine query. Additionally Synchronicity will extract the most salient keywords from pages that link to the missing page (link neighborhood lexical signature) that again can be used as a query.
Lastly Synchronicity offers a convenient way to modify the URL that caused the 404 error and try. The idea is that maybe shortening the path will get where you want to go.

These last three methods can be applied if no archived copy of the missing page can be found.

Synchronicity provides a straight forward interface but also enables more experienced users to modify all parameters underlying the extraction of titles, keywords, tags and extended keywords. The Expert Interface lets you for example show the titles of the last n copies where you specify the value of n. It also enables you to pick a particular copy to extract the keywords from and change many more parameters.



Notes:
Synchronicity is a beta release so do not let it perform open-heart surgery on your mother-in-law!
It was developed within the the WS-DL research group in the Computer Science Department at Old Dominion University by Moustafa Aly and Martin Klein under supervision of Dr. Michael L. Nelson.

Please send your feedback, comments and suggestions for improvement to
synchronicity-info@googlegroups.com

--
martin

1 comment: