2018-07-02: The Off-Topic Memento Toolkit
Inspired by AlNoamany's work from " Detecting off-topic pages within TimeMaps in Web archives " I am pleased to announce an alpha release of the Off-Topic Memento Toolkit (OTMT). The results of testing with this software will be presented at iPres 2018 and those results are now available as a preprint . Web archive collections are created with a specific purpose in mind. A curator will supply seeds for the collection and create multiple versions of these seeds in order to study the evolution of a web page over time. This is valuable for following the changes in an organization or the events in a news story. Unfortunately, depending on the curator's intent, sometimes these seeds go off-topic. Because web archive crawling software has no way to know that a page is off-topic, these mementos are added to the collection. Below I list a few examples of off-topic pages within Archive-It collections. This memento from the Human Rights collection at Archive-It create