2017-03-24: The Impact of URI Canonicalization on Memento Count

Mat reports that relying solely on a Memento TimeMap to evaluate how well a URI is archived is not a sufficient method. We performed a study of very large Memento TimeMaps to evaluate the ratio of representations versus redirects obtained when dereferencing each archived capture. Read along below or check out the full report . Memento represents a set of captures for a URI (e.g., http://google.com ) with a TimeMap. Web archives may provide a Memento endpoint that allows users to obtain this list of URIs for the captures, called URI-Ms. Each URI-M represents a single capture (memento), accessible when dereferencing the URI-M (resolving the URI-M to an archived representation of a resource). Variations in the "original URI" are canonicalized (coalescing https://google.com and http://www.google.com:80/ , for instance) with the original URI (URI-R in Memento terminology) also included with a literal "original" relationship value.