We constructed two very simple test pages, both of which include links to Voorburg's missing page http://www.dds.nl/~krantb/stellingen/.
- http://www.cs.odu.edu/~jbrunelle/wsdl/unrobustifyTest.html which does not use robustify.js
- http://www.cs.odu.edu/~jbrunelle/wsdl/robustifyTest.html which does use robustify.js
GET /services/statuscode.php?url=http%3A%2F%2Fwww.dds.nl%2F~krantb%2Fstellingen%2F HTTP/1.1 Host: digitopia.nl Connection: keep-alive Origin: http://www.cs.odu.edu User-Agent: Mozilla/5.0 (iPhone; CPU iPhone OS 8_0 like Mac OS X) AppleWebKit/600.1.3 (KHTML, like Gecko) Version/8.0 Mobile/12A4345d Safari/600.1.4 Accept: */* Referer: http://www.cs.odu.edu/~jbrunelle/wsdl/robustifyTest.html Accept-Encoding: gzip, deflate, sdch Accept-Language: en-US,en;q=0.8 HTTP/1.1 200 OK Server: nginx/1.1.19 Date: Fri, 06 Feb 2015 21:47:51 GMT Content-Type: application/json; charset=UTF-8 Transfer-Encoding: chunked Connection: keep-alive X-Powered-By: PHP/5.3.10-1ubuntu3.15 Access-Control-Allow-Origin: *
The resulting JSON is used by robustify.js to then redirect the user to the memento http://web.archive.org/web/19990830104212/http://www.dds.nl/~krantb/stellingen/ as expected.
Given this success, we wanted to understand how our test pages would behave in the archives. We also included a link to the stellingen memento in our test page before archiving to understand how a URI-M would behave in the archives. We used the Internet Archive's Save Page Now feature to create the mementos at URI-Ms http://web.archive.org/web/20150206214019/http://www.cs.odu.edu/~jbrunelle/wsdl/robustifyTest.html and http://web.archive.org/web/20150206215522/http://www.cs.odu.edu/~jbrunelle/wsdl/unrobustifyTest.html.
The Internet Archive re-wrote the embedded links to be relative to the archive in the memento, converting http://www.dds.nl/~krantb/stellingen/ to http://web.archive.org/web/20150206214019/http://www.dds.nl/~krantb/stellingen/. Upon further investigation, we noticed that robustify.js does not issue onclick events to anchor tags linking to pages within the same domain as the host page. An onclick even is not assigned to this any embedded anchor tags because all of the links point to within the Internet Archive, the host domain. Due to this design decision, robustify.js is never invoked when within the archive.
When clicking on the URI-M, the 2015-02-06 memento does not exist, so the Internet Archive redirects the user to the closest memento http://web.archive.org/web/19990830104212/http://www.dds.nl/~krantb/stellingen/. The user, when clicking the link, ends up at the 1999 memento because the Internet Archive understands how to redirect the user from the 2015 URI-M for a memento that does not exist to the 1999 URI-M for a memento that does exist. If the Internet Archive had no memento for http://www.dds.nl/~krantb/stellingen/ the user would simply receive a 404 and not have the benefit of robustify.js using the Memento Time Travel service to search additional archives.
These templates are rewritten during playback to be relative to the Internet Archive:
Instead of issuing an HTTP GET for http://digitopia.nl/services/statuscode.php?url=http%3A%2F%2Fwww.dds.nl%2F~krantb%2Fstellingen%2F, robustify.js issues an HTTP GET for
http://www.cs.odu.edu/web/20150206214020/http://digitopia.nl/services/statuscode.php?url=http%3A%2F%2Fwww.dds.nl%2F~krantb%2Fstellingen%2F which returns an HTTP 404 when dereferenced. The robustify.js script does not handle the HTTP 404 response when looking for its service, and throws an exception in this scenario. Note that the memento that references the URI-M of robustify.js does not throw an exception because the robustify.js script does not make a call to digitopia.nl/services/.
In our test mementos, the Internet Archive also re-writes the URI-M http://web.archive.org/web/19990830104212/http://www.dds.nl/~krantb/stellingen/ to http://web.archive.org/web/20150206214019/http://web.archive.org/web/19990830104212/http://www.dds.nl/~krantb/stellingen/.
Yo Dawg situation) does not exist. Clicking on the apparent memento of a memento link leads to the user being told by the Internet Archive that the page is available to be archived.
We also created an Archive.today memento of our robustifyTest.html page: https://archive.today/l9j3O. In this memento, the functionality of the robustify script is removed, redirecting the user to http://www.dds.nl/~krantb/stellingen/ which results in a HTTP 404 response from the live web. The link to the Internet Archive memento is re-written to https://archive.today/o/l9j3O/http://www.dds.nl/~krantb/stellingen/, which results in a redirect (via a refresh) to http://www.dds.nl/~krantb/stellingen/ which results in a HTTP 404 response from the live web, just as before. Archive.today uses this redirect approach as standard operating procedure. However, Archive.today re-writes all links to URI-Ms back to their respective URI-Rs.
This is a different path to a broken URI-M than the Internet Archive takes, but results in a broken URI-M, nonetheless. Note that Archive.today simply removes the robustify.js file from the memento, not only removing the functionality, but also removing any trace that it was present in the original page.