2018-07-18: Why We Need Private Web Archives: Almost Two-Thirds of Web Traffic IS NOT Publicly Archivable
Google.com mementos from May 8th 1999 on the Internet Archive In terms of the ability to be archived in public web archives, web pages fall into one of two categories: publicly archivable, or not publicly archivable. 1. Publicly Archivable Web Pages: These pages are archivable by public archives. The pages can be accessed without login/authentication. In other words, these pages do not reside behind a paywall. Grant Atkins examined paywalls in the Internet Archive for news sites and found that web pages behind paywalls may actually be redirecting to a login page at crawl time. A good example of a publicly archivable page is Dr. Steven Zeil's page since no authentication is required to view the page. Furthermore, it does not use client-side scripts (i.e., Ajax) to load additional content, so what you see in the web browser and what you can replay from public web archives are exactly the same. Screen shot from Dr. Steven Zeil's page captured on 2018-07-02