2020-11-03: 19 Years of Wayback – Inspiring the collection and replay of the web
The Internet Archive’s Wayback Machine is almost 20 years old. As the Wayback Machine nears its second decade full of operation, I reflected on how my research has been inspired by the work that goes into enabling the historical replay of the web. The @waybackmachine is officially old enough to vote but not drink this year. 468 Billion web pages and more than 1,000,000,000,000 captures later, the Wayback Machine is still public, free, and committed to access for all. https://t.co/EdYMeVz2q5 — Internet Archive (@internetarchive) October 26, 2020 I’ve been away from the WS-DL blog for a little while, so a reintroduction is probably worthwhile. I am a PhD alumnus from the WS-DL group and am currently a Principal Researcher at The MITRE Corporation . As you may guess, my work in the WS-DL group focused on web archiving and specifically the crawler and information collection trade-offs of using crawlers that exercise JavaScript on web pages (e.g., Brozzler ) vs those that...