To commemorate the 25th anniversary of the Internet Archive, I decided to use the Wayback Machine to dig through my past related to Elizabethtown College, PA, mainly when I was featured as an International Leadership Assistant (ILA) on their international student website.
I graduated from Elizabethtown College (Etown) in 2018 with a Bachelor of Science in Computer Engineering. In 2016, I had the opportunity to join the International Leadership Team to help international students promote global culture through various activities. I organized Culture Through ART every Fall semester and assisted my colleague Alexandria Takahashi in developing US Culture & Slang teaching materials. It shows a great example that international students were not only expressing their culture through crafts (e.g., origami) and paintings, but they were also learning to adopt US culture (Figure 1). At present, as a Ph.D. student researching in the area of Digital Libraries, Web Crawling, Natural Language Processing, and Machine Learning, I was eager to see if I could find our ILA group during the 2016-2107 academic year through web archives.
Figure 1: Some photos of our ILA team and international student activities (2016-2018)
Initially, I went to Etown's international student resource page and found that the new UI does not have the "International Leadership Team" menu bar, but the old UI has it. I also searched my name as "International Leadership Team Tasin Choudhury," but it returned no results found (Figure 2). Using the Wayback Machine on the same URL (
https://www.etown.edu/offices/international-students/staff.aspx), I discovered that the URL has been there for at least 10 years, and it has been archived only 11 times (Figure 3). Figure 3 also illustrates that a copy is found from mid-August in 2021, yet the next version is all the way back in 2017.
Figure 2: New vs. Old UI and Search Result using the new UI
I was mainly hired as an ILA Trainee in 2016 and got promoted to a senior team member in 2017. Therefore, I started digging up the snapshot from 2016 using the Wayback Machine. I found myself featured in the "International Leadership Team" menu bar (Figure 4), and the URL has been changed to "http://www.etown.edu/offices/international-students/peers.aspx." Since the old UI contains the "International Leadership Team" menu bar and the new UI does not, I assumed it has been removed from the new UI. To make sure, I stripped out the archives and performed "curl" on the following URL: http://www.etown.edu/offices/international-students/peers.aspx. Figure 5 illustrates that performing "curl" returns a "302 response" HTTP status code -- it has been temporarily moved to the URL given by the Location header. So If we search on the browser, the search engine will not find the page (Figure 6).
Figure 4: Featured as "Tasin Choudhury" in the old UI and URL has been changed
I checked the URL for "http://www.etown.edu/offices/international-students/peers.aspx" and found out that the URL has been archived only 15 times in the past 10 years. Although the initial search could not retrieve the page when I was featured in 2016, the Internet Archive's Wayback Machine query retrieved the page. I also checked for any mementos using
time travel for "http://www.etown.edu/offices/international-students/peers.aspx," but I found zero mementos. Figure 7 illustrates this example.
We can conclude that more work needs to be done in crawling sites more frequently. Some sites are crawled more often than is necessary, and some are not crawled enough. For example, Figure 8 illustrates that the URL for "https://twitter.com/realDonaldTrump" had 11 copies on August 14, 2021,
but this page is not online anymore. Interestingly, this is the same number of copies of Etown's international student resource page over 10 years. This proves the importance or necessity of balancing the crawl frontier.
Acknowledgment
I am very excited to celebrate
#InternetArchive25 with the Internet Archive. I would like to thank the Internet Archive for the fact that I could retrieve the page where I was featured at Etown as an ILA. I would also like to express my gratitude to
Dr. Michael Nelson for assisting me in this work and
Dr. Jian Wu for encouraging me to write this blog.
Comments
Post a Comment