Tuesday, October 25, 2016

2016-10-25: Paper in the Archive

Mat reports on his journalistic experience and how we can relive it through Internet Archive (#IA20)                            

We have our collections, the things we care about, the mementos that remind us of our past. Many of these things reside on the Web. For those we want to recall and should have (in hindsight) saved, we turn to the Internet Archive.

As a computer science (CS) undergrad at University of Florida, I worked at the student-run university newspaper, The Independent Florida Alligator. This experience became particularly relevant with my recent scholarship to preserve online news. At the paper, we reported mostly on the university community, but also on news that catered to the ACRs through reports about Gainesville (e.g., city politics).

News is compiled late in the day to maximize temporal currency. I started at the paper as a "Section Producer" and eventually evolved to be a Managing Editor. I was in charge of the online edition, the "New Media" counterpart of the daily print edition -- Alligator Online. The late shift fit well with my already established coding schedule.

Proof from '05, with the 'thew' still intact.

The Alligator is an independent newspaper -- the content we published could conflict with the university without fear of being censored by the university. Typical associated college newspapers have this conflict of interest, which potentially limits their content only to that which is approved. This was part of the draw to the paper for me and I imagine, the student readers seeking less biased reporting. The orange boxes were often empty well before day's end. Students and ACRs read the print paper. As a CS student, I preferred Alligator Online.

With a unique technical perspective among my journalistic peers, I introduced a homebrewed content management system (CMS) into the online production process. This allowed Alligator Online to focus on porting the print content and not on futzing with markup. This also made the content far more accessible and, as time has shown thanks to Internet Archive, preservable.

Internet Archive's capture of Alligator Online at alligator.org over time with my time there highlighted in orange.

After graduating from UF in 2006, I continued to live and work elsewhere in Gainesville for a few years. Even then technically an ACR, I still preferred Alligator Online to print. A new set of students transitioned into production of Alligator Online and eventually deployed a new CMS.

Now as a PhD student of CS studying the past Web, I have observed a resultant decline in accessibility that occurred after I had moved on from the paper. This corresponds further with our work On the Change in Archivability of Websites Over Time (PDF). Thankfully, adaptations at Alligator Online and possibly IA have allowed the preservation rate to recover (see above, post-tenure).

alligator.org before (2004) and after (2006) I managed, per captures by Internet Archive.

With Internet Archive celebrating 20 years in existence (#IA20), IA has provided the means for me to see the aforementioned trend in time. My knowledge in the mid-2000s of web standards and accessibility facilitated preservation. Because of this, with special thanks to IA, the collections of pages I care about -- the mementos that remind me of my past -- are accessible and well-preserved.

— Mat (@machawk1)

NOTE: Only after publishing this post I thought to check alligator.org's robots.txt file as archived by IA. The final capture of alligator.org in 2007 before the next temporally adjacent one in 2009 occurred on August 7, 2007. At that time (and prior), no robots.txt file existed for alligator.org despite IA preserving the 404. Around late October of that same year, a robot.txt file was introduced with the lines:
User-Agent: *
Disallow: /

No comments:

Post a Comment