2014-05-28: The road to the most precious three letters, PHD

On May 10th, 2014, the commencement with hundreds of students wearing their caps and gowns and ready for the moment of graduation can’t be forgotten. For me, it was the coronation for a long trip towards my Ph.D. degree in computer science. A few days before that, on May 3rd, 2014, I submitted my dissertation that was entitled “Web Archive Services Framework For Tighter Integration Between The Past And Present Web” to the ODU registrar's office as a declaration of the completion of the requirements for the degree. On Feb 26th, 2014, I defended my dissertation that was presented with these slides and is available for watching on video streaming.

In my research, I explored a proposed service framework that provided APIs for the web archive corpus to enable users and third party developers to access the web archive on four levels.

  • The first level is the content level that gives access to the actual content of web archive corpuses with various filter. 
  • The second level is the metadata level that gives access to two types of metadata. The ArcLink system extracts, preserves, and delivers the temporal web graph for the corpus. ArcLink was published as a poster in JCDL 2013 with my favorite minute madness and with a more detailed version as a tech report. ArcLink was presented in IIPC GA 2013 and received good feedback from the web archives consortium. The second type of metadata was thumbnails, we proposed thumbnails summarization techniques to select and generate distinguished set of pages that represent the main changes in the visual appearance of webpage through time. This work has been presented at ECIR 2014
  • The third level is URI level where we tried to extend the default URI lookup interface to benefit form the HTTP redirection. This research has been discussed in TempWeb 2013 and the full paper available in the proceedings. 
  • The fourth level is archive level where we quantified the current web archiving activities on two directions. The percentage of web archives materials regarding the live web corpus that was presented in JCDL 2011 and detailed version appeared as tech report. This work attracted the attention of various reporters to discuss it such as: The Atlantic, The Chronicle of Higher Education, and MIT Technology Review. The second direction was the distribution of web archives materials where we developed new methods to profile the web archives based on the TLD and languages. The work was presented at TPDL 2013, and an extended version with a larger dataset is accepted for publication in an IJDL special issue.
Now, while writing about it from my office at Stanford University Library, where I’m working as web archiving engineer and leading the technical activities for the new Stanford web archiving project, I remember the long trip since I've arrived in the US in Fall 2009 to start my degree. It was a long trip to gain the most precious three letters that will be attached to my name forever, Ahmed AlSum, PhD.
@JFK on Aug 2, 2009
