2020-09-28: A PhD is a very long tunnel with a light at the end

My PhD defense committee: From the top left, Dr. M. Nelson (my co-advisor),
Dr. M. Weigle (my advisor), Dr. M. Abdous, Dr. S. Jayarathna, Dr. J. Wu, and
M. Aturban (myself).


This year has been tragic and depressing for most of us because of the pandemic, but it has not been that bad for me. I get the fruit of my hard work of almost eight years toward my PhD. I became a doctor and landed a job I love. My academic journey in the USA started around 12 years ago, where I first attended an intensive English program at Portland State University in Portland, Oregon, in 2008. About eight months later, in 2009, I moved to Las Cruces, New Mexico, where I completed my master's degree at New Mexico State University. In August 2012, I got accepted to the ODU's PhD program. On July 23rd, I successfully (and virtually) defended my PhD dissertation entitled "A Framework for Verifying the Fixity of Archived Web Resources" (presentation slides). I never thought that immediately after my defense, I would be setting and celebrating this great moment and achievement by myself in the WS-DL lab because of the pandemic. On August 14th, I completed the last PhD requirement by submitting my dissertation to ProQuest for publication. On August 31st, I started my new job as an assistant professor in Computer Science at Columbia College, Missouri.

In my research, we introduced a framework for establishing and verifying the fixity on the playback of archived pages, or mementos. We built this framework because users of web archives could not verify that archived pages have not been tampered with or changed since the time archives captured these resources. For example, if a web page is archived in 1999 and replayed in 2020, how do we know that this archived page has not been tampered with during those 21 years? To verify the fixity of mementos, we need to generate repeatable fixity information (e.g., hash values) on the playback of mementos. 

To accomplish that, we first collected 16,627 mementos from 17 public web archives. We described the four methods we used to create the dataset.  We then conducted a 14-month study that included downloading each memento 39 times and computing its fixity information (e.g., hash values) by following our initial predefined guidelines. For each memento, we compared its 39 hash values. We found that 88.45% of mementos produced more than one hash value, and only 11.55% of mementos always produced the same hash values. 

The distribution of all 16,627 mementos for distinct numbers of hash values

Based on our study results, we identified several changes causing the same memento to have different fixity information over time. This allowed us to define an archive-aware hashing function that consists of several guidelines for generating repeatable fixity information on the playback of mementos. Finally, we introduced two approaches, Atomic and Block, to disseminate fixity information to web archives and verify the fixity of mementos.

our archive-aware hashing function

My PhD journey was long, about eight years. That is about 3,000 days. I learned a lot from these years. 

First, I assumed that I could do it in four years. It took me eight years. Many factors affect how long a PhD is. These factors include the academic performance, department requirements, research group requirements, how quickly it is to find a research topic, the background knowledge about the topic, how interested you are in the topic, how collaborative you and your colleagues are, etc. And believe me, there are many more! 

Second, when I first started my Ph.D., I thought that I would do everything independently. Well, I am not against collaboration with others, of course, but it was not a priority during my Ph.D. However, I learned the hard way. Collaboration is critical. By working with others, you will learn from each other. You will be more productive, publish more, and your work will be cited more. Also, you will create more connections. So please, do not be a lone wolf.

Third, we all have 24 hours a day. It took me some time to realize that I should not let my PhD take priority over family or vice versa. As a PhD student, I had to work on multiple tasks, including working on my research, publications, projects, writing, courses, teaching, etc. It is essential to know how to manage your time. 

I am writing these words from Columbia, Missouri, where I started my teaching position as an assistant professor at Columbia College. I am so excited! 

I want to thank my WS-DL colleagues for their help in loading the moving container to be shipped from Norfolk, Virginia to Columbia, Missouri. 

From left to right, Mohamed Aturban (myself), Sawood Alam, and Nauman Siddique

I want to thank everyone in the WS-DL research group, especially Dr. Michele Weigle and Dr. Michael Nelson, for their support during my PhD journey. Finally, I would mention that I left two signs on my cubicle. The first sign has the name Ahmed Alsum who used the cubicle from 2010 to 2014, and the second sign has my name Mohamed Aturban. I used this cubicle from 2015-2020. I left both signs for Himarsha Jayanetti (a current PhD student) who will be using this cubicle soon. I hope she continues this tradition by preserving these signs for the next PhD student. Our cubicle is like a relay race, where each student works hard on a PhD for several years. Once finished, the student will pass the relay baton (cubicle sign) to the next student who will do the same.

My cubicle from 2015 to 2020 at Old Dominion University.

My office at Columbia College 

--Mohamed Aturban