|From right to left, Dr. Nelson (my advisor),|
Yousof (my son), Yasmin (myself), Ahmed (my husband)
I started my Ph.D. in January 2011 at the same time that the uprisings of the Jan 25 Egyptian Revolution began. I was witnessing what was happening in Egypt while I was in Norfolk, Virginia. I could not do anything during the 18 days except watch all the news and social media channels, witnessing the events. I wished that my son Yousof, who was less than 2 years old at that time, could know what was happening as I saw it. Luckily, I knew about Archive-It, a subscription service by the Internet Archive that allows institutions to develop, curate, and preserve collections of Web resources. Each collection in Archive-It has two dimensions: time and URI. Understanding the contents and boundaries of these archived collections is a challenge for most people, resulting in the paradox of the larger the collection, the harder it is to understand.
We named the proposed framework the Dark and Stormy Archive (DSA) framework, in which we integrate “storytelling” social media and Web archives. In the DSA framework, we identify, evaluate, and select candidate Web pages from archived collections that summarize the holdings of these collections, arrange them in chronological order, and then visualize these pages using tools that users already are familiar with, such as Storify. An example of the output is bellow. It shows three stories for the three collections about the Egyptian Revolution. The user can gain an understanding about the holdings of each collection from the snippets of each story.
There are multiple collections in Archive-It about the Jan. 25 Egyptian Revolution
There is more than collection documenting the Arab Spring and particularly the Egyptian Revolution. Documenting long-running events such as the Egyptian Revolution results in large collections that have 1000s of URIs and each URI has 1000s of copies through time. It is challenging for my son to pick a specific collection to know the key events of the Egyptian revolution. The topic of my dissertation, which was entitled "Using Web Archives to Enrich the Live Web Experience Through Storytelling", focused on understanding the holdings of the archived collections.
|The story of the Arab Spring Collection|
|The story of the North Africa and the Middle East collection|
|The story of the Egyptian Revolution collection|
With the help of Archive-It team and partners, we obtained a ground truth data set for evaluating the generated stories by the DSA framework. We used Amazon Mechanical Turk to evaluate the automatically generated stories against the stories that were created by domain experts. The results show that the automatically generated stories by the DSA are indistinguishable from those created by human subject domain experts, while at the same time both kinds of stories (automatic and human) are easily distinguished from randomly generated stories. I successfully defended my Ph.D. dissertation on 06/16/2016.
Using Web Archives to Enrich the Live Web Experience Through Storytelling - Ph.D. defense presentation from Yasmina Anwar
Generating persistent stories from themed archived collections will ensure that future generations will be able to browse the past easily. I’m glad that Yousof and future generations will be able to browse and understand the past easily through generated stories that summarize the holding of the archived collections.
- A list of my publications:
- Visualizing digital collections at Archive-It, JCDL2012: ppt
- Access patterns for robots and humans in web archives, JCDL 2013: ppt
- Who and What Links to the Internet Archive, TPDL 2013: ppt
- Who and what links to the Internet Archive, IJDL 2014
- Using Web Archives to Enrich the Live Web Experience Through Storytelling, IEEE-DCDL 2013: ppt
- Detecting Off-Topic Pages in Web Archives, TPDL 2015: ppt
- Characteristics of Social Media Stories, TPDL 2015, ppt
- Characteristics of social media stories: What makes a good story?, IJDL 2016
- Detecting off-topic pages within TimeMaps in Web archives, IJDL 2016
- The resulting code from this research on Github:
To continue WS-DLer’s habit in providing recaps, lessons learned, and recommendations, I will list some of the lessons learned for what it takes to be a successful Ph.D. student and advice for applying in academia. I hope these lessons and advice will be useful for future WS-DLers and grad students. Lessons learned and advice:
- The first one and the one I always put in front of me: You can do ANYTHING!!
- Getting involved in communities in addition to your academic life is useful in many ways. I have participated in many women in technology communities such as the Anita Borg Institute and the Arab Women in Computing (ArabWIC) to increase the inclusion of women in technology. I was awarded travel scholarships to attend several well-known women in tech conferences: CRA-W (Graduate Cohort 2013), Grace Hopper Celebration of Women in Computing (GHC) 2013, GHC 2014, GHC 2015, and ArabWIC 2015. I am a member of the leadership committee of ArabWIC. Attending these meetings grows maturity and enlarges personal connections and development that prepare students for future careers. I also gained leadership skills from being part of the leadership committee of ArabWIC.
- Publications matter! if you are in WS-DL, you will have to get the targeted score 😉. You can know more about the point system on the wiki. If you plan to apply in academia, the list of publication is a big issue.
- Teaching is important for applying in academia.
- Collaboration is a key for increasing your connections and also will help in developing your skills for working in teams.
- And at last, being a mom holding a Ph.D. is not easy at all!!
The trail was not easy, but it is worth it. I learned and have changed much since I started the program. Having enthusiastic and great advisors like Dr. Nelson and Dr. Weigle is a huge support that results in happy ending and achievement to be proud of.