Posts

Showing posts from October, 2021

2021-10-22: Rediscovering my Angelfire.com Pages with the Help of the Wayback Machine's CDX API

Image
To liberally steal a line from Stephen King: "A 6 digit ICQ number fled through the ancient Internet, and Jim sporadically scrambled after it." This is has been the case for well over a decade - probably closer to two decades at this point. I know there's a "7" involved. I think it started with "173." Or was it "178." Human memory is fuzzy like that - we tend to store shape features into long term memory which is very problematic for recalling specific numbers twenty-something years later . In the spirit of celebrating the Internet Archive's 25th birthday (everyone else is saying anniversary but I'm going with birthday because I like to anthropomorphize computer things), I thought it appropriate to share some very special data I was able to dig up from the strata of the ancient web during one of my fits to track down that ICQ number. I was up late one dark and stormy night , most likely training a U-net model for a work project. If

2021-10-21: How the Internet Archive Helped Me Remember CIKM 2019

Image
Last week, I was discussing my publications with a colleague, and I mentioned that I had a paper published at ACM CIKM 2019 . They were curious about the conference and its 2019 call for papers (CFP) date. I was trying to recall the conference's venue and one of the workshop's names. I remembered that the conference was in Beijing but did not have the other information on hand during the discussion. To help answer these questions, I visited the CIKM 2019 website as linked from  CIKM's page and found a very different website from what I remembered.  Content drift had struck again. This is not the CIKM conference website I was looking for... I discovered that the CIKM 2019 conference proceedings contain proof of my paper publication, a welcome from the conference chairs, paper statistics, lists of committee members and organizers, sponsors, and other papers and workshops at CIKM 2019.  The proceedings do not mention: the conference venue recommended conference hotels CFP d

2021-10-21: 9/11 through the eyes of the Internet Archive

Image
 "Where were you when the planes hit the World Trade Center?" It seems like everyone has an answer to that question.  On 9/11 , my mom was visiting my grandparents in a small town in Wisconsin. My dad was stationed in Korea and called my mom to turn on the news. She was standing in my grandparents' living room, on the phone with my dad, when she watched the second plane hit the World Trade Center.  I was 2 years old, so what I know about 9/11 is from documentaries and stories from my parents and grandparents. The same is true for those in my generation and generations to follow.  With the 20th anniversary of the 9/11 Attack, I was curious how 9/11 was reflected in the Internet Archive. I looked at the captures for 5 major new networks on September 11, 2001: CNN ,  Fox News ,  MSNBC ,  Washington Post , and  NY Times . The first plane hit at 8:46 am and the earliest capture from those websites came from CNN at 8:03 pm, almost 12 hours later. These captures show the after e

2021-10-20: Not Your Parents’ Web: Scope, Segmentation, Stability, Resilience, and Persistence

Image
  "Even though the documents on the Internet are the easy documents to collect and archive, the average lifetime of a document is 75 days and then it is gone." -- Brewster Kahle, November, 1996 Researchers from the Internet Archive , Protocol Labs , and Old Dominion University will revisit the question “how long does a web page last?”  The answer frequently given is 44 , 75 , or 100 days, all of which stem from research that dates back to 1996--2003.  It is well-known that “44-100 days on average” does not capture the complexity of HTTP activity, but it is an easy to remember scalar number that people can understand.  Much of the circulating knowledge regarding this question comes from the 1990s and early 2000s, before the large-scale adoption of JavaScript in web pages for dynamically producing content and the proliferation of native mobile apps (e.g., the release of the iPhone in 2007 ).   The Filecoin Foundation has generously funded ($75k) this year-long project, in

2021-10-20: Evolution of a Childhood Newspaper

Image
Cover Pages of Wijeya newspapers When we ( Gavindya  and  Yasith ) were young adults, we read the  Wijeya newspaper weekly back in Sri Lanka.  Wijeya is a native language (Sinhala), weekly newspaper in Sri Lanka, published by  Wijeya Newspapers Ltd . Back in the days, the  Wijeya newspaper  was published only on paper. However, today most newspapers are also published on websites as online newspapers as well. We remember reading the Wijeya newspaper weekly, whenever we got free time. Since our parents did not allow us to use computers much, we dedicated our leisure time mostly for reading.  The Wijeya newspaper included sections for science, fiction, drawings and creative writings of children, news about schools, and general news (local and international). Sometimes, the Wijeya newspaper was sold out, and our parents had to go to a different store to buy it for us. It was a very popular newspaper among young adults. We loved to read it mostly because of the presentation of information

2021-10-19: Proving I was an Etown ILA using the Internet Archive

Image
To commemorate the 25th anniversary of the  Internet Archive , I decided to use the Wayback Machine to dig through my past related to Elizabethtown College , PA, mainly when I was featured as an International Leadership Assistant (ILA) on their international student website . I graduated from Elizabethtown College (Etown) in 2018 with a Bachelor of Science in Computer Engineering. In 2016, I had the opportunity to join the International Leadership Team to help international students promote global culture through various activities. I organized Culture Through ART  every Fall semester and assisted my colleague  Alexandria Takahashi  in developing  US Culture & Slang teaching materials. It shows a great example that international students were not only expressing their culture through crafts (e.g., origami) and paintings, but they were also learning to adopt US culture (Figure 1). At present, as a Ph.D. student researching in the area of Digital Libraries, Web Crawling, Natural L

2021-10-14: ODU CS selected for national NASA@ My Library STEAM programming initiative

Image
    Dr. Poursardar, a lecturer in ODU computer science department and a member of the  Web Science and Digital Libraries Research Group (WS-DL) , has been selected through a competitive application process to be part of  NASA@ My Library , an education initiative created to increase and enhance STEAM (science, technology, engineering, arts, and math) learning opportunities for library patrons throughout the nation, including geographic areas and populations currently underrepresented in STEAM education. Check out more details on this project on  Science-Technology Activities and Resources for Libraries (Star Net) website. ODU is one of just six universities nationwide to be selected to support this initiative, and we’re thrilled to be part of this project, we look forward to our students helping libraries bring STEAM concepts to younger patrons and exploring the universe together with people of all ages during their public programs in 2021 and 2022. As a NASA@ My Library Partner Uni

2021-10-14: Summer Internship at LANL - Memento Validator

Image
    Last summer, I had an opportunity to work as a research intern at the Los Alamos National Laboratory (LANL) with the Research and Prototyping (Proto) team of the LANL Research Library . The position was my first internship in the US, and I am extremely grateful for the opportunity. Due to the pandemic, I worked remotely from Norfolk. I worked on the Memento validator project, updating the memento validation and compliance testing infrastructure under Dr. Lyudmila Balakireva . Los Alamos National Laboratory ( https://www.lanl.gov/ ) About LANL Los Alamos National Laboratory (LANL) is a Federally Funded Research and Development Center (FFRDC) to solve national security through scientific research and development. The research areas of the laboratory include nuclear security, national defense, energy, counter-terrorism, and the environment. The laboratory hires approximately 2000 students each year to work on scientific research and development projects as a part of their summer i