Posts

Showing posts from July, 2018

2018-07-22: Tic-Tac-Toe and Magic Square Made Me a Problem Solver and Programmer

Image
"How did you learn programming?", a student asked me in a recent summer camp. Dr. Yaohang Li organized the Machine Learning and Data Science Summer Camp for High School students of the Hampton Roads metropolitan region at the Department of Computer Science, Old Dominion University from June 25 to July 9, 2018. The camp was funded by the Virginia Space Grant Consortium. More than 30 students participated in it. They were introduced to a variety of topics such as Data Structures, Statistics, Python, R, Machine Learning, Game Programming, Public Datasets, Web Archiving, and Docker etc. in the form of discussions, hands-on labs, and lectures by professors and graduate students. I was invited to give a lecture about my research and Docker. At the end of my talk I solicited questions and distributed Docker swag.

The question "How did you learn programming?" led me to draw Tic-Tac-Toe Game and a 3x3 Magic Square on the white board. Then I told them a more than a decade o…

2018-07-18: HyperText and Social Media (HT) Trip Report

Image
From July 9 - 12, the 2018 ACM Conference on Hypertext and Social Media (HT) took place at the College of Arts at Towson University in Baltimore, Maryland. Researchers from around the world presented the results of complete or ongoing work in tutorial, poster, and paper sessions. Also, during the conference I had the opportunity to present a full paper: "Bootstrapping Web Archive Collections from Social Media" on behalf of co-authors Dr. Michele Weigle and Dr. Michael Nelson. Day 1 (July 9, 2018)
The first day of the conference was dedicated to a tutorial (Efficient Auto-generation of Taxonomies for Structured Knowledge Discovery and Organization) and three workshops: Human Factors in Hypertext (HUMAN)Opinion Mining, Summarization and DiversificationNarrative and Hypertext I attended the Opinion Mining, Summarization and Diversification workshop. The workshop started with a talk titled: "On Reviews, Ratings and Collaborative Filtering," presented by Dr. Oren Sar Sh…

2018-07-18: Why We Need Private Web Archives: Almost Two-Thirds of Web Traffic IS NOT Publicly Archivable

Image
In terms of the ability to be archived in public web archives, web pages fall into one of two categories: publicly archivable, or not publicly archivable.
1. Publicly Archivable Web Pages: These pages are archivable by public archives. The pages can be accessed without login/authentication. In other words, these pages do not reside behind a paywall. Grant Atkins examined paywalls in the Internet Archive for news sites and found that web pages behind paywalls may actually be redirecting to a login page at crawl time. A good example of a publicly archivable page isDr. Steven Zeil's page since no authentication is required to view the page. Furthermore, it does not use client-side scripts (i.e., Ajax) to load additional content, so what you see in the web browser and what you can replay from public web archives are exactly the same.

Some web pages provide "personalized" content depending on the GeoIP of the requester. In these cases, what you see in the browser and what …