Posts

2018-09-03: Let's compare memento damage measures!

Image
It is always nice getting a Google Scholar alert that one of my papers has been cited. In this case, I learned that the paper " Reproducible Web Corpora: Interactive Archiving with Automatic Quality Assessment " (to appear in the  ACM Journal of Data and Information Quality ) cited a paper that I wrote during my doctoral studies with fellow PhD students  Mat Kelly  and  Hany SalahEldeen and our advisors  Michael Nelson and Michele Weigle . More specifically, the Reproducible Web Corpora paper (by Johannes Kiesel , Florian Kneist , Milad Alshomary , Benno Stein , Matthias Hagen , and Martin Potthast ) is a very important and well-executed follow on to our paper " Not All Mementos Are Created Equal: Measuring The Impact Of Missing Resources " (a b est student paper from JCDL2014 and subsequently published in the International Journal of Digital Libraries ). In this blog post, I will be providing a quick recap and analysis of the Kiesel paper from the perspective

2018-09-02: Sampath Jayarathna (Assistant Professor, Computer Science)

Image
I am really excited to be part of the Old Dominion University and the WS-DL group . I joined the faculty at Old Dominion University in 2018. Before that, I was a tenure-track assistant professor for two years at California State Polytechnic University (Cal Poly Pomona). I am truly grateful to Frank Shipman , Oleg Komogortsev , Richard Furuta , Dilma Da Silva and Cecelia Aragon for the help throughout this faculty search. It is sad to say goodbye to my colleagues at Cal Poly but I am excited to have an amazing bunch of mentors and colleagues here at ODU, Michael , Michele , Nikos , Ravi , Jian , Cong , Shubham , Anne and many more. Its truly amazing that I was able team up and put-together 2 NSF proposals (CRII and REU Site) within a short period of time. I received my Ph.D. in Computer Science from Texas A&M University in 2016, advised by Frank Shipman. I was a member of the Center for the Study of the Digital Libraries (CSDL) group. In 2012, I did a 6 month internshi

2018-08-30: Excited to Join WS-DL group in ODU!

Image
I am an outlier compared with most computer scientists because I spent 10 years on a field called "Astronomy and Astrophysics". Very few computer scientists followed the same path as me to transfer from a seemingly irrelevant major. But this is where my passion is, so I did it, and I made it! Right after I graduated as a PhD in 2011, I joined the CiteSeerX group directed by Dr. C. Lee Giles at IST , Penn State University . I worked as a DBA for web crawling at the beginning and soon became the tech leader of the search engine, and recently the Co-PI of an NSF awarded proposal on CiteSeerX . I spent six years, an usually long time as a postdoc and then was promoted to a teaching faculty. However, I kept moving on, because I wanted to do research! Luckily, Michael and Michele did not mind of taking the risk and bet on me to be a tenure-track faculty at the Old Dominion University. So I accepted the offer and became a member of the Web Science Digital Library group at ODU

2018-08-25: Four WS-DL Classes Offered for Fall 2018

Image
Four WS-DL classes are offered for Fall 2018:   CS 418/518 Web Programming is taught by Dr. Justin Brunelle , Tuesdays 4:20-7pm.  This class teaches LAMP , the original web programming stack. Even if you end up using MEAN , you still need to know LAMP. CS 431/531 Web Server Design is taught by Dr. Michael L. Nelson , Wednesdays 4:20-7pm.  This class teaches REST , the primary architectural style for web programming, via implementing a fully functional web server from scratch.   CS 795/895 Intro to Data Science is taught by Dr. Sampath Jayarathna , Tuesdays & Thursdays, 5:45-7pm.  This course will cover Python, machine learning, NumPy , pandas , and general data wrangling .  CS 795/895 Mining Scholarly Big Data is taught by Dr. Jian Wu , Tuesdays & Thursdays, 9:30-10:45am.  This course will cover machine learning, data mining, deep learning, as applied to the corpus of scholarly communication (via Dr. Wu's involvement in the CiteSeerX project). Dr. Michele C.

2018-08-01: A Preview of MementoEmbed: Embeddable Surrogates for Archived Web Pages

Image
As commonly seen on Facebook and Twitter, the social card is a type of surrogate that provides clues as to what is behind a URI. In this case, the URI is from Google and the social card makes it clear that the document behind this long URI is directions. As I described to the audience of Dodging the Memory Hole last year, surrogates provide the reader with some clue of what exists behind a URI . The social card is one type of surrogate. Above we see a comparison between a Google URI and a social card generated from that URI. Unless a reader understands the structure of all URIs at google.com , they will not know what the referenced content is about until they click on it. The social card, on the other hand, provides clues to the reader that the underlying URI provides directions from Old Dominion University to Los Alamos National Laboratory. Surrogates allow readers to pierce the veil of the URI's opaqueness. With  the death of Storify , I've been  examining alternativ

2018-07-22: Tic-Tac-Toe and Magic Square Made Me a Problem Solver and Programmer

Image
" How did you learn programming? ", a student asked me in a recent summer camp. Dr. Yaohang Li organized the Machine Learning and Data Science Summer Camp  for High School students of the Hampton Roads metropolitan region at the Department of Computer Science, Old Dominion University  from June 25 to July 9, 2018. The camp was funded by the  Virginia Space Grant Consortium . More than 30 students participated in it. They were introduced to a variety of topics such as Data Structures, Statistics, Python, R, Machine Learning, Game Programming, Public Datasets, Web Archiving, and Docker etc. in the form of discussions, hands-on labs, and lectures by professors and graduate students. I was invited to give a lecture about my research and Docker . At the end of my talk I solicited questions and distributed Docker swag. The question "How did you learn programming?" led me to draw Tic-Tac-Toe Game and a 3x3 Magic Square on the white board. Then I told them a more t

2018-07-18: HyperText and Social Media (HT) Trip Report

Image
Leaping Tiger statue next to the College of Arts at Towson University From July 9 - 12, the 2018 ACM Conference on Hypertext and Social Media (HT) took place at the College of Arts at Towson University in Baltimore, Maryland. Researchers from around the world presented the results of complete or ongoing work in tutorial, poster, and paper sessions . Also, during the conference I had the opportunity to present a full paper: " Bootstrapping Web Archive Collections from Social Media " on behalf of co-authors Dr. Michele Weigle and Dr. Michael Nelson . Day 1 (July 9, 2018) The first day of the conference was dedicated to a tutorial ( Efficient Auto-generation of Taxonomies for Structured Knowledge Discovery and Organization ) and three workshops: Human Factors in Hypertext (HUMAN) Opinion Mining, Summarization and Diversification Narrative and Hypertext I attended the Opinion Mining, Summarization and Diversification workshop. The workshop started with