Posts

Showing posts from September, 2015

2015-09-30: Digital Preservation - Magdeburg Germany Trip Report

Image
Dr. Herzog: This large green area on your left is Sanssouci Park. It has 11 palaces in it.
Yasmin: I want to visit this park after we are back from the university, can we?
Dr. Herzog: We sure can... I think we will be back before sunset.
Yasmin: I love beautiful things.
Dr. Herzog: Who doesn't?
Sawood: [Smiles]

The three souls were heading to the Hochschule Magdeburg-Stendal University from Potsdam, Germany in Dr. Michael Herzog's car for a lunch lecture on the topic of Digital Preservation. Yasmin and Sawood from the Web Science and Digita Libraries Research Group of the Old Dominion University, Norfolk, Virginia were invited for the talk by Dr. Herzog at his SPiRIT Research Group. The two WSDL members have presented their work at TPDL 2015 in Poznan, Poland then on their way back home they ware halted and hosted by Dr. Herzog in Germany for the lunch lecture. You may also enjoy the TPDL 2015 trip report by Yasmin.


Passing by beautiful landscapes, crossing bridges and rivers, obse…

2015-09-28: TPDL 2015 in Poznan, Poland

Image
On September 15 2015, Sawood Alam and I (Yasmin AlNoamany) attended the 2015 Theory and Practice of Digital Libraries (TPDL) Conference in Poznan, Poland. This year, WS-DL had four accepted papers in TPDL for three students (Mohamed Aturban (who could not attend the conference because of visa issues), Sawood Alam, and Yasmin AlNoamany). Sawood and I arrived in Poznan on Monday, Sept. 14. Although we were tired from travel, we could not resist walking to the the best area in Poznan, the old market square. It was fascinating to see those beautiful colorful houses at night with the reflection of the water on them after it rained with the beautiful European music by many artists who were playing in the street.

The next morning we headed to the conference, which was held in Poznań Supercomputing and Networking Center. The organization of the conference was amazing and the general conference co-chairs, Marcin Werla and Cezary Mazurek, were always there to answer our questions. Furthermore,…

2015-09-21: InfoVis Spring 2015 Class Projects

Image
In Spring 2015, I taught Information Visualization (CS 725/825) for MS and PhD students.  This time we used Tamara Munzner's Visualization Analysis & Design textbook, which I highly recommend:
"This highly readable and well-organized book not only covers the fundamentals of visualization design, but also provides a solid framework for analyzing visualizations and visualization problems with concrete examples from the academic community. I am looking forward to teaching from this book and sharing it with my research group."
—Michele C. Weigle, Old Dominion University I also tried a flipped-classroom model, where students read and answer homework questions before class so that class time can focus on discussion, student presentations, and in-class exercises. It worked really well -- students liked the format, and I didn't have to convert a well-written textbook into Powerpoint slides.

Here I highlight a couple of student projects from that course.  (All class pro…

2015-09-10: CDXJ: An Object Resource Stream Serialization Format

Image
I have been working on an IIPC funded project of profiling various web archives to summarize their holdings. The idea is to generate statistical measures of the holdings of an archive under various lookup keys where a key can be a partial URI such as Top Level Domain (TLD), registered domain name, entire domain name along with any number of sub-domain segments, domain name and a few segments from the path, a given time, a language, or a combination of two or more of these. Such a document (or archive profile) can be used answer queries like "how many *.edu Mementos are there in a given archive?", "how many copies of the pages are there in an archive that fall under netpreserve.org/projects/*", or "number of copies of *.cnn.com/* pages of 2010 in Arabic language". The archive profile can also be used to determine the overlap between two archives or visualize their holdings in various ways. Early work of this research was presented at the Internet Archive

2015-09-08: Releasing an Open Source Python Project, the Services That Brought py-memento-client to Life

Image
The LANL Library Prototyping Team recently received correspondence from a member of the Wikipedia team requesting Python code that could find the best URI-M for an archived web page based on the date of the page revision. Collaborating with Wikipedia, Harihar Shankar, Herbert Van de Sompel, Michael Nelson, and I were able to create the py-mement-client Python library to suit the needs of pywikibot.

Over the course of library development, Wikipedia suggested the use of two services, Travis CI and Pypi, that we had not used before.  We were very pleased with the results of those services and learned quite a bit from the experience.  We have been using GitHub for years, and also include it here as part of the development toolchain for this Python project.

We present three online services that solved the following problems for our Python library:
Where do we store source code and documentation for the long term? - GitHubHow do we ensure the project is well tested in an independent environ…