Posts

2014-07-14: The Archival Acid Test: Evaluating Archive Performance on Advanced HTML and JavaScript

Image
One very large part of digital preservation is the act of crawling and saving pages on the live Web into a format for future generations to view. To accomplish this, web archivists use various crawlers, tools, and bits of software, often built to purpose. Because of these tools' ad hoc functionality, users expect them to function much better than a general purpose tool. As anyone that has looked up a complex web page in The Archive can tell you, the more complex the page, the less likely that all resources will be captured to replay the page. Even when these pages are preserved, the replay experience is frequently inconsistent from the page on the live web. We have started building a preliminary corpus of tests to evaluate a handful of tools and web sites that were created specifically to save web pages from being lost in time. In homage to the web browser evaluation websites by the Web Standards Project , we have created The Archival Acid Test as a first step in ensuring

2014-07-10: Federal Cloud Computing Summit

Image
As mention in my previous post , I attended the Federal Cloud Computing Summit on July 8th and 9th at the Ronald Reagan Building in Washington, D.C. I helped the host organization, the Advanced Technology And Research Center (ATARC) organize and run the MITRE-ATARC Collaboration Sessions that kick off the event on July 8th. The summit is designed to allow Government representatives to meeting and collaborate with industry, academic, and other Government cloud computing practitioners on the current challenges in cloud computing. A FedRAMP  primer was held at 10:00 AM on July 8th in a Government-only session. At its conclusion, we began the MITRE-ATARC Collaboration Sessions that focused on Cloud Computing in Austere Environments, Cloud Computing for the Mobile Worker, Security as a Service, and the Impact of Cloud Computing on the Enterprise. Because participants are protected by Chathan House Rule , I cannot elaborate on the Government representation or discussions in the col

2014-07-08: Presenting WS-DL Research to PES University Undergrads

Image
On July 7th and 8th, 2014, Hany SalahEldeen and I ( Mat Kelly ) were given the opportunity to present our PhD research to visiting undergraduate seniors from a leading university in Bangalore, India ( PES University ). About thirty students were in attendance at each session and indicated their interest in the topics through a large quantity of relevant questions. Dr. Weigle ( @weiglemc ) Prior to ODU CS students' presentations, Dr. Michele C. Weigle ( @weiglemc ) gave the students an overview presentation of some of WS-DL 's research topics with her presentation Bits of Research . In her presentation she covered both our lab's foundational work, recent work, some outstanding research questions, as well as some potential projects to entice interested students to work with our research group. Bits of Research from Michele Weigle Mat ( @machawk1 ), your author Between Hany and me, I ( Mat Kelly ) presented a fairly high level yet technical overview

2014-07-08: Potential MediaWiki Web Time Travel for Wayback Machine Visitors

Image
Over the past year, I've been working on the  Memento MediaWiki Extension .  In addition to trying to produce a decent product, we've also been trying to build support for the Memento MediaWiki Extension at WikiConference USA 2014 .  Recently, we've reached out via Twitter to raise awareness and find additional supporters . To that end, we attempt to answer two questions: The Memento extension provides the ability to access a page revision closest, but not over the datetime specified by the user.  As mentioned in an earlier blog post , the Internet Archive only has access to the revisions of articles that existed at the time it crawled, but a wiki can access every revision.  How effective is the Wayback Machine at ensuring that visitors gain access to pages close to the datetimes they desire? How many visitors of the Wayback Machine could benefit from the use of the Memento MediaWiki Extension? Answering the first question shows why the Wayback Machine is not a

2014-07-07: InfoVis Fall 2012 Class Projects

Image
(Note: This is continuing a series of posts about visualizations created either by students in our research group or in our classes.) I've been teaching the graduate Information Visualization course since Fall 2011.   In this series of posts, I'm highlighting a few of the projects from each course offering.  (Previous post: Fall 2011 ) The Fall 2012 projects were based on the 2012 ODU Infographics Contest . Participants were tasked with visualizing the history and trajectory of work done in the area of quantum sensing. (All class projects are listed in my InfoVis Gallery .) Top Quantum Sensing Trends Created by Wayne Stilwell This project (currently available at https://ws-dl.cs.odu.edu/vis/quantum-stilwell/ ) is a visualization for displaying the history and trajectory of quantum sensing. History is shown as a year-by-year slideshow. The most publicized quantum sensing areas for the selected year are displayed. Clicking on a topic shows the number of publication

2014-07-02 LaTeX References, and how to control them

Image
With just a little abuse: "Which way did they go? How many were there? I must find my references; For I am their master." LaTeX references are wonderful things.  In this short epistle, we will explore some A sample page with reference problems. of the interesting things that you can do with them, problems that can arise from misusing them, problems that can arise from not using them, and finally how to spice them up just a little. First we will set up a conceptual model using based on the LaTeX file (Listing 1), the make file (Listing 2), and some auxiliary files that LaTeX creates.  Firstly, copy the LaTeX file and the make file to a convenient directory.  Create references.pdf from the command line, by executing make .  You want to get a sample PDF like in the image.  Now that we have something to look at, we can construct the conceptual model. Opening the references.aux and searching for the lines that begin with the \newlabel token, and comparing tha

2014-07-02 An ode to the "Margin Police," or how I learned to love LaTeX margins

Image
To the great Margin Police: "You lay down rules for all that approach you, One and half on the left-hand edge, One on all the other edges, Page numbers one half down from the top. These are your words. And we are grateful for you guidance and direction. Lo, you lead us in the ways of professionalism and consistency. We, the unwashed are grateful." But I have one question: Why doesn't the LaTeX style file help me achieve these goals?? And so the exploration begins. Sometimes we use LaTeX to write and submit papers and reports for publication.  Often the publishers provide a style file for us to use that dictates things like margins, number of columns per page, headers, footers, and other formatting directives.  Other times, guidance comes from "instructions to authors" and we are expected and required to meet the requirements.  What follows below are how see what are the current margins, how to set the margins, and how to see if your docum

2014-06-26: InfoVis Fall 2011 Class Projects

Image
(Note: This is continuing a series of posts about visualizations created either by students in our research group or in our classes.) I've been teaching the graduate Information Visualization course (then CS 795/895, now CS 725/825) since Fall 2011. Each semester, I assign an open-ended final project that asks students to create an interactive visualization of something they find interesting.  Here's an example of the project assignment . In this series of blog posts, I want to highlight a few of the projects from each course offering.  Some of these projects are still active and available for use, while others became inactive after their creators graduated. The following projects are from the Fall 2011 semester.  Both Sawood and Corren are PhD students in our group.  Another nice project from this semester was done by our PhD student Yasmin AlNoamany and MS alum Kalpesh Padia.  The project led directly to Kalpesh's MS Thesis, which has its own blog post . (All class

2014-06-23: Federal Big Data Summit

Image
On June 19th and 20th, I attended the Federal Big Data Summit  at the Ronald Reagan Building in the heart of Washington, D.C. The summit is hosted by the Advanced Technology Academic Research Center (ATARC) . I participated as an employee of the  MITRE Corporation  -- we help ATARC organize a series of collaboration sessions that are designed to help identify and make recommendations for solutions to big challenges in the federal government. I lead a collaboration session between government, industry, and academic representatives on Big Data Analytics and Applications. The goal of the session was to facilitate discussions between the participants regarding the application of big data in the government and preparing for the continued growth in importance of big data. The targeted topics included access to data in disconnected environments, interoperability between data providers, parallel processing (e.g., MapReduce ), and moving from data to decision in an optimal fashion. Due

2014-06-18: Google and JavaScript

Image
In this blog post, we detail three short tests in which we challenge the Google crawler's ability to index JavaScript-dependent representations. After an introduction to the problem space, we describe our three tests as introduced below. String and DOM modification : we modify a string and insert it into the DOM. Without the ability to execute JavaScript on the client, the string will not be indexed by the Google crawler. Anchor Tag Translation : we decode an encoded URI and add it to the DOM using JavaScript. The Google crawler should index the decoded URI after discovering it from the JavaScript-dependent representation. Redirection via JavaScript : we use JavaScript to build a URI and redirect the browser to the newly built URI. The Google crawler should be able to index the resource to which JavaScript redirects. Introduction JavaScript continues to create challenges for web crawlers run by web archives and search engines. To summarize the problem, our web brows

2014-06-18: Navy Hearing Conservation Program Visualizations

Image
(Note: This is the first in a series of posts about visualizations created either by students in our research group or in our classes.) The US Navy runs a Hearing Conservation Program (HCP) which aims to protect the hearing and prevent hearing loss in service members.  Persons who are exposed to levels in the range 85-100 dB are in the program and have their hearing regularly tested.  In the audiogram, there is a beep sounded at different frequencies with increasing volume.  The person being tested raises their hand when they hear the beep and the frequency and volume (in dBA) are recorded.  A higher volume value means worse hearing (i.e., the beep had to be louder before it was audible). Not only are people in the HCP regularly tested, but they are also provided hearing protection to help prevent hearing loss.  The audiogram data includes information about the job the person currently holds as well as if they are using hearing protection. Researchers are interested in studying Noi

2014-06-02: WikiConference USA 2014 Trip Report

Image
Amid the smell of coffee and bagels, the crowd quieted down to listen to the opening by Jennifer Baek , who, in addition to getting us energized, also paused to recognize Ardrianne Wadewitz and Cythia Sheley-Nelson , two Wikipedians who had, after contributing greatly to the Wikimedia movement, had recently passed.  The mood became more uplifting as Sumana Harihareswara began her keynote, discussing the wonders of contributing knowledge and her experience with the Ada Initiative , Geek Feminism , and Hacker School .  She detailed how the Wikimedia culture can learn from the experiences at Hacker School, discussing different methods of learning , and how these methods allow all of us to nurture learning in a group.  She went on to discuss the difference between liberty and hospitality, and the importance of both to community, detailing how the group must ensure that individuals do not feel marginalized due to their gender or ethnicity, but also detailing how good hospitality engend