Thursday, November 15, 2018

2018-11-15: LANL Internship Report

Los Alamos National Laboratory
On May 27 I landed in sunny Sante Fe, New Mexico to start my 6 month internship at Los Alamos National Laboratory (LANL) for the Digital Library Research and Prototyping Team under the guidance of Herbert Van de Sompel and WSDL alumnus Martin Klein.

Work Accomplished

A majority of my time was used to work on the Scholarly Orphans project, which is a joint project between LANL and ODU, sponsored by the Andrew Mellon Foundation. This project explores from an institution perspective how it can discover, capture, and archive scholarly artifacts that an institution's researcher deposits in various productivity portals. After months of working on the project, Martin Klein showcased the Scholarly Orphans pipeline at TPDL 2018.

Scholarly Orphans pipeline diagram

My main task for this pipeline was to create and manage two components: the artifact tracker and pipeline orchestrator. Communication between different components was completed using ActivityStream2 (AS2) messages and Linked Data Notification (LDN) inboxes for sending and receiving messages. AS2 messages describe events users have accomplished providing a "human friendly but machine-processable" JSON format. LDN inboxes provide endpoints for messages to be received, advertising these endpoints via link headers. Applications (senders) can discover these endpoints and send messages to these endpoints (receivers). In this case each component was a sender and a receiver. For example, the orchestrator sent an AS2 message to the tracker component's inbox to start a process to track a user for a list of portals, the tracker responds and sends an AS2 message with results to the orchestrator's inbox which is then saved in a database.

This pipeline was designed to be a distributed network, where the orchestrator knows where each component inbox is before sending messages. The tracker, capture, and archiver components are told by the orchestrator where to send their AS2 messages and also where their generated AS2 event message will be accessible. An example of an AS2 message from the orchestrator component to the tracker component shows an event object with an endpoint "to" telling the tracker where to send the message and a "tracker:eventBaseUrl" to append a uuid for where the event generated by the tracker will be accessible. After the tracker has found events for the user it will generate a new AS2 message and send it to the orchestrator "to" endpoint.

Building the tracker and orchestrator components allowed me to learn a great deal about W3C Web Standards mostly dealing with the Semantic Web. I was required to learn about various programmatic technologies during my work which included: Elasticsearch as a database, Celery task scheduling, using Docker-Compose in a production environment, Flask and uWSGI as a python web server, and working with OAI-PMH interfaces.

I was also exposed to the various technologies the Prototyping Team had developed previously and included these technologies in various components of the Scholarly Orphans pipeline. These included: Memento, Memento Tracer, Robust Links, and Signposting.

The prototype interface of the Scholarly Orphans project is hosted at for a limited time. On the website you can see the various steps of the pipeline, the AS2 event messages, the WARCs generated from the capture process, and the replay of the WARCs via the archiver process for each of the researcher's productivity portal events. The tracker component of the Scholarly Orphans pipeline was made available via Github found here:

New Mexico Lifestyle


Over the course of my stay I stayed in a house located in Los Alamos shared by multiple Ph.D. students studying in diverse fields such as Computer Vision, Nuclear Engineering, Computer Science, and Biology. The views of the mountains were always amazing and only ever accompanied by rain during the monsoon season. A surprising discovery during the summer was that there always seemed to be a forest fire somewhere in New Mexico. 
Los Alamos, NM


During my stay and adventures I found out the level of spiciness that apparently every New Mexican had become accustomed to by adding the local Green Chile to practically any and/or every meal. 


Within the first two weeks of landing I had already planned a trip to Southern NM. Visiting Roswell, NM I discovered aliens were very real.
Roswell, NM International UFO Museum
Going further south I got to visit Carlsbad, NM the home of the Carlsbad Caverns which were truly incredible.
Carlsbad, NM Carlsbad Caverns
I was able to visit Colorado for a few days and went on a few great hikes. On August 11, I got to catch the Rockies vs. Dodgers MLB game where I got to see for the first time a walk-off home run by the Rockies

I also managed a weekend road trip to Zion Canyon, Utah allowing me to hike some great trails like Observation Point Trail, The Narrows, and Emerald Pools.
Zion Canyon, Utah - Observation Point Trail


If you're a visiting researcher not hired by the lab consider living in a shared home with multiple other students. This can help alleviate you of boredom and also help you to find people to plan trips with. Otherwise you will usually be excluded from the events planned by the lab for other students.

If you're staying in Los Alamos, plan to make weekend trips out to Santa Fe. Los Alamos is beautiful and has some great hikes, but can be short on entertainment frequently.

Final Impressions

I feel very blessed to have been offered this 6 month internship. At first I was reluctant to move out to the West, however it allowed me to travel to many great locations with new friends. My internship has allowed me to be exposed to various subjects relating to WS-DL research which will surely improve, expand, and influence my own research in the future.

A special thanks to Herbert Van de Sompel, Martin Klein, Harihar Shankar, and Lyudmila Balakireva for allowing me to collaborate, contribute, and learn from this fantastic team during my stay at LANL.

--Grant Atkins (@grantcatkins)

No comments:

Post a Comment