The Web Science and Digital Libraries Research Group had a productive 2016, with two Ph.D. and one M.S. students graduating, one large research grant awarded ($830k), 16 publications, and 15 trips to conferences, workshops, hackathons, etc.
For student graduations, we had:
- Justin Brunelle defended his Ph.D. dissertation on February 5, 2016. Justin already had a full-time position at MITRE, but not coincidentally he had his choice of significant promotions at the conclusion of his Ph.D.
- Yasmin AlNoamany defended her Ph.D. dissertation on June 16, 2016. Yasmin had several opportunities, and eventually decided on a postdoc fellow position in Software Curation at UC Berkeley, with Dr. Erik Mitchell.
- Greg Szalkowski completed his M.S. in 2016 as well. We had hoped to keep him on for a Ph.D., but he's having too much fun traveling the world setting up military communications solutions.
- Shawn Jones passed his breadth exam.
- Alexander Nwala passed his breadth exam.
- Lulwah Alkwai passed her research ability exam.
- Erika Siregar, a new M.S. student, joined us via the Fulbright Scholar Program.
- Shawn's paper "Avoiding spoilers: wiki time travel with Sheldon Cooper", based on his MS Thesis research, was finally published in IJDL.
- Also in IJDL were expanded versions of three papers from TPDL 2015: Sawood Alam's paper "Web Archive Profiling Through CDX Summarization", and two from Yasmin "Characteristics of social media stories: What makes a good story?" and "Detecting Off-Topic Pages in Web Archives".
- Mat Kelly and Sawood each had a paper at TPDL 2016.
- Alexander, Sawood, and Mat had three posters at JCDL 2016 (we did not have full paper submissions since Michele Weigle was a PC co-chair). Alexander's JCDL poster was also released as an expanded tech report.
- Justin had a paper in D-Lib Magazine about archiving a corporate intranet.
- Shawn's position at LANL was productive, with a paper in PLoS ONE about "Scholarly Context Adrift", a poster at WWW 2016 about linking (or not) to DOIs, and a tech report about extracting HTML from archived web pages, which informed several blog post proposals about formal definitions for getting at raw archived content.
- Michele had a poster at IEEE Vis 2016 based on a project from her CS 725/825 class.
- I had a tech report with David Rosenthal and Herbert Van de Sompel about various techniques for archiving journals, including Signposting.
In late April, we had Herbert, Harish Shankar, and Shawn Jones visit from LANL. Herbert has been here many times, but this was the first visit to Norfolk for Harish. It was also on the visit that Shawn did his breadth exam.
.@WebSciDL on our way to #jcdl2016 @machawk1 @weiglemc @phonedude_mln @acnwala @ibnesayeed @jcdl2016 pic.twitter.com/xI7zeDQeAd— Michael L. Nelson (@phonedude_mln) June 18, 2016
@elunca @WebSciDL @machawk1 @weiglemc @acnwala @ibnesayeed @jcdl2016 yes! Took the ferry to make it longer! ;-) pic.twitter.com/wbJENY334O— Michael L. Nelson (@phonedude_mln) June 18, 2016
In addition to the fun road trip to JCDL 2016 in New Jersey (which included beers on the Cape May-Lewes Ferry!), our group traveled to:
- Justin attended the Federal Cloud Computing Summit in January.
- Sawood, Mat, and Alexander attended the Archives Unleashed Hackathon in March at the University of Toronto.
- In April I went to both the CNI Spring Meeting in San Antonio and the IIPC General Assembly in Reykjavik.
- April also saw Shawn attend the WWW Conference in Montreal.
- In May, Erika attended the Fulbright Enrichment Seminar in Pittsburgh.
- From June through August, Alexander was at Harvard University for a summer fellowship at the Library Innovation Lab; where he worked on the Local Memory Project.
- As mentioned above, in June Mat, Sawood, Alexander, Michele, and I all went to JCDL 2016 (also the Web Archiving & Digital Libraries Workshop, and the JCDL Doctoral Consortium).
- Right before JCDL 2016 Mat, Sawood, Alexander, Shawn, John Berlin, and Mohamed Aturban attended the second Archives Unleashed Hackathon in Washington DC. While the rest of the people left for JCDL in New Jersey, Shawn stuck around for the Saving the Web Symposium at the Library of Congress right after the Hackathon.
- In August I attended the Documenting the Now Advisory Board meeting in St. Louis.
- Mat attended the IIPC Building Better Crawlers Hackathon in London in September.
- In October, Mat, Shawn, John, and I attended the Dodging the Memory Hole meeting at UCLA.
- Sawood went to TPDL 2016 in Hannover in October, and also visited with Dr. Michael Herzog after the conference.
- Michele attended part of IEEE Vis 2016 in Baltimore in October.
- And somehow, I don't think anyone traveled in November or December!
20th Anniversary of the Internet Archive, we did celebrate locally with tacos, DJ Spooky CDs, and a series of tweets & blog posts about the cultural impact and importance of web archiving. This was in solidarity with the Internet Archive's gala which featured taco trucks and a lecture & commissioned piece by Paul Miller (aka DJ Spooky). We write plenty of papers, blog posts, etc. about technical issues and the mechanics of web archiving, but I'm especially proud of how we were able to assemble a wide array of personal stories about the social impact of web archiving. I encourage you to take the time to go through these posts:
.@WebSciDL celebrates 20 years of #webarchiving & @internetarchive w tacos and @djspooky CDs! #IA20 pic.twitter.com/AFb3qUiuzz— Michael L. Nelson (@phonedude_mln) October 26, 2016
We had only one popular press story about our research this year, with Tech.Co's "You Can’t Trust the Internet to Continue Existing" citing Hany SalahEldeen's 2012 TPDL paper about the rate of loss of resources shared via Twitter.
We released several software packages and data sets in 2016:
- Mat and Sawood continue to develop IPWB, an implementation of the Wayback Machine and the InterPlanetary Filesystem.
- Yasmin released her Off-Topic Detection code, as well the story data sets used in her dissertation.
- Zetan Li provided a much needed update for the Carbon Date software.
- Alexander released the first version of code for the Local Memory Project (from his time @ Harvard).
- Sawood released MemGator, which allows one to setup their own Memento Aggregator.
Signposting, automated assessment of web archiving quality, verification of archival integrity, and automating the archiving of non-journal scholarly output. We will soon be releasing several research outputs as a result of this grant.
WS-DL reviews are also available for 2015, 2014, and 2013. We're happy to have graduated Greg, Yasmin, and Justin; and we're hoping that we can get Erika back for a PhD after her MS is completed. I'll close with celebratory images of me (one dignified, one less so...) with Dr. AlNoamany and Dr. Brunelle; may 2017 bring similarly joyous and proud moments.
.@yasmina_anwar returns from @UCBIDS for grad ceremony & avatar advancement @WebSciDL @oducs https://t.co/Vo2NH3iTpI pic.twitter.com/o02e6nlDc7— Michael L. Nelson (@phonedude_mln) December 17, 2016