2018-01-07: Review of WS-DL's 2017
Great writeup of #jcdl2017 in Toronto by @acnwala, featuring @oducs @WebSciDL (2 faculty, 2 alums, 3 grad students) https://t.co/wSirB8Jhq9 pic.twitter.com/HM0XePiz8u— ODU Computer Science (@oducs) July 28, 2017
The Web Science and Digital Libraries Research Group had a steady 2017, with one MS student graduated, one research grant awarded ($75k), 10 publications, and 15 trips to conferences, workshops, hackathons, internships, etc. In the last four years (2016--2013) we have graduated five PhD and three MS students, so the focus for this year was "recruiting" and we did pick up seven new students: three PhD and four MS. We had so many new and prospective students that Dr. Weigle and I created a new CS 891 web archiving seminar to.@WebSciDL luncheon, joint w/ Dr Li's group and several prospective students. pic.twitter.com/uvgDpHmPWc— Michael L. Nelson (@phonedude_mln) February 10, 2017
- Erika Siregar completed her MS, and her project was on the continued development of the Memento Damage service, first introduced by Dr. Justin Brunelle in his PhD dissertation.
- For PhD students, we added: Plinio Vargas, Hussam Hallak, and Brian Griffin.
- For MS Students, we added: Miranda Smith, Grant Atkins, Nauman Siddique, and Maheedar Gunnam.
- Alexander Nwala won the departmental "Outstanding PhD researcher" award in Spring 2017.
- Mohamed Aturban published a tech report about the difficulties in simply computing fixity information about archived web pages (spoiler alert: it's a lot harder than you might think; blog post).
- Corren McCoy published a tech report about ranking universities by their "engagement" with Twitter.
- Yasmin AlNoamany, now a post-doc at UC Berkeley, published two papers based on her dissertation about storytelling: a tech report about the different kinds of stories that are possible for summarizing archival collections, and a paper at Web Science 2017 about how our automatically created stories are indistinguishable from those created by experts.
- Lulwah Alkwai published an extended version of her JCDL 2015 best student paper in ACM TOIS about the archival rate of web pages in Arabic, English, Danish, and Korean languages (spoiler alert: English (72%), Arabic (53%), Danish (35%), and Korean (32%)).
- The rest of our publications came from JCDL 2017:
- Alexander published a paper about his 2016 summer internship at Harvard and the Local Memory Project, which allows for archival collection building based on material from local news outlets.
- Justin Brunelle, now a lead researcher at Mitre, published the last paper derived from his dissertation. Spoiler alert: if you use headless crawling to activate all the javascript, embedded media, iframes, etc., be prepared for your crawl time to slow and your storage to balloon.
- John Berlin had a poster about the WAIL project, which allows easily running Heritrix and the Wayback Machine on your laptop (those who have tried know how hard this was before WAIL!)
- Sawood Alam had a proof-of-concept short paper about "ServiceWorker", a new javascript library that allows for rewriting URIs in web pages and could have significant impact on how we transform web pages in archives. I had to unexpectedly present this paper since thanks to a flight cancellation the day before, John and Sawood were in a taxi headed to the venue during the scheduled presentation time!
- Mat Kelly had both a poster (and separate, lengthy tech report) about how difficult it is to simply count how many archived versions of a web page an archive has (spoiler alert: it has to do with deduping, scheme transition of http-->https, status code conflation, etc.). This won best poster at JCDL 2017!
- In February, I attended the National Symposium on Web Archiving Interoperability at the Internet Archive in San Francisco, and immediately after that Lulwah, Erika, Sawood, and Mohamed attended the third Archives Unleashed Hackathon. Unfortunately I could not stay for the hackathon.
- John Berlin went to Stanford University in late March for Personal Digital Archiving 2017 and presented WAIL.
- In April I went to Albuquerque NM for the Spring 2017 CNI Membership Meeting, and along with Martin Klein and Herbert Van de Sompel we presented early results from our AMF project "To the Rescue of the Orphans of Scholarly Communication".
- June was a busy month, first with Mat and Sawood attending the fourth Archives Unleashed hackathon, and then the IIPC Web Archiving Conference in London, which were held back-to-back.
- June also saw many WS-DL members (me, Dr. Weigle, Sawood, John, Alexander, and alumni Martin and Justin) attend JCDL 2017 in Toronto, for which we have separate trip reports for the main conference itself and the associated Web Archiving and Digital Libraries (WADL) Workshop.
- Shawn Jones attended Web Science 2017 in late June in Troy, NY, presenting Yasmin's paper since she was unable to attend and since he will be furthering the storytelling research she began.
- Late August saw Alexander return from his summer internship at Harvard, where he researched media manipulation.
- In October, Lulwah attended the 2017 Grace Hopper Celebration of Women in Computing (GHC). Although this was Lulwah's first time at the conference, we were fortunate enough to have Yasmin attend three times in the past (2015, 2014, and 2013).
- Also in October, I went to LANL for to visit with Herbert, Martin, and Sarven Capadisli about some of the technologies and concepts that Herbert would eventually cover in his Paul Evans Peters award lecture in December.
- In November, Shawn attended the ASIST meeting in DC (along with his wife Valentina Neblitt Jones, who works at LANL), and then made it, with Dr. Weigle in attendance, to San Francisco for the Dodging the Memory Hole meeting at the Internet Archive.
- Dr. Weigle attended the Fall 2017 CNI Membership Meeting in December in Washington DC, ODU's first as a member of CNI!
- I had to miss CNI, since I first attended the ACM Workshop on Reproducibility in Publication, and then the Documenting the Now Symposium "Digital Blackness in the Archives" (which overlapped with CNI).
#DTMH2017 getting ready to get started at @internetarchive. @WebSciDL folks speaking today (@mart1nkle1n) and tomorrow (@shawnmjones). pic.twitter.com/ojyXiURKDN— Michele Weigle (@weiglemc) November 15, 2017
.@johnaberlin pitching WAIL in minute madness #jcdl2017 https://t.co/2pzlUL7vd1 @WebSciDL pic.twitter.com/4uvKKqWs3G— Michael L. Nelson (@phonedude_mln) June 20, 2017
.@mart1nkle1n introducing @justinfbrunelle's paper "discover more stuff but crawl more slowly" #jcdl2017 see also https://t.co/wTDDZokQeE pic.twitter.com/1v58FjNwnk— Michael L. Nelson (@phonedude_mln) June 20, 2017
Zombies in archives, says @justinfbrunelle - live web content leaking into archives. #jcdl2017 pic.twitter.com/QId5G420qw— Ian Milligan (@ianmilligan1) June 20, 2017
— 417 Expectation Failed (@ruebot) June 20, 2017
#JCDL2017 secret; a big one @ibnesayeed high poster + mythological reference in #MinuteMadness + shameless marketing = #Best #Poster #Award pic.twitter.com/YYEXYaX8e7— Sawood Alam (@ibnesayeed) June 22, 2017
@ianmilligan1 seeing off @WebSciDL @ibnesayeed @johnaberlin at Union Station after #WADL2017 #JCDL2017 pic.twitter.com/sJTTwLeTSQ— Sawood Alam (@ibnesayeed) June 24, 2017
Unicorns and zombies in #WebArchiving practice, from @ibnesayeed. On leaks from live web into archives. #jcdl2017 #wadl2017 pic.twitter.com/SpfI0ynMCN— Ian Milligan (@ianmilligan1) June 22, 2017
@WebSciDL at @internetarchive after Archives Unleashed 3.0 wrap up. We have a winner of #HackArchives pic.twitter.com/vYLi89yap0— Sawood Alam (@ibnesayeed) February 25, 2017
— John Berlin (@johnaberlin) March 29, 2017
.@johnaberlin from @WebSciDL @oducs is demoing #WAIL @StanfordLibs #pda2017 pic.twitter.com/htNqLae4Ib— Yasmina Anwar (@yasmina_anwar) March 29, 2017
incredible suite of tools for working with and creating archive content from live web @machawk1 @Mementoweb #hackarchives @WAWeek2017 pic.twitter.com/OayUgAc5UE— Matthew Weber (@docmattweber) June 12, 2017
Thanks To Alexander Nwala (PhD) For Spending 3 Hours Teaching Me Computer Programming With Python. pic.twitter.com/7Oi98oImds— J.O. Effoduh (@effodu) August 4, 2017
WS-DL did not host any external visitors this year, but we were active with the colloquium series in the department and the broader university community:
- In February, I gave a web archiving colloquium in the Chemistry Department at ODU (if my high school chemistry teacher only knew!).
- Sawood gave a colloquium about web archiving and WS-DL research for a set of visiting undergrad students in the summer.
- Alexander presented his media manipulation research in summer colloquium as well.
- In November, Sawood gave another departmental colloquium, this time about Docker.
- RJI ran three separate articles about Shawn, John, and Mat participating in the 2016 "Dodging the Memory Hole" meeting.
- On a less auspicious note, it turns out that Sawood and I had inadvertently uncovered the Optionsbleed bug three years ago, but failed to recognize it as an attack. This fact was covered in several articles, sometimes with the spin of us withholding or otherwise being cavalier with the information.
- Grant released a significantly reworked version of CarbonDate, which wraps a variety of heuristics for estimating the creation date of a web page.
- Alexander updated the Local Memory Project to include non-US news sources.
- John updated WAIL, and in the process released a set of support software: node-warc, node-cdxj, and Squidwarc.
- Mohamed released "archivenow", a library and service to make it easier to simultaneously push web pages to not just the Internet Archive, but also archive.is, webcitation.org, and perma.cc.
- As mentioned above, Erika implemented the Memento-Damage service. Justin's original code was just a research prototype, but she built a python library, service, and docker image.
- Mat, Sawood, and others continued to work on Interplanetary Wayback, a merger of Wayback Machine functionality implemented over IPFS.
- Corren released the data sets for her University Twitter Engagement (UTE) work mentioned above.
- Mat created a nice page for downloading WARCreate, WAIL, and Mink as part of the close out of the NEH "Archive What I See Now" project.
Another point you can probably infer from the discussion above but I decided to make explicit is that we're especially happy to be able to continue to work with so many of our alumni. The nature of certain jobs inevitably takes some people outside of the WS-DL orbit, but as you can see above in 2017 we were fortunate to continue to work closely with Martin (2011) now at LANL, Yasmin (2016) now at Berkeley, and Justin (2016) now at Mitre.
WS-DL annual reviews are also available for 2016, 2015, 2014, and 2013. Finally, I'd like to thank all those who at various conferences and meetings have complimented our blog, students, and WS-DL in general. We really appreciate the feedback, some of which we include below.
--Michael
Researchers find Twitter followers useful proxy for typical measures of university reputation. P.S. @WebSciDL at Old Dominion is amazing. https://t.co/zofT1BozDs— Eileen Clancy (@clancynewyork) August 28, 2017
No matter the occasion, there's a @WebSciDL blog post for it ... https://t.co/arFuvRCi4n— Justin Littman (@justin_littman) December 13, 2017
.@smythbound makes a @ibnesayeed-esque shoutout to unicorns and zombies... the last #webarchiving zombie of #jcdl2017 #wadl2017? pic.twitter.com/NTbMYSW9qX— Ian Milligan (@ianmilligan1) June 23, 2017
Comments
Post a Comment