2013-02-24: Personal Digital Archiving 2013

On February 21-22 Justin Brunelle (@justinfbrunelle) and I (@machawk1) traveled to College Park, Maryland for Personal Digital Archiving (PDA) 2013. Other members of the Web Science and Digital Libraries Research (WS-DL) Group at ODU had previously attended this conference (see 2012 Trip Report and 2011 Trip Report), always previously at Internet Archive in San Francisco, and knew it would be informative and extremely relevant to both of research efforts.
We had both been anticipating a few of the presentations, namely the keynotes by Sally Bedell Smith and George Sanger and that Erin Engle (@erinengle) promised on the Library of Congress digital preservation blog The Signal.
For the sake of preservation, I captured videos of many of the presentations, which I posted on Internet Archive. Each available will be linked inline in this post but for a more original experience, view the videos.
As our sole mission at WS-DL is not only to document conferences (ok, admittedly, documenting conferences is not in our mission), I presented a demo and poster titled "Making Enterprise-Level Archive Tools Accessible for Personal Web Archiving". The presentation described a software package I had created called Web Archiving Integration Layer (WAIL), currently available for download and open source. The tool packages together instances of Wayback, Heritrix and other archiving tools to allow for one-click user instigated preservation or as Dr. Michael L. Nelson (@phonedude_mln) put it, "Easiest Heritrix ever".

Day One

After registration, coffee and the welcome by Bill Lefurgy (@blefurgy) and Trevor Muñoz (@trevormunoz), Sally Bedell Smith began the first keynote and talk of the conference. Sally documented her reluctance to move to modern writing programs, having been comfortable with Xywrite for the longest time and her 120+ words per minute (!) typing speed, which had previously been showcased to her colleagues. As her writing has been mostly biographical, she asked, "How do future biographers access personal media?" and the need for un-published content in the archives to be utilized" including retracted content.
"Memory is deceptive", she said, further emphasizing the need to preserve drafts. She repeatedly illustrated the advantages of paper medium over digital copies, stating that it was easier to get a 10,000 foot view when she could physically spread the pages out on a table, a perspective likely not common in the crowd of the conference bearing "Digital" in its name. "Personal correspondence is lost in digital media", she said, alluding to the context lost from interviews documented digitally." She would take these printouts and annotate them and make the digital changes. "Duplication is accepted/preferred", she said and "sharing drafts is essential".
Jenny Shaw was the first paper presentation with "Hardware and soft skills: surveying scientific personal papers in the digital age.” She spoke of her work with the Human Genome Archive Project and the group's efforts in capturing scientific notes related to the Human Genome Project. She had lead efforts in surveying the software used and the hardware needed. The primary target was for the UK's efforts and to survey Born Digital material.

After Jenny, Sudheendra Hangal (@hangal) and Monica S. Lam (@MonicaSLam) presented "Engaging users with personal archives through gamification". Sudheendra spoke of the gamification of the e-mail process through a creative fill-in-the-blank construction of a user's e-mail content in the form a crosswords and word searches. With his software, Muse, this can all be done automatically and has a primary use case in Alzheimers patients. The software instills a degree of personalization into familiar games.

After a short break, Noah Lenstra (@nlenstr2) presented "Connecting Local & Family History with Personal Digital Archiving: Findings from Studies in Four Midwestern Public Libraries". His main idea was for the need to negotiate boundaries between personal and public archives and why personal archives should be converted to public archives.

After Noah, Heather Gendron presented her talk, "Passionate About History and the Making of History: in Situ Dialogues with Artists and their Assistants about Studio Archives". She had spoken to many artist's and investigated their archiving process and feelings on whether the construction-phase of their works should be archived. She wished to publish good practice methods for artists so those with sub-par systems might learn of effective methods while still maintaining their workflow.

When Heather finished, the room was adjourned for lunch at the Banneker Room of the Stamp Student Union.

Following lunch, a series of lightning (10-minute) talks commenced. First up was Mike Ashenfelder with "The Library of Congress Personal Digital Archiving Videos". Along with Library of Congress, his group "created short, 3-5 minute videos, with clear focused messages on a single topic related to digital preservation. His intention in creating these is to reach all audiences, namely non-technical audiences." A video that his group was considering in the near future is scanning, as there has been a lot of interest in the method as a preservation means.
A few videos they have already produced are "Why Digital Preservation is Important for Everyone" and "Why Digital Preseration is Important to You" as well as Butch Lazorchak's interviews with teenagers about digital preservation. "In his interviews they had some startling realizations", Mike said, "like a teenager who said, 'when she puts something on the internet, it's always available'."

Mél Hogan (@mel_hogan) followed Mike with "Collect Yourself: Data Storage Centers as the Archive's Underbelly". Mél emphasized in her presentation (slides) the environmental impact that preservation has, highlighting the Facebook data centers in both the energy required to run the servers but additionally, the energy required to cool them.

Nigel Lepianka (@trueXstory) followed Mél with "Achievements as Personal Archives of Memory and Experience in Open World Video Games". His work has been in documenting how achievements can be used as a means of archiving how we navigate video games with work primarily done in games like World of Warcraft, which has been around long enough to represent an evolution of individuals and thus culture. As achievements are temporally organized, they show the order of experiences of the player.

Jan Emery next presented “Personal Artifacting”, a concept and practice at bringing dimensionality to personal archiving.

Following the lightning talks, Zach Vowell presented his paper, "The Many Faces of the Fat Man: A Case Study of a Multi-Faceted Personal Digital Archive." Zach's spoke of the George Sanger (the Day Two PDA2013 keynote presenter) collection, part of the UT video game archive. In the collection were numerous obscure hardware and software medium and formats, respectively, that he needed to recreate in their original form in order to preserve. One of these were Sanger's recording of digital data (namely video game audio) onto specialized VHS cassettes, which confused Zach when trying to verify the data using a VCR, as he received only static.

After Zach, Smiljana Antonijevic (@Smiljana_A) and Ellysa Stern Cahoy presented "Scholarly workflow and personal digital archiving".
They presented interviews with academic faculty to investigate scholarly workflow. The project began in 2012 and will conclude in June 2013. They conducted a web-based faculty of faculty and graduate students asking about their digital practice. The study went across the sciences, humanities, and social sciences about how faculty manage their data. Their initial question was generally, "How do faculty use their personal information collections?".
Next, Sudheendra Hangal (@hangal) returned for a second paper along with Sit Manovit, Peter Chan, and Monica S. Lam (@MonicaSLam) to present "Providing Access to Email Archives for Historical Research".
Jason Matthew Zalinger and Nathan G. Freier were the final presenters of the day with "Narrative Searching Through a Scholar’s Email Archive". Jay had been given a large corpus of e-mail from InfoVis mogul Ben Shneiderman (@benbendc), in attendance, for the sake of researching linguistic trends or indicators in an academic's professional communications. "Ben was always aware that his e-mail could become public.", Jay said. He found the keyword, "however" to be a transitional phrase that denoted much emotion in Ben's writings and illustrated examples outside of this corpus that confirmed his finding. Jay continued to "look for anger" in the corpus by finding other transitional phrases that had such an effect, relating most to moments in an academic's career where much emotion would be had (e.g., the acceptance or rejection of proposals). He finished with a quote from H. Porter Abbot, Narrative is marked almost everywhere by its lack of closure. Commonly called suspense, this lack is one of the two things that above everything else give narrative its life." He expounded, "E-mail is suspenseful."
With the closing of the paper presentations, the crowd was instructed to head to the Maryland Institute for Technology in the Humanities (MITH) for the poster session. As above, I presented my poster titled "Making Enterprise-Level Archive Tools Accessible for Personal Web Archiving" along with eleven other poster by others from a wide range of fields.

Day Two

Day two started off with a keynote from George Sanger (a.k.a "The Fat Man") George reminisced about his past with music creation, mainly for video games, and the many other endeavors he had been involved in along the way. "Archiving is like playing an electrified guitar", he said, "what's the point?". "The motivation for an archive", he later said, "is that I want the stuff gone, but I want to keep it" thus confirming its necessity.
The talks for the day started with Megan Barnard and Gabriela Redwine's presentation of "Collaborating to Improve and Protect Born-Digital Acquisitions"

The next presentation was a quasi panel humorously titled, "All Your Bits Aren't Belong To Us: Opportunities and Challenges of Personally Revealing Information in Digital Collections".

Cal Lee was the first of the panel with his theme "It's Ethics All The Way Down". Cal spoke of the levels at which information resides that might be important to document or convey the way that people interact with systems. His levels consisted of:
  • Aggregation of objects
  • Object or package
  • In-application rendering
  • File through filesystem
  • File as "raw" bitstream
  • Sub-file data structure
  • Bitstream through I/O equipment
  • Raw signal stream through I/O equipment
  • Bitstream on physical medium

Naomi Nelson followed Cal in the panel. She spoke of lists they had received in their collections of financial information and sensitive data. A further example were photos they received from a writer of their workspace that happen to have financial information in it. Beyond sensitive data, they discovered metadata for deleted files that was present in the collections to which the owners may not want exposed. She asked, "What do archives do with collections that include private info -- deleted files, drafts, cookies, geotags, etc."

Kam Woods (@kamwoods) spoke after Naomi as the third speaker of the panel, titled "Let There be Hope for Our Future". "The 'hope'", he described, "is that as digital materials get larger, we will have the right tools to protect donor information." His group has been building software that is relatively simple to prevent data corruption or manipulation through unintentional writing. A major part of his work was knowing that you find everything there is to find on disk when a donor submits content.

Matt Kirschenbaum (@mkirschenbaum) covered the tail end of the panel with his presentation, "Robot Historians". Matt spoke of scholarship in the context of archival donors stating, "Data has the potential, and indeed the right, to make it's call - that the stuff and matter of the cultural record is vested with agency in this negotiation. Scholarships is thus a vocation in the service of the inanimate, not just in the memory and shades of Shakespeare but their irreducible, physical remainder."

Melissa Rogers (@MelissaRogers17) was the first paper to follow the panel with "Public Displays of Affection: Digital Zine Archives and the Labor of Love". She spoke of the culture encompassed within "zines" and the natural variance, degradation and manipulation that zinester's works possess. "Those of us interested in innovative forms of zine archiving must find a way around the limited 'to digitize or not-to-digitize' argument that seems to dominate many conversation of digital zine preservation.", she said.

Seth Anderson (@AVPSeth) followed Melissa with, "Protecting the Personal Narrative: An Assessment of Archival Practice's Place in Personal Digital Archiving". Seth stated, "Collection is a natural process. ... Collecting is a way of manifesting our own existence through the materials we accumulate around ourselves we look to exist beyond our lifetimes. The materials we have represent ourselves."

After Seth, the crowd broke for lunch, only to return to a second round of lightning talks for the conference.

The first speaker of the lightning talks was Erin Engle (@erinengle) with, "We've Thought Globally, Now Let's Act Locally". She spoke about how the National Digital Information Infrastructure and Preservation Program (NDIIPP) has reached out to individuals for personal digital archiving advice. They also developed the Personal Digital Archiving Day Kit geared toward organization and institutions to share archiving guidance within their communities. The guidance supplied was non-technical in nature with suggestions like, "Identify where you have your individual files", "Organize your digital files", "Make copies, at least two, and store them in different locations".

Following Erin in the lightning talks was Philip von Stade with "Memories Lost & Found: How Digital Memories Can Help Those With Alzheimers and Their Caregivers". Philip spoke of memory loss and the use of preserved photos for keeping aging people sharp via social engagement. He is in the process of creating an iPad app that allows photos to be organized and audial annotations to be added with the use case being review of these photos and annotation to keep the aging from suffering from Alzheimers with the treatment being social engagement.

Sarah Kim was the third and last speaker of the conference's lightning talks with "The virtual presence of others and the presentation of self in personal digital archives".

Following the lightning talks, Evan Carroll (@evancarroll) presented "Law and Society: Current Advances in the Digital Afterlife". Evan discussed the idea of having a "digital executor", an option for estate planning that involved a more tech savvy person to handle digital assets. "There is a certain advantage to being dead and gone.", he said on worrying about information being exposed by those insufficiently capable of properly dealing with one's digital assets after death.
After a final break, the conference closed with Leslie Swift and Lindsay Zarwell's presentation "Projections of Life: Prewar Jewish Life on Film".

Bill and Trevor closed up the conference as they began it by asking the crowd for suggestion and comments on the conference.
Overall, Justin and I found the conference very informative and it enlightened us to some of the concerns from those in the humanities.
— Mat (@machawk1)