2021-12-08: Collaborative Software Archiving Project funded by the Alfred P. Sloan Foundation

We are happy to announce that our new collaboration between the New York University (NYU) Division of Libraries, the Research Library's Prototyping Team at Los Alamos National Laboratory (LANL), the University of Pittsburgh’s Computer Science Department, and the Web Science and Digital Libraries (WDSL) Research Group has been funded by the Alfred P. Sloan Foundation with an amount of $520,000. The two-year project titled “Collaborative Software Archiving for Institutions (CoSAI)” is led by co-PIs Vicky Rampin (Librarian for Research Data Management and Reproducibility, NYU) and Martin Klein (Research Scientist, LANL), in collaboration with the wilkie from the Open Curation of Computation and Metadata (OCCAM) initiative at the University of Pittsburgh and WSDL’s Michael L. Nelson and Michele Weigle from ODU. In addition, we are thrilled to welcome Talya Cooper and Emily Escamilla to the CoSAI team.

CoSAI will focus on institutional approaches to provide machine-repeatable and human-understandable workflows for preserving web-based scholarship, specifically source code, while forefronting the role of education, outreach, and community building.

Example of an arXiv pre-print that contains a reference to a source code repository (GitHub). The link to the repository is rotten (returns a HTTP 404) but Mementos exist.

The objective of CoSAI is to lower the barrier of entry to software preservation through development of a framework for institutional archiving of open scholarly materials, focused on but not exclusive to research software. This work includes developing and testing novel technical solutions, specifically decentralized and federated technology, in order to foster collaboration and increase access to curated materials as well as the curation workflow themselves. A key part of our approach is catalyzing the role of “Software Curation” librarians in academic and government institutions. Technical solutions are very much needed (as evidenced by the outcomes of the Sloan-funded Investigating & Archiving the Scholarly Git Experience (IASGE) environmental scan), but the work needs dedicated labor to be successful.

To support these goals, CoSAI will have three main streams of work: 

  1. technical development on open source, community-led tools for collecting, curating, and preserving open scholarship with a focus on research software (resulting in software, workflows, and documentation),

  2. community building around open scholarship, software collection and curation, and archiving of open scholarship (resulting in educational and outreach materials), and

  3. optimizing workflows for archiving open scholarship with ephemera, via machine-actionable and manual workflows (resulting in workflows, narratives, and primers).

Expected outcomes of CoSAI are a minimal-computing toolkit for federated software preservation including (semi-)automatic quality control of archived records, catalyzing the role of software curation librarians, and community building around the importance of long-term access to research software for reproducibility and the stability of the scholarly record.

If you are interested in this area of work or simply would like to learn more about our efforts, please feel free to get in touch (vicky.rampin@nyu.edu and mklein@lanl.gov).


Vicky & Martin