2023-07-23: "Up and running with ARK persistable identifiers" JCDL Tutorial Trip Report

 

John Kunze presenting "Up and running with ARK persistable identifiers"


JCDL 2023 (main trip report) featured a doctorial consortium, three tutorials and six workshops (including EEKE and WADL). John Kunze (@jakkbl) presented the first tutorial of the conference on “Up and running with ARK persistable identifiers.” ARK (@arks_org) stands for Archival Resource Key. “Persistable” is preferred over the more common term, “persistent.” Persistable identifiers of different kinds have been around for 20 years and are used for academic papers (DOIs) and infrastructure (URNs and Handles). These (sometimes expensive) persistable identifiers don’t always meet the needs of the average researcher today for projects such as in digital humanities. Thus, ARKs were created as a free persistable identifier available for a variety of uses. The site N2T.net (Name to Thing) is the main website used for connecting an ARK identifier with its object resolution.


Examples of ARKs

One example of an ARK is ark:/13960/t2m620j4v. This ARK contains two parts: the identifier for the organization, in this case 13960, and the identifier for the object, in this case t2m620j4v. 13960 is the numerical identifier for the Internet Archive. This ARK resolves to a copy of Jane Austen's Persuasion available to be read at the Internet Archive.

ARK Anatomy: each organization has an identification number, and each ARK has an identifier. Some ARKs have additional suffixes. Source: https://arks.org/about/

The organization with the most ARKs is FamilySearch, a genealogical records website. Family Search has 8 billion ARKs. For example, the ARK ark:/61903/1:1:MJB9-WB6 represents Grace Hopper's 1920 US federal census record. Logged in users can view this record, and anyone can make a free FamilySearch account. The FamilySearch ARKs I found resolve to landing pages meant for human use (rather than machine use). This particular landing page contains a transcription of the census document, a link to view the original document, and links to additional records such as Grace's main records page and links to other members of her family listed with her in the census.

YAMZ (Yet Another Metadata Zoo), a system made by researchers at Drexel University, uses ARKs to represent proposed metadata terms. For example, the ARK ark:/99152/h8072 resolves to the term "Persistent Identifier (PID)". When a user adds a new term to the system, a new ARK is created for it.

IIIF and ARKs

ARKs can also integrate with IIIF (International Image Interoperability Framework) for images, and support a variety of suffix redirections

The BnF (Bibliothèque nationale de France) uses ARKs with IIIF in its digital library. The ARK ark:/12148/btv1b531384533 represents a design sketch of the Garnier bust at the Opéra Garnier. This ARK landing page has items including a description of the piece and an image viewer. There is also a link to the IIIF JSON manifest, which allows for use of the IIIF API by either a human or a machine. The API enables cropping, scaling, and tone changes, such as this version of the bust which zooms in on Garnier's head in black and white, through suffixes appended to the URI. All IIIF versions of the piece still share one base ARK. 

The ARK ark:/12148/btv1b531384533 can be manipulated with the IIIF API. All versions share one base ARK, and different suffixes enable viewing different versions of the image.

Workshop closing considerations

John discussed the framework that organizations need to consider when implementing ARKs, such as whether or how ARK-identified content might change in the future, and also the technical choices behind the implementation (such as the choice of a minting algorithm for the opaque identifier). For example, Internet Archive ARKs (such as the Persuasion ARK ark:/13960/t2m620j4v from earlier) end in a check character that permits detection of the most common transcription errors.

 ARKs are decentralized and each organization is responsible for resolving its own ARKs on its web server. The global ARK resolver at N2T.net supports the persistence of ARKs-as-URLs by permitting an organization, if it chooses, to publicize ARKs based at N2T.net rather than at its own, possibly less stable, resolver domain name. If the resolver location changes, it’s easy to update the record at N2T to correctly forward ARKs to the new resolver.

Organizations can request a Name Assigning Authority Number for free to get started on their projects with persistable objects. Below are the slides from the workshop.


-Lesley Frew

Comments