2020-09-09: Theory and Practice of Digital Libraries 2020 (TPDL 2020) Non-Trip Report
The 2020 Theory and Practice of Digital Libraries (TPDL 2020) was planned to take place in Lyon, France, but was virtually hosted via Big Blue Button. It was a joint conference with ADBIS 2020 and EDA 2020. TPDL was a fascinating look into the various projects and research efforts undertaken by members of the digital library community. Due to the time zone differences and technical issues, I could not attend all sessions. As usual, I will summarize some of those I have attended here.
On the Persistence of Persistent Identifiers of the Scholarly Web
Martin Klein and Lyudmila Balakireva, my teammates from the Los Alamos Laboratory Research Library Prototyping Team, won the best paper award at TPDL 2020 for "On the Persistence of Persistent Identifiers of the Scholarly Web."
In their paper, Martin and Luda investigated how consistently scholarly publishers respond to common HTTP requests against Digital Object Identifiers (DOIs) that identify scholarly artifacts on the web. They analyzed the length of the DOI redirect chain and the HTTP response code of the redirect chain's last link for different HTTP request methods and HTTP clients. They found significant differences in responses to HTTP clients and methods that closely resemble machines "browsing" the web versus HTTP requests more closely resembling human browsing behavior. Less than 50% of DOIs returned the same response code across all requests. Overall, requests sent with the popular web browser Chrome (as the method most closely resembling a human browsing) returned the most successful HTTP responses. Some of the odd behavior they noticed was that, for example, the simple HTTP HEAD method resulted in a sizable number of "404 - Not Found" responses, but 25% of those DOIs returned a "200 - OK" response when any other HTTP request was sent. Martin and Luda further investigated differences when sending requests from different network environments (with and without commercial publisher subscription levels) and when sending requests against DOIs that identify Open Access vs. non-Open Access content. Please read the paper for more details on the study and the implications of its findings. Martin and Luda raise questions about trust in the persistence of these widely used persistent identifiers, given the noticeable inconsistencies in responses to simple HTTP requests.
Honored and humbled to receive the #TPDL2020 best paper award w/ Luda, thanks @tpdl2020 community!
— Martin Klein (@mart1nkle1n) August 27, 2020
Preprint: https://t.co/TWCPUxQgSX
Conf proceedings: https://t.co/anCCO0J5vX
Slides: https://t.co/834LowL1LZ @LosAlamosNatLab
Keynotes
Verónika Peralta from the Université de Tours shared "From source data to data narratives: accompanying users in the way to interactive data analysis." Peralta noted that data narration is narrating with data visualization, incorporating analysis, synthesis, and visualization together to tell a story. Her talk (available on YouTube) contains many, many references for narrative discourse, data analysis, visualization, and more. She covered a four-layer model for building a data narrative. The factual layer is where we collect and analyze data. With the intentional layer, we create our message based on our findings. The structural layer organizes these messages into units that can be discussed. The presentational layer takes these messages and visualizes them to tell our narrative. Peralta broke each of these layers into individual tasks demonstrating the work necessary to bring a data narrative to life. She also covered how the OLAP III project is further developing these ideas to provide insights into data, trying to address issues of which queries and models to execute against data, and what highlights to select for an interesting story. She provided many different references about measures of interestingness before closing with the many open challenges surrounding building these narratives.
Selected Papers
Interested in how #ArchiveIt users view #webarchives and quality? Read my #tpdl paper "Correspondence as the primary measure of #quality for #webarchives": https://t.co/PwRvr63U1q
— Brenda Reyes (@CamtheWicked) August 24, 2020
The final authenticated version is
available online at https://t.co/jrcpcMka0P
Brenda Reyes from the University of Alberta presented "Correspondence as the Primary Measure of Quality for Web Archives: A Grounded Theory Study" (slides). Reyes analyzed issues reported against Archive-It to build a Theory of Quality built upon an evaluation of correspondence (similarity between live and memento), relevance (on-topic content compared to original), and archivability (difficulty with preservation, similar to memento-damage). She found that Archive-It users have issues evaluating the relevance of content and are concerned as much about the "overabundance of content" as they are about coverage of their collection topic.
Andrea Mannocci is presenting “Context-driven Discoverability of Research Data” #TPDL2020
— Shawn M. Jones (@shawnmjones) August 26, 2020
Paper: https://t.co/dKncBoZeoC pic.twitter.com/jIb4Hfl8Nt
Andrea Mannocci from CNR-ISTI presented "Context-driven Discoverability of Research Data," where he noted how research data is often considered "ancillary material" and, as such, has no common practices, incentives, or mandates for its care and use. This presents issues with discoverability and reusability. To aid in discoverability, Mannocci proposes applying semantic relations connecting datasets to the documents whose access is already facilitated by existing discovery tools. He closed with a demo of his solution.
@rasa_bocyte from @ReTV_EU is presenting “Online News Monitoring for Enhanced Reuse of Audiovisual Archives” #TPDL2020
— Shawn M. Jones (@shawnmjones) August 27, 2020
Paper: https://t.co/rsdkjP6fqt
ReTV Project: https://t.co/S9AlpccSar pic.twitter.com/YDqOddJw1g
Rasa Bočytė from the ReTV Project presented "Online News Monitoring for Enhanced Reuse of Audiovisual Archives," where the authors reused news archives to gain insights. Their project analyzes cross-platform and multilingual data and produces a variety of visualizations of news across different topics and localities. Their visual dashboard produces graphs, but unlike StoryGraph, these graphs connect concepts rather than news sources. It also produces flow diagrams, bubble charts, and word clouds for topic analysis as well as steamgraphs demonstrating changes in news over time.
Future
#TPDL2021 will likely be at @unitartu alongside #ADBIS2021 August 24-26, 2021@marlon_dumas is discussing the 2021 conference at University of Tartu in Estonia
— Shawn M. Jones (@shawnmjones) August 27, 2020
"COVID allowing, it is going to be a comfortable trip if you stay here"#TPDL2020 #ADBIS2020 pic.twitter.com/qgVgvIAMyy
TPDL 2021 will likely occur in concert with ADBIS 2021 at the University of Tartu in Estonia. We covered a lot of good work in 2020. It is a shame that I did not get to view it all. I look forward to next year.
Comments
Post a Comment