2023-07-26: ACM/IEEE Joint Conference on Digital Libraries (JCDL) 2023 Trip Report
The ACM/IEEE Joint Conference on Digital Libraries (JCDL) 2023 was a hybrid conference with the in-person event at Hilton Santa Fe Historic Plaza, New Mexico and virtual attendees joining via Zoom. JCDL 2023 conference took place from June 26-30 and it was hosted by Los Alamos National Laboratory. JCDL is a major international forum focusing on digital libraries and associated technical, practical, and social issues.
- Best Student Paper Award: Making Changes in Webpages Discoverable: A Change-Text Search Interface for Web Archives (Lesley Frew, Michael Nelson, and Michele Weigle)
- Best Short Paper Award: MetaEnhance: Metadata Quality Improvement for Electronic Theses and Dissertations of University Libraries (Muntabir Hasan Choudhury, Lamia Salsabil, Himarsha R. Jayanetti, Jian Wu, William A. Ingram, and Edward A. Fox)
- Best Poster Award: The Memento Tracer Toolset for Human-Guided Focused Crawling of Dynamic Web (Lyudmila Balakireva, Emily Escamilla, Talya Cooper, Michael L. Nelson, and Michele C. Weigle)
Conference Venue - Hilton Santa Fe Historic Plaza, New Mexico |
Opening and Keynote #1: Oksana Bruy
Day 1 of the JCDL main conference started off with some information about the Santa Fe area before the first keynote speaker.
@mart1nkle1n from @LosAlamosNatLab is kicking off day 2 of #JCDL2023! pic.twitter.com/TXDDxbWKdP
— JCDL2023 (@jcdl2023) June 27, 2023
.@OksanaBrui is talking about the destruction of libraries and books in Ukraine during the Russian invasion. The Russian invaders have removed or destroyed historical documents and fiction that does not support their narrative. #JCDL2023 pic.twitter.com/Jj4VSWikWJ
— Shawn M. Jones, PhD / @shawnmjones@hachyderm.io (@shawnmjones) June 27, 2023
Session 1: Digital Libraries
.@acnwala is kicking off the first paper session of #JCDL2023 — Session 1: Digital Libraries. pic.twitter.com/3ztRwgB1Es
— Shawn M. Jones, PhD / @shawnmjones@hachyderm.io (@shawnmjones) June 27, 2023
Session 1: Digital Libraries was chaired by Alexander Nwala (@acnwala), an assistant professor of data science at William & Mary and an alumnus of the WS-DL research group.
Christin Katharina Kreutz is presenting "Evaluating Digital Library Search Systems by using Formal Process Modelling"#JCDL2023
— Shawn M. Jones, PhD / @shawnmjones@hachyderm.io (@shawnmjones) June 27, 2023
Pre-print: https://t.co/vFgpa6r9nm pic.twitter.com/ZkfMvCON2Q
Satvik Chekuri and Bipasha Banerjee are presenting “Integrated Digital Library System for Long Documents and their Elements”#JCDL2023 pic.twitter.com/hqYUiNCkJq
— Shawn M. Jones, PhD / @shawnmjones@hachyderm.io (@shawnmjones) June 27, 2023
.@LizWoolcott is presenting about use and reuse in digital libraries - their goal is to identify use cases and develop a toolkit #jcdl2023 #rethinkingdigitalrecords pic.twitter.com/YF21bqaKci
— Lesley Frew (@lesley_elis) June 27, 2023
Session 2: Scientometrics
Session 2: Scientometrics at the @jcdl2023 just started. @weiglemc from @WebSciDL @ODU, session chair, is introducing the speakers of the session. #jcdl2023 pic.twitter.com/rnm9X8v4XQ
— Yasasi (@Yasasi_Abey) June 27, 2023
Session 3: Web Archiving
Session 3: Web Archiving at the @jcdl2023 just started.@shawnmjones, session chair, is introducing the speakers of the session. #jcdl2023 pic.twitter.com/HmPnGLeB8G
— Yasasi (@Yasasi_Abey) June 27, 2023
I am excited to present our work titled “Less than 4% of Archived Instagram Account Pages for the Disinformation Dozen are Replayable” @jcdl2023 later today!
— Himarsha R. Jayanetti (@HimarshaJ) June 27, 2023
Preprint: https://t.co/dFcLxNK3mj
🛝: https://t.co/hlA7HDVgIY@haleybragg17 @phonedude_mln @weiglemc @WebSciDL #JCDL2023
Session 4: Information Retrieval
The conference had a session on information retrieval in parallel to the session on web archiving, chaired by Dr. Martin Klein from the Los Alamos National Laboratory. The session included three long paper presentations, of which two had nominations for the best paper awards. The session began with Timo Breuer from TH Köln presenting their study "Bibliometric Data Fusion for Biomedical Information Retrieval," which proposes to improve information retrieval by incorporating metadata such as citations and altmetrics. Next, Souvick Ghosh from San Jose State University presented their paper, "Toward Connecting Speech Acts and Search Actions in Conversational Search Tasks," which uses speech acts in utterances to predict system-level actions in conversational information retrieval. The session's final presentation was "Binding Data Narrations- Corroborating the Plausibility of Scientific Narratives by Open Research Data" by Denis Nagel from TU Braunschweig. Their proposal includes structuring narratives and identifying plausible supporting data for a given scientific claim using the open data repository of the World Health Organization. For example, unlike the widely accepted connection between smoking and lung cancer, the plausibility of scientific narratives linking smoking and tuberculosis require strong evidence and argumentation for credibility is often questioned. Their BiND system is built for computing flexible bindings between scientific narratives and open research data to provide the missing evidence.
Session 4: Information Retrieval @jcdl2023 just started with a quick introduction by Dr. Klein (@mart1nkle1n ) @LosAlamosNatLab #JCDL2023 pic.twitter.com/TXHuzz2JsD
— Bhanuka Mahanama (@mahanama94) June 27, 2023
Panel 1: Research data without borders
@NFDI_de having a panel on how the sections of the NFDI work on cross-cutting topics across disciplines and consortia @jcdl2023 ! @nfdi4culture @nfdi4ds @NFDI4Chem @BERD_NFDI @NFDI4Cat #JCDL2023 pic.twitter.com/EV6Ho28z81
— HerresPawlisLab (@HerresLab) June 27, 2023
Poster Session
We are staring #JCDL2023 Minute Madness!!!
— JCDL2023 (@jcdl2023) June 27, 2023
Every poster and demos has 1 minute to describe your poster! Good luck! pic.twitter.com/eNbAY47AE3
The poster session started with Minute Madness, where each poster presenter has one minute to summarize their work.
“Archiving dynamic web content is like trying to catch a firefly in a jar.” The solution: “The Memento Tracer Toolset for Human-guided Focused Crawling of Dynamic Web”. @HimarshaJ from @WebSciDL is inviting everyone at #jcdl2023 minute madness to visit their poster @jcdl2023. pic.twitter.com/AafvmgSxqQ
— Yasasi (@Yasasi_Abey) June 27, 2023
Sandeep Kalari from @accessodu @WebSciDL is presenting at #JCDL2023 minute madness. Visit their poster, “Assessing the Accessibility of Web Archives” at @jcdl2023 poster session.@mk344567 @vikas_daveb @oducs pic.twitter.com/4GMSwHL4Bc
— Yasasi (@Yasasi_Abey) June 27, 2023
After minute madness, conference attendees could walk around to view the posters and talk to the presenters about their work.
Images from the poster session |
Himarsha from WS-DL presenting at JCDL 2023 poster session |
Keynote #2: Jessica Polka
Jessica Polka, the Executive Director of ASAPbio, a nonprofit organization led by researchers, delivered the second keynote speech in person. Her presentation was titled “How preprints are changing biomedical publishing” and she offered a diverse range of thought-provoking viewpoints on scholarly communication based on preprints.
We welcome our next #JCDL2023 keynote speaker @jessicapolka !
— JCDL2023 (@jcdl2023) June 28, 2023
Slides: https://t.co/PKmo2woyLp pic.twitter.com/hkeyejaoWP
During her keynote, highlighted several key benefits of preprints, including rapid dissemination, rapid feedback, and rapid correction. She emphasized that preprints enable individuals to collaborate and communicate at an earlier stage, fostering the opportunity to receive feedback on their papers, expand their professional network, and welcome new collaborators. She also made a reference to the popular XKCD Comic. As avid fans of XKCD ourselves at WS-DL, we couldn't resist including this delightful tidbit in the trip report.
#JCDL2023@jessicapolka just referenced this @xkcdComic https://t.co/d7GiadMLTY
— Michael L. Nelson (@phonedude_mln) June 28, 2023
Session 5: Knowledge Graphs and Knowledge Organization
Session #5: Knowledge Graphs and Knowledge Organization @jcdl2023 has started Hermann Kroll from Techincal University Braunschweig, Germany presenting their long paper titled "Enriching Simple Keyword Queries for Domain-Aware Narrative Retrieval". #jcdl2023 pic.twitter.com/yfFPi7b1t4
— Yasasi (@Yasasi_Abey) June 28, 2023
Session 6: Digital Humanities and Teaching
Following the keynote session, the digital humanities and teaching session was held in parallel to session five, with Dr. Wolf-Tilo Balke from TU Braunschweig chairing the session. It included a diverse set of publications: one long paper, three short papers, and one late-breaking study.
The session began with Nandana Kumara from the University of Waikato presenting "Reading Lists Systems' Pedagogical Features: A Comparative Analysis." In the study, the authors explore features of existing reading lists with their perceived value. They discover that existing reading lists do not fully meet academic expectations and identify possible improvements based on their observations. Next was a short paper by Caitlin Burge from the University of Luxembourg, titled "A King's Counsel: A Network(ed) Approach; Digitizing the Privy Council Registers of Henry VIII." The study analyzes digitized historical records to identify details on power and influence among historical figures, highlighting the significance of digitized records over digitalized records.
The third presentation was the late-breaking study, "Yes but.. Can ChatGPT Identify Entities in Historical Documents?" by Carlos-Emiliano González-Gallardo from the University of La Rochelle. The study explores the named entity recognition and classification by Large Language Models in historical documents with zero-shot learning and identifies several shortcomings, such as the inaccessibility of historical archives for training these models. Next was a short paper titled "FastCat Catalogues: Interactive Entity-based Exploratory Analysis of Archival Documents" by Pavlos Fafalios from ICS-FORTH, Greece. The study proposes an application that supports researchers of maritime history to search archival documents with a wide variety of features, such as entity browsing. The session's final presentation was the short paper "MINE - A Text Analysis Service for Digital Humanities Scientists" by Triet Ho Anh Doan from GWDG, Göttingen. The project aims to address problems with text analysis and accessibility issues in large-scale data by simplifying and offering the process as a service.
Session 6: Digital Humanities and Teaching @jcdl2023 started with the chair Dr. Wolf-Tilo Balke https://t.co/BqzzP4OmOs
— Bhanuka Mahanama (@mahanama94) June 28, 2023
Panel 2: AI and Public Archives
The third day of the conference began with a panel discussion titled “AI and Public Archives: Collaborative Leadership for Responsible Adoption”. The four panelists were William Ingram from Virginia Tech, Rebecca Dikow from Smithsonian Institution Data Science Lab, Abigail Potter from Library of Congress, and Jill Reilly from U.S. National Archives and Records Administration (NARA).
Jill Reilly talked about using Artificial Intelligence tools to automate previously manual work for digital access (public access to government records). Rebecca Dikow talked about how off-the-shelf AI models misidentify historical objects. One example from the Smithsonian was the automated tagging of shackles used in the movie "Roots" as a necklace with 91% probability. Another example was tagging entries in a botanical archive gathered by Mary Vaux Walcott as being contributed by her husband, Charles Walcott because the entry was listed as "Mrs. Charles Walcott", which was a common way that married women identified themselves during this time period. Emphasizing the significance of an AI values statement at the Smithsonian, she highlighted the institution's role as a trusted source. Their aim is to guarantee that the integration of new technology does not undermine public trust. Abigail Potter discussed the role of values as a fundamental component of an AI strategy at the Library of Congress. She introduced a framework and various tools for effective AI planning.
After their short presentations, the panelists engaged in discussions and a Q&A session with the audience.
The last day of #JCDL2023 began with the panel “AI and Public Archives: Collaborative Leadership for Responsible Adoption” with esteemed panelists @sudobear, @jillreillyjames, @rdikow, & @opba. @JCDL2023 @WebSciDL pic.twitter.com/iJGQMBxAKe
— Himarsha R. Jayanetti (@HimarshaJ) June 29, 2023
The discussions and Q&A resulted in several noteworthy points and suggestions, including but not limited to the following:
- Ensuring users are explicitly informed about their automated nature when using automatic captions, thereby encouraging feedback for improvement.
- Leverage general-purpose crowdsourcing to facilitate corrections and gather public input on various aspects.
- Involve students who are studying practical applications in testing out models.
- Importance of promoting transparency by sharing models and datasets with users and researchers
- Creation of a centralized registry where institutions can contribute their unique developments, allowing for cross-institutional learning and innovation. Notable platforms such as Hugging Face Hub and AI4LAMs were mentioned as valuable resources for accessing and sharing models and datasets.
Out of the panelists, only William Ingram attended the event in person. The remaining panelists participated virtually and expressed their gratitude to William Ingram for his excellent handling of the panel.
Session 7: AI/ML/Entity Extraction
The third day of the main conference had the final paper session on Artificial Intelligence, Machine Learning, and Entity Extraction. Dr. Sawood Alam from the Internet Archive chaired the session comprising six publications. The first paper was the late-breaking study/dataset, "Mining the History Sections of Wikipedia Articles," by Wolfgang Kircheis from Leipzig University. The authors propose a dataset comprising science and technology Wikipedia articles with their extracted history sections. Next was the long paper "Efficient Ultrafine Typing of Named Entities" by Alejandro Sierra-Múnera from Hasso-Plattner-Institut. The study addresses the complexities in ultrafine named entity recognition, such as the requirement of a large training dataset or the costly operation of comparing against all entity types.
The third presentation of the paper was "Mining Semantic Relations in Data References to Understand the Roles of Research Data in Academic Literature," by Lizhou Fan from the University of Michigan. The study presented a workflow for identifying the relationships between the publications, studies, and authors. The next presentation was the short paper "Extreme Classification for Answer Type Prediction in Question Answering" by Vinay Setty from University of Stavanger. The author proposes to improve the answer type prediction by incorporating transformer models when predicting the top-k knowledge graph type producing state-of-the-art results.
The session's final presentation was the late-breaking study, "Zero-shot Entailment of Leaderboards for Empirical AI Research," by Salomon Kabongo Kabenamualu from Leibniz University of Hannover. In the study, authors investigate the generalizability of state-of-the-art models in identifying entailment-the directional relation between two text fragments.
Session 6 @jcdl2023 just started with @ibnesayeed
— Bhanuka Mahanama (@mahanama94) June 29, 2023
from @internetarchive chairing the session#JCDL2023 pic.twitter.com/ReAGMMljPE
Keynote #3: Sarah Lamdan
Professor Sarah Lamdan from the City University of New York School of Law delivered the final keynote titled "Data Cartels and the Future of Digital Information Access”. The talk focussed mainly on the transformation of publishers towards data analytics and its impact on library professionals' roles in adapting to digital platforms and products. She used LexisNexis as an example, where records containing overlapping data points are assigned a unique identifier called LexID, which is not derived from personally identifiable information. These identity profiles (LexIDs) are continuously enriched by adding new records. She discussed how this information is being sold to various entities such as governments, tenant screening companies, healthcare systems, and insurance companies. She discussed some incidents where incorrect data from LexisNexis had real-world implications, causing harm to the public. She also highlighted the notion that data analytics has become pervasive across all industries (with few companies dominating all the informational markets), with publishers simply adapting to this prevailing trend. For instance, she cited the exposure of personal data through IoT devices, the emergence of technological trends like smart thermometers and smart clothing, and how contemporary cars gather extensive data about drivers.
#JCDL2023
— Michael L. Nelson (@phonedude_mln) June 29, 2023
re: @greenarchives1's note about surveillance in our modern life:
old cars don't spy on you ;-)https://t.co/jXKUlJ9q5d
For more information on this topic, interested readers can explore Sarah Lamdan's insightful book titled “Data Cartels: The Companies That Control and Monopolize Our Information”.
As the #JCDL2023 keynote speaker, @greenarchives1 asks:
— Shawn M. Jones, PhD / @shawnmjones@hachyderm.io (@shawnmjones) June 29, 2023
* What does it mean when library vendors are also doing stuff with our personal data for profit?
* Does this risk our patrons' academic or intellectual freedom? pic.twitter.com/75x1attqi7
Conference Dinner and Awards
On Wednesday, June 28th night, the conference dinner was held at La Fonda on the Plaza, Santa Fe. Following the conference dinner, the best paper and poster awards were announced.Best Poster Award
“The Memento Tracer Toolset for Human-guided Focused Crawling of Dynamic Web." poster won the best poster award @ #jcdI2023 🏆@HimarshaJ accepted the award on behalf of Lyudmila (@LosAlamosNatLab), @EmilyEscamilla_(@ODU), Talya (NYU). @phonedude_mln @weiglemc @WebSciDL pic.twitter.com/EdX38kF9mt
— Yasasi (@Yasasi_Abey) June 29, 2023
Best Short Paper Award
Best Short Paper Award won by “MetaEnhance: Metadata Quality Improvement for Electronic Theses and Dissertations of University Libraries" #jcdl2023 🏆
— Yasasi (@Yasasi_Abey) June 29, 2023
Congratulations @TasinChoudhury, @liya_lamia, @HimarshaJ, @fanchyna, William A. Ingram, and @edwardafox 👏@WebSciDL @ODUSCI pic.twitter.com/Kla2by2sMf
Best Student Paper Award
Congratulations @lesley_elis @weiglemc @phonedude_mln for winning the Best Student Paper Award at #jcdl2023 for their paper “Making Changes in Webpages Discoverable: A Change-Text Search Interface for Web Archives” 🏆👏 @WebSciDL @ODUSCI pic.twitter.com/R3L1lZmgg7
— Yasasi (@Yasasi_Abey) June 29, 2023
Vannevar Bush Best Paper Award
Congratulations to the #JCDL2023 Vannevar Bush Best Paper Award Winner:
— JCDL2023 (@jcdl2023) June 29, 2023
“SciKGTeX - A LaTeX Package to Semantically Annotate Contributions in Scientific Publications” pic.twitter.com/7IheZqjPQH
Wrap-up
.@WebSciDL folks at @jcdl2023 conference dinner.@weiglemc @phonedude_mln @mart1nkle1n @fanchyna @shawnmjones @acnwala @ibnesayeed @machawk1 @HimarshaJ @TasinChoudhury @mahanama94 @lesley_elis #jcdl2023 pic.twitter.com/qFwmmKo9en
— Yasasi (@Yasasi_Abey) June 29, 2023
--Yasasi (@Yasasi_Abey), Bhanuka (@mahanama94), Lesley (@lesley_elis), and Himarsha (@HimarshaJ)
Comments
Post a Comment