2019-06-05: Joint Conference on Digital Libraries (JCDL) 2019 Trip Report

Alma Mater, a bronze statue at the University of Illinois by sculptor Lorado Taft. Photo by Illinois Library, used under CC BY 2.0 / Cropped from original

It's June, so this means it's time for the 19th ACM/IEEE Joint Conference on Digital Libraries Libraries (JCDL 2019). This year's JCDL was held at the University of Illinois, in Urbana-Champaign (UIUC) between June 2 - 6. Similar to last year's conference, we (members of WSDL) attended paper sessions, workshops (Web Archiving and Digital Libraries), tutorials, and panels, in which researchers from multiple disciplines presented the findings or progress of their respective research efforts. Unlike previous years, we did not feature any students or faculty in this year's JCDL doctoral consortium. We regret this and hope to resume next year.

Proud to have @WebSciDL at #JCDL2019: three profs, 2 students, and 1 alum!

Excellent conf -- thanks to @profdownie and all who helped make it happen.

See you next year in Wuhan for #JCDL2020 pic.twitter.com/Knmv5Qx1kl
— Michael L. Nelson (@phonedude_mln) June 5, 2019

Day 1

The stage is ready for an eventful week, @profdownie welcomes #JCDL2019 attendees. pic.twitter.com/J4tixK81Wc
— Sawood Alam (@ibnesayeed) June 3, 2019

Following a welcome statement by Dr. Stephen Downie, Professor and Associate Dean for Research at the School of Information Sciences at UIUC, Day 1 began with a keynote from Dr. Patricia Hswe (pronounced "sway"), the program officer for Scholarly Communications at The Andrew W. Mellon Foundation. The title of her keynote was: Innovation is Dead! Long Live Innovation!

#JCDL2019 kicking off with an opening keynote from @pmhswe - really important discussion of exemplar sustainable projects: why they work, and challenges they face. I pic.twitter.com/aD4l4tt1bX
— Ian Milligan (@ianmilligan1) June 3, 2019

.@pmhswe at #JCDL2019 - Digital research workflow: gather, catalog, transcribe, identify, interpret, publish; LOD as a key technology for identify, and web annotations for interpret.
— Rob Sanderson (@azaroth42) June 3, 2019

I like this quote - after the revolution, who picks up the garbage? #JCDL2019 pic.twitter.com/mZSc3Suj6p
— Ian Milligan (@ianmilligan1) June 3, 2019

.@pmhswe mentions this article about maintenance (and makes an analogy about digital infrastructure)https://t.co/nMQgGEWuUG

#JCDL2019
— Michael L. Nelson (@phonedude_mln) June 3, 2019

.@pmhswe also mentions this article by @dancohen https://t.co/x4w3Xp0vd3

#JCDL2019
— Michael L. Nelson (@phonedude_mln) June 3, 2019

Also mentioned by @pmhswe in #JCDL2019 keynote https://t.co/ABdppiIMkq
— Michael L. Nelson (@phonedude_mln) June 3, 2019

Her keynote proposed rethinking the purpose of innovation in the Digital Libraries domain to ensure what is being built is not entirely new. But to ensure innovation includes adaptation, reuse, recovery, etc., instead of rushing to build the next new "Next New Shiny Thing."

Three parallel paper sessions followed the keynote after a break:

Generation and Linking
Analysis and Curation, and
Search Logs

Generation and Linking Session

Pablo Figueira began this paper session with a full paper presentation titled: Automatic Generation of Initial Reading Lists: Requirements and Solutions. They proposed an automatic method for generating reading lists of scientific articles to help researchers familiarize themselves with existing literature by presenting four existing requirements, and one novel requirement for generating reading lists.

Pablo Figueira is presenting, "Automatic Generation of Initial Reading Lists: Requirements and Solutions" at #JCDL2019 pic.twitter.com/l9nbnWZjxe
— Sawood Alam (@ibnesayeed) June 3, 2019

Next, Lucy McKenna, a PhD student at Trinity College Dublin, presented a full paper titled: NAISC: An Authoritative Linked Data Interlinking Approach for the Library Domain. They showed that Information Professionals such as librarians, archivists, and cataloguers have difficulty in creating five star Linked Data. Consequently, they proposed NAISC, an approach for assisting Information Professionals in the Linked Data creation process.

Lucy McKenna is presenting, "NAISC: An Authoritative Linked Data Interlinking Approach for the Library Domain" at #JCDL2019 pic.twitter.com/hLgU76RuBt
— Sawood Alam (@ibnesayeed) June 3, 2019

Next, Rohit Sharma presented a short paper titled: BioGen: Automated Biography Generation. They proposed BioGen, a system that automatically creates biographies of people by generating short sets of biographical sentences related to multiple life events. They also showed their system produced biographies similar to those manually generated by Wikipedia.

Folks from @iitgn presenting, "BioGen: Automated Biography Generation" at #JCDL2019 pic.twitter.com/cfwnq4kXck
— Sawood Alam (@ibnesayeed) June 3, 2019

The Generation and Linking session ended with a short paper presentation by Tinghui Duan, PhD student at the University of Jena, titled: Corpus Assembly as Text Data Integration from Digital Libraries and the Web. Their work proposes a method of building a Digital Humanities corpora by searching and extracting fragments of high-quality digitized versions of artifacts from the Web.

Tinghui Duan presenting, "Corpus Assembly as Text Data Integration from Digital Libraries and the Web" at #JCDL2019. pic.twitter.com/hFf4ST05Xs
— Sawood Alam (@ibnesayeed) June 3, 2019

Analysis and Curation Session

Dr. Antoine Doucet, professor of Computer Science at the University of La Rochelle, France, began the first paper session by presenting their full paper: Deep Analysis of OCR Errors for Effective Post‐OCR Processing. They presented the results of a study of five general Optical Character Recognition (OCR) errors: misspellings (real-word and non-word errors), edit operations, length effects, character position errors, and word boundary. Subsequently, they recommended different approaches to design and implement effective OCR post-processing systems.

Did I mention we’re hiring @NewsEyeEU @L3i_LaRochelle @UnivLaRochelle ? 🙂 https://t.co/zBsFmfu6wl
— Antoine Doucet (@AntoinDoucet) June 3, 2019

Next, Colin Post, a doctoral candidate in the Information and Library Science program at the University of North Carolina, Chapel Hill, presented a full paper (best paper nominee) titled: Digital curation at work: Modeling workflows for digital archival materials. This research provides insight about digital curation in practice by studying and comparing the digital curation workflows of 12 cultural heritage institutions, and focusing on the use of open-source software in their workflows.

Colin Post (@werrthe) up next in the Analysis and Curation session with a full paper (best paper nominee) presentation: "Digital curation at work: Modeling workflows for digital archival materials."#JCDL2019 pic.twitter.com/CK8HZmArXo
— Alexander C. Nwala (@acnwala) June 3, 2019

Next was a presentation from Julianna Pakstis, Metadata Librarian at the Department of Biomedical and Health Informatics (DBHi) at the Children's Hospital of Philadelphia (CHOP), and Christiana Dobrzynski, Digital Archivist at DBHi. Their short paper presentation was titled: Advancing Reproducibility Through Shared Data: Bridging Archival and Library Practice. This research highlights the work of a team of librarians and archivists at CHOP. This team implemented Arcus, an initiative of the CHOP Research Institute with the purpose of providing the biomedical research data archive and discovery catalog more broadly available within their institution.

Julianna Pakstis and Christiana Dobrzynski (@CHOP_CBMi) presenting: "Advancing Reproducibility Through Shared Data: Bridging Archival and Library Practice"#jcdl2019 pic.twitter.com/BaW0iYazie
— Alexander C. Nwala (@acnwala) June 3, 2019

The session was concluded with Ana Lucic's short paper presentation titled: Unsupervised Clustering with Smoothing for Detecting Paratext Boundaries in Scanned Documents. This research explores addressing the problem of separating the main text of a work from its surrounding paratext, a task common to the processing of large collections of scanned text in the Digital Humanities domain. The paratext is often required to be removed in order to avoid the distortion of word counts computation, locating of references, etc. They proposed a method for detecting the paratext based on a smoothed unsupervised clustering technique, and showed that their method improved subsequently text processing post removal of the paratext.

The Analysis and Curation session ended with a short paper presentation by Ana Lucic, "Unsupervised Clustering with Smoothing for Detecting Paratext Boundaries in Scanned Documents"#JCDL2019 pic.twitter.com/Q6YXCU8vvZ
— Alexander C. Nwala (@acnwala) June 3, 2019

Search Logs Session

This session began the first (best paper nominee) of three full papers presentation from Behrooz Mansouri, Computer Science PhD Student at the Rochester Institute of Technology, titled: Toward math-enabled digital libraries: Characterizing searches for mathematical concepts. The work explores what queries people use to search for mathematical concepts (e.g., "Taylor series") by studying a dataset of 392,586 queries from a two-year query log. Their results show that math search sessions are typically longer and less successful than general search, and their queries are more diverse. They claim these findings could aid in the design of search engines designed for processing mathematical notation.

Behrooz Mansouri presenting MathSeerhttps://t.co/lX76dFP65Y https://t.co/gRmoJ7fgfR

#JCDL2019 pic.twitter.com/1gkkVny3JG
— Michael L. Nelson (@phonedude_mln) June 3, 2019

Next, Maram Barifah, presented a full paper titled: Exploring Usage Patterns of a Large-scale Digital Library in which they proposed a framework for assisting librarians and webmasters explore the usage patterns of Digital Libraries.

Maram Barifah presenting about the RERO doc DLhttps://t.co/Qn9nTOUooZ #JCDL2019 pic.twitter.com/jUvXBRb0AJ
— Michael L. Nelson (@phonedude_mln) June 3, 2019

Finally, Yasunobu Sumikawa, presented the final full paper of the session titled: Large Scale Analysis of Semantic and Temporal Aspects in Cultural Heritage Collection's Search. In this presentation they reported the results of a study of a 15-month snapshot of query logs of the online portal of the National Library of France to understand the the interest of users and how users find cultural heritage content.

Yasunobu Sumikawa presenting "Large Scale Analysis of Semantic and Temporal Aspects in Cultural Heritage Collection's Search"https://t.co/ErPEB8siym

#JCDL2019 pic.twitter.com/L1tqPkluhw
— Michael L. Nelson (@phonedude_mln) June 3, 2019

Classification, Discovery and Recommendation Sessions

Following a lunch break, Abel Elekes, presented the first full paper titled: Learning from Few Samples: Lexical Substitution with Word Embeddings for Short Text Classification. To help in the classification of short text, this paper proposes clustering semantically similar terms when training data is scarce to improve the performance of text classification tasks.

Abel Elekes starting off the afternoon session with his talk on "Learning from Few Samples: Lexical Substitution with Word Embeddings for Short Text Classification" #JCDL2019 pic.twitter.com/YAD2EJy1h8
— Martin Klein (@mart1nkle1n) June 3, 2019

Abel Elekes begins the Classification, Discovery and Recommendation session with a full paper presentation:

"Learning from Few Samples: Lexical Substitution with Word Embeddings for Short Text Classification"#jcdl2019 pic.twitter.com/KWvbSrxDXZ
— Alexander C. Nwala (@acnwala) June 3, 2019

Next, Andrew Collins, a researcher at Trinity College Dublin, presented a short paper titled: Document Embeddings vs. Keyphrases vs. Terms for Recommender Systems: A Large‐Scale Online Evaluation. They compared a standard term-based recommendation approach to document embedding and keyphrases - two methods used for related-article recommendation in digital libraries, by applying the algorithms to multiple recommender systems.

Very nice hearing about Andrew Colin's and @JoeranBeel systematic approach to online evaluation of recommender systems and using it to compare keyphrase vs doc embeddings recommendations of scientific papers. #JCDL2019 Also great @oacore is used as a dataset for this work.
— Petr Knoth (@petrknoth) June 3, 2019

Andrew Collins on Mr.DLib: "Document Embeddings vs. Keyphrases vs. Terms for Recommender Systems: A Large-Scale Online Evaluation" #JCDL2019 pic.twitter.com/HwzX3dVbhu
— Martin Klein (@mart1nkle1n) June 3, 2019

Andrew Collins presenting a short paper "Document Embeddings vs. Keyphrases vs. Terms for Recommender Systems: A Large‐Scale Online Evaluation"#JCDL2019 pic.twitter.com/7MuuuZUn7W
— Alexander C. Nwala (@acnwala) June 3, 2019

Next, Corinna Breitinger, a PhD student at the University of Konstanz, presented her short paper titled: 'Too Late to Collaborate': Challenges to the Discovery of in-Progress Research. She presented the finding from an investigation to understand how how computer science researchers from four disciplines currently identify ongoing research projects within their respective fields. Additionally, she outlined the challenges faced by researchers such as avoiding duplicate research, while protecting the progress of their research for fear of idea plagiarism.

@BreitingerC is right on time to talk about "'Too Late to Collaborate': Challenges to the Discovery of in-Progress Research" #JCDL2019 pic.twitter.com/6fb7F9qwEf
— Martin Klein (@mart1nkle1n) June 3, 2019

"Unpublished literature not searchable atm" @BreitingerC. Gray literature starting to be more searchable (PK: hope @oacore helping here) #JCDL2019 pic.twitter.com/gLyy761VJO
— Petr Knoth (@petrknoth) June 3, 2019

@BreitingerC presenting a short paper:

'Too Late to Collaborate': Challenges to the Discovery of in-progress Research: https://t.co/n5CQY4y96M #jcdl2019 pic.twitter.com/KkhB3HMjO4
— Alexander C. Nwala (@acnwala) June 3, 2019

@BreitingerC et al. find a discovery-confidentiality trade-off when surveying CS researchers on how they discover unpublished and ongoing work/ideas. #JCDL2019 pic.twitter.com/jVAVEgkZbu
— Martin Klein (@mart1nkle1n) June 3, 2019

Finally, Norman Meuschke, a PhD candidate at the University of Wuppertal, presented a full paper titled: Improving Academic Plagiarism Detection for STEM Documents by Analyzing Mathematical Content and Citations. He presented their approach for addressing the problem of detecting concealed plagiarism (heavy paraphrasing, translation, etc.) in scholarly text which consists of a two-staged detection that combines similarity assessments of mathematical content, academic citations, and text, as well as similarity measures that consider the order of mathematical features.

@normeu on "Improving Academic Plagiarism Detection for STEM Documents by Analyzing Mathematical Content and Citations" #JCDL2019 pic.twitter.com/saQB6rHMdF
— Martin Klein (@mart1nkle1n) June 3, 2019

@normeu presenting his research on academic #plagiarism detection for STEM literature at #JCDL2019 Find the preprint here: https://t.co/1CIXYhQ201 pic.twitter.com/lHU1e5zQQ8
— Corinna Breitinger (@BreitingerC) June 3, 2019

@normeu presents an example of a discovered plagiarized paper, probably from a German politician.... #JCDL2019 pic.twitter.com/zdEkJxAVNj
— Martin Klein (@mart1nkle1n) June 3, 2019

Minute Madness followed after Norman's presentation, wrapping up the scholarly activities of Day 1 of JCDL. In Minute Madness, poster presenters were given one minute to advertise their respective posters to the conference attendees. The poster session began after the minute madness.

Minute Madness

#MinuteMadness queue of an impressive lineup of posters at #JCDL2019 pic.twitter.com/t3OmN7E1d5
— Sawood Alam (@ibnesayeed) June 3, 2019

.@ianmilligan1 at #jcdl2019 minute madness pic.twitter.com/hf9DCMqU5E
— Michael L. Nelson (@phonedude_mln) June 3, 2019

@petrknoth @ #jcdl2019: @davejavupride and Petr built a platform to annotate academics citations according to type where the annotators are first authors. Would be interested to see how these de-facto GT annotations differ from annotations created by non-authors. pic.twitter.com/hagoy4bORT
— Dasha Herrmann (@robodasha) June 3, 2019

.@OpenMaze presenting at #JCDL2019 minute madness @WebSciDL pic.twitter.com/0REawwuev8
— Michael L. Nelson (@phonedude_mln) June 3, 2019

@crstlthms's "Less Than 10% of Library Service Users Ask For Help" #JCDL2019 pic.twitter.com/7QtwqtowyB
— Martin Klein (@mart1nkle1n) June 3, 2019

I love #JCDL2019 minute madness - fantastic research interspersed with high comedy (@mart1nkle1n and co making a cameo as a man in black wiping our memories of their classified research). pic.twitter.com/8jP1TTKjmW
— Ian Milligan (@ianmilligan1) June 3, 2019

#MiB: @mart1nkle1n @JoshuaFinnell @BrianCain101

#JCDL2019 pic.twitter.com/MNINgveR6t
— Michael L. Nelson (@phonedude_mln) June 3, 2019

@ianmilligan1 part deux #JCDL2019 pic.twitter.com/ibjBnl7nqv
— Michael L. Nelson (@phonedude_mln) June 3, 2019

@BorisVeytsman @ #jcdl2019: How can we make sure topic annotations of research papers are correct? Good annotations should predict citations (based on the idea authors cite related work) — we can therefore measure correctness of tags by measuring how well they predict citations pic.twitter.com/YdhlJ8NqQb
— Dasha Herrmann (@robodasha) June 3, 2019

Took some photos of the 1st day of #jcdl2019 including #minutemadness and the #postersession. Now available for download here: https://t.co/q7A7bD45Nm pic.twitter.com/JT3GJAyz7l
— Corinna Breitinger (@BreitingerC) June 4, 2019

Day 2

Keynote Spotlight!: Dr. Robert Sanderson will be presenting "Standards and Communities: Connected People, Consistent Data, Usable Applications" on Tuesday, June 4th. Read more here: https://t.co/fkMrhZXhi6 pic.twitter.com/VFKFN9oEop
— JCDL Conference (@JCDLConf) May 24, 2019

Day 2 of JCDL 2019 began with a keynote from Dr. Robert Sanderson, the Semantic Architect for the J. Paul Getty Trust: Standards and Communities: Connected People, Consistent Data, Usable Applications. The keynote highlighted the value of Web/Internet standards in providing the underlying foundation that makes the connected world possible. Additionally, the keynote explored the relationship between standards and their target communities, some common inverse relationships such as the trade-off between the completeness and usability, production and consumption, etc.

Standards and Communities: Connected People, Consistent Data, Usable Applications from Robert Sanderson

@azaroth42 starting off the day with his keynote on "Standards and Communities: Connected People, Consistent Data, Usable Applications" #JCDL2019 pic.twitter.com/32Rj3vRsp9
— Martin Klein (@mart1nkle1n) June 4, 2019

.@azaroth42's keynote building on @pmhswe's keynote from yesterday #JCDL2019 pic.twitter.com/XCsZl0dG5u
— Michael L. Nelson (@phonedude_mln) June 4, 2019

Why? YUP! #JCDL2019 pic.twitter.com/spNJAwKQGm
— Martin Klein (@mart1nkle1n) June 4, 2019

Balancing member mgmt w actual standards work is a conundrum & why big organizations might be less effective though more seemingly active. But can small organizations realistically represent the whole world's needs and practices? -@azeroth42 Told ya it was interesting! #jcdl2019
— Jasmine Mulliken (@jasminemulliken) June 4, 2019

.@azaroth42 asks: "If only small groups (e.g., 20 people) work on standards, but the community is larger, how does that small group effectively represent the standards needs of the community?" Invokes @Educopia community engagement pyramid. #jcdl2019
— Patricia Hswe (@pmhswe) June 4, 2019

@azaroth42 presents great points to remember (borrowed from @cbracy) when working towards community engagement. #JCDL2019 pic.twitter.com/kChxoGlYZz
— Michele Weigle (@weiglemc) June 4, 2019

.@azaroth42 courts controversy by assessing protocols/standards he's been involved with #JCDL2019 pic.twitter.com/UYNpachPcQ
— Michael L. Nelson (@phonedude_mln) June 4, 2019

Interesting to see this chart from @azaroth42 showing ResourceSync as having less functionality and less comprehensibility than PMH. PK: Although ResourceSync has less functionality, it has IMHO somehow potential for better functionality and scalability. #JCDL2019 #ResourceSync pic.twitter.com/gTf08xGaMC
— Petr Knoth (@petrknoth) June 4, 2019

The Web Archives session followed the keynote.

Web Archives 1 Session

Sawood Alam, a PhD student at Old Dominion University, and member of the WSDL group presented a full paper on behalf of Mohamed Aturban: Archive Assisted Archival Fixity Verification Framework. Sawood presented two approaches, Atomic and Block, to establish and check fixity ( testing if an archived resource has not been unaltered since the last capture time) of archived resources. The Atomic approach for checking fixity involves storing fixity information of web pages in a JSON file and publishing the fixity content before it is disseminated to multiple on-demand Web archive. In contrast, the block approach involves merging the fixity information of multiple archived pages in a single file before its publication and dissemination to the archives.

Archive Assisted Archival Fixity Verification Framework from Sawood Alam

@ibnesayeed presenting "Archive Assisted Archival Fixity Verification Framework," on behalf of @maturban1:https://t.co/mJk5GWkE8v #jcdl2019 pic.twitter.com/3uEnTNCC7x
— Alexander C. Nwala (@acnwala) June 4, 2019

Take-aways from @ibnesayeed’s talk on Archive Assisted Archival Fixity Verification Framework. Joint work with @maturban1 @WebSciDL #jcdl2019 pic.twitter.com/yUGdUeDCf0
— Michele Weigle (@weiglemc) June 4, 2019

Next, Dr. Martin Klein, a research scientist, at the Los Alamos National Laboratory presented a short paper titled: Evaluating Memento Service Optimizations. He explained the the problem of long response time services that utilize the Memento Aggregator experience. This problem arises because search requests are broadcast to all Web archives connected to the Aggregator irrespective of the fact that some URI requests can only be fulfilled by some Web Archives. He subsequently reported some results of some performance optimizations of the Memento Aggregator such as Caching and Machine Learning-based predictions.

Evaluating Memento Service Optimizations from Martin Klein

Next up in the #webarchiving session at #jcdl2019 is @mart1nkle1n, presenting “Evaluating Memento Service Optimizations.” Similarly, you can read the pre-print at https://t.co/L7LtYlcO7e (go web archiving team at making these papers so easily accessible!). pic.twitter.com/Ul56hcJpOM
— Ian Milligan (@ianmilligan1) June 4, 2019

Finally, Sawood Alam, again, presented a full paper (best paper nominee) titled: MementoMap Framework for Flexible and Adaptive Web Archive Profiling. Sawood additionally proposed the MementoMap framework as a flexible and adaptive means of efficiently summarizing the holdings of a Web archive, showing its application for the summary of the holdings of a Portuguese Web archive collection (http://arquivo.pt/) consisting of 5 billion mementos (archived copies of web pages).

MementoMap Framework for Flexible and Adaptive Web Archive Profiling from Sawood Alam

And the final paper in this #webarchiving session at #jcdl2019 is “MementoMap Framework for Flexible and Adaptive Web Archive Profiling,” being presented again by @ibnesayeed. Pre-print at https://t.co/zLlgmSHn8N. pic.twitter.com/ziEVLA2iGW
— Ian Milligan (@ianmilligan1) June 4, 2019

@ibnesayeed presenting a full paper (best paper nominee): "MementoMap Framework for Flexible and Adaptive Web Archive Profiling"

Tech report: https://t.co/Dc7EnG8uQz
Slides: https://t.co/OdM0TELjLL
#jcdl2019 pic.twitter.com/rD8wvx2sN3
— Alexander C. Nwala (@acnwala) June 4, 2019

Other papers were presented concurrently in the Analysis and Processing session.

Analysis and Processing Session

In this session, Felix Hamborg, a PhD candidate at the University of Konstanz, presented a full paper titled: Automated Identification of Media Bias by Word Choice and Labeling in News Articles. Felix presented their research about an automatic method to detect a specific form of news bias - Word Choice and Labeling (WCL). WCL often occurs when journalists use different terms (e.g., "economic migrants" vs. "refugees.") to refer to the same concepts.

@flxhbg presenting research on media bias detection by merging approaches from the social sciences with computer science methods at #jcdl2019 pic.twitter.com/5Tz8TtkYbZ
— Corinna Breitinger (@BreitingerC) June 4, 2019

Next, Drahomira Herrmannova, presented a full paper (Vannevar Bush best paper award winner) titled: Do Authors Deposit on Time? Tracking Open Access Policy Compliance. This paper presented the findings from an analysis of 800,000 research papers published over a 5 year period. They investigated if the time lag between the publication date of research papers and the dates the papers were deposited in a repository can be tracked across thousands of repositories globally.

Drahomira Herrmannova presenting on deposit time lag for #OpenAccess publications at #jcdl2019 Deposit time lag is decreasing but still varies across countries and disciplines. pic.twitter.com/WmGzeVMHJD
— Corinna Breitinger (@BreitingerC) June 4, 2019

@robodasha Just #openaccess is not enough, we need #fastopenaccess #JCDL2019 pic.twitter.com/n5kcdjvGE4
— Petr Knoth (@petrknoth) June 4, 2019

There are over 1k #openaccess policies. #jcdl2019 @robodasha pic.twitter.com/4k5H7JuoOx
— Petr Knoth (@petrknoth) June 4, 2019

Deposit time lag across countries shows which countries get make publications available early, as featured in @PhysicsToday https://t.co/QIc0HwNmj2.… , full study: https://t.co/kqtTnVDYZY… @robodasha #JCDL2019 pic.twitter.com/KoRrFnWhfC
— Petr Knoth (@petrknoth) June 4, 2019

@robodasha Despite #openaccess compliance improving year on year, there are significant differences across institutions. #JCDL2019 https://t.co/JFJYTvkMlI #researchengland #REF2021 pic.twitter.com/4Wv5LtMytR
— Petr Knoth (@petrknoth) June 4, 2019

Following a break, the paper sessions continued.

Web Archives 2 Session

Sergej Wildemann, a researcher at the L3S Research Center, began with a full paper presentation titled: Tempurion: A Collaborative Temporal URI Collection for Named Entities, where he introduced Tempurion, a collaborative service for enriching entities (e.g., People, Places, and Creative Work) by linking them with URLs that best describe them. The URLs are dynamic in nature and change as the associated entities change.

First up: Sergej Wildemann and @helgeho and their work "Towards Temporal URI Collections for Named Entities" #JCDL2019 pic.twitter.com/qZL3QNNDlj
— Martin Klein (@mart1nkle1n) June 4, 2019

Next, I (Alexander Nwala) presented a full paper (best paper nominee) titled: Using Micro-collections in Social Mediato Generate Seeds for Web Archive Collections. I highlighted the importance of Web Archive collections as a means of traveling back in time to study events (e.g., Ebola Virus Outbreak and Flint Water Crisis) that may not be properly represented on the live Web due to link rot. These Archived collections begin with seed URLs that are often manually selected by experts or crowdsourced. As a result of the time consuming nature in collecting seed URLs for Web Archive collections, it is common for major news events to occur without the creation of a Web Archive collection to memorialize the events, justifying the need for automatically generating seed URLs. I showed that social media Micro-collections (curated lists created by social media users) provide the opportunity for generating seeds and produce collections with distinctive properties from convention collections generated by scraping Web and Social Media Search Engine Result Pages (SERPs).

Next up, @acnwala and his work on "Using Micro-Collections in Social Media to Generate Seeds for Web Archive Collections" #JCDL2019 pic.twitter.com/rwSqVL7tyL
— Martin Klein (@mart1nkle1n) June 4, 2019

@acnwala presenting his Best Paper nominated work on collecting seeds for web archives from “micro-collections” in social media. #jcdl2019 @WebSciDL pic.twitter.com/wXGgSTv7HA
— Michele Weigle (@weiglemc) June 4, 2019

Fantastic use of @TwitterMoments and @reddit by @acnwala for automatic & immediate creation of seed URLs via Micro-collections for web archiving by @archiveitorg and @internetarchive of important social events. Temporal classification for identifying events. @JCDLConf #JCDL2019 pic.twitter.com/ZSqyGmUt1d
— Shubhanshu Mishra (@TheShubhanshu) June 4, 2019

Improve web archive collections by adding seeds gathered from social media micro-collections @acnwala @WebSciDL #jcdl2019 pic.twitter.com/Dueth5UXoY
— Michele Weigle (@weiglemc) June 4, 2019

Next, Dr. Ian Milligan, history professor at the University of Waterloo, presented a short paper titled: The Cost of a WARC: Analyzing Web Archives in the Cloud. Dr. Milligan explored and answered (US$7 per TB) the question he proposed: "How much does it cost to analyze Web archives in the cloud?" He used the Archives Unleashed platform as an example to show some of the infrastructural and financial cost associated with supporting scholarship in the humanities and social sciences.

About to present our (@RyanDeschamps @SamVFritz @lintool @ianmilligan1 & @ruebot) paper on “Cost of the WARC: Analyzing Web Archives in the Cloud,” as part of the #webarchiving track at #jcdl2019. Check out our pre-print & slides at https://t.co/q9ANLgeNoE. @unleasharchives pic.twitter.com/Io4XOB0xRo
— Ian Milligan (@ianmilligan1) June 4, 2019

Finally, Dr. Ian Milligan, again, presented another short paper titled: Building Community and Tools for Analyzing Web Archives through Datathons. In his second talk of the session, Dr. Milligan highlighted lessons learned from conducting the Archives Unleashed Datathons. The Archives Unleashed Datathons started in March 2016, as a collaborative Data hackathon in which social scientists, humanists, archivists, librarians, computer scientists, etc. work together for 2-3 days on analyzing Web archive data.

And also presenting our paper (long list of authors!) on “Building Community and Tools for Analyzing Web Archives through Datathons,” which will close out the #webarchiving session here at #jcdl2019. Our pre-print & slides available at https://t.co/Qhz3U481Qp. @unleasharchives pic.twitter.com/n3dS0Jg6na
— Ian Milligan (@ianmilligan1) June 4, 2019

@ianmilligan1 is on a roll, now on "Building Community and Tools for Analyzing Web Archives Through Datathons" #JCDL2019 pic.twitter.com/qz0P4sJdtv
— Martin Klein (@mart1nkle1n) June 4, 2019

Another series of paper sessions followed after a break.

User Interface and Behavior Session

Dr. George Buchanan and Dr. Dana Mckay, researchers at the University of Melbourne School of Computing and Information systems, presented a full paper titled: One Way or Another I'm Gonna Find Ya: The Influence of Input Mechanism on Scrolling in Complex Digital Collections. They presented their findings from comparing the effect of input modality-touch and scrolling-on navigation in book browsing interfaces, by reporting user satisfaction associated with horizontal and two-dimensional scrolling.

@GeorgeRBuchanan and @DanaChatter present their full paper in the User Interface and Behavior session:

One Way or Another I’m Gonna Find Ya: The Influence of Input Mechanism on Scrolling in Complex Digital Collections#JCDL2019 pic.twitter.com/8N2hsU57MH
— Alexander C. Nwala (@acnwala) June 4, 2019

Next, Dr. Dagmar Kern, a Human Computer Interaction and User Interface Engineering researcher at Gesis, presented a short paper titled: Recognizing Topic Change in Search Sessions of Digital Libraries Based on Thesaurus and Classification System. She presented their thesaurus and classification-based solution for segmenting user session information of a social science literature into its topical components.

Dr. Dagmar Kern of @gesis_org presents a short paper in the User Interface and Behavior session:

Recognizing Topic Change in Search Sessions of Digital Libraries Based on Thesaurus and Classification System#JCDL2019 pic.twitter.com/X7hhpNlcLd
— Alexander C. Nwala (@acnwala) June 4, 2019

Finally, Cole Freeman, a researcher at Northern Illinois University, presented the last short paper of the session titled: Shared Feelings: Understanding Facebook Reactions to Scholarly Articles. where he presented a new dataset of Facebook Reactions to research papers, and the results of analyzing it.

Cole Freeman @SundryTomatoes presenting our paper 'Shared Feelings: Understanding #Facebook Reactions to Scholarly Articles' at #JCDL2019

Preprint: https://t.co/WvdnFHlPYo
Dataset: https://t.co/Q1VgXyWByu #research #altmetrics #openscience #scicomm pic.twitter.com/RvbmlMhRuf
— Hamed Alhoori (@alhoori) June 4, 2019

Curtis Freeman presents the final short paper of the User Interface and Behavior:

Shared Feelings: Understanding Facebook Reactions to Scholarly Articles

preprint: https://t.co/M6k5t3ktFK #jcdl2019 pic.twitter.com/eVLo889EUY
— Alexander C. Nwala (@acnwala) June 4, 2019

Citation Session

Dattatreya Mohapatra, a recent Computer Science graduate of Indraprastha Institute of Information, presented, a full paper (best student paper award winner) titled: Go Wide, Go Deep: Quantifying the Impact of Scientific Papers through Influence Dispersion Trees. He presented a novel data structure, the Influence Dispersion Tree (IDT) to model the impact of a scientific paper without relying of citation counts, but instead captures the relationship of follow-up papers and and their citation dependencies.

Dattatreya Mohapatra presenting “Go wide,
go deep: Quantifying the Impact of Scientific Papers through Influence Dispersion Trees'' at the #JCDL2019 #citations session pic.twitter.com/VUdz80BY8M
— Corinna Breitinger (@BreitingerC) June 4, 2019

Dattatreya Mohapatra from @IIITDelhi presenting, "Go Wide, Go Deep: Quantifying the Impact of Scientific Papers through Influence Dispersion Trees" at #JCDL2019

Preprint: https://t.co/bJ8QmR9dzb pic.twitter.com/z3KGF7vv3g
— Sawood Alam (@ibnesayeed) June 4, 2019

Next, Leonid Keselman, a researcher at Carnegie Mellon University, presented a full paper titled: Venue Analytics: A Simple Alternative to Citation‐Based Metrics. He presented a means for automatically organizing and evaluating the quality of Computer Science publishing venues, by producing venue scores for conferences and journals, done by formulating venue authorship as a regression problem.

Leonid Keselman presenting very interesting research on ''Venue Analytics: A Simple Alternative to Citation-Based Metrics'' at #JCDL2019. Preprint: https://t.co/yOE0VTUfTT pic.twitter.com/KTDSOk0wOj
— Corinna Breitinger (@BreitingerC) June 4, 2019

Day 2 ended with the conference banquet and awards presentation at the Memorial football stadium.

Cool #jcdl2019 banquet venue at the University of Illinois football stadium! I think this’d be the biggest football stadium in Canada...

Thanks for a great event @profdownie et al! pic.twitter.com/FAZj372RsA
— Ian Milligan (@ianmilligan1) June 5, 2019

The best demo award was given to MELD: a Linked Data Framework for Multimedia Access to Music Digital Libraries, by Dr. Kevin Page, David Lewis, and Dr. David M. Weigl

Congratulations to Kevin and David for their best demo award at #JCDL2019 https://t.co/1yfC3ue3u9
— Oxford e-Research (@OxfordeResearch) June 5, 2019

The best student paper award to given to Go Wide, Go Deep: Quantifying the Impact of Scientific Papers through Influence Dispersion Trees, by Dattatreya Mohapatra, Abhishek Maiti, Dr. Sumit Bhatia and Dr. Tanmoy Chakraborty

Super-excited to receive Best Student Paper Award in @JCDLConf #JCDL19 for our work "Go Wide, Go Deep: Quantifying the Impact of Scientific Papers through Influence Dispersion Trees"https://t.co/5i87teBSEa
\w @sbhatia_ pic.twitter.com/ry8XZLZsgI
— Tanmoy Chakraborty (@Tanmoy_Chak) June 5, 2019

The Vannevar Bush best paper award was given to Do Authors Deposit on Time? Tracking Open Access Policy Compliance by Drahomira Herrmannova, Nancy Pontika and Dr. Petr Knoth

I am utterly astonished & still can't get over the fact @petrknoth, @nancypontika and I won the best paper award last night @ #JCDL2019. Thank you Petr for allowing me to work on this important topic and thanks to the @JCDLConf peers for honouring us with this award! @kmiou pic.twitter.com/ZkYN1ugBTg
— Dasha Herrmann (@robodasha) June 5, 2019

Day 3

Day 3 of JCDL 2019 began with a keynote from Dr. John Wilkin, the Dean of Libraries and University Librarian at the University of Illinois at Urbana-Champaign. His keynote was titled: How do you lift an elephant with one hand? and explored the challenges overcome in building the HathiTrust Digital Library, a large-scale digital repository that offers millions of titles digitized from libraries around the world.

"How do you lift an elephant with one hand?", tells the #keynote speaker John Wilkin from @IllinoisLibrary at #JCDL2019 pic.twitter.com/0ejPIuzCpn
— Sawood Alam (@ibnesayeed) June 5, 2019

Following the keynote was an ACM Digital Library (DL) panel session titled: Towards a DL by the Communities and for the Communities. The ACM Digital Library & Technology Committee is headed by Dr. Michael Nelson and Dr. Ed Fox, and the panel session featured talks from Dr. Daqing He, Dr. Dan Wu, Wayne Graves, and Dr. Martin Klein. During the panel, Dr. Daqing presented usage statistics of the ACM DL, Wayne Graves, Director of Information Systems at ACM presented the redesigned ACM DL website (available soon) and received feedback on existing and future services, and Dr. Martin Klein presented Piloting a ResourceSync Interface for the ACM Digital Library. Dr. Dan Wu invited the researchers to Wuhan University, the host of the JCDL 2020 conference, and introduced the audience to the city, subsequently, Dr. Stephen Downie gave the conference closing remarks.

@HeDaqing shows how the ACM DL is accessed. Traffic from external links (Google, Google Scholar, conf pages) dominate.#JCDL2019 pic.twitter.com/NMHs4pEa98
— Michael L. Nelson (@phonedude_mln) June 5, 2019

It looks like @acmdl is revamping its website in big ways. We got a glimpse at #JCDL2019!
— Sawood Alam (@ibnesayeed) June 5, 2019

"Piloting a ResourceSync Interface for the ACM Digital Library" by @mart1nkle1n in the #JCDL2019 panel. pic.twitter.com/ew6bAPVDCN
— Sawood Alam (@ibnesayeed) June 5, 2019

@mart1nkle1n demoing a #ResourceSync interface to the ACM DL "sandbox".

Future plans: https://t.co/lO30oVmG29 https://t.co/yFgZczTxjG

ResourceSync:https://t.co/U4ik8Mc8YB #JCDL2019 pic.twitter.com/X8RUlUg7Jj
— Michael L. Nelson (@phonedude_mln) June 5, 2019

#JCDL2020 will be in Wuhan, China!!! #JCDL2019 pic.twitter.com/JFakSW1dKa
— Martin Klein (@mart1nkle1n) June 5, 2019

I would like to thank the organizers and sponsors of the conference and the hosts, Dr. Stephen Downie and the University of Illinois, in Urbana-Champaign (UIUC), and Corinna Breitinger for taking and uploading additional photos of the conference.

-- Alexander C. Nwala (@acnwala)

Search This Blog

Web Science and Digital Libraries Research Group