2022-07-27: Web Archiving and Digital Libraries (WADL) Workshop 2022 Trip Report
#JCDL2022 the "Web Archiving and Digital Libraries (WADL) 2022" has begun!!
— kritika garg (@kritika_garg) June 24, 2022
Program Schedule: https://t.co/2j4ukN0M22@jcdl2022 pic.twitter.com/XIfPkD9944
Talks 1
I was pleased to talk today about the Croatian Web Archive #HAW_NSK at the #WADL2022 #JCDL2022
— Karolina Holub (@KarolHolu) June 24, 2022
Many thanks to all co-chairs for an invite! @mart1nkle1n @jcdl2022 @NSK_Zagreb #webarchiving
More about HAW: https://t.co/Na4vhnHq2s
Talks 2
Yousef Younes from the GESIS Leibniz Institute for the Social Sciences talked on "Where Are the Datasets? A case study on the German Academic Web Archive". Their case study reflected the research question, “How to find references to research datasets using web archives?” They looked at the various identifiers such as DOI, URL, title, etc., to find datasets. They also investigated the changes in the volume of referenced datasets over time.
The Talks 2 session in the #WADL2022 workshop at @jcdl2022 has started with Yousef Younes's talk on "Where are the Datasets? A case study on the German Academic Web Archive"#JCDL2022 pic.twitter.com/XeEoYWXjNa
— Yasasi (@Yasasi_Abey) June 24, 2022
Himarsha Jayanetta from WS-DL, Old Dominion University, presented our work on "Comparison of Access Patterns of Robots and Humans in Web Archives." This work extends Dr. Yasmin AlNoamany's research. In our work, we analyzed the anonymized web access logs from the Internet Archive (IA) and Arquivo.pt to detect how bots and humans access the web archive holdings. We used various heuristics to identify the robot sessions. We found that 88% of sessions were robots in IA 2012 dataset, 70% of sessions were robots in IA 2019 dataset, and 97% of sessions were robots in Arquivo.pt 2019 dataset. This work is accepted for publication in the 26th International Conference on Theory and Practice of Digital Libraries 2022.
#WADL2022 @HimarshaJ from @WebSciDL is presenting "Comparison of Access Patterns of Robots and Humans in Web Archives" work by her and @kritika_garg @ibnesayeed @phonedude_mln @weiglemc
— Shawn M. Jones, PhD (@shawnmjones) June 24, 2022
Robots behave differently than humans and we can detect this. pic.twitter.com/XdBWALcaqw
Dr. Sawood Alam from the Internet Archive, who is also a WS-DL alumnus, presented on "Wayback Machine Video Archiving Insights." He showed a dashboard demo that provides insights on videos archived by the Internet Archive. The dashboard displays the number of videos archived, the duration of videos, and the longest video archived on a particular day. The dashboard shows the word cloud of top tags associated with the videos, top languages, analysis of duration of videos, and top 100 uploaders.
@ibnesayeed from @internetarchive (also @WebSciDL alum) presented "Wayback Machine Video Archiving Insights" work by him, Bill O'Connor, @MarkGraham
— Shawn M. Jones, PhD (@shawnmjones) June 24, 2022
highlighting their Video Archiving Insights dashboard that provides visibility into what has been archived. #WADL2022 pic.twitter.com/PPoIoEO4pM
I, Kritika Garg from WS-DL, Old Dominion University, presented our work on “Optimizing Archival Replay by Eliminating Unnecessary Traffic to Web Archives.” We demonstrated how replaying an archived web page with carousels, widgets, etc., can generate wasteful requests. For instance, we showed a memento making 1098 requests per minute. We created a minimal reproducible example to show how missing embedded resources make recurring requests to the web archive server. We demonstrated that we could mitigate the unnecessary requests by sending the 404 responses with a Cache-Control header.
#WADL2022 @kritika_garg from @WebSciDL presented "Optimizing Archival Replay by Eliminating Unnecessary Traffic to Web Archives" by her, @HimarshaJ @ibnesayeed @phonedude_mln @weiglemc
— Shawn M. Jones, PhD (@shawnmjones) June 24, 2022
Improving web archive access by improving performance for end users. pic.twitter.com/fJULgEuBrs
Talks 3
The Talks 3 session #WADL2022 has started with a talk on "Emulation-based long-term Access to Complex Web-sites" by Marcel Tschöpe, Rafael Gieschke and Klaus Rechert(@kurau5u)#JCDL2022 #WADL2022
— kritika garg (@kritika_garg) June 24, 2022
Related blog: https://t.co/HFyMv5BxF6@jcdl2022 pic.twitter.com/fIL1NIdN2m
#WADL2022 @TReid803 is presenting "Web Archiving as Entertainment" exploring how to integrate gaming with web archiving work with @phonedude_mln @weiglemc
— Shawn M. Jones, PhD (@shawnmjones) June 24, 2022
Work funded by @NetPreserve: https://t.co/iVXbxWEvgQ pic.twitter.com/Ebh1lGUSHJ
Just wrapped up our presentation at the Web Archiving & Digital Libraries 2022 Workshop (#wadl2022) titled, "First steps in Identifying Academic Migration using Memento and Quasi-Canonicalization".
— Mat Kelly (@machawk1) June 24, 2022
Here are the slides: https://t.co/X7gtqnHcTw#memento #webarchiving pic.twitter.com/jnfAMFkGpl
Talks 4
.@librariancarrie and @erica_peaslee are presenting the SUCHO project where 1300+ people banded together to preserve Ukrainian cultural heritage online. They use @internetarchive and @webrecorder_io tools to archive the Ukrainian sites. #WADL2022@jcdl2022 #JCDL2022 pic.twitter.com/jOG5nMlqW3
— kritika garg (@kritika_garg) June 24, 2022
Talks 5
.@ibnesayeed (also @WebSciDL alum) and MarkGraham from @internetarchive are now presenting "CDX Summary for Web Archival Collection Insights" at #WADL2022.
— kritika garg (@kritika_garg) June 24, 2022
CDX Summary is a tool to summarize web archive capture index (CDX) files. Tool: https://t.co/hs34VNCwCj
@jcdl2022 @oducs pic.twitter.com/gd1LiNHJAe
.@grantcatkins (@WebSciDL alum) is presenting a talk on "Russia-Ukraine News on the Dark Web" at #WADL2022.@japharl @justinfbrunelle@jcdl2022 #JCDL2022 pic.twitter.com/KKhot4hkmG
— kritika garg (@kritika_garg) June 24, 2022
Scholars are increasingly citing repos in Git Hosting Platforms, but they aren't permanent. And preserving the code as a stand alone product isn't enough.
— Emily Escamilla (@EmilyEscamilla_) June 24, 2022
We need to archive the issues/wikis/pull requests that aid in reproducibility https://t.co/fqMRFnZjLo
Talks 6
It's been phenomenal to watch the collaboration between @helgeho & @ruebot as they've developed ARCH (@unleasharchives + @archiveitorg integration solution to analyze #webarchives)!
— The Archives Unleashed Project (@unleasharchives) June 24, 2022
Happening now @ #WADL2022 a @ruebot DEMO which tours through features + functionalities of ARCH pic.twitter.com/ivYj6Xi7OR
.@IlyaKreymer is presenting "WACZ" work by him, @edsu, and Cade Diehm at #WADL2022.
— kritika garg (@kritika_garg) June 24, 2022
Web Archive Collection Zipped (WACZ) format that allows web archives to be shared and distributed.
py-wacz: https://t.co/1UJaXF1Uaw#JCDL2022 @jcdl2022 pic.twitter.com/orVlVwt1jz
#WADL2022 @vphill is presenting "Moving the End of Term Web Archive to the Cloud to Encourage Research Use and Reuse" work with @ibnesayeed
— Shawn M. Jones, PhD (@shawnmjones) June 24, 2022
"We are also focusing on computational consumption of the collection rather than just replay."
For more info: https://t.co/H6PpNmtIba pic.twitter.com/rGchhXhZPf
Closing Session
#WADL2022 has come to a close. Thanks to all speakers, attendees, note-takers, tweeters (@WebSciDL), and our invited presenters @KarolHolu, @erica_peaslee, @librariancarrie @NSK_Zagreb @sucho_org #JCDL2022
— Martin Klein (@mart1nkle1n) June 24, 2022
#JCDL2022 came to an end with the closing of WADL. Thank you all for making @jcdl2022 a success. You all have a save trip home and looking forward to seeing you soon.
— JCDL2022 (@jcdl2022) June 25, 2022
The organizing team https://t.co/x6oWlAoGEw
Comments
Post a Comment