2025-07-19: 17th ACM Web Science Conference (WebSci) 2025 Trip Report
The 17th ACM Web Science Conference (WebSci 2025) took place at Rutgers University in New Brunswick, New Jersey
The 17th ACM Web Science Conference (WebSci 2025) was held from May 20–23 at Rutgers University in New Brunswick, New Jersey. The theme was "Maintaining a Human-Centric Web in the Era of Generative AI" and highlighted the interdisciplinary nature of Web Science, which examines the complex, reciprocal relationship between the Web and society. This trip report is authored by Kritika Garg and David Calano from the Web Science and Digital Libraries (WSDL) research group at Old Dominion University, who had the pleasure of attending and presenting at the conference.
Tuesday, May 20, 2025
On the first day of the conference, a series of workshops and tutorials were held on cutting-edge topics such as Generative AI, the Human-Centric Web, and Information Security. Tutorials included sessions on using the National Internet Observatory for collecting web data for research and exploring the Meta Content Library as a research tool. We had to choose one workshop or tutorial to attend.
Day 1 of #WebSci25 @WebSciConf kicks off with an exciting lineup of workshops and tutorials!
— Kritika garg (@kritika_garg) May 20, 2025
🔗 https://t.co/ZN7SD9vnq4
Tutorial: Beyond APIs: Collecting Web Data for Research using the National Internet Observatory
The first workshop session was “Beyond APIs”, where members of the National Internet Observatory (NIO) at Northwestern University discussed many of the current issues in interfacing with the Web, collecting data, and ethical concerns of data usage. We at WS-DL often face many of these same challenges when working with APIs of various sites, such as the deprecation of the original Twitter API discussed in the workshop. In the NIO program, users opt into the study and can both voluntarily donate their data and utilize mobile apps and browser extensions which monitor their Web activity and allow researchers to find interesting patterns in user behavior and the interconnectedness of the Web.
At #WebSci2025? Join our "Beyond APIs: Collecting Web Data for Research using the National Internet Observatory" tutorial that addresses the critical challenges of web data collection in the post-API era.
📍Where: ABE 2400 (15 Seminary Place)
⏰When: Tue, May 20, 9-12.
1/2 pic.twitter.com/Expoi1BWAn
Workshop: HumanGenAI Interactions: Shaping the Future of Web Science
I, Kritika Garg, participated in the workshop “Human-GenAI Interactions: Shaping the Future of Web Science,” which showcased several fascinating studies.
Lydia Manikonda from Rensselaer Polytechnic Institute presented work on characterizing linguistic differences between human and LLM-generated text using Reddit data from r/explainlikeimfive. They prompted ChatGPT with the same questions as those posed on the subreddit, then compared the top-voted human responses with the AI-generated ones, asking whether readers could distinguish between them and identify the author.
Celia Chen and Alex Leitch from the University of Maryland discussed “Evaluating Machine Expertise,” focusing on how graduate students develop frameworks to assess GenAI content. They noted that LLM-generated content often appears authoritative even without domain expertise. Their research examines whether students build mental models to decide when and how to use LLMs and how these frameworks shift across disciplines. They found that students protect work central to their professional identity, are skeptical of academic LLM content, but trust machine outputs when they can be tested. International students often verify results across languages, such as checking first in English and then confirming in Chinese.
Alexander Bringsjord from Rensselaer Polytechnic Institute explored GenAI’s dual deception based on content and perceived intelligence, highlighting LLM hallucinations and how LLMs blend prior conversation into answers rather than accurately interpreting new documents.
Lydia Manikonda also spoke about the importance of privacy and ethical practices as more companies integrate AI into customer experiences.
Finally, Eni Mustafaraj’s reflections on the Semantic Web and the current state of AI, along with her work on Credbot, left me reflecting on how we might engage with the web and information in the future. The discussion about whether we will continue to visit web pages or shift to new modes of communication felt especially relevant and worth pondering.
How is GenAI reshaping the web and our behavior online?@maidylm, @oshaniws, and Rui Fan are leading a #WebSci25 @WebSciConf workshop on Human–GenAI Interactions: exploring ethical, social, and technical impacts on the web and its users.
🔗https://t.co/Cm6EjulLHA pic.twitter.com/DZftRzCUsV
Wednesday, May 21, 2025
The conference kicked off on Wednesday with opening remarks from General Chair Matthew Weber of Rutgers University. He welcomed attendees to New Jersey and introduced the other chairs. He shared that this year there were 149 submissions from 519 authors across 29 countries, with 59 papers accepted, resulting in an acceptance rate of 39.6%.
Day 2 at #WebSci25!
First day of the main conference. Opening remarks are happening @RutgersU now!
@lifefromalaptop @WebSciDL @WebSciConf pic.twitter.com/rKuOpELo3f
Session 1: Digital Identity & Social Systems
Ines Abbes opened Session 1 with “Early Detection of DDoS Attacks via Online Social Networks Analysis”. They proposed a BERT-based approach for early detection of DDoS attacks by analyzing user reports on Twitter, demonstrating high accuracy and outperforming existing methods. Next, Sai Keerthana Karnam presented “Social Biases in Knowledge Representations of Wikidata separates Global North from Global South.” Their work investigates social biases embedded in Wikidata’s knowledge representations, showing that geographic variations in bias reflect broader socio-economic and cultural divisions worldwide. Xinhui Chen presented “Unpacking the Dilemma: The Dual Impact of AI Instructors’ Social Presence on Learners’ Perceived Learning and Satisfaction, Mediated by the Uncanny Valley”, that explores how adding social presence to AI instructors boosts learners’ perceived learning and satisfaction but also risks triggering uncanny‑valley reactions. Lastly, Ben Treves presented “VIKI: Systematic Cross-Platform Profile Inference of Tech Users”. Their work introduces VIKI, a method that analyzes and compares users’ displayed personas, like personality traits, interests, and offensive behavior, across platforms such as GitHub, LinkedIn, and X, revealing that 78% of users significantly alter how they present themselves depending on the context.
Ines Abbes is starting the first session of #WebSci25 with the presentation “Early Detection of DDoS Attacks via Online Social Networks Analysis”.
📄 DOI: https://t.co/2jgcmJ8Oln
🔗https://t.co/rfcK4I0MSI@WebSciDL @WebSciConf @RutgersU pic.twitter.com/1KtuTPBCAp
Keynote: Mor Naaman
Mor Naaman from Cornell Tech delivered the first keynote of the conference. His talk was titled “AI Everywhere all at Once: Revisiting AI-Mediated Communication”. He reflected on how, when the concept of AI-Mediated Communication (AIMC) was first introduced in 2019, it seemed mostly theoretical and academic. However, in just a few years, AI has become deeply embedded in nearly every aspect of human communication, from personal conversations to professional work and online communities. Mor revisited key studies from the AIMC area, highlighting findings such as how suspicion of AI can undermine trust in interpersonal exchanges, and how AI assistants can subtly influence not only the language and content of our communication but even our attitudes. Given the rapid growth of AI technologies like ChatGPT, he proposed an updated understanding of AIMC’s scope and shared future research directions, while emphasizing the complex challenges we face in this evolving landscape. His talk highlighted the profound and often subtle ways AI is transforming our communication, not just in what we say, but how we think and connect with one another. It made me wonder about the future of communication as AI becomes increasingly integrated into our daily interactions, raising important questions about how we can preserve authenticity and trust amid this rapid technological rise.
Day 2 at #WebSci25! After lunch, there were two parallel sessions and we attended Session 2 which seemed more aligned with our interests. Jessica Costa started the session with “Characterizing YouTube’s Role in Online Gambling Promotion: A Case Study of Fortune Tiger in Brazil”, which examines how YouTube facilitates the promotion of online gambling, highlighting its societal impact and providing a robust methodology for analyzing similar platforms. Next, Aria Pessianzadeh presented “Exploring Stance on Affirmative Action Through Reddit Narratives”. This study analyzes narratives on Reddit to explore public opinions on affirmative action, revealing how users express support or opposition through personal stories and thematic framing. Ashwin Rajadesingan presented “How Personal Narratives Empower Politically Disinclined Individuals to Engage in Political Discussions”, which study showing how sharing personal stories can motivate people who typically avoid politics to join conversations on Reddit, as these stories resonate more with people and tend to receive more positive engagement than other types of comments. Wolf-Tilo Balke concluded the session with "Scientific Accountability: Detecting Salient Features of Retracted Articles". This study identifies key characteristics of retracted scientific articles, such as citation patterns, language features, and publication metadata, to better understand their impact and improve detection of problematic research. This work offers a new lens to think critically about the credibility of scientific literature, especially in an era of information overload. Day 2 at #WebSci25! Dr. Lee Giles delivered an excellent keynote on the operation and infrastructure of Web crawlers as well as search engines, both general and those created by him. These included numerous *Seer-variant engines, such as ChemXSeer and CiteSeerX. Being a friend of the WS-DL research group, this talk was a nice treat as a current WS-DL student and an incredible resource for other conference participants interested in Web crawlers. Through discussions with other students there, many had attempted to work with or create Web crawlers in the past without realizing the complexity and challenging hurdles they needed to overcome in the process of navigating the modern Web. Excellent keynote session on spiders, search engines, and Web crawling from @cleegiles today at @WebSciConf #WebSci25 @kritika_garg @ibnesayeed @phonedude_mln @weiglemc @webscidl pic.twitter.com/bPvwhQn4Jy The WebSci ‘25 Lightning Talks were brief presentations meant to advertise and attract audience members to the large selection of posters being presented. As with the session and keynote talks, there was no shortage of interesting work on display. Great posters and drinks reception @WebSciConf 2025 @RutgersU. Spot two web science founders in this photo. It’s great to see the younger generation here picking up the challenge that we laid down 20 years ago and running with it. pic.twitter.com/Rk00D6EQGB #WebSci25 Poster Session and Welcome Reception at @RutgersU, @lifefromalaptop from @oducs & @WebSciDL is presenting his paper: “GitHub Repository Complexity Leads to Diminished Web Archive Availability.”@weiglemc @phonedude_mln @WebSciConf pic.twitter.com/nDwTIch7xa The talks from Session 4 were all keenly relevant to today’s evolving political climate. The papers presented in this talk were: Unite or divide? Biased search queries and Google Search results in polarized politics, by Chau Tong All of the papers in this talk presented interesting information and findings. For example, in the case of Kai-Cheng Yang and Filippo Menczer’s paper, it is interesting to note the left-wing bias inherent in LLMs and what effect such biases might have. As many Web users, particularly those of younger generations, default to consulting an LLM chat bot for information and rarely conduct further searches or analysis of sources, what happens to an already polarizing society? Likewise, Chau Tong’s paper explored the topic of polarization in search engine results. The DocNet paper by Zhu et al. also provided a good technical exploration of bias detection systems leveraging AI and Python. Deanna Zarrillo presented “Facilitating Gender Diverse Authorship: A Comparative Analysis of Academic Publishers’ Name Change Policies”, which examines the publicly available name change policies of nine academic journal publishers through thematic content analysis, providing insights into how publishers manage rebranding and transparency during transitions. Tessa Masis presented her work, “Multilingualism, Transnationality, and K-pop in the Online #StopAsianHate Movement” which examines how the #StopAsianHate movement used multilingual posts and K-pop fan culture to build global solidarity and amplify anti-Asian hate messages across different countries and communities online. Come get Rickrolled by @kritika_garg from @WebSciDL during her paper presentation for "Not Here, Go There: Analyzing Redirection Patterns on the Web" at #WebSci25 !@WebSciConf @weiglemc @phonedude_mln pic.twitter.com/NNi5UnbVj5 The papers from Session 9 explored a wide range of topics across social media from war, news, and even mental health and safety. “A Call to Arms: Automated Methods for Identifying Weapons in Social Media Analysis of Conflict Zones” by Abedin et al. presented an interesting framework for analyzing and tracking weapons of war and ongoing conflicts in active regions of war through social media platforms, such as Telegram. Their work heavily utilized computer vision and open-source datasets and provides a window into the scale and lethality of ongoing conflicts. The paper by Saxena et al., “Understanding Narratives of Trauma in Social Media”, was incredibly valuable in discussing the effects of trauma and social media on mental health. The Web Science panel consisted of Dame Wendy Hall, Dr. Jim Hendler, Dr. Mathew Webber, Dr. Wolf-Tilo Balke, Dr. Marlon Twyman II, and Dr. Oshani Seneviratne. While the panel went a little over on time and not many questions were able to be asked in session, many were had at the reception after. It was a treat to hear from some of the key founders of the field of Web Science and core creators of the World Wide Web at large. The panel topics and moderated questions took on a broad range of topics across the spectrum of Web Science and it was great to hear the thoughts from such key figures on issues related to social media, AI, political governance, the Semantic Web, and the broad applications of Communication and Social Science to the World Wide Web. Also discussed by Dame Hall and Dr. Hendler was the Web Science Trust, which seeks to advance the field of Web Science and bring together researchers from across the globe. Web Science Panel responding to attendee questions on a wide range of Web Science topics Session 10 also had a decent variety in terms of content. Two of our favorite papers presented were “Decentralized Discourse: Interaction Dynamics on Mastodon” by Brauweiler et al. and “Is it safe? Analysis of Live Streams Targeted at Kids on Twitch.tv”, by Silva et al. Many of the WS-DL members are fans of new, unique, experimental, and decentralized Web tools and social platforms. Some of our members are active in various Mastodon communities and have even run their own instances. It was exciting to hear some researchers are utilizing Mastodon and other social platforms and how they tackled many of the technical challenges present among them. Like the work of Saxena et al. from Session 9, the work by Silva et al. in researching child safety on the popular streaming platform Twitch is also of great importance for the health and wellbeing of the younger Web population. They found that currently Twitch only has minimal options in place for marking and filtering adult content, and in particular only for select forms of media, and such channels are self-reported as for an adult audience, not automatically tagged as such. Furthermore, even if content is not marked for an adult audience, or explicitly marked for kids or for a younger audience, there is no guarantee of the language used by the streamer or topics discussed in chat to be suitable for younger audiences except through voluntary moderation. Day 4 at #WebSci25! Dame Wendy Hall’s closing keynote was an excellent look through the history of Artificial Intelligence and its relation to the Web. It served as an excellent reminder that progress is not always constant and we tend to alternate between periods of uncertainty and rapid progress that can often blindside us to potential hazards. It was also a reminder of how much Artificial Intelligence relies on the World Wide Web, its users surfing the waves of hyperspace, and the information they share along the way. The collective information of the Web is what comprises AI, without the input of billions of people around the world, there would be no substance to it. Some other great points from the talk were on the dangers and politics surrounding AI research, development, and utilization. Importantly, how much power and control we allow AI to have in our global society and global cooperation (or lack thereof) in regards to AI regulation. The points of this keynote were extremely relevant given the simultaneous release of Anthropic’s Claude 4 LLM model, which in testing was found to engage in blackmail, whistleblowing, and other interesting behaviors. Day 4 at #WebSci25 Despite the week’s rainy weather, the conference was well-organized, stimulating, and rewarding. For some, this was a return to a familiar community, while for us it was a valuable first in person conference experience. The opportunity to exchange ideas with colleagues from industry and academia worldwide was truly worthwhile. The dinner at the Rutgers Club was a fitting conclusion, providing space to connect before departing. With the next conference scheduled for Germany, we look forward to continuing these conversations there. Many thanks to the organizers for putting together an excellent event. Snapshots from our trip — Kritika and David presenting at WebSci 2025, meeting Dame Wendy Hall and Dr. Jim Hendler, the must-have ODU WSDL group photo with our alumnus Dr. Sawood Alam, and a scenic drive back to Virginia - Kritika Garg (@kritika_garg) and David Calano (@lifefromalaptop)
Keynote by Mor Naaman (@informor) from @cornell_tech is underway.
He’s discussing AI-Mediated Communication (AIMC) once a theoretical concept, now a reality influencing how we talk, work, & connect online @WebSciConf @WebSciDL @lifefromalaptop @RutgersU pic.twitter.com/UaqRMV94TvSession 2: Content Analysis & User Narratives
Session 2: Content Analysis & User Narratives
Jessica Costa from @UFOP is presenting “Characterizing YouTube’s Role in Online Gambling Promotion: A Case Study of Fortune Tiger in Brazil”
📄 https://t.co/gyUcfFrWcp@WebSciDL @WebSciConf @RutgersU pic.twitter.com/r8WLmboX23Keynote: Lee Giles
Lightning Talks & Poster Session
I, David Calano, presented the poster "GitHub Repository Complexity Leads to Diminished Web Archive Availability", which highlighted the limited availability of Web hosted (i.e., GitHub) software repositories archived to the Wayback Machine. We looked at the page damage of archived repository landing pages and the availability of the archived source files themselves to assess the viability of potentially rebuilding archived software projects.Thursday, May 22, 2025
Session 4: Media Credibility & Bias
Session 7: Online Safety & Policy
I, Kritika Garg, had the pleasure of presenting our work, “Not Here, Go There: Analyzing Redirection Patterns on the Web”. Our research examined 11 million redirecting URIs to uncover patterns in web redirections and their implications on user experience and web performance. While half of these redirections successfully reached their intended targets, the other half led to various errors or inefficiencies, including some that exceeded recommended hop limits. Notably, the study revealed "sink" URIs, where multiple redirections converge, sometimes used for playful purposes such as Rickrolling. Additionally, it highlighted issues like "soft 404" error pages, causing unnecessary resource consumption. The research provides valuable insights for web developers and archivists aiming to optimize website efficiency and preserve long-term content accessibility.
Mohammad Namvarpour presented the last presentation of the session, “The Evolving Landscape of Youth Online Safety: Insights from News Media Analysis”, which examines how news stories about keeping kids safe online have changed over the past 20 years, showing that recent coverage focuses more on tech companies and government rules. The authors studied news articles to understand how the conversation about youth online safety has evolved.Session 9: Contemporary Issues in Social Media
Web Science Panel
Friday, May 23, 2025
Session 10: Platform Governance & User Safety
Final session of the @WebSciConf, “Platform Governance & User Safety,” is happening now!
Chaired by @ibnesayeed @lifefromalaptop @WebSciDL @RutgersUClosing Keynote: Dame Wendy Hall
Dame Wendy Hall (@DameWendyDBE) from @unisouthampton is now delivering the closing keynote: “Generative AI: Fact or Fiction”
She is unpacking the global hype around GenAI & asking what’s real, what’s not, what comes next#AI @WebSciDL @WebSciConf @RutgersU pic.twitter.com/5dOmmAAQZVConference closing
Comments
Post a Comment