2025-09-28: Trip report: Dagstuhl Seminar 25382: Open Scholarly Information Systems: Status Quo, Challenges, Opportunities
During September 14-19, I was honored to organize and participate in the Dagstuhl Seminar titled Open Scholarly Information Systems: Status Quo, Challenges, Opportunities. The seminar was held at Schloss Dagstuhl, a small castle-like computer science center in the west of Germany.

What is unique about Dagstuhl is that it is located in a wood, very isolated from big cities. The nearest town, Wadern, is 30 minutes on foot, and there is no public transportation. After landing at the huge Frankfurt airport, one has to take the train for 2 hrs from the regional train station to Türkismühle. Then you must take a taxi because there is no public transportation. Based on the recommendation of the Dagstuhl website, I had to reserve a taxi from Taxi Martin at least 3 days in advance. Of course, I had to reserve it again when I left at 4 am!
The whole building contains two parts: an old building, which was built 260 years ago, and a new building, built in 2001. The two buildings are connected with a bridge on the second floor. In addition to lecture rooms and living rooms, the old building has a dining room, a kitchen, a piano room, and a backyard, which is very nice for afternoon teas and outdoor discussions. In addition to lectures and living rooms, the new building has a laundry room, a sauna, and a gym. Both buildings are covered by Wifi. Coffee (including espresso drinks), sparkling water, and wine are available 24/7. The center has everything you need for research, except cars. You have to walk if you need to get out.
The core organization team consists of five people around the world, including Hannah Bast (chair, Germany), Marcel Ackermann (coordinator, Germany), Guillaume Cabanac (co-chair, France), Paolo Manghi (co-chair, Italy), and me (co-chair, United States). We started drafting the proposal back in March 2023. The original goal was to celebrate the 32nd anniversary of DBLP, a legendary digital library for computer and information sciences. Later on, the plan evolved to assemble about 40 scholars to discuss a broader topic: open scholarly information systems (OSIs). To carefully control the size of the seminar and guarantee attendance, the organization team sent invitations at least 3 rounds. The invited people include PIs, core technical people, or directors of well-known digital libraries (e.g., Google Scholar, arXiv, CORE, OpenAIRE, OpenReview, NDLTD, and CiteSeerX), researchers in particular domains (e.g., Natural Language Processing, semantic web, digital libraries, information retrieval), and software companies about open data (e.g., Digital Science).
Different from computer science conferences and workshops, the seminar was held in a way with short talks (10 min -- 20 min) and plenty of time for plenary and small group discussions, and social activities. The final report is collaboratively edited. The activities each day are outlined below.
Monday
- Welcome + Introduction (Hannah Bast)
- Lightning Talks (Everyone)
- Talks
- Digital Science & the Research Data Ecosystem (Kathryn Weber-Boer)
- Overview of COnnecting REpositories (CORE) (Petr Knoth)
- Google Dataset Search (Natasha Noy)
- Lunch break
- Talks
- CiteSeerX and NDLTD (Jian Wu)
- Zenodo and InvenioRDM: Cross-domain digital repositories for the long tail of research (Martin Fenner)
- Curating the dblp computer science bibliography (Marcel R. Ackermann)
- Conclusion of Day 1 + Discussion about how to continue
- After dinner discussion
Tuesday
- Talks
- Research Fast and Slow (Min-Yen Kan)
- FAIR Digital Objects as Scholarly Infrastructure Metadata Middleware (Carole Goble)
- Citations in the world’s largest encyclopedia - and their future (Phoebe Ayers and Lydia Pintscher)
- Collaborative curation of bibliographic metadata (Daniel Mietchen)
- Find Topics for Working Groups (Everyone)
- Wrap-up Working Group descriptions
- Lunch break
- Working groups, Session 1 (Everyone)
- Working groups, Session 2 (Everyone)
- Working groups, plenary (Everyone)
- After dinner discussion
Wednesday
- Talks
- Please meet AI, our dear new colleague: Are we becoming obsolete? (Iryna Gurevych)
- Openness aspects of scholarly information systems important for adoption, transparency and interoperability (Bianca Kramer)
- Scholarly Knowledge Graphs: No community, no fun (Paolo Manghi)
- Challenges and opportunities in arXiv (Ramin Zabih)
- Working groups
- Sustainability for Open Scholarly Infrastructures (everyone)
- Lunch break
- Social activity
- Premium Hike around Dagstuhl for Everyone
- After dinner discussion
Thursday
- Talks
- Narrative Information Access in Digital Libraries (Wolf-Tilo Balke)
- OpenCitations and its new IT infrastructure (Mario Petrella)
- The AIDA Dashboard (Angelo Salatino)
- Navigating Scholarly Knowledge (Tilahun Abedissa Taffa)
- Working groups
- Research in the Age of Scientific Agentic AI (everyone)
- Lunch break
- Almost free discussions, kickoff
- Almost free discussions, wrap-up
- After dinner discussion
Friday
- Talks
- Knowledge Graphs and QLever (Hannah Bast)
- TIB AI Assistant for Research (Sahar Vahdati)
- Working groups
- Consolidation (co-editing the manifesto)
- Lunch Break
- Departure for the University of Trier to attend the DBLP 32nd anniversary (Keynote by Carole Gobal, titled "Open Scholarly Infrastructure: Transformation in a Shifting Landscape")
The manifesto summarized the main topics, conclusions, and next steps discussed in working groups. The table of contents is shown below.
- OpenReview and DBLP
- Barcelona Declaration, arguments for open scholarly infrastructure
- Barcelona Declaration, signatories
- Wikidata and QLever
- oAsIs - The role of open agentic scholarly information systems in the age of agentic AI
- Collaborative Metadata
- Author name disambiguation
- Harmonize Subject Area Classification
- CORE to Commons workflow
- Assessing overlaps between Dimensions and Wikidata
- Schloss Singapura on AI
- Modal µ-calculus in SPARQL
- Do we need rankings OR How to change the current perverse system?
- CS Ontology and DBLP
- ACL Anthology
- Fake conference metadata discussion
- Future of bibliographic metadata in Wikidata
- Working with Scholarly Metadata Dumps
- Perspectives on How to Achieve the Sustainability of Open Scholarly Infrastructure (OSI)
- Software tagging in DBLP
- Nanopublications to track acknowledgements of DBLP/ Dagstuhl
- Find Ghost #96
- Topic: Perspectives on How to Achieve the Sustainability of Open Scholarly Infrastructure (OSI) (See also Barcelona Declaration, arguments for open scholarly infrastructure )
- People Involved: Jian Wu, Petr Knoth, Lynda Hardman, Carole Goble, Bianca Kramer, Daniel Mietchen, Martin Fenner, Nees Jan van Eck, Paolo Manghi, Mario Petrella, etc.
- Summary of Outcome:
- In general, the group discussion on OSI sustainability indicated that this problem has raised the attention of the open scholar community, and affirmed the initial concerns in a broader context. The main outcomes are itemized below.
- Several OSIs (e.g., CiteSeerX, NDLTD, CORE) are facing sustainability challenges in the foreseeable or longer-term future due to financial, administrative, and/or human resource issues, which may lead to the loss of valuable data, software, and services to the scholarly communities.
- In future proposals to secure the financial support of OSIs, PIs are suggested to emphasize how the OSIs can bring social and economic impacts that are well aligned with the national priorities (e.g., digital sovereignty, security, AI) or the priorities of private funding institutions instead of only justifying the needs from the researchers' points of view.
- Consolidation of under-supported OSIs may be necessary to sustain the data, software, and services. PIs are also suggested to consider business models (e.g., donations, subscriptions) to support the growth and maintenance of OSIs.
- New OSIs should avoid duplicating data and services of existing OSIs and have a longer vision of sustainability.
- Next Steps:
- Collaboration with other OSIs (e.g., CORE) for research and developmental work on scholarly big data.
The seminar was highly rated by the participants, not only because of the free food, drink, and lodging but also because of the short talks and sufficient time for free-form discussions. Thanks to the chair, Hannah Bast! This form proves more efficient for scholars to have in-depth discussions that go into detail and cover many aspects of concrete problems. This is in contrast with most computer and information science conferences with pre-scheduled 20-minute or 30-minute presentations, which appear to cover lots of materials, but in fact make it easy for attendants to get bored. These presentations are usually followed by a very short amount of time for QA and discussions. Most of the time, the session chair has to set a timeout and leave the discussion "offline", which rarely happens.
On the 32nd anniversary of DBLP, Dr. Michael Ley announced his retirement. Many people agreed that with the retirements of C. Lee Giles, Ed A. Fox, and Michael Ley, this marks the end of an era of digital libraries. DBLP will have a new director.
The food at Dagstuhl was VERY good, featuring home-made dishes, very tasty, fresh, and healthy. The honey is produced by Dagstuhl's own honeywell.
Finally, here is the featured picture of this seminar. How many people can you recognize? Some of them are well-known in their domains.
-- Jian Wu
Comments
Post a Comment