2025-09-28: Trip report: Dagstuhl Seminar 25382: Open Scholarly Information Systems: Status Quo, Challenges, Opportunities


During September 14-19, I was honored to organize and participate in the Dagstuhl Seminar titled Open Scholarly Information Systems: Status Quo, Challenges, Opportunities. The seminar was held at Schloss Dagstuhl, a small castle-like computer science center in the west of Germany. 


What is unique about Dagstuhl is that it is located in a wood, very isolated from big cities. The nearest town, Wadern, is 30 minutes on foot, and there is no public transportation. After landing at the huge Frankfurt airport, one has to take the train for 2 hrs from the regional train station to Türkismühle. Then you must take a taxi because there is no public transportation. Based on the recommendation of the Dagstuhl website, I had to reserve a taxi from Taxi Martin at least 3 days in advance. Of course, I had to reserve it again when I left at 4 am! 

The whole building contains two parts: an old building, which was built 260 years ago, and a new building, built in 2001. The two buildings are connected with a bridge on the second floor. In addition to lecture rooms and living rooms, the old building has a dining room, a kitchen, a piano room, and a backyard, which is very nice for afternoon teas and outdoor discussions. In addition to lectures and living rooms, the new building has a laundry room, a sauna, and a gym.  Both buildings are covered by Wifi. Coffee (including espresso drinks), sparkling water, and wine are available 24/7. The center has everything you need for research, except cars. You have to walk if you need to get out.

The core organization team consists of five people around the world, including Hannah Bast (chair, Germany), Marcel Ackermann (coordinator, Germany), Guillaume Cabanac (co-chair, France), Paolo Manghi (co-chair, Italy), and me (co-chair, United States). We started drafting the proposal back in March 2023. The original goal was to celebrate the 32nd anniversary of DBLP, a legendary digital library for computer and information sciences. Later on, the plan evolved to assemble about 40 scholars to discuss a broader topic: open scholarly information systems (OSIs). To carefully control the size of the seminar and guarantee attendance, the organization team sent invitations at least 3 rounds. The invited people include PIs, core technical people, or directors of well-known digital libraries (e.g., Google Scholar, arXiv, CORE, OpenAIRE, OpenReview, NDLTD, and CiteSeerX),  researchers in particular domains (e.g., Natural Language Processing, semantic web, digital libraries, information retrieval), and software companies about open data (e.g., Digital Science). 

Different from computer science conferences and workshops, the seminar was held in a way with short talks (10 min -- 20 min) and plenty of time for plenary and small group discussions, and social activities. The final report is collaboratively edited. The activities each day are outlined below.

Monday

Tuesday 

  • Talks
  • Find Topics for Working Groups (Everyone) 
  • Wrap-up Working Group descriptions
  • Lunch break
    • Working groups, Session 1 (Everyone) 
    • Working groups, Session 2 (Everyone)
    • Working groups, plenary (Everyone) 
  • After dinner discussion 

Wednesday 

  • Talks
    • Please meet AI, our dear new colleague: Are we becoming obsolete? (Iryna Gurevych
    • Openness aspects of scholarly information systems important for adoption, transparency and interoperability (Bianca Kramer
    • Scholarly Knowledge Graphs: No community, no fun (Paolo Manghi
    • Challenges and opportunities in arXiv (Ramin Zabih
    • Working groups
      • Sustainability for Open Scholarly Infrastructures (everyone) 
  • Lunch break
  • Social activity
    • Premium Hike around Dagstuhl for Everyone
  • After dinner discussion 

Thursday 

  • Talks
  • Working groups
    • Research in the Age of Scientific Agentic AI (everyone) 
  • Lunch break
  • Almost free discussions, kickoff
  • Almost free discussions, wrap-up
  • After dinner discussion 

Friday

The manifesto summarized the main topics, conclusions, and next steps discussed in working groups. The table of contents is shown below. 

  1. OpenReview and DBLP
  2. Barcelona Declaration, arguments for open scholarly infrastructure
  3. Barcelona Declaration, signatories
  4. Wikidata and QLever
  5. oAsIs - The role of open agentic scholarly information systems in the age of agentic AI
  6. Collaborative Metadata
  7. Author name disambiguation
  8. Harmonize Subject Area Classification
  9. CORE to Commons workflow
  10. Assessing overlaps between Dimensions and Wikidata
  11. Schloss Singapura on AI
  12. Modal µ-calculus in SPARQL
  13. Do we need rankings OR How to change the current perverse system?
  14. CS Ontology and DBLP
  15. ACL Anthology
  16. Fake conference metadata discussion
  17. Future of bibliographic metadata in Wikidata
  18. Working with Scholarly Metadata Dumps
  19. Perspectives on How to Achieve the Sustainability of Open Scholarly Infrastructure (OSI)
  20. Software tagging in DBLP
  21. Nanopublications to track acknowledgements of DBLP/ Dagstuhl
  22. Find Ghost #96 
Take one topic as an example. The manifesto contains the following. 

  • Topic: Perspectives on How to Achieve the Sustainability of Open Scholarly Infrastructure (OSI) (See also Barcelona Declaration, arguments for open scholarly infrastructure )
  • People Involved: Jian Wu, Petr Knoth, Lynda Hardman, Carole Goble, Bianca Kramer, Daniel Mietchen, Martin Fenner, Nees Jan van Eck, Paolo Manghi, Mario Petrella, etc. 
  • Summary of Outcome: 
    • In general, the group discussion on OSI sustainability indicated that this problem has raised the attention of the open scholar community, and affirmed the initial concerns in a broader context. The main outcomes are itemized below. 
    • Several OSIs (e.g., CiteSeerX, NDLTD, CORE) are facing sustainability challenges in the foreseeable or longer-term future due to financial, administrative, and/or human resource issues, which may lead to the loss of valuable data, software, and services to the scholarly communities. 
    • In future proposals to secure the financial support of OSIs, PIs are suggested to emphasize how the OSIs can bring social and economic impacts that are well aligned with the national priorities (e.g., digital sovereignty, security, AI) or the priorities of private funding institutions instead of only justifying the needs from the researchers' points of view. 
    • Consolidation of under-supported OSIs may be necessary to sustain the data, software, and services. PIs are also suggested to consider business models (e.g., donations, subscriptions) to support the growth and maintenance of OSIs. 
    • New OSIs should avoid duplicating data and services of existing OSIs and have a longer vision of sustainability. 
  • Next Steps:
    • Collaboration with other OSIs (e.g., CORE) for research and developmental work on scholarly big data. 
The last topic is "Find Ghost #96", which originated from a tradition of capturing the "Ghost" located everywhere in the Dagstuhl Castle! If you want to know the details, come and visit Dagstuhl! 

The seminar was highly rated by the participants, not only because of the free food, drink, and lodging but also because of the short talks and sufficient time for free-form discussions. Thanks to the chair, Hannah Bast! This form proves more efficient for scholars to have in-depth discussions that go into detail and cover many aspects of concrete problems. This is in contrast with most computer and information science conferences with pre-scheduled 20-minute or 30-minute presentations, which appear to cover lots of materials, but in fact make it easy for attendants to get bored. These presentations are usually followed by a very short amount of time for QA and discussions. Most of the time, the session chair has to set a timeout and leave the discussion "offline", which rarely happens.

On the 32nd anniversary of DBLP, Dr. Michael Ley announced his retirement. Many people agreed that with the retirements of C. Lee Giles, Ed A. Fox, and Michael Ley, this marks the end of an era of digital libraries. DBLP will have a new director. 

The food at Dagstuhl was VERY good, featuring home-made dishes, very tasty, fresh, and healthy. The honey is produced by Dagstuhl's own honeywell.

Finally, here is the featured picture of this seminar. How many people can you recognize? Some of them are well-known in their domains.



-- Jian Wu


Comments