Web Science and Digital Libraries Research Group

Posts

Showing posts with the label notra dame

2024-12-31: Benchmark: Whether LLM agents can evaluate, replicate, and independently conduct the research process

By Jian Wu - December 31, 2024

I am excited to announce that Old Dominion University (ODU) is part of the multi-university grant awarded by the Open Philanthropy Foundation to support the development of a systematic benchmark assessing how effectively large language models (LLMs) can evaluate, replicate, and conduct scientific research. The leading institution is the Center for Open Science (CoS, Dr. Brian Nosek and Dr. Tim Errington ) and the participation institutions are Pennsylvania State University ( Dr. Sarah Rajtmajer , Dr. Qingyun Wu ), Notre Dame University ( Dr. Meng Jiang ), and ODU ( Dr. Jian Wu , myself). The team will conduct a test of whether LLMs are capable of determining whether claims are true or false. Here a claim means a statement that conveys a research finding in a scientific paper. Our operationalization of this question is whether LLMs can assess a scientific paper and predict whether primary findings would replicate or reproduce successfully in an independent test. In the fund...