2022-09-29: Theory Entity Extraction for Social and Behavioral Sciences Papers using Distant Supervision
In this blog, I will talk about our recent paper " Theory Entity Extraction for Social and Behavioral Sciences Papers using Distant Supervision ", which is published in the conference DocEng . In this paper, we proposed an automated framework based on distant supervision that leverages entity mentions from Wikipedia to build a ground truth corpus consisting of more than 4500 automatically annotated sentences containing theory/model mentions. We compared four deep learning architectures and found the RoBERTa-BiLSTM-CRF is the best one with a precision as high as 89.72%. The code and data are publicly available in GitHub . You can also check the slides. Introduction Scientific literature has grown exponentially over the past decades . In order to understand the literature more quickly, people can review abstracts and high-level key phrases. But they don't provide enough details. Theories and models extracted from body text can provide more details. While abstracts and