2022-12-12: Trip Report -- Visit to Virginia Polytechnic Institute and State University (Virgina Tech)
I was invited to give a graduate seminar at Virginia Tech yesterday titled “Towards automatically understanding scientific papers.” @WebSciDL @virginia_tech @sudobear @edwardafox Thanks @TasinChoudhury for taking the picture. pic.twitter.com/ygkOp1U9cF
— Jian Wu (@fanchyna) November 12, 2022
CS 5604: Information Storage and Retrieval
A team from course 5604: Information Storage and Retrieval presented and provided demo of Electronic Theses or Dissertations Search Engine. The team is divided into five groups led by senior PhD students at DLRL. pic.twitter.com/cpQxSBxIXy
— Muntabir Hasan Choudhury (@TasinChoudhury) November 14, 2022
Team 3 presented object detection in ETD which utilize YOLOv7. Team 4 was responsible for language model which produce chapter summary and perform classification. Lastly, team 5 was responsible for dockerixing the code, automation using Apache airflow and DAG. pic.twitter.com/e3OmYD3XEZ
— Muntabir Hasan Choudhury (@TasinChoudhury) November 14, 2022
Graduate Seminar Talk by Dr. Jian Wu
— Muntabir Hasan Choudhury (@TasinChoudhury) November 11, 2022
Presenting different published papers to address the crisis. pic.twitter.com/L4hqX0zrch
— Muntabir Hasan Choudhury (@TasinChoudhury) November 11, 2022
Information Extraction From Scientific Papers
Reproducibility Assessment
Scientific Disinformation
He introduced the "Searching for Evidence of Scientific News in Scholarly Big Data" paper to automatically find the most relevant scholarly papers given a scientific news article.
He presented an assessment titled "Scientific Claim Verification Frameworks," where the goal was to perform a generalizability test of state-of-the-art fact-checking models on scientific news.
He also talked about his ongoing research on developing a robust model to verify the soundness of scientific news with research claims.
Data Infrastructure
Dr. Wu concluded his talk by addressing the importance of Data Infrastructure. He reviewed the design, implementation, operation experiences, and lessons of CiteSeerX -- a real-world digital library search engine. He talked about the strengths and weaknesses of the current design. He discussed the newly proposed architecture in their paper titled "Building an Accessible, Usable, Scalable, and Sustainable Service for Scholarly Big Data." He also described his work on "Building A Large Collection of Multi-domain Electronic Theses and Dissertations." There are currently 500k ETDs in this repository. He showed the top ten US universities that contributed the most to this collection.
Dr. Wu concluded his talk by addressing the importance of Data Infrastructure. He reviewed the design, implementation, operation experiences, and lessons of CiteSeerX -- a real-world digital library search engine. He talked about the strengths and weaknesses of the current design. He discussed the newly proposed architecture in their paper titled "Building an Accessible, Usable, Scalable, and Sustainable Service for Scholarly Big Data." He also described his work on "Building A Large Collection of Multi-domain Electronic Theses and Dissertations." There are currently 500k ETDs in this repository. He showed the top ten US universities that contributed the most to this collection.
Focused Research Discussion
Presentation by Lamia Salsabil: A Summary of Contribution to the ETD Project
It was a great learning experience during our visit to DLRL at Virginia Tech. We are scheduled to meet the team again.
-- Muntabir Choudhury (@TasinChoudhury) and Lamia Salsabil (@liya_lamia)
Comments
Post a Comment