
I was overwhelmed with joy on the day when I had my first meeting with my professor regarding project briefs and discussions. I was assigned to a collaborated project with Virginia Tech called "
Mining Electronic Theses and Dissertations (ETDs)," supported by the
Institute of Museum and Library Services (IMLS). ETDs are scholarly articles that usually serve as partial requirements of academic degrees for students pursuing higher education. These ETDs are hosted by commercial (e.g., ProQuest) or university digital library repositories. However, the digital libraries of ETDs lack computational models and services for accessing and discovering the knowledge buried in ETDs. For example, many library-provided metadata often exhibit incomplete, inconsistent, and incorrect values, which harms the discoverability of ETDs. Additionally, ETDs can be scanned (scanning physical copies of thesis and dissertation) and born-digital. To index quality metadata, we first need to extract the metadata accurately from both document types, and it was one of the key research challenges I explored at the beginning of my Ph.D. journey. I briefly explained some of my earlier research in the following blogs:
Optical Character Recognition (OCR) Experiment, and
Heuristic Rules to Extract Metadata.
Research & Publications
In an era where information is abundant yet often inaccessible, efficiently retrieving and utilizing scholarly knowledge is crucial. ETDs represent a vast repository of academic research, yet their complex structures and inconsistent metadata often hinder discoverability and integration into digital libraries. My research is driven by the imperative to bridge this gap, leveraging applied ML, NLP, and CV to transform ETDs into structured, accessible, and valuable components of scholarly big data.
Dissertation Defense
I developed ETDSuite, a toolkit designed to mine ETDs and their structured components. ETDSuite addresses critical challenges in digital libraries by providing machine learning-based methods for page-level segmentation, metadata extraction, citation parsing, and metadata enhancement. This toolkit has been instrumental in improving the accessibility and quality of ETD repositories, facilitating more efficient knowledge discovery and retrieval.
On November 6, 2024, I successfully defended my dissertation in front of a live audience and my committee:
I appreciate the input of all committee members toward making my dissertation better.
When addressing the research problems of mining ETDs, I raised the following four research questions (RQs):
- RQ1: Can we develop an AI method to extract metadata from the cover pages of scanned and born-digital ETDs?
- RQ2: Library-provided metadata often exhibits incomplete, inconsistent, and incorrect values. How can we leverage AI methods to improve metadata quality?
- RQ3: Will latent features that encode text and vision modalities outperform latent features obtained from a single modality in the ETD page classification?
- RQ4: Is it possible to design a universal parser that accurately parses metadata from multi-style and multi-type citations as appeared in ETDs?
I developed the following frameworks by addressing the four RQs:
 |
Metadata Extraction Pipeline Using Heuristic Rules |
 |
Automatic Metadata Extraction System |
MetaEnhance -- This is a framework to improve the metadata quality of ETDs by filling out the missing values, correcting the incorrect values and misspellings, and canonicalizing the surface values by leveraging the SOTA ML and DL models. The framework was evaluated against MetaEnhance-ETDQual500 and achieved nearly perfect F1-scores in detecting errors and F1-scores ranging from 85% -- 100% for correcting five of seven key metadata fields. More details can be found in my published work -- metadata quality improvement.
 |
MetaEnhance Framework to Improve Metadata Quality |
 |
Multimodal Framework to Classify ETD Pages |
LMParsCit -- This is a large language model-based framework (e.g., llama3-8b-instruct, GPT-3.5 turbo, and GPT-4o-mini) to extract key metadata fields—title, author, venue, and year—from references across a range of bibliography types (e.g., journals, conference proceedings, technical reports). It also supports multiple bibliography styles (e.g., IEEE, ACM, APA) and achieved an F1 score of 99% on CORA-ref and ETDCite. Publications
My
research contributions have been published in several peer-reviewed journals and conferences, where I have served as both first author and co-author.
Internship Opportunities
My early work in extracting metadata from scanned ETDs, where I applied NLP and CV, especially applying OCR technology, led me to land a first internship in the Summer of 2020 at
Los Alamos National Laboratory (LANL) in New Mexico. During my internship at LANL, I developed a framework for
Offline Handwritten Mathematical Equation Recognition, where the core architecture relied on a Convolutional Neural Network (CNN) called LeNET5-CNN. Working as a Research Intern at LANL not only expanded my skills in Computer Vision but also opened another door for me to obtain an internship opportunity in the following year (Summer 2021) with
Bhirle Applied Research Inc. (BAR), an aerospace and aerodynamics company in Hampton, Virginia. During my internship at BAR, I developed and enhanced algorithms for the
Train Detection Model used by Rail-Inspector. This cloud-based software processes aerial imagery of railroad tracks using Machine Learning and Deep Learning.
I would say, internships are a crucial stepping stone in building a successful career. However, research experience has significantly played an important role during my PhD career. My AI-driven research had been deeply application-focused, allowing me to develop cutting-edge applications (e.g.,
TechDrawFinder (a vector search engine for finding segmented patent figures) or
ETDPC (a multimodal AI framework to segment ETDs)) that industries value. During my Ph.D., I always thrived to adopt new technology and proposed frameworks that could solve a complex problem with state-of-the-art (SOTA) results. For example, during the 5th year of my Ph.D., a sudden shift happened in the NLP domain. People have been using Large Language Models (LLMs) due to their SOTA performance in both NLP and Natural Language Understanding (NLU). I exploited this area to apply in my research, which further allowed me to develop a language-based citation parser, called
LMParsCit, where the core architecture relies on LLMs (e.g.,
Llama-3-8b-instruct). Having a good understanding of language models and vision models through my research work helped me land another internship opportunity in the Summer of 2024 at the U.S. Food & Drug Administration as an ORISE Fellow (i.e., Research Fellow), where I enhanced a machine learning-based algorithms for one of the regulatory projects (
Analytics Driven Supplement Evaluation Model) in the division of
Center for Drug Evaluation & Research.
Internship Blogs:
Academic and Professional Journey
After completing one year of professional experience, I enrolled in the Ph.D. program at ODU. During my time at ODU, I served as a Teaching Assistant, where I had the privilege of mentoring undergraduate and graduate students in courses such as Machine Learning (CS 722/822) and Web Programming (CS 418/518). My responsibilities included delivering lectures on applying AI frameworks and technologies, guiding students through complex concepts, and supervising research projects related to NLP and digital libraries. Moreover, I mentored high school students in developing an AI-based search engine, called
TechDrawFinder, which utilized vector search techniques to retrieve segmented patent images.
Through dedication and perseverance, I have consistently improved myself, as reflected in the following achievements:
- Dominion Scholar Award (2023) – Old Dominion University.
- Best Short Paper Award (2023) – ACM/IEEE Joint Conference on Digital Libraries (JCDL).
- Outstanding Teaching Assistant Award (2022) – Old Dominion University.
- Best Poster Honorable Mention (2020) – ACM/IEEE Joint Conference on Digital Libraries (JCDL).
- Dr. Hussain Abdel-Wahab Graduate Fellowship (2020) – Old Dominion University.
- AML Summer Research Fellowship (2020) – Los Alamos National Laboratory.
In addition, I was honored to receive an invitation to serve as a reviewer for the following peer-reviewed conferences and journals:
- PeerJ Computer Science – One Manuscript Review (2024).
- Scientometrics – One Manuscript Review (2023).
- ACM/IEEE Joint Conference on Digital Libraries 2023 – One Paper Review.
- ACM/IEEE Joint Conference on Digital Libraries 2022 – One Paper Review.
- ACM/IEEE Joint Conference on Digital Libraries 2020 – 10 Poster Abstracts Review.
Post Doctoral Journey
I accepted an offer from the U.S. Food & Drug Administration (FDA) to serve as a Research Fellow in the CDER division. In this role, I will act as a subject matter expert, enhancing algorithms for a key regulatory project aimed at assessing drug products. My work will focus on leveraging state-of-the-art AI techniques to optimize drug product quality analysis.
Wrap Up
To wrap up this blog post, I want to share a few lessons I learned during my rollercoaster journey at ODU.
Work Hard, but Work Smart: One of the key lessons I learned from my advisor was the importance of working efficiently. To streamline my workflow, I developed a variety of automated scripts to handle repetitive tasks. For example, when compiling a large-scale ETD dataset for students to develop an ETD Search Engine in Dr. Wu’s Web Programming class, I automated processes such as converting large batches of PDFs to images and extracting text from those images. These scripts saved me valuable time, allowing me to focus on more critical aspects of my research rather than redoing previously completed tasks.
Effective Communication with Peers: Strong communication skills are essential, particularly when serving as a first author or co-author on research projects. Constructive criticism from peers should be taken seriously rather than personally, as it plays a crucial role in refining and improving the quality of research. Embracing feedback with an open mind fosters collaboration and ultimately leads to the production of high-quality research work.
Breadth and Depth: A successful Ph.D. journey requires balancing both breadth and depth in research. While specializing in a particular domain is essential for making novel contributions, having a broad understanding of related fields enables interdisciplinary thinking and innovative problem-solving. Expanding knowledge across multiple areas can lead to new perspectives and research opportunities.
Perseverance and Push Yourself: Research is filled with challenges, setbacks, and moments of uncertainty. Perseverance is key to overcoming obstacles and making progress. It is essential to push beyond your comfort zone, take on difficult problems, and remain committed to long-term goals, even when faced with failures or unexpected hurdles.
Professional Networking is Key: Building a strong professional network can open doors to collaborations, job opportunities, and mentorship. Attending conferences, engaging with researchers in the field, and actively participating in academic discussions help establish meaningful connections that can be valuable throughout one's career.
Find an Internship: Internships provide hands-on industry experience, exposure to real-world applications of research, and opportunities to work with experts outside academia. They not only enhance technical skills but also improve professional growth, making the transition from academia to industry smoother and more impactful.
--
Research Fellow, U.S. FDA / CDER
Email: muntabirc@gmail.com
Comments
Post a Comment