2025-12-31: From Tables to Triumph: A PhD Journey in Uncertainty-Aware Scientific Data Extraction

 



In January 2021, I began a journey that would span nearly five years, three children, countless late nights, and a singular focus: teaching machines to extract data from complex scientific tables with confidence—and to know when they're uncertain. On October 29, 2025, I successfully defended my dissertation titled "SCITEUQ: Toward Uncertainty-Aware Complex Scientific Table Data Extraction and Understanding" at Old Dominion University. This milestone represents not just the culmination of intensive research but a testament to perseverance, family support, and the power of focused determination.

Finding My Path at LAMP-SYS

When I joined the Lab for Applied Machine Learning and Natural Language Processing Systems (LAMP-SYS), part of ODU's Web Science and Digital Libraries Research Group (WSDL) under the guidance of Dr. Jian Wu, I knew exactly what problem I wanted to solve: making scientific table data extraction both accurate and trustworthy through uncertainty quantification.

Scientific tables are ubiquitous in research papers, containing critical experimental data, statistical results, and research findings. Yet extracting this data automatically from PDF documents remains surprisingly difficult. Unlike the simple, well-structured tables you might see on Wikipedia, scientific tables are complex beasts—featuring multi-level headers, merged cells, irregular layouts, and domain-specific notations that confound even state-of-the-art machine learning models.

But here's the real problem: existing methods don't tell you when they're wrong. They extract data with the same confidence whether they're processing a simple table or struggling with a complex one. For scientific applications where accuracy is paramount, this means researchers must manually verify every single extracted cell—a task that doesn't scale when you're dealing with thousands of tables.

The Research Challenge

My research addressed a fundamental question: How can we build systems that not only extract data from complex scientific tables but also quantify their uncertainty, allowing us to focus human verification effort only where it's needed?

To tackle this challenge, I formulated four research questions:

RQ1: What is the status of reproducibility and replicability of existing TSR models?

Before building something new, I needed to understand what already existed. I conducted the first systematic reproducibility and replicability study of 16 state-of-the-art Table Structure Recognition (TSR) methods. The results were sobering: only 8 of 16 papers made their code and data publicly available, and merely 5 had executable code. When I tested these methods on my newly created GenTSR dataset (386 tables from six scientific domains), none of the methods replicated their original performance. This highlighted a critical gap in the field. This work was published at ICDAR 2023: "A Study on Reproducibility and Replicability of Table Structure Recognition Methods."

RQ2: How do we quantify the uncertainties of TSR results?

To address this, I developed TTA-m, a novel uncertainty quantification pipeline that adapts Test-Time Augmentation specifically for TSR. Unlike vanilla TTA, my approach fine-tunes pre-trained models on augmented table images and employs ensemble-based methods to generate cell-level confidence scores. On the GenTSR dataset, TTA-m achieved an F1-score of 0.798, with over 80% accuracy for high-confidence predictions—enabling reliable automatic detection of extraction errors. This work was published at IEEE IRI 2024: "Uncertainty Quantification in Table Structure Recognition."

RQ3: How can we integrate uncertainties from TSR and OCR for holistic table data extraction?

I designed and implemented the TSR-OCR-UQ framework, which integrates table structure recognition (using TATR), optical character recognition (using PaddleOCR), and conformal prediction-based uncertainty quantification into a unified pipeline. The results were compelling: the accuracy improved from 53-71% to 83-97% for different complexity levels, with the system achieving 69% precision in flagging incorrect extractions and reducing manual verification labor by 53%. This work was published at ICDAR 2025: "Uncertainty-Aware Complex Scientific Table Data Extraction."

RQ4: How well do LLMs answer questions about complex scientific tables?

To evaluate the QA capability of Large Language Models on scientific tables, I created SciTableQA, a benchmark dataset containing 8,700 question-answer pairs across 320 complex scientific tables from multiple domains. My evaluation revealed that while GPT-3.5 achieved 79% accuracy on cell selection TableQA tasks, performance dropped to 49% on arithmetic reasoning TableQA tasks—highlighting significant limitations of current LLMs when dealing with complex table structures and numerical reasoning. This work was published at TPDL 2025: "SciTableQA: A Question-Answering Benchmark for Complex Scientific Tables."

The SCITEUQ Framework

Putting it all together, SCITEUQ (Scientific Table Extraction with Uncertainty Quantification) represents a comprehensive solution to uncertainty-aware scientific table data extraction. The framework achieves the state-of-the-art performance while providing essential uncertainty quantification capabilities that enable efficient human-in-the-loop verification.

Each component contributes to a more reliable approach:

  • GenTSR provides rigorous cross-domain evaluation
  • TTA-m quantifies uncertainties in structure recognition
  • TSR-OCR-UQ integrates structure and content extraction with uncertainty maps
  • SciTableQA enables systematic evaluation of reasoning capabilities

Publications and Research Impact

My research resulted in five first-author publications at top-tier conferences and journals:

Each of these papers faced initial rejection before ultimately being accepted. This taught me an invaluable lesson: rejection is not failure; it's an opportunity to refine and improve your work.

Industry Experience: From Azure to Alexa to Microsoft AI

While my research focused on scientific tables, my internships at Microsoft and Amazon broadened my perspective on applying machine learning at scale.

Microsoft (Summers 2022, 2023, 2025)

My first two summers at Microsoft were with the Azure team, where I worked on infrastructure optimization problems far from my research area. I developed an AI-human hybrid LLM-based multi-agent system for AKS Cluster configuration, reducing cluster type generation time from 2 weeks to 1 hour (link to blog post). I also designed ML anomaly detection systems on Azure Synapse that reduced hardware maintenance costs by over 20% and formulated new metrics for characterizing node interruption rates that decreased hardware downtime by 25% (link to blog post).

In Summer 2025, I joined the Microsoft AI team under the Bing organization, working on problems at the intersection of large-scale search and AI—which is what I'll be doing when I return to Microsoft full-time in January 2026.

Amazon (Summer 2024)

At Amazon, I worked with the Alexa Certification Technology team in California, where I drove 10% customer growth by designing LLM-based RAG systems with advanced prompt engineering techniques and increased the revenue by over 5% by developing LLM Agents on AWS to improve Alexa-enabled applications (link to blog post).

These internships, while not directly related to my dissertation research, taught me how to apply ML thinking to diverse industrial problems and to work effectively in large, complex organizations.

Balancing PhD Life with Family

Perhaps the most challenging aspect of my PhD journey had nothing to do with research—it was combining my studies with raising three young children. My youngest son, Daniel, was born just six month after I was enrolled in the PhD program. Managing research deadlines, experimental runs, paper submissions, and the demands of parenting three boys (Paul, David, and Daniel) required discipline and sacrifice.

I developed a strict routine: work from 9 AM to 3 PM every day at my research lab, then pick up my kids from school and be fully present for them. This meant no late nights in the lab, no weekend marathons of coding—just consistent, focused work during designated hours. It wasn't always easy. Conference deadlines sometimes meant asking my wife, Olabisi, to take on even more, or my mother, Beatrice, to provide extra support. But this routine kept me grounded and taught me that quality of work matters more than quantity of hours.

The Defense

On October 27, 2025, I defended my dissertation before my committee:

Their thoughtful feedback, probing questions, and constructive critiques throughout my PhD journey were instrumental in refining my research and pushing me to think deeper about the implications and limitations of my work.

Lessons Learned

Looking back on nearly five years of doctoral work, several lessons stand out:

1. Embrace Rejection as Refinement

Four of my papers were initially rejected. Each rejection stung, but each one ultimately led to a stronger paper. The review process, while sometimes frustrating, forced me to clarify my arguments, strengthen my experiments, and address weaknesses I hadn't noticed. My TPDL 2025 paper on SciTableQA went through two rounds of revisions, but the final version is significantly better than the original submission.

2. Establish Non-Negotiable Boundaries

My 9 AM to 3 PM schedule wasn't just convenient—it was essential for maintaining my sanity and my family relationships. While some might argue that PhD students need to work 80-hour weeks, I proved that focused, disciplined work during reasonable hours can produce quality research. Those boundaries also made me more efficient: when you only have six hours a day, you learn to prioritize ruthlessly.

3. Build for Reproducibility from Day One

My systematic study on TSR reproducibility taught me the hard way how difficult it is to reproduce other people's work. This experience shaped how I approached my own research. Every framework I built—TTA-m, TSR-OCR-UQ, SciTableQA—comes with comprehensive documentation, publicly available code, and clear instructions for replication. Future researchers shouldn't struggle to build upon my work the way I struggled with others'.

4. Choose Problems That Matter to You

I entered my PhD knowing I wanted to work on table extraction with uncertainty quantification, and I never wavered from that focus. This singular vision helped me navigate the inevitable setbacks and distractions that come with doctoral research. When experiments failed or papers got rejected, I could always return to the core question: How do we make scientific data extraction both accurate and trustworthy?

5. Internships Broaden Your Perspective

While my Microsoft and Amazon internships didn't directly contribute to my dissertation, they fundamentally shaped how I think about research. Working on production systems with millions of users taught me to think about scalability, robustness, and real-world constraints in ways that academic research rarely emphasizes. These experiences make me a better researcher because I can now evaluate my work not just on benchmark performance, but on whether it could actually be deployed at scale.

Looking Forward

In January 2026, I'll be joining Microsoft as a Data Scientist 2 with the Microsoft AI team at the Redmond campus in Washington state. My family and I are excited about this new chapter—moving from Norfolk, Virginia, to the Pacific Northwest, and transitioning from academic research to industry applications.

While I'll be working on different problems at Microsoft, the skills and mindset I developed during my PhD—rigorous experimentation, systematic evaluation, uncertainty quantification, and reproducible research—will continue to guide my work. I'm particularly excited about the opportunity to apply research-driven thinking to real-world problems at a scale that can impact millions of users.

Acknowledgments

This journey would have been impossible without extraordinary support:

To Dr. Jian Wu, my advisor, mentor, and guide—thank you for believing in my research vision, for pushing me to think bigger, and for your patience during the inevitable frustrations of doctoral research. Your mentorship has not only shaped my research but also my approach to solving complex problems.

To Dr. Yi He, my co-advisor at William & Mary, your expertise and thoughtful feedback greatly enriched this research. Thank you for your guidance and support throughout this journey.

To my dissertation committee—Drs. Michael Nelson, Michele Weigle, and Sampath Jayarathna—your constructive critiques and expert insights were essential in refining my ideas and strengthening this work.

To my colleagues in WSDL and LAMP-SYS, the collaborative environment, intellectual exchanges, and camaraderie made this journey both enriching and memorable.

To my wife, Olabisi—you walked beside me every step of this journey with unwavering devotion and love. Your patience during the long hours, your understanding through the challenges, and your constant encouragement when the path seemed difficult made this achievement possible. This accomplishment is as much yours as it is mine.

To my sons—Paul, David, and Daniel—you are my greatest blessings and my constant source of joy and motivation. I hope this work serves as an example that with dedication and faith, you too can achieve your dreams.

To God Almighty, who is the source of all wisdom and strength, I give thanks and praise.


-Kehinde Ajayi (@KennyAJ)

Comments