2025-01-07: Paper Summary: "GazePrompt: Enhancing Low Vision People’s Reading Experience with Gaze-Aware Augmentations"

The ACM Conference on Human Factors in Computing Systems (CHI) serves as the leading global platform for showcasing innovative research in Human-Computer Interaction (HCI). It convenes researchers, practitioners, and industry leaders to discuss cutting-edge advancements in designing, evaluating, and utilizing technology to improve human experiences. CHI 2024, themed "Surfing the World," took place from May 11 to 16, 2024, at the Hawaiʻi Convention Center in Honolulu, Hawaiʻi, USA, with provisions for remote participation. In this blog post, I highlight a compelling recent work presented at the conference, titled "GazePrompt: Enhancing Low Vision People’s Reading Experience with Gaze-Aware Augmentations," authored by Ru Wang et al.

Figure 1 Wang et al.: GazePrompt offers two primary features: Line-Switching (LS) support and Difficult-Word (DW) support. Each feature includes two design options: (a) Line Highlighting and (b) Arrow for LS support, and (c) Text-to-Speech and (d) Word Magnifier for DW support. Note that the gaze visualizations depicted are for illustrative purposes only and are not visible to users.

Motivation

Reading is a fundamental skill for accessing information and participating in daily life, yet for millions of individuals with low vision, it remains a persistent challenge. Conditions like blurry vision, central vision loss, and peripheral vision loss make tasks such as locating the next line, distinguishing words, and maintaining reading flow difficult and frustrating. While tools like magnifiers and screen contrast enhancements offer some relief, they often create new barriers, such as limiting field of view, increasing cognitive load, and slowing navigation through text. These challenges compound over time, reducing not only reading efficiency but also the user's confidence and independence. To address these unmet needs, the authors introduce GazePrompt, a gaze-aware reading aid that provides tailored visual and auditory augmentations to support low vision individuals in overcoming reading difficulties effectively.

Background and Related Work

Commonly used tools that aid reading among low-vision individuals include optical magnifiers, electronic devices, and screen magnification features. These tools enhance visibility but introduce new challenges. For instance, screen magnification often reduces the visible field, complicating navigation and increasing cognitive load. Similarly, individuals with peripheral or central vision loss struggle with identifying lines or words, as magnification alone cannot compensate for missing or distorted parts of the text. These limitations underline the need for advanced, behavior-responsive solutions.

Previous efforts have explored various strategies to assist low vision readers. Magnification tools have been complemented with features like contrast enhancement and contour adjustments to improve clarity. Eye-tracking technology has emerged as a promising approach, offering opportunities for context-aware augmentation. For instance, early research demonstrated the ability to track gaze and support tasks like resuming reading positions or highlighting difficult words. However, these solutions primarily targeted sighted users or relied on manual control methods, which were cumbersome and inaccessible for low vision individuals. This gap in tailored, automated reading aids sets the stage for systems like GazePrompt, which leverages eye-tracking to dynamically adapt to users’ behaviors, addressing both performance and experience shortcomings.

GazePrompt Design

Figure 2 Wang et al.: GazePrompt interfaces showcasing: (a) Line Highlighting for Line-Switching Support, (b) Arrow indicator for Line-Switching Support, (c) Word Magnifier for Difficult-Word Support, and (d) Customizable color options for Line-Switching Support.

Key Features

GazePrompt addresses critical challenges faced by low vision readers through two core features: Line-Switching Support and Difficult-Word Assistance, both designed to enhance navigation and comprehension during reading.

Line-Switching Support (LS): This feature helps users efficiently locate and transition to the intended line of text. Using eye-tracking data, GazePrompt identifies the line of interest and provides two augmentation options tailored to user preferences: Line Highlighting for clear visibility across the full line and Arrow Indicators for a subtle guide to the start of the line. These augmentations aim to minimize cognitive load and improve reading flow.
Difficult-Word Assistance (DW): To address word recognition difficulties, this feature identifies hesitation or extended gaze on a word and offers targeted support. Users can choose between Text-to-Speech, which reads the word aloud, or Word Magnification, which enlarges the specific word on the screen for closer examination. Both options are customizable to suit individual needs and preferences.

Implementation

The implementation of GazePrompt involves three main components: gaze calibration and data collection, real-time gaze behavior detection, and augmentation rendering. Together, these elements ensure that the system accurately identifies user behaviors and provides tailored augmentations to improve the reading experience.

Gaze Calibration and Data Collection

Figure 3 Wang et al.: Calibration and validation interfaces: (a) 14-dot calibration interface, (b) 5-dot validation interface, (c) Example of a sliding target used for line-based calibration (moves horizontally across the screen; white arrow shown for demonstration only), (d) 5-line calibration interface, and (e) 4-line validation interface.

To achieve reliable eye-tracking for low vision users, GazePrompt employs a robust calibration process. This includes a standard 14-point calibration for high accuracy, followed by line-based correction to counteract common issues like vertical drift in gaze estimation. Participants are guided to track a moving target across lines to refine vertical alignment. Gaze data is collected using a Tobii Pro Fusion 120Hz eye tracker, with a focus on dominant-eye data collection for higher accuracy. These steps ensure that the system adapts to individual visual capabilities and delivers precise gaze detection.

Real-Time Gaze Behavior Detection

The core of GazePrompt lies in its ability to interpret gaze behaviors dynamically. Using fixation data, the system identifies high-level actions such as line-following, line-switching, and word hesitation. Line identification is achieved through a weighted voting mechanism, analyzing fixation points relative to bounding boxes around text lines. For difficult-word detection, the system monitors fixation duration, re-fixation counts, and total gaze time on words. These behaviors trigger the corresponding augmentations, ensuring responsive and context-aware assistance.

Augmentation Rendering

Once behaviors are detected, GazePrompt provides visual and auditory feedback through a web-based interface built with React. Augmentations are rendered in real time: line-switching support offers customizable highlighting or arrows, while difficult-word assistance displays magnified text or triggers text-to-speech. The interface ensures low latency, leveraging Flask-SocketIO for smooth communication between the eye tracker and the system. This seamless rendering process allows users to experience intuitive, distraction-free support tailored to their reading needs.

Evaluation

The evaluation of GazePrompt involved two studies designed to assess its effectiveness in enhancing the reading experience for individuals with low vision. Study 1 focused on a controlled reading-aloud task to measure quantitative improvements in reading performance, while Study 2 examined real-world usage through silent reading tasks to capture qualitative insights into user experiences.

Study 1: Controlled Reading-Aloud Task

In the first study, 13 participants with varying low vision conditions completed reading-aloud tasks under four conditions: without GazePrompt (baseline), with LS, with DW, and with both features enabled. Quantitative metrics, such as line-switching time, frequency of misread words, and scrolling behaviors, were recorded alongside user feedback.

Figure 4 Wang et al.: GazePrompt influenced participants’ scrolling behavior. Orange dots represent fixations, and orange line segments represent saccades. (a) Without Line-Switching Support, Participant 7 (P7) scrolled two lines upward while fixating on the line starting with ‘Matthews.’ (b) With Line-Switching Support, P7 scrolled five lines upward while fixating on the line starting with ‘very.’ The red arrow indicates the progression direction of fixations.

Results showed that LS support significantly reduced line-switching time, enabling users to locate the next line faster. Although line-switching accuracy improvements were not statistically significant, users reported less cognitive strain and improved focus. DW support also reduced the number of misread words and provided targeted assistance for word recognition, particularly for visually challenging terms. Participants expressed strong preferences for customizable features, such as highlight colors and triggering thresholds, which enhanced usability. Overall, GazePrompt was perceived as effective and easy to learn, with participants noting its potential to reduce the effort of navigating magnified text.

Study 2: Silent Reading in Realistic Settings

To evaluate GazePrompt in more naturalistic conditions, Study 2 involved silent reading tasks with 13 different participants. They read passages chosen for their consistent length and difficulty, while GazePrompt was enabled to assist as needed. Participants were encouraged to read at their own pace and answer comprehension questions afterward to validate understanding.

In this study, LS support was found to help users maintain focus and improve comprehension, particularly for long or technical texts. Participants appreciated its utility in low-light environments and for text with reduced line spacing. DW support was often used as a confirmation tool, providing reassurance when users hesitated on a word. Text-to-speech was favored by those who preferred auditory feedback, while word magnification appealed to users with central vision loss for its precision. The silent reading context revealed additional insights, such as the potential of LS support to reduce visual fatigue and DW support to enhance user confidence.

Discussion

The evaluation of GazePrompt highlights its potential to transform the reading experience for individuals with low vision by addressing key challenges like line navigation and word recognition. The LS feature demonstrated notable benefits in reducing cognitive load and improving line-switching efficiency. This finding aligns with user feedback, which emphasized the importance of maintaining focus and reducing distractions while reading magnified text. However, the results also suggest that users with peripheral vision loss may require additional design considerations, such as more prominent or dynamic indicators, to fully benefit from line-switching aids. The customization options for highlight colors and augmentation styles were particularly appreciated, underscoring the importance of adaptability in assistive technologies.

The DW feature further demonstrated how real-time gaze-aware augmentations can improve reading accuracy and confidence. While participants valued the tailored assistance, they highlighted areas for improvement, such as adjustable magnification levels and alternative placement of the magnified text. Interestingly, the silent reading study revealed that users employed DW support not just for recognition but also as a reassurance tool, demonstrating its role in enhancing user confidence. The feedback also pointed to opportunities for hybrid approaches, integrating both visual and auditory aids for a more holistic experience. Together, these insights pave the way for refining GazePrompt and expanding its applications to diverse reading scenarios and user needs.

Limitations and Future Work

While GazePrompt demonstrates promising results in improving the reading experience for individuals with low vision, several limitations warrant attention.

GazePrompt currently relies on commercial eye-tracking hardware, which can be costly and requires precise calibration. This dependency may limit accessibility for users with severe peripheral vision loss or those in resource-constrained settings. Addressing tracking inaccuracies for such users is critical to improving the system’s inclusivity and reliability.
The system is optimized for static, text-based reading tasks and may not perform effectively with more dynamic or multimedia-rich content, such as web pages, interactive e-books, or video subtitles. Adapting GazePrompt for diverse reading environments is a key area for future development.
The evaluation studies involved a relatively small and specific set of participants, which, while diverse, may not fully represent the wide range of low vision conditions. Larger-scale studies with broader demographic and vision profiles are needed to validate the system’s efficacy across varied user groups.
Exploring more portable and affordable hardware solutions, such as integration with widely available devices or alternative tracking methods like head or gesture tracking, could make GazePrompt more accessible. Longitudinal studies in real-world contexts are also necessary to assess how users adapt to the system over time and its long-term impact on reading performance and confidence.

Conclusion

GazePrompt represents a significant advancement in assistive technology for low vision readers, addressing critical challenges like line navigation and word recognition through gaze-aware augmentations. The system’s Line-Switching and Difficult-Word Support features demonstrated measurable improvements in reading efficiency and user confidence, with customization options enhancing accessibility. While limitations such as hardware reliance and scalability remain, GazePrompt’s success highlights the potential of gaze-aware technologies to transform accessibility tools. Future work will focus on expanding usability, affordability, and applicability across diverse reading scenarios. GazePrompt is a step toward more inclusive design, empowering low vision individuals to read with greater ease and independence.

References

Wang, R., Potter, Z., Ho, Y., Killough, D., Zeng, L., Mondal, S., & Zhao, Y. (2024, May). GazePrompt: Enhancing Low Vision People's Reading Experience with Gaze-Aware Augmentations. In Proceedings of the CHI Conference on Human Factors in Computing Systems (pp. 1-17). DOI: 10.1145/3613904.3642878

- AKSHAY KOLGAR NAYAK @AkshayKNayak7

Search This Blog

Web Science and Digital Libraries Research Group