1-10 of 730 publications

Area of Interest Tracking Techniques for Driving Scenarios Focusing on Visual Distraction Detection

Viktor Nagy; Péter Földesi; György IstenesApplied Sciences
On-road driving studies are essential for comprehending real-world driver behavior. This study investigates the use of eye-tracking (ET) technology in research on driver behavior and attention during Controlled Driving Studies (CDS). One significant challenge in these studies is accurately detecting when drivers divert their attention from crucial driving tasks. To tackle this issue, we present an improved method for analyzing raw gaze data, using a new algorithm for identifying ID tags called Binarized Area of Interest Tracking (BAIT). This technique improves the detection of incidents where the driver’s eyes are off the road through binarizing frames under different conditions and iteratively recognizing markers. It represents a significant improvement over traditional methods. The study shows that BAIT performs better than other software in identifying a driver’s focus on the windscreen and dashboard with higher accuracy. This study highlights the potential of our method to enhance the analysis of driver attention in real-world conditions, paving the way for future developments for application in naturalistic driving studies.

A machine learning study for predicting driver goals in contingencies with leading and lagging features during goal determination

2024Artificial Intelligence, Cognitive Psychology, DrivingCore
Hsueh-Yi LaiExpert Systems with Applications
Many studies have focused on decision support systems that enhance both the efficiency and safety of driving. They have also explored the potential of real-time psychological data and machine learning in predicting drivers’ cognitive state, such as their fatigue levels, drowsiness, or workload. However, few studies have investigated prediction of driving goals as a cognitive outcome. Early prediction plays an essential role in providing active decision support during driving events under time pressure conditions. In this study, machine learning algorithms and features associated with different phases of decision-making were used to predict two common driving goals: defensive driving in emerging scenarios and urgent reactions in nonroutine scenarios. The effects of perception-, reflex-, control-, and kinetic-related features and how they contribute to prediction in the context of decision-making were analyzed. A total of 49 individuals were recruited to complete simulated driving tasks, with 237 events of defensive driving and 271 events of urgent reactions identified. The results revealed premium recall with a naïve Bayes classifier, indicating the onset of decision-making, with extreme gradient boosting and random forests exhibiting superior precision in predicting defensive driving and urgent reactions, respectively. Additionally, the cutoff of the initial 0.4 s of the events was identified. Before the cutoff, the leading features were reflex- and control-related features, which were the drivers’ immediate reactions before scenario evaluation and goal determination. These leading features contributed to superior prediction results for the two types of driving goals, indicating the likelihood of early detection. After the cutoff, model performance decreased, and lagging features came into play. These lagging features comprised perception- and kinetic-related features, reflecting observation of cues and outcomes of inputs delivered to vehicles. In the first 2 s, predictive models recovered and stabilized.

User Identification via Free Roaming Eye Tracking Data

Rishabh Vallabh Varsha Haria; Amin El Abed; Sebastian ManethPreprint
We present a new dataset of "free roaming" (FR) and "targeted roaming" (TR): a pool of 41 participants is asked to walk around a university campus (FR) or is asked to find a particular room within a library (TR). Eye movements are recorded using a commodity wearable eye tracker (Pupil Labs Neon at 200Hz). On this dataset we investigate the accuracy of user identification using a previously known machine learning pipeline where a Radial Basis Function Network (RBFN) is used as classifier. Our highest accuracies are 87.3% for FR and 89.4% for TR. This should be compared to 95.3% which is the (corresponding) highest accuracy we are aware of (achieved in a laboratory setting using the "RAN" stimulus of the BioEye 2015 competition dataset). To the best of our knowledge, our results are the first that study user identification in a non laboratory setting; such settings are often more feasible than laboratory settings and may include further advantages. The minimum duration of each recording is 263s for FR and 154s for TR. Our best accuracies are obtained when restricting to 120s and 140s for FR and TR respectively, always cut from the end of the trajectories (both for the training and testing sessions). If we cut the same length from the beginning, then accuracies are 12.2% lower for FR and around 6.4% lower for TR. On the full trajectories accuracies are lower by 5% and 52% for FR and TR. We also investigate the impact of including higher order velocity derivatives (such as acceleration, jerk, or jounce).

Improving Human-Robot Team Transparency with Eye-tracking based Situation Awareness Assessment

2024HCI, RoboticsCore
Favour Aderinto; Josh Bhagat Smith; Prakash Baskaran; Julie Adams; Mark-Robin Giolando2024 ACM/IEEE International Conference on Human-Robot Interaction
Human-robot interactions rely on transparency to foster effective collaboration. Transparency can be assessed through metrics associated with factors such as situation awareness. This manuscript presents an ocular metric to assess situation awareness for human-machine teams. Participants used a decision support system to select a grasp for underwater manipulation. The participants' gaze behavior and visual awareness were analyzed using a wearable eye tracker. An initial analysis that measures saccadic distance provides insight into the requirements of future techniques for objectively assessing situation awareness.

MarkupLens: An AI-Powered Tool to Support Designers in Video-Based Analysis at Scale

Tianhao He; Ying Zhang; Evangelos Niforatos; Gerd KortuemPreprint
Video-Based Design (VBD) is a design methodology that utilizes video as a primary tool for understanding user interactions, prototyping, and conducting research to enhance the design process. Artificial Intelligence (AI) can be instrumental in video-based design by analyzing and interpreting visual data from videos to enhance user interaction, automate design processes, and improve product functionality. In this study, we explore how AI can enhance professional video-based design with a State-of-the-Art (SOTA) deep learning model. We developed a prototype annotation platform (MarkupLens) and conducted a between-subjects eye-tracking study with 36 designers, annotating videos with three levels of AI assistance. Our findings indicate that MarkupLens improved design annotation quality and productivity. Additionally, it reduced the cognitive load that designers exhibited and enhanced their User Experience (UX). We believe that designer-AI collaboration can greatly enhance the process of eliciting insights in video-based design.

Contextualizing remote fall risk: Video data capture and implementing ethical AI

2024Artificial Intelligence, Clinical, Computer VisionCore
Jason Moore; Peter McMeekin; Thomas Parkes; Richard Walker; Rosie Morris; Samuel Stuart; Victoria Hetherington; Alan Godfreynpj Digital Medicine
Wearable inertial measurement units (IMUs) are being used to quantify gait characteristics that are associated with increased fall risk, but the current limitation is the lack of contextual information that would clarify IMU data. Use of wearable video-based cameras would provide a comprehensive understanding of an individual’s habitual fall risk, adding context to clarify abnormal IMU data. Generally, there is taboo when suggesting the use of wearable cameras to capture real-world video, clinical and patient apprehension due to ethical and privacy concerns. This perspective proposes that routine use of wearable cameras could be realized within digital medicine through AI-based computer vision models to obfuscate/blur/shade sensitive information while preserving helpful contextual information for a comprehensive patient assessment. Specifically, no person sees the raw video data to understand context, rather AI interprets the raw video data first to blur sensitive objects and uphold privacy. That may be more routinely achieved than one imagines as contemporary resources exist. Here, to showcase/display the potential an exemplar model is suggested via off-the-shelf methods to detect and blur sensitive objects (e.g., people) with an accuracy of 88%. Here, the benefit of the proposed approach includes a more comprehensive understanding of an individual’s free-living fall risk (from free-living IMU-based gait) without compromising privacy. More generally, the video and AI approach could be used beyond fall risk to better inform habitual experiences and challenges across a range of clinical cohorts. Medicine is becoming more receptive to wearables as a helpful toolbox, camera-based devices should be plausible instruments.

Coordination of gaze and action during high-speed steering and obstacle avoidance

2024Driving, VR/ARVR
Nathaniel V. Powell; Xavier Marshall; Gabriel J. Diaz; Brett R. FajenPLOS ONE
When humans navigate through complex environments, they coordinate gaze and steering to sample the visual information needed to guide movement. Gaze and steering behavior have been extensively studied in the context of automobile driving along a winding road, leading to accounts of movement along well-defined paths over flat, obstacle-free surfaces. However, humans are also capable of visually guiding self-motion in environments that are cluttered with obstacles and lack an explicit path. An extreme example of such behavior occurs during first-person view drone racing, in which pilots maneuver at high speeds through a dense forest. In this study, we explored the gaze and steering behavior of skilled drone pilots. Subjects guided a simulated quadcopter along a racecourse embedded within a custom-designed forest-like virtual environment. The environment was viewed through a head-mounted display equipped with an eye tracker to record gaze behavior. In two experiments, subjects performed the task in multiple conditions that varied in terms of the presence of obstacles (trees), waypoints (hoops to fly through), and a path to follow. Subjects often looked in the general direction of things that they wanted to steer toward, but gaze fell on nearby objects and surfaces more often than on the actual path or hoops. Nevertheless, subjects were able to perform the task successfully, steering at high speeds while remaining on the path, passing through hoops, and avoiding collisions. In conditions that contained hoops, subjects adapted how they approached the most immediate hoop in anticipation of the position of the subsequent hoop. Taken together, these findings challenge existing models of steering that assume that steering is tightly coupled to where actors look. We consider the study’s broader implications as well as limitations, including the focus on a small sample of highly skilled subjects and inherent noise in measurement of gaze direction.

Comparing eye–hand coordination between controller-mediated virtual reality, and a real-world object interaction task

2024Applied Psychology, VR/ARVR
Ewen Lavoie; Jacqueline S. Hebert; Craig S. ChapmanJournal of Vision
Virtual reality (VR) technology has advanced significantly in recent years, with many potential applications. However, it is unclear how well VR simulations mimic real-world experiences, particularly in terms of eye–hand coordination. This study compares eye–hand coordination from a previously validated real-world object interaction task to the same task re-created in controller-mediated VR. We recorded eye and body movements and segmented participants’ gaze data using the movement data. In the real-world condition, participants wore a head-mounted eye tracker and motion capture markers and moved a pasta box into and out of a set of shelves. In the VR condition, participants wore a VR headset and moved a virtual box using handheld controllers. Unsurprisingly, VR participants took longer to complete the task. Before picking up or dropping off the box, participants in the real world visually fixated the box about half a second before their hand arrived at the area of action. This 500-ms minimum fixation time before the hand arrived was preserved in VR. Real-world participants disengaged their eyes from the box almost immediately after their hand initiated or terminated the interaction, but VR participants stayed fixated on the box for much longer after it was picked up or dropped off. We speculate that the limited haptic feedback during object interactions in VR forces users to maintain visual fixation on objects longer than in the real world, altering eye–hand coordination. These findings suggest that current VR technology does not replicate real-world experience in terms of eye–hand coordination.

Robotic Care Equipment Improves Communication between Care Recipient and Caregiver in a Nursing Home as Revealed by Gaze Analysis: A Case Study

2024Clinical, RoboticsInvisible
Tatsuya Yoshimi; Kenji Kato; Keita Aimoto; Izumi KondoInternational Journal of Environmental Research and Public Health
The use of robotic nursing care equipment is an important option for solving the shortage of nursing care personnel, but the effects of its introduction have not been fully quantified. Hence, we aimed to verify that face-to-face care is still provided by caregivers in transfer situations when using robotic nursing care equipment. This study was conducted at a nursing home where the bed-release assist robot “Resyone Plus” is installed on a long-term basis. Caregiver gaze was analyzed quantitatively for one user of the equipment during transfer situations, and communication time, which involved looking at the face of the care recipient, as well as face-to-face vocalization, was measured. The caregiver spent 7.9 times longer looking at the face of and talking to the care recipient when using Resyone than when performing a manual transfer. In addition, the recipient was observed to smile during Resyone separation, which takes about 30 s. The results indicate a possible improvement in the QOL of care recipients through the use of robotic nursing care equipment as a personal care intervention. The ongoing development of robot technology is thus expected to continue to reduce the burden of caregiving as well as to improve the QOL of care recipients.

Automated visual acuity estimation by optokinetic nystagmus using a stepped sweep stimulus

2024Clinical, Oculomotor, OpthalmologyInvisible
Jason Turuwhenua; Zaw LinTun; Mohammad Norouzifard; Misty Edmonds; Rebecca Findlay; Joanna Black; Benjamin ThompsonmedRxiv
Purpose Measuring visual acuity (VA) can be challenging in adults with cognitive impairment and young children. We developed an automatic system for measuring VA using Optokinetic Nystagmus (OKN). Methods VA-OKN and VA by ETDRS (VA-ETDRS) were measured monocularly in healthy participants (n=23, age 30±12). VA was classified as reduced (n=22, >0.2 logMAR) or not (n=24, ≤0.2 logMAR) in each eye. VA-OKN stimulus was an array of drifting (5 deg/sec) vanishing disks presented in descending/ascending size order (0.0 to 1.0 logMAR in 0.1 logMAR steps). The stimulus was stepped every 2 seconds, and 10 sweeps were shown per eye. Eye tracking data determined when OKN activity ceased (descending sweep) or began (ascending sweep) to give an automated sweep VA. Sweep traces were randomized and assessed by a reviewer blinded to VA-ETDRS. A final per sweep VA and VA-OKN was thereby determined. Results A single randomly selected eye was used for analysis. VA deficit group: There was no significant difference between overall mean VA-OKN and VA-ETDRS (p>0.05, paired t-test) and the r2 statistic was 0.84. The 95% limits of agreement were 0.19 logMAR. No VA deficit group: There was a 0.24 logMAR bias between VA-OKN and VA-ETDRS and no correlation was found (r2 = 0.06). However, the overall sensitivity/specificity for classification was 100%. Conclusions A robust correlation between VA-ETDRS and VA-OKN was found. The method correctly detected a VA deficit. Translational relevance OKN is a promising method for measuring VA in cognitively impaired adults and pre-verbal children.