Cognitive Effort Measures Driven by Fixation Induced Retinal Flow in Visual Scanning Behavior during Virtual Driving
Abstract
In this paper, we consider the problem of visual scanning mechanism underpinning sensorimotor tasks, such as walking and driving, in dynamic environments. We exploit eye tracking data for offering two new cognitive effort measures in visual scanning behavior of virtual driving. By utilizing the retinal flow induced by fixation, two novel measures of cognitive effort are proposed through the importance of grids in the viewing plane and the concept of information quantity, respectively. With psychophysical studies, two proposed cognitive effort measures have shown their significant correlation with widely used objective measurements of cognitive effort. Our results suggest that the quantitative exploitation of eye tracking data provides an effective approach for the evaluation of sensorimotor activities.
Keywords Virtual/Augmented Reality Information Theory Eye Tracking
1 Introduction
Visual scanning and eye tracking are important for the living of any human in natural surroundings [1, 2]. Visual scanning is indeed the foundation for a human to perform common and everyday sensorimotor tasks, such as walking and driving. Actually, the understanding of the mechanism behind visual scanning has been valuable since late 1970 and, is especially helpful and beneficial to making stark and essential progress in both theoretical and practical perspectives [3].
Basically, it is significant to make clear how much a human can achieve for sampling visual information through visual scanning, bearing one of the most fundamental points involved in the understanding of visual scanning mechanism [3]. Cognitive effort is an important approach to comprehending cognitive control and visual scanning behavior [4]. Actually, cognitive effort basically depicts subjective engagement for assessing the human’s internal state during tasks [5]. Thus, cognitive effort plays an important role in visual scanning and visuo-motor behavior, but how to measure and in particular how to objectively assess the cognitive effort have been a paramount concern in both theory and practice [4, 6].
It is noted that the visual motion, which always occurs during the visual scanning behavior of sensorimotor tasks [7], has been rarely touched in the cognitive effort measure. In this paper, based on the so-called fixation induced retinal flow [8] that is a quantitative description for the visual motion in the visual interactive perception on environmental stimuli, the importance of grids in the viewing plane is developed. A cognitive effort measure, called the view importance based cognitive effort measure (), is proposed, through employing Shannon entropy based complexity [9] of the probability distribution of the view importance of grids. Still, based on the fixation induced retinal flow, the amount of perception of the visual motions during sensorimotor tasks is obtained using the square root of Jensen-Shannon divergence, which is a true mathematical metric [10]. Then, in terms of the concept of information quantity [9], the perceived amount of visual motions is transformed to be utilized for satisfactorily defining the information quantity based cognitive effort measure () to understand the cognitive status of humans during driving tasks. To the best of our knowledge, this is the first time, based on the exploitation of visual motion and a true mathematical metric, to effectively define the quantitative and objective evaluations of the classical and subjective cognitive effort. Our proposal paves a novel path for behaviometric discovery by the utilization of eye tracking data.
2 Related Works
An initial consideration for discussing the measurement of the visual scanning behavior is to select suitable eye tracking indices [2] naturally associated with cognitive processing. Fixation and saccade [2] are classic indices for this purpose. But, direct and indirect usages of these indices (for examples, rate/duration of them and their simple combinations) are more applicable in specific application scenarios, rather than in the general evaluation of visual scanning behavior [11]. In addition, pupil dilation and blink rate are two widely used eye tracking indices for the study of cognition and psychology [12]. Notice specifically that eye tracking has been expected as a strong estimator for task performance in many professional fields [13].
Basically, the knowing of cognitive effort plays a significant character in the procedure of all kinds of cognitive processing [5, 6, 14]. The cognitive effort, which has started within educational psychology, is a classic measure for subjective engagement [5]. From the perspective of cost-benefit decision-making, the cognitive effort is deemed as an amplitude or intensity of behavior in the fulfillment of cognitive control for accomplishing tasks [14]. The assessment of cognitive effort, which is used for the estimation of the human’s internal state, has been largely encouraged in the area of ergonomics and human factors [6]. And undoubtedly, the measurement of cognitive effort in the domain of visual scanning and visuo-motor has attracted a lot of attention in both theory and practice [4]. But, the visual motion, which usually appears during sensorimotor activities, has not been used in the measurement of cognitive effort. In addition, evaluating the cognitive effort, particularly in an objective way, bears a big challenge [6].
The visual motion perceived during a fixation in a sensorimotor task, which is the so-called fixation induced retinal flow in this paper, has been introduced based on the concepts of eye tracking and optical flow in the literature of visual scanning and visuo-motor [8]. Considering that the fixation induced retinal flow is very important as the fundamental basis for establishing the two proposed measures, its methodology is specially depicted in Section 3.1.
Additionally, previous research has demonstrated numerous eye movement measures related to cognitive effort in humans during tasks. For example, when a driver’s cognitive load increases, the driver’s periphery/mirror/instrument check rate (hereafter referred to as check rate) tends to decrease [15]. Stationary Gaze Entropy () is used to measure the level of fixation dispersion during the eye scanning process [3]. Shiferaw et al. found that during driving, if the driver is in an abnormal state such as hungover or fatigued, their shows significant changes [3]. Entropy rate, a concept in information theory, is used to describe the rate at which a random process generates information. In more specific applications, it can quantify the uncertainty and complexity of a signal or data sequence. It has also been confirmed to change with variations in cognitive load [16].
3 Methods
3.1 The Retinal Flow induced by Fixation and Visual Scanning Efficiency
The retinal flow induced by a fixation is introduced based on the identification of the visual motion resulted from a fixated stimulus [8].
A fixation (its index is ) with duration by an observer in a 3D environment is shown in Fig. 1. Here, is the distance between and in the direction of viewing, and is the depth of from the perspective of . Considering, during , there is a relative motion displacement happened to , and is taken as the optical flow vector [7] for the fixated stimulus by . Actually, is the projection of in the direction of optical flow vector, and thus perpendicular to the direction of viewing. For computation simplicity, the length of trajectory segment of the circular motion of centered on , , acts as an approximated magnitude of . The central angle subtended by ,
(1) |
is further used to define a perceived magnitude of optical flow, for characterizing the amount of visual motion perceived by the observer during the fixation . is called as the fixation induced retinal flow in this paper, because this quantity represents the amount of perceived optical flow in the course of a fixation. The definition of meets the usual practice that, angle is widely used as the representation of magnitude or amount in eye tracking [2]. Note explicitly uses the depth cue , enabling the fixation induced retinal flow to convey this important and special cue in 3D environments.
Visual scanning is done by a performer through a sequence of fixations, so that meeting the requirement of visual sampling the surrounding environments for fulfillment of a sensorimotor task. The probability distribution of fixations , namely the fixation distribution, which is built up based on the normalized histogram of fixation locations in a 3D environment, is used for a representation of the visual scanning behavior. The fixation induced retinal flow is used to construct the retinal flow probability distribution . Previous work measured the difference between and by the Square Root of Jensen-Shannon Divergence (SRJSD) between them,
(2) |
for assessing the so-called visual scanning efficiency [8]. plays a basic function for understanding the mechanism of visual scanning behavior.
3.2 The Proposed Cognitive Effort Measure Based on the View Importance
During a sensorimotor task such as driving, for the purpose of safety and stability, the driver usually focuses varied attention on regions in the viewing plane. For example, central and peripheral viewing regions are paid large and small attention and/or importance, respectively, to achieve stable driving, if the central viewing regions dominate the road for driving. That is to say, the importance of stimulus observation plays an important factor in performing the driving tasks, considering that a stimulus in 3D environment corresponds to at least a region in the viewing plane.
In this paper, the amount of the perceived visual motion resulted from a fixated stimulus, which is characterized as the fixation induced retinal flow, is utilized to define
(3) |
as the importance of stimulus observation. Notice that the observation importance takes into consideration of the motion displacement of a fixation during its duration, for explicitly signaling the influence of the eye yaw rotation on the observation of the fixated stimulus. This means that when is larger the obtainment of fixation is easier, and conversely, when is smaller the obtainment of fixation is more challenging. That is, from the perspective of stimulus observation, provides a kind of indicator for pointing out how much effort should be exerted to observe and perceive a stimulus. From the viewpoint of a stimulus itself, offers a measurement of its importance for observation and perception.
And then, the view importance of a region in the viewing plane is obtained, by accumulating all the values of the observation importance of the corresponding stimulus to this region. A region corresponds to a single element of grids in the viewing plane (currently, achieves good results in this paper, and other options on will be done as a future work). The normalized histogram of values of the view importance of grids is created, to obtain a probability distribution of the view importance of grids. The Shannon entropy
(4) |
of this probability distribution is proposed to evaluate the degree of balance for visually scanning various grids in the viewing plane, leading to the view importance based cognitive effort measure () in our paper. This entropy based complexity evaluation on the view importance of grids, which indeed takes into account of the non-trivial interaction between the observations on various grids. As a result, this proposed indicates the degree of a systematic perception of all the visual motions during the sensorimotor driving, as well suggests an intensity or amplitude of the visual scanning behavior and behaves as an assessment function for cognitive effort.
3.3 The Information Quantity of Perceived Visual Motion and the Corresponding Proposed Cognitive Effort Measure
The developed measure of the amount of perceived visual motions, , can be studied from the perspective of probability and information theory [9]. That is, , in fact, can be considered as a probability an event occurs, because it ranges from 0 to 1 [10]. As a result, gives a probability of perception of the visual motions. According to information theory, the logarithmic probability of occurrence () represents the quantity of information conveyed by the occurrence [9]. It is obvious that the information quantity indicates the quantified amount of perception of all the visual motions during a sensorimotor task. Notably, the amplitude for the perception of visual motions and for the completion of sensorimotor tasks, reflects the meaning of cognitive effort [14]. Thus the logarithm transformation of perceived visual motion is taken as the core function for the proposed information quantity based cognitive effort measure ().
Because the distributions of fixations and fixation transitions reflect different aspects of visual scanning, a combination of both these two distributions should provide a more complete understanding of visual scanning behavior, as has been pointed out in relevant work [3]. Indeed, the performer of visual scanning voluntarily exerts some cognitive effort to do a unidirectional switch between two neighboring fixations, and the visual motion induced by the first fixation perceived/cognized in the procedure of completing one fixation transition measures this effort. Therefore, we propose to utilize the fixation transition distribution and the retinal flow distribution based on fixation transition to obtain the definition of . The approach to obtaining is similar to that of the fixation distribution , but here the fixation sequence is employed. Correspondingly, can be easily obtained based on . gives an information quantity for the perceived visual motion during a single fixation transition. We define by the division between these two information quantities, and , as
(5) |
for characterizing a cognitive effort during a sensorimotor task.
4 Experiment
4.1 Participants
14 Master/Phd students ( females; age range: -, Mean = , SD = ) with driving experience (they hold their driver license at least one and a half years) from our University volunteer to participate in the psychophysical studies. All of the participants have normal/corrected-to-normal visual acuity and normal color vision. There is no participant having adverse reaction to the virtual environment we set up for the studies.
4.2 Apparatus
HTC Vive headset is used to display the vitural environment for participants. The eye-tracking equipment is 7INVENSUN Instrument aGlass DKII, which is embedded into the HTC Vive display to capture visual scanning data in a frequency of Hz and in an accuracy of gaze position of . The driving device is a Logitech G29 steering wheel. Participants listen the ambient traffic and car engine sounds in VE by speakers. The visual and driving behaviors of participants are displayed on desktop monitor for observation.
4.3 Driving Task
As discussed in Section 4.1, the task directed focus on visual scanning and driving is taken. And as a result, participants are required to keep driving at a target speed of km/h, for speed control. This paper takes the inverse of the mean acceleration of vehicle to denote the driving performance. The smaller the mean acceleration, the higher the driving performance becomes, and vice versa. In fact, this kind of performance measure has been used a lot in literature [17]. An example of performing driving tasks is presented in Fig. 2.
4.4 Procedure
Each participant completes 4 test sessions with the same task requirements and the same driving routes, with a 9-point calibration for the eye tracker at the beginning of each session and with an interval of one week between every two sessions. Data for visual scanning and driving behaviors are recorded during test sessions. In this paper, a trial represents a test session, and there are valid trials in all (obviously this gets a large enough sample size [18]). A preparation session is applied to participants before each test session to let them know the purpose and procedure about the studies.
Correlation | Pupil Size Change | Fixaion Rate | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Pearson | Kendall | Spearman | Pearson | Kendall | Spearman | |||||||
CC | p-value | CC | p-value | CC | p-value | CC | p-value | CC | p-value | CC | p-value | |
0.38 | 0.01 | 0.27 | 0.01 | 0.39 | 0.01 | -0.19 | 0.05 | -0.04 | 0.05 | -0.04 | 0.05 | |
0.27 | 0.05 | 0.20 | 0.05 | 0.27 | 0.05 | -0.46 | 0.001 | -0.23 | 0.05 | -0.36 | 0.01 | |
0.15 | 0.05 | 0.14 | 0.05 | 0.21 | 0.05 | 0.35 | 0.01 | 0.32 | 0.01 | 0.46 | 0.01 | |
0.03 | 0.05 | 0.02 | 0.05 | 0.05 | 0.05 | 0.01 | 0.05 | -0.02 | 0.05 | -0.04 | 0.05 | |
-0.02 | 0.05 | 0.01 | 0.05 | 0.03 | 0.05 | -0.07 | 0.05 | -0.10 | 0.05 | -0.18 | 0.05 |
5 Results and Discussions
5.1 Correlation Results
As widely used in literature as a measure of cognitive effort [12, 19], pupil size change has been accepted as an autonomic and reflexive measure of cognitive effort. In this paper, the standard deviation of pupil size [20, 4] during each trial is utilized to represent the pupil size change because of its simplicity and effectiveness. The fixation rate is also used to measure cognitive effort [21], because corresponding studies have indicated that factors influencing pupil size are not solely due to cognitive effort [22]. It is clear that cognitive load affects both human pupillary response and fixation based eye movements. Therefore, both pupil size change and fixation rate are taken in this paper as the definitive quantitative ground truth for cognitive effort. Three classic quantitative measures of cognitive effort, check rate [15], [3] and entropy rate [16] are used as comparison for evaluating the effectiveness of our proposed measures.
We validate the relationship between two proposed measures and pupil size change/fixation rate through three commonly used correlation coefficients called Pearson Linear Correlation Coefficient (PLCC), Kendall Rank Order Correlation Coefficient (KROCC) and Spearman Rank Order Correlation Coefficient (SROCC). The correlation results are listed in Table 1. We find that, shows a significant correlation with both pupil size change and fixation rate. exhibits a significant correlation with pupil size change. The check rate is not related to pupil size change, yet it has a significant correlation with fixation rate. But, and entropy rate are not correlated with pupil size change and fixation rate. Correlation analysis between eye movement measures with pupil size change/fixation rate is also clearly shown in Fig. 3.
5.2 General Discussions
The proposed cognitive effort measures are based on the methodology of information theory, through taking advantage of the perceived visual motion always happening in dynamic environments during sensorimotor tasks. In fact, our proposal actually satisfies the definition of cognitive effort in terms of information processing [14]. The significant positive correlation between and pupil size change suggests the principle that the more chaotic and varied the distribution of the importance of different areas in the driver’s viewing plane, the higher the corresponding cognitive effort on the driver. Indeed is designed based on this principle to assess the driver’s cognitive load. Overall, achieves the best in measuring cognitive effort, from both pupillary and fixation perspectives. From the viewpoints of both pupil size and fixation, a consistent conclusion can be drawn: the larger the , the greater the cognitive effort. The significant correlation between and the ground truth of cognitive effort demonstrates that the higher the proportion of perceived visual motion information among all perceived potential eye movement changes, the higher the cognitive effort on the driver. The achievement by is also evidenced by a comparison between it with other classic measurements of cognitive effort based on eye movement. Among the three measures under comparison, only the check rate has a significant correlation with the fixation rate. We believe this is because the check rate itself is specifically related to the fixation distribution for driving. In a word, we consider our proposed to have the best robustness, being applicable in a broader range of scenarios and potentially yielding a more accurate measurement of cognitive effort. In the meantime, we believe that the definition of cognitive effort in terms of the information quantity and of “physics" is worthwhile, and we will continue deep exploitation in this avenue.
Considering that we have made progress on the exploitation of eye tracking data, as a behaviometric, for the evaluations on cognitive effort, a further investigation into the relationship between these two proposed measures and the performance of sensorimotor tasks will be done in future work. And actually, this could be a working path illuminated based on the exploitation of Yerkes-Dodson law [23].
Notice that the findings of this paper may not be applicable for all cases, but it does work in the context of our topic. Due to that the visual scanning and visuo-motor behavior is exceptional important in virtual and real-world sensorimotor tasks, what we have achieved on the measurement of cognitive effort in virtual driving should be potentially helpful for ergonomic evaluation pragmatically, in many practical and relevant applications.
6 Conclusions and Future Works
In this paper, we take an important step for thorough understanding the mechanism of visual scanning in virtual driving. This paper has established, in an objective and quantitative way, two new measures for the subjective cognitive effort, mainly by utilizing information theoretic tools. Our proposal is well done through a methodology that exploits the perceived visual motions in a sensorimotor task. As far as we can know, no research up to now has reported this kind of finding to shed light on the issue of cognitive effort measure for the visual scanning behavior during sensorimotor tasks. Additionally, the proposed cognitive effort measures may offer a new perspective on the inherent relationship between task directed visual scanning and eye tracking data, so as to help the development of behaviometric discovery, from both theoretical and practical perspectives.
In the near future, we will investigate our proposed methodology and measures for real-life driving scenarios, for instance, for crash risk problem [24]. In consideration of the critical role of illumination conditions for driving, we will exploit the manipulation of illumination levels in a detailed quantitative way, to comprehensively understand the mechanism of cognitive effort during visual scanning behavior. Also, physiological signals such as heartbeat [25] will be investigated for understanding the relationships and interplays between these signals and eye tracking data, for the sake of cognitive effort.
References
- [1] Shreya Ghosh, Abhinav Dhall, Munawar Hayat, Jarrod Knibbe, and Qiang Ji. Automatic gaze analysis: A survey of deep learning based approaches. IEEE Transactions on Pattern Analysis and Machine Intelligence, 46(1):61–84, 2023.
- [2] Kenneth Holmqvist, Marcus Nyström, Richard Andersson, Richard Dewhurst, Halszka Jarodzka, and Joost Van de Weijer. Eye tracking: A comprehensive guide to methods and measures. OUP Oxford, 2011.
- [3] Brook Shiferaw, Luke Downey, and David Crewther. A review of gaze entropy as a measure of visual scanning efficiency. Neuroscience & Biobehavioral Reviews, 96:353–366, 2019.
- [4] Pauline van der Wel and Henk Van Steenbergen. Pupil dilation as an index of effort in cognitive control tasks: A review. Psychonomic bulletin & review, 25:2005–2015, 2018.
- [5] Andrew Westbrook and Todd S Braver. Cognitive effort: A neuroeconomic approach. Cognitive, Affective, & Behavioral Neuroscience, 15:395–415, 2015.
- [6] Luca Longo, Christopher D Wickens, Gabriella Hancock, and Peter A Hancock. Human mental workload: A survey and a novel inclusive definition. Frontiers in psychology, 13:883321, 2022.
- [7] S Negahdaripour. Revised definition of optical flow: integration of radiometric and geometric cues for dynamic scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(9):961–979, 1998.
- [8] Zezhong Lv, Qing Xu, Klaus Schoeffmann, and Simon Parkinson. A jensen-shannon divergence driven metric of visual scanning efficiency indicates performance of virtual driving. In 2021 IEEE International Conference on Multimedia and Expo (ICME), pages 1–6. IEEE, 2021.
- [9] Thomas M Cover and Joy A Thomas. Elements of information theory. John Wiley & Sons, 2012.
- [10] J Lin. Divergence measures based on the shannon entropy. IEEE Transactions on Information Theory, 37(1):145–151, 1991.
- [11] Heejin Jeong, Ziho Kang, and Yili Liu. Driver glance behaviors and scanning patterns: Applying static and dynamic glance measures to the analysis of curve driving with secondary tasks. Human Factors and Ergonomics in Manufacturing & Service Industries, 29(6):437–446, 2019.
- [12] Maria K Eckstein, Belén Guerra-Carrillo, Alison T Miller Singley, and Silvia A Bunge. Beyond eye gaze: What else can eyetracking reveal about cognition and cognitive development. Developmental cognitive neuroscience, 25:69–91, 2017.
- [13] Amie C Hayley, Brook Shiferaw, and Luke A Downey. Amphetamine-induced alteration to gaze parameters: A novel conceptual pathway and implications for naturalistic behavior. Progress in neurobiology, 199:101929, 2021.
- [14] Amitai Shenhav, Sebastian Musslick, Falk Lieder, Wouter Kool, Thomas L Griffiths, Jonathan D Cohen, and Matthew M Botvinick. Toward a rational and mechanistic account of mental effort. Annual review of neuroscience, 40:99–124, 2017.
- [15] Dengbo He, Ziquan Wang, Elias B Khalil, Birsen Donmez, Guangkai Qiao, and Shekhar Kumar. Classification of driver cognitive load: exploring the benefits of fusing eye-tracking and physiological measures. Transportation research record, 2676(10):670–681, 2022.
- [16] Shurong Tong and Yafei Nie. Measuring designers’ cognitive load for timely knowledge push via eye tracking. International Journal of Human–Computer Interaction, 39(6):1230–1243, 2023.
- [17] Ankit Kumar Yadav and Nagendra R Velaga. Effect of alcohol use on accelerating and braking behaviors of drivers. Traffic Injury Prevention, 20(4):353–358, 2019.
- [18] Erich Leo Lehmann. Elements of large-sample theory. Springer, 1999.
- [19] Siddhartha Joshi, Yin Li, Rishi M Kalwani, and Joshua I Gold. Relationships between pupil diameter and neuronal activity in the locus coeruleus, colliculi, and cingulate cortex. Neuron, 89(1):221–234, 2016.
- [20] Siyuan Chen, Julien Epps, Natalie Ruiz, and Fang Chen. Eye activity as a measure of human mental effort in hci. intelligent user interfaces, pages 315–318, 2011.
- [21] Alexis D Souchet, Stéphanie Philippe, Domitile Lourdeaux, and Laure Leroy. Measuring visual fatigue and cognitive load via eye tracking while learning with virtual reality head-mounted displays: A review. International Journal of Human–Computer Interaction, 38(9):801–824, 2022.
- [22] Bernhard Petersch and Kai Dierkes. Gaze-angle dependency of pupil-size measurements in head-mounted eye tracking. Behavior Research Methods, 54(2):763–779, 2022.
- [23] Paul A Watters, F Martin, and Zoltan Schreter. Caffeine and cognitive performance: The nonlinear yerkes-dodson law. Human Psychopharmacology-clinical and Experimental, 12(3):249–257, 1997.
- [24] Bargman J Victor T, Dozza M. Analysis of naturalistic driving study data: Safer glances, driver inattention, and crash risk. Technical report, 2015.
- [25] Alejandro Galvez-Pol, Ruth Mcconnell, and James M. Kilner. Active sampling in visual search is coupled to the cardiac cycle. Cognition, 196, 2020.