research-article

Open access

The Influence that the Complexity of the Three-Dimensional Eye Model Used to Generate Simulated Eye-tracking Data Has on the Gaze Estimation Errors Achieved Using the Data

Authors:

Joshua David Fischer,

Johan van der Merwe,

David VandenheeverAuthors Info & Claims

ACM Transactions on Applied Perception, Volume 22, Issue 1

Article No.: 2, Pages 1 - 16

https://doi.org/10.1145/3660637

Published: 12 November 2024 Publication History

PDF eReader

Abstract

Simulated eye-tracking data are an integral tool in the development of eye-tracking methods. Most of the simulated data used in eye-tracking-related research has been generated using low-complexity eye models that include a single spherical corneal surface. This study investigated the influence of eye-model complexity on the ability of simulated eye-tracking data to predict real-world outcomes. The experimental procedures of two pertinent comparative eye-tracking studies were replicated in a simulated environment using various eye-model complexities. The simulated outcomes were then evaluated against the findings of the comparative studies that were derived from real-world outcomes. The simulated outcomes of both comparative studies were significantly influenced by the eye-model complexity. Eye models that included an aspheric corneal surface best replicated experimental eye-tracking outcomes, while including a posterior corneal surface did not improve the ability of simulated data to replicate real-world outcomes. Using a wide-angle eye model that accurately replicates the peripheral optics of the eye did not improve simulated outcomes relative to a paraxial eye model.

1 Introduction

An eye-tracker is used to record the movement of a user's eyes and estimate the position of a user's gaze. A typical eye-tracker consists of at least one camera positioned to view the user's eyes. The camera captures a continuous stream of images of the eye region from which image features such as the position of the pupil are extracted using a feature extraction algorithm. These features are then used to derive the estimated gaze point of the user using a gaze estimation algorithm. An overview of the various types of eye tracker configurations and gaze estimation algorithms is given by Reference [27]. The largest contributor to eye-tracking errors is thought to be the systematic errors inherent in the gaze estimation algorithms and their inability to compensate for the multitude of potential sources of eye-tracking errors such as head movements and changes in the size of the pupil [1, 2].

Simulations are frequently used during the development of eye-trackers to rapidly evaluate various configurations of hardware and algorithmic components against sources of eye-tracking errors [3, 4]. A simulation consists of computational models of the hardware components of an eye-tracker, such as the camera and light sources, as well as features of the user of the device, such as the user's eye [2, 5]. Modern simulation environments also include realistic head models that allow the simulation of synthetic images of the entire eye region [6–8].

It is important to validate simulated eye-tracking outcomes against the outcomes achieved in user studies performed under similar conditions to ensure that the simulated data used to inform the development of eye-tracking algorithms accurately reflect real-world outcomes. Decisions that are ill-informed by inaccurate simulated data may result in costly mistakes, such as conducting resource-intensive user studies using a sub-optimal hardware configuration or selecting feature extraction or gaze estimation algorithms that perform well in simulations but generate large errors in user studies.

Studies that investigate gaze estimation algorithms commonly use simulated outcomes that were not validated against real-world outcomes to derive conclusions about the performance of the algorithms. Studies of this nature include investigations of the robustness of algorithms to head movements [9], comparing the performance of algorithms for different calibration configurations of a head-mounted eye-tracker [10], and the influence of corrective lenses on eye-tracking outcomes [11]. The studies that do validate their simulated data against the outcomes of user studies report conflicting outcomes. Studies such as References [12] and [13] report simulated gaze estimation errors significantly smaller than the corresponding real-world errors. Conversely, studies such as Reference [14] found significantly larger gaze estimation errors generated by simulated data than the corresponding real-world errors.

These studies indicate a clear discrepancy between simulated and real-world eye-tracking outcomes. The discrepancy between simulated and real-world outcomes is further demonstrated by the large errors achieved by learning-based gaze estimation algorithms that were trained on simulated data and evaluated on real images [15–17]. These findings suggest that further improvements to the realism of simulated data are required.

The eye model used in a simulation consists of several computational surfaces meant to replicate the eye's optical performance and is thought to have the largest influence on the outcomes of a simulation [2]. The complexity of an eye model refers to the number of parameters included in the model, with more parameters that better represent the anatomy and performance of the human eye [18]. Various complexities of eye models have been published in the literature, ranging from as few as three parameters [19] to as many as 97 [20]. The simulated data used in the literature is commonly generated using low-complexity eye models that simulate the cornea as a spherical surface [2, 21, 22].

This bias towards using low-complexity eye models warrants some caution, as Reference [23] showed that eye models with a spherical corneal surface only accurately replicate the central visual field performance of the eye and are not suited for investigations over larger viewing angles. Furthermore, References [25] and [26] have demonstrated the influence that the peripheral optics of the eye have on the dynamics of the image features used to perform eye-tracking, such as pupil and glint features [27]. Based on these findings, it is reasonable to suspect that the complexity of the eye model used to generate simulated data may influence the ability of the data to predict real-world outcomes. A lack of eye-model complexity is also often discussed as a limitation of studies that use simulated data [1, 21].

The combination of the prevalence of simulated eye-tracking data, the disparities between simulated and real-world outcomes, the concerns surrounding the influence that the peripheral optics of the eye may have on eye-tracking outcomes, and the shortcomings in the current understanding of how eye-model complexity influences the predictive power of simulated data motivates the need for an investigation of the influence that eye-model complexity has on the predictive power of simulated data. The work presented in this study evaluated the simulated data produced by five eye models with increasing complexities based on their ability to predict the real-world eye-tracking outcomes reported by two comparative studies. This work has a notable application both for the eye-tracking community as well as for graphics, perception, and vision researchers, as it may assist researchers in these fields with understanding how the complexity of the eye model used to to generate simulated data may influence the outcomes achived using the data. The sourcecode for the simulations performed in this study are available online.

2 Method

This section describes the development of the simulation environment that was used to replicate the real-world eye-tracking experiments performed by two comparative studies using various eye-model complexities. The section begins with an overview of the comparative studies. The eye-model complexities considered in this study are then discussed, followed by a description of the procedure used to simulate the comparative studies using each eye model. Finally, the methods used to compare the simulated and experimental outcomes are given.

2.1 Comparative Studies

Table 1 describes the two comparative studies investigated in this study. Both studies used remote eye trackers placed on a table and recorded the seated participants’ gaze as they focused on targets appearing on a screen. Reference [28] used a remote eye tracker with a single camera and two light sources to evaluate a model-based gaze estimation algorithm during head movements using four participants. The algorithm uses the pupil and glint features observed by the camera and the known position of the hardware components to calculate the orientation of a virtual eye model with a single spherical corneal surface. The orientation of the visual axis of the virtual eye model was then taken as the direction of the user's gaze. The algorithm required four parameters of the virtual eye model, namely, the radius of curvature of the cornea (R_ac), the distance between the corneal centre of rotation and the anatomical pupil centre (K), the offset of the fovea from the optic axis (α), and the index of refraction of the cornea (η_ac) to be calibrated for each user.

Table 1.

Study	Participants	Eye tracker configuration	Gaze estimation algorithm
[28]	4	Remote	Model
[29]	26	Remote	Interpolation

Table 1. Comparative Studies

Each participant tested by Reference [28] sequentially focused their gaze on nine points on the screen. This fixation procedure was repeated with the participant's head in five positions: a central head position and four peripheral head positions that were 30 cm to each side and 40 cm forward and backward from the central head position. The five parameters of the gaze estimation algorithm were calibrated at the central head position by calculating the value of the parameters that minimise the gaze estimation error over the nine fixation targets using a nonlinear search algorithm. The average angular error between the fixation targets and the estimated gaze positions was reported for the central (H1) and peripheral head positions (H2).

The second comparative study was performed by Reference [29]. This study used a remote eye-tracker to compare the performance of 12 regression function combinations, given in Table 2, used in an interpolated-based gaze estimation algorithm combined with six calibration configurations. Participants were tasked with sequentially fixating their gaze on 135 targets on a screen while the eye tracker recorded the positions of their pupils and the glints formed by the reflection of a light source from the cornea. The normalised pupil-glint vector for each fixation target was calculated as the distance between the pupil and glint of each eye divided by the distance between the pupils of both eyes.

Table 2.

Label	x	y
R1	x + 1	y + 1
R2	x + y + xy + 1	x + y + xy + 1
R3	x + x² + y + y² + xy + 1	x + x² + y + y² + xy + 1
R4	x + x² + y + y² + xy + x²y² + 1	x + x² + y + y² + xy + x²y² + 1
R5	x + y + xy + 1	x + y + y² + 1
R6	x + x² + y + 1	x² + y + xy + x²y + 1
R7	x + x³ + y² + xy + 1	x + x² + y + y² + xy + x²y + 1
R8	x + x² + x³ + y + xy + x²y + x³y + 1	x + x² + y + y² + xy + x²y + 1
R9	x + x² + x³ + y + xy + x²y + x³y + 1	x + x² + y + xy + x²y + 1
R10	x + x² + x³ + y + xy + x²y + x³y + 1	x + x² + x³ + x⁴ + x⁵ + y + xy + x²y + x³y + x⁴y + x⁵y + 1
R11	x + x² + x³ + y + y² + y³ + xy + x²y + x³y + 1	x + x² + y + xy + x²y + 1
R12	x + x² + x³ + y + y² + y³ + xy + x²y + x³y + 1	x + x^{^}2 + x³ + x⁴ + x⁵ + y + xy + x²y + x³y + x⁴y + x⁵y + 1

Table 2. Regression Functions Used by Reference [29]

The coefficients of the regression functions are omitted. The regression functions for the x and y components of the estimated gaze position include the coefficients a_k and b_k, respectively, where k is the number of terms in the regression function.

The coefficients of the 12 regression functions (a_k and b_k) were calibrated for each eye using the pupil-glint vectors recorded during the fixations on the targets contained in each calibration configuration consisting of 5, 9, 14, 18, 23, and 135 (C5, C9, C14, C18, C23, and C135) targets. The average angular error between the 135 fixation targets and the estimated gaze point was then calculated as the average estimated gaze positions of both eyes for the 72 combinations of regression functions and calibration configurations. Using the recorded errors, the best-performing regression function combination for each calibration configuration was identified.

2.2 Eye-model Complexities

The five eye models (E₁ to E₅) described in Table 3 were identified from the literature and included in the simulation environment. The eye models were chosen to represent incremental increases in eye-model complexity so the influence of individual components of the eye model on the predictive power of simulated eye-tracking data could be investigated. In the context of this study, the complexity of the corneal surface is of particular interest, as the shape of the cornea is thought to have a significant influence on eye-tracking outcomes [30]. Therefore, the differentiation between eye models was based on the parameters used to describe the cornea. A description of the various eye-model parameters and the surfaces used to describe the surface of the eye model is provided by References [18] and [23]. The parameters were defined relative to the optic axis.

Table 3.

Parameter	E₁	E₂	E₃	E₄	E₅
Anterior corneal surface
R_ac(mm)	8.00	7.80	7.80	7.72	(7.76, 7.74)
Q_ac		−0.25		−0.26	(−0.272, −0.361)
T_ac (mm)					(0.22, −0.23)
θ_ac(°)					(0, 1.24, 2.98)
\({{{\rm{\eta }}}_{ac}}\)	1.376	1.3375	1.3771	1.367	η_ac(ω)
Posterior corneal surface
R_pc(mm)			6.50	6.50	6.32
Q_pc					−0.038
T_pc(mm)					(0, 0.19, −0.21)
θ_pc(°)					(0, 1.24, 2.98)
\({{{\rm{\eta }}}_{pc}}\)			1.3374	1.3374	η_pc(ω)
CCT (mm)			0.55	0.55	0.55
Pupil surface
D_p(mm)	4	4	4	4	4
T_p (mm)					(0, −0.371, 0.04)
θ_p (°)					(0, 0.91, −1.64)
ACD (mm)	3.60	3.60	3.60	3.60	3.60
Anterior lens surface
R_al(mm)			10.2	10.2	14.304
Q_al				−3.1326	−3.546
T_al (mm)					(0, −0.371, 0.04)
θ_al (°)					(0, 0.91, −1.64)
\({{{\rm{\eta }}}_{al}}\)			1.336	1.336	GRIN
Posterior lens surface
R_pl (mm)			−6.0	−6.0	−6.884
Q_pl				−1.0	−3.286
T_pt(mm)					(0, −0.27, 0.09)
θ_pl (°)					(0, 0.91, −1.64)
\({{{\rm{\eta }}}_l}\)			1.336	1.336	GRIN
T (mm)			4	4	3.5
Retinal surface
R_r(mm)	12	12	12	12	14
Q_r					0.247
AL (mm)	24	24	24.2	24	24.06
Rotation
COR_y(mm)	(−8, 0, 0)	(−13.5, 0, 0)	(−13.5, 0, 0)	(−13.5, 0, 0)	(−12.0, 0, 0.33)
COR_z(mm)	(−8, 0, 0)	(−13.5, 0, 0)	(−13.5, 0, 0)	(−13.5, 0, 0)	(−14.7, 0.79, 0)
α (°)	(5.45, 2.5)	(5.45, 2.5)	(5.45, 2.5)	(5.45, 2.5)	(5.45, 2.5)

Table 3. The Parameters of the Five Eye Models Were Included in the Simulation Environment

ac – Anterior corneal surface; pc – Posterior corneal surface; p – Pupil surface; al – Anterior lens surface; pl – Posterior lens surface; r – Retinal Surface; R – Radius of curvature; Q – Aspheric constant; T – Surface apex translation relative to optic axis; θ – Surface tilt; η – Index of refraction; CCT – Central corneal thickness; D – Diameter; ACD - Anterior chamber depth; T – Lens thickness; AL – Axial length; COR_y – Centre of rotation around y axis; COR_z – Centre of rotation around z axis; α – Foveal offset; ω – Incident light wavelength; GRIN – Gradient-index.

The first eye model (E1) included in this investigation is the reduced eye model used by Reference [31], as illustrated in Figure 1(a). Reduced eye models model the cornea as a single spherical surface described by a radius of curvature (R_ac) and do not include any lens surfaces. This model is the lowest complexity eye model included in this investigation and was chosen as it is used in popular eye-tracking simulation resources such as SynthesEyes [32] and UnityEyes [22]. The centres of rotation (COR_y and COR_z) were placed on the centre of rotation of the cornea, as is a common assumption in eye-tracking algorithms [28, 33].

Fig. 1.

Aspheric reduced eye models are typically created by adding an anatomically observed aspheric constant (Q_ac) to the corneal surface of a reduced eye model to make its wide-angle performance more realistic [8]. The aspheric reduced eye model included in this investigation (E₂) is based on the Emsley reduced eye model proposed by Reference [34] with the addition of an aspheric constant (Q_ac) of -0.25 to the corneal surface [35]. This eye model is used in the RIT-Eyes eye-tracking simulation resource [8]. Both centres of rotation were placed on the optic axis at a more realistic 13.5 mm behind the corneal apex [36]. The optical components of the aspheric reduced eye model (E₂) are illustrated in Figure 1(b). The figure demonstrates the flatter shape of the cornea compared to the reduced eye model.

The four-surface eye model (E₃) included in the investigation was based on the Le Grand full theoretical four-surface eye model [37]. The model includes spherical anterior (R_ac) and posterior (R_pc) corneal surfaces and anterior (R_al) and posterior lens (R_pl) surfaces. The optical components of the four-surface eye model (E₃) included in this investigation are illustrated in Figure 1(c). The inclusion of the posterior corneal surface and the lens surfaces can be observed in the figure.

The fourth eye model (E₄) included in this investigation was based on the Navarro wide-angle eye model [38]. This aspheric four-surface eye model was selected as it was used in pivotal studies to motivate the undertaking of the work presented in this study [25, 26]. The model includes an aspheric anterior and spherical posterior corneal surface. The flatter aspheric anterior corneal surface compared to the four-surface eye model (E₃) is illustrated in Figure 1(d).

The finite eye model (E₅) included in the simulation environment was based on the model proposed by Reference [23]. This was the highest complexity eye model included in the study. This model was chosen over other finite eye models, such as the eye model proposed by Reference [5], as it has been shown to accurately replicate the wide-angle performance of the human eye for horizontal viewing angles up to 50° and vertical viewing angles up to 20°, which exceed the viewing angles of the eye during the experiments performed by the comparative studies.

The finite eye model (E₅) includes an ellipsoidal anterior and aspheric posterior corneal surface, both of which include anatomically accurate ocular surface decentrations and tilts. The lens was modelled as a series of isoindical aspheric contours forming a gradient-index of refraction (GRIN) lens model [24]. The indexes of refraction of the optical surfaces (η) were calculated as a function of the incident light wavelength (ω) using Cauchy's equation [39]. The chromatic dispersion coefficients for the GRIN lens and corneal surfaces used in the gkaModelEye framework [5] were used. Based on the findings of References [40] and [41], separate centres of rotation were defined for rotations around the y and z axes (COR_y and COR_z). The optical components of the finite eye model are illustrated in Figure 1(e), in which the ocular surface decentrations and tilts, as well as the separate centres of rotation, can be observed.

2.3 Simulation Procedure

The study utilised the MATLAB-based gkaModelEye framework [5] to create a simulation environment with which the experiments performed in the comparative studies were replicated. This framework includes computational models of eye-tracking components: the eye, camera, light sources, and screen. By employing ray-tracing operations, the framework simulates the image features observed by the eye tracker's camera, including the glint and apparent pupil. Compared to alternative options such as et_simul [2] and UnityEyes [6], this framework was chosen for its ability to accommodate various eye-model complexities and configure the simulation environment to replicate the comparative studies’ experiments. The 10 parameters listed in Table 4 describe the simulation environment's configuration. The table also includes the values of the parameters used to replicate the experiments of the comparative studies.

Table 4.

Parameter	Description	[28]	[29]
Ct (mm)	Position of the camera's nodal point	(315, \(\frac{{IPD{\rm{*}}}}{2}\), −150.5)	(520, \(\frac{{IPD{\rm{*}}}}{2}\), −300)
Cr (°)	Rotation of the camera around its nodal point	(0, 13.8, 0)	(0, 30, 0)
Cs (mm)	Dimensions of the camera's image sensor	(4.8, 3.6)	(4.48, 3.36)
Cfl (mm)	Focal length of the camera	35	10
Lt, 1 (mm)	Position of the first point light source	(615, \(\frac{{IPD{\rm{*}}}}{2} + 208.5\), 0)	(560, \(\frac{{IPD{\rm{*}}}}{2}\), −250)
Lt,2 (mm)	Position of the second point light source	(615, \(\frac{{IPD{\rm{*}}}}{2} - 208.5\), 0)
Lw (nm)	Illumination wavelength of the light source	850	850
St (mm)	Position of the centre of the screen	(650, \(\frac{{IPD{\rm{*}}}}{2},{\rm{\ }}0\))	(800, \(\frac{{IPD{\rm{*}}}}{2},{\rm{\ }} - 131.5\))
Ss (mm)	Dimensions of the screen	(377, 301)	(495, 280)
Sf (targets)	Configuration of the fixation targets	(3, 3)	(15, 9)

Table 4. The Parameters of the Simulation Environment and the Configuration of the Parameters Used to Replicate Each Comparative Study

*IPD – Interpupillary distance

Figure 2 demonstrates the graphical output of the simulation environment that was configured to replicate the experimental setup described by Reference [29]. The black dots on the screen illustrate the fixation targets of the comparative study. The simulation environment's origin was positioned at the intersection of the optic axis and the anterior corneal surface apex, with the eye in an unrotated orientation. The x axis pointed towards the screen along the optic axis, the y axis pointed nasally for a right eye and temporally for a left eye, and the z axis pointed superiorly.

Fig. 2.

The eye model was sequentially fixated on a series of targets to simulate the fixation procedures used in the comparative studies. Ray-tracing operations were then used to simulate the position of the glint and apparent pupil centres on the camera's image sensor. Each fixation involved rotating the eye model according to Listing's law around its centres of rotation (COR_y and COR_z) to align the line of sight with the fixation target [5]. This operation is challenging due to the nonlinear change in the entrance pupil's centre position as the eye rotates [42]. The gkaModelEye framework [5] utilises a nonlinear search algorithm to determine the eye's orientation, resulting in a line of sight intersecting the fixation target, entrance pupil's centre, and fovea of the eye model.

During each fixation, the position of the apparent pupil boundary was simulated by projecting 30 points evenly distributed along the anatomical pupil boundary through the refractive cornea. These rays then intersected the camera's nodal point and its image sensor. The centre of the apparent pupil was then calculated by fitting an ellipse to the projected boundary points on the image sensor using a least squares solver. The glint centres were simulated by tracing rays from each point light source that reflected off the anterior corneal surface and intersected the camera's nodal point and image sensor. All ray traces in this study were confirmed to intersect the camera's nodal point within a tolerance of 10^-4 mm.

The simulation environment only included one eye model at a time. The fixation procedure was repeated twice with the same eye model but in different positions to generate binocular eye-tracking data for a single user. The screen, camera, and light source components were translated half of the interplanetary distance (IPD) along the positive y axis for the right eye. Conversely, the components were translated by half of the interpupillary distance along the negative y axis for the left eye. A fixed interpupillary distance of 63 mm, based on the average human measurement reported by Reference [43], was used in all simulations.

2.4 Analysis

The first step of the analysis was inspecting the image features simulated with each eye model to determine how the features were influenced by the parameters included in the eye models. The simulated outcomes were then evaluated against the main findings reported by the comparative studies. The main findings derived from the experimental data reported by References [28] and [29] are given in Table 5. Note that major findings pertaining to the interpersonal variance in eye-tracking outcomes reported by the comparative studies are not included in the table, as these findings could not be evaluated against data generated by the normative eye models used in this study.

Table 5.

[28]
1. The average gaze estimation errors at the central and peripheral head positions. 2. The average gaze estimation error increased by 0.13° from the central to peripheral head positions.
[29]
1. The average errors achieved using each combination of regression functions and calibration configuration. 2. Regression function combination R2 generated the smallest errors for C5, R5 for C9 and R8 for C14, C18, C23, and C135. 3. No regression function combination produced acceptable errors using C5. 4. No significant improvement was achieved by R8 by increasing the number of calibration targets beyond C14.

Table 5. The Main Findings of the Comparative Studies to which the Simulated Outcomes Were Evaluated

3 Results

In this section, the eye-tracking data simulated using various eye-model complexities are evaluated based on their ability to replicate the findings of the comparative studies derived using real-world data. The simulated image features generated using each eye model are investigated first, followed by the simulated outcomes of each comparative study.

3.1 Simulated Image Features

The image features simulated using each of the five eye models included in the simulation environment (E₁ to E₅) are illustrated in Figure 3. This figure illustrates the influence that the complexity of the eye model has on the simulated image features. Each colour in Figure 3 represents the features simulated using a different eye-model complexity that was sequentially fixated on the 135 fixation targets in the experiment performed by Reference [29]. Figures 3(a) and 3(b) illustrate the simulated apparent pupil centre and glint features.

Fig. 3.

The reduced eye model (E₁) generated feature distributions that were much smaller than the other eye models in both axes, with no movement of the glint centre during changes in the orientation of the eye model. The finite eye model (E₅) generated pupil centre and glint feature positions with a flatter aspect ratio than the other eye models. These differences are thought to be mainly attributable to the centre of rotations used by the eye models. The centre of rotation of the reduced eye model (E₁) coincided with the centre of rotation of the cornea, leading to smaller movements of the anatomical pupil centre and no movement of the glint feature during eye rotations. The asymmetrical rotation of the finite eye model (E₅) results in feature distributions with a different aspect ratio than the other eye models that used a single centre of rotation.

The curvature over the width of the image feature distribution increased with the addition of an aspheric constant in the corneal surfaces. This curvature is evident in the top left corner of Figure 3(b), with the glint centre distributions generated by the aspheric reduced (E₂) and aspheric four-surface (E₄) eye models curving away from the distribution generated by the four-surface eye model (E₂). The image feature distributions generated by the finite eye model (E₅) were slightly more curved than those generated by the other eye models. There were small variations in the glint feature distributions generated by the four-surface (E₃) and aspheric four-surface (E₄) eye models. However, the pupil feature distributions were almost the same. This finding demonstrates the small influence that the posterior corneal surface has on the simulated image features.

3.2 Guestrin and Eizenman, 2006

The distribution of gaze estimation errors reported by Reference [28] at H1 and H2 and the corresponding errors simulated in this study are illustrated in Figure 4. The finite eye model's (E5) results are not shown, as simulated errors exceeding 11° were achieved for both head positions.

Fig. 4.

A two-tailed single-sample t-test was performed to determine if the simulated best function outcomes were significantly different from the experimental outcomes. The reduced (E₁ (H1): M = 0.16; E₁ (H2): M = 0.16) and four-surface (E₃ (H1): M = 0.16; E₃ (H2): M = 0.17) eye models predicted outcomes that are significantly different from the findings reported by Guestrin and Eizenman (2006) (H1: M = 0.38, SD = 0.08; H2: M = 0.52, SD = 0.14) at both head positions (t(3) > 4.87, ρ < .05). Conversely, the aspheric reduced (E₂ (H1): M = 0.41; E₂ (H2): M = 0.42) and aspheric four surface (E₄ (H1): M = 0.43; E₄ (H2): M = 0.44) eye models predicted errors that are similar to the experimental outcomes at both head positions (t(3) < 1.39, ρ = [0.26, 0.61] at the two head positions). No eye model was able to simulate an increase in errors larger than 0.02° between H1 and H2.

3.3 Blignaut, 2014

The simulated and experimental errors produced by the best-performing regression function combinations using each calibration configuration, as reported by Reference [29], are illustrated in Figure 5. No confidence interval is shown for the experimental data, as standard deviations were not provided by Reference [29]. The difference in the predicted best function errors for the simulated outcomes was large relative to the experimental outcomes for every calibration configuration and eye model. The largest deviation exceeded 1° for C5 and the smallest 0.40° for C135. The simulated best function errors only described 10% to 20% of the experimentally observed errors.

Fig. 5.

The simulated data correctly predicted that no significant improvement was achieved by increasing the number of calibration targets past C14, with all eye models predicting a difference in error of under 0.01° between C14 and C135. The simulated data also correctly predicted a significant decrease in errors achieved by increasing the calibration targets from C5 and C9. All eye models predicted the correct regression function combinations for C9, C14, C18, C23, and C135. No eye model correctly predicted R2 as the best function for C5.

Eye-model complexity significantly influenced the simulated errors for C5 and C9. Surprisingly, the eye models with spheric corneal surfaces (E₁ and E₃) generated simulated errors over 70% larger than those of the other eye models. There was a negligible difference between the errors generated by the eye models for calibration configurations exceeding C9. There was also a negligible difference in the errors achieved between the reduced (E₁) and four-surface (E₃) and between the aspheric reduced (E₂) and aspheric four-surface (E₄) eye models for all calibration configurations. There was no significant difference between the errors achieved by the finite eye model (E₅) and the other eye models that included conic corneal surfaces (E₂ and E₄).

4 Discussion

The findings of this chapter are discussed in this section. The discussion is divided into three parts, starting with a discussion of the implications of the results of this study. The pertinent limitations of the investigation are then described. Finally, an overview of future work is given.

4.1 Implications of Findings

The three-dimensional eye model is a fundamental component of a simulated eye-tracking environment. This study aimed to determine whether eye-model complexity has an influence on the ability of simulated data to predict real-world eye-tracking outcomes. Five eye models with varying complexity were used to simulate the experiments performed in two comparative eye-tracking studies by References [28] and [29].

Eye-model complexity had a contrasting influence on the simulated outcomes of the two comparative studies. However, some important findings were general to the outcomes of both comparative studies. Including the posterior corneal surface and the lens surfaces had a small influence on the simulated image features but did not result in significant differences in gaze estimation errors. These results indicate that both gaze estimation algorithms investigated in this study are able to calibrate for the influence of these parameters on the simulated image feature distributions. The finite eye model (E5) generated image feature distributions that significantly differed from the other eye models. However, the model did not improve the predictive power of the simulated data for either comparative study. The model-based gaze estimation algorithm generated exceedingly large errors, likely caused by the introduction of ocular surface decentrations and tilts that could not be calibrated. Conversely, the increase in complexity of the finite eye model had no significant influence on the outcomes of the interpolation-based gaze estimation algorithms, indicating that the algorithms calibrated well to the image feature distribution. Given the large number of parameters of the finite model, it is possible that the outcomes may have differed if another finite model with slightly different parameters, such as the model developed by Polans et al. (2015), was used. Including too much complexity in a simulation environment may limit the simulated data's ability to predict average outcomes in a sample population.

The simulated outcomes of the comparative study by Reference [28] showed that eye models that include a conic anterior corneal surface generated simulated outcomes that closely predicted the magnitude of the average real-world errors achieved by a model-based gaze estimation algorithm in a small sample population. The significant influence of anterior corneal asphericity Based on these findings, it is recommended that aspheric reduced eye models be used to generate simulated eye-tracking data when evaluating the performance of model-based gaze estimation algorithms. Furthermore, the findings demonstrate that eye models that include spheric anterior corneal surfaces significantly underpredict the average eye-tracking errors achieved by a model-based gaze estimation algorithm.

Despite the evidence provided by Reference [26], including a posterior corneal surface did not emulate the reported head movement-related eye-tracking errors. The head movement-related errors reported by Reference [26] are likely attributable to a combination of the simulation procedure in which changes in the eye's orientation were simulated by rotating the pupil relative to the eye instead of rotating the entire eye model and the stereoscopic gaze estimation algorithms used. The findings of this study further support the explanation of head-movement-related errors provided by Reference [28], who argued that image-related sources of errors, such as changes in illumination conditions and deformations in the appearance of the glints during head movements, are the cause of head movement–related errors in model-based algorithms, not the optics of the eye. It is unclear if this finding is applicable to other gaze estimation algorithms.

An increase in eye-model complexity from a reduced eye model did not increase the predictive power of the simulated interpolation-based gaze estimation outcomes relative to the average real-world outcomes reported by Reference [29]. These findings show that increasing the complexity of the eye model beyond a reduced eye model does not improve the ability of simulated data to predict the real-world outcomes of an interpolation-based gaze estimation algorithm. The difference in the findings for the model and interpolation-based gaze estimation algorithms indicates that different gaze estimation algorithms may have different ideal eye-model complexities.

The simulated outcomes of the comparative study by Reference [29] demonstrate that simulated data can not predict the magnitude of average interpolation-based gaze estimation errors, regardless of the complexity of the eye model used. For example, the simulated data demonstrated that high-accuracy gaze estimation with errors under 0.3° is achievable using only five calibration targets. However, the real-world outcomes reported by Reference [29] indicate that the best function error achieved for C5 was over 1.3°, which is insufficient for most applications [44]. Despite significantly underpredicting the magnitude of errors, the simulated data predicted the relative influence that the addition of calibration targets has on gaze estimation errors. The simulated data also predicted the best regression function combinations for each calibration configuration except C5.

Including image-related sources of eye-tracking errors could potentially improve the ability of simulations to predict real-world outcomes. However, researchers should resist the temptation to introduce artificial noise to the simulated image feature positions to achieve magnitudes of simulated outcomes closer to what would be expected in the real world for several reasons. First, eye-tracking devices have large sampling speeds, which means that when evaluating an eye-tracking device using some fixation procedure, as the comparative studies used in this study did, gaze estimation errors are averaged over multiple samples to filter out random sources of eye-tracking errors. This means that the image-related sources of eye-tracking errors that influence the performance of a device are likely systematic in nature. Second, it is unclear how to predict the magnitude of the noise that is present in a particular hardware configuration, as it could be influenced by many factors, such as the feature extraction algorithm used and the ambient lighting conditions on a particular day that the experiment is conducted. Synthetic image simulations such as UnityEyes [22] and RIT-Eyes [8] consist of large sets of realistic synthetic images of the entire eye region. These resources offer a more realistic alternative to simulating image-related sources of errors, as they simulate systematic image-related errors such as shadows, occlusions, and eye pigmentation.

4.2 Limitations

The small number of comparative studies included in this study limits the strength and applicability of the findings. There could be factors that influence simulated and real-world eye-tracking outcomes to the extent that the relationships between the two that were identified in this study could become invalid. For example, the simulated data generated for an interpolation-based gaze estimation algorithm using a head-mounted eye-tracking configuration may be influenced more by the peripheral optics of the eye model, as the distance between the camera and the user is smaller.

The small number of participants evaluated by Reference [28] resulted in non-statistically significant outcomes (ρ > .05) when comparing the simulated and real-world data generated by the study. However, Reference [28] was chosen as a comparative study in this work, as this seminal work was the only study that could be found that implemented a model-based gaze estimation algorithm and provided a detailed description of the experimental setup that allowed it to be implemented in a simulated environment with minimal assumptions.

The parameters used by the eye models were not consistent across the models used in this investigation. For example, an anterior radius of curvature was included in all the eye models, but the value of the parameter was different in some of the eye models. Differences in the parameters could have caused some of the variations in errors generated by the eye models that were not related to the differences in the complexities of the eye models. However, the differences between the parameter values were very small and unlikely to have a significant influence.

The gkaModelEye framework [5] was the only simulation framework available that could be used for the investigation presented in this chapter. The addition of realistic image-related sources of eye-tracking errors could have provided a more holistic investigation of eye-tracking simulations. However, a simulation resource that could simulate image error sources and facilitate the inclusion of various eye-model complexities does not currently exist.

The normative eye models used in this investigation to generate simulated data can only be used to make predictions about the performance of eye-tracking methods on the average person. However, many of the findings of the comparative studies were related to the variance in the performance of an eye-tracking method throughout the sample population. A recent study [45] addressed this limitation by investigating the ability of simulated eye-tracking data to predict the variance in experimental eye-tracking outcomes by simulating realistic variations in eye-model parameters.

4.3 Future Work

This study demonstrates a systematic approach to validating simulated outcomes against real-world eye-tracking outcomes. Further research is required to understand the extent to which simulations can predict real-world outcomes in other types of eye-tracking hardware and algorithmic configurations. A study that trains an appearance-based gaze estimation algorithm with synthetic image data that contains an aspheric eye model, such as the RIT-Eyes dataset [8], could lead to improved gaze estimation accuracies. However, the influence that eye-model complexity has on appearance-based gaze estimation outcomes is still unclear.

5 Conclusion

The outcomes of this study demonstrate that the complexity of the eye model used in an eye-tracking simulation can significantly influence the predictive power of the simulated outcomes. The inclusion of an aspheric anterior corneal surface significantly improves the ability of simulated eye-tracking outcomes generated by an interpolation and model-based gaze estimation algorithm to predict real-world outcomes. Despite the evidence provided by References [25] and [26] that the posterior corneal surface influences simulated pupil and glint features, the inclusion of the posterior corneal surface did not influence the predictive power of the simulated data generated by either gaze estimation algorithm. The inclusion of a high complexity eye model that has been validated to accurately replicate the wide-angle performance of the eye did not result in better simulated outcomes than paraxial eye models.

Acknowledgments

The views and opinions expressed are those of the authors and do not necessarily represent the official views of the South African Medical Research Council.

References

[1]

K. Holmqvist, M. Nyström, R. Andersson, R. Dewhurst, H. Jarodzka, and J. Van de Weijer. 2011. Eye Tracking: A Comprehensive Guide to Methods and Measures. OUP Oxford.

Abstract

1 Introduction

2 Method

2.1 Comparative Studies

2.2 Eye-model Complexities

2.3 Simulation Procedure

2.4 Analysis

3 Results

3.1 Simulated Image Features

3.2 Guestrin and Eizenman, 2006

3.3 Blignaut, 2014

4 Discussion

4.1 Implications of Findings

4.2 Limitations

4.3 Future Work

5 Conclusion

Acknowledgments

References

Index Terms

Recommendations

Eye-Model-Based Gaze Estimation by RGB-D Camera

Compensation of head movements in mobile eye-tracking data using an inertial measurement unit

Eye-gaze estimation under various head positions and iris states

Comments

Information

Published In

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Funding Sources

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

View options

PDF

eReader

Login options

Full Access

Share

Share this Publication link

Share on social media

Affiliations