1 Introduction

1.1 Autonomous Vehicles

The sight of autonomous vehicles navigating through traffic has become increasingly common. These vehicles offer drivers a relief by assuming control of the vehicle, allowing them to focus on secondary tasks. The Society of Automotive Engineers (SAE) describes a continuum of six levels of driving automation, ranging from no driving automation (Level 0) to full driving automation (Level 5) [1]. As the autonomy level ascends, the driver’s obligation to manage the vehicle diminishes.

At level 0, driving entails no automation where the driver assumes complete control of the vehicle’s operations—managing the brake, steering, and gas pedal. Levels 1 and 2 of autonomous vehicle technology introduce functionalities that control specific aspects (such as adaptive cruise control, electronic stability control, dynamic brake support, etc.). Throughout these levels, the driver is responsible for overseeing the road and maintaining control of the vehicle as needed. A level 4 and 5, autonomous vehicle grants the driver freedom from vehicle control, alleviating the need to monitor traffic actively. On the other hand, vehicles operating at level 3 autonomy allow the driver to engage in secondary tasks while the vehicle autonomously handles operations, until an exceptional circumstance arises. At that point, the driver is expected to regain control after a Take Over Request (TOR) conveyed through visual, audio, or tactile cues preceding the exceptional event.

1.2 Secondary Tasks

In level-3 autonomous vehicles, drivers are permitted to engage in secondary tasks until the vehicle prompts a TOR. However, this allowance might lead drivers to lose awareness of the traffic situation or become disengaged from the driving task precisely when the TOR is issued. When drivers are out of the loop due to engaging in secondary tasks, combined with visual and cognitive distractions, their ability to perform takeovers in terms of quality and duration diminishes, as indicated by previous studies [2,3,4,5].

Previous research has explored various types of secondary tasks, such as text-messaging [6], reading text [7,8,9], watching a video [5], playing a game, 20-Questions Task [10, 11], n-back task [5, 12, 13], and Surrogate Reference Task [14,15,16,17]. In this study, the n-back task was chosen as the secondary task due to its ability to engage cognitive, visual, and manual aspects of the driver [11, 12]. Engaging in the n-back secondary task is expected to divert the driver’s attention away from critical elements of the traffic environment, potentially affecting their emotional state and rendering them less prepared to promptly take control of the vehicle when necessary.

1.3 Buffer Times

Research has explored the ideal buffer time that ensures optimal takeover performance, considering factors such as the duration and quality of takeovers, the driver’s attention, and their emotional states before, during, and after these takeover instances. Nevertheless, these studies have not reached a consensus on the safest buffer time for level-3 autonomous vehicles [18,19,20,21]. However, in general, it appears that 8 s and 4 s serve as optimal early and late buffer times, respectively.

1.3.1 Buffer Time in Emotional States

The emotional states of drivers could potentially impact the duration and quality of takeovers prompted by TORs in level-3 autonomous vehicles. Emotion is a cognitive ability exhibited by humans, triggered by external sensory stimuli [22]. Therefore, variations in the emotional state affect drivers’ capability to control a vehicle and may cause driving errors [23]. Various emotional states such as stress, excitement (positive arousal awareness), engagement (alertness and focused attention on task-relevant stimuli), interest, focus, and relaxation might affect driving behaviors. For instance, elevated levels of excitement have been linked to belonging to a group of traffic offenders [24], while a lack of engagement and driving while fatigued or drowsy has been identified as a significant cause of traffic accidents [10, 25]. Furthermore, frustration has been associated with increased anger, leading to risky driving behaviors like speeding, erratic driving, and reduced reaction times and distances before crashes [10, 26, 27]. However, no prior study has specifically examined the emotional states of drivers before, during, and after TORs in level-3 autonomous vehicles to ascertain the most suitable buffer times concerning these emotional states. Hence, one of the objectives of this study is to explore how different buffer times (both early and late) in level-3 autonomous vehicles impact drivers’ emotional states before, during, and after TOR incidents occurring at intersections.

In this study, emotions will be quantified using Electroencephalography (EEG). EEG is an electrophysiological monitoring technique that records brain activity associated with emotional states [26]. This technology has been widely employed in various studies as a non-invasive means of measuring and gathering data on drivers’ emotional states. For instance, EEG has been utilized to measure the physiological and driving behavior of sleep-deprived drivers [28]; assess driver distraction [29]; examine the effects of different noises within vehicles on driver relaxation and focus [26]; gauge emotional stress levels in self-driving cars [23]; and report neural activity in distinct brain regions, mental fatigue, levels of brain arousal, alertness, or engagement following extended simulated driving sessions [10, 26]. EEG serves as a valuable tool in capturing and analyzing brain activity associated with diverse emotional states, providing insights into drivers’ cognitive responses and emotional experiences while operating vehicles under varying conditions and scenarios.

1.3.2 Buffer Time on Eye Gaze

The driver’s visual attention and cognitive effort directed toward processing pertinent traffic information before approaching intersections play a pivotal role in ensuring safe driving practices. Rapidly processing data from traffic signs, other road users, and dashboard displays is crucial for making informed decisions and averting potential hazards. Visual attention can significantly affect driving performance, influencing lane control, steering precision, and the reaction time to potential dangers. Instances of inattention, improper lookout, and internal distractions are primary human-related factors contributing to over 92.6% of accidents [30]. A higher level of attention and focus signifies expertise and is closely associated with takeover performance in terms of both takeover time and quality.

Eye-gaze data, including the number and duration of fixations, can provide crucial insights into the driver’s visual attention and cognitive processes [31]. Eye fixation refers to a stable and sustained gaze on a particular location, while fixation duration (FD) represents the length of time the eye remains steadily focused within an area of interest (AOI) [5]. Longer FDs indicate increased cognitive effort, which corresponds to increased attention [21, 32]. Eye-gaze data has been used in traffic and transportation research to study focus, attention, and cognitive effort. Examples include studying drivers’ eye-scanning skills before, during, and after hazardous situations using eye-tracking technology [3] and investigating visual attention allocation to the driver’s lane and intersecting roads under varying traffic conditions and signs [9]. Research shows heightened visual attention to secondary tasks, such as texting or internet browsing, can lead to reduced cognitive effort. At the same time, an increased count of fixations indicates greater attention and focus on specific areas of interest [13, 19, 33,34,35].

One study [6] examined the effects of buffer time on drivers’ attentional focus and cognitive processing of traffic-related information during TORs in level-3 autonomous vehicles. The purpose was to examine drivers’ attentional focus, and cognitive processing of traffic-related information at 3 and 7 s buffer times, as late and early TORs, respectively. Results showed a significant effect of buffer time on the combined measure of attention and cognitive processing, but not individually. Specifically, a 7-second buffer time resulted in greater attention to salient traffic information than a 3-second buffer time. The study’s small sample size (15 young drivers) may have contributed to the lack of individual effects.

Therefore, this study aims to investigate the impact of buffer time on the driver’s visual attention and cognitive processing of crucial information within the driving environment when a level-3 autonomous vehicle encounters a system failure just before intersections. In order to accurately record eye-gaze data during the two buffer times (4 and 8 s), eye gaze data is collected starting from 2 s before TORs and continues until the vehicle clears the intersection.

1.3.3 Buffer Time on Driving Behavior

Studies have explored the impact of buffer time on driving behavior within level 3 autonomous vehicles when drivers encounter TORs [9]. Findings have indicated that shorter buffer times (e.g., 5 s as opposed to 7 s) led to quicker decision-making and responses. However, this was accompanied by an overall poorer quality of driving behavior, characterized by reduced mirror and shoulder checks, increased acceleration, and a higher likelihood of collisions.

In a separate study [31], it was revealed that drivers needed a minimum of 7 s to effectively identify other vehicles within new or unfamiliar traffic scenes. This suggested that a buffer time of 4 s was inadequate, potentially leaving drivers in precarious situations without ample time to respond. Moreover, another study [13] demonstrated that providing drivers with less visual information during TORs resulted in slower handover times. However, this reduction in visual information did not notably affect the timing of drivers’ maneuvers aimed at avoiding collisions.

One study [6] investigated the influence of different buffer times (specifically, 3 and 7 s) on driving behavior, encompassing takeover duration, quality, attentional focus, and cognitive effort. The research reported significant results concerning the combined impact of takeover time and quality, yet did not find significant individual effects for these factors. It’s important to note that the study had a relatively small sample size, comprising only 15 participants.

1.4 Problem Statement

This study aims to achieve three specific objectives. Firstly, it seeks to explore the impact of two distinct buffer times (4 s and 8 s) on drivers’ experiences related to six emotional states — stress, excitement, engagement, interest, focus, and relaxation. Secondly, it aims to analyze how these buffer times affect drivers’ attentional focus and cognitive exertion while they process traffic-related information within five predefined Areas of Interest (AOIs): the secondary task, traffic signal, driver’s lane, intersecting roads, and the stop line at the intersection [33]. Lastly, the study aims to assess the duration and quality of takeovers performed by drivers who are engaged in a secondary task, in an “out-of-the-loop” state, and during system failures of level-3 autonomous vehicles occurring just before intersections.

2 Research Methods and Procedures

2.1 GMOST

The study employed a multiplayer driving simulator called GMOST, designed within a 3 × 3 mile terrain featuring realistic road networks, traffic signs, residential and commercial buildings, vegetation, and various environmental elements. This simulation incorporated two road types: one with two lanes and a speed limit of 25 mph, and another with four lanes and a speed limit of 35 mph. An intersection manager system replaced traditional traffic signals and signs. Within the simulation, 50 autonomous vehicles with randomized destinations were spawned throughout the environment. These vehicles were spawned at intervals of 30 s, with four vehicles introduced simultaneously until a specific threshold was reached. When these vehicles reached their destinations, they were rerouted to alternate destinations using an A* path search algorithm [5]. The simulation was designed with high traffic density, maintaining an average vehicle spacing of 100 m from each other, and a 4-second time gap [9] in all directions. Additionally, the autonomy of the vehicles could be disabled by the driver through specific actions, such as depressing the brake pedal by more than 10% or deviating from the designated path by 2 degrees. The autonomy mode could also be manually toggled on and off by pressing a button located on the steering wheel.

The intersection managers in the simulation operate on a first-come, first-served basis by considering the speed of approaching vehicles and their distance from the collider. Additionally, these managers are designed to identify instances where the driver gains control of an autonomous vehicle during a system failure, prioritizing the driver’s vehicle as the one expected to arrive first at the intersection. To facilitate this decision-making process, all artificial intelligence (A.I.) agents, including the driver’s vehicle, are equipped with ray casting capabilities. These capabilities allow the vehicles to track the speed, distance, and direction of other A.I. vehicles in the simulation. Within GMOST, intersections were strategically positioned approximately every half-mile across the environment. Around 155 m before reaching an intersection, the intersection manager communicated turn instructions (proceed or stop) to the vehicles by displaying a visual color on their respective dashboards. The driving simulation setup comprised 3 LCD screens, a driver’s seat, and a steering wheel mounted on a metal frame, complete with accelerator and brake pedals. Additionally, rear-view and side mirrors were installed to provide drivers with visibility of their surroundings from the rear and sides, enhancing situational awareness within the simulation environment.

2.1.1 N-back secondary tasks

As a cognitive distraction, the n-back secondary task (Fig. 1) was introduced. This task entails the driver memorizing the sequence of letters presented, with 10-second intervals between each letter. Specifically, the driver is prompted to press the “B” button on the steering wheel if the current letter matches the one displayed two letters ago. Alternatively, the driver would press the “X” button if there is no match. For example, as shown by the numerical values 12 and 4 in Fig. 1 below, this specific participant provided 12 inaccurate and 4 accurate responses to the prompts. The aim of this task is to divert the driver’s attention away from the road, engaging their visual focus, manual involvement, and cognitive processing.

Fig. 1
figure 1

The illustration of the N-back secondary task at the bottom, with the screen displaying the task’s letters as shown to the participants at the top (Displayed at the bottom of the middle LCD screen, with a size of 4 inches by 1 inch)

2.1.2 System Failure and Buffer Time

Participants took charge of the vehicle when a system fails, occurring either 4–8 s prior to approaching intersections, based on the distance and speed of the vehicle. The 4- or 8-seconds time frame selection was informed by prior research findings, where it was observed that 4 s was deemed insufficient [17, 36], whereas 8 s was considered adequate [36].

2.1.3 Takeover request (TOR)

The simulation utilized two forms of TOR: auditory and visual warnings. The auditory TOR comprised two high-pitched beeps at 65 dB sound pressure level, each lasting 240 milliseconds at 2800 Hz frequency, with intervals of 100 milliseconds between them, conforming to the NHTSA crash warning guidelines [37]. Conversely, the visual TOR involved a warning message showcased on the dashboard, featuring a red background along with the text “Fail Hands-On” (see Fig. 2). When the autonomy feature was activated, the displayed warning message changed to “Auto ON Hand Out” (see Fig. 2) [11].

Fig. 2
figure 2

Representation of Auto-not-active or Auto-active (Displayed at the bottom the middle LCD screen, each with a size of 1.25 inches by 1.25 inches)

2.2 Participants

20 drivers (19 male, 1 female, aged 20–35 years) with normal vision and a valid state driver’s license for at least 2 years were selected for the study. System failures occurred randomly at 8–4 s before the intersection, or not at all. Each participant experienced a random number of 8 s, 4 s, or no system failures. The total number of 8-second failures across all 20 participants is 304, while the total number of 4-second failures is 293. Accordingly, the participants were divided into two groups based on buffer times of 8 and 4 s, referred to as 8SG and 4SG, respectively. Participants had no prior experience with EEG or Emotiv.

Participants completed 20 h-long sessions over two weeks, driving for about 25 min. Before each session, participants received a 30-minute training on the autonomous vehicle, including how to take control when the system fails and engage in the n-back secondary task. In addition, participants wore an EPOC + headset with 14 nodes, calibrated for each participant when all 14 nodes were in 100% contact with the participant’s head. During the experiment, participants were instructed to engage in the n-back task while driving and assuming the vehicle’s control when a TOR was invoked. They were also instructed to stop at intersections if they were not given the right-of-way and to continue driving once granted the right-of-way by controlling the wheel and gas pedal as in real traffic.

2.3 Data

In this research, three categories of data were captured during the experimental sessions. These encompassed driving behavior, specifically takeover time and takeover quality; eye-gaze data, which included measures of attention and cognitive effort; and EEG data, comprising scores related to stress, excitement, engagement, interest, focus, and relaxation.

2.3.1 Driving Behavior

Takeover Time: Two metrics were utilized to compute takeover time: first-contact-time (FCT) and take control time (TCT). FCT represents the duration in milliseconds from TOR until the initial hand movement on the steering wheel, brake, or gas pedal [19, 38]. The initial FCT occurs when there is a 1-degree steering wheel rotation or any depression of the brake/gas pedal exceeding 0% [38]. On the other hand, TCT refers to the duration in milliseconds from the TOR until the first detectable brake/gas pedal or steering wheel response, defined by a 10% pedal position or a 2° steering wheel angle [18, 19, 21]. Additionally, the determination of TCT is based on whether the vehicle was signalled to stop or was granted the right of way to proceed [18].

Takeover Quality

To calculate the takeover quality, the maximum longitudinal deceleration (MLD) was used, along with the standard deviation of lateral position (DLP). MLD measures the quality of deceleration when stopping at an intersection abruptly, smoothly, or with fluctuating braking when stopping at an intersection. A smoother deceleration is ideal [37].

The MLD calculation involves initially computing the interval, which signifies the distance between the vehicle’s initiation time of deceleration and the cessation time of deceleration. A 10% threshold [21] was set for the brake pedal to indicate the vehicle’s initial deceleration time. Subsequently, the deceleration rate is determined as demonstrated in the formula below by dividing the interval by the time difference between when the participant starts decelerating and when the deceleration halts.

$$\:Deceleration=\frac{interval}{{t}^{2}}$$

DLP serves as a crucial metric for monitoring vehicle control and traffic safety [39], specifically in maintaining the vehicle within its designated lane. This parameter is computed using both horizontal and vertical coordinates, signifying the lane’s centre and the vehicle’s position. Once the ideal driving line is established and the vehicle’s position is identified, the nearest distance to this line is determined, representing the vehicle’s lateral position. The standard deviation derived from all these lateral position measurements reveals the extent of the vehicle’s swerving. A higher DLP value indicates increased horizontal movement within the lane. Therefore, a higher DLP suggests a lower-quality takeover, implying that the driver may not have maintained a straight path and might have frequently veered outside the lane. Conversely, a lower DLP signifies a higher-quality takeover. The calculation of DLP commences from the TORs and extends until the intersection point.

2.3.2 Eye Gaze

The study used eye-tracking technology to analyze drivers’ attention and cognitive effort on traffic-related information at salient parts of the environment. Five AOIs are the secondary task window on the dashboard (ST); the traffic signal (TS) indicating whether the autonomous vehicle has the right of way to proceed through the intersection without stopping or needs to stop before proceeding; the driver’s lane (DL); the intersecting roads (IR) with a focus on interacting with other drivers/objects/road lines/etc. to anticipate possible collisions and make proper decisions accordingly; and the stop line at the intersection (SL) [9]. Eye gaze data were collected on these AOIs from 2 s before TORs until the vehicle exited the intersection.

Attention

The number of eye fixations (NF) indicates a driver’s attention to an AOI [5]. Fixations are brief holding points where the gaze stays on one point before moving to another position while looking at an AOI [40]. NFs on an AOI are calculated by tallying total fixations on that AOI. Eye-gaze data is collected at intersections from 2 s before TORs until the vehicle exits, normalizing data across participants. To normalize NFs across groups, NFs are divided by buffer times. This yields five NFs for five AOIs at each intersection [7, 40].

Cognitive Effort

FD is linked to cognitive effort [40]. Mean eye fixation duration (MFD) indicates a driver’s cognitive effort in processing information on an AOI. FD is when a student stays fixated with a relatively stable eye gaze on an AOI, reflecting cognitive effort exerted to process an AOI. MFD is calculated by dividing the total FD on a fixated AOI by the number of fixations on that AOI. Longer MFDs suggest more time spent processing the information presented by an AOI. Literature reports FDs lasting between 200-400ms [40]. Thus, MFD provides a measure of cognitive effort exerted on the information presented by an AOI.

2.3.3 Emotional State

The research utilized the Epoc + device, Emotiv App, EmotivPro software, and virtual serial port applications to collect data regarding six emotional states: stress, excitement, engagement, interest, focus, and relaxation. These emotional states were tracked before, during, and after the triggering of TORs set at two buffer times—either 4–8 s just before intersections. The EPOC + device, a wireless 14-channel EEG device with 2 reference nodes, was employed to capture the six emotions via electrical signal released by the brain [32]. EPOC + detects the electrical signals through the 14 electrodes placed on the head [41].

The configuration of the 14 sensors and 2 references on the EPOC + device adheres to the 10–20 system, wherein the spacing between adjacent electrodes is determined as 10% or 20% of the total front-back or right-left distance of the skull [41]. Within this framework, four distinct categories of brainwaves are identified: delta (0.1–3.5 Hz), theta (4–7.5 Hz), beta (14–30 Hz), and gamma (> 30 Hz), each corresponding to specific states of brain activity [41]. All 14 nodes made perfect contact with the participants’ scalps with the help of a saline solution to ensure accurate brainwave detection and data recording. The brainwave data, collected continuously, is wirelessly transmitted to a USB dongle for processing through EmotivePro software. Subsequently, it is visually presented in real-time on the Emotiv App, detailing the six emotional states [41]. Emotiv has not publicly released their classification algorithms [41].

Each participant had a personalized user profile, and data was continually recalibrated over time to enhance precision. The six emotional states were recorded by the EPOC + device [32] on a scale ranging from 0 to 1 for each channel with higher score corresponding to greater emotional intensity. The EPOC + device did not require prior training to detect emotions. Emotiv Professional software was responsible for recording the data, with specific triggers identified within GMOST: at 2 s prior to TORs and upon the vehicle’s departure from the intersection.

2.4 Data Analysis

In this study, a comparison was made between two groups categorized by buffer times of 8 and 4 s, examining their impact on driving behavior, eye gaze patterns, and EEG data related to emotional states. To analyze the effect of buffer time on these three constructs, three separate MANOVA tests were conducted. Each construct was measured by several dependent variables, requiring individual analysis of MANOVA for each construct with two independent variables (8SG and 4SG) and several dependent variables. The independent variables considered were the two buffer times (8SG and 4SG), with several dependent variables within each construct. The significance level was set at 0.05, and the overall multivariate significance of the dependent variables for each group was determined using Hotelling’s T2 statistic.

2.4.1 Driving Behavior

Takeover quality is measured using MLD and DLP, whereas takeover time is measured using FCT and TCT. A MANOVA is run with four dependent (DLP, TCT, FCD, and MLD) and two independent variables (8SG and 4SG). MANOVA compares groups on each dependent variable and all dependent variables as a single construct.

2.4.2 Eye-Gaze

For each AOI, an individual and separate attention (NF) score is recorded, totaling five separate NFs scores. ST, TS, DL, IR, and SL each have a separate NF score recorded to measure participants’ attention over these AOIs. A MANOVA is run with five dependent (ST, DL, TS, SL, and IR) and two independent variables (8SG and 4SG). This way, attention over each of the five AOI and all of them combined is analyzed. Also, for each AOI, an individual and separate cognitive effort (MFD) score is recorded, totaling five separate MFDs scores. ST, TS, DL, IR, and SL each have a separate MFD score recorded to measure participants’ cognitive effort over these AOIs. Finally, a MANOVA is run with five dependent (ST, DL, TS, SL, and IR) and two independent variables (8SG and 4SG). This way, cognitive effort over each of the five AOIs and cognitive effort over all of them combined are analyzed.

2.4.3 Emotional State

Six emotional states— stress, excitement, engagement, interest, focus, and relaxation—were considered as dependent variables, representing participants’ emotional states from 2 s before TORs until the vehicle cleared the intersection. A MANOVA was performed with the two groups (8SG and 4SG) serving as independent variables, and the six emotional states acting as the dependent variables. The assessments for eye gaze, driving behavior, and emotional data were conducted by the same raters employing a consistent rubric for evaluation. In all instances of MANOVA and ANOVA, the significance threshold was set at 0.05.

3 Results and Discussions

3.1 Emotional State

EEG data were collected and analyzed, and the results in Tables 1 and 2 show that participants in the 8SG had higher levels of focus and engagement, while those in the 4SG had higher levels of stress, relaxation, excitement, and interest.

Table 1 The means for EEG Data (emotional intensity unit on a scale from zero to one) across groups
Table 2 Standard deviations in EEG Data (emotional intensity unit on a scale from zero to one) across groups

A MANOVA was conducted to examine the statistical distinctions between two groups (4SG and 8SG) concerning all six emotional states (stress, excitement, engagement, interest, focus, and relaxation) treated as a single construct. The Box’s test of equality of covariance matrices indicated that the observed covariance matrices of the dependent variables were not equal across the groups (significance level: 0.001). However, Levene’s test of equality of error variances demonstrated that variances were equal across groups for all dependent variables. The results revealed that the multivariate analysis did not exhibit a statistically significant difference between the two groups (Pillai’s T = 0.022, F (6, 390) = 1.45, p = .193, partial eta squared = 0.022).

The analysis of coefficients for the linear combinations distinguishing the two groups revealed that excitement (Partial eta squared = 0.004, p = .189) had the highest contribution, although it lacked statistical significance. Conversely, relaxation (Partial eta squared = 0.0, p = .904) showed the lowest contribution and was also statistically insignificant. Subsequent one-way MANOVA tests corroborated these findings, indicating that the 4SG and 8SG groups did not exhibit statistically significant differences across any of the emotional states, encompassing stress, excitement, engagement, interest, focus, and relaxation. Therefore, it can be concluded that the buffer time, whether early or late, does not exert an influence on drivers’ emotions.

3.2 Eye Gaze

Table 3 presents descriptive statistics regarding attention, indicated by NFs (Number of Fixations), and cognitive effort, indicated by FDs, among all participants across 4SG and 8SG. Table 3 shows that participants in 8SG had higher NFs than those in 4SG in three AOIs (ST, TS, and DL), while those in 4SG had higher NFs than those in 8SG in two AOIs (IR and SL). However, the differences were small in both cases. This suggests that participants in 8SG paid slightly more attention to the secondary task, traffic signs, and staying in their lane, while those in 4SG focused slightly more on incoming traffic and stopping at the stop line. Table 3 also shows that participants in 4SG had slightly higher mean FD values for TS, DL, and SL compared to those in 8SG, while the FD values were the same for ST and IR. These results suggest that, while not statistically significant, participants in 4SG exerted slightly more cognitive effort in processing information presented by TS (provided by the intersection manager regarding how to proceed at the intersection), DL (to control the vehicle), and SL (to ensure proper stopping before the stop line at the intersection) compared to participants in 8SG.

Table 3 The means and Std. Deviations for number of fixations and fixation durations across the two groups

A MANOVA was conducted to determine whether there were significant differences between the two groups on ten gaze measures (NFs and FDs on ST, TS, DL, IR, and SL) when considered as a single construct. The results from the analysis indicated that there was no statistically significant difference between the two groups on all ten gaze measures when considered together (Pillai’s T = 0.048, F (10, 60) = 0.859, p = .573, partial eta squared = 0.443). Hence, when analyzing all five AOIs collectively, there were no significant differences observed between participants in the two groups concerning attention and cognitive effort.

Two separate multivariate analyses were conducted to examine the differences between the two groups concerning attention and cognitive effort across all five AOIs. The results revealed that no significant differences were found between the groups concerning attention (Pillai’s T = 0.5, F (5, 67) = 1.853, p = .105, partial eta squared = 0.623) or cognitive effort (Pillai’s T = 0.002, F (5, 65) = 0.067, p = .997, partial eta squared = 0.065). It’s worth noting that the difference between the groups was more pronounced in terms of NF gaze measures (attention) compared to FD gaze measures (cognitive effort).

The examination of coefficients for the linear combinations revealing distinguishing the groups demonstrated that NFs on TS significantly contributed (Partial eta squared = 0.549, p = .038) in differentiating the two groups. Conversely, NFs on ST (Partial eta squared = 0.1, p = .512), DL (Partial eta squared = 0.333, p = .126), IR (Partial eta squared = 0.074, p = .649), SL (Partial eta squared = 0.14, p = .383), and FDs on ST (Partial eta squared = 0.052, p = .9), TS (Partial eta squared = 0.055, p = .84), DL (Partial eta squared = 0.08, p = .608), IR (Partial eta squared = 0.051, p = .945), and SL (Partial eta squared = 0.055, p = .837) did not significantly contribute to differentiating the groups. These results suggest that participants in the 8SG group exhibited statistically significantly higher NFs on TS compared to those in the 4SG group. This indicates that individuals in the 8SG group allocated statistically significantly more attention to the traffic signal conveyed from the intersection manager, specifically regarding the right of way at the approaching intersection, in comparison to individuals in the 4SG group. Follow-up ANOVAs were conducted to determine the source of the difference in attention over the AOIs. Table 4 shows a significant difference in participants’ attention to the traffic sign (TS), with 8SG paying significantly more attention than 4SG. No significant difference was found in other dependent variables.

The drivers’ overall attention and cognitive effort across all AOIs remained similar regardless of the timing of TOR prompts (4–8 s) during a level-3 autonomous vehicle system failure before approaching an intersection. However, participants exhibited statistically significantly higher attention to traffic signs when prompted with an 8-second buffer compared to a 4-second buffer.

3.3 Driving Behavior

The study’s third aim was to explore whether the duration and quality of takeovers in a level-3 autonomous vehicle are influenced by buffer time when a system failure happens before an intersection. This assessment focused on instances where the driver was engaged in a secondary task and was out-of-the-loop.

Table 4 illustrates FCT and TCT for the duration of takeovers and DLP and MLD regarding takeover quality across both the 4SG and 8SG participant groups. FCT measures drivers’ motor readiness to take over control of the vehicle if needed. Table 4 also shows that participants in the 8-second buffer group (8SG) had a shorter FCT than those in the 4-second buffer group (4SG), indicating that participants in 8SG were quicker to respond to TORs. Additionally, participants in 8SG were faster at taking control of the vehicle compared to those in 4SG. A clear pattern emerges when combining the mean values for both FCT and TCT: 8SG had a faster FCT and quicker TCT response, which may be attributed to the extra time participants in 8SG had to react to the TORs.

MLDs were used to evaluate the quality of deceleration, where lower values indicate smoother deceleration. Results from Table 4 show that participants in 4SG had better deceleration quality than those in 8SG. Conversely, it was expected that participants in 4SG would have worse deceleration quality due to having less time to react.

DLPs showed how well drivers kept the vehicle within their allocated lanes horizontally, starting with TORs until either stopping at the stop line if they were directed to stop or passing the intersections if given the right way to proceed. A lower value indicates a better takeover quality. Participants in 4SG had a better performance keeping the vehicle within their allocated lanes with a slightly lower MDLP score, as shown in Table 4. In summary, participants in 4SG took control of the vehicle faster and had a smoother deceleration pattern.

To summarize, individuals in 8SG exhibited faster vehicle control and quicker responses to TORs, while participants in 4SG demonstrated smoother deceleration and an improved ability to maintain the vehicle within the designated lane.

Table 4 Take-Control Duration and Quality among the two groups

A MANOVA analysis was performed to assess the driving behavior of participants in both the 8SG and 4SG groups, examining four measures (FCT, TCT, MLD, and LDP) collectively. The test of equality of covariance matrices revealed that the observed covariance matrices of the dependent variables were unequal across the groups (sig. < 0.001). The assumption of homogeneity of variance was confirmed as the ratio of the largest group size to the smallest was 1.04 (304/293), which is less than the suggested threshold of 1.5 [42].

The results demonstrated no statistically significant difference between the groups when the four dependent variables were collectively considered (Pillai’s T = 0.013, F (4, 592) = 1.99, p = .094, partial eta squared = 0.013), indicating no statistically significant effect of the buffer time on the quality and duration of takeovers.

Two additional MANOVAs were conducted to evaluate the differences between the 8SG and 4SG groups concerning takeover duration (FCT and TCT combined) and takeover quality (LDP and MLD combined). The findings indicated no statistically significant difference between the groups in either takeover duration (Pillai’s T = 0.009, F(2, 594) = 2.68, p = .07, partial eta squared = 0.009) or takeover quality (Pillai’s T = 0.006, F(2, 594) = 1.76, p = .172, partial eta squared = 0.006).

The examination of the coefficients for the linear combinations distinguishing the two groups indicated that TCT (F(1, 595) = 3.861, p = .0498, partial eta squared = 0.006) contributed significantly to distinguish the two groups while FCT (F(1, 595) = 3.783, partial eta squared = 0.006, p = .052), MLD (F(1, 595) = 3.16, partial eta squared = 0.005, p = .076) and DLP (F(1, 595) = 0.344, partial eta squared = 0.001, p = .558) were not found to contribute significantly to distinguish the groups. This indicates that participants in 8SG were faster in taking control of the vehicle. According to the eta squared, only 0.06%, 0.06%, 0.01%, and 0.05% of the variability in the participants’ TCT, FCT, MLD, and LDP values, respectively to group membership (4SG and 8SG). This means group membership does not statistically significantly affect the changes in FCT, MLD, or LDP values.

In summary, the findings reveal that participants in the 8SG group had statistically significantly faster takeover durations compared to those in the 4SG group. However, there were no statistically significant differences between the groups in terms of response times to TORs or takeover quality.

4 Conclusions

The first objective of this study was to examine how different buffer times (8 and 4 s) affect drivers’ emotional states (stress, excitement, engagement, interest, focus, and relaxation) before, during, and after TORs before intersections. The results did not show that the two different buffer times make a difference in all six emotional states (excitement, engagement, stress, relaxation, interest, and focus) considered together. Additionally, follow-up statistical analyses did not find that the two buffer times make a difference on any of the six emotional states separately. Thus, the results showed that the early (8SG) or late (4SG) buffer time does not impact drivers’ emotions. This would mean that buffer time (early or late) will not have an impact on driving performance via drivers’ emotions, as they have been linked to driving performance [10, 23,24,25,26,27].

The second objective was to explore the effects of the two buffer times on drivers’ attention and cognitive effort when processing traffic-related information in five specified AOIs upon a system failure in level-3 autonomous vehicles before intersections and while the driver is out-of-the-loop. The results showed that buffer time (early or late) did not have an impact on all ten gaze measures (NFs and FDs on five AOIs) considered together, indicating that drivers’ attention and cognitive efforts were not affected by the buffer time, which contradicts the prior study [6], which reported that buffer-time did have an impact on combined attention and cognitive processing effort under similar traffic conditions within the simulated traffic environment. One explanation for the contradiction could be the different early and late buffer time values (3 & 7 vs. 4 & 8). On the other hand, the results revealed that drivers in 8SG paid statistically significantly more attention to the traffic signal received from the intersection manager than the ones in 4SG. That is, when drivers are provided with earlier buffer times, they pay further attention to the traffic signal, which may be because they have more time to prepare for the upcoming intersection. Again, this result contradicts the prior study [6], which did not find that the buffer time impacts attention or cognitive effort individually. This may be because the sample size in the prior study was smaller than in this study and the variation in the buffer times.

The third objective was to investigate the impact of the two buffer times on the duration and quality of takeovers. The results showed that the buffer times did not make a statistically significant effect on four quality and duration of takeover measures (FCT, TCT, SDLP, and MLD) considered together. However, the results showed that TCT was the only measure that contributed statistically significantly to distinguishing the two groups, with 8SG having a shorter TCT than 4SG. These results are contradictory to the prior study [9], which reported that a shorter buffer time leads to quicker decision-making and response. One explanation of this contradictory result may be due to the difference in the sample sizes.

A future study can be conducted to examine the correlation between drivers’ eye gaze, emotional states, and driving behaviors concerning different buffer times or disregarding the effects of buffer times.