research-article

Open access

Realism and Field of View Affect Presence in VR but Not the Way You Think

Authors:

Tom Charlie Lancaster,

Christof LutterothAuthors Info & Claims

CHI '23: Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems

Article No.: 399, Pages 1 - 17

https://doi.org/10.1145/3544548.3581448

Published: 19 April 2023 Publication History

All formats PDF

Abstract

Presence is one of the most studied and most important variables in immersive virtual reality (VR) and it influences the effectiveness of many VR applications. Separate bodies of research indicate that presence is determined by (1) technical factors such as the visual realism of a virtual environment (VE) and the field of view (FoV), and (2) human factors such as emotions and agency. However, it remains unknown how technical and human factors may interact in the presence formation process. We conducted a user study (n=360) to investigate the effects of visual realism (high/low), FoV (high/low), emotions (focusing on fear) and agency (yes/no) on presence. Counter to previous assumptions, technical factors did not affect presence directly but were moderated through human factors. We propose TAP-Fear, a structural equation model that describes how design decisions, technical factors and human factors combine and interact in the formation of presence.

Figure 1:

1 Introduction

Immersive virtual reality (VR) allows users to experience out-of-this-world experiences from the safety of their home. Head mounted display (HMD) technology can immerse a user’s senses in the virtual world and help induce feelings of presence. According to Witmer and Singer [114], presence as a construct refers to the user’s subjective feeling of actually being in a place or environment, even when they are situated in another. However, in the case of experiencing a virtual environment (VE), presence refers to experiencing the VE rather than the actual physical one [114]. In contrast,immersion refers to an objective property of a system and the extent to which it can engage the user’s sensorimotor channels and perception of the VE [92]. Immersion is therefore affected by the technical characteristics of the VR setup including characteristics of both the software, such as realism of the VE, and the hardware, such as the field of view of the HMD [19]. Critically, the level of immersion provided by the system may affect the illusion of being in the virtual world, and the process by which presence is created has been described as a ‘negotiation’ between the user and the technical, or immersive, qualities of a VR system [40].

As a result, a body of research has explored individual technical factors, showing that improvements to field of view (FoV) [11, 20, 43, 87], level of detail (LoD) [12, 118], frame rate [6, 7, 29] and stereoscopy [45, 54, 78] can enhance presence. Over the last three decades there have been considerable improvements in the technical capabilities of commercial VR HMDs. Increases in computational power and display technology have allowed increasingly sophisticated VEs to be displayed, and characteristics such as the FoV to approach those of the human eye [18, 19]. Additionally, the commercial landscape has changed, with consumers having access to a wide variety of immersive HMDs, ranging from the most expensive VR setups which require high-powered gaming computers to run (e.g., Varjo VR-3¹) to more affordable and popular untethered HMDs (e.g. Meta Quest 2²). However, despite these technological advances there are certain characteristics, such as VE realism and HMD FoV, that compete for limited computational resources, and designers of VR experiences still need to prioritise certain technical improvements over others when designing for the masses. This raises questions about how technical factors interact with each other to affect presence, and to date there has been little to no systematic combinatorial investigation.

In addition to the immersive quality of the VR setup, research has shown how human factors can affect a user’s presence. The emotions felt by the user [32] and their perceived agency within the VE [33, 50] have been identified as important human factors that affect presence, and there are complex interactions that occur during this process [42]. Emotions that rate highly on arousal, and in particular fear, appear to hold a special place in the formation of presence [10, 32, 44], having a strong evolutionary importance [65]. Furthermore, eliciting a sense of fear is at the centre of a wide range of VR applications, including therapeutic interventions [48, 70, 107], training for crisis management [28], and desensitisation to phobia-inducing stimuli [41, 69], and is even commonly used as a game mechanic in popular VR games [52]. Fear is not only important in itself but can change the way other factors affect presence: Jicol et al. [42] found that a user’s sense of agency strongly affected their presence in a fear-inducing VE but not in a happiness-inducing VE. This highlights how the formation of presence is determined by interactions with human factors, and shows that we cannot understand how a technical factor influences presence unless we systematically consider its effects in combination with human factors such as fear and agency.

The study presented here is the first to systematically investigate the effects on presence of important technical factors in the wider context of human factors. That is, we provide insights into a) interactions between the two technical factors realism and field of view, b) interactions between these technical factors and the two human factors agency and fear, and c) the importance of the technical and human factors in the formation of presence relative to each other. Such a systematic study of both technical and human factors in VEs has the potential to provide a more holistic view of how design decisions can affect presence in VR. We pose the following research questions:

RQ1

How do visual realism and FoV affect VR presence?

RQ2

How can we describe the formation of VR presence based on technical and human factors?

To address these questions, we conducted a large-scale study with 360 participants exploring the formation of presence in VR, by systematically varying the technical factors visual realism (high/low) and FoV (high/low), as well as the human factors emotion (focusing on fear) and agency (yes/no), yielding 2 × 2 × 2 × 2 = 16 between-group conditions. We consider two levels for each of the four independent variables, using levels based on realistic design choices that are popular and meaningful with regard to current consumer-grade VR hardware and experiences. For emotion, we focus our investigation on fear, and compare it with happiness as a popular emotion of opposite valence. To address RQ1, we first analysed the data based on hypotheses derived from related work using analysis of variance (ANOVA) and linear regression methods. The large sample size allowed us to follow a robust approach even where related work did not provide plausible hypotheses [17]. That is, it allowed us to avoid Type 1 errors by correcting for multiple comparisons, while still retaining a high power and avoiding Type 2 errors. To address RQ2, we used structural equation modelling (SEM) to formulate and evaluate the novel TAP-Fear model, which demonstrates the close relationships between fear, agency and presence. In comparison to earlier models of presence, such as the Presence, Emotion and Agency (PEA) model [42], the TAP-Fear model provides a better fit to our data and describes the effects of both technical and human factors. In summary, we make the following contributions:

(1)

Evidence that visual realism and FoV do not affect presence directly.

(2)

Evidence that the effects of visual realism and FoV on presence are moderated by induced fear and perceived agency, respectively.

(3)

The TAP-Fear model, describing how technical and human factors work together in presence formation.

2 Related Work

The importance of presence for the effectiveness of applications ranging from entertainment [86] to learning [67, 75, 103] and sensory-motor rehabilitation [8, 15, 16, 79] has led to significant efforts to investigate those elements that contribute to its formation. Presence is also a crucial factor that drives user retention and adoption of VR technology [39]. Earlier definitions of presence described it as merely the sensation of ‘being there’ [55]. However, Weber et al. [106] highlight that in the case of HMD VR, the feeling of ‘being there’ is easily achievable because sensory stimulation from the outside world is blocked and replaced by the virtual one. Still, that does not mean that the user regards the VE as realistic or believable, which are crucial characteristics of presence [106]. The term ‘presence’ was first adopted four decades ago, due to the need to quantify the increasing ability of new media to provide rich and realistic VEs that could transport users from the real world into the virtual [66]. This evidence points to the strong reliance of VR on technical properties to elicit presence and distinguish itself from conventional 2D screens [109]. Indeed, achieving ever higher user presence has been described as the single most important goal of VR experiences [108].

In the past few years, decreasing costs of screen and tracking technology as well as an exponential increase in computational power have made commercial VR featuring previously prohibitive technical qualities affordable to the average user [38]. However, reproducing very high fidelity VEs and affording extensive body tracking still requires state of the art hardware and software which is not yet viable in consumer grade HMDs. This is especially relevant for modern untethered VR HMDs such as the popular Meta (Oculus) Quest 2. The added portability and reduced cost of such devices come with correspondingly reduced computational power, which in turn means that technical improvements compete for limited resources. From the multitude of technical factors that characterise VR HMDs, perhaps the most important for presence and yet computationally taxing are the visual factors of the VE in what concerns software and the FoV in terms of hardware. Facilitating higher visual realism comes with increased demand for computational power as more detailed objects have to be rendered on the screen. Similarly, a wider FoV poses the same challenge because more of these objects need to be rendered to fill each of the wider frames. Furthermore, with increased FoV comes the added cost of a larger, more costly display. Maximising both realism and FoV is not yet viable on consumer-grade hardware and especially on portable HMDs [18]. This is relevant to consumers as cost and portability are significant prohibitive factors that hinder user adoption of HMD based VR [39]. It is thus crucial to first understand the actual benefits for user presence of increased visual realism and FoV, in order to inform guidelines for both consumer-grade VR hardware engineers and content creators.

2.1 Visual Realism

It has been argued that the realism of a VE is the most important factor ultimately driving user presence [91]. The visual realism of a VE has itself been described as composed of two main components. First, geometric realism refers to how realistic objects within the VE look, or how close they are to their real world counterparts [93]. The second aspect of realism is illumination realism, which refers to the fidelity of lighting and shadows cast by objects in the VE [93]. These can be further divided into quality of objects and terrain [110], texture and lighting [118], and shadow quality [60, 93, 95]. It has been shown that when presented with a VE, users will invariably compare the look of virtual objects with real life ones, in order to judge the level of congruence [100]. Indeed, Weber and colleagues [106] state that from the perceptual and conceptual point of view of the user, realism can be sub-divided into separate components such as coherence [89], fidelity [3], judgement of reality [5] and perceived realism [14, 81, 90]. Thus, most factors that determine the level of realism of a VE are heavily dependent on the way a VE looks.

However, due to the multitude of factors manipulated and hardware utilised across studies, there is currently no consensus as to the exact effect of visual realism on presence. The strong effects of visual realism on user experience are hard to contest. For example, there is evidence that perceived VE realism can even affect user behaviour in VR [88]. Some studies have found that visual realism can be beneficial for presence [49, 93, 110], while others found no such effect [23, 57, 60, 118]. Still, a portion of this work did not account for affective content, which is a characteristic of most VR games. One more recent study aimed to address this limitation and presented users with two versions of a VR game that elicited fear [38]. The authors manipulated polygon count and texture resolution and found that a higher level of realism enhanced presence [38]).

Moreover, not only the software technical component contributes to visual realism. Even in a hypothetical scenario where a VR VE could be perceptually indistinguishable from reality, the way in which this VE is perceived is still mediated by the hardware of the HMD, in particular the display. In other words, presence may not be solely determined by what users perceive in a VE but also how users perceive it.

2.2 Field of View

The impact of field of view on presence has long been studied with 2D screens, e.g. [12, 37, 47], with results suggesting that wider screens enhance the immersive features of an application. VR HMD displays are still evolving, with many features such as pixel density and colour accuracy still in need of improvements [18]; the afforded FoV of HMDs has been constantly improving since their appearance on the market. It is important to acknowledge, however, that the human’s average binocular field of view (FoV) reaches up to 190° [2], whereas the most popular consumer HMDs currently stand at around half that. The benefits of wider FoVs for VR presence have been highlighted by early studies [53, 73, 98] but the findings were not unanimous [43]. Again, these studies are dated, employing setups with far lower FoVs than are prevalent today. A more recent meta-analysis by Cummings and Bailenson [19] showed that the FoV of an HMD can play an even more important role in presence formation than visual realism. However, this meta-analysis was published before the recent wave of modern consumer HMDs [86]. A recent study from 2021 showed that participants using a variety of modern HMDs experienced lower presence with a reduced FoV [102]. This study, however, was conducted remotely and due to differences in the native FoVs of various HMDs, it is difficult to draw conclusions as to what precisely the wide and narrow FoVs were. A reduced FoV not only affects presence directly but can lead to skewed distance estimation within VR [62], which influences users’ ability to co-locate themselves among other landmarks within the VE, which is a prerequisite for presence [97].

Despite the apparent advantages of wider FoVs for presence, deploying wider screens to HMDs presents several challenges. First, a wider FoV implies that a larger portion of the VE is displayed at one time, which can be extremely taxing on limited computational power. At the hardware level, as remarked in a recent review by Angelov et al. [4], HMDs often have to compromise other qualities, such as pixel density, in exchange for a wider FoV. Lower pixel density can reduce presence and potentially contribute to VR motion sickness [61]. Simultaneously increasing the FoV and VE verisimilitude meets the same challenge of limited computational resources. It is clear that hardware and software technical features need to be balanced to maximise presence while staying within the bounds of available processing power. Before beginning to understand how such technical factors impact presence, however, there is a need also to take into account factors originating from the user.

2.3 Human Factors

Earlier models of presence placed a heavy emphasis on the technical factors of VR which enabled VEs to feel more realistic e.g. [112]. However, with significant advances in those areas, more recent research has started to investigate human factors too. More recent models acknowledge that ultimately the user determines whether presence is formed, and this also depends on how they feel within the VE [21, 39, 42, 85].

Agency, or the perception of acting within a VE, has been at times neglected in accounts of user presence. However, with advances in hardware, in particular tracking technology, agency has received more attention within the presence literature. Sanchez-Vives et al. [82] describe presence as grounded in the feeling not only of “being there” but also “doing there”. The user perceived verisimilitude of the interaction is also amongst the elements that contribute to overall perceived realism and thus user presence [106]. This is supported by Magnenat-Thalmann [58] who points out that amongst the crucial aspects that drive user presence are the presentation of the environment and the interaction that is afforded to the user within it, i.e. agency. Jang and Park [39] aimed to create a SEM model to explain user retention and adoption of HMD VR. They showed that both technical features and agency contributed to presence. However, their study did not immerse users in an actual controlled VE, but merely asked them which factors they considered important. Such reporting could be dependent on any variety of VR applications that the users had engaged with, and although informative does not allow for clear design recommendations. Moreover, the authors did not test for interactions between the design factors, nor for affective content, which can have a strong impact on presence [41, 42].

Despite this evidence that both technical factors (realism of the VE or the HMD’s FoV) and human factors (such as emotion and agency) contribute to presence, to our knowledge no previous research has investigated how they interact in the formation of presence. For example, Hvass et al. [38] immersed users in two VEs with different levels of realism, both of which afforded users agency and elicited fear. It was found that fear levels were lower in the condition presenting poorer realism. This suggests that technical factors are indeed able to moderate the intensity of felt emotion, which could in turn affect presence. However, as shown by Jicol et al.’s [42] PEA model of presence, agency also moderates the effect of fear on presence. This makes it problematic to expand the findings of Hvass et al. [38] to VEs where users do not have agency. Agency could in fact interact with technical factors in their effect on fear and perhaps presence, but such an effect has not been tested with modern hardware.

Hence, a clear understanding of whether and how technical and human factors may interact to create presence is missing. Past literature has clarified the relevance of technical characteristics such as VE realism and FoV as well as fear and agency [42]. The current study expands this knowledge by systematically investigating not only their contributions to presence in isolation but also their interactions with each other. To answer RQ1 and RQ2, we collected a new large data set and used it to create a novel SEM model including the technical factors realism and FoV, and the human factors agency and emotion. When considering emotion, our particular focus is on fear, as it holds significant importance to the VR industry and has shown intricate relationships with technical factors [38], agency and presence [42].

3 Method

To answer our research questions, we designed and conducted a 2 × 2 × 2 × 2 = 16 between-group experimental design that manipulatedfour independent variables: Emotion_VE is the emotion that was intended to be elicited by a VE, with levels Fear (F) and Happiness (H). Here Happiness was added as a control condition to fear, having opposite valence. Agency_VE defines the ability of users to interact with the VE and influence it, with levels Agency (A) and Non-Agency (NA); Realism_VE describes the fidelity of the visuals presented in the VE, with levels ‘high’ and ‘low’; FoV_VE describes the width of the field of view afforded to the user, with levels ‘high’ (130°) and ‘low’ (90°). This led to a total of 16 conditions: Happiness-Agency (HA), Happiness-Non-Agency (HNA), Fear-Agency (FA) and Fear-Non-Agency (FNA), each with either high or low realism and FoV.

3.1 Apparatus

A Valve Index HMD was used to display the VEs. This was chosen for its high FoV of 130°, which is considerably higher than the most popular commercial HMDs at the moment. Other HMDs such as the 8K Priax can offer higher FoVs but present other limitations, such as limited frame rate. Moreover, past research has shown no differences in presence between 140° and 180° FOVs [53], suggesting that the Valve Index maximises the effects of FoV as far as impact on presence is concerned. The Valve Index also provides a high maximum frame rate of 144Hz, which allows us to discount frame rate as a confounding variable, and high resolution (two 1440×1600 LCD IPS Fast Switching Type Displays). Additionally, because we tested every combination of our four factors and levels of each, we used the same headset for all conditions to control for confounding variables and subtle differences between hardware, such as frame rate and pixel density. This also avoids issues that the advertised FoV of an HMD is not necessarily perceived as such by the human eye, due to a variety of factors such as optics and positioning of the user’s eyes relative to the screen. As such, the actual perceived FoV of the Valve Index can be closer to 110°. However, this would affect both the high and low FoV conditions equally given that we used the same HMD in all conditions. The HMD was powered by a desktop computer running Windows 10 with an Intel i7-9900k processor, an RTX 2080Ti GPU and 64GB of RAM. These specifications align with recent studies using similar VR stimuli and Unity recommendations [42, 64].

3.2 Stimuli

When selecting different levels for the technical factors we focused on realistic design decisions to maximise applicability to consumer-grade VR HMDs and PCs. As we were considering the same human factors identified by the recent PEA model proposed by Jicol et al. [42], we chose to modify their VEs to suit our design. These VEs have been validated for their effects on emotion and agency, and are freely available³(see Figure 1). Using these VEs allowed us to control the human factors we were interested in while considering changes to the VEs’ realism and FoV, allowing us to compare and contrast our findings with the previously reported PEA model.

Visual realism was adjusted as follows: for ‘high’ realism, ’Texture Quality’ was set to ’Full Res’ and ’Maximum LoD Level’ was set to 0, forcing the environment to use the highest LoD level for all objects, and for all textures to render at the highest quality natively available. By comparison, for ‘low’ realism, textures and shadows of a lower resolution were used, and polygon counts of 3D models were reduced compared to Jicol et al.’s original VEs. To achieve this, the quality settings within Unity were changed as follows: ‘Texture Quality’ was reduced from ‘Full Res’ to ‘Eighth Res’, decreasing the resolution of all textures to one-eighth; and the ‘Maximum LoD Level’ was increased from 0 to 3, forcing all objects to use the lowest level-of-detail (LoD) regardless of distance from the camera. All assets have several LoD profiles, ranging from the full quality high LoD object (stated by LoD level 0) to a low-quality low LoD object (stated by LoD level 3). These were either designed by the original asset creator or created within Unity. The ‘high’ FoV was 130°, which is the maximum afforded by the Valve Index. In order to maximise the study’s relevance to current popular HMD technology, the low FoV was set to 90°. This FoV is particularly relevant to investigate as it is the maximum FoV afforded by the Oculus Quest 2, a very popular HMD at the moment. To create the low FoV condition, a shader in the shape of a vertical rectangle was applied to the camera, blocking the user’s FoV outside of the rectangular area. This shape was created by altering the original circular 3D model using Blender’s 3D modelling tools. This was done as it more accurately represents the Oculus Quest 2’s FoV, while also ensuring that participants didn’t induce a higher risk of motion sickness due to restricting vertical FoV [115].

The VEs were designed by Jicol et al. [42] based on mood induction research [25, 111]. Two of the four VEs induce fear (one with agency and one without agency), while the other two happiness (again one with agency and one without agency). Participants were afforded six degrees of freedom, thus being able to look around or tilt. Users were instructed to remain roughly in the same position during the virtual experience so that they would not hit the VE boundaries and break presence. No virtual body of the user was implemented within the VEs to avoid confounding variables due to perceptions of body ownership [99]. The duration was also unchanged from Jicol et al. [42] as this duration had been suggested as optimal for presence before [117].

3.2.1 Visuals.

The location was not altered from the original VEs [42], nor were significant elements such as trees or buildings. No changes were made to the shape of the terrain either. The VEs designed to induce happiness consisted of a park environment during a sunny day. A dog was present which walked in a scripted pattern of movement and performed semi-random actions e.g. sniffing, playing, jumping in the air. In the conditions with no agency, users could only observe the VE and the dog. The pattern of movement of the dog was designed so that it would approach the user a total of four times each time performing an action. In the conditions where users were afforded agency they could direct a virtual laser pointer by moving a tracked VR hand controller. When the user flashed the pointer in front of the dog on the ground, the dog would respond by performing one of the actions or by following the pointer of jumping to catch it.

As for the fear-inducing VEs, no fundamental aspects of the VEs were changed for the ones meant to induce fear from the originals [42]. These were designed to mimic the happy VEs as closely as possible, while changing the stimuli so as to elicit fear. The same park VE was used but the sky was changed to a night one and the lighting was dimmed so as to create a nigh time feel. The dog in the happy VEs was replaced by a threatening wolf with dark fur and red eyes that were meant to be menacing.

The wolf was scripted to mimic the patterns of movement of the dog, which were also timed identically. When approaching the user, the wolf jumped and attacked them, before retreating. Users could not act on this within the non-agency conditions, whereas in the agency ones they had a flashlight which if pointed to the wolf, it would retreat. To ensure that users would have an incentive to use their afforded agency, the wolf attacked unless fended off with the light and the happy dog would distance itself from the user unless interacted with.

The VR experience lasted a total of three minutes. This duration was based on previous research showing that this is the threshold at which presence can be formed, while avoiding the onset of boredom [117], and is the same as used in Jicol et al. [42].

3.2.2 Audio.

The audio was not changed from the original VE by Jicol et al. [42] since it showed to be effective in eliciting desired emotions. The same royalty-free music was used, with the track “Happy Sandbox” in Happiness conditions [63]“Dark Ambient Music 3” [116] for the Fear conditions.

3.3 Measures

In order to facilitate comparability, we used the same questionnaires for assessing emotions, Agency and Presence as Jicol et al. [42], based on rating scales ranging from 1 (lowest) to 10 (highest). Emotions were measured using two items (“For the following emotions (happiness, fear), how intensely did you feel them during the VR experience?”), administered immediately after completing the VR experience. We will refer to reported levels of felt happiness and fear as Happiness and Fear , respectively. Moreover, Intensity in each VE will refer to the reported intensity of the specific emotion a VE is meant to elicit: it will be the reported Happiness scores in the VEs that were designed to induce happiness and the Fear scores in the VEs designed to induce fear. Perceived agency was measured using three items which were based on the “User Experience in Immersive Virtual Environments” questionnaire developed by [101].

We measured Presence with Tcha-Tokey et al.’s version [101] of the revised Presence Questionnaire (PQ) [113] initially developed by Witmer and Singer [114]. It should be noted that Witmer and Singer defined presence in a VE as not just the feeling of ‘being there’, but also as experiencing the VE as a real environment. This has led to several items in their questionnaire (and thus also in Tcha-Tokey et al.’s revised version) to be phrased around how realistic the perception of the VE is (e.g. “I could examine objects from multiple viewpoints.” and “The visual display quality distracted me from performing assigned tasks.”). This is different from other common presence measures such as the iGroup Presence Questionnaire (IPQ) [84], which solely focuses on the concept of ‘being there’. The PQ has been widely used for measuring presence in VR [31, 34] and has been correlated with other accepted measures such as the Slater–Usoh–Steed (SUS) [96] and the Multimodal Presence Scale (MPS) [59]. The revised version by Tcha-Tokey et al. is almost identical in its operationalisation of presence, but separates engagement into a separate measure.

3.4 Procedure

Participants were greeted by the experimenter and presented with an information sheet describing the study, before being asked for their consent to take part. They then completed a pre-task demographics questionnaire and were given a description of the type of VE they would experience, including whether or not they would have agency and whether the VE was designed to induce fear or happiness. This briefing was especially relevant for the fear condition as, in addition to the VE itself, fear can be elicited through a conceptual pathway by being told that there will be something scary [25, 36]. This may introduce a potentially uneven effect of priming on happiness and fear, however we opted for this method to create expectations about emotions and potentially further increase their intensity [85]. As such we did not aim for a similar intensities of emotion, and our later SEM analysis takes into account that intensity of emotions may not be felt equally. Participants were also told they would either have agency or not for purely instructional purposes. Those in the agency conditions were told about the interactive method they had at their disposal and the others were just told that they would observe the VE. We opted for such a text-based description so as to standardise the prior information given to each participant.

Participants were not told about manipulations in realism or FoV, and were not aware of conditions other than their own. Participants were assisted to put on the Valve Index HMD. They were then presented with a blank scene with text of different sizes and underwent a calibration phase during which the HMD was adjusted until they were able to read the smallest text presented. If assigned to one of the agency conditions, participants were shown how to use the hand controller to direct the laser pointer or the flashlight respectively.

At the end of the VR experience, a screen within the VE prompted participants to remove their HMD. Next, the emotion, presence and agency questionnaires were filled in on a normal screen in this order. While within-VR measures of presence have their merits, assessing presence with a traditional questionnaire after the VR experience has been shown to not affect obtained scores [74]. An open-ended question was also administered which was answered in text: “What elements/characteristics of the virtual environment made you feel present?”. The entire session lasted approximately 20-25 minutes. At the end, a debrief was given to participants and they were paid for their time.

3.5 Hypotheses

For clarity, we divide our hypotheses into sets that each focus on a particular aspect of our investigations. Each set of hypotheses provides a building block for the novel TAP-Fear model, which is assembled at the end. Most of our hypotheses are confirmatory, i.e. they are unambiguous a priori predictions that are based on the findings of previous work. Some of our hypotheses are speculative in the sense that they consider multiple possibilities, without a specific a priori prediction, as related work has not yet explored how technical and human factors work together. As discussed later, we test the speculative hypotheses using post hoc correction, which ensures that confirmatory and speculative results have similar validity.

Figure 2:

3.5.1 Verifying and Extending the PEA Model.

Our first set of a priori hypotheses is primarily based on the work by Jicol et al. [42], predicting that the relationships between Presence, Emotion \(_\text{VE}\) and Agency \(_\text{VE}\) described by the PEA model (Figure 2) remain valid:

Presence will be higher in VEs designed to induce fear compared to happiness (Emotion \(_\text{VE}\) ).

Presence will be higher in VEs that afford agency (Agency \(_\text{VE}\) ).

Agency \(_\text{VE}\) will moderate the effect of Emotion \(_\text{VE}\) on Presence (interaction Emotion \(_\text{VE} \times \text{Agency}_\text{VE}\) ).

Furthermore, we add hypotheses describing the straightforward immersion-enhancing effects of increased realism [12, 118] and FoV [11, 20, 43, 87] on presence that were reported in related work:

Presence will be higher in VEs with higher realism (Realism \(_\text{VE}\) ).

Presence will be higher in VEs with higher FoV (FoV \(_\text{VE}\) ).

The PEA model provided strong evidence of the moderating role of the design variables Emotion \(_\text{VE}\) and Agency \(_\text{VE}\) . Therefore we suspect that there is a similar moderation of the effects on Presence of Realism \(_\text{VE}\) and FoV \(_\text{VE}\) . In other words, we suspect that realism and FoV work jointly together with emotion and agency to affect presence. We do not know exactly which variables are involved in such moderation as this has never been studied before. Therefore we pose a set of hypotheses about possible moderations based on all such two-way interactions. Note that mathematically, it makes no difference in which order the variables in a moderation are specified, i.e. to say that “X moderates Y” is the same as saying “Y moderates X”:

H6A

Realism \(_\text{VE}\) will moderate the effect of Emotion \(_\text{VE}\) on Presence (interaction Realism \(_\text{VE} \times \text{Emotion}_\text{VE}\) ).

H6B

FoV \(_\text{VE}\) will moderate the effect of Emotion \(_\text{VE}\) on Presence (interaction FoV \(_\text{VE} \times \text{Emotion}_\text{VE}\) ).

H6C

Realism \(_\text{VE}\) will moderate the effect of Agency \(_\text{VE}\) on Presence (interaction Realism \(_\text{VE} \times \text{Agency}_\text{VE}\) ).

H6D

FoV \(_\text{VE}\) will moderate the effect of Agency \(_\text{VE}\) on Presence (interaction FoV \(_\text{VE} \times \text{Agency}_\text{VE}\) ).

3.5.2 Predicting Presence in the TAP-Fear Model.

The PEA model describes how the design variables Emotion \(_\text{VE}\) and Agency \(_\text{VE}\) influence perceived emotional Intensity and perceived Agency, which are in turn used to predict Presence. While the PEA model poses that emotional Intensity is a good measure to describe the effects of emotion on Presence, Jicol et al.’s results suggest that Fear may be a more appropriate variable [42]. In the PEA model (Figure 2) Presence is mainly formed when a VE affords agency (Agency_VE=1), as can be seen by the large standardised coefficient β = 0.69 of the effect of Agency on Presence. Now let us compare the effects of an agency-affording, fear-inducing VE (Emotion \(_\text{VE}\) =-1) with those of a agency-affording, happiness-inducing VE (Emotion \(_\text{VE}\) =1) by summing up the effects of the paths that include Emotion \(_\text{VE}\) . What we find is that the fear-inducing VE has a standardised effect on presence of (− 0.51 + 0.3) × 0.2 + (0.37 + 0.42) × 0.69 ≈ 0.50, whereas the happiness-inducing VE only has an effect of (0.51 − 0.3) × 0.2 + (− 0.37 + 0.42) × 0.69 ≈ 0.08. The large difference is due to the strong indirect effect of fear on Presence through perceived Agency (which can be loosely interpreted as ‘fear lends wings’). The special role of fear in the formation of presence, as well as its intricate relationship with other factors, has been repeatedly supported by VR literature [21, 35, 38, 72, 76] and is backed up by its strong evolutionary importance [65]. It has even been suggested that fear and presence may be mutually dependent in some VEs [71].

As a result, we shift the focus of our analysis to considering perceived Fear as the main emotional variable, laying the foundation for the TAP-Fear model:

The intensity of Fear is a positive linear predictor of Presence.

We furthermore propose that perceived agency is still an important predictor of presence, as described in the PEA model:

Perceived Agency is a positive linear predictor of Presence.

Similar to our predictions in H6A-H6D of the moderating role of Realism \(_\text{VE}\) and FoV \(_\text{VE}\) on the effects of the design variables Emotion \(_\text{VE}\) and Agency \(_\text{VE}\) on Presence, the PEA model leads us to predict that Realism \(_\text{VE}\) and FoV \(_\text{VE}\) also moderate the effects of perceived Fear and Agency on Presence. In other words, we expect that feelings of fear and agency work jointly together with realism and FoV to form presence. Similar to the PEA model, we encode Realism \(_\text{VE}\) and FoV \(_\text{VE}\) as dummy variables with values 1 for ‘high’ and 0 for ‘low’ visual realism or FoV respectively. Again, we do not know which variables exactly are involved in such moderation so we propose a set of hypotheses about possible moderations, similar to H6A-H6D:

H9A

Realism \(_\text{VE}\) will moderate the effect of Fear on Presence (interaction Realism \(_\text{VE} \times \text{Fear}\) ).

H9B

FoV \(_\text{VE}\) will moderate the effect of Fear on Presence (interaction FoV \(_\text{VE} \times \text{Fear}\) ).

H9C

Realism \(_\text{VE}\) will moderate the effect of Agency on Presence (interaction Realism \(_\text{VE} \times \text{Agency}\) ).

H9D

FoV \(_\text{VE}\) will moderate the effect of Agency on Presence (interaction FoV \(_\text{VE} \times \text{Agency}\) ).

3.5.3 Predicting Fear in the TAP-Fear Model.

As shown in Figure 2, the PEA model predicts perceived emotional Intensity and Agency based on the design variables Emotion \(_\text{VE}\) and Agency \(_\text{VE}\) , i.e. based on design decisions of what emotion the VE is intended to elicit and whether the user is afforded agency. We want to make similar predictions in the TAP-Fear model, therefore we propose that Fear can be predicted in a manner similar to emotional Intensity. This leads us to the following hypotheses:

H10

Fear will be higher in VEs designed to induce fear compared to happiness (Emotion \(_\text{VE}\) ).

H11

Agency \(_\text{VE}\) will moderate the effect of Emotion \(_\text{VE}\) on Fear (interaction Emotion \(_\text{VE} \times \text{Agency}_\text{VE}\) ).

Finally, we suspect that Realism \(_\text{VE}\) and FoV \(_\text{VE}\) may influence Fear and Agency. For example, users may feel more fear when exposed to a more realistic fear-inducing stimulus that is presented within a larger FoV. Similarly, looking at the PEA model, it seems plausible that Realism \(_\text{VE}\) and FoV \(_\text{VE}\) may moderate the effects of Emotion \(_\text{VE}\) and Agency \(_\text{VE}\) on Fear and Agency respectively. However, this has never been studied before, therefore, we again propose a set of speculative hypotheses about possible effects, first for Fear:

H12A

Realism \(_\text{VE}\) affects Fear (Realism \(_\text{VE}\) ).

H12B

FoV \(_\text{VE}\) affects Fear (FoV \(_\text{VE}\) ).

H12C

Realism \(_\text{VE}\) will moderate the effect of Emotion \(_\text{VE}\) on Fear (interaction Realism \(_\text{VE} \times \text{Emotion}_\text{VE}\) ).

H12D

FoV \(_\text{VE}\) will moderate the effect of Emotion \(_\text{VE}\) on Fear (interaction FoV \(_\text{VE} \times \text{Emotion}_\text{VE}\) ).

3.5.4 Predicting Agency in the TAP-Fear Model.

We further hypothesise that the prediction of Agency based on the design variables Emotion \(_\text{VE}\) and Agency \(_\text{VE}\) in the PEA model shown in Figure 2 remains valid:

H13

Agency will be higher in VEs designed to afford agency (Agency \(_\text{VE}\) ).

H14

Agency \(_\text{VE}\) will moderate the effect of Emotion \(_\text{VE}\) on Agency (interaction Emotion \(_\text{VE} \times \text{Agency}_\text{VE}\) ).

And we propose a set of speculative hypotheses for Agency, similar to H12A-D:

H15A

Realism \(_\text{VE}\) affects Agency (Realism \(_\text{VE}\) ).

H15B

FoV \(_\text{VE}\) affects Agency (FoV \(_\text{VE}\) ).

H15C

Realism \(_\text{VE}\) will moderate the effect of Agency \(_\text{VE}\) on Agency (interaction Realism \(_\text{VE} \times \text{Agency}_\text{VE}\) ).

H15D

FoV \(_\text{VE}\) will moderate the effect of Agency \(_\text{VE}\) on Agency (interaction FoV \(_\text{VE} \times \text{Agency}_\text{VE}\) ).

3.6 Participants

A total of 360 participants (130 males, 230 females) were recruited amongst university students and staff members. Recruitment was done via online posts on the university noticeboard, posters and word of mouth. Participants were randomly assigned across the 16 conditions. Participants’ age ranged from 16 to 60, and had a mean of 23.847 and standard deviation of 10.09 years. We ensured that all participants had normal or corrected to normal vision and normal hearing. Due to the intensity of emotions elicited by the VEs all participants were also screened for neurological diseases, use of medication, psychological or emotional issues, epilepsy or use of medical devices before taking part in the study. However, no participant was excluded as all passed the screening questionnaires. Participants’ level of VR experience was assessed via a single item ("Please rate the amount of experience you have with virtual reality"), which was rated from "not at all" to "a great deal", on a scale from 1 to 10. Additionally, we ensured no participant suffered from cynophobia (fear of dogs), which was assessed through the same three items as used by Jicol et al. [42]. All participants were paid £5 in cash for their time. The study received ethical approval from the Department of Psychology Ethics Committee at the University of Bath (Ethics code: 21-233).

To ensure that we had sufficient statistical power, we conducted an a priori power analysis to calculate the necessary sample size per participant group. This was done for a between-factors ANOVA with main factors and interactions, through the widely used G*Power software (version 3.1) [24]. To estimate the sample size we used a partial eta-squared \(\eta _p^2\) of 0.06 (for a medium effect size), with a level of power of 0.80 for 16 groups, 1 numerator df (degree of freedom; for main factor 4 − 1 = 3, for interaction (4 − 1) × (4 − 1) = 9), and an α-level of 0.05. The analysis parameters were chosen so as to be similar to those of Jicol et al. [42]. The G*Power analysis indicated a minimum necessary overall sample size of 254, or approximately 16 participants per condition.

3.7 Statistical Methodology

Statistical analyses were performed using JASP 0.16.4.0 [56]. First we ensured the data satisfied the assumptions of ANOVA, using Levene’s test to check equality of variances and Q-Q plots to verify that distributions were close enough to normal. Then we conducted four-way ANOVAs using factors Emotion \(_\text{VE}\) , Agency \(_\text{VE}\) , Realism \(_\text{VE}\) and FoV \(_\text{VE}\) to test H1-H6 and H10-H15. For directed hypotheses, one-tailed tests were used, and otherwise two-tailed tests. Irrespective of our hypotheses, we always report all effects tested by an ANOVA to provide a complete picture. In tables, we report Bonferroni-Holm corrected p-values p \(_\text{BH}\) for speculative sets of hypotheses, highlight significant hypotheses and their p-values, and report effect sizes (η² for ANOVAs and standardised coefficients for regressions) for all effects. The error bars in the graphs show the 95% confidence intervals of the means.

Power analyses using GPower 3.1 show that the ANOVAs were able to detect medium, main and interaction effects (Cohen’s d = 0.5) at α = .05 with a power of 0.999. In other words, thanks to our sample size it is very likely that we will detect any effects that are of at least medium size, which is a common threshold for such analyses. For the four sets of speculative hypotheses H6A-D, H9A-D, H14A-D and H15A-D, we used the robust procedure discussed in [17]. We performed four related comparisons each time (A-D); so in order to control the Type 1 error rate α, we applied the Bonferroni-Holm post hoc correction. This correction ensures α = .05 despite the multiple comparisons by increasing p-values but it also decreases the power of some of the comparisons slightly to 0.998. This is still very high, given that a power of 0.80 or higher is generally considered good for a study. Thus, our sample size enabled us to effectively control both Type 1 and Type 2 errors even when testing multiple speculative hypotheses.

H7-H9 involve continuous predictor variables, therefore we applied linear regressions to test them, using the same robust procedure as described above for H9A-D to test multiple regression coefficients. In order to assemble our findings in a cohesive model, we constructed a structural equation model (SEM) as described in [68], using the SEM maximum likelihood estimator provided by the R package lavaan [80]. We then used accepted measures such as the root mean squared error of approximation (RMSEA) and the Comparative Fit Index (CFI) to evaluate model fit as discussed in [83].

4 Results

We first compared the level of user VR experience between the 16 conditions, so as to avoid confounding variables. A one-way ANOVA indicated that there was no such difference between the 16 participant groups (F(15, 342) = 0.927, p = .535). It was also confirmed via a Pearson correlation that participants’ VR experience did not correlate with Presence (r(359) = .038, p = .468).

Next, it was verified whether the VEs were successful in eliciting the intended emotions. This was confirmed because reported Happiness was significantly higher across the VEs designed to induce happiness, when compared to the fear VEs (t(360) = −8.155, p < .001^**, d = −.857). The opposite pattern was found for Fear which was higher in the fear VEs (t(360) = 13.611, p < .001^**, d = 1.431).

It was also tested whether the dominant emotion in each condition was the intended one, in that users reported feeling the emotion a given VE was intended to elicit as most intense. Here paired-samples t-tests indicated that participants felt more Happiness in the Happy VEs (M = 6.836, SD = 1.84) compared to Fear (M = 2.63, SD = 2.00), (t(182) = −18.200, p < .001^**). A similar effect in reverse was observed for in Fear-inducing VEs where indeed Fear (M = 5.68, SD = 2.25) was felt more than Happiness (M = 5.10, SD = 2.20), (t(178) = 2.035, p = .043^*).

User felt Agency was also compared between the agency and non-agency VEs via an independent samples t-test which confirmed that agency-inducing VEs (M = 6.88, SD = 1.97) led to significantly higher Agency (M = 6.08, SD = 2.14), (t(360) = −3.697, p < .001^**). Overall, these results confirm that the VEs were successful in eliciting the desired emotions and feeling of agency.

4.1 Verifying and Extending the PEA Model

Table 1:

Hypothesis	Effect	df	F	p	\(p_\text{BH}\)	η²
H1	Emotion \(_\text{VE}\)	1	47.323	< .001		0.108
H2	Agency \(_\text{VE}\)	1	8.286	.002		0.019
H3	Emotion \(_\text{VE}\) × Agency \(_\text{VE}\)	1	11.461	< .001		0.026
H4	Realism \(_\text{VE}\)	1	0.789	.188		0.002
H5	FoV \(_\text{VE}\)	1	1.380	.121		0.003
H6A	Realism \(_\text{VE}\) × Emotion \(_\text{VE}\)	1	0.016	.898	.898	< 0.001
H6B	FoV \(_\text{VE}\) × Emotion \(_\text{VE}\)	1	2.902	.089	.267	0.007
H6C	Realism \(_\text{VE}\) × Agency \(_\text{VE}\)	1	1.049	.306	.612	0.002
H6D	FoV \(_\text{VE}\) × Agency \(_\text{VE}\)	1	8.075	.005	.020	0.018
	Realism \(_\text{VE}\) × FoV \(_\text{VE}\)	1	0.893	0.345		0.002
	Emotion \(_\text{VE}\) × Agency \(_\text{VE}\) × Realism \(_\text{VE}\)	1	1.913	0.167		0.004
	Emotion \(_\text{VE}\) × Agency \(_\text{VE}\) × FoV \(_\text{VE}\)	1	1.189	0.276		0.003
	Emotion \(_\text{VE}\) × Realism \(_\text{VE}\) × FoV \(_\text{VE}\)	1	0.229	0.632		< 0.001
	Agency \(_\text{VE}\) × Realism \(_\text{VE}\) × FoV \(_\text{VE}\)	1	2.935	0.088		0.007
	Emotion \(_\text{VE}\) × Agency \(_\text{VE}\) × Realism \(_\text{VE}\) × FoV \(_\text{VE}\)	1	2.098	0.148		0.005
	Residuals	346

Table 1: ANOVA of Emotion \(_\text{VE}\) , Agency \(_\text{VE}\) , Realism \(_\text{VE}\) and FoV \(_\text{VE}\) on Presence.

Table 1 shows the results of the four-way ANOVA of Emotion \(_\text{VE}\) , Agency \(_\text{VE}\) , Realism \(_\text{VE}\) and FoV \(_\text{VE}\) on Presence. The main effects of Emotion \(_\text{VE}\) and Agency \(_\text{VE}\) are both significant, and so are their interaction; therefore we accept H1-H3. The main effects of Realism \(_\text{VE}\) and FoV \(_\text{VE}\) are both not significant; therefore we reject H4&5. Of the set of speculative hypotheses investigating the interactions of Realism \(_\text{VE}\) and FoV \(_\text{VE}\) with Emotion \(_\text{VE}\) and Agency \(_\text{VE}\) , which signify moderation effects on Presence, only Agency \(_\text{VE}\) × FoV \(_\text{VE}\) is significant; therefore we reject H6A-C and accept H6D.

4.2 Predicting Presence in the TAP-Fear Model

Table 2:

Hypothesis	Predictor	Unstandardized	Standard Error	Standardized	t	p	\(p_\text{BH}\)
	(Intercept)	4.669	0.302		15.447	< .001
H7	Fear	0.107	0.032	0.232	3.367	< .001
H8	Agency	0.313	0.041	0.545	7.704	< .001
H4	Realism \(_\text{VE}\)	0.271	0.336	0.113	0.805	0.211
H5	FoV \(_\text{VE}\)	− 0.444	0.334	− 0.185	− 1.329	0.093
H9A	Realism \(_\text{VE}\) × Fear	− 0.016	0.037	− 0.039	− 0.436	0.663	1.00
H9B	FoV \(_\text{VE}\) × Fear	− 0.053	0.037	− 0.127	− 1.458	0.146	.438
H9C	Realism \(_\text{VE}\) × Agency	− 0.015	0.046	− 0.044	− 0.324	0.746	1.00
H9D	FoV \(_\text{VE}\) × Agency	0.121	0.045	0.360	2.673	0.008	.032

Table 2: Coefficients of the linear regression on Presence.

Table 2 shows the results of the linear regression analysis on Presence testing H7-H9. Perceived Fear and Agency are significant positive predictors of Presence, therefore we accept H7&8. By comparison, Intensity (r(358) = 0.027, p = .614) and Happiness (r(358) = −0.067, p = .202) do not correlate significantly with Presence. In line with our ANOVA results, Realism \(_\text{VE}\) and FoV \(_\text{VE}\) do not significantly predict Presence, which provides further support for rejecting H4&5. Of the set of speculative hypotheses investigating the interactions of Realism \(_\text{VE}\) and FoV \(_\text{VE}\) with perceived Fear and Agency, which signify moderation effects on Presence, only FoV \(_\text{VE}\) × Agency is significant; therefore we reject H9A-C and accept H9D. This is in line with the rejection of H6A-C and the acceptance of H6D, which consider similar interactions with design variables Emotion \(_\text{VE}\) and Agency \(_\text{VE}\) rather than perceived Fear and Agency.

4.3 Predicting Fear in the TAP-Fear Model

Table 3:

Hypothesis	Effect	df	F	p	p \(_\text{BH}\)	η²
H10	Emotion \(_\text{VE}\)	1	191.979	< .001		0.338
	Agency \(_\text{VE}\)	1	4.212	0.041		0.007
H11	Emotion \(_\text{VE}\) × Agency \(_\text{VE}\)	1	0.454	0.501		< 0.001
H12A	Realism \(_\text{VE}\)	1	4.090	0.044	0.132	0.007
H12B	FoV \(_\text{VE}\)	1	2.179	0.141	0.282	0.004
H12C	Realism \(_\text{VE}\) × Emotion \(_\text{VE}\)	1	6.413	0.012	0.048	0.011
H12D	FoV \(_\text{VE}\) × Emotion \(_\text{VE}\)	1	0.005	0.942	0.942	< 0.001
	Agency \(_\text{VE}\) × FoV \(_\text{VE}\)	1	1.098	0.295		0.002
	Agency \(_\text{VE}\) × Emotion \(_\text{VE}\) × FoV \(_\text{VE}\)	1	0.438	0.509		< 0.001
	Agency \(_\text{VE}\) × Realism \(_\text{VE}\)	1	0.943	0.332		0.002
	Agency \(_\text{VE}\) × Emotion \(_\text{VE}\) × Realism \(_\text{VE}\)	1	6.244	0.013		0.011
	FoV \(_\text{VE}\) × Realism \(_\text{VE}\)	1	1.377	0.241		0.002
	Agency \(_\text{VE}\) × FoV \(_\text{VE}\) × Realism \(_\text{VE}\)	1	0.327	0.568		< 0.001
	Emotion \(_\text{VE}\) × FoV \(_\text{VE}\) × Realism \(_\text{VE}\)	1	2.355	0.126		0.004
	Agency \(_\text{VE}\) × Emotion \(_\text{VE}\) × FoV \(_\text{VE}\) × Realism \(_\text{VE}\)	1	0.654	0.419		0.001
	Residuals	1498.896	346

Table 3: ANOVA of Emotion \(_\text{VE}\) , Agency \(_\text{VE}\) , Realism \(_\text{VE}\) and FoV \(_\text{VE}\) on perceived Fear.

Table 3 shows the results of the four-way ANOVA of Emotion \(_\text{VE}\) , Agency \(_\text{VE}\) , Realism \(_\text{VE}\) and FoV \(_\text{VE}\) on perceived Fear. The main effect of Emotion \(_\text{VE}\) is significant, therefore we accept H10. The interaction between Emotion \(_\text{VE}\) and Agency \(_\text{VE}\) is not significant, so we reject H11. Of the set of speculative hypotheses H12A-D investigating the main effects of Realism \(_\text{VE}\) and FoV \(_\text{VE}\) , and their interactions with Emotion \(_\text{VE}\) , only the interaction of Realism \(_\text{VE}\) with Emotion \(_\text{VE}\) is significant, so we reject H12A,B&D and accept H12C.

4.4 Predicting Agency in the TAP-Fear Model

Table 4:

Hypotheses	Effect	df	F	p	p \(_\text{BH}\)	η²
	Emotion \(_\text{VE}\)	1	34.848	< .001		0.081
H13	Agency \(_\text{VE}\)	1	16.132	< .001		0.038
H14	Emotion \(_\text{VE}\) × Agency \(_\text{VE}\)	1	13.917	< .001		0.033
H15A	Realism \(_\text{VE}\)	1	0.362	0.548	1.00	< 0.001
H15B	FoV \(_\text{VE}\)	1	< 0.001	0.980	1.00	< 0.001
H15C	Realism \(_\text{VE}\) × Agency \(_\text{VE}\)	1	0.022	0.882	1.00	< 0.001
H15D	FoV \(_\text{VE}\) × Agency \(_\text{VE}\)	1	2.775	0.097	.388	0.006
	Emotion \(_\text{VE}\) × FoV \(_\text{VE}\)	1	4.139	0.043		0.010
	Agency \(_\text{VE}\) × Emotion \(_\text{VE}\) × FoV \(_\text{VE}\)	1	0.377	0.539		< 0.001
	Emotion \(_\text{VE}\) × Realism \(_\text{VE}\)	1	0.756	0.385		0.002
	Agency \(_\text{VE}\) × Emotion \(_\text{VE}\) × Realism \(_\text{VE}\)	1	3.990	0.047		0.009
	FoV \(_\text{VE}\) × Realism \(_\text{VE}\)	1	1.224	0.269		0.003
	Agency \(_\text{VE}\) × FoV \(_\text{VE}\) × Realism \(_\text{VE}\)	1	0.771	0.381		0.002
	Emotion \(_\text{VE}\) × FoV \(_\text{VE}\) × Realism \(_\text{VE}\)	1	0.278	0.598		< 0.001
	Agency \(_\text{VE}\) × Emotion \(_\text{VE}\) × FoV \(_\text{VE}\) × Realism \(_\text{VE}\)	1	2.137	0.145		0.005
	Residuals	1274.779	346	3.684

Table 4: ANOVA of Emotion \(_\text{VE}\) , Agency \(_\text{VE}\) , Realism \(_\text{VE}\) and FoV \(_\text{VE}\) on perceived Agency.

Table 4 shows the results of the four-way ANOVA of Emotion \(_\text{VE}\) , Agency \(_\text{VE}\) , Realism \(_\text{VE}\) and FoV \(_\text{VE}\) on perceived Agency. The main effect of Agency \(_\text{VE}\) is significant, therefore we accept H13. The interaction between Emotion \(_\text{VE}\) and Agency \(_\text{VE}\) is significant, so we accept H14. Of the set of speculative hypotheses H15A-D investigating the main effects of Realism \(_\text{VE}\) and FoV \(_\text{VE}\) , and their interactions with Agency \(_\text{VE}\) , none of the effects is significant, so we reject H15A-D.

4.5 The TAP-Fear Structural Equation Model

Figure 3:

Figure 3 shows the Technical Agency-Presence-Fear (TAP-Fear) structural equation model (SEM), which was constructed based on our accepted hypotheses. Boxes are variables and arrows are regressions, so that the diagram illustrates the flow of effects from technical and design variables (marked with the subscripts VE) at the top and left, to perceived Fear and Agency in the middle, and finally to Presence. In other words, similar to the PEA model, the TAP-Fear model can be used to predict perceived Fear and Agency from technical and design variables, and finally predict Presence. As the TAP-Fear model focuses on fear, we encode the emotion the VE is designed to induce using a dummy variable called Fear \(_\text{VE}\) , with value 1 meaning the VE is designed to induce fear and 0 meaning the VE is designed to induce happiness. By contrast, the PEA model encodes Emotion \(_\text{VE}\) as 1 for happiness and -1 for fear. Our encoding of Agency \(_\text{VE}\) is the same as in the PEA model, with 1 meaning the user is afforded agency and 0 meaning she is not. Fear_VE predicts perceived Fear (H10), and Agency_VE predicts perceived Agency (H13). Additional effects of Fear_VE on Fear_VE are moderated by Realism _VE (H12C). Additional effects of Agency_VE on Agency are moderated by Fear_VE (H14). Presence is formed through perceived Fear (H7) and Agency (H8), as well as effects of perceived Agency that are moderated by FoV_VE (H9D).

Table 5:

Measure	PEA	Fear	TAP-Fear	TAP-Fear2	TAP-Fear3
R² of Intensity resp. Fear	0.077	0.340	0.358	0.358	0.358
R² of Agency	0.145	0.145	0.145	0.144	0.144
R² of Presence	0.414	0.445	0.437	0.436	0.444
Root mean square error of approximation (RMSEA)	0.096	0.073	0.100	0.111	0.071
Comparative Fit Index (CFI)	0.932	0.968	0.915	0.913	0.970
Akaike Information Criterion (AIC)		4047.02	4036.56	4034.99	4037.12
Bayesian Information Criterion (BIC)		4089.83	4087.16	4081.69	4079.93

Table 5: SEM model fit measures.

We evaluate the TAP-Fear model by considering several fit measures, as shown in Table 5. We compare TAP-Fear to the PEA model from Jicol et al. [42], as well as three model variations called Fear, TAP-Fear2 and TAP-Fear3, which have fewer variables. Considering these reduced models makes sense because many fit measures reward models that are parsimonious, i.e. are able to describe data accurately while avoiding complexity, which is a desirable model property. Different fit measures penalise the complexity of a model in different ways, so several fit measures should be taken into account when comparing models [83].

We first compare the PEA model to the Fear model, which aims to do the same as the PEA model: predict emotions (Intensity for PEA and Fear for Fear), Agency and Presence based on the design variables Fear_VE, Agency_VE and their interaction. The Fear model is the TAP-Fear model shown in Figure 3 without the technical factors, i.e. without the grey boxes at the top. Table 5 shows that Fear is better at predicting emotions (Fear) and then Presence than PEA, as indicated by markedly higher R² values. Fear also has a lower RMSEA (lower is better) and higher CFI (higher is better), with RMSEA and CFI measures indicating that PEA has a ‘mediocre’ fit and Fear an ‘adequate’ fit. This suggests that Fear is an improvement on PEA. The AIC and BIC values can only be compared when two models predict the same variables, which is not the case as PEA predicts Intensity where Fear predicts Fear. If we included also Happiness as a predictor of Presence into the Fear model, then the R² value would increase marginally (0.450) while the CFI would drop drastically (0.824) due to increased model complexity, which is an indication that including Happiness would lead to model overfitting.

By including the technical factor Realism_VE × Fear_VE, the TAP-Fear model improves its prediction of Fear, as seen by the higher R². However, the prediction of Presence gets slightly worse. The added complexity (more variables) cause some fit measures (RMSEA and CFI) to get worse, while some get better (AIC and BIC, lower is better). All coefficients of Fear and TAP-Fear are significant (p ≤ .023) except for the one for Agency_VE on perceived Agency (p = .529), as suggested by the very small standardised coefficient β = −0.04. We therefore removed this effect (the dotted lines at the bottom left in the diagram), resulting in a model variation TAP-Fear2 with a prediction performance similar to TAP-Fear but better parsimony (lower AIC and BIC). We further reduce TAP-Fear2 by removing the smallest effect FoV_VE × Agency (β = 0.08, p = 0.023) shown in dashed lines at the top right. This results in a new model TAP-Fear3 with slightly improved prediction performance and RMSEA and CFI values that put it into the ‘adequate’ to ‘good’ category.

5 Qualitative Analysis

We conducted a thematic analysis on the open-ended participant responses, with the following results. Agency was mentioned as one of the most prominent factors contributing to the participants’ feeling of presence, second only to sound and music. The highest frequency of reporting agency as presence-inducing was in the VEs with agency and high FoV, followed by VEs with high visual realism. Participants in VEs with low FoV mainly mentioned agency as presence-inducing only if the VE induced fear with high visual realism (“The responsiveness to the light despite it being relatively out of view”). Participants found that agency induced presence by making the VE responsive to their actions (“The fact that the virtual environment was responsive to my actions made the experience realistic”, “Interacting with the dog with a ball meant I felt like I was acting in the environment.’’). This focused their attention (“the dog made it more interactive and grabbed my attention immediately”). In the fear VEs, agency gave them a purpose of defending themselves (“I quite liked that because there was a perceived threat of the wild dog coming closer if I didn’t shine the flashlight, I had to be on guard which was very immersive”).

Visual realism was another prominent factor reported as inducing presence. In VEs with high FoV, high visual realism and agency, visuals were mentioned most frequently as presence-inducing. Especially the realism of the creature appeared to contribute to presence (“I thought the animation (movement of the dog) was quite realistic and made me feel present”), with unrealistic visuals reducing presence (“sometimes the dog would easily walk through bushes or sort of jump on nothing which reminded me that it was not real.”). Visual realism of the creature was most often reported as presence-inducing in fear-inducing VEs with high visual realism, agency and low FoV. The visual realism of the scenery also contributed to presence (“the environment being very detailed (rocks, pathway, buildings, flowers etc’’) and how [...] “the park was built up in a natural and realistic way”).

Visual realism was also a prominent factor contributing to emotion. The most reports of visual realism contributing to emotions were made in fear-inducing VEs with high visual realism and high FoV. Conversely, limitations of the visual realism hampered emotional response, even in VEs with high realism (“The limited graphical quality and animations ruined the fear I felt”, “they weren’t really realistic so I didn’t feel anything during the study”, “the unrealistic graphics made the experience more humorous than engaging”). Participants frequently stated that improved visual realism would increase emotional response (“More realistic textures and movement of the creature would have improved the fearfulness of the experience”, “if it was more realistic, the fear that I experienced would be 100% more intense”).

6 Discussion

In addressing RQ1 about how visual realism and FoV affect VR presence, the results suggest first of all that they do not affect presence directly to any meaningful degree. The large sample size gave the study a high power ( \(99\%\) ), that is, a high probability of detecting at least medium effects of visual realism and FoV on VR presence. However, our hypotheses about such effects (H4&5) were not supported. Previous work found direct effects of visual realism [49, 93, 110] and FoV [53, 73, 98], but this could be explained by the fact that effects of realism and FoV appear to moderate the effects of human factors such as fear and agency (see grey boxes at the top of Figure 3). If a study does not consider different levels of induced fear and afforded agency, for example, because the VEs used are of a limited variety, then changes of realism and FoV could appear to affect presence directly. These findings also align with the dual model of presence which was recently proposed by Weber et al. [106]. The dual model postulates that the sensation of “being there” which is ensured by technical qualities of VR is in fact not enough to achieve presence because ultimately the user needs to interpret what they perceive as realistic. This is a direct reference to human factors and their importance to presence. To use an example from our model, VEs that elicit fear and afford agency are more likely perceived as realistic, whereas technical factors only support these human factors (e.g. realism supports fear and FoV supports agency).

The results suggest that visual realism and FoV do indeed affect presence indirectly through moderation. More precisely, visual realism appears to moderate the fear-inducing effects of a VE on presence (Realism_VE × Fear_VE, H12C), which mediated through the fear that is actually felt (H7). In other words, visual realism makes it easier to induce fear in a VE, which in turn leads to higher presence. This finding is in line with previous work, which showed that users felt more fear in VEs that were more realistic [38]. This is not congruent, however, with a recent study which aimed to investigate whether the level of realism can elicit higher fear and presence in a height simulation task [32]. Unlike Hvass et al. [38], they did not find visual realism to affect the level of perceived fear, but only of presence directly. A possible explanation for this lack of an effect on presence is that their participant sample was one with fear of heights and thus may have exhibited pathological fear [22], leading to a ceiling effect. This raises an interesting point about the extent to which visual realism can heighten perceived fear and where the effect might level off. Our findings also validate previous work attempting to systematically investigate the factors contributing to presence, such as the interoceptive attribution model by Diemer et al. [21]. This model postulates that presence is determined by the immersive features of the medium (i.e. technical qualities) and by the level of arousal felt by users. Our TAP-Fear model substantially enriches this paradigm by adding the effect of agency amongst human factors and describing the exact interactions that occur between human and technical factors.

FoV appears to moderate the effect of perceived agency on presence (H9D). In other words, FoV matters when a user feels in control. This finding was also backed up by qualitative reports from participants, who mentioned agency as an immersive feature more often in conditions where they were afforded agency. The observed effect is in contrast with a recent study [102], which found that reducing FoV did not impact presence, despite their VE affording users agency. However, Teixeira and Palmisano [102] used the Oculus Rift CV1 HMD, which has a maximum FoV of just below 90°, and whose FoV was reduced even further to 20% of that during the study. This suggests that the authors have tested the ‘lower half’ of FoV variability, whereas we focused on the ‘upper half’ where agency may be more important. One possible explanation for the interaction effect between FoV and agency in the present study could be that motion from optical flow is primarily detected in the peripheral vision [105], which was further restricted in the low FoV conditions. In essence, users may have felt less in control of the VE because they had a limited visual window for perceiving their moving laser pointer/flashlight. Our results show if a user is not afforded agency, the low FoV does not matter.

One of the benefits of our model is that it can give a quantifiable measure of the added benefits brought by technical factors. This is important because in some cases the disadvantages of a technical factor may outweigh its benefits to presence. For example, an increasing body of literature has shown that reducing the FoV can be effective in preventing VR motion sickness [9, 26, 46]. This practice had gathered so much evidence that a few years ago it was implemented in some popular VR experiences [1]. Confusingly, there is also evidence that reducing FoV does not reduce motion sickness [1]. Designers need to decide whether the increased sense of presence from higher FoV and agency could outweigh the potential negative effects of motion sickness. This is an example of how the TAP-Fear model can inform VR design because given the small effect of FoV on presence, it may be better to opt for a reduced FoV in experiences that are known to induce motion sickness. This is especially true if the VE does not afford agency, in which case there would be no added benefit to increased FoV.

We addressed RQ2, showing how the formation of VR presence is based on technical and human factors by creating the TAP-Fear structural equation model (Figure 3), which appears to describes our data adequately and is able to predict human factors from design and technical factors. The TAP-Fear model allows us to quantify the estimated effects of technical factors on presence: in a VE that is designed to elicit fear, the normalised effect of increased visual realism on presence, mediated through fear, is the product of the normalised coefficients 0.17 × 0.16 ≈ 0.03. Similarly, in a VE that induces fear and affords agency, the normalised effect of increased FoV on presence can be estimated as 0.40 × 0.08 ≈ 0.03. In stark contrast, these technical effects are fairly small compared to the estimated effects human factors have on presence by designing a VE that induces fear 0.49 × 0.16 ≈ 0.08, or the effect of affording agency in a fear-inducing VE 0.40 × 0.61 ≈ 0.24. As expected based on the previous work by Jicol et al. [42] and their PEA model, no such effects were visible in conditions inducing happiness. This demonstrates once again the relevance of fear for VR applications and the formation of user presence. This pattern of results could be due to the evolutionary function of fear and it is not driven by arousal, since both fear and happiness are high-arousal emotions.

Previous research has offered localised snapshots of the interactions between factors such as fear and realism [38], and fear and agency [42]. The TAP-Fear model illustrates previously unexplored interactions, such as between FoV and agency, as well as providing a broader overview of how the most prominent human and technical factors come together to form presence. What the model also demonstrates is the necessity to adopt a broader view when investigating individual factors or binary relationships between them. TAP-Fear suggests that VR technology has come a long way since its early days and improvements to technical features may have reached a point of diminishing returns for presence [94]. Arguably, even the low levels of realism and FoV that we tested are superior to many VR environments from only a decade ago – a feat which was made possible by the exponential increase in processing power of chip technology [77]. This view is supported by the number of participant comments that remarked on the quality of visuals – comments which did not differ across the two levels of realism. Although novel rendering approaches, optics and displays will certainly continue to improve the technical features of forthcoming VR systems [27], this does not mean that advances in presence formation are stalled until that time. In fact, the TAP-Fear model brings up new questions about the direction of VR hardware and software. It suggests that, at least in VEs inducing fear, better understanding the user, and how their individual characteristics and feelings shape presence, may be more effective for presence than improving particular hardware characteristics.

6.1 Limitations and Future Work

Despite the significant undertaking of systematically investigating all possible combinations of four factors with two meaningful levels each, the TAP-Fear model only offers a restricted view over the multitude of design factors and levels that are at play in VR experiences. Examples of elements not considered are VEs designed to induce emotions other than fear or happiness, and HMDs with more extreme technical capabilities. While caution needs to be employed when generalising the TAP-Fear model, incorporating other design factors and levels can be addressed in future work because our between-participants experimental design allows for extensions of the model without retesting the conditions employed in this study. This data is made publicly available for other researchers to use and provides the foundation for future models on presence.

Our VEs only depict one scenario in a single environment (park). This is a location that all users will have a level of familiarity with. It has been shown that perceived realism of a VE is determined in part by the extent to which said VE meets users’ expectations; these expectations are in turn grounded in their prior knowledge about the setting depicted in the VE [90]. Arguably, perceived realism may not have been affected as much if we had used a less-familiar VE. For example, a fantasy world could have been used, which would still have been able in principle to elicit sensations of realism [30]. Other VEs depicting a variety of settings, real and abstract, should thus be tested in the future.

In this study, we adopted a binary approach to affording agency, mainly to limit the number of levels for our experimental design and create clear design guidelines. However, agency is a continuous variable, and being able to manipulate it continuously would be useful for future work, e.g. to assess individual differences in how affordances of agency are perceived. This is increasingly relevant with the introduction of richer interaction techniques to consumer HMDs, such as controller-less hand tracking and foot trackers [13, 104]. In addition, our VE did not afford participants any form of locomotion, which was done to control for sensory input. This means that our model may not be entirely applicable to cases where users are able to navigate the VE. One can infer, however, that in such cases the level of perceived agency would be higher, and so could be presence. It is hard to predict whether the same interactions with emotions and FoV would be found with such increased agency.

Based on the technical factors we chose, the applicability of the TAP-Fear model is currently limited to consumer-grade VR hardware. The two FoV levels we used were chosen to maximise the applicability of our findings to current consumer-grade popular VR HMDs, and the lower FoV was chosen to match that of the Meta’s Quest 2, which is the most popular HMD. However, we acknowledge that there may be floor or ceiling effects present in our results, and that testing significantly worse or better HMDs may lead to new insights.In future, testing lower FoV may provide interesting insights for the design of low-cost HMDs, however HMD technology is fast evolving. We note that the high FoV we used is lower than some special-purpose HMDs, such as the Pimax Vision 8K X which benefits from up to 200°. As the presented models are extendable, future work should aim to understand how a wider range of FoV and realism affect presence and the quality of experience, which is especially relevant to more specialised hardware.

Our two VEs are two points on a continuum of visual realism that extends beyond the low and high levels tested here. This was exemplified by the qualitative data where some users remarked that the low graphical quality lowered their fear levels, even in the high realism conditions. The applicability of the TAP-Fear model is thus limited by the parameters it was tested with. Still, the level of graphical fidelity in our study went beyond what is currently computationally possible on current untethered VR HMDs, as it was designed to make full use of the relatively high-end consumer-grade PC that was used. In addition, these models raise questions about how presence is affected in other technologies, such as MR and AR, which are still lagging behind VR in terms of technical factors and where the concept of presence is different. Furthermore, other technical factors can be incorporated into the models, such as the frame rate of the HMD.

Finally, Tcha-Tokey’s presence measure [101] was chosen for its reliability as well as to allow for validation of and comparison with the PEA model by Jicol et al. [42]. However, this also meant that presence was opeartionalised as a unidimensional construct, as it is very commonly done in literature. Still, it has been argued that presence can be divided into three separate dimensions, namely spatial, social and self presence [51]. This is relevant to the TAP-Fear model as emotion and agency may disproportionately affect the three sub-components of presence. Thus, further work should refine our model to separate between them.

6.2 Impact

Our results clarify the intricate ways in which human and technical factors can interact in the formation of VR presence. From our TAP-Fear model it becomes apparent that designers cannot ignore the influence of human factors when developing VR experiences. More precisely, it is the technical factors that should be adapted according to the specific emotion and level of agency afforded to the user, given the importance of the latter. Ultimately, technical factors need to be optimised due to limitations in computational power and high component cost that still characterise VR HMDs. Our model provides a framework for doing such an optimisation while prioritising the user experience, when designing VEs meant to elicit fear in particular. TAP-Fear can be interpreted as a structured decision tree, whereby the purpose of a VR application determines its properties and those of the HMD delivering it. In cases where the dominant intended emotion is fear, game designers should prioritise the enhancement of visual realism. In such cases, HMDs with an FoV above the 90° threshold should be required only for experiences that afford users agency, such as interactive games or training applications.

7 Conclusions

In conclusion, the present study systematically investigated the roles of two technical factors and two human factors on the formation of presence within VR. Our results paint a clearer picture of how users’ perceptions ultimately shape the formation of presence, with technical factors taking only a supporting role:

(1)

Visual realism and FoV do not appear to affect VR presence directly.

(2)

Visual realism appears to make it easier to induce fear in a VE, which in turn leads to higher presence. FoV appears to increase presence only when a user feels agency.

(3)

The effects of visual realism and FoV on presence appear to be small compared to the effects of agency afforded and fear induced in a VE.

Our work reinforces the centrality of human factors in the presence-formation process. Future work should expand our understanding of how VR hardware and software should be designed with the user in mind. In particular, a promising avenue of research could be an expansion of the TAP-Fear model to include additional levels of the tested variables, such as more emotions, levels of agency, or manipulations of the technical factors. Additionally, new factors should be considered, both technical, such as frame rate, as well as user characteristics such as personality traits. Such efforts would contribute towards an increasingly more general model of user presence in VR. It would substantially expand applicability to a wider range of uses, providing VR developers with a framework for designing hardware and content in a manner that maximises the user experience.

Acknowledgments

This work was supported and funded by The Virtual Reality Oracle Project (VRO; AH/T004673/1) and the Centre for the Analysis of Motion, Entertainment Research and Applications (CAMERA 2.0; EP/T022523/1) at the University of Bath.

Footnotes

Varjo VR-3 - https://varjo.com/products/vr-3/

Meta Quest 2 - https://store.facebook.com/gb/quest/products/quest-2/

VEs can be downloaded from https://github.com/RevealBath/vr-presence-benchmark

Supplementary Material

Supplemental Materials (3544548.3581448-supplemental-materials.zip)

Download
5.12 KB

MP4 File (3544548.3581448-video-figure.mp4)

Video Figure

Download
69.16 MB

MP4 File (3544548.3581448-video-preview.mp4)

Video Preview

Download
23.00 MB

MP4 File (3544548.3581448-talk-video.mp4)

Pre-recorded Video Presentation

Download
126.81 MB

References

[1]

Isayas Berhe Adhanom, Nathan Navarro Griffin, Paul MacNeilage, and Eelke Folmer. 2020. The effect of a foveated field-of-view restrictor on VR sickness. In 2020 IEEE conference on virtual reality and 3D user interfaces (VR). IEEE, 645–652.

Abstract

1 Introduction

2 Related Work

2.1 Visual Realism

2.2 Field of View

2.3 Human Factors

3 Method

3.1 Apparatus

3.2 Stimuli

3.2.1 Visuals.

3.2.2 Audio.

3.3 Measures

3.4 Procedure

3.5 Hypotheses

3.5.1 Verifying and Extending the PEA Model.

3.5.2 Predicting Presence in the TAP-Fear Model.

3.5.3 Predicting Fear in the TAP-Fear Model.

3.5.4 Predicting Agency in the TAP-Fear Model.

3.6 Participants

3.7 Statistical Methodology

4 Results

4.1 Verifying and Extending the PEA Model

4.2 Predicting Presence in the TAP-Fear Model

4.3 Predicting Fear in the TAP-Fear Model

4.4 Predicting Agency in the TAP-Fear Model

4.5 The TAP-Fear Structural Equation Model

5 Qualitative Analysis

6 Discussion

6.1 Limitations and Future Work

6.2 Impact

7 Conclusions

Acknowledgments

Footnotes

Supplementary Material

References

Cited By

Index Terms

Recommendations

Effects of dynamic field-of-view restriction on cybersickness and presence in HMD-based virtual reality

Simulation of the Field of View in AR and VR Headsets

Objective Evaluation of VR Sickness and Analysis of Its Relationship with VR Presence

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

PDF

eReader

HTML Format

Get Access

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations