1 Introduction
Immersive virtual reality (VR) allows users to experience out-of-this-world experiences from the safety of their home. Head mounted display (HMD) technology can immerse a user’s senses in the virtual world and help induce feelings of presence.
According to Witmer and Singer [114], presence as a construct refers to the user’s subjective feeling of actually being in a place or environment, even when they are situated in another. However, in the case of experiencing a virtual environment (VE), presence refers to experiencing the VE rather than the actual physical one [114]. In contrast,immersion refers to an objective property of a system and the extent to which it can engage the user’s sensorimotor channels and perception of the VE [
92]. Immersion is therefore affected by the technical characteristics of the VR setup including characteristics of both the software, such as realism of the VE, and the hardware, such as the field of view of the HMD [
19]. Critically, the level of immersion provided by the system may affect the illusion of being in the virtual world, and the process by which presence is created has been described as a ‘negotiation’ between the user and the technical, or immersive, qualities of a VR system [
40].
As a result, a body of research has explored individual technical factors, showing that improvements to field of view (FoV) [
11,
20,
43,
87], level of detail (LoD) [
12,
118], frame rate [
6,
7,
29] and stereoscopy [
45,
54,
78] can enhance presence. Over the last three decades there have been considerable improvements in the technical capabilities of commercial VR HMDs. Increases in computational power and display technology have allowed increasingly sophisticated VEs to be displayed, and characteristics such as the FoV to approach those of the human eye [
18,
19]. Additionally, the commercial landscape has changed, with consumers having access to a wide variety of immersive HMDs, ranging from the most expensive VR setups which require high-powered gaming computers to run (e.g., Varjo VR-3
1) to more affordable and popular untethered HMDs (e.g. Meta Quest 2
2). However, despite these technological advances there are certain characteristics, such as VE realism and HMD FoV, that compete for limited computational resources, and designers of VR experiences still need to prioritise certain technical improvements over others when designing for the masses. This raises questions about how technical factors interact with each other to affect presence, and to date there has been little to no systematic combinatorial investigation.
In addition to the immersive quality of the VR setup, research has shown how human factors can affect a user’s presence. The emotions felt by the user [
32] and their perceived agency within the VE [
33,
50] have been identified as important human factors that affect presence, and there are complex interactions that occur during this process [
42].
Emotions that rate highly on arousal, and in particular fear, appear to hold a special place in the formation of presence [10, 32, 44], having a strong evolutionary importance [65]. Furthermore, eliciting a sense of fear is at the centre of a wide range of VR applications, including therapeutic interventions [48, 70, 107], training for crisis management [28], and desensitisation to phobia-inducing stimuli [41, 69], and is even commonly used as a game mechanic in popular VR games [
52].
Fear is not only important in itself but can change the way other factors affect presence: Jicol et al. [
42] found that a user’s sense of agency
strongly affected their presence in a fear-inducing VE but not in a
happiness-inducing VE. This highlights how the formation of presence is determined by interactions with human factors, and shows that we cannot understand how a technical factor influences presence unless we systematically consider its effects in combination with human factors such as fear and agency.The study presented here is the first to systematically investigate the effects on presence of important technical factors in the wider context of human factors. That is, we provide insights into a) interactions between the two technical factors realism and field of view, b) interactions between these technical factors and the two human factors agency and fear, and c) the importance of the technical and human factors in the formation of presence relative to each other. Such a systematic study of both technical and human factors in VEs has the potential to provide a more holistic view of how design decisions can affect presence in VR. We pose the following research questions:
RQ1
How do visual realism and FoV affect VR presence?
RQ2
How can we describe the formation of VR presence based on technical and human factors?
To address these questions, we conducted a
large-scale study with 360 participants exploring the formation of presence in VR, by systematically varying the technical factors visual realism (high/low) and FoV (high/low), as well as the human factors emotion
(focusing on fear) and agency (yes/no), yielding 2 × 2 × 2 × 2 = 16 between-group conditions.
We consider two levels for each of the four independent variables, using levels based on realistic design choices that are popular and meaningful with regard to current consumer-grade VR hardware and experiences. For emotion, we focus our investigation on fear, and compare it with happiness as a popular emotion of opposite valence. To address RQ1, we first analysed the data based on hypotheses derived from related work using analysis of variance (ANOVA) and linear regression methods. The large sample size allowed us to follow a robust approach even where related work did not provide plausible hypotheses [
17]. That is, it allowed us to avoid Type 1 errors by correcting for multiple comparisons, while still retaining a high power and avoiding Type 2 errors. To address RQ2, we used structural equation modelling (SEM) to formulate and evaluate the novel TAP-Fear model, which demonstrates the close relationships between fear, agency and presence. In comparison to earlier models of presence, such as the Presence, Emotion and Agency (PEA) model [
42], the TAP-Fear model provides a better fit to our data and describes the effects of both technical
and human factors. In summary, we make the following contributions:
(1)
Evidence that visual realism and FoV do not affect presence directly.
(2)
Evidence that the effects of visual realism and FoV on presence are moderated by induced fear and perceived agency, respectively.
(3)
The TAP-Fear model, describing how technical and human factors work together in presence formation.
2 Related Work
The importance of presence for the effectiveness of applications ranging from entertainment [
86] to learning [
67,
75,
103] and sensory-motor rehabilitation [
8,
15,
16,
79] has led to significant efforts to investigate those elements that contribute to its formation. Presence is also a crucial factor that drives user retention and adoption of VR technology [
39]. Earlier definitions of presence described it as merely the sensation of ‘being there’ [
55]. However, Weber et al. [
106] highlight that in the case of HMD VR, the feeling of ‘being there’ is easily achievable because sensory stimulation from the outside world is blocked and replaced by the virtual one. Still, that does not mean that the user regards the VE as realistic or believable, which are crucial characteristics of presence [
106]. The term ‘presence’ was first adopted four decades ago, due to the need to quantify the increasing ability of new media to provide rich and realistic VEs that could transport users from the real world into the virtual [
66]. This evidence points to the strong reliance of VR on technical properties to elicit presence and distinguish itself from conventional 2D screens [
109]. Indeed, achieving ever higher user presence has been described as the single most important goal of VR experiences [
108].
In the past few years, decreasing costs of screen and tracking technology as well as an exponential increase in computational power have made commercial VR featuring previously prohibitive technical qualities affordable to the average user [
38]. However, reproducing very high fidelity VEs and affording extensive body tracking still requires state of the art hardware and software which is not yet viable in consumer grade HMDs. This is especially relevant for modern untethered VR HMDs such as the popular Meta (Oculus) Quest 2. The added portability and reduced cost of such devices come with correspondingly reduced computational power, which in turn means that technical improvements compete for limited resources. From the multitude of technical factors that characterise VR HMDs, perhaps the most important for presence and yet computationally taxing are the visual factors of the VE in what concerns software and the FoV in terms of hardware. Facilitating higher visual realism comes with increased demand for computational power
as more detailed objects have to be rendered on the screen. Similarly, a wider FoV poses the same challenge because more of these objects need to be rendered to fill each of the wider frames. Furthermore, with increased FoV comes the added cost of a larger, more costly display. Maximising both realism and FoV is not yet viable
on consumer-grade hardware and especially on portable HMDs [18]. This is relevant to consumers as cost and portability are significant prohibitive factors that hinder user adoption of HMD based VR [
39]. It is thus crucial to first understand the actual benefits for user presence of increased visual realism and FoV, in order to inform guidelines for both
consumer-grade VR hardware engineers and content creators.
2.1 Visual Realism
It has been argued that the realism of a VE is the most important factor ultimately driving user presence [
91]. The visual realism of a VE has itself been described as composed of two main components. First, geometric realism refers to how realistic objects within the VE look, or how close they are to their real world counterparts [
93]. The second aspect of realism is illumination realism, which refers to the fidelity of lighting and shadows cast by objects in the VE [
93]. These can be further divided into quality of objects and terrain [
110], texture and lighting [
118], and shadow quality [
60,
93,
95]. It has been shown that when presented with a VE, users will invariably compare the look of virtual objects with real life ones, in order to judge the level of congruence [
100]. Indeed, Weber and colleagues [
106] state that from the perceptual and conceptual point of view of the user, realism can be sub-divided into separate components such as coherence [
89], fidelity [
3], judgement of reality [
5] and perceived realism [
14,
81,
90]. Thus, most factors that determine the level of realism of a VE are heavily dependent on the way a VE looks.
However, due to the multitude of factors manipulated and hardware utilised across studies, there is currently no consensus as to the exact effect of visual realism on presence.
The strong effects of visual realism on user experience are hard to contest. For example, there is evidence that perceived VE realism can even affect user behaviour in VR [88]. Some studies have found that visual realism can be beneficial for presence [
49,
93,
110], while others found no such effect [
23,
57,
60,
118]. Still, a portion of this work did not account for affective content, which is a characteristic of most VR games. One more recent study aimed to address this limitation and presented users with two versions of a VR game that elicited fear [
38]. The authors manipulated polygon count and texture resolution and found that a higher level of realism enhanced presence [
38]).
Moreover, not only the software technical component contributes to visual realism. Even in a hypothetical scenario where a VR VE could be perceptually indistinguishable from reality, the way in which this VE is perceived is still mediated by the hardware of the HMD, in particular the display. In other words, presence may not be solely determined by what users perceive in a VE but also how users perceive it.
2.2 Field of View
The impact of field of view on presence has long been studied with 2D screens, e.g. [
12,
37,
47], with results suggesting that wider screens enhance the immersive features of an application.
VR HMD displays are still evolving, with many features such as pixel density and colour accuracy still in need of improvements [18]; the afforded FoV of HMDs has been constantly improving since their appearance on the market. It is important to acknowledge, however, that the human’s average binocular field of view (FoV) reaches up to 190° [
2], whereas the most popular consumer HMDs currently stand at around half that. The benefits of wider FoVs for VR presence have been highlighted by early studies [
53,
73,
98] but the findings were not unanimous [
43]. Again, these studies are dated, employing setups with far lower FoVs than are prevalent today. A more recent meta-analysis by Cummings and Bailenson [
19] showed that the FoV of an HMD can play an even more important role in presence formation than visual realism. However, this meta-analysis was published before the recent wave of modern consumer HMDs [
86]. A recent study from 2021 showed that participants using a variety of modern HMDs experienced lower presence with a reduced FoV [
102]. This study, however, was conducted remotely and due to differences in the native FoVs of various HMDs, it is difficult to draw conclusions as to what precisely the wide and narrow FoVs were. A reduced FoV not only affects presence directly but can lead to skewed distance estimation within VR [
62], which influences users’ ability to co-locate themselves among other landmarks within the VE, which is a prerequisite for presence [
97].
Despite the apparent advantages of wider FoVs for presence, deploying wider screens to HMDs presents several challenges. First, a wider FoV implies that a larger portion of the VE is displayed at one time, which can be extremely taxing on limited computational power. At the hardware level, as remarked in a recent review by Angelov et al. [
4], HMDs often have to compromise other qualities, such as pixel density, in exchange for a wider FoV. Lower pixel density can reduce presence and potentially contribute to VR motion sickness [
61]. Simultaneously increasing the FoV and VE verisimilitude meets the same challenge of limited computational resources. It is clear that hardware and software technical features need to be balanced to maximise presence while staying within the bounds of available processing power. Before beginning to understand how such technical factors impact presence, however, there is a need also to take into account factors originating from the user.
2.3 Human Factors
Earlier models of presence placed a heavy emphasis on the technical factors of VR which enabled VEs to feel more realistic e.g. [
112]. However, with significant advances in those areas, more recent research has started to investigate human factors too. More recent models acknowledge that ultimately the user determines whether presence is formed, and this also depends on how they feel within the VE [
21,
39,
42,
85].
Agency, or the perception of acting within a VE, has been at times neglected in accounts of user presence. However, with advances in hardware, in particular tracking technology, agency has received more attention within the presence literature. Sanchez-Vives et al. [
82] describe presence as grounded in the feeling not only of “being there” but also “doing there”. The user perceived verisimilitude of the interaction is also amongst the elements that contribute to overall perceived realism and thus user presence [
106]. This is supported by Magnenat-Thalmann [
58] who points out that amongst the crucial aspects that drive user presence are the presentation of the environment and the interaction that is afforded to the user within it, i.e. agency. Jang and Park [
39] aimed to create a SEM model to explain user retention and adoption of HMD VR. They showed that both technical features and agency contributed to presence. However, their study did not immerse users in an actual controlled VE, but merely asked them which factors they considered important. Such reporting could be dependent on any variety of VR applications that the users had engaged with, and although informative does not allow for clear design recommendations. Moreover, the authors did not test for interactions between the design factors, nor for affective content, which can have a strong impact on presence [
41,
42].
Despite this evidence that both technical factors (realism of the VE or the HMD’s FoV) and human factors (such as emotion and agency) contribute to presence, to our knowledge no previous research has investigated how they interact in the formation of presence. For example, Hvass et al. [
38] immersed users in two VEs with different levels of realism, both of which afforded users agency and elicited fear. It was found that fear levels were lower in the condition presenting poorer realism. This suggests that technical factors are indeed able to moderate the intensity of felt emotion, which could in turn affect presence. However, as shown by Jicol et al.’s [
42] PEA model of presence, agency also moderates the effect of fear on presence. This makes it problematic to expand the findings of Hvass et al. [
38] to VEs where users do not have agency. Agency could in fact interact with technical factors in their effect on
fear and perhaps presence, but such an effect has not been tested with modern hardware.
Hence, a clear understanding of whether and how technical and human factors may interact to create presence is missing.
Past literature has clarified the relevance of technical characteristics such as VE realism and FoV as well as fear and agency [42]. The current study expands this knowledge by systematically investigating not only their contributions to presence in isolation but also their interactions with each other. To answer RQ1 and RQ2, we
collected a new large data set and used it to create a novel SEM model including the technical factors realism and FoV, and the human factors agency and emotion. When considering emotion, our particular focus is on fear, as it holds significant importance to the VR industry and has shown intricate relationships with technical factors [38], agency and presence [42].4 Results
We first compared the level of user VR experience between the 16 conditions, so as to avoid confounding variables. A one-way ANOVA indicated that there was no such difference between the 16 participant groups (F(15, 342) = 0.927, p = .535). It was also confirmed via a Pearson correlation that participants’ VR experience did not correlate with Presence (r(359) = .038, p = .468).
Next, it was verified whether the VEs were successful in eliciting the intended emotions. This was confirmed because reported Happiness was significantly higher across the VEs designed to induce happiness, when compared to the fear VEs (t(360) = −8.155, p < .001**, d = −.857). The opposite pattern was found for Fear which was higher in the fear VEs (t(360) = 13.611, p < .001**, d = 1.431).
It was also tested whether the dominant emotion in each condition was the intended one, in that users reported feeling the emotion a given VE was intended to elicit as most intense. Here paired-samples t-tests indicated that participants felt more Happiness in the Happy VEs (M = 6.836, SD = 1.84) compared to Fear (M = 2.63, SD = 2.00), (t(182) = −18.200, p < .001**). A similar effect in reverse was observed for in Fear-inducing VEs where indeed Fear (M = 5.68, SD = 2.25) was felt more than Happiness (M = 5.10, SD = 2.20), (t(178) = 2.035, p = .043*).
User felt Agency was also compared between the agency and non-agency VEs via an independent samples t-test which confirmed that agency-inducing VEs (M = 6.88, SD = 1.97) led to significantly higher Agency (M = 6.08, SD = 2.14), (t(360) = −3.697, p < .001**). Overall, these results confirm that the VEs were successful in eliciting the desired emotions and feeling of agency.
4.1 Verifying and Extending the PEA Model
Table
1 shows the results of the four-way ANOVA of
Emotion \(_\text{VE}\) ,
Agency \(_\text{VE}\) ,
Realism \(_\text{VE}\) and
FoV \(_\text{VE}\) on
Presence. The main effects of
Emotion \(_\text{VE}\) and
Agency \(_\text{VE}\) are both significant, and so are their interaction; therefore we accept H1-H3. The main effects of
Realism \(_\text{VE}\) and
FoV \(_\text{VE}\) are both not significant; therefore we reject H4&5. Of the set of speculative hypotheses investigating the interactions of
Realism \(_\text{VE}\) and
FoV \(_\text{VE}\) with
Emotion \(_\text{VE}\) and
Agency \(_\text{VE}\) , which signify moderation effects on
Presence, only
Agency \(_\text{VE}\) ×
FoV \(_\text{VE}\) is significant; therefore we reject H6A-C and accept H6D.
4.2 Predicting Presence in the TAP-Fear Model
Table
2 shows the results of the linear regression analysis on
Presence testing H7-H9. Perceived
Fear and
Agency are significant positive predictors of
Presence, therefore we accept H7&8. By comparison,
Intensity (
r(358) = 0.027,
p = .614) and
Happiness (
r(358) = −0.067,
p = .202) do not correlate significantly with
Presence. In line with our ANOVA results,
Realism \(_\text{VE}\) and
FoV \(_\text{VE}\) do not significantly predict
Presence, which provides further support for rejecting H4&5. Of the set of speculative hypotheses investigating the interactions of
Realism \(_\text{VE}\) and
FoV \(_\text{VE}\) with perceived
Fear and
Agency, which signify moderation effects on
Presence, only
FoV \(_\text{VE}\) ×
Agency is significant; therefore we reject H9A-C and accept H9D. This is in line with the rejection of H6A-C and the acceptance of H6D, which consider similar interactions with design variables
Emotion \(_\text{VE}\) and
Agency \(_\text{VE}\) rather than perceived
Fear and
Agency.
4.3 Predicting Fear in the TAP-Fear Model
Table
3 shows the results of the four-way ANOVA of
Emotion \(_\text{VE}\) ,
Agency \(_\text{VE}\) ,
Realism \(_\text{VE}\) and
FoV \(_\text{VE}\) on perceived
Fear. The main effect of
Emotion \(_\text{VE}\) is significant, therefore we accept H10. The interaction between
Emotion \(_\text{VE}\) and
Agency \(_\text{VE}\) is not significant, so we reject H11. Of the set of speculative hypotheses H12A-D investigating the main effects of
Realism \(_\text{VE}\) and
FoV \(_\text{VE}\) , and their interactions with
Emotion \(_\text{VE}\) , only the interaction of
Realism \(_\text{VE}\) with
Emotion \(_\text{VE}\) is significant, so we reject H12A,B&D and accept H12C.
4.4 Predicting Agency in the TAP-Fear Model
Table
4 shows the results of the four-way ANOVA of
Emotion \(_\text{VE}\) ,
Agency \(_\text{VE}\) ,
Realism \(_\text{VE}\) and
FoV \(_\text{VE}\) on perceived
Agency. The main effect of
Agency \(_\text{VE}\) is significant, therefore we accept H13. The interaction between
Emotion \(_\text{VE}\) and
Agency \(_\text{VE}\) is significant, so we accept H14. Of the set of speculative hypotheses H15A-D investigating the main effects of
Realism \(_\text{VE}\) and
FoV \(_\text{VE}\) , and their interactions with
Agency \(_\text{VE}\) , none of the effects is significant, so we reject H15A-D.
4.5 The TAP-Fear Structural Equation Model
Figure
3 shows the Technical Agency-Presence-Fear (TAP-Fear) structural equation model (SEM), which was constructed based on our accepted hypotheses. Boxes are variables and arrows are regressions, so that the diagram illustrates the flow of effects from technical and design variables (marked with the subscripts
VE) at the top and left, to perceived
Fear and
Agency in the middle, and finally to
Presence. In other words, similar to the PEA model, the TAP-Fear model can be used to predict perceived
Fear and
Agency from technical and design variables, and finally predict
Presence. As the TAP-Fear model focuses on fear, we encode the emotion the VE is designed to induce using a dummy variable called
Fear \(_\text{VE}\) , with value 1 meaning the VE is designed to induce fear and 0 meaning the VE is designed to induce happiness. By contrast, the PEA model encodes
Emotion \(_\text{VE}\) as 1 for happiness and -1 for fear. Our encoding of
Agency \(_\text{VE}\) is the same as in the PEA model, with 1 meaning the user is afforded agency and 0 meaning she is not.
FearVE predicts perceived
Fear (H10), and
AgencyVE predicts perceived
Agency (H13). Additional effects of
FearVE on
FearVE are moderated by
Realism VE (H12C). Additional effects of
AgencyVE on
Agency are moderated by
FearVE (H14).
Presence is formed through perceived
Fear (H7) and
Agency (H8), as well as effects of perceived
Agency that are moderated by
FoVVE (H9D).
We evaluate the TAP-Fear model by considering several fit measures, as shown in Table
5. We compare TAP-Fear to the PEA model from Jicol et al. [
42], as well as three model variations called Fear, TAP-Fear2 and TAP-Fear3, which have fewer variables. Considering these reduced models makes sense because many fit measures reward models that are parsimonious, i.e. are able to describe data accurately while avoiding complexity, which is a desirable model property. Different fit measures penalise the complexity of a model in different ways, so several fit measures should be taken into account when comparing models [
83].
We first compare the PEA model to the Fear model, which aims to do the same as the PEA model: predict emotions (
Intensity for PEA and
Fear for Fear),
Agency and
Presence based on the design variables
FearVE,
AgencyVE and their interaction. The Fear model is the TAP-Fear model shown in Figure
3 without the technical factors, i.e. without the grey boxes at the top. Table
5 shows that Fear is better at predicting emotions (
Fear) and then
Presence than PEA, as indicated by markedly higher
R2 values. Fear also has a lower RMSEA (lower is better) and higher CFI (higher is better), with RMSEA and CFI measures indicating that PEA has a ‘mediocre’ fit and Fear an ‘adequate’ fit. This suggests that Fear is an improvement on PEA. The AIC and BIC values can only be compared when two models predict the same variables, which is not the case as PEA predicts
Intensity where Fear predicts
Fear.
If we included also Happiness as a predictor of Presence into the Fear model, then the R2 value would increase marginally (0.450) while the CFI would drop drastically (0.824) due to increased model complexity, which is an indication that including Happiness would lead to model overfitting.By including the technical factor RealismVE × FearVE, the TAP-Fear model improves its prediction of Fear, as seen by the higher R2. However, the prediction of Presence gets slightly worse. The added complexity (more variables) cause some fit measures (RMSEA and CFI) to get worse, while some get better (AIC and BIC, lower is better). All coefficients of Fear and TAP-Fear are significant (p ≤ .023) except for the one for AgencyVE on perceived Agency (p = .529), as suggested by the very small standardised coefficient β = −0.04. We therefore removed this effect (the dotted lines at the bottom left in the diagram), resulting in a model variation TAP-Fear2 with a prediction performance similar to TAP-Fear but better parsimony (lower AIC and BIC). We further reduce TAP-Fear2 by removing the smallest effect FoVVE × Agency (β = 0.08, p = 0.023) shown in dashed lines at the top right. This results in a new model TAP-Fear3 with slightly improved prediction performance and RMSEA and CFI values that put it into the ‘adequate’ to ‘good’ category.
5 Qualitative Analysis
We conducted a thematic analysis on the open-ended participant responses, with the following results. Agency was mentioned as one of the most prominent factors contributing to the participants’ feeling of presence, second only to sound and music. The highest frequency of reporting agency as presence-inducing was in the VEs with agency and high FoV, followed by VEs with high visual realism. Participants in VEs with low FoV mainly mentioned agency as presence-inducing only if the VE induced fear with high visual realism (“The responsiveness to the light despite it being relatively out of view”). Participants found that agency induced presence by making the VE responsive to their actions (“The fact that the virtual environment was responsive to my actions made the experience realistic”, “Interacting with the dog with a ball meant I felt like I was acting in the environment.’’). This focused their attention (“the dog made it more interactive and grabbed my attention immediately”). In the fear VEs, agency gave them a purpose of defending themselves (“I quite liked that because there was a perceived threat of the wild dog coming closer if I didn’t shine the flashlight, I had to be on guard which was very immersive”).
Visual realism was another prominent factor reported as inducing presence. In VEs with high FoV, high visual realism and agency, visuals were mentioned most frequently as presence-inducing. Especially the realism of the creature appeared to contribute to presence (“I thought the animation (movement of the dog) was quite realistic and made me feel present”), with unrealistic visuals reducing presence (“sometimes the dog would easily walk through bushes or sort of jump on nothing which reminded me that it was not real.”). Visual realism of the creature was most often reported as presence-inducing in fear-inducing VEs with high visual realism, agency and low FoV. The visual realism of the scenery also contributed to presence (“the environment being very detailed (rocks, pathway, buildings, flowers etc’’) and how [...] “the park was built up in a natural and realistic way”).
Visual realism was also a prominent factor contributing to emotion. The most reports of visual realism contributing to emotions were made in fear-inducing VEs with high visual realism and high FoV. Conversely, limitations of the visual realism hampered emotional response, even in VEs with high realism (“The limited graphical quality and animations ruined the fear I felt”, “they weren’t really realistic so I didn’t feel anything during the study”, “the unrealistic graphics made the experience more humorous than engaging”). Participants frequently stated that improved visual realism would increase emotional response (“More realistic textures and movement of the creature would have improved the fearfulness of the experience”, “if it was more realistic, the fear that I experienced would be 100% more intense”).
6 Discussion
In addressing RQ1 about how visual realism and FoV affect VR presence, the results suggest first of all that they do not affect presence directly to any meaningful degree. The large sample size gave the study a high power (
\(99\%\) ), that is, a high probability of detecting at least medium effects of visual realism and FoV on VR presence. However, our hypotheses about such effects (H4&5) were not supported. Previous work found direct effects of visual realism [
49,
93,
110] and FoV [
53,
73,
98], but this could be explained by the fact that effects of realism and FoV appear to moderate the effects of human factors such as fear and agency (see grey boxes at the top of Figure
3). If a study does not consider different levels of induced fear and afforded agency, for example, because the VEs used are of a limited variety, then changes of realism and FoV could appear to affect presence directly. These findings also align with the dual model of presence which was recently proposed by Weber et al. [
106]. The dual model postulates that the sensation of “being there” which is ensured by technical qualities of VR is in fact not enough to achieve presence because ultimately the user needs to interpret what they perceive as realistic. This is a direct reference to human factors and their importance to presence. To use an example from our model, VEs that elicit fear and afford agency are more likely perceived as realistic, whereas technical factors only support these human factors (e.g. realism supports fear and FoV supports agency).
The results suggest that visual realism and FoV do indeed affect presence indirectly through moderation. More precisely, visual realism appears to moderate the fear-inducing effects of a VE on presence (
RealismVE × FearVE, H12C), which mediated through the fear that is actually felt (H7). In other words, visual realism makes it easier to induce fear in a VE, which in turn leads to higher presence. This finding is in line with previous work, which showed that users felt more fear in VEs that were more realistic [
38]. This is not congruent, however, with a recent study which aimed to investigate whether the level of realism can elicit higher fear and presence in a height simulation task [
32]. Unlike Hvass et al. [
38], they did not find visual realism to affect the level of perceived fear, but only of presence directly. A possible explanation for this lack of an effect on presence is that their participant sample was one with fear of heights and thus may have exhibited pathological fear [
22], leading to a ceiling effect. This raises an interesting point about the extent to which visual realism can heighten perceived fear and where the effect might level off. Our findings also validate previous work attempting to systematically investigate the factors contributing to presence, such as the interoceptive attribution model by Diemer et al. [
21]. This model postulates that presence is determined by the immersive features of the medium (i.e. technical qualities) and by the level of arousal felt by users. Our TAP-Fear model substantially enriches this paradigm by adding the effect of agency amongst human factors and describing the exact interactions that occur between human and technical factors.
FoV appears to moderate the effect of perceived agency on presence (H9D). In other words, FoV matters when a user feels in control. This finding was also backed up by qualitative reports from participants, who mentioned agency as an immersive feature more often in conditions where they were afforded agency. The observed effect is in contrast with a recent study [
102], which found that reducing FoV did not impact presence, despite their VE affording users agency. However, Teixeira and Palmisano [
102] used the Oculus Rift CV1 HMD, which has a maximum FoV of just below 90°, and whose FoV was reduced even further to 20% of that during the study. This suggests that the authors have tested the ‘lower half’ of FoV variability, whereas we focused on the ‘upper half’ where agency may be more important. One possible explanation for the interaction effect between FoV and agency in the present study could be that motion from optical flow is primarily detected in the peripheral vision [
105], which was further restricted in the low FoV conditions. In essence, users may have felt less in control of the VE because they had a limited visual window for perceiving their moving laser pointer/flashlight. Our results show if a user is not afforded agency, the low FoV does not matter.
One of the benefits of our model is that it can give a quantifiable measure of the added benefits brought by technical factors. This is important because in some cases the disadvantages of a technical factor may outweigh its benefits to presence. For example, an increasing body of literature has shown that reducing the FoV can be effective in preventing VR motion sickness [
9,
26,
46]. This practice had gathered so much evidence that a few years ago it was implemented in some popular VR experiences [
1]. Confusingly, there is also evidence that reducing FoV does not reduce motion sickness [
1]. Designers need to decide whether the increased sense of presence from higher FoV and agency could outweigh the potential negative effects of motion sickness. This is an example of how the TAP-Fear model can inform VR design because given the small effect of FoV on presence, it may be better to opt for a reduced FoV in experiences that are known to induce motion sickness. This is especially true if the VE does not afford agency, in which case there would be no added benefit to increased FoV.
We addressed RQ2, showing how the formation of VR presence is based on technical and human factors by creating the TAP-Fear structural equation model (Figure
3), which appears to describes our data adequately and is able to predict human factors from design and technical factors. The TAP-Fear model allows us to quantify the estimated effects of technical factors on presence: in a VE that is designed to elicit fear, the normalised effect of increased visual realism on presence, mediated through fear, is the product of the normalised coefficients 0.17 × 0.16 ≈ 0.03. Similarly, in a VE that induces fear and affords agency, the normalised effect of increased FoV on presence can be estimated as 0.40 × 0.08 ≈ 0.03. In stark contrast, these technical effects are fairly small compared to the estimated effects human factors have on presence by designing a VE that induces fear 0.49 × 0.16 ≈ 0.08, or the effect of affording agency in a fear-inducing VE 0.40 × 0.61 ≈ 0.24.
As expected based on the previous work by Jicol et al. [42] and their PEA model, no such effects were visible in conditions inducing happiness. This demonstrates once again the relevance of fear for VR applications and the formation of user presence. This pattern of results could be due to the evolutionary function of fear and it is not driven by arousal, since both fear and happiness are high-arousal emotions.Previous research has offered localised snapshots of the interactions between factors such as fear and realism [
38], and fear and agency [
42]. The TAP-Fear model illustrates previously unexplored interactions, such as between FoV and agency, as well as providing a broader overview of how the most prominent human and technical factors come together to form presence. What the model also demonstrates is the necessity to adopt a broader view when investigating individual factors or binary relationships between them. TAP-Fear suggests that VR technology has come a long way since its early days and improvements to technical features may have reached a point of diminishing returns for presence [
94]. Arguably, even the low levels of realism and FoV that we tested are superior to many VR environments from only a decade ago – a feat which was made possible by the exponential increase in processing power of chip technology [
77]. This view is supported by the number of participant comments that remarked on the quality of visuals – comments which did not differ across the two levels of realism. Although novel rendering approaches, optics and displays will certainly continue to improve the technical features of forthcoming VR systems [
27], this does not mean that advances in presence formation are stalled until that time. In fact, the TAP-Fear model brings up new questions about the direction of VR hardware and software. It suggests that
, at least in VEs inducing fear, better understanding the user, and how their individual characteristics and feelings shape presence, may be more effective for presence than improving particular hardware characteristics.
6.1 Limitations and Future Work
Despite the significant undertaking of systematically investigating all possible combinations of four factors with two meaningful levels each, the TAP-Fear model only offers a restricted view over the multitude of design factors and levels that are at play in VR experiences. Examples of elements not considered are VEs designed to induce emotions other than fear or happiness, and HMDs with more extreme technical capabilities. While caution needs to be employed when generalising the TAP-Fear model, incorporating other design factors and levels can be addressed in future work because our between-participants experimental design allows for extensions of the model without retesting the conditions employed in this study. This data is made publicly available for other researchers to use and provides the foundation for future models on presence.
Our VEs only depict one scenario in a single environment (park). This is a location that all users will have a level of familiarity with. It has been shown that perceived realism of a VE is determined in part by the extent to which said VE meets users’ expectations; these expectations are in turn grounded in their prior knowledge about the setting depicted in the VE [
90]. Arguably, perceived realism may not have been affected as much if we had used a less-familiar VE. For example, a fantasy world could have been used, which would still have been able in principle to elicit sensations of realism [
30]. Other VEs depicting a variety of settings, real and abstract, should thus be tested in the future.
In this study, we adopted a binary approach to affording agency, mainly to limit the number of levels for our experimental design and create clear design guidelines. However, agency is a continuous variable, and being able to manipulate it continuously would be useful for future work, e.g. to assess individual differences in how affordances of agency are perceived. This is increasingly relevant with the introduction of richer interaction techniques to consumer HMDs, such as controller-less hand tracking and foot trackers [
13,
104]. In addition, our VE did not afford participants any form of locomotion, which was done to control for sensory input. This means that our model may not be entirely applicable to cases where users are able to navigate the VE. One can infer, however, that in such cases the level of perceived agency would be higher, and so could be presence. It is hard to predict whether the same interactions with emotions and FoV would be found with such increased agency.
Based on the technical factors we chose, the applicability of the TAP-Fear model is currently limited to consumer-grade VR hardware. The two FoV levels we used were chosen to maximise the applicability of our findings to current consumer-grade popular VR HMDs, and the lower FoV was chosen to match that of the Meta’s Quest 2, which is the most popular HMD. However, we acknowledge that there may be floor or ceiling effects present in our results, and that testing significantly worse or better HMDs may lead to new insights.In future, testing lower FoV may provide interesting insights for the design of low-cost HMDs, however HMD technology is fast evolving. We note that the high FoV we used is lower than some special-purpose HMDs, such as the Pimax Vision 8K X which benefits from up to 200°. As the presented models are extendable, future work should aim to understand how a wider range of FoV and realism affect presence and the quality of experience, which is especially relevant to more specialised hardware.
Our two VEs are two points on a continuum of visual realism that extends beyond the low and high levels tested here. This was exemplified by the qualitative data where some users remarked that the low graphical quality lowered their fear levels, even in the high realism conditions. The applicability of the TAP-Fear model is thus limited by the parameters it was tested with. Still, the level of graphical fidelity in our study went beyond what is currently computationally possible on current untethered VR HMDs, as it was designed to make full use of the relatively high-end consumer-grade PC that was used. In addition, these models raise questions about how presence is affected in other technologies, such as MR and AR, which are still lagging behind VR in terms of technical factors and where the concept of presence is different. Furthermore, other technical factors can be incorporated into the models, such as the frame rate of the HMD.
Finally, Tcha-Tokey’s presence measure [101] was chosen for its reliability as well as to allow for validation of and comparison with the PEA model by Jicol et al. [42]. However, this also meant that presence was opeartionalised as a unidimensional construct, as it is very commonly done in literature. Still, it has been argued that presence can be divided into three separate dimensions, namely spatial, social and self presence [51]. This is relevant to the TAP-Fear model as emotion and agency may disproportionately affect the three sub-components of presence. Thus, further work should refine our model to separate between them. 6.2 Impact
Our results clarify the intricate ways in which human and technical factors can interact in the formation of VR presence. From our TAP-Fear model it becomes apparent that designers cannot ignore the influence of human factors when developing VR experiences. More precisely, it is the technical factors that should be adapted according to the specific emotion and level of agency afforded to the user, given the importance of the latter. Ultimately, technical factors need to be optimised due to limitations in computational power and high component cost that still characterise VR HMDs. Our model provides a framework for doing such an optimisation while prioritising the user experience, when designing VEs meant to elicit fear in particular. TAP-Fear can be interpreted as a structured decision tree, whereby the purpose of a VR application determines its properties and those of the HMD delivering it. In cases where the dominant intended emotion is fear, game designers should prioritise the enhancement of visual realism. In such cases, HMDs with an FoV above the 90° threshold should be required only for experiences that afford users agency, such as interactive games or training applications.