research-article

Open access

Flicker Augmentations: Rapid Brightness Modulation for Real-World Visual Guidance using Augmented Reality

Authors:

Jonathan Sutton,

Tobias Langlotz,

Alexander Plopski,

Kasper HornbækAuthors Info & Claims

CHI '24: Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems

Article No.: 752, Pages 1 - 19

https://doi.org/10.1145/3613904.3642085

Published: 11 May 2024 Publication History

All formats PDF

Abstract

Providing attention guidance, such as assisting in search tasks, is a prominent use for Augmented Reality. Typically, this is achieved by graphically overlaying geometrical shapes such as arrows. However, providing visual guidance can cause side effects such as attention tunnelling or scene occlusions, and introduce additional visual clutter. Alternatively, visual guidance can adjust saliency but this comes with different challenges such as hardware requirements and environment dependent parameters. In this work we advocate for using flicker as an alternative for real-world guidance using Augmented Reality. We provide evidence for the effectiveness of flicker from two user studies. The first compared flicker against alternative approaches in a highly controlled setting, demonstrating efficacy (N = 28). The second investigated flicker in a practical task, demonstrating feasibility with higher ecological validity (N = 20). Finally, our discussion highlights the opportunities and challenges when using flicker to provide real-world visual guidance using Augmented Reality.

1 Introduction

Visual search is a common task in our daily lives. Efficient search is critical when looking for specific items, assembling or maintaining objects or machinery, following someone’s explanations in tutorials or guided tours, or linking complex data from different sources. However, searching can often be difficult, frustrating, time-consuming, and costly. An essential way to alleviate this is to provide visual guidance and cues.

Augmented Reality (AR) has been repeatedly highlighted as a promising way of providing visual guidance to points in the physical world (real-world guidance). It has been explored for supporting training [55], aiding search [29], and assisting focus [24] and memory [48]. With the continuing miniaturisation and technical improvements of AR, in particular of Optical See-Through Head Mounted Displays (OSTHMDs), it is easy to foresee that future AR devices will be close in their appearance to traditional glasses and become ubiquitous [17]. Thus, future visual guidance can be provided on demand, alongside other AR content, to guide the gaze to areas within our view that might otherwise be overlooked or attended too slowly. Although this potential for visual guidance is almost undisputed, the actual realisation of effective visual guidance in the real world remains challenging.

Various methods have been explored for achieving visual guidance. Traditionally, these often used geometric overlays such as circles or arrows as they are simple and effective ways of guidance in AR [35, 37, 49, 51]. However, using geometrical overlays has several limitations that must be considered. For example, they can occlude both the real scene and potentially other AR content, contribute to visual clutter, and distract from other relevant information (attention tunnelling). Consequently, other approaches to visual guidance have been investigated to address these limitations. Saliency modulation provides visual guidance while reducing the downside of occluding scene elements [18]. Saliency modulation has been researched for AR [47, 48] and, recently, we have seen the first explorations in AR glasses [45]. However, saliency modulation proved challenging as per-pixel modulation of the scene was needed, requiring customised hardware [45].

Alternatively, flicker, the sensation evoked when intermittent light is presented to the eye [9], has been demonstrated in desktop visualisations to guide users through complex data [51], images [4], or for counting objects in a scene [29]. However, to the best of our knowledge, flicker has not yet been explored as an alternative form of real-world guidance using AR glasses or for practical tasks beyond the screen. Contrary to existing works using flicker on desktop interfaces, real-world guidance introduces challenges such as user motion, different environments and viewing positions, and inconsistent lighting. However, even for general visual interfaces not using AR, there is only limited information on the performance of flicker against alternative guidance techniques and on the performance of flicker in realistic applications.

This work addresses the increasing interest and practical relevance of using AR glasses for providing real-world visual guidance. Specifically, we explore the use of flicker and its potential to overcome the shortcomings of existing approaches for visual guidance. Specifically, we address two areas to understand the role of flicker for real-world guidance. First, we investigate the relative impact and efficacy of flicker when compared to existing approaches for visual guidance. Existing works compared against geometric techniques for on screen guidance [51] or for pre-cueing using projectors rather than guidance, and none compared flicker against saliency-based visual guidance. But more importantly, they also did not consider the constraints imposed by AR glasses (e.g., relative brightness, only additive, different view and focus planes [21]). Second, existing studies are limited in their external validity as they assumed a static position of the user that came with constrained viewing angles and limited task engagement. In this work, we investigate the performance of flicker-based guidance in less-constrained real-world tasks increasing the external validity of our findings.

In summary, we contribute:

(1)

A comparative study running in a controlled environment exploring outlining, saliency modulation, and flicker for real-world guidance using AR glasses.

(2)

A study on flicker in a less constrained realistic scenario using AR glasses for visual guidance thereby highlighting the performance in a practical task and adding external validity.

(3)

A discussion reflecting on the results and relevance of flicker visual guidance and its practical application

Our work complements existing works on using flicker as guidance technique for desktop interfaces by exploring the performance when applied as real-world visual guidance in AR glasses while also contributing additional insights into the performance when compared to existing guidance techniques.

2 Background

As a general concept, visually guiding attention in AR has many applications, including training, assisting search, assembly tasks, order picking tasks, or reducing distraction and increasing focus. The latter has recently seen much interest from research investigating technical aids for supporting autistic people [6, 53]. In this research, we look specifically at methods that visually draw gaze to specific objects of interest and that can be implemented on common AR glasses for implementing real-world visual guidance. In the following, we briefly introduce the prior works showing the application and used techniques for visual guidance on screens before expanding our discussion to real-world visual guidance.

2.1 Screen-based Visual Guidance

An increasing number of tasks rely on the information presented on digital screens, either on computers, mobile phones, or integrated into everyday objects such as cars. This increase in volume and complexity of information has made it more challenging to stay focused or refocus as part of task resumption. Visual guidance, techniques with the potential to direct our attention using visual cues, have been increasingly explored to support human users’ effectiveness in various fields. Screen-based guidance has been used in training for reading mammography scans [44], support reading in the presence of interruptions [22], restoring attention [55], and managing attention between multiple screens or devices [23]. Visual guidance techniques have also been applied on the screen of VR head-mounted displays such as for guidance in 360°media [15], cinematic VR experiences [39] or for following game narrative [10].

Various methods have been demonstrated for screen-based visual guidance. These can be separated into the categories of geometrical cues, saliency cues, and temporal cues, such as flicker, with geometrical cues being the norm. Specifically, guidance in Virtual Reality (VR) has been extensively studied and we refer to recent surveys for more details on particular techniques [38].

Geometrical cues. There are many examples of geometrical cues as they present a simple means of introducing guidance to a scene. The view of a scene can have a shape overlaid or integrated into it with some inherent significance to the user to pay attention or follow the cue. Outlines such as circular and rectangular frames have historically been the most commonly used ones in particular for drawing gaze (e.g. as in [22, 23, 51]). However, the use of arrows as means of guidance has seen use in more recent works in VR [39, 52, 55], alongside coloured dots [15].

Saliency cues. Saliency is the distinct subjective perceptual quality which makes some items in the world stand out from their immediate surrounding and adjusting visual saliency has been proposed as a means for guiding attention. There is a large body of work that has researched features of visual saliency that are connected to the processing of different visual stimuli in the human visual system. It is widely agreed that brightness, colour saturation, and contrast are major factors for controlling the visual saliency of scene elements. Modulating these factors allows for changing the saliency of a scene. There are various examples of existing research that demonstrate effective saliency modulation on images [3, 28, 30] and videos [27, 43, 48] that are presented on screens and show how they can successfully draw human gaze. While comparisons are missing, it is often stated that the advantage of visual saliency modulation is that it changes the actual appearance of objects instead of introducing new geometrical features that cause additional clutter and potentially occlude other scene elements [45, 48].

Temporal cues. Temporal effects such as motion and flicker are known to be strong attractors of attention [1]. While not part of what is traditionally considered visual saliency in visual guidance, they are either captured in motion saliency or the wider temporal saliency. Both motion and flicker have been utilised as forms of guidance. The most common implementation of temporal-based guidance has been to create flickering over target areas of interest, such as in the work on Subtle Gaze Direction [4, 29] (SGD) that was developed as a means to draw attention in images. These works demonstrated that flicker can be used to assist in counting items in an image [29] using either a subtle version of the technique or a more overt one where area of modulation was increased but is not directly observed by the user, and that the subtle modulation increased likelihood of looking at a region (5x5) of the image [4]. Since then, alternative flicker approaches have also been proposed as subtle methods to provide guidance in visualisation by utilising high frequencies [50]. Work has also investigated applying flicker and SGD in 360°images in VR [15], Cinematic VR experiences [39] and general VR [16]. Alternatively, moving geometry such as butterflies leading gaze has been shown in VR [52].

2.2 Augmented Reality Visual Guidance

There are numerous cases in which we can take advantage of visual guidance in off-screen tasks and specifically for guiding the human gaze to points in the physical world. This real-world visual guidance involves the usage of AR interfaces such as projectors (spatial AR) or head-mounted displays (HMDs) (either video see-through or optical see-through). In particular, projectors and optical see-through HMDs have been targeted as video see-through HMDs usually decouple the user from the real world (fully camera mediated), which is unacceptable in most work environments and a potential safety hazard. Specifically, as order-picking tasks, such as selecting items in a warehouse setting, have been repeatedly named as important scenarios. Works have demonstrated the potential of visual guidance for assistance during picking [35, 40] and assembly [42], assisting in general search [19, 45], training [53], aiding memory [48], and reducing distractions [7].

Intuitively, one would assume real-world visual guidance to follow the same principles as screen-based visual guidance, but it comes with extra challenges. These mainly arise from the limited control when using projector-based AR or optical see-through HMDs [26] as they cannot fully diminish the physical world (e.g. they usually cannot darken the environment but are constrained to add light). Further issues include the fact that guidance is still constricted to the field of view covered by the used AR device. At the same time, visual clutter and occlusions are even more problematic in real-world visual guidance. Screen-based guidance also benefits from an ability to have prior knowledge of the content the user will view. Therefore, despite many of the techniques having the potential to be used in the real world, the transfer of findings from screen-based visual guidance to real-world visual guidance is limited and significant differences can arise. This has been evidenced in prior research on text legibility [2], using OSTHMD as part of multi-display systems [12] and interaction [32].

Figure 1:

Geometrical cues. In line with the use of AR to introduce virtual artefacts to the world, geometrical cues are the most common form of guidance employed in real-world visual guidance. While they generally translate well to AR, new challenges arise due to requiring accurate placement, and being added to complex scenes perceived from multiple angles and potentially alongside other augmentations. In AR glasses they are placed on different focal depths with constrained contrast [21]. First works on supporting order picking with AR-based interfaces for guidance demonstrated arches, frames, arrows, and tunnel visualisations in order picking tasks [35, 40, 41]. The authors already noted that some methods were perceived better because they were less ambiguous (frames or outlines) compared to alternatives (arrows) while some other methods were considered effective but introduced a lot of clutter (tunnels). So later work combined tunnels (when objects are out of view) with frames (when objects are in view) [41]. This difference in in-view vs out-of-view guidance was also brought up by works that also navigate to specific locations [36, 37]. Since then geometrical cues are widely explored in AR [19, 36, 37, 42, 49] but have mainly been explored in isolation. So far, the findings indicate that outlines such as frames or circles are the best compromise for guidance when the target is in view, while arrows are often too ambiguous. Clutter and occlusions were problems that were mentioned in many studies, in particular those that focus on selecting small objects or order-picking but were mentioned less in tasks that were mainly navigation [37].

Saliency cues. Visual saliency has only recently been explored in AR and head-mounted displays. First works blurred parts of the scene to increase focus on the sharp areas in a video see-through HMDs providing full control of the scene appearance [24]. We have also seen approaches that aim to change scene saliency via projectors [47] but results are limited to qualitative image comparisons. Only recently, we have seen saliency modulation using OSTHMDs [45]. While the authors show that saliency-based guidance can be effective in attracting our gaze, they also report on the challenges in setting the parameters as different scenes seem to benefit from different settings in saliency modulation, questioning the external validity of saliency-based real-world visual guidance. Further, the OSTHMD required modification to enable saliency modulation on a per-pixel level necessary to achieve the intended effect, and the range of adjustments that could be conducted was constrained [45].

Temporal cues. Temporal cues have seen limited application in real-world guidance. Compared to screen-based cues, real-world guidance requires not only unconstrained viewing angles and position of users, but also occlusions and uncontrolled motion and lighting from the real world. Geometric cues have been extended with motion on arrows [42] and outlines [36]. While temporal cues, using motion with geometrical cues can be seen as an enhancement of standard geometrical cues. The most relevant to our approach are prior works where temporal cues such as blinking were added to labels shown in AR [25, 49] and have been demonstrated as an effective means to increase the notability of virtual artefacts outside central vision [25], or to pre-cue a user where to look next [49] (a similar concept to guidance where information to the next target is conveyed rather than drawing gaze to a target). Similarly, Booth et al. used a projector [5] to introduce a blink to attract the user’s gaze in a controlled environment. However, the actual temporal cue is vaguely described with the blink seemed to be one time (flash), while its effectiveness in comparison with other cues is not studied.

2.3 Research Gap

Visual guidance has many important applications beyond on-screen guidance. In particular real-world guidance has a large potential but only a few techniques have been explored, with specifically flicker and temporal cues remaining less understood. This is partially because integrating content or modulating the real world is more limited by OSTHMDs which is well covered in the literature [21]. Notable examples are that full light control is not possible, pixel-precise integration is challenging, dynamic range produced by the displays is limited, and glasses have fixed focal planes introducing vergence-accommodation conflict. Finally, a significant unknown from all prior works using flicker to provide guidance is the impact of free positioning and motion of the users. All prior work required the user’s head to be static [4, 29, 50] or controlled the positioning of users within a scene [16, 39]. If flicker can be provided in real-world situations via AR glasses, it has the potential to overcome several of the limitations of alternative techniques. It does not require special hardware, is not reliant on scene content as it only requires a perceivable modulation to be repeatedly made and is less likely to occlude other scene elements. An added benefit of flicker is that it has been shown to be detectable in brain-computer interfaces opening up interaction and gaze confirmation possibilities [13, 46]. However, to the best of our knowledge, a detailed exploration of flicker for real-world visual guidance has not been done.

3 Evaluating Flicker for Visual Guidance in AR Glasses

As a first step towards introducing flicker for use in AR glasses, an understanding of how it compares to current approaches is required. Prior explorations of flicker have focused on the generation of potential techniques and comparisons against other approaches are limited, and particularly lacking against saliency. Further, AR glasses introduce constraints on the manipulations possible due to device capabilities. Therefore, we wanted to evaluate techniques as they can actually be produced in AR glasses; earlier work has either worked directly on screen-based content or VR views.

For the initial evaluation we focused on the relative effect of the different types of cues in their ability to draw gaze. Therefore, we choose to determine the relevance of applying flicker as an alternative approach in a highly controlled initial experiment. Modelling our study after prior works on visual guidance, and in particular flicker [4, 16, 29, 50], we created a controlled study in which users were presented with static views of a scene while ensuring that the views were representative of the guidance provided by AR glasses.

Controlling for confounds such as the position from which the user views the scene, and any variance in this due to motion enables a high degree of internal validity. This is commensurate with the comparison between techniques we wanted to conduct at the cost of some externalisation where wider task and user action will influence effectiveness.

3.1 Study Outline

Conditions: We choose techniques that represent different characteristics within a conceptual design space. We eventually settled on the following techniques: Visual guidance using geometrical cues, visual guidance using inherent saliency modulation, and finally, visual guidance using a flicker cue. Figure 1 gives an overview of the techniques implemented for our study and the effect of overtness variance.

For a technique using geometric cues, we chose outlining [35], in particular, the variant using halos/circles [45, 51]. This provides an easily implementable method that has been demonstrated to work in OSTHMDs and is similar to several other methods proposed in the literature [23], including frame effects [40]. This technique also avoids known problems that arise when using arrows [40] while also reducing the occlusions caused by using arrows and dots as geometrical primitives. In our implementation, we utilised a white circle encompassing the target area. To vary the overtness of the geometry, we adjusted the opacity of the circle.

For a technique demonstrating visual guidance using saliency, we used the recent techniques by Sutton et al. [45] as it was, to the best of our knowledge, the only saliency-based guidance techniques explored for OSTHMDs and Augmented Reality. The technique modulates contrast and saturation, increasing it in a target area while reducing it everywhere else. We adjusted the technique by using the direct outlines of the target areas rather than a blurred circle to reduce the geometrical impact of the technique. The original paper explored the parameter space to adjust the overtness of the saliency modulation. We did not apply a per-component-based optimisation but adjusted the levels of each component uniformly to change the overall overtness.

For a flicker technique, we were inspired by screen-based information visualisations [51]. They are well documented and do not require eye tracking as other approaches used in perceptual studies [4, 29]. Similar to the original implementation for standard displays, our implementation does not rely on achieving high-frequency flicker at critical flicker frequency (CFF) as it usually cannot be achieved with current head-mounted displays, nor is the CFF consistent for all users and viewing environments. Instead, our technique briefly shows a high-frequency flicker before transitioning to a low-frequency low-intensity flicker. Besides implementing it for use in HMDs we modified the technique by adjusting the shape of the luminance adjustment (the area of flicker) to match the shape of the target. To vary the overtness of the techniques we adjusted the time spent at the various flicker frequencies.

Task: The task for this study was to view a series of images. Participants were informed that we record gaze data of people viewing a set of images, some of which had been modified, and encouraged to explore the images. They were also made aware that we would be asking questions about the images afterwards.

Design: We designed a within-subjects study to investigate and compare the effectiveness of guidance techniques at different levels of overtness. We evaluated the effectiveness of the techniques using a set of real images which participants were asked to look at whilst their gaze was recorded. Our independent variable was the method of guidance provided (None, Geometric, Saliency, Flicker). We collected results for each method at four different levels of modulation overtness (25%, 50%, 75%, 100%) to evaluate their relative effectiveness at different levels. Examples of each technique applied at each level can be seen in Figure 1. We looked at the time to first fixation and the area of the image explored as the dependent variable. Subjective overtness noted by the participants was recorded on a seven-point schematically anchored scale.

Figure 2:

Apparatus: As a primary goal of this study was to explore the use of cues in AR glasses, we required users to see the guidance provided as seen through the glasses. The limited luminance range of displays, and non-linear gamut, and additive nature will affect the perceived flicker. Similarly, the saliency modulations producible are constrained to additions to the scene only, and contrast constraints will impact the visibility of outlines. However, collecting reliable gaze data in AR displays is challenging as the quality of eye trackers varies, and data access is limited when compared to eye trackers traditionally used in research. As we faced similar challenges and were interested in reliable results (internal validity), this study used a study apparatus initially proposed by Sutton et al. [45]. The key idea is to capture the view through an OSTHMD with a camera, in our case a Sony A7M3, and present this view in a VR display with an integrated eye tracker (HTC Vive Eye Pro, see Figure 2). The image dataset presented to the participants consisted of 80 images selected to represent a range of real-world scenarios in which visual guidance may be implemented. We included both natural scenes and man-made structures. We split the images into three groups based on their image saliency (High, Medium, and Low) as given by a commonly used saliency estimation predictor [8]. We then assigned each image to a desired level of modulation from 1 (minimal modulation) to 4 (maximum modulation). This resulted in a dataset of images comprised of 80 images divided into sets of 20 with an even distribution of inherent saliency spread across various real-world scenarios.

Based on the generated saliency map for each image, we also selected one object or area to be modulated. These objects were selected as places that were expected to see little but some attendance by viewers in the unmodulated condition. Areas were selected based on the expected saliency and the objects contained were random. Therefore, top-down processing may have introduced some variance into the degrees of attention applied, particularly under the baseline condition that would reflect natural viewing. All images had higher saliency areas which were expected to initially draw attention in the limited viewing time, and we used a counter-balanced design to mitigate any effects. We choose not to use videos in our dataset as they would introduce additional confounding factors due to motion in videos, which serve as additional salient cues.

Procedure: Participants first signed a consent form and completed a demographic survey (age, gender, if they were colourblind, any other uncorrected visual impairments). Then they put on the headset. We ran the eye tracker calibration, which was verified to be within 1^o. If the error exceeded this threshold, the calibration routine was rerun. Once the calibration was verified, the participants were shown a white cross at the centre of the virtual screen and instructed to focus on it. After 3 seconds, the cross was taken away, and the participants were shown an image for 5 seconds. This was followed by showing a black screen with a question regarding the perceived obtrusiveness of the modulation using a seven-point semantically anchored scale with labels of 1: Very Subtle and 7: Very Overt. They were also given option 0: No modulation. After participants answered the question, the cross was shown again. This procedure was repeated for all images in the dataset with each image being modulated by one of the techniques at a given level. The level of modulation being applied to the images was set using double Latin squares to compensate for ordering effects. The image and technique order within each level were randomised. After viewing all images, the participants were given a break and a chance to remove the headset before continuing the study (after calibrating the system again). The participants were then shown one of three unmodified images (either low, medium, or high saliency distribution) and all modified versions of the image at one modulation level and asked for any further comments regarding the techniques. Next, this was repeated for all levels of modulation. This study was approved by the institutional ethics committee.

Participants: We recruited 28 participants from around the campus (7 female, 21 male, mean age: 24.5, sd: 6.2). All participants could calibrate the eye-tracker sufficiently and took part in the interview.

3.2 Analysis

Figure 3:

Analysis: For statistically analysing the results, we used a significance level of p < .05. We calculated fixations using the IV-T fixation detection algorithm [31]. To determine if a fixation was within the target area, we tested if any fixation point lay within the area denoted by the outline, allowing for an error of 1 degree. With this we determined if a user fixated (F) within the target area for each image, and, if so, the time to first fixation (TtFF). If the participant did not look at the target area, we set TtFF to the maximum time (3 sec). This assumes that for all images and techniques, the participant would have looked at the target area immediately after the time shown. Whilst this is a false assumption, it is equally applied across all conditions and allows us to run statistical tests on a complete data set. This is a very conservative approach as we would expect actual values for fixations to show greater variance and actual p-values to be smaller than those found. We also analysed the area of exploration (AoE) and the duration of time (D) spent fixated on the target area. Examples of the area of images explored by participants can be seen in figure 3.

We applied Friedman’s test (Degrees of freedom: 3) and a Wilcoxon’s paired test (Degrees of freedom: 27) with Holm-Bonferronni correction as for all measures with non-parametric data. In the following, we report on the relevant statistical results relevant to our main research goals. Further details are provided in the appendix.

Figure 4:

Results. Looking at the TtFF (Figure 4 Left), we can see that above 25% flicker enabled the fastest fixations except when compared to geometric cues at 75%. At 25% flicker was only significantly different –and was slower than– geometric guidance (p < .0001), which was also significantly faster than none (p < .0001) and saliency (p < .0001). Above 25% we see that flicker provides a significantly faster time to first fixation (all p < .0001), indicating that it was able to provide effective guidance in this regard. In fact, a significant effect for all techniques was found when compared to the baseline condition, indicating an improvement in effective guidance. Looking at the guidance techniques over 25%, compared to saliency-based guidance, we see that flicker is consistently significantly faster to draw gaze (all p < .0001). Compared to geometric guidance, flicker was also able to draw fixations significantly faster at 50% (p = .0022) and 100% (p = .00355), but not at (75%: p = .074). Overall, we see that once the initial high-frequency component was present, flicker proved effective and the fastest of the tested techniques. The mean values and standard deviations for each technique at each level can be seen in figure 5 (Right).

Looking at the AoE (Figure 4 Right), we can see that there is a similar trend to TtFF with flicker being the most limiting on AoE above 25%. Geometric guidance was the only technique to show a significant effect compared to the baseline None (p = .0048) and subsequently also flicker (p = .0048) and saliency (p = .0048). Again, we see a significant effect for flicker compared to the alternative techniques. In this case, in all instances above 25% flicker was significantly different, leading to an overall decrease in image exploration among participants (all p < .0001). This indicates that once a gaze is drawn to a target by flicker, the amount of continued exploration decreases compared to the alternative tested.

Figure 5:

Looking at the overtness (Figure 5 Left) of the conditions we see that flicker was considered the most overt in all conditions above 25% (None: all p < .001; Saliency: 50% p = .00024, 75% p = .00075, 100% p = .03435; Geometric: 50% p = .00636, 75% p = .00355) except when compared to Geometric guidance at 100% (p = .12407). This generally shows that flicker provides a very noticeable effect compared to the other conditions.

Figure 6:

We allowed participants to note if they did not perceive any modulation (modulated noted) and tested this. We also tested whether or not participants fixated on the target area based on previous work [45], and the duration of the fixations on the target areas. The results for these can be seen in Figure 6. The notability of modulations was considered significantly different across all conditions and levels of modulation, in line with the differences in subjective overtness. Fixations followed the same pattern of significance as TtFF save for flicker compared to geometric at 100% where neither was significantly better than the other (p = .0726). The duration of fixations followed the same results as AoE, with flicker producing significantly higher fixations over the conditions above 25%.

Key points raised in Interviews. The participants raised several interesting points when discussing the techniques at different modulation levels. First, participants liked the ability of flicker to provide precise guidance without obscuring the shape of a target but found the highest overtness level of flicker too intense. Participants noted the difference between the highest levels of flicker (100%) and the 2nd highest (75%) that caused the flicker to switch from constantly fast to a slower frequency. Second, a third to a half of the participants generally negatively commented on saliency modulation. While participants appreciated the reduced obtrusion of using the saliency to provide guidance, levels increasingly washed out the image, which was perceived as an undesired filter. The effectiveness of the alternatives was given as a reason for saliency not being preferred. Third, geometric outlines were often preferred, although not always considered effective. Initially, it was disliked due to the effect being too subtle, although at higher levels, issues arose with it standing out too much from the image. It was also noted that it was making it difficult to understand what was being highlighted.

Generally, the anecdotal feedback showed a preference for effective and clear techniques to provide guidance. However, this comes with the downside of reducing the overall viewing experience. This led to a dislike for saliency, and concerns with overly overt outlines, and extend periods of high frequency flicker.

3.3 Discussion

One immediate thing to note from our results was that at the 25% level, only the geometric modulation effect had any perceivable effect and was the only technique to affect their gaze patterns. It is also the only modulation technique that was noticed, though the noticeability of modulations even in the None condition was rated at 50% on average indicating a lot of false positives. This might be explained by the fact that even with no modulation (None condition) the participants had an OSTHMD in their optical path which could have caused some effects not perceived as modulation and should be considered when interpreting the other results.

While all techniques except geometric modulation seem to not be effective or even perceivable at low overtness levels, they showed a similar ability to provide effective guidance compared to baseline at all other levels. With respect to the flicker modulation, we can see that in all levels where there is an initial fixation component to the flicker effect (levels 50%, 75% and 100%), it provides the fastest TtFF and greatest F, outperforming traditional geometric cues.

Whilst being effective, flicker showed a higher tendency to cause attention tunnelling when compared to the geometric technique, a concern that needs to be considered when applying it in real-world scenarios. To avoid this, saliency modulation appears to be the best option, minimising the time the target is fixated to that necessary to identify it and directing attention faster and more consistently than the unmodulated condition. However, saliency modulation did not achieve the speed or high chance of fixation that the other techniques achieved. When looking at saliency, we can again see that precise calibration is needed. Without using calibrations of the parameters, lower modulation levels showed no significant effect with either no differences noted or a subtle shift that did not make a clear target stand out, whilst at higher levels the participants found the washing out of colours undesirable. Based on the participants’ comments, tuning saliency techniques towards directly increasing the saliency of the target, with limited reductions in the surrounding environment, may be preferable.

We also found that once a drawing time was introduced into the flicker effect, it was found to be overt and participants thought it could become annoying. Turning off the effect before it can be viewed, as demonstrated in other works [4, 29], is a potential means to alleviate this. However, this may impact the ability of user’s to confirm the target of guidance, as works have assumed a need for subtle effects that are not directly viewed, and only evaluated overtness by varying the size of modulated areas [29].

Figure 7:

One aspect to consider from our results is the need for tailoring algorithms to both user and context. Whilst there is also the confounding variable of interpretation of the question, we can see that geometric guidance was often preferred in the subjective interview. However, this was not always the case and for some participants, it was not even the optimal guidance. We believe that tailoring modulations to the needs of an individual user will be an important step forward in the development of further methods for visual guidance. Furthermore, the need for context-based modulations when applying saliency and geometric cues is apparent. We can clearly see in some images where the generic application of modulations can have little to no effect, for example, those where the target is a light area surrounded by further light areas, and those where the effect is quickly evident, for example targeting a dark area surrounded by further dark areas causing modulations to create a quick transition (Figure 7). This need for context-aware and adaptive highlighting techniques was already reported elsewhere [41] and our results indicate that flicker is another technique to add to the existing repertoire of techniques for guidance that can be applied in-context.

We also see that flicker and geometric outlining techniques both provide an effective means to quickly draw attention to a target area and hold attention there. Notably, flicker appears to be the most effective at this. Saliency was still able to effectively draw attention when compared to no modulation but fixations were generally slower and saliency could not maintain attention as well as the alternatives. This would indicate that saliency techniques may indeed be best left to utilisation in their currently indicated application areas of subtle, less obtrusive, and more scene-preserving methods of visual guidance.

The results from the study indicate the potential of the flicker technique for visual guidance to provide effective forms of visual assistance to AR glasses. However, results also indicate that modulating flicker is needed to prevent it from being overt and to reduce the attention tunnelling seen in geometric cues.

4 AR Guidance Using Flicker in A Practical Task

Our first study showed the relevance of flicker for AR guidance, however, like previous works on screens, was focused on internal validity and did not involve the users completing a practical task representative of real-world use of AR guidance. To extend the ecological validity of our results and the application of flicker to AR guidance we focused on capturing usage data in a practical item retrieval task. This time, users directly wore the AR glasses and could freely move about the study environment, a yet untested situation for flicker guidance. Our previous study showed that flicker was effective, however, there seemed to be potential improvement through better regulation of the flicker that otherwise can cause discomfort and distraction [20]. Prior works have created subtle forms of flicker the looked to preventing directing viewing using gaze and shown it as an effective means of guidance [29] on screens, and we draw on this notion of gaze modulation for use in AR guidance on current OSTHMDs. Therefore, we aimed to explore if the general positive performance can be confirmed in a real task and environment while mitigating the few negative impacts by improved modulation of the flicker.

4.1 Study Outline

Conditions: With the focus of this study being on flicker, we considered three different conditions. None, as a baseline condition with no guidance. Constant flicker, as implemented in our initial study. For practicality, we elected to have the flicker run constantly rather than diminish over time. Finally, Gaze-Modulated flicker, a condition that potentially reduces the impact of the flicker effect by responding to the user’s gaze. This was based on the notion of gaze modulations shown on the screen [4, 29], however did not include saccade detection for practical realisation in AR glasses which will still create an overt flicker that clearly indicates the target. As the focus of our study was on the application of flicker in real-world tasks we choose to use a baseline of None. While we could have further compared flicker to outlines by using it as a baseline, we wanted to focus participants on the practical use of flicker and gaze modulation so avoided conducting a second comparison study. The guidance approaches are conceptualised in Figure 8.

Figure 8:

Task: The task for the study was a generalisation of a picking task in which participants had to find the correct tool among several shown tools (Figure 9 Left). The task was implemented to investigate the impact of flicker for real-world guidance in a realistic application for AR. Specifically, the participants were shown a set of tools laid out on a table. They were stood in front of a monitor on a table opposite the tools (Figure 9 Right). A tool to select was displayed on the monitor. The participants turned, touched the intended tool, then turned back and click done. This emulates a worker retrieving tools from a table while working and is a similar scenario to many of those shown in prior research for industrial use cases [33, 54]. The design of the task allows for direct generalisation to task in which items need to be retrieved from a different area to the work are such as mechanics, tutorials, and cooking. In contrast to prior works [4, 16, 29, 41], the tools selected for the task varied in size to provide a more realistic use of guidance.

Design: We designed a within-participants study to investigate the effectiveness of each guidance technique based both on their ability to draw attention and their ability to facilitate the completion of tasks. Our independent variable was the guidance condition (None, Constant flicker, Gaze-modulated flicker). Our dependent variables were different eye tracking metrics, time to first fixation, measured from the time the user clicked the start button til the first fixation was detected on the target object, and overall task completion time, measured from the time between clicking begin and clicking done, as a measure of impact on task performance. Where prior works, and our prior study, have focused solely on the instantaneous impact on search and generally consider a short time period (\(\space\) 5s) of use, search is generally conducted as a sub-task of a wider task and over longer periods of time, as in the scenarios considered by our task. As such we were interested in the effect on our overall task and measured overall task completion time. We also collected feedback on the guidance methods using Likert-scale questions by adapting questions from the MREQ [34] to our purpose, which asked about the presence and environment integration of any virtual overlays, and complementing this with three more study-specific questions. These asked about the obtrusiveness of the display, the participants comfort when finding the tool, and distractions when searching for the tool. We finished the study with five open questions to collect some general feedback. Questions used for the study are included in the appendix.

During the study, it was apparent that participants tended to take a relaxed approach to returning to the computer and completing the task. We anticipated that this extended task completion would impact the effect size on overall task completion. As we did not record other timings such as tool selection, we decided to also consider time facing the table as a measure of overall time spent engaged with the tools (including reconfirming object and double checking the search target).

Apparatus: We conducted this study with the users directly wearing the AR glasses and conducting the task in their physical space. We utilised a Hololens 2 and a common lab environment with tools placed on a table.

Figure 9:

Hypotheses: We formulated three hypotheses stemming from the results of our prior study:

(1)

As flicker was able to provide effective guidance with a large effect, we anticipate that flicker guidance will improve participants’ performance. This will be seen in both the time to first fixation and task completion time.

(2)

Given that perceivable flicker was generally considered very overt, gaze-modulated flicker will be preferred over constant flicker by users, being more visually comfortable and less obtrusive.

(3)

Constant flicker will be the fastest and most distracting of the conditions, with participants simply looking at the flickering.

Procedure: After reading an information sheet, signing a consent form, and filling out a demographic sheet, participants were introduced to the Hololens 2 and asked to wear it. Once the Hololens 2 was adjusted to their head, we calibrated the eye tracker using the inbuilt calibration. We verified the accuracy of the tracking, and participants were asked to recalibrate if the calibration was insufficient (within 1^o). The participants then stood in front of a monitor and the process for finding tools was explained. Participants indicated they understood the task by clicking a ’begin’ button on the monitor. Once started they saw a photo of a tool, they needed to find on the monitor. The participants then turned to the table and selected the tool shown on the screen by touching, providing a clear visual cue to the observing researcher that they had found the correct item. They then turned back to the computer and clicked ’found’. Once clicked the system presented the participants the Likert-scale questions on the monitor. This was repeated for 12 trials under varying conditions in a counter-balanced order with a unique tool for each trail. After all trials, the participants were asked to answer open questions about the study and gaze guidance.

Participants: For this study, we looked to recruit 24 participants. Due to issues with the eye tracking calibration, we had to collect results from 27 participants, with 3 participants being unable to verify the eye tracker as being calibrated within an acceptable range (1^o). We had to further exclude one participant after the study due to errors in the data recording.

4.2 Analysis and Results

Analysis: We analysed the results of our second study in the same fashion as the first, using a significance level of p<.05. We used Friedmans test (Degrees of freedom: 2) with Wilcoxon post hoc test (Degrees of freedom: 22 for Gaze, 19 for Likert-scales) as our data violated the assumptions for parametric tests (non-normal distributions according to a Shapiro-Wilk normality test). To analyse time facing the table we considered the time from when the participant’s viewing angle was less than half the maximum offset angle (i.e., facing the monitor) as the table was placed directly opposite the monitor. The analysis of our results is in three sections: Gaze, Likert-scale Questions, Open Questions. ¹

Gaze. We found a significant effect in the time to fixation on the target (p = .000191), and there was a significant effect of the unmodulated condition being slower than both the constant flicker (p = .0022) and the gaze-modulated flicker (p = .0043) (See Figure 10 Top Left). There was no statistically significant effect between the flicker conditions (p = .58). We found a significant effect in the amount of time participants spent facing the table (p = .01913). There was a significant difference decrease the time spent in the constant flicker compared to the unmodulated condition (p = .016) (See Figure 10 Top Center). Other differences were not considered significant (p =.6 for both). We did not find a significant effect in the overall task completion time (p = .1078) (See Figure 10 Top Right).

Figure 10:

Likert Scales. Three participants did not fill in responses for the obtrusive questions so the analysis is based on the remaining participants. For noticing the presence of the virtual overlay we saw a significant effect (p < .0001) with significant differences between all conditions, (None rated lower than Constant, p = .00022; None rated lower than Gaze-modulated, p = .00022; Gaze-modulated rated lower than Constant, p = .00753). Integration of the virtual overlay with the environment was again a significant effect (p = .006065) with the unmodulated condition significantly lower than the gaze-modulated condition. We saw significant differences between all conditions regarding obtrusion, with Friedman showing a significant effect (p = < .0001) and None less obtrusive than constant flicker (p = .00094), and gaze-modulated flicker (p = .00218). Between the two flicker conditions, gaze-modulated was significantly less obtrusive than constant (p = .00218). Considering the comfort of users, they reported a significant effect (p = .03422) with the gaze-modulated condition being significantly better than both None (p = .024) and Constant (p = .010). Finally, there was no evidence to support a difference in the distraction of the participants (p = 0.8984). See Figure 10 Bottom for the Likert-scale results.

Open Questions. When looking at the results for the final questions we did not conduct formal analysis as these did not directly answer our hypotheses, rather looked to gather further insights that might be indicative of results or of relevance to future work. Overall, they indicate a clear benefit to speed/perceived ability to find the objects and a like for the flickering. However, a couple do mention the annoyance and there is a clear theme of looking for guidance or just relying on the guidance without paying attention to the task or learning where the tools were.

(1)

What did they see: Almost all the participants noted flashing or flickering lights (flickering was given in the study description), those who did not noted highlighting. Only a few participants mentioned "varied" highlighting or that it was turned off sometimes. While participants said that they found it comfortable as a guide, or said that it was not distracting, some noted that it was both uncomfortable and distracting. Participants mentioned in their response to this question that they felt either "much quicker" and "even quicker than I could otherwise" when finding tools. They also noted the assistance for differentiating similar tools.

(2)

What they thought of the guidance: Several participants mentioned the guidance being ’very helpful’ (11 mentioned helpful) with others saying that they thought the guidance was good. However, a couple noted that the flicker obscures the target object. Two mentioned that they missed the guidance or felt dumb because they couldn’t remember where anything was when in the baseline condition. Several participants mentioned feeling quicker to find the objects and feeling more confident. The flicker was considered annoying when unanticipated but helpful when anticipated, with the gaze-modulated flicker being less obtrusive and preferred.

(3)

Perceived impact on ability and search approach: Most participants mentioned that providing guidance increased their speed and made finding the tools easier. However, they also discussed that they didn’t have to search anymore or just followed the light, some not even visually confirming that they had selected the correct object. They also mentioned missing it when it was gone and searching for guidance rather than the object. One participant mentioned disliking the need to confirm they had the right object once the flickering stopped.

(4)

Use in other tasks: Participants mentioned Finding things, in repetitive tasks, and Reducing stress as use case for flicker guidance. They thought it would be great for finding keys/lost items as a general use case. They specifically mentioned it would be good for finding tools and items when building or cooking. Another practical task is transferring location knowledge between people. The use in learning was also put forward under the condition that reliance on the guidance is considered. A few participants pointed to the positive effect of reducing frustration and tedium in stressful situations and it was also noted that it should be applied only if there is a clear correct selection or an un-nuanced tasks.

(5)

Impact on task performance and focus: Most participants noted how they were less focused or paid less attention to the surrounding items when the guidance was present. Further, looking for guidance and ignoring the tools themselves was a common point raised.

4.3 Discussion

Looking at our results for H1 we were able to partially support this hypothesis. Participants looked at target objects significantly faster when guidance was present, however, the overall task completion time was not affected to a degree that shows statistical evidence. Therefore, while we can support the notion that guidance could effectively guide the user to their target quicker, we cannot support the notion that this will allow for increased task completion speed. On average, we saw a reduction in search time of 0.89 seconds for gaze-modulated guidance and a reduction of 1.49 seconds for constant flicker. From the participant responses, we saw this was supported with many noting that they subjectively felt faster during the task (12) and they were able to easily find the tool. Looking at the task completion time, we point to the simplicity of the overall search task used in the study and the size of the improvement provided in relation to the overall time as a potential cause for these surprising results.

We were able to confirm our second hypothesis H2. We saw a significant reduction in the obtrusion and an increase in the comfort between constant flicker and gaze-modulated flicker. We saw a significant impact on both techniques when compared to a baseline of no effect for obtrusion, in line with our previous study. This confirms the assumptions of prior works the gaze modulation creates a less overt effect [4, 29] and further indicates that entire avoidance of viewing is not needed to reduce the impact of flicker, however, also shows that perceiving flicker to this degree is still considered overt. It is important to consider prior works that adjusting overtness, larger targets were used to create an overt effect [29] and we did not tailor the size of modulations, rather highlighted the either tool. Therefore, size may have had an effect on the perception of overtness, alongside viewing of the flicker. We did not see a significant impact on the comfort of users between the constant flicker and the unguided condition. Participants subjective responses discussed the impact of flicker on comfort when completing the task, noting the guidance as being very helpful and increasing confidence, although it was also noted as a bit excessive and the gaze-modulated guidance was preferred as being less intrusive. One participant pertinently noted that they found the guidance annoying until they anticipated it being there, and then it was useful. They also discussed the presence of the flicker, noting that they searched for the guidance rather than the target item and so were hampered when it was absent.

An interesting result was that comfort was not considered to be significantly reduced in either flicker condition and was in fact increased in the gaze-modulated condition. This is contrary to our expectations resulting from our initial study as flicker is known to create visual noise and impact comfort [20]. Given that we introduced a task in which participants had to ’achieve’ something, we believe that the increased ease in searching, increased participants’ comfort when completing the task while also offsetting the increases in visual discomfort. While there was no overall difference in the visual comfort, numerous participants did comment on the impact of the flicker.

We were not able to confirm our third hypothesis H3, as there was no significant difference in task completion times between the flicker styles. We also did not see a significant difference in the distraction when using the constant flicker. We did see a trend towards decreased completion time and increased distraction. Therefore, further testing may show an advantage to using constant flicker or gaze-modulated flicker in terms of these metrics. The subjective responses are indicative of this, with participants referring to the annoyance of the flicker and the gaze-modulated condition was considered less obtrusive and did not obscure the target.

The participants also mentioned side effects of using the guidance that are of relevance when considering their use in real tasks. Providing guidance subjectively reduced their awareness of the spatial layout, and they paid less attention to the task. Removing the guidance in the gaze-modulated flicker condition would remove the issue of not confirming they have touched the correct item, however, would not remove any issues with attention tunnelling and spatial awareness. The existence of these issues is indicated in the prior literature [11, 45] although we did not verify this here.

5 Discussion and Conclusion

Visual guidance has been demonstrated to provide assistance in numerous application scenarios ranging from on-screen guidance to real-world visual guidance such as order picking. Prior works on AR guidance had primarily focused on the use of geometric cues for guidance and only recently had saliency modulation been considered. At the same time, flicker had only been explored for on-screen-based guidance, where the location of targets is constrained, views are known ahead of time, and complete control of the content seen by users is possible. To explore the potential of flicker for real-world guidance, it is critical to understand, first, how flicker compares in AR glasses to existing techniques, and second, the impact of applying flicker to real-world situations. In this paper, we addressed the limitations of current approaches and presented investigations into addressing these requirements via two user studies. The first study validating the use of flicker in a controlled study compared to current alternatives, the second study validating its use in a real-world task.

5.1 Discussion of Results

Overall, both studies showed that flicker was effective in guidance and should in the future be considered as an option when realising visual guidance using AR. It provided benefits over alternatives and we showed that it can readily be applied in practical tasks using current OSTHMDs. Specifically, when comparing flicker against outlines and saliency modulation, flicker was more effective at drawing and holding gaze, while also being considered the most overt approach. We also showed improvements in practical visual search tasks, and here specifically the actual visual search, even though the differences did not carry over to general task performance. Even though participant noted the obtrusion introduced by flicker, they still preferred it and the overall modulation. Introducing gaze-based modulation of the technique showed the potential to further improve comfort while maintaining benefits.

Based on our results, we pose the following recommendations for using flicker in AR: First, flicker is a fast method for drawing gaze and was highly effective at this. In fact, it emerges that flicker is best used when an immediate response or the best possible guarantee of drawing gaze is needed. Examples include in emergencies and to ensure dangers are recognised, and in fact, we already see blinking being used for this purpose in practice (e.g. light on emergency vehicles). Similarly, if constant reference for an object is needed then using geometric cues such as outlines may be preferable, and saliency based are better suited to subtle support of gaze opposed to direct guidance.

Second, we argue that flicker-based guidance can be improved, and here in particular the perceived comfort, when modulating it by user’s gaze. This was already indicated in prior works on using flicker for desktop-based guidance [4, 29, 51] and seems to also hold true when users can move freely such as in AR. While not specifically explored, gaze-modulate flicker has the potential to also avoiding occlusion issues when compared to geometric cues. Unlike prior work, our flicker was more noticeable (because of the constrains given by the OSTHMD such as display frequency) but it seems that a more subtle effect is not necessarily needed for having acceptable comfort. In fact, a need for clear guidance was brought by participants in both studies and flicker modulation was preferred. Therefore, when applying flicker, modulation should be used, however, should not completely disappear to allow for target verification.

Third, prior works have focused on task performance of flicker in artificial lab task such as increased attention to images areas [4], counting bubbles [29], and more recently focused on quicker search times [16]. Similarly, our results showed clearer benefits in the controlled setting without a practical task, while in a practical task, measurable benefits to the search component of the task were not sufficient to impact the overall task. Instead, future studies and applications on flicker-based guidance should more consider other aspects than only search times such as comfort, task focus, or attention tunnelling while acknowledging that the importance of these factors depends on the actual task.

Finally, flicker can have adverse effects due to visual noise and photo-sensitivity (e.g., epilepsy) with frequencies 15-20Hz over more than 10% of the field of view being of particular concern and 3-60Hz being of general note [14, 20]. While this is generally a very small part of the population and affected people almost always know of their sensitivity, the effects can be severe and therefore flicker may not be applicable for all users. That said, in our envisioned use, only the wearer of the device would be affected while earlier works on flicker visualisation on desktops or projectors have a higher chance of also affecting bystanders. Note that we clearly expressed the presence of flicker in the studies, and participants who noted adverse effects to flickering lights were excluded from participating.

5.2 Limitations and Future Work

One limitation of our studies was that we chose to use unoptimised forms of guidance techniques. All the forms of guidance have various parameters that could be adjusted and tuned to improve performance, particularly for specific users and contexts of use, as shown for flicker [4] and saliency [45]. Therefore optimisation may lead to different results, with the potential for all techniques to perform better and differences to vary. However, this reinforces the validity of the use of flicker as it was able to be applied effectively without consideration for optimisation and could still be optimised further, although any optimisations for effectiveness and overtness will be scene and user-specific [45, 50].

A first step to improving flicker-based guidance would be to further investigate the use of SGD to modulate the intensity [29]. Rather than using a base level of flicker and disabling it once the user was looking towards the target, this would involve tuning the intensity of the guidance and determining when the user is about to look at a target area to modulate it, while still allowing for clear acquisition of the target. In real-world applications, issues such as smooth pursuits, non-static targets, and close target proximity will need to be considered. An alternative extension to this would be to consider the use EEG or similar signals to identify when a flickering target had been seen and modulate the flicker accordingly which would remove the need for calibrated eye-tracking [13, 46].

Similarly, future work could develop methods of guidance that incorporate multiple guidance approaches to draw on the advantages of each. Our studies showed the advantages of flicker in drawing attention and that negative impacts could be mitigated when applied in a practical task. Given that outlines were preferred to flicker for a method of guidance however are slower and occlude scene elements, future work could look to integrate geometric cues, such as outlines, in conjunction with flicker to improve both performance and preference of guidance. Consideration of how to exploit the positive aspects while minimising the negative aspects is needed.

Another limitation of our work is the duration over which observations are gathered which is common to most guidance studies. Both our studies used shorter tasks, the first being based on prior works where initial views of images are considered and only looking at effects within five seconds. Over longer durations, e.g., a day of work, the impacts on user perception may vary with impacts from modulating the guidance potentially becoming more pronounced. The second study used a simple item retrieval task and users were not conducting other tasks in between or learning the layout of items. While this was a longer task and showed wider user, a long-term study would be able to make observations about some of the comments raised by our participants around the impacts on paying attention to the task at hand, and reliance on guidance. Of particular interest is the impact of known environments on the performance of guidance, the impact on learning of locations, and use when the location of objects can vary between or during use.

Finally, given that our work focused on introducing flicker in the users’ view to guide visual attention, one area of future work is directing users to out-of-view objects. While it is feasible in search tasks to know the area in which to look or scan a room, directing a user’s direction and location are areas of further research to complement our own that are actively being investigated.

5.3 Conclusion

Overall, we demonstrate the practical relevance of using flicker in AR glasses to provide visual guidance and assist with search tasks. The results from our two studies both point to the advantages of employing this style of guidance and its ready application in current devices while also reinforcing the need to modulate guidance to prevent discomfort or scene occlusion. We believe our research is relevant to advancing the use of visual guidance in real-world situations and overcoming the limitations of current applications.

Author Contributions

Jonathan Sutton performed writing - original draft, conceptualisation, investigation, software, and formal analysis, Tobias Langlotz performed writing - review & editing, conceptualisation, supervision, project administration and funding acquisition, Alexander Plopski performed writing - review & editing, investigation, Kasper Hornbæk provided writing - review & editing.

Acknowledgments

The authors thank Felix Schrimper for aid in conducting user studies. This work was supported by the Marsden Fund Council (grants MFP-UOO2124 and UOO1834) administered by the Royal Society of New Zealand and by the Villum Fonden (grant VIL-50108).

A Extended Results from Study 1

The following tables summarise the statistical p-values from our test run on the data gathered during the initial study.

Table 1:

Fixation	Friedman:	2.71e-09		Time	Friedman:	5.306e-10
	None	Flicker	Saliency		None	Flicker	Saliency
Flicker	0.81258	-	-	Flicker	0.69	-	-
Saliency	0.35257	0.81258	-	Saliency	0.40	0.94	-
Geometrical	7.2e-05	0.00018	0.00018	Geometrical	7.5e-08	3.0e-07	4.5e-08

Duration	Friedman:			Area	Friedman:
	None	Flicker	Saliency		None	Flicker	Saliency
Flicker	0.89	-	-	Flicker	1.0000	-	-
Saliency	0.12	0.89	-	Saliency	1.0000	1.0000	-
Geometrical	4.5e-08	4.5e-08	6.0e-08	Geometrical	0.0176	0.0048	0.0048

Noted	Friedman:	8.039e-07		Overtness	Friedman:	0.009654
	None	Flicker	Saliency		None	Flicker	Saliency
Flicker	1.00000	-	-	Flicker	1.0000	-	-
Saliency	1.00000	0.48443	-	Saliency	1.0000	1.0000	-
Geometrical	0.00196	0.00053	0.00067	Geometrical	0.0023	0.0203	0.0048

Table 1: 25%

Table 2:

Fixation	Friedman:	3.326e-15		Time	Friedman:	2.42e-15
	None	Flicker	Saliency		None	Flicker	Saliency
Flicker	2.5e-05	-	-	Flicker	4.5e-08	-	-
Saliency	7.2e-05	3.0e-05	-	Saliency	2.6e-05	4.5e-08	-
Geometrical	2.2e-05	0.083	3.0e-5	Geometrical	4.5e-08	0.0022	5.6e-07

Duration	Friedman:	2.345e-16		Area	Friedman:	8.873e-10
	None	Flicker	Saliency		None	Flicker	Saliency
Flicker	4.5e-08	-	-	Flicker	6.3e-07	-	-
Saliency	2.6e-05	4.5e-08	-	Saliency	0.0232	4.1e-06	-
Geometrical	4.5e-08	0.00079	4.5e-08	Geometrical	4.4e-05	7.1e-05	0.0013

Noted	Friedman:	1.606e-09		Overtness	Friedman:	3.536e-06
	None	Flicker	Saliency		None	Flicker	Saliency
Flicker	0.00052	-	-	Flicker	0.00044	-	-
Saliency	0.00549	0.00163	-	Saliency	0.00760	0.00024	-
Geometrical	0.00101	0.00444	0.00267	Geometrical	0.00065	0.00024	0.00636

Table 2: 50%

Table 3:

Fixation	Friedman:	9.83e-13		Time	Friedman:	2.978e-12
	None	Flicker	Saliency		None	Flicker	Saliency
Flicker	2.1e-05	-	-	Flicker	4.5e-08	-	-
Saliency	4.6e-05	0.00046	-	Saliency	2.1e-06	3.1e-06	-
Geometrical	3.9e-05	0.59098	0.00055	Geometrical	4.5e-08	0.074	6.7e-06

Duration	Friedman:	3.706e-14		Area	Friedman:	7.989e-10
	None	Flicker	Saliency		None	Flicker	Saliency
Flicker	4.5e-08	-	-	Flicker	1.3e-07	-	-
Saliency	3.0e-07	3.1e-07	-	Saliency	3.5e-05	5.0e-06	-
Geometrical	4.5e-08	0.00042	5.5e-05	Geometrical	2.0e-05	3.3e-06	0.019

Noted	Friedman:	2.608e-10		Overtness	Friedman:	3.409e-07
	None	Flicker	Saliency		None	Flicker	Saliency
Flicker	0.00036	-	-	Flicker	0.00049	-	-
Saliency	0.00138	0.00617	-	Saliency	0.00355	0.00075	-
Geometrical	0.00036	0.07260	0.00866	Geometrical	0.00034	0.00355	0.00355

Table 3: 75%

Table 4:

Fixation	Friedman:	8.175e-13		Time	Friedman:	4.661e-13
	None	Flicker	Saliency		None	Flicker	Saliency
Flicker	2.3e-05	-	-	Flicker	7.5e-08	-	-
Saliency	5.0e-05	0.00049	-	Saliency	7.5e-08	5.7e-06	-
Geometrical	2.3e-05	0.07625	0.00499	Geometrical	4.5e-08	0.00170	0.00068

Duration	Friedman:	3.001e-14		Area	Friedman:	8.873e-10
	None	Flicker	Saliency		None	Flicker	Saliency
Flicker	6.0e-08	-	-	Flicker	6.3e-07	-	-
Saliency	4.5e-08	2.2e-07	-	Saliency	0.0232	4.1e-06	-
Geometrical	4.5e-08	4.1e-05	4.8e-05	Geometrical	4.4e-05	7.1e-05	0.0013

Noted	Friedman:	1.362e-08		Overtness	Friedman:	1.357e-07
	None	Flicker	Saliency		None	Flicker	Saliency
Flicker	0.00062	-	-	Flicker	0.00056	-	-
Saliency	0.00055	0.25717	-	Saliency	0.00056	0.03435	-
Geometrical	0.00062	0.25717	0.32096	Geometrical	0.00034	0.12407	0.03435

Table 4: 100%

B Questions Used in Study 2

Likert-like scales used for per trial responses. All scales were 7-point between strongly agree and strongly disagree

•

There was a virtual overlay

•

The virtual overlay belonged to the real environment

•

The overlay was obtrusive.

•

Finding the tool was visually comfortable

•

I was distracted when trying to find the tool

Open questions asked after the study.

•

Throughout the study different forms of visual guidance were applied. What, if anything, did you perceive?

•

How did you find the forms of guidance? What did you think of them?

•

How did the guidance impact your ability to find the target tool? Did it affect how you searched?

•

Would you consider using any of the forms of guidance to assist in actual tasks? Why and what tasks/ why not?

•

How did the guidance forms impact your ability to perform the task at hand and focus on the tools on the table

Footnote

We noted a participant adjusted the brightness of the Hololens 2 putting it on. This may reduce the effectiveness of the flicker and potentially comfort. We did not tune the flickering luminance or see any changes in the results once this was corrected. As such we consider this a confound for the within-subject design, alongside individual susceptibility to lights and flicker.

Supplemental Material

MP4 File - Video Presentation

Video Presentation

Transcript for: Video Presentation

References

[1]

Richard A Abrams and Shawn E Christ. 2003. Motion onset captures attention. Psychological science 14, 5 (2003), 427–432.

Abstract

1 Introduction

2 Background

2.1 Screen-based Visual Guidance

2.2 Augmented Reality Visual Guidance

2.3 Research Gap

3 Evaluating Flicker for Visual Guidance in AR Glasses

3.1 Study Outline

3.2 Analysis

3.3 Discussion

4 AR Guidance Using Flicker in A Practical Task

4.1 Study Outline

4.2 Analysis and Results

4.3 Discussion

5 Discussion and Conclusion

5.1 Discussion of Results

5.2 Limitations and Future Work

5.3 Conclusion

Author Contributions

Acknowledgments

A Extended Results from Study 1

B Questions Used in Study 2

Footnote

Supplemental Material

References

Index Terms

Recommendations

Look over there! Investigating Saliency Modulation for Visual Guidance with Augmented Reality Glasses

Haptics in Augmented Reality

Extending Virtual Reality Display Wall Environments Using Augmented Reality

Comments

Information

Published In

Sponsors

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Upcoming Conference

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

View options

PDF

eReader

HTML Format

Login options

Full Access

Share

Share this Publication link

Share on social media

Affiliations