1. Introduction
In the future world of working, employees will be empowered by assistance systems to expand their skills. The functions of immersive systems promise the smart operators the right information in the right place at the right time. This enables them to carry out the tasks assigned to them in less time with better quality and higher output [
1]. One concrete example of this is a measurement application using augmented reality. This allows users to virtually interact with the real world around them and to annotate it. This system has the ability to geodetically map the real environment in real time [
2], enabling real dimensions and references to be augmented virtually. Such a system can support employees, especially when measurements are difficult to access or out of reach for operators. These conditions could favour the selection of immersive systems compared to conventional measurement systems, as conventional systems are limited by their inherent measurement principles.
However, the productivity paradox approach of Schweikl and Obermaier [
3] shows that the potential of modern information and communication technology (ICT) applied in assistance systems is not fully exploited. The principle states that thanks to modern automation and digitalisation, higher overall productive performance is expected than what is currently measured. Due to fundamentally new interaction paradigms and the interplay of virtual assets within the real world, users face major obstacles in understanding the logic of immersive systems. Especially for beginners, the error rates are very high due to the participation gap. This leads to masking of the added value of the immersive features along the user journey. This dilemma poses major challenges for both developers and users. The question from our research association is whether the system design or the training design has a greater influence on user experience. According to Schweikl and Obermaier [
3], the reasons for this paradox are multi-layered, including exaggerated expectations, adjustment delays, and misjudgements of the added value. However, a key aspect is the mismanagement of these technologies and the lack of complementary preconditions. The operators lack the necessary skill level to be able to use the systems effectively. Corresponding to [
3], insufficient resources are invested in the necessary training. This training error, as outlined in the AR failure taxonomy [
4], contributes to adverse effects on performance (as discussed by Bahaei et al. [
4] and Simatupang and Saroyeni [
5]) and acceptance (indirect correlation as discussed by Giovanni Mariani et al. [
6] and Marshall et al. [
7]). It also contributed to technostress [
8,
9,
10], which negatively impacts overall productivity. In general, immersive technologies promise significant potential for education and training in the industrial environment [
11,
12,
13], such as speeding up the reconfiguration of production lines, supporting shop-floor operations, or virtual training for the assembly of parts. Initial concepts have also been explored regarding how the use and handling of immersive systems can be trained in the context of training and teaching in a university environment [
14]. The critical literature search revealed that there is no contribution that deals with training scenarios, evaluation methods, or guidelines specifically for learning the usage of immersive systems in an industrial context. However, there is a demand for research (see [
7,
15,
16,
17,
18,
19]) on how immersive assistance systems can be introduced to user groups effectively and efficiently in order to avoid negative effects. Indeed, there is a lack of applicable design guidelines and recommendations regarding training scenarios for specialised, professional use. In our research association, a research prototype of an industrial AR application was developed for layout planning, and its effects on long-term use were investigated. The first challenge was to tailor the application design towards the industrial requirements. The aim was to explore effective and efficient AR training scenarios to pilot tangible software prototypes under real conditions. In preliminary research for this study, an optimised training scenario was developed, the results of which were incorporated into the current research design as the factor level of an independent variable. Motivated by the prior experience of the research association, the study we present here aims to investigate the effects of the quality of training on performance, acceptance, and stress. This study contributes to the state of scientific knowledge in the following way: in contrast to the related work, this article addresses training scenarios in an industrial context. Additionally, unlike theoretical approaches, practical implications can be drawn from this work, e.g., how the right mix of training strategy (emphasising quality over quantity or intensity) delivers effective training success. Furthermore, a replicable experimental design (with penalty times) for measuring the efficacy of training scenarios is presented. In addition, the ‘performance’ index is introduced as a new metric (accounting for the time and correctness of a task). In summary, tangible design guidelines for training scenarios towards immersive assistance systems are presented. This topic may also be meaningful in the context of the industrial metaverse.
2. Related Work
Relevant findings have been researched and embedded in the context of ICT and immersive assistance systems.
Despite substantial investments in information and communication technologies, the anticipated improvements in productivity, as measured by output per worker or output per unit of capital, have not always been realised. According to the scientific discourse of a recent meta-review [
3] about its effects, causes, and socio-technical contexts, the productivity paradox also applies to immersive assistance systems. Research from the last three decades suggests that ‘adjustment delays, measurement issues, exaggerated expectations and mismanagement’ [
3] are root causes for the recent deceleration in productivity growth. A critical examination of mismanagement highlights that insufficient introduction of ICT and missing organisational change can lead to substantial productivity loss. In addition, facilitating organisational change to fundamentally transform business processes is a necessary action for management to undertake [
3].
The following needs for training to increase productivity were identified from preliminary work. Therefore, it is important for workers to continuously develop their skills and for companies to invest in training and development programs to ensure that their workforce has the necessary skills to effectively use new technologies [
3]. There is a demand for providing training and education programs that equip workers with the necessary skills to use immersive technology correctly [
3], thus allowing them to succeed in the changing workplace. Zoetbrood [
20] observed that training does not have a significant impact on productivity, instead pointing to other interacting factors, including multiple deficiencies, e.g., integration into workflows and processes, linkage with other organisational changes, adaptation to employees’ needs, and continuity and sustainability.
According to Kurilovas [
21], quality is defined as the extent to which an entity (e.g., a product or service) is able to meet specific requirements. In the context of training quality, quality refers to the effectiveness of training scenarios and their suitability to learners’ personal needs. It is emphasised that the quality of learning content should be assessed through both an expert-centred (top-down) and a learner-centred (bottom-up) approach. Using the top-down approach, external experts carry out the assessment by applying various methods of decision analysis. They draw up a quality model, determine the weighting of the quality criteria, and apply suitable evaluation methods. The bottom-up approach focuses on direct feedback and evaluation from the actual users of the training scenario. This can be achieved through surveys, interviews, case studies, or other methods to obtain a comprehensive understanding of how learners perceive the scenario and what improvements could be made from their perspective. In the context of ICT, the quality of training is regarded as the user’s acquisition of skills that enable them to perform their technical task effectively and efficiently together with the immersive assistance system [
22].
Results regarding the impact of training on stress could be found in preliminary work. For instance, Korunka and Vitouch [
23] and Day et al. [
15] demonstrate that adequately trained workers involved in the implementation of new ICT experience less stress. The findings presented by Wang et al. [
24] indicate that technostress can be reduced by fostering learning organisation and sufficient training.
Previous work has shown these outcomes regarding the influence of training on acceptance. To determine how to improve the implementation and adoption of new technologies, Marshall et al. [
7] examined the impact of end-user training on technology acceptance in the context of oral surgeons. Both determinants, performance expectancy and effort expectancy, were positively correlated with training. The work by Alqahtani et al. [
25] demonstrates the influence of educational quality, including the features and functionality of the e-learning system, on the acceptance factors perceived ease of use and perceived usefulness. As a summary of past studies, Giovanni Mariani et al. [
6] (n = 497 participants) analysed the influence of training opportunities introducing new IT tools by selecting some factors of the TAM model. The results show that providing training opportunities indirectly impacts the intention to use (acceptance). Training opportunities were shown to have a direct positive impact on IT self-competence, job satisfaction, perceived ease of use, and perceived usefulness. The last two determinants, in turn, have a direct impact on the intention to use. There were also other effects with moderate factors such as the organisational area, gender and age, educational level, IT experience, and learning strategies.
In the following, learning factors, methods, and approaches for ICT, as well as for augmented reality, are presented. In line with Korpelainen and Kira [
22], two basic learning strategies are differentiated: formal and informal. Formal learning is structured training guided by a trainer. It includes presenting help through user training sessions organised by the organisation’s support staff. These sessions are designed to offer employees a structured and guided approach on how to use the ICT system. Formal training may also include self-study courses (e.g., e-learning) or training exercises with peers. In contrast, informal learning is not structured or guided by a trainer. It consists of practical application, seeking help from written material, or interpersonal help from peers. Practical application involves trying things out by trial and error, exploring different functions, and learning by doing. Written material, representing manuals or email information from the organisation’s IT support department, must be studied by the participants. The interviews conducted by Korpelainen and Kira [
22] show that the test subjects favoured informal learning strategies, such as practical experiments, using helpdesk services, or help from peers. These informal strategies offer employees flexibility and autonomy in the learning process and are, therefore, recommended according to Korpelainen and Kira [
22]. Giovanni Mariani et al. [
6] recommend that companies consider various learning strategies to maximise the effectiveness of training initiatives and meet the individual learning needs of employees. This includes creating opportunities for informal learning, promoting learning with others, and taking into account the different learning strategies utilised by employees. The following implications can be drawn from the field of agile learning formats and practice-based learning environments: learners should use competency-based agile learning formats, learning environments with incentives and opportunities to apply their knowledge through a mixture of learning elements, case studies, and practical sessions [
26].
Additional learning approaches for industrial use include blended [
27] and flipped learning [
28,
29]. Both approaches aim to enhance the effectiveness of learning by integrating various teaching methods and technologies. Blended learning emphasises the integration of digital learning tools, e.g., e-learning, video tutorials, manuals, and different learning formats, such as face-to-face training, practical sessions, and so on. Flipped learning focuses on reversing the traditional teaching structure by providing learning materials and practical sessions before personal training. Both approaches can be leveraged in practice to develop effective training programs and meet the requirements of ICT.
Dörner and Horst [
14] present relevant methods, discussions, and inferred best practices for overcoming challenges when teaching hands-on courses about mixed reality in higher education. The key method is the Circuit Parcours Technique [
14], which provides a structured mixture of hands-on experiences with theoretical backgrounds and evaluations for higher education. It involves role-based learning and group collaboration and aims to achieve pedagogical goals, such as active learning and peer teaching. This method can be integrated into the curriculum to offer a comprehensive learning experience and includes reflection and evaluation phases for continuous improvement. Overall, it offers an interactive approach for educators to promote practical learning and skill development in VR and AR.
The preceding study by Bradley et al. [
30] discusses how the learning curve of task difficulty with new technologies looks schematically over time; see
Figure 1a. Accordingly, the novice user starts their conventional work task at a certain difficulty level. If the novice starts their learning journey from this point, they must invest a certain amount of effort until they reach maximum learning pain. As soon as they overcome this turning point, they will have acquired so much learning experience that the task difficulty decreases over time. With increasing training and experience, the user masters the assistance system, thus benefitting from the potential of the easiness opportunity when carrying out their activity in combination with the assistance system.
In accordance with Dunleavy [
32], three design principles were formulated for the development of AR learning experiences. The first principle is to enable and then challenge, which involves providing support for positive interdependence among learners to achieve a common goal in a physical space [
32]. The second principle is driven by gamification, which involves direct player interaction and learning through gamified narratives to enhance engagement and learning outcomes [
32]. The third principle is to spark curiosity by enabling the exploration of spatial awareness features through AR technology for immersive learning experiences [
32]. Due to the high density of previous work on training with AR, the findings and pedagogical concepts can be generalised to the learning of immersive systems. Mystakidis et al. [
33] recommended engaging novices in playful activities to promote critical thinking and reflection, utilising AR and technical skill development. The most commonly used instructional technique involves learners constructing knowledge by interacting with AR-enhanced content [
33]. This empowers novices to take control of their learning through hands-on activities [
33]. Combining various AR approaches provides diverse and engaging learning experiences [
33]. Implementing playful designs and strategies enhances novices’ engagement and motivation [
33].
The following results were obtained concerning the learning and adoption of technologies. According to Barnard et al. [
31], the specific relationship between acceptance and the adoption of new technologies was discussed. According to Barnard et al. [
31], two basic conditions were identified that must be met for users to eagerly learn a new technology: intention to use and usability. It is, therefore, important to strengthen the determinants of these two basic conditions when training new technologies in order to ensure that they are well accepted by users. For the intention to use, we refer to the UTAUT acceptance model [
7] and its four determinants for the intention to use: performance expectation, effort expectation, Cohen’s influence, and facilitating conditions. According to the definition of ISO9241 [
34], the usability of a technology is the perceived effectiveness, efficiency, and satisfaction with which users achieve their goal, where effectiveness refers to accuracy and completeness. Contrastively, Nielsen [
35] shows that usability is characterised by the following five attributes: easy to learn, efficient to use, easy to remember, minimal errors, and subjectively appealing. In turn, Rogers et al. [
36] identify five factors that influence the adoption of innovations, including relative advantages, compatibility, complexity, testability, and observability. These factors combine user and system aspects and emphasise the importance of user experience in the technology acceptance process. Another study by Barnard et al. [
31] also justifies the relevance of experimenting with technology [
37] using the incorporation phase of the STAM model, in which the user should be given an understanding of the usefulness of the system. Based on these findings from the literature, Barnard et al. [
31] conducted two qualitative case studies to gain insights into the experiences, opinions, and challenges of older adults in using technology. One study dealt with the use of mobile technologies when walking, the other with errors that older users experience when using a tablet computer for the first time. The researchers used interviews, open discussions, and experimental methods to collect and analyse the data. Derived from these data, together with the results of related work, two theoretical models were postulated, which include the influencing factors of the ease of learning perspective and the system and user perspective [
31]:
- 1
Model of technology acceptance from an ease of learning perspective [
31]: Due to the perception derived from perceived self-efficacy and perceived difficulty, the users can ascertain an individual’s attitude to learning. These attitudes are also affected by the social environment and the availability of support. They then lead to the crucial decision-making point: intention to learn, which is essentially the determination to engage with the technology. In line with this intention, the user may proceed through experimentation and exploration, encountering either a barrier of learning, which can lead to rejection, or if the difficulty is manageable and the experience is positive, to acceptance and satisfaction.
- 2
Model of technology acceptance from system and user perspective [
31]: In short, some selective factors influence the perceived difficulty of learning when a user explores a new technology. These factors include the actual difficulties of the system, the user’s experience with technology, the transfer of learning experiences, feedback, error recovery, quality of training and training materials, and support from the social environment. The influence of these factors on the perceived difficulty of learning were given, which can ultimately lead to acceptance or rejection of the system.
Our study continues the preliminary work in the context of the human–machine interaction loop model [
19]. This model investigates the errors and their effects on immersive assistance systems in industrial work environments. The design was derived from the Human–Cyber–Physical System (HCPS; see [
38,
39,
40]) and included a control loop approach for human–computer interaction. This research model was created to analyse the effects of data errors, visibility errors, interaction errors, and training errors on user performance, acceptance, and stress. For the industrial environment, the initial results show that erroneous speech interaction has a negative impact on performance, acceptance, and stress in human–machine interaction.
Our work investigates the training error within the augmented reality error taxonomy by Bahaei et al. [
4]. The model by Rosilius et al. [
19] was consequently motivated by the theoretical base of a taxonomy of faults. This taxonomy encompasses training inside the categories’ personnel and organisational faults, which refer to the lack of skills and knowledge required to perform a task. This highlights the importance of technology adoption through appropriate training and skill development to mitigate human error in AR-enabled systems.
The following research demand was identified with respect to the related work. Summarising the above, it is essential for research to explore the interplay of organisational support, such as resources for training new technologies that take the specific needs of individual workers into account [
22]. ‘End-user training appears to be an important and understudied factor in technology acceptance’ [
7] (p. 5). According to Palanque et al. [
16], errors within the human–computer interaction loop can be avoided by choosing suitable training content. Relevant research needs were identified regarding industrial applications and acceptance factors for augmented reality [
17]. Relevant research demand was highlighted concerning industrial use and acceptance factors for augmented reality [
17]. As noted by Graser and Böhm [
18], ‘There is a particular of models that reflect application conditions in the field of corporate training outside schools and academic institutions’ [
18] (p. 7). Based on the literature reviews conducted, it is apparent that in the field of immersive assistance training, studies on performance, acceptance factors, and technostress, as well as practitioner notes, are appreciably underrepresented. In compliance with Rosilius et al. [
19], further studies could focus on the research design to investigate inferior training.
5. Discussion
In contrast to most existing AR studies [
11,
12,
13], our contribution does not focus on learning with AR applications but on training scenarios for immersive systems. The critical literature review revealed that there is no contribution that deals with training scenarios, evaluation methods, or guidelines specifically for learning the usage of immersive systems in the industrial context. Dörner et al. [
14] developed an AR learning course for the university environment, but its implications are only transferable to industrial use to a very limited extent.
The uncertainty of preliminary work, whether and what effect end-user training has on performance and acceptance, is addressed by the research questions and associated results of our study. Our results contribute to this research gap. In general, it is widely understood that end-user training is an important factor that requires further research [
7,
16,
17,
18,
19,
22]. But, contrary to the results of Zoetbrood [
20], our study shows that training indeed has a significant influence on productivity (
p = 0.009; Co’d = −1.049). First, Zoetbrood’s correlation matrix [
20] showed only a very small, non-statistically significant effect (Pearson = 0.69;
p > 0.05) between training practices and productivity. Second, the hypothesis ‘The more training and development practices used in an organisation the greater the productivity benefits because of ICT investments’ could not be confirmed on the basis of the multivariate analysis (ß = 0.27, t(98) = 1.003,
p = 0.318). In particular, the development of hypothesis H2 by [
20] is based only on indirect and weak assumptions, so further investigations are necessary. In addition, the research design by [
20] and the related work do not allow any conclusions to be drawn about a differential analysis of the quality and quantity of the inherent training, which, in turn, shows a relevant difference to our DOE. But, we agree with the described interaction effects that performance can only be exploited effectively if other (facilitating) conditions, such as the integration of training measures into process structures, linked with organisational changes, and the necessary continuity, are also implemented.
The sensitivity of performance with regard to QoT supports the postulated mismanagement as a cause of the productivity paradox [
3], indicating that insufficient training in ICT and missing organisational changes can lead to substantial productivity loss. The statement by Schweikl and Obermaier [
3] that companies should invest in training and development programmes to ensure the correct use of new technologies and strengthen their workforce can also be confirmed. Due to the differences in the technologies, DOE, and hypotheses used, the results of the preliminary work could not be directly comparable to our study. The preliminary work could, therefore, only be considered as a basis for the hypotheses. In line with the results from the oral surgeons’ systems [
7], the influence of end-user training on the acceptance factors, PE (Pearson = 0.769 with significance) and EE (0.840 with significance), was confirmed when transferred to immersive assistance systems according to our results with PE (
p = 0.030; Co’d = −0.903) and EE (
p = 0.032; Co’d = −0.898). Both studies exhibit relatively high effect sizes, but [
7] does not allow any conclusions to be drawn about the scope and quality of the training scenarios. Similarly, Alqahtani et al. [
25] showed that educational quality has an influence on the acceptance factors’ perceived ease of use and perceived usefulness for e-learning systems. We also confirm the influence of training opportunities on intention to use (=BI) in the field of immersive assistance systems. In contrast to Giovanni Mariani et al. [
6], we did not conduct a confirmatory study based on the TAM model. However, the results can be transferred to the extent that the influence of training opportunities on the determinants, EE (which corresponds to perceived ease of use) and PE (which corresponds to perceived usefulness), can be confirmed by the UTAUT2 model.
In line with the implications of Korunka and Vitouch [
23] and Wang et al. [
24], our results confirm that the appropriate QoT can reduce the stress levels caused by ICT. This preliminary work indicates the need for appropriate training scenarios, but no proof has been provided yet. Bala and Venkatesh [
10] investigated the effects of training effectiveness on perceived threat, which showed highly significant effects. In our research, the perceived total stress respective workload (Total_Mean) across various subscales of the NASA TLX increased by 20.32 units from OT to MT and by 16.97 units from OT to DT. Other subscales of the NASA TLX also exhibited significant differences (see
Table 5). The Mean MD as the psychological response of stress increased significantly between OT and MT by 17.19 units (
p = 0.045; Co’d = 0.846) and between OT and DT by 19.56 units (
p = 0.017; Co’d = 0.963). The physiological responses also supported the impact of insufficient training. HR_Diff increased significantly from OT to MT by 4.91 Hz (
p = 0.045; Co’d = 1.036) and from OT to DT by 8.11 Hz (
p = 0.017; Co’d = 1.712). SCL_Diff exhibited a significant increase from DT to MT by 1.309 µS (
p = 0.008; Co’d = 1.09). But, the results of the SCL_Diff show a conspicuous difference in detail, as DT compared to MT (
p = 0.008) shows a statistically significant difference, but MT compared to OT does not (
p = 0.077). The analyses show negative values for Mean SCL_Diff at OT, i.e., that the SCL was higher in the baseline (resting) phase than in the action phase. This could be explained by the high susceptibility of the measurement system to interference, including electrode wear, skin care products, disinfection of the hands or incorrect positioning, sequence effects during measurement, etc.
It should be emphasised that the teaching methods and models from the educational sector are not directly transferable to the industrial context. According to Barnard et al. [
31] and Giovanni Mariani et al. [
6], age, level of education, technology affinity, intention to learn, and organisational conditions, such as the expectation of productive success, etc., play a pivotal role as differentiating factors. This can also be deduced from the large standard deviation in the mean performance in the OT scenario of 2.09%/min. These factors were also confirmed by our own experience from the industrial research project. In summary, we recommend that training for immersive assistance systems should always be based on the user’s specific tasks. There should also be as few barriers as possible and a good trainer for troubleshooting. A mix of formal methods with blended learning approaches, personal guidance, and guided learning-by-doing sessions is recommended. It is important to differentiate between system levels, e.g., for hardware and the operational system. It seems advantageous to present the learning content via structured videos with analogies to familiar knowledge (Windows operations). In general, it has been found that when there is a lot of input, learners tend to forget important learning content and become overwhelmed or confused. These findings contradict the implications of Korpelainen and Kira [
22], Dörner and Horst [
14], and Giovanni Mariani et al. [
6].
Figure 1b presents postulated special learning curves for immersive systems with different task difficulties and learning efforts per system level. Handling the hardware and the operational system is relatively simple. The main challenge with the operational system (HL2) is the interaction paradigm. The similarity to Windows means that the user journey is easy to manage, based on previous experience. Here, the user encounters a customised user interface, incorporating a demanding interaction paradigm and spatial dependencies, among others. It can, therefore, be indirectly concluded that errors within the human–computer interaction loop can be avoided through suitable training content [
16]. Our study addresses the demand for acceptance research with augmented reality in an industrial context [
17] and with the aim of finding training scenarios for corporate training outside schools and academic institutions [
18].
A critical examination of our study allows the following claim to be made: with more effort, the levels of the independent variable QoT could have been designed more appropriately. In order to reproduce realistic corporate conditions, the MT scenario used correct content but with very simplified learning and insufficiently designed training methods. Naturally, there is no limit on how incorrectly this scenario can be designed. The intermediate DT level was designed so that a trainer endeavours to convey the training content as effectively as possible in person during a session. Other scenarios are conceivable here, such as a trainer accompanying a GLBD session throughout, which could theoretically deliver promising results. As described in
Section 3.1, a mixed-methods optimisation approach was chosen. Of course, there is still further research potential in the OT scenario.
In contrast to conventional studies [
45,
46] in which performance was always assessed separately via duration and accuracy, which can lead to unreliable results, the performance index was refined for this DOE. Performance is derived using the speed–accuracy trade-off [
48], following Equation (5). This index is analogous to the visibility index proposed by Rosilius et al. [
42]. In relation to the overarching research approach by Rosilius et al. [
19], our contribution has shown that a deficient training scenario has a negative impact on performance, acceptance, and stress. In the case of erroneous speech interaction, a significant difference in performance, acceptance, and frustration was found, but not in terms of strain and stress. Hereby, a contribution was made within this research context, and a dedicated research design for the independent variable QoT was developed, which provides an opportunity for further research activities. In quantitative terms, the BI could be reduced by an average of 1.4 units due to an incorrect training scenario. Compared to Rosilius et al. [
19], the BI was degraded by 2.3 units on average due to an erroneous speech interaction. These differences in mean values or weightings of different errors in human–machine interaction need to be analysed further using inferential statistics, as these purely qualitative observations do not allow any general statements to be made.