Keywords

1 Introduction

Virtual reality has been gaining a lot of attention in the last few years with the new generation head mounted displays and more content. Besides entertainment and social interactions, virtual reality is also used for training people on various skills. These applications can be considered as serious games for virtual reality. Serious games are described as games having a purpose beyond entertainment such as teaching a new skill or providing training to enhance existing skills [11]. Virtual reality training offers several advantages over real world training such as safety, easy customization, gamification, real time alteration of scenarios and environmental elements, automated data collection and no severe real life consequences of mistakes. Many early studies agree on several benefits that virtual reality training provides in many diverse areas such as medical training [10, 14], aeronautics and space training [5] and vehicle operation training [2].

Serious games are an important form of games since they have the aim of training users on some specific skills, which may later be transferred to real life. Because of this, serious games usually involve a lot of tasks to be practiced. The user first needs to understand what they need to do and then perform the tasks. Onboarding users with easy to understand and user friendly instructions is an important aspect of these games since otherwise users may get confused, may not enjoy training and may not benefit from the serious game. Since virtual reality is not mainstream yet, many users are expected to have little to no prior virtual reality experience, which especially calls for user friendly instructions for virtual reality serious games. Not much has been explored yet about the effectiveness of different instruction methods in virtual reality serious games. In this study, we examined effects of instruction giving methods on user experience in a virtual reality serious game for baseline vocational warehouse skills training. Our motivation is to provide insight into future virtual reality serious games for more effective and user friendly instructions. Our results may help in both tutorial level and in game instruction design, which are crucial components of serious games. We believe our results will also help in virtual reality games for entertainment since all games need to have some kind of instructional aspect to them and instructions for games in virtual reality haven’t been well studied yet.

The four instruction methods that were explored in our study are 3D animated, pictograph, written and verbal instructions. The selection of these four instructional methods were inspired by real vocational training games. In instruction preparation, an important factor is cost of the chosen method. 3D animations and picture based instructions are usually costlier to prepare as compared to written or verbal methods. Hence, if animated or pictograph methods provide no significant value over written or verbal methods, developers can choose these less costly methods in their virtual reality serious games.

For our study, we designed eight simple tasks to be performed in a virtual warehouse environment. The tasks were vocational warehouse skills related and included tangible object interaction with tracked boxes. Instructions for these tasks were given to the users with one of these four methods: animated, pictograph, written and verbal. We performed a user study with 15 college aged users. We mainly examined task performance, replays requested for instructions, time to complete the tasks, user preference on instruction methods, ease of understanding and frustration. To make sure that our task design and virtual reality implementation were reasonable, we also looked at ease of the tasks, presence and motion sickness ranked by the users.

2 Related Work

Although no previous work that we are aware of analyzed effects of instruction methods in virtual reality serious games, many previous works are still related to our study in various aspects. Bowman et al. studied spatial information presentations inside a virtual zoo environment to provide better learning [3]. In the virtual reality application, authors employed verbal and text based information, and a few images to accompany these only for more complex content. Ragan et al. also studied effects of supplementary spatial information presentations on user performance in virtual environments [13]. Authors used written and symbolic information in their training system. Most virtual reality training applications have some form of instructions, if they do not solely rely on human tutors. Recent virtual reality training systems utilize different forms of instructions and in-game information. Oliveira et al. utilized text based instructions in their industrial training virtual reality application [12]. Bobadilla et al. used animated, written and verbal forms of instructions and information in their underground power distribution lines maintenance virtual reality training system [9]. In their virtual reality system for training athletes for high pressure situations, Stinson and Bowman used written messages for information conveyance [15]. Carlson et al. utilized video based instructions in the virtual reality assembly tasks training system they developed [6]. The videos were pre-recorded and showed demonstrations of using the input devices and carrying out the in-game tasks. Chittaro and Buttussi utilized written and very brief picture based instructions in their aviation safety training game [7]. In their study of virtual reality laparoscopic surgery training curriculum development, Aggarwal et al. used a one to one human training approach to familiarize the users with the system first [1]. Then, inside the virtual reality training module, written and brief visual based instructions were used. Corato et al. developed a virtual reality training system for hand washing procedure of surgery staff [8]. The authors utilized overlaid real time animations that performed the same task along with the user. There were also supplementary written on-screen instructions to explain the users what to do. Although these studies used various forms of instructions, since the main focus of the studies were providing effective training with virtual reality serious games, authors did not explore effects of different instruction methods.

3 Instruction Methods Experiment

To examine user preference on instruction methods in virtual reality serious games, we decided on four instruction methods that were commonly used in these games: animated, pictograph, verbal and written. We designed and implemented eight baseline warehouse tasks. The tasks were intentionally designed to be simple to allow the user to focus on their instruction method preference rather than struggling with the tasks. The simplicity of the tasks also expected to overcome any possible cognitive load difference between the instruction methods. A professional job trainer helped us in designing the tasks to ensure that the tasks were of similar baseline difficulty and the instructions were appropriate. Tasks were designed to be different than each other in at least one element. To make sure that the tasks were of the same difficulty level in terms of vocational skills and the instructions were meaningful, we demonstrated the designed tasks to six professional vocational trainers. They all stated that the tasks had a similar level of difficulty and the instructions were meaningful and easy to understand.

To ensure variety and to minimize any possible learning effect, tasks were designed in three categories: sorting, fetching and alignment. Tasks had a roughly even distribution between these categories (3 sorting, 3 fetching and 2 alignment tasks). Between the tasks, textures on the boxes and work station labels were changed. Sorting tasks were based on price, product label and size. Fetching tasks were based on color and product labels. Alignment tasks were based on expiration dates and barcodes. These tasks were presented to the users with one of the four instruction methods: animated, pictograph, verbal and written. To perform the tasks, users needed to interact with tangible boxes that were equipped with markers to be tracked by the motion tracking system in real time. A general overview of our system can be seen in Fig. 1. The reason behind the selection of tangible boxes was the positive scores the users gave to this type of interaction in a previous study of ours which included many forms of different virtual reality interactions [4]. Different textures were projected onto the boxes in the virtual world to create variety in the tasks. Real tables were used as workstations. These tables also had accurate virtual representations of them. Different labels such as ‘Regular QC’ and ‘Inspection Area’ were projected to the white areas on the virtual tables to specify work stations in different tasks.

Fig. 1.
figure 1

A general overview of the virtual reality serious game setup. The user interacts with the tangible boxes on the workstations. View of the user through the HMD is projected on a curtain display only for outside viewing purposes.

For animated instructions, we used a realistic 3D animation approach to be able to convey instructions in a way that is close to real world and to make sure that the difference between this method and the 2D picture based instructions were obvious. As an example, animation strip of a task is shown in Fig. 2. In the animated method, as a virtual warehouse supervisor character demonstrated the tasks, he also described what he did with a few brief words to explain the task better, like a supervisor teaching a task to their employee by demonstration. 3D animations were played in the virtual world in real time instead of being pre-rendered and displayed as a video overlay. The reason for that was to give the user the feeling of watching a real demonstration in the same environment they were present instead of watching an overlaid video. As an example, the virtual tutor says “sorting by price” as he sorts the boxes in Fig. 2. These brief words were recorded as clear voice over audio by a male native English speaker. Pictographs consisted of simple drawings with brief explanations (see Fig. 3). The reason for selecting pictographs as one of our methods was the prevalent use of them in today’s workplaces for instructions, especially in fast food and retail stores, and in training games. Written instructions consisted of brief yet clear explanations of the tasks to be done (see Fig. 3). We gave great importance to make these instructions easy to understand by using common words in simple English. For the verbal instructions, content of the written instructions was read clearly and recorded as voice over audio by a female native English speaker.

Fig. 2.
figure 2

Animation strip of Task 8 from user’s viewpoint. Virtual character sorts the boxes according to their price tags.

Fig. 3.
figure 3

Pictograph and written instruction methods. Left: pictograph. Right: written.

Instructions had different pre-defined durations proportional to their content. Animated instructions varied between 8 and 18 s. All pictograph durations were 10 s since their content was similar. Written instruction durations varied between 10 and 12 s and verbal instruction durations varied between 5 and 9 s based on the length of the content. In our virtual reality serious game, first the users were presented with the instructions and requested to watch/read/listen to the instruction method until it disappeared. To make the users focus on the instructions for the same amount of time, possible actions of the users were restricted in this mode. After the instructions disappeared, virtual boxes appeared and the users were then able to interact with the objects in the virtual world. We preferred this approach to ensure that all users were exposed to the instructions for the same amount of time yielding comparable results. Animated instructions took place in a position that was very close to the user in the virtual world. Pictograph and written instructions were presented as overlays covering 60% of the screen. Verbal instructions did not have a visual cue.

3.1 Hardware

We used 12 Opti Track V100R2 FLEX cameras for real time motion tracking. Our tracked area was 8 ft by 8 ft but the tasks were designed so that the user never needed to step outside of the tracked area. A VR2200 head mounted display (HMD) was used for viewing. HMD was tracked by the system in real time via attached markers on top. Users wore a backpack that contained battery for the HMD and the port for the VGA cable. This backpack weighed around 2 lb. The VGA cable went to the server computer through a tool balancer mounted on the ceiling. Software was implemented using the Unity game engine and worked around 60 frames per second. Users also wore hand bands that were equipped with reflective markers for real time hand tracking. The boxes were also equipped with markers for real time tracking. Four surrounding speakers (Creative A550) were used for audio.

3.2 Experiment Design

Within subjects experiment was performed with the independent variable of instruction method. The independent variable had four levels (animated, pictograph, verbal and written instructions) that were varied within subjects. Each participant was presented with two different instances of each of the four method. Orders of the independent variable levels were assigned randomly. Counterbalancing was also used to have an even distribution.

3.3 Research Questions and Hypothesis

Our study aims to answer the following research question: What are the effects of instruction methods on user experience in virtual reality serious games. We developed the following two hypotheses: (H1) Animated instructions will be the most preferred method as compared to the other three methods. (H2) Written instructions will be the least preferred method as compared to the other three methods.

When constructing these hypotheses, we thought that 3D animation was the method that was closest to the real-life training whereas written instructions was the least prevalently used method in real life considering the recent increase in the use of visual communication in many areas.

3.4 Data Collection

We collected automated data for the following: successful completion of tasks and fails with their time logs, time it took to complete the tasks and number of instruction replays requested by the users.

After the users completed all of the eight tasks, a survey was given to them. This survey asked questions about their preferred instruction method, ease of understanding each instruction method provided, frustration each instruction method caused, level of immersion and motion sickness during the whole experiment, and ease of the tasks.

3.5 Participants

15 adult individuals participated in the study (N = 15). Participants were recruited via e-mail announcements and word of mouth. All participants were undergraduate or graduate university students from different majors. Participants were aged between 21 and 33 with mean (M) 25.80 and standard deviation (SD) 3.05. All participants were either native English speakers or fulfilling English proficiency requirement with TOEFL IBT score above 79 or IELTS score above 6.5. Gender distribution was 5 females and 10 males. 13 participants had no prior virtual reality experience whereas 2 participants had minimal prior virtual reality experience. The study was conducted under the IRB Pro00013008.

3.6 Procedure

Participants arrived at the laboratory. They read and signed the consent form and filled the demographics questionnaire. Then, the research staff briefly explained the equipment and the user’s objective in the experiment. The users were told that they would be presented with different instruction methods and their aim was to do what they understood from that instruction method. They were also told that they could request two more replays of the instructions if they felt like they did not understand them. Following, the research staff helped the users wear the head mounted display and the hand bands, and a familiarization session began. The aim of this session was to get the users comfortable with the virtual reality system and the tangible box interaction. Research staff first explained the elements in the virtual world briefly: virtual warehouse, workstations and boxes. Research staff then requested the users to perform some basic actions such as looking at their hands, looking around, moving around on the tracked area, holding and moving the tangible boxes, rotating the boxes and touching the work stations. The familiarization session ended when the user stated that they were comfortable with the virtual reality system, which took approximately 80 s on average. Then, the experiment began.

The users were presented with eight tasks. Each task instance consisted of first watching or listening to the instructions and then doing what the instruction requested. The users could also request two more replays of instructions if they wanted to. After finishing all of the eight tasks each presented with one of the four instruction methods, the research staff helped the users to take off the worn equipment. The users then filled out a survey about their experience. Virtual reality exposure time was around 10 min and the survey filling took 3–5 min per user.

4 Results

Although we tried to design the tasks with the same level of difficulty -with the help of professional job trainers; one of the eight tasks turned out to be significantly more confusing than the others. Independent of the instruction method, 11 users out of 15 failed in Task 5. Hence, we excluded all data of this task from our analysis and examined the results of the remaining 7 tasks. Error bars in all of the charts in the paper represent the standard error of the mean.

Task 5 requested the users to move the cookware box to the Regular QC area (QC stood for Quality Control). Pictograph instruction of Task 5 can be seen in Fig. 2. In the beginning, there were one cookware and one silverware boxes in each workstation with the identical alignment. One workstation was labeled as ‘Urgent QC’ and the other was labeled as ‘Regular QC’. The users needed to pick up the cookware box on the Urgent QC area and put it to the Regular QC area. Initial box positions for this task are presented in Fig. 4. Distribution of the 4 successful completions of this task over the instruction methods were as follows: 1 animated, 1 pictograph, 1 verbal, and 1 written. Distribution of the 11 failed completions of this task over the instruction methods were as follows: 3 animated, 3 pictograph, 3 verbal, and 2 written.

Fig. 4.
figure 4

Starting box positions of Task 5. Workstations were labeled as Urgent QC and Regular QC. Each workstation had one silverware box (back) and one cookware box (front).

4.1 Performance

We examined the percentage of successful completions in all seven tasks for different instruction methods. Results in percentage of fails and successful completions are presented in Fig. 5. As we analyzed the data for the effect of instruction method on successful completion percentage using single factor ANOVA with alpha 0.05, no significant effect was found (F(3, 11) = 1.690, p = 0.184). However, it can be observed that all users were able to complete the tasks correctly with the animated instructions whereas the success percentage was below 89% for the other methods.

Fig. 5.
figure 5

Bar chart of the successful completion and fail percentages for different instruction methods.

4.2 Preference

At the end of the experiment, we asked each user to rank the instruction methods according to their preference. Then, we gave weights to these results so that first choice had a weight of 4, second choice had a weight of 3, third choice had a weight of 2 and fourth choice had a weight of 1. After that, we applied these weights to the count of times each method was ranked in an order and we divided the results by our sample size of 15. This gave us the weighted averages of the user preference results, which can be seen in Fig. 6. As we applied single factor ANOVA analysis, there was significant difference in the preference results (F(3, 11) = 8.372, p = 0.000). As we performed t-tests, there was significant difference between all method pairs except the pictograph and written (see Table 1). Animated instructions were the most preferred with a score of 3.47, which supported H1. Verbal instructions were the least preferred with a score of 1.53, which rejected H2.

Fig. 6.
figure 6

Bar chart of the weighted averages of user preference ranking scores for different instruction methods.

Table 1. Significant two sample t-test results for the preference ranking scores.

5 Discussion

Results revealed that the participants were able to complete all tasks successfully with the animated instructions. Although there was no statistically significant difference, fail ratio was the most with the verbal method. This could be because of the resemblance of the animated instructions to having a tutor in real life.

User preference was notably favoring the animated instructions. Pictograph was the second in preference, being slightly better than the third preference of the written instructions. We interpret that although it gave them some difficulty in information processing, visual based nature of the pictographs still found to be interesting and favorable by the participants. Verbal instructions clearly seemed to be the least choice of the participants. Some of our participants complained about the anxiety they had regarding the possibility of missing the verbal instructions.

In overall, animated instructions were favored by the users whereas verbal instructions were the least favorite. If possible, having animated instructions in serious games for training users might be a better practice. If not, as a less costly alternative, pictograph instructions that remain on screen as an overlay might work well as a second option. Users showed more interest in the pictograph instructions than the written instructions although the two methods were similar in many results. We suggest that verbal instructions should be avoided in virtual reality serious games since they may create slight anxiety on users and may not be as effective as the other three methods, especially with the longer instructions. If verbal instructions needed to be used, we suggest having supplementary captions as overlays on screen or enabling the users the option to replay the instructions as many times as they want as more user friendly alternatives. This way, the users may read the parts they missed or replay the instruction.

6 Conclusions and Future Work

In this study, we examined effects of instruction methods on user experience in virtual reality serious games. Four instruction methods were explored: animated, pictograph, written and verbal. Eight simple vocational tasks to be performed in an immersive virtual warehouse environment were designed and implemented. A user study with 15 participants revealed that the animated instructions provided better user experience whereas verbal instructions were the least preferred among the four methods. Hence, we suggest using animated instructions in virtual reality serious games. Pictograph and written instructions shared the middle position in ranking of the methods. Between the pictograph and the written methods, the users were more excited about the pictograph; however, it was slightly more difficult for them to remember all of the elements in the pictographs when the instructions disappeared. Hence, we suggest using pictograph instructions that remain on screen as overlays as the second choice of instruction method, followed by written instructions. We do not recommend using verbal instructions. If needed to be used, we suggest that having captions on screen or enabling replay option may be user friendly improvements for verbal instructions.

Future work may include evaluating effects of instruction methods in terms of cognitive load, exploring other forms of instruction methods such as video, and evaluating effects of instruction methods on user performance for complex tasks. Different levels of the instruction methods mentioned in this study such as speed and fidelity may also be examined in terms of effectiveness.