Understanding Software Quality Metrics For Virtual Reality Products - A Mapping Study
Understanding Software Quality Metrics For Virtual Reality Products - A Mapping Study
Understanding Software Quality Metrics For Virtual Reality Products - A Mapping Study
ABSTRACT in various use-cases like news, shopping, education, art, and ana-
Virtual Reality (VR) Software is becoming more mainstream in lytics, etc. In the initial days, VR systems were mostly limited to
recent years. It has provided an opportunity for VR practitioners Aviation and Shipping Industry. They were built to train pilots and
to explore new domains and deliver cutting edge products. The sailors on navigation systems. With the advent of abridged versions
success of the VR products depends primarily on the product con- of various Head Mounted Devices (HMD), multitude of personal-
textual relevance and qualities exhibited. However, it is unclear ized products are being built using VR, creating a significant impact
how VR practitioners curb software quality challenges and improve in digital consumer market.
the essence of the VR product over every release. In this paper, we The practices followed by VR developers originated from the
present a Systematic Mapping Study of the software quality metrics Gaming Industry due to its widespread presence in Online Games
adopted by VR practitioners for assessing the quality of their VR [63]. Game Developers started contributing to Core VR product
products. The study showed that practitioners used unique metrics development with an idea of building serious enterprise VR prod-
to measure the quality of their VR products in addition to adopting ucts. Although game development life cycle resembles traditional
some of existing enterprise software metrics. Further, we consoli- software (for example, enterprise) product development life cycle,
date these metrics into different themes that future practitioners VR products are yet to adopt a lot of practices. The major reason
may use for developing VR products. being the various challenges specific to VR setup that are translated
to its products. As a result, assessment of quality of VR software
CCS CONCEPTS products is still not systematic like enterprise software. Previously,
we conducted a study to understand the modalities of Virtual Re-
• Software and its engineering → Software development tech-
ality Product Development in the Software Industry [34]. Some of
niques; Software testing and debugging.
the observations pertinent to process and product quality in VR
KEYWORDS software captured from the study are:
Software Quality; Virtual Reality; Industrial Practices; Metrics • VR software development is complex, disorganized and can
ACM Reference Format: be correlated to the level of practitioners’ participation.
Mohit Kuri, Sai Anirudh Karre, and Y. Raghu Reddy. 2021. Understanding • Quality assessment for VR software products is considered to
Software Quality Metrics for Virtual Reality Products - A Mapping Study. be cost-intensive. Also, it is difficult to generalize the quality
In 14th Innovations in Software Engineering Conference (formerly known as attributes to all the end-users as VR products tend to be
India Software Engineering Conference) (ISEC 2021), February 25–27, 2021, personalized.
Bhubaneswar, Odisha, India. ACM, New York, NY, USA, 11 pages. https: • Design and Usability reflect VR product sensitivities. They
//doi.org/10.1145/3452383.3452391 have a direct impact on product quality.
• Design versioning and Sustenance maintenance are time-
1 MOTIVATION consuming and confusing at times for unstructured VR prod-
Virtual Reality (VR) is known for interpreting complex visual expe- uct builds.
riences into simple ones for real-world events using Head Mounted • Support tools for VR product development practices are in-
Devices (HMDs) [31]. In Gartner report [12], VR is presented as a adequate
‘Strategic Technology Trend’ i.e. it is meant to guide organizations • Stakeholder conflicts are far more given the wider variety of
that have digital use-cases best solved using immersive experience. stakeholder involvement in the development of VR products
Technology based on VR can help people perceive the digital world • There are almost no comprehensive testing strategies for VR
Permission to make digital or hard copies of all or part of this work for personal or Products that can help improve them over multiple product
classroom use is granted without fee provided that copies are not made or distributed releases.
for profit or commercial advantage and that copies bear this notice and the full citation
on the first page. Copyrights for components of this work owned by others than ACM The above observations motivated us to explore the current
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specific permission and/or a quality measures, more specifically, software quality metrics of VR
fee. Request permissions from permissions@acm.org. products in the existing literature. The rest of the paper is struc-
ISEC 2021, February 25–27, 2021, Bhubaneswar, Odisha, India tured as follows: Section 2 provides details about the systematic
© 2021 Association for Computing Machinery.
ACM ISBN 978-1-4503-9046-0/21/02. . . $15.00 mapping study process along with related work. Section 3 includes
https://doi.org/10.1145/3452383.3452391 a discussion on giving preferences to VR practitioners on adopting
ISEC 2021, February 25–27, 2021, Bhubaneswar, Odisha, India M. Kuri et al.
appropriate metrics while conducting Software Quality studies dur- Table 1: Application of PICOC method to VR products study
ing VR Product development. In section 4, we discuss some threats
to validity and finally we present some conclusions in section 5. Criterion Description
Population Virtual Reality related software prod-
2 THE MAPPING STUDY ucts and applications
Intervention Software quality metrics or indicators
ISO Standard 9126 [17] quality model classifies quality into a col-
Comparison Comparison between the results cap-
lection of quality characteristics and sub-characteristics. Various
tured in various software quality met-
metrics can be used to measure these quality characteristics. Prod-
rics
uct owners have to adopt diverse software quality metrics to track
Outcome Studies where software quality metric-
the health of the software product after every product release. Soft-
s/indicators are applied to VR product
ware quality of an enterprise software product involves quality
and apps
assessment, quality assurance, and quality evaluation. Researchers
Context Academia, software industry and other
have proposed various approaches and metrics to address software
empirical studies
quality problems at different stages of software production. Shihab
et al. [72] conducted a literature review of more than 100 published
research papers on software defect prediction. It was found that
most of the approaches did not provide guidance on industrial
adoption and rarely considered the impact, risk, and dependency Search String: The search terms were chosen with concepts re-
associated with the predicted or forecasted defects. The practical sulting from the PICOC method. Below are the details of search
adoption of software quality methods in the industry is limited as strings.
software industry tends to be reactive. Compared to traditional soft- C1: “Virtual Reality” OR “Virtual Programming” OR “Virtual Real-
ware developers, VR practitioners have fairly limited knowledge ity Product” OR “Virtual Learning” OR “VR” OR “Virtual Environ-
about the state-of-art metrics needed for quality management of VR ment”
products [34]. In this paper, we detail a systematic mapping study C2: “Software Quality” OR “Software Metrics” OR “Software Indi-
performed on existing VR literature to explore software quality cators” OR “Software Evaluation” OR “Metric” OR “Metrics” OR
metrics or indicators used by VR developers while developing VR “Indicator” OR “Indicators” OR “Quality Assessment” OR “Qual-
products. ity Improvement” OR “Quality Evaluation” OR “Quality Measure-
ment”
C3: “Publication year” > “2000”
2.1 Research Questions
The Systematic Mapping Study described in this paper uses the
guidelines suggested by Petersen et al. [56]. The primary goal of
The resulting string formulated for addressing research questions
the study is to capture the details of software quality indicators or
R1, R2, and R3 is ‘C1’AND ‘C2’ AND ‘C3’. We considered Virtual
metrics adopted by VR developers while developing their respec-
Reality and Virtual Programming related keywords as the initial
tive VR Products. We followed an evidence-based approach called
search filter. Software Quality Metrics is a major factor, hence we
PICOC method. PICOC is an acronym for Population, Interven-
expanded the search space to consider all potential keywords per-
tion, Comparison, Outcome, and Context [46]. Application of the
taining to quality metrics. We conducted a multi-level analysis [39]
PICOC concepts to VR products study is shown in Table 1. PICOC
on the Virtual Reality research area and found that with the advent
method helped formulate the research questions effectively and
of new hardware there was a significant shift in VR technology
document the scope of mapping study. The work presented in this
and corresponding software after the year 2000. Hence, the year
paper addresses the following research questions:
2000 was considered as a limit for publication year to extract the
R1: What are the existing software quality metrics/indicators used literature.
as part of VR product(s)/app(s)?
R2: Is there any trend in adapting certain software quality metric- Search Quality Assessment: We reviewed the search strings mul-
s/indicator in VR product(s)/app(s)? tiple times and incrementally developed them based on a peer-
R3: Are there any domain specific VR Software Quality metrics? review approach [39]. We also conducted a manual search of the
search string incrementally and compared the results in every val-
idation cycle. Our peer-reviewers graded the manual search vali-
2.2 Search Strategy dation results and helped us finalize the search string. We worked
For any systematic mapping study establishing a search string is key. with fellow researchers from similar research areas to our finalize
As a first step, keywords pertinent to Software Quality metrics in the search string. Once the search string was finalized, the authors
VR applications were identified. The research questions R1, R2, and independently conducted the search activity against all available
R3 are related to each other and hence only one search string was attributes of a research paper including abstract, contents of the pa-
constructed. The search strategy was set to enable identification of per, keywords, etc. and recorded the respective results. We filtered
studies that describe the presence of at least one software quality these attributes further to avoid miscellaneous research papers. Our
metric or indicator applied to VR Software Product. review supplement data can be found here [47].
Understanding Software Quality Metrics
for Virtual Reality Products - A Mapping Study ISEC 2021, February 25–27, 2021, Bhubaneswar, Odisha, India
2.3 Databases and Paper Selection spectrum of VR product development. In the review process, we
We conducted this search activity against major electronic research came across several papers where researchers used software quality
databases including IEEE Xplore, ACM Digital Library, Springer, metrics for validating their VR Prototype or Product.
and ScienceDirect. The search order was based on the databases Based on the review of literature, we conducted a coding based
that returned most results. The search fields and search string was thematic analysis [11] of the gathered research papers. Based on the
formulated to assure that the search process is made similar across coding results, we broadly categorized the gathered data into six
these electronic research databases. We omitted the grey litera- themes: VR Audio Quality, VR Scene Quality, VR Video Quality, VR
ture (research produced by organizations outside of the traditional QoE (Quality of Experience), VR Image Quality and VR Code Quality.
commercial or academic publishing and distribution channels) and In this section, we analyze the research questions and discuss the
focused only on active publications. Our review considers research relevant observations.
papers published till September 2020.
Exclusion Criteria - Articles short on metric description details
were ignored. Research papers with no clarity on their VR product
setup were ignored as it is critical for judging the quality metric
used in a VR scene. Articles that focused only on software quality
processes/models/techniques, topics related to the description of
software quality engineering, or industry white papers were ex-
cluded from our study. Also, papers that did not mention anything
about the quality aspect of the VR product built as part of their
research was not considered. Books were not included as part of
this mapping study as books tend to cover broader data that is more
useful for in-depth analysis rather than mapping study.
Inclusion Criteria - Papers that discussed the use or implemen-
tation of a Software Quality Metric(s) or Indicator(s) in their title,
abstract, or keywords are considered. Peer reviewed publications
that contain clear details about the VR products and documents
were given primary consideration. Only articles written in English
were considered as part of our study.
using 3D Convolutional Neural Networks. They conducted an em- dataset and suggestions were provided on the improvement of im-
pirical study on 3D Panoramic VR Video Dataset and determined ages based on VR Scene setup. Junfei Qiao et al. [60] reviewed 3D
the initial quality of these videos using a subjective score for each Synthesized views used for the rapid development of high-quality
data sample. They proposed a ‘fusion strategy score’ to rank and VR Scenes. They formulated an algorithm as part of their previous
determine the quality of the VR Video project process. They con- work and compared it with their existing dedicated No-Reference
sidered MultimediaQA, VRVideoQA, and ImageQA indexes as a Image Quality Assessment (NIQA) method to assess the quality of
measure for the quality of their assessment. Their future plans in- the Image in a high-quality VR Scene. They set up an experimental
clude the application of their proposed score on large VR video study to comprehend image quality degradation and distortions.
databases to help VR Practitioners adopt the proposed score and Further, they have recommended guidelines for low-level compres-
metrics as part of conventional VR product development. Alireza et sion of images in VR Scene to avoid rendering issues. Rahim et al.
al. [87] conducted a quality assessment comparative study between [62] introduced a content-dependent objective quality assessment
tile-based method and truncated square pyramid (TSP) method of procedure to evaluate the distortions that occur while building the
projecting VR Videos. These methods are primarily involved in viewport in VR Scenes. They used a supervised learning method
Streaming VR Videos which are subject to latency issues. Quality- to classify their dataset to determine the viewport quality of 360
Assessment-View (QAV) Index was used to assess the quality of degrees images in a given VR Scene. They set up an experiment
VR Video. Subjective evaluation was performed and the observed to predict the proposed metric against the viewport quality of the
data was analyzed to determine the merits and demerits of these image set with reasonable accuracy.
methods. Sijia Chen et al. [10] reviewed Omnidirectional VR Videos
VR Audio Quality: Miroslaw et al. have reviewed issues with spa-
and conducted an objective evaluation to calculate the spherical
tial audio as part of their previous work and have now reviewed the
structural similarity index (SSSI) and compare this quality metric
quality aspects of compressed audio on emerging HMDs [50]. They
results with traditional heuristic methods. They also conducted an
adopted the MUSHRA test methodology (Multiple Stimuli with
experimental assessment to determine the relationship between
Hidden Reference and Anchor) to assess the quality of soundscape
two domains to determine the video quality.
of a 360 degree streaming VR for immersion setup. They conducted
Sebastian et al. [70] made attempts to understand the methods to
a subjective evaluation of raw audio content and captured the qual-
render virtual viewports from supplementary depth information to
ity of user experience. They reviewed the options of compressing
create a better VR video quality. Depth-image-based rendering and
the spatial audio and proposed a few guidelines for practitioners.
Peak-Signal-to-noise ratio are two quality metrics used to evaluate
They plan to develop an objective spatial audio quality metric as
the VR video quality. They conducted a subjective evaluation and
part of their future work. Jules et al. [18] were the first to study
published their findings. Naty et al. [73] conducted an experimen-
continuous movement recognition and real-time sound parameter
tal study on understanding the performance and computational
generation in a VR Scene. They conducted series of experiments
complexity of 360 degrees VR Video. They were part of a research
to understand the mapping between the design process and user
group that developed new coding tools to address video encoding
interaction through a VR Scene using machine learning approach.
and decoding to avoid noise and a better bit rate in VR Videos.
They used Auditory Feedback as a quality measure to determine
They have used Weighted-Peak-Signal-to-Noise Ratio as a custom
the health of their study. David Triantafyllou et al. [78] are pioneers
metric to evaluate the 360 degrees VR Video quality. Shu Yang et
on studying sound in VR scenes. They conducted two experiments
al. have introduced a quality assessment method for panoramic
to determine the relationship and shortcomings between physical-
videos which is based on multi-level quality factors. This is calcu-
world sound and VR Scene sound. They identified the difference
lated based on the region of interest in a given VR Scene [85]. They
between these two experimental conditions and used auditory feed-
conducted a subjective valuation using a few panoramic scenes and
back to assess the quality. They have published a few guidelines to
captured insightful results. Their observations shows that the qual-
practitioners on building combinations of surfaces with better nat-
ity assessment method is easy to implement, when compared with
ural sounds in virtual environments. Ceenu et al. [22] investigated
traditional video quality assessment method. Carlos et al. proposed
whether audio signals and haptic feedback can act as indicators
two novel metrics to study the user behavior under 360-degree
for real-world boundaries, such as objects, walls, and people. They
movie cuts [45]. This is to examine the influence of user perception
used NASA TLX Survey and Presence questionnaire to gather feed-
on a 360 degree movie cuts over large scale video scenes.
back on presence and workload from the participants in their study
setup. Adrielle et al. studied Audio localization in a VR Scene using
spatial audio metrics like the NASA TLX questionnaire to capture
VR Image Quality: Wei et al. [75] conducted a subjective quality
workload while performing actions like gaze pointing and wand
valuation of compressed virtual reality images in VR scenes. They
pointing towards the projected audio in VR Scene [48].
performed a correlation study with popular objective quality mea-
sures and published their observations. They used Single-Stimulus VR Scene Quality: Blaine Bell et al. [5] are first to investigate
method to collect the subjective scores from participants and have the quality of rendered VR plane. As part of their work, authors
computed the MultiScale Structural Similarity Index (MSSI) to de- focused on view management in a 3D view plan and determine
termine the quality of the images in a VR Scene. Huiyu Duan et the properties of objects, position, size, transparency, and shapes
al. [14] established an exhaustive VR Image dataset and worked of the virtual world. They proposed a layout decision approach
on perceptual quality assessment of Omnidirectional images in and conducted a subjective evaluation. Further, they proposed a
VR. The image quality assessment measures were applied on their custom quality metric to determine the view plan representation
Understanding Software Quality Metrics
for Virtual Reality Products - A Mapping Study ISEC 2021, February 25–27, 2021, Bhubaneswar, Odisha, India
in a virtual world. Ying Zhang et al. [86] created a multi-model and with less geometric distortion. An experiment was conducted
interface for the virtual environment for an assembly application. to review the quality method and was compared with traditional
They evaluated multi-model feedback on assembly task activities models. Yingxue et al. worked on improving the immersive viewing
through simulation in the virtual world. They were the first to build experience for VR Scenes [88]. They proposed a display protocol
such an interface and conduct a heuristic auditory and sensory and evaluated it against panoramic VR Scene videos to review the
feedback study to assess the quality of their environment. They distortions and video coding compression. They conducted a case
also conducted a subjective evaluation and captured the observa- study on all these VR Scenes and heuristically reviewed the qual-
tions via a questionnaire. They provided recommendations to the ity of the scenes using the mean opinion score method from the
practitioners on building an efficient task performance based VR participants. Deba et al. proposed a method for defining a quality
Scene. Shun Li et al. worked on a virtual surgical simulation setup metric on context-aware intelligent environments with inference
with an intent to understand the thermal damage of a bone tissue to address physiological processes [67]. Their work was focused on
using bone drilling [41]. They formulated a virtual surgical pro- VR Scenes which requires a visual feedback model, and developed
cess and evaluated the virtual scenes using a customized quality a Supermarket application for validation. They used the electro-
metric called ‘Temperature Distribution’. Dongdong et al. explored dermal activity as a quality metric and conducted a tasked based
the differences of visual discomfort caused by long-term immer- experiment to assess their proposed method.
sion between virtual environments and physical environments [26] Brendean John et al. [32] worked towards investigating pupillary
across a variety of VR scenes. Visual fatigue scale (VFS), change of light response in the virtual scene setup. They used a custom quality
pupil size (PS), and accommodation response (ACR) are captured as metric called pupil light response as a reference scale to detect and
metrics. Bilal et al. conducted a task-based subjective evaluation of identify the rates of cognitive-emotional responses from the par-
a driving simulation [64] using the user-experience questionnaire ticipants who were part of the case study. They built a task-based
and Gaze interaction metrics. activity experiment and proposed guidelines for building effective
Carvalheiro et al. [8] proposed a haptic based interaction system scenes with reasonable pupillary interaction. Markus Wirth et al.
for virtual reality products. This includes a combination of tracking investigated interaction techniques in a virtual reality scene setup
devices for hands and objects in a given scene. Their solution re- focused on a diagnostic radiology application. This is first of its
ceives haptic stimuli by manipulating real objects mapped to virtual kind domain-specific VR Application where a VR Scene based qual-
objects. They conducted a subjective evaluation via an experiment ity metric was evaluated from Software Engineering perspective
setup and proposed a quality metric called ‘Simulation Awareness’ [83]. Attractiveness, Pragmatic/Hedonic Quality attributes were
to understand the stimuli experience of the end-users in a virtual evaluated as part of a thorough experiment for radiologists.
world. Rohan et al. [9] presented a novel algorithmic framework to
optimize the depth of camera placements for a given virtual envi- VR QoE (Quality of Experience): Mapar et al. [44] and Akpan
ronment. As part of the study, they utilized a quality metric called et al. [1] adopted a heuristic-based personalized quality metric to
simulated annealing and depth inaccuracy to evaluate the quality evaluate a space flight simulation setup and an assembly product.
of the construction of the scene in virtual space. In [38], the author They conducted a study to determine the quality of experience and
investigated a unique visual comfort model for the real prediction of used egocentric depth perception as a metric [33]. Jarvinen et al.
a 3D VR Scene, which is based on the physiologic mechanism. They [30] conducted a series of experiments on the spatial setup in VR
conducted an experimental study to evaluate r model by utilizing a scenes and evaluated their experiment VR scene to evaluate the
customized quality metric called multimodel interactive continuous quality of experience in spatial memory. They used a customized
scoring. This helped them understand the stability and perception test called Spatial memory test to gauge the quality. Ruddle et al.
of 3D VR Scene images. Jann et al. conducted a VR Scene tolerance [66] conducted a VR scene evaluating using travel time, collision
study using a custom scale called [21] Cybersickness Susceptibil- index, and speed profile index as the quality of experience in their
ity Questionnaire. The scope of this metric is to predict the Scene case study VR Scenes. Markus et al. [82] assessed the personality
tolerance of the participant. traits of athletes using a VR football game scene using the Presence
Hak et al. [37] proposed a quality metric of exceptional motion Questionnaire. Monthir et al. compared Game-pad and Naturally-
in VR Video Contents for VR sickness assessment. Their metric was mapped Controller Effects on Perceived Virtual Reality Experiences
developed to improve the quality of VR scenes to avoid sickness [2] and captured customized metrics Self-reported Presence, Self-
issues. They validated the work using Simulator sickness ques- reported Engagement, and Self-reported Accuracy. Peng et al. [55]
tionnaires in VR environments. Viktorija et al. studied Levitation conduct a comparative study between a PC and VR based presence
Simulator using VR based shooting scene to capture workload using evaluation of emotional challenge-based games.
the NASA TLX Task-based survey [53]. Jeremy et al. proposed an Lugrin et al. [42] conducted a subjective evaluation of a task-
innovative method to navigate into VR Scenes by opting for accel- based assembly activity to analyze the quality of experience of a
eration parameters from the users in real-time [57]. The motivation participant through quality metrics In-game performance index,
was to address VR Sickness issues. They have conducted a case In-game navigation, Multi-Screen usage index to determine the
study and used customized quality metrics called Motion-Sickness adaptable of VR Scene. Charles et al. [58] built a workstation in
Dose Value and Electro-dermal activity to judge their results. Ke Gu a VR environment to understand the physical risk factors in hu-
et al. proposed a novel referenceless quality metric for depth-image mans. The quality of the experience was evaluated in a subjective
based rendering of VR Scenes [25]. This is to ensure that the syn- evaluation using rapid upper limb assessment, averaged muscle
thesized free-viewpoint videos are generated with higher accuracy activations, Total task time as quality indicators. Hamam et al. [27]
ISEC 2021, February 25–27, 2021, Bhubaneswar, Odisha, India M. Kuri et al.
Table 2: Software Quality Metrics mapped with respective type and VR Theme
VR Quality Domain (or) Theme Software Quality Metrics Used Type of VR Product References
Streaming VR Video Quality Quality Assessment View (QAV) Streaming VR Video [87]
VR Audio Quality MUSHRA test methodology Use-Case Specific VR Product [50]
Auditory Feedback ML based Approach [18]
Auditory Feedback Questionnarire [78] [22] [48]
VR Code Quality LOC, CC, %Lack of Cohesion Case Study [23]
VR Image Quality MultiScale Structural Similarity (MSSIM) Use-Case Specific VR Product [75]
Spherical Structural Similarity Index Use-Case Specific VR Product [10]
DIBR-synthesized IQA metric Use-Case Specific VR Product [60]
perceived viewport quality,
Content Dependent Objective quality metric 360 degree video/image [62]
VR Paranomic Scene Quality BP-based quality assessment of panoramic videos Use-Case Specific VR Product [85]
Heuristic Coding Lab [88] [76]
VR QoE Electrodermal Activity, Heart Rate, Miss Clicks Speech and Language Theraphy [35] [36]
Immersive Experience Questionnaire Use-Case Specific VR Product [65]
Interaction Modality Use-Case Specific VR Product [52]
Hand Movement Velocity Rehabilitation Game [43]
Interface quality, Realisum Index Use-Case Specific VR Product [71]
Locomotion Index Use-Case Specific VR Product [7] [6] [19]
UserState Measure, Perception Measure,
Physiological Measure Game [27]
Egocentric Depth Perception Task Based Activity [33]
electrocardiographic signal, galvanic skin response,
blood volume pressure, electrodermal activity Task Based Activity [69] [81]
Flow Experience Analysis Task Based Activity [29]
Heuristic Assembly Product [1]
Presence Evaluation Use-Case Specific VR Product [74] [82] [2]
Subjective Evaluation Exercise IoT App [61] [79]
Subjective Evaluation Use-Case Specific VR Product [66] [20]
Subjective Evaluation Task Based Activity [28] [68]
Subjective Evaluation, content resolution, start delay, stalling ratio Task Based Activity [15] [40]
Subjective Evaluation Walk-In-Place activity [54]
Heuristic Auditory and Sensory Feedback Case Study - Automobile Driving [80]
Heart Rate, Skin Conductivity Case Study - Public Speaking [16]
Spatial Memory Test Case Study - Spaces [30] [49]
Vibrotactile Feedback Task Based Activity [51]
realism, control, interface quality,
ability to examine, performance,
haptic sub-scales Haptic Based Case Study [4] [55]
Laban Movement Analysis Task Based Activity [3]
In-game Performance, In-game Navigation,
Multi-Screen usage Task Based Activity [42] [24]
Simulated Annealing, Depth Inaccuracy Use-Case Specific VR Product [9]
Motion-Sickness Dose Value, Electro-Dermal Activity Use-Case Specific VR Product [57]
depth image-based rendering Use-Case Specific VR Product [25]
Multi-modal interactive continuous scoring Use-Case Specific VR Product [38]
temperature distribution Virtual Surgery - Bone Drilling [41]
electrodermal activity quality metric Super Market [67]
VR Scene Quality
Subjective Evaluation, Visual Discomfort Task Based Activity [32] [26]
Subjective Evaluation Custom Haptic Setup [8]
Heuristic Auditory and Sensory Feedback Assembly Product [86]
Heuristic Auditory and Sensory Feedback Case Study [37]
View Plane Representation View Plan - Visible Surface Determination [5]
Attractiveness, Pragmatic and Hedonic Quality Diagnostic Radiology - Interaction [83]
VR Simulation Quality Heuristic Space Flight Simulation, Driving [44] [64]
rapid upper limb assessment,
VR Task Quality averaged muscle activations, Assembly Product [58]
Total Task Time
VR Video Quality MultimediaQA, VRVideoQA, Image QA Use-Case Specific VR Product [84]
depth-image-based rendering,
peaksignal-to-noise ratio Use-Case Specific VR Product [70]
omnidirectional IQA Use-Case Specific VR Product [14]
weighted Peak Signal to Noise Ratio Use-Case Specific VR Product [73] [45]
Understanding Software Quality Metrics
for Virtual Reality Products - A Mapping Study ISEC 2021, February 25–27, 2021, Bhubaneswar, Odisha, India
built a game to analyze the quality of user experience in a VR to assess the experience of a VR scene using 5G bandwidth [40].
scene by capturing User State Measure, perception measure and, Robertro et al. developed a two-stage method to enable efficient
physiological measure as quality indicators and shared significant Streaming via QoE-aware Mobile Networks for AR/VR Scenes [76].
insights. Steinicke et al. [74] built a simulation system to study The proposed Dynamic path selection approach uses a new QoE
the presence of a participant using a presence evaluation method. model for video streaming. Pupillometry based metric was proposed
Aristidou et al. [3] investigated the motion capture technology for by Kenya et al. as a Method to evaluate Reading Comprehension in
virtual spaces with a specific focus on folk dance motion analysis. VR-based Educational Comics [68]. Maria et al. Human-Centered
They present a framework to identify the styles in dance motion QoE approach was adapted to assessing Delay, Jitter and Packet
and use Laban Movement Analysis to study the quality of experi- Loss in VR Applications [79].
ence of the generated Scene. Gayathri et al. studied spatial memory
w.r.t heights in adults and teens using a Virtual staircase [49]. Met- R2: Is there any trend in adapting certain software quality metric-
rics like Turning error, Latency, and Corsi Scores are captured as s/indicator in VR product(s)/app(s)?
a case study. Precision and Task Completion time are captured As part of our study, we observed that metrics that can be considered
along with a Subject Questionnaire to evaluate collaborative tasks as a common set used across VR software applications, are fairly
in VR [20]. Additionally, various other quality metrics are imple- limited. Most of the quality indicators or metrics for VR applications
mented on prototypes for focused applications through a subjective are limited to particular usage and need. The trend seemed to be
evaluation to understand the quality of experience. They include more towards researchers opting for customized quality metrics and
metrics like Interaction Modality [52], Locomotion Index [7] [19], building a methodology around the quality metric with a subjective
Haptic based systems include Realism, control, interface quality, or objective evaluation. Most of these customized metrics are unique
ability to examine, performance, and haptic subscales [4], Speech, and are found to be used as part of focused VR applications like
and language-based therapy applications used Electrodermal Ac- rehabilitation, trauma, education, and fun task-based prototypes.
tivity, Heart Rate, Miss Clicks [35], Rehabilitation Game employed Metrics like Temperature distribution [41], Electrodermal Activity,
Hand Movement Velocity as a customized metric [43]. Heuristic Heart Rate, Miss Clicks [35], Hand Movement Velocity [43], and
and Survey based evaluations were conducted as part of studies Pragmatic and Hedonic Quality [83] are widely used health care
[28], [54], [80], [65] where metrics like perceptual ration, estimated based application over a decade. Other metrics like Rapid upper
path length, recall time and immersion score were used as metrics. limb assessment, averaged muscle activations, Total Task Time [58],
Andreea et al. conducted a quality evaluation of the effect of ther- Heuristic Evaluation [44] [1], Auditory and Sensory Feedback [86],
mal visual representation on users grasping Interaction in Virtual Immersion Score [80] are distinctively followed in the manufac-
Reality application [6]. A novel metric called Grasp Aperture was turing domain over a decade. Due to distinction in scene design
presented and used as part of the study. across various domains, no traces of a generic software metric(s)
The VR Applications which are focused on healthcare can broadly or indicator(s) are found to be practiced.
be categorized as detection applications and intervention applica- In some cases [22], researchers relied on a common quality in-
tions. Given that VR is still nascent, studies have focused more on dicator for multiple quality factors like workload and presence. It
detection rather than intervention. Metrics like Electrocardiograph shows that there is a need for new approaches for use-case based
signal, galvanic skin response, blood volume pressure, electroder- adoption. It clearly shows us that the practitioner’s attitude to-
mal activity [69] [36], Flow Experience Analysis for task-based wards adopting a software quality metric is unique and varies
activities [29], Miss Rate, Merge Rate, Fragmentation Rate in case based on the intent of the scene. Most of the practitioners have
of rehabilitation and trauma-based VR applications [61], Effect not considered at least one metric in common to address essen-
of Reach-ability [15], Vibrotactile Feedback [51], Interface qual- tial quality requirements. These observations motivate the need
ity, and Realism Index along with an immersive tendency survey for a So f twareQualityEvaluation f ramework for VR applications,
[71], Heart Rate, and Skin Conductivity [16] are used to assess which includes strategies for addressing Image, Video, Code, and
the quality of applications. In [81], PPG signals and EEG signals Audio quality challenges in a generic way. Such frameworks can be
are captured to study the participant’s attention in terms of learn- realizable if and only if multiple empirical studies are attempted on
ing VR applications. This study was conducted to understand the large VR product data sets. This will help future VR practitioners
multi-dimensional physiological characteristics of the participants to adopt generalized quality metrics as a basis and then develop
towards immersed learning. Elena et al. were the first to study anti- focused quality metrics on top of the framework, based on their
stress adaptation to a new educational environment for foreign business needs. We plan to explore this avenue as part of our future
students [77] using a VR tool called Emotional Experience designer. work and will attempt to work towards formulating a generalized
This tool captures emotional and muscle relaxation as quality met- Software Quality Evaluation framework for VR products.
rics. Multi-user Isness medical condition experiences were studied
using a discussion-based questionnaire [24]. Scope for Automation - As part of our review study, it was sur-
VR applications which are focused on providing high quality prising to note that practitioners or VR product(s) are not using
of experience through enhanced bandwidth use interesting ap- automated methods or metrics to assess software quality. Devel-
proaches. Krogfoss et al. studied the impact of 5G bandwidth and opers are making progress in building frameworks/tools that can
its impact on improving the VR Audio and Video quality. They be used for VR testing. While most of the VR Quality metrics are
conducted a QoE assessment with customized metrics like con- defined to achieve Quality of Experience, a large number of these
tent resolution (cod – coding), start delay (s), and stalling ratio (t) metrics are being evaluated manually. Metrics like lines of code [23],
ISEC 2021, February 25–27, 2021, Bhubaneswar, Odisha, India M. Kuri et al.
lack of cohesion [23], Omnidirectional IQA [14], and depth image- and VR scene quality. New QoE metrics content resolution (cod
based rendering [25] are currently captured using semi-automated – coding), start delay (s), and stalling ratio (t) to assess the expe-
methods. New automated methods as well as a moving current rience of a VR scene using 5G bandwidth [40]. Frames to reach a
semi-automated methods to fully automated methods can shift the ROI (framesToROI) and Percentage of total fixations inside the ROI
make the assessment more objective. There is tremendous scope (percFixInside) are new metrics developed to study user behavior
for developing automated approaches to assess/improve VR quality. in a 360-degree video scene [45]. Pupillometry based metric like
For example, in VR simulation software, Interface testing software, capturing pupil size and dilation time were captured to study read-
etc. automated assessment of various qualities can significantly ing comprehension in Educational comics [68]. Customized Metrics
help in making the applications better. like Self-reported Presence, Self-reported Engagement, and Self-
R3: Are there any domain specific VR Software Quality metrics? reported Accuracy [2] are captured in comparison studies. Grasp
in VR is studied through the Grasp Aperture metric [6].
Table 2 illustrates the details of the metrics used as part of VR prod-
uct quality assessment across specific Quality Themes (or) Domain. Empirical Metric Evaluation Methods: The domain-specific VR
Domain - It is not a type of industry or business but a common applications relied on empirical approaches to gather the metric
theme under which a VR product is developed. To categorize, we data. Question Survey [23] [35], Subjective Survey [84] [44], Pres-
consider VR applications that are involved in assembling the objects ence Survey [7] , and Presence Questionnaire [4] [22] [82] have
in a specific order that may come under the Assembly domain. VR primarily used as part of Task-Action based VR applications. Mean
Apps which require the users to perform tasks and generate actions Opinion Scores[88], Comparative Analysis [60] and Case Study
in the given VR scene are regarded as Task-Action domain. VR Apps [57] [87] [58] [14] [9] [25] [41] [73] [67] [36] [27] [61] [21] [24]
based on games for fun or serious games are acknowledged as Gam- [32] based empirical methods are practiced in Gaming based VR
ing Domain. VR Apps, which provides health care solutions, comes applications. Experimental Setup [33] [69] [29] [1] [66] [28] [15]
under Healthcare Domain. The categorization of domain here is not [54] [8] [86] [37] [80] [16] [30] [51] [5] [18] [78] [83] [3] [42] [20]
specific to a business need, but it is a heuristic set portrayed as a [77] [64], Temporal Visual Comfort Model [38], Kinect skeletal
domain. To present the metrics in Table 2 in a well-defined form, model [43], Kennedy-Lane Simulator Sickness Questionnaires [74]
we categorized them into below types. based empirical methods are used in the Healthcare domain. Single-
Stimulus [75] and Immersive Experience Questionnaire [65] [81]
Widely Used Metrics: Heuristic Evaluation with Presence Survey [52] are followed in Assembly-simulation based VR applications.
is the most widely used quality metric by most of the simulation- Susanne et al. conducted the effectiveness of Questionnaires in
based VR applications [44] [88] [55] [1]. Auditory Feedback is an- VR User Studies as a quality aspect and found that questionnaires
other metric is used by practitioners to assess VR audio quality [18] reduces Break in Presence (BIP) and theoretical bias [59].
[78]. Apart from these two approaches, the rest appear to be either
distinct to a particular VR product or not relevant for generic usage. Customized Quality Models: Few researchers proposed Quality
QoE based metrics like Jitter, Delay, and Packet Loss were captured models on improving the streaming of Audio and Video data [40]
using Subjective Evaluation [79]. [76]. These models are formulated based on network bandwidth.
LTE and 5G bandwidths play a vital role in these quality models
Unique Metrics: We observed a few quality metrics which are where Network key quality indicators and QoE quality indicators
only one of its kind; unlike anything else. Flow Experience Analy- are compared with the varied scale of network bandwidth. to judge
sis [29] is used in the Task-based quality assessment methods. This the streaming quality. Cloud Quality Assessment Model using local
metric can be customed to a specific scale and updated based on the binary patterns were proposed to improve the screen quality of VR
practitioner’s strategy. User state measure [27], Realism Index [4], HMDs for rendering effective scene quality [13].
Spatial Memory Test [30], Simulation Awareness [8], and Laban
Movement Analysis [3] are the quality metrics which are unique 4 THREATS TO VALIDITY
on gathering results and their evaluation. In one case, researchers
In this section, we cover the threats to validity of our systematic
used the Presence Survey and NASA TLX Survey for capturing
literature review.
quality data for both Audio and Haptic feedback, which was un-
common in the case of other works [22]. Visual fatigue scale (VFS), Conclusion Validity - In our study, we considered research papers
Change of pupil size (PS), and Accommodation response (ACR) are written in English only. It helped us construct the search string in
captured to evaluate Fatigue Rate due to prolonged Immersion in a an appropriate language. There is a possibility of research work in-
large VR Scene [26]. A Gender-based case study was conducted to volving Software Quality for VR in other languages. We overlooked
understand VR Scene tolerance [21] by introducing a metric called such papers as it is challenging to comprehend the observations in
Cybersickness Susceptibility Questionnaire. all possible languages.
Novel Metrics: We observed a few novel metrics which are not Internal Validity - We worked with Software Quality domain ex-
found to be used in traditional software product development. Ef- perts to monitor and assess the quality of string search, filtration of
fect of Reachability [15], Pupillary Light Response [32], Miss Rate - search content, review of results, and overall analysis. We received
Merge Rate - Fragmentation Rate [61], Temperature Distribution constant feedback from the domain experts who were part of both
[41] and Vibrotactile Feedback [51] are few of the metrics. We ob- industry and academia to judge our search strategy and progress
serve that these metrics are novel and can be adopted by upcoming of our study. Of course, there could be minor mistakes by authors
VR products that are planning to focus on Quality-of-Experience regarding the judgment of a research paper during the filtration
Understanding Software Quality Metrics
for Virtual Reality Products - A Mapping Study ISEC 2021, February 25–27, 2021, Bhubaneswar, Odisha, India
process. The peer-researchers agree on search Strings. A keyword Enhanced Virtual Reality Experiences. In Proceedings of the 2016 CHI Conference
or two from the search string might have been overlooked. The on Human Factors in Computing Systems (San Jose, California, USA) (CHI ’16).
ACM, New York, NY, USA, 1968–1979. https://doi.org/10.1145/2858036.2858226
publications and discussion, as part of this paper, were evaluated [5] Blaine Bell, Steven Feiner, and Tobias Höllerer. 2001. View Management for
based on the judgment and experience of the authors and other Virtual and Augmented Reality. In Proceedings of the 14th Annual ACM Symposium
on User Interface Software and Technology (Orlando, Florida) (UIST ’01). ACM,
researchers may have appraised these publications previously. New York, NY, USA, 101–110. https://doi.org/10.1145/502348.502363
Construct Validity - The review records observations directly [6] Andreea Dalia Blaga, Maite Frutos-Pascual, Chris Creed, and Ian Williams. 2020.
Too Hot to Handle: An Evaluation of the Effect of Thermal Visual Representation
from the recognized research papers. The review results consider on User Grasping Interaction in Virtual Reality. In Proceedings of the 2020 CHI
the credibility of the hypothesis and design of the VR product or Conference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI
’20). Association for Computing Machinery, New York, NY, USA, 1–16. https:
scene developed as part of research work. //doi.org/10.1145/3313831.3376554
[7] Evren Bozgeyikli, Andrew Raij, Srinivas Katkoori, and Rajiv Dubey. 2016. Lo-
External Validity - We have made every attempt to conduct this comotion in Virtual Reality for Individuals with Autism Spectrum Disorder. In
Proceedings of the 2016 Symposium on Spatial User Interaction (Tokyo, Japan) (SUI
study under a systematic literature search and review protocol. ’16). ACM, New York, NY, USA, 33–42. https://doi.org/10.1145/2983310.2985763
Results may differ if the search strategy and data extraction are [8] Cristiano Carvalheiro, Rui Nóbrega, Hugo da Silva, and Rui Rodrigues. 2016. User
altered with a different review protocol. Redirection and Direct Haptics in Virtual Environments. In Proceedings of the
24th ACM International Conference on Multimedia (Amsterdam, The Netherlands)
(MM ’16). ACM, New York, NY, USA, 1146–1155. https://doi.org/10.1145/2964284.
5 CONCLUSION 2964293
[9] R. Chabra, A. Ilie, N. Rewkowski, Y. Cha, and H. Fuchs. 2017. Optimizing place-
In this paper, we discuss the use of software quality metrics or ment of commodity depth cameras for known 3D dynamic scene capture. In 2017
indicators in VR applications or prototypes was via a systematic IEEE Virtual Reality (VR). 157–166. https://doi.org/10.1109/VR.2017.7892243
[10] S. Chen, Y. Zhang, Y. Li, Z. Chen, and Z. Wang. 2018. Spherical Structural
review. The primary motivation of the study was to analyze the cur- Similarity Index for Objective Omnidirectional Video Quality Assessment. In
rent state-of-art of practices on adopting software quality metrics 2018 IEEE International Conference on Multimedia and Expo (ICME). 1–6. https:
or indicators while building virtual reality products or prototypes. //doi.org/10.1109/ICME.2018.8486584
[11] D. S. Cruzes and T. Dyba. 2011. Recommended Steps for Thematic Synthesis in
There are domain-specific quality metrics which are built for the Software Engineering. In 2011 International Symposium on Empirical Software
targeted users and for the focused market(s). Despite the variety of Engineering and Measurement. 275–284. https://doi.org/10.1109/ESEM.2011.36
software quality metrics or indicators, there is uncertainty about the [12] Brian Burke David Cearley. October 2018. Top 10 Strategic Technology Trends
for 2019. Gartner Technology Report (October 2018).
most suitable set of metrics used for analyzing a particular domain- [13] R. Diniz, P. G. Freitas, and M. C. Q. Farias. 2020. Towards a Point Cloud Quality
specific virtual reality product. In a larger context, we found that Assessment Model using Local Binary Patterns. In 2020 Twelfth International
Conference on Quality of Multimedia Experience (QoMEX). 1–6.
there is a need for a generalized software quality evaluation frame- [14] H. Duan, G. Zhai, X. Min, Y. Zhu, Y. Fang, and X. Yang. 2018. Perceptual Quality
work for VR practitioners to choose and adopt certain basic metrics Assessment of Omnidirectional Images. In 2018 IEEE International Symposium on
that help them achieve quality. Circuits and Systems (ISCAS). 1–5. https://doi.org/10.1109/ISCAS.2018.8351786
[15] Elham Ebrahimi, Andrew Robb, Leah S. Hartman, Christopher C. Pagano, and
Future Work: In the future, the study can be extended to support in Sabarish V. Babu. 2018. Effects of Anthropomorphic Fidelity of Self-avatars on
choosing an appropriate set of software quality metrics or indicators Reach Boundary Estimation in Immersive Virtual Environments. In Proceedings
of the 15th ACM Symposium on Applied Perception (Vancouver, British Columbia,
for future VR products or prototypes developed in both academia Canada) (SAP ’18). ACM, New York, NY, USA, Article 2, 8 pages. https://doi.org/
and Industry. A set of tools can be built using themes like VR 10.1145/3225153.3225170
[16] Meriem El-Yamri, Alejandro Romero-Hernandez, Manuel Gonzalez-Riojo, and
Scene Quality, VR audio quality, etc. to automate the evaluation Borja Manero. 2019. ComunicArte: A Public Speaking Trainer in Virtual Reality.
of VR products. Also, we may focus on developing a Generalized In Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing
Software Quality Framework for Virtual Reality products so that Systems (Glasgow, Scotland Uk) (CHI EA ’19). ACM, New York, NY, USA, Article
VS05, 2 pages. https://doi.org/10.1145/3290607.3311777
the developers can adopt and assess basic quality indicators during [17] International Organization for Standardization and International Electrotechnical
product development. Commission. 2001. Software Engineering–Product Quality: Quality model. Vol. 1.
ISO/IEC.
[18] Jules Françoise and Frédéric Bevilacqua. 2018. Motion-Sound Mapping Through
ACKNOWLEDGMENTS Interaction: An Approach to User-Centered Design of Auditory Feedback Using
The authors thank the peer reviewers from industry and academia Machine Learning. ACM Trans. Interact. Intell. Syst. 8, 2, Article 16 (June 2018),
30 pages. https://doi.org/10.1145/3211826
for participating in various meetings on finalizing the search strat- [19] Jann Philipp Freiwald, Oscar Ariza, Omar Janeh, and Frank Steinicke. 2020.
egy and review study. Walking by Cycling: A Novel In-Place Locomotion User Interface for Seated
Virtual Reality Experiences. In Proceedings of the 2020 CHI Conference on Human
Factors in Computing Systems (Honolulu, HI, USA) (CHI ’20). Association for
REFERENCES Computing Machinery, New York, NY, USA, 1–12. https://doi.org/10.1145/
[1] Justice I. Akpan and Roger J. Brooks. 2005. Experimental Investigation of the 3313831.3376574
Impacts of Virtual Reality on Discrete-event Simulation. In Proceedings of the 37th [20] Jann Philipp Freiwald, Lennart Diedrichsen, Alexander Baur, Oliver Manka,
Conference on Winter Simulation (Orlando, Florida) (WSC ’05). Winter Simulation Pedram Berendjy Jorshery, and Frank Steinicke. 2020. Conveying Perspective
Conference, 1968–1975. http://dl.acm.org/citation.cfm?id=1162708.1163049 in Multi-User Virtual Reality Collaborations. In Proceedings of the Conference
[2] Monthir Ali and Rogelio E. Cardona-Rivera. 2020. Comparing Gamepad and on Mensch Und Computer (Magdeburg, Germany) (MuC ’20). Association for
Naturally-Mapped Controller Effects on Perceived Virtual Reality Experiences. Computing Machinery, New York, NY, USA, 137–144. https://doi.org/10.1145/
In ACM Symposium on Applied Perception 2020 (Virtual Event, USA) (SAP ’20). 3404983.3405521
Association for Computing Machinery, New York, NY, USA, Article 10, 10 pages. [21] Jann Philipp Freiwald, Yvonne Göbel, Fariba Mostajeran, and Frank Steinicke.
https://doi.org/10.1145/3385955.3407923 2020. The Cybersickness Susceptibility Questionnaire: Predicting Virtual Reality
[3] Andreas Aristidou, Efstathios Stavrakis, Panayiotis Charalambous, Yiorgos Tolerance. In Proceedings of the Conference on Mensch Und Computer (Magdeburg,
Chrysanthou, and Stephania Loizidou Himona. 2015. Folk Dance Evaluation Germany) (MuC ’20). Association for Computing Machinery, New York, NY, USA,
Using Laban Movement Analysis. J. Comput. Cult. Herit. 8, 4, Article 20 (Aug. 115–118. https://doi.org/10.1145/3404983.3410022
2015), 19 pages. https://doi.org/10.1145/2755566 [22] C. George, P. Tamunjoh, and H. Hussmann. 2020. Invisible Boundaries for VR:
[4] Mahdi Azmandian, Mark Hancock, Hrvoje Benko, Eyal Ofek, and Andrew D. Auditory and Haptic Signals as Indicators for Real World Boundaries. IEEE
Wilson. 2016. Haptic Retargeting: Dynamic Repurposing of Passive Haptics for Transactions on Visualization and Computer Graphics (2020), 1–1.
ISEC 2021, February 25–27, 2021, Bhubaneswar, Odisha, India M. Kuri et al.
[23] N. Ghrairi, S. Kpodjedo, A. Barrak, F. Petrillo, and F. Khomh. 2018. The State of 1 (2009), 7–15. https://doi.org/10.1016/j.infsof.2008.09.009
Practice on Virtual Reality (VR) Applications: An Exploratory Study on Github [40] B. Krogfoss, J. Duran, P. Perez, and J. Bouwen. 2020. Quantifying the Value of 5G
and Stack Overflow. In 2018 IEEE International Conference on Software Quality, and Edge Cloud on QoE for AR/VR. In 2020 Twelfth International Conference on
Reliability and Security (QRS). 356–366. Quality of Multimedia Experience (QoMEX). 1–4.
[24] David R. Glowacki, Mark D. Wonnacott, Rachel Freire, Becca R. Glowacki, Ella M. [41] S. Li, Y. Chui, and P. Heng. 2014. Simulation of thermal damage to bone tissue
Gale, James E. Pike, Tiu de Haan, Mike Chatziapostolou, and Oussama Metatla. during bone drilling. In 2014 4th IEEE International Conference on Information
2020. Isness: Using Multi-Person VR to Design Peak Mystical Type Experiences Science and Technology. 569–573. https://doi.org/10.1109/ICIST.2014.6920542
Comparable to Psychedelics. In Proceedings of the 2020 CHI Conference on Human [42] Jean-Luc Lugrin, Marc Cavazza, Fred Charles, Marc Le Renard, Jonathan Freeman,
Factors in Computing Systems (Honolulu, HI, USA) (CHI ’20). Association for and Jane Lessiter. 2013. Immersive FPS Games: User Experience and Performance.
Computing Machinery, New York, NY, USA, 1–14. https://doi.org/10.1145/ In Proceedings of the 2013 ACM International Workshop on Immersive Media Expe-
3313831.3376649 riences (Barcelona, Spain) (ImmersiveMe ’13). ACM, New York, NY, USA, 7–12.
[25] K. Gu, V. Jakhetiya, J. Qiao, X. Li, W. Lin, and D. Thalmann. 2018. Model- https://doi.org/10.1145/2512142.2512146
Based Referenceless Quality Metric of 3D Synthesized Images Using Local Image [43] Mengxuan Ma, Rachel Proffitt, and Marjorie Skubic. 2017. Quantitative Assess-
Description. IEEE Transactions on Image Processing 27, 1 (Jan 2018), 394–405. ment and Validation of a Stroke Rehabilitation Game. In Proceedings of the Second
https://doi.org/10.1109/TIP.2017.2733164 IEEE/ACM International Conference on Connected Health: Applications, Systems
[26] J. Guo, D. Weng, H. Fang, Z. Zhang, J. Ping, Y. Liu, and Y. Wang. 2020. Exploring and Engineering Technologies (Philadelphia, Pennsylvania) (CHASE ’17). IEEE
the Differences of Visual Discomfort Caused by Long-term Immersion between Press, Piscataway, NJ, USA, 255–257. https://doi.org/10.1109/CHASE.2017.90
Virtual Environments and Physical Environments. In 2020 IEEE Conference on [44] J. Mapar, K. Brown, J. Medina, K. Laskey, and C. Conaty. 2001. NASA Goddard
Virtual Reality and 3D User Interfaces (VR). 443–452. Space Flight Center Virtual System Design Environment. In 2001 IEEE Aerospace
[27] Abdelwahab Hamam, Abdulmotaleb El Saddik, and Jihad Alja’am. 2014. A Conference Proceedings (Cat. No.01TH8542), Vol. 7. 7–3580 vol.7. https://doi.org/
Quality of Experience Model for Haptic Virtual Environments. ACM Trans. 10.1109/AERO.2001.931435
Multimedia Comput. Commun. Appl. 10, 3, Article 28 (April 2014), 23 pages. [45] C. Marañes, D. Gutierrez, and A. Serrano. 2020. Exploring the impact of 360°
https://doi.org/10.1145/2540991 movie cuts in users’ attention. In 2020 IEEE Conference on Virtual Reality and 3D
[28] Chih-Fan Hsu, Anthony Chen, Cheng-Hsin Hsu, Chun-Ying Huang, Chin-Laung User Interfaces (VR). 73–82.
Lei, and Kuan-Ta Chen. 2017. Is Foveated Rendering Perceivable in Virtual [46] Helen Roberts Mark Petticrew. 2008. Sociology - Reviewed Work: Systematic
Reality?: Exploring the Efficiency and Consistency of Quality Assessment Meth- Reviews in the Social Sciences: A Practical Guide Vol:42, 5 (2008), 1032–1034.
ods. In Proceedings of the 25th ACM International Conference on Multimedia https://www.jstor.org/stable/42857205?seq=1#page_scan_tab_contents
(Mountain View, California, USA) (MM ’17). ACM, New York, NY, USA, 55–63. [47] Y. Raghu Reddy Mohit Kuri, Sai Anirudh Karre. Last Accessed - Sept 2020. SLR
https://doi.org/10.1145/3123266.3123434 Suppliment data. (Last Accessed - Sept 2020). https://serc.iiit.ac.in/resources/
[29] Meng-Hsuan Huang and Saiau-Yue Tsau. 2018. A Flow Experience Analysis on the projects/smsvrqa/
Virtual Reality Artwork: La Camera Insabbiata. In Proceedings of the International [48] A. N. Moraes, R. Flynn, A. Hines, and N. Murray. 2020. Evaluating the User
Conference on Machine Vision and Applications (Singapore, Singapore) (ICMVA in a Sound Localisation Task in a Virtual Reality Application. In 2020 Twelfth
2018). ACM, New York, NY, USA, 51–55. https://doi.org/10.1145/3220511.3220514 International Conference on Quality of Multimedia Experience (QoMEX). 1–6.
[30] Hannu Järvinen, Ulysses Bernardet, and Paul F.M.J. Verschure. 2011. Interaction [49] Gayathri Narasimham, Haley Adams, John Rieser, and Bobby Bodenheimer. 2020.
Mapping Affects Spatial Memory and the Sense of Presence when Navigating Encoding Height: Egocentric Spatial Memory of Adults and Teens in a Virtual
in a Virtual Environment. In Proceedings of the Fifth International Conference on Stairwell. In ACM Symposium on Applied Perception 2020 (Virtual Event, USA)
Tangible, Embedded, and Embodied Interaction (Funchal, Portugal) (TEI ’11). ACM, (SAP ’20). Association for Computing Machinery, New York, NY, USA, Article 8,
New York, NY, USA, 321–324. https://doi.org/10.1145/1935701.1935776 8 pages. https://doi.org/10.1145/3385955.3407938
[31] Sankar Jayaram, Hugh I. Connacher, and Kevin W. Lyons. 1997. Virtual assembly [50] M. Narbutt, S. O’Leary, A. Allen, J. Skoglund, and A. Hines. 2017. Streaming
using virtual reality techniques. Computer-Aided Design 29, 8 (1997), 575–584. VR for immersion: Quality aspects of compressed spatial audio. In 2017 23rd
https://doi.org/10.1016/S0010-4485(96)00094-2 International Conference on Virtual System Multimedia (VSMM). 1–6. https:
[32] Brendan John, Pallavi Raiturkar, Arunava Banerjee, and Eakta Jain. 2018. An //doi.org/10.1109/VSMM.2017.8346301
Evaluation of Pupillary Light Response Models for 2D Screens and VR HMDs. In [51] Tomi Nukarinen, Jari Kangas, Jussi Rantala, Toni Pakkanen, and Roope Raisamo.
Proceedings of the 24th ACM Symposium on Virtual Reality Software and Technology 2018. Hands-free Vibrotactile Feedback for Object Selection Tasks in Virtual
(Tokyo, Japan) (VRST ’18). ACM, New York, NY, USA, Article 19, 11 pages. https: Reality. In Proceedings of the 24th ACM Symposium on Virtual Reality Software
//doi.org/10.1145/3281505.3281538 and Technology (Tokyo, Japan) (VRST ’18). ACM, New York, NY, USA, Article 94,
[33] J. Adam Jones, J. Edward Swan, II, Gurjot Singh, Eric Kolstad, and Stephen R. Ellis. 2 pages. https://doi.org/10.1145/3281505.3283375
2008. The Effects of Virtual Reality, Augmented Reality, and Motion Parallax on [52] Yun Suen Pai, Benjamin Outram, Noriyasu Vontin, and Kai Kunze. 2016. Trans-
Egocentric Depth Perception. In Proceedings of the 5th Symposium on Applied parent Reality: Using Eye Gaze Focus Depth As Interaction Modality. In Pro-
Perception in Graphics and Visualization (Los Angeles, California) (APGV ’08). ceedings of the 29th Annual Symposium on User Interface Software and Tech-
ACM, New York, NY, USA, 9–14. https://doi.org/10.1145/1394281.1394283 nology (Tokyo, Japan) (UIST ’16 Adjunct). ACM, New York, NY, USA, 171–172.
[34] Sai Anirudh Karre, Neeraj Mathur, and Y. Raghu Reddy. 2019. Is Virtual Real- https://doi.org/10.1145/2984751.2984754
ity Product Development Different?: An Empirical Study on VR Product De- [53] Viktorija Paneva, Myroslav Bachynskyi, and Jörg Müller. 2020. Levitation Simula-
velopment Practices. In Proceedings of the 12th Innovations on Software En- tor: Prototyping Ultrasonic Levitation Interfaces in Virtual Reality. In Proceedings
gineering Conference (Formerly Known As India Software Engineering Confer- of the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu,
ence) (Pune, India) (ISEC’19). ACM, New York, NY, USA, Article 3, 11 pages. HI, USA) (CHI ’20). Association for Computing Machinery, New York, NY, USA,
https://doi.org/10.1145/3299771.3299772 1–12. https://doi.org/10.1145/3313831.3376409
[35] Conor Keighrey, Ronan Flynn, Siobhan Murray, Sean Brennan, and Niall Murray. [54] Richard Paris, Miti Joshi, Qiliang He, Gayathri Narasimham, Timothy P. Mc-
2017. Comparing User QoE via Physiological and Interaction Measurements of Namara, and Bobby Bodenheimer. 2017. Acquisition of Survey Knowledge
Immersive AR and VR Speech and Language Therapy Applications. In Proceed- Using Walking in Place and Resetting Methods in Immersive Virtual Environ-
ings of the on Thematic Workshops of ACM Multimedia 2017 (Mountain View, ments. In Proceedings of the ACM Symposium on Applied Perception (Cottbus,
California, USA) (Thematic Workshops ’17). ACM, New York, NY, USA, 485–492. Germany) (SAP ’17). ACM, New York, NY, USA, Article 7, 8 pages. https:
https://doi.org/10.1145/3126686.3126747 //doi.org/10.1145/3119881.3119889
[36] C. Keighrey, R. Flynn, S. Murray, and N. Murray. 2020. A Physiology-based QoE [55] Xiaolan Peng, Jin Huang, Alena Denisova, Hui Chen, Feng Tian, and Hongan
Comparison of Interactive Augmented Reality, Virtual Reality and Tablet-based Wang. 2020. A Palette of Deepened Emotions: Exploring Emotional Challenge in
Applications. IEEE Transactions on Multimedia (2020), 1–1. Virtual Reality Games. In Proceedings of the 2020 CHI Conference on Human Factors
[37] Hak Gu Kim, Wissam J. Baddar, Heoun-taek Lim, Hyunwook Jeong, and Yong Man in Computing Systems (Honolulu, HI, USA) (CHI ’20). Association for Computing
Ro. 2017. Measurement of Exceptional Motion in VR Video Contents for VR Machinery, New York, NY, USA, 1–13. https://doi.org/10.1145/3313831.3376221
Sickness Assessment Using Deep Convolutional Autoencoder. In Proceedings of [56] Kai Petersen, Sairam Vakkalanka, and Ludwik Kuzniarz. 2015. Guidelines for
the 23rd ACM Symposium on Virtual Reality Software and Technology (Gothenburg, conducting systematic mapping studies in software engineering: An update.
Sweden) (VRST ’17). ACM, New York, NY, USA, Article 36, 7 pages. https: Information and Software Technology 64 (2015), 1 – 18. https://doi.org/10.1016/j.
//doi.org/10.1145/3139131.3139137 infsof.2015.03.007
[38] T. Kim. 2017. Theoretical analysis of the physiologic mechanism for visual [57] J. Plouzeau, J. Chardonnet, and F. Merienne. 2018. Using Cybersickness Indicators
comfort in 3D virtual reality. In 2017 IEEE International Conference on Consumer to Adapt Navigation in Virtual Reality: A Pre-Study. In 2018 IEEE Conference on
Electronics (ICCE). 302–305. https://doi.org/10.1109/ICCE.2017.7889329 Virtual Reality and 3D User Interfaces (VR). 661–662. https://doi.org/10.1109/VR.
[39] Barbara Kitchenham, Pearl Brereton, David Budgen, Mark Turner, John Bailey, 2018.8446192
and Stephen G. Linkman. 2009. Systematic literature reviews in software engi- [58] C. Pontonnier, A. Samani, M. Badawi, P. Madeleine, and G. Dumont. 2014. Assess-
neering - A systematic literature review. Information & Software Technology 51, ing the Ability of a VR-Based Assembly Task Simulation to Evaluate PhysicalRisk
Understanding Software Quality Metrics
for Virtual Reality Products - A Mapping Study ISEC 2021, February 25–27, 2021, Bhubaneswar, Odisha, India
Factors. IEEE Transactions on Visualization and Computer Graphics 20, 5 (May 2018 IEEE International Workshop on Signal Processing Systems (SiPS). 31–36.
2014), 664–674. https://doi.org/10.1109/TVCG.2013.252 https://doi.org/10.1109/SiPS.2018.8598306
[59] Susanne Putze, Dmitry Alexandrovsky, Felix Putze, Sebastian Höffner, Jan David [74] Frank Steinicke and Gerd Bruder. 2014. A Self-experimentation Report About
Smeddinck, and Rainer Malaka. 2020. Breaking The Experience: Effects of Long-term Use of Fully-immersive Technology. In Proceedings of the 2Nd ACM
Questionnaires in VR User Studies. In Proceedings of the 2020 CHI Confer- Symposium on Spatial User Interaction (Honolulu, Hawaii, USA) (SUI ’14). ACM,
ence on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI ’20). New York, NY, USA, 66–69. https://doi.org/10.1145/2659766.2659767
Association for Computing Machinery, New York, NY, USA, 1–15. https: [75] W. Sun, K. Gu, G. Zhai, S. Ma, W. Lin, and P. Le Calle. 2017. CVIQD: Subjective
//doi.org/10.1145/3313831.3376144 quality evaluation of compressed virtual reality images. In 2017 IEEE International
[60] J. Qiao, M. Liu, S. Li, Z. He, and Z. Yang. 2018. Highly Efficient Quality Assessment Conference on Image Processing (ICIP). 3450–3454. https://doi.org/10.1109/ICIP.
of 3D-Synthesized Views Based on Compression Technology. IEEE Access 6 (2018), 2017.8296923
42309–42318. https://doi.org/10.1109/ACCESS.2018.2859439 [76] R. I. Tavares da Costa Filho, F. De Turck, and L. P. Gaspary. 2020. From 2D to Next
[61] Fazlay Rabbi, Taiwoo Park, Biyi Fang, Mi Zhang, and Youngki Lee. 2018. When Generation VR/AR Videos: Enabling Efficient Streaming via QoE-aware Mobile
Virtual Reality Meets Internet of Things in the Gym: Enabling Immersive Inter- Networks. In NOMS 2020 - 2020 IEEE/IFIP Network Operations and Management
active Machine Exercises. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. Symposium. 1–6.
2, 2, Article 78 (July 2018), 21 pages. https://doi.org/10.1145/3214281 [77] Elena Tikhonova, Galina Efremova, and Irina Terehina. 2020. Virtual Reality as
[62] F. Rahim, M. P. Queluz, and J. Ascenso. 2018. Objective Assessment of Line A Tool for Foreign Students’ Anti-Stress Adaptation to A New Educational Envi-
Distortions in Viewport Rendering of 360º Images. In 2018 IEEE International ronment. In 2020 The 4th International Conference on Education and Multimedia
Conference on Artificial Intelligence and Virtual Reality (AIVR). 68–75. https: Technology (Kyoto, Japan) (ICEMT 2020). Association for Computing Machinery,
//doi.org/10.1109/AIVR.2018.00017 New York, NY, USA, 127–132. https://doi.org/10.1145/3416797.3416833
[63] R. Ramadan and Y. Widyani. 2013. Game development life cycle guidelines. In [78] David Triantafyllou, Angeliki Antoniou, and George Lepouras. 2016. Sound and
2013 International Conference on Advanced Computer Science and Information Kinesthesis in Virtual Environments: Pilot Experiment to Compare Physical and
Systems (ICACSIS). 95–100. Digital Sound Contradictions. In Proceedings of the 20th Pan-Hellenic Conference
[64] Andreas Riegler, Bilal Aksoy, Andreas Riener, and Clemens Holzmann. 2020. Gaze- on Informatics (Patras, Greece) (PCI ’16). ACM, New York, NY, USA, Article 55,
Based Interaction with Windshield Displays for Automated Driving: Impact of 4 pages. https://doi.org/10.1145/3003733.3003737
Dwell Time and Feedback Design on Task Performance and Subjective Workload. [79] S. Van Damme, M. T. Vega, and F. De Turck. 2020. Human-centric Quality
In 12th International Conference on Automotive User Interfaces and Interactive Management of Immersive Multimedia Applications. In 2020 6th IEEE Conference
Vehicular Applications (Virtual Event, DC, USA) (AutomotiveUI ’20). Association on Network Softwarization (NetSoft). 57–64.
for Computing Machinery, New York, NY, USA, 151–160. https://doi.org/10. [80] Marcel Walch, Julian Frommel, Katja Rogers, Felix Schüssel, Philipp Hock, David
1145/3409120.3410654 Dobbelstein, and Michael Weber. 2017. Evaluating VR Driving Simulation from a
[65] Katja Rogers, Giovanni Ribeiro, Rina R. Wehbe, Michael Weber, and Lennart E. Player Experience Perspective. In Proceedings of the 2017 CHI Conference Extended
Nacke. 2018. Vanishing Importance: Studying Immersive Effects of Game Audio Abstracts on Human Factors in Computing Systems (Denver, Colorado, USA) (CHI
Perception on Player Experiences in Virtual Reality. In Proceedings of the 2018 EA ’17). ACM, New York, NY, USA, 2982–2989. https://doi.org/10.1145/3027063.
CHI Conference on Human Factors in Computing Systems (Montreal QC, Canada) 3053202
(CHI ’18). ACM, New York, NY, USA, Article 328, 13 pages. https://doi.org/10. [81] B. Wan and J. Guo. 2020. Learning Immersion Assessment Model Based on Multi-
1145/3173574.3173902 dimensional Physiological Characteristics. In 2020 IEEE International Conference
[66] Roy A. Ruddle, Ekaterina Volkova, and Heinrich H. Bülthoff. 2013. Learning to on Power, Intelligent Computing and Systems (ICPICS). 87–90.
Walk in Virtual Reality. ACM Trans. Appl. Percept. 10, 2, Article 11 (June 2013), [82] M. Wirth, S. Gradl, W. A. Mehringer, R. Kulpa, H. Rupprecht, D. Poimann, A. F.
17 pages. https://doi.org/10.1145/2465780.2465785 Laudanski, and B. M. Eskofier. 2020. Assessing Personality Traits of Team Athletes
[67] D. P. Saha, L. Thomas Martin, and R. Benjamin Knapp. 2018. Towards Defining a in Virtual Reality. In 2020 IEEE Conference on Virtual Reality and 3D User Interfaces
Quality-Metric for Affective Feedback in an Intelligent Environment. In 2018 IEEE Abstracts and Workshops (VRW). 101–108.
International Conference on Pervasive Computing and Communications Workshops [83] Markus Wirth, Stefan Gradl, Jan Sembdner, Soeren Kuhrt, and Bjoern M. Eskofier.
(PerCom Workshops). 609–614. https://doi.org/10.1109/PERCOMW.2018.8480245 2018. Evaluation of Interaction Techniques for a Virtual Reality Reading Room
[68] K. Sakamoto, S. Shirai, J. Orlosky, H. Nagataki, N. Takemura, M. Alizadeh, and in Diagnostic Radiology. In Proceedings of the 31st Annual ACM Symposium on
M. Ueda. 2020. Exploring Pupillometry as a Method to Evaluate Reading Com- User Interface Software and Technology (Berlin, Germany) (UIST ’18). ACM, New
prehension in VR-based Educational Comics. In 2020 IEEE Conference on Virtual York, NY, USA, 867–876. https://doi.org/10.1145/3242587.3242636
Reality and 3D User Interfaces Abstracts and Workshops (VRW). 422–426. [84] J. Yang, T. Liu, B. Jiang, H. Song, and W. Lu. 2018. 3D Panoramic Virtual Reality
[69] Débora Pereira Salgado, Felipe Roque Martins, Thiago Braga Rodrigues, Conor Video Quality Assessment Based on 3D Convolutional Neural Networks. IEEE
Keighrey, Ronan Flynn, Eduardo Lázaro Martins Naves, and Niall Murray. 2018. Access 6 (2018), 38669–38682. https://doi.org/10.1109/ACCESS.2018.2854922
A QoE Assessment Method Based on EDA, Heart Rate and EEG of a Virtual [85] S. Yang, J. Zhao, T. Jiang, J. W. T. Rahim, B. Zhang, Z. Xu, and Z. Fei. 2017.
Reality Assistive Technology System. In Proceedings of the 9th ACM Multimedia An objective assessment method based on multi-level factors for panoramic
Systems Conference (Amsterdam, Netherlands) (MMSys ’18). ACM, New York, NY, videos. In 2017 IEEE Visual Communications and Image Processing (VCIP). 1–4.
USA, 517–520. https://doi.org/10.1145/3204949.3208118 https://doi.org/10.1109/VCIP.2017.8305133
[70] S. Schwarz and M. M. Hannuksela. 2017. Perceptual quality assessment of HEVC [86] Ying Zhang, T. Fernando, R. Sotudeh, and Hannan Xiao. 2005. The use of visual
main profile depth map compression for six degrees of freedom virtual reality and auditory feedback for assembly task performance in a virtual environment.
video. In 2017 IEEE International Conference on Image Processing (ICIP). 181–185. In Ninth International Conference on Information Visualisation (IV’05). 779–784.
https://doi.org/10.1109/ICIP.2017.8296267 https://doi.org/10.1109/IV.2005.127
[71] Valentin Schwind, Pascal Knierim, Nico Haas, and Niels Henze. 2019. Using Pres- [87] A. Zare, A. Aminlou, and M. M. Hannuksela. 2017. Virtual reality content
ence Questionnaires in Virtual Reality. In Proceedings of the 2019 CHI Conference on streaming: Viewport-dependent projection and tile-based techniques. In 2017
Human Factors in Computing Systems (Glasgow, Scotland Uk) (CHI ’19). ACM, New IEEE International Conference on Image Processing (ICIP). 1432–1436. https:
York, NY, USA, Article 360, 12 pages. https://doi.org/10.1145/3290605.3300590 //doi.org/10.1109/ICIP.2017.8296518
[72] Emad Shihab. 2014. Practical Software Quality Prediction. In 30th IEEE Interna- [88] Y. Zhang, Y. Wang, F. Liu, Z. Liu, Y. Li, D. Yang, and Z. Chen. 2018. Subjective
tional Conference on Software Maintenance and Evolution, Victoria, BC, Canada, Panoramic Video Quality Assessment Database for Coding Applications. IEEE
September 29 - October 3, 2014. 639–644. https://doi.org/10.1109/ICSME.2014.114 Transactions on Broadcasting 64, 2 (June 2018), 461–473. https://doi.org/10.1109/
[73] N. Sidaty, P. Cabarat, W. Hamidouche, D. Menard, and O. Deforges. 2018. TBC.2018.2811627
Performance and Computational Complexity of the Future Video Coding. In