Performance Evaluation and Optimization of 3D Models from Low-Cost 3D Scanning Technologies for Virtual Reality and Metaverse E-Commerce

Grande, Rubén; Albusac, Javier; Vallejo, David; Glez-Morcillo, Carlos; Castro-Schez, José Jesús

doi:10.3390/app14146037

Open AccessArticle

Performance Evaluation and Optimization of 3D Models from Low-Cost 3D Scanning Technologies for Virtual Reality and Metaverse E-Commerce

by

Rubén Grande

,

Javier Albusac

^*

,

David Vallejo

,

Carlos Glez-Morcillo

and

José Jesús Castro-Schez

Department of Information Technologies and Systems, University of Castilla-La Mancha, Paseo de la Universidad 4, 13071 Ciudad Real, Spain

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(14), 6037; https://doi.org/10.3390/app14146037

Submission received: 4 June 2024 / Revised: 5 July 2024 / Accepted: 8 July 2024 / Published: 10 July 2024

(This article belongs to the Section Computing and Artificial Intelligence)

Download

Browse Figures

Versions Notes

Abstract

:

Virtual Reality (VR) is and will be a key driver in the evolution of e-commerce, providing an immersive and gamified shopping experience. However, for VR shopping spaces to become a reality, retailers’ product catalogues must first be digitised into 3D models. While this may be a simple task for retail giants, it can be a major obstacle for small retailers, whose human and financial resources are often more limited, making them less competitive. Therefore, this paper presents an analysis of low-cost scanning technologies for small business owners to digitise their products and make them available on VR shopping platforms, with the aim of helping improve the competitiveness of small businesses through VR and Artificial Intelligence (AI). The technologies to be considered are photogrammetry, LiDAR sensors and NeRF.In addition to investigating which technology provides the best visual quality of 3D models based on metrics and quantitative results, these models must also offer good performance in commercial VR headsets. In this way, we also analyse the performance of such models when running on Meta Quest 2, Quest Pro and Quest 3 headsets (Reality Labs, Reality Labs, CA, USA) to determine their feasibility and provide use cases for each type of model from a scalability point of view. Finally, our work describes a model optimisation process that reduce the polygon count and texture size of high-poly models, converting them into more performance-friendly versions without significantly compromising visual quality.

Keywords:

virtual reality; e-commerce; VR shopping; product digitisation; virtual reality performance; digital retail transformation

1. Introduction

Emerging technologies, such as Virtual Reality (VR) and Artificial Intelligence (AI), are contributing to the further evolution of e-commerce [1]. They offer considerable advantages, such as the ability to provide a shopping experience that resembles face-to-face shopping and the opportunity to present products that are highly personalised to the customer’s profile [2,3,4]. However, the adoption of these technologies requires considerable investment in skilled staff and state-of-the-art technological resources [5]. While large international companies have these resources at their disposal, marking the next evolutionary steps, small businesses are generally at a disadvantage in terms of competitiveness. This dynamic is leading to a growing gap between the e-commerce giants and small businesses, further widening existing inequalities.

Faced with these challenges, there is a need to look for solutions that reduce this gap, improve the competitiveness of small businesses and strengthen local economies. To improve this competitiveness, it is necessary to help small businesses go digital with technologies that will shape the commercial situation in the years to come. Due to the changes in consumption patterns in recent years, it is critical to provide these small businesses with technological solutions that abstract them from the difficulties of implementation and management, while enabling interaction with potential buyers. All of this with the aim of not being continually in a notable technological gap.

Therefore, a first step must be overcome: the digitisation of the products to be displayed in the virtual environment. Studies show that the realistic appearance of these products contributes significantly to a higher purchase intention [3,4]. Unfortunately, product scanning requires high-cost devices for high-quality results, such as advanced 3D scanners, which are often beyond the economic reach of retailers and require technical know-how to operate. Without the possibility of acquiring these scanners, software is needed to facilitate this task and integrate the products into e-commerce platforms.

Therefore, it is necessary to consider low-cost and user-friendly digitisation methods that enable realistic digitisation of products. In recent times, state-of-the-art hardware and software technologies have emerged which, combined with AI techniques, can improve the results [6]. To the best of our knowledge, these technologies have not yet been analysed and evaluated in depth within the context of VR shopping.

The aim of this work is to analyse and experiment with technologies that enable small retailers to digitise their product catalogues at low cost. The results of this work will be used to establish guidelines for the low-cost scanning technologies that we are investigating in this work and that will be available to such retailers.

In addition to realism, it is essential to ensure that digitised products are integrated into a virtual environment without negatively affecting their performance. Too much detail, or the simultaneous display of a large number of products, can compromise the performance of the system, leading to dizziness [7], uncomfortable usage of VR glasses and, in general, a poor User Experience (UX). Therefore, this work includes a study that seeks a balance between realism and performance to ensure customer well-being and future effectiveness in product sales. The results provide guidelines for the design of optimal VR experiences for e-commerce.

Our approach is therefore based on two fundamental aspects: (i) to quantitatively study the visual quality of 3D models generated by different low-cost scanning tools and (ii) to analyse the performance offered by those models that ensure higher visual quality in order to determine the workload that current VR headsets on the market can support while providing a pleasant user experience. For the first part, we will select a set of objects whose characteristics do not differ from the best practices of the selected scanning tools and then scan and analyse them according to a set of metrics. For the second part, we will create scenes in the Unity game engine with the models of the highest visual quality, until we determine a theoretical maximum where the performance is poor from the user’s point of view. These scenes will be run standalone, and the data collected through a profiling tool will be further analysed.

The rest of the paper is structured as follows. In Section 2, a brief description of VR-ZOCO’s architecture is provided. In Section 3, related works regarding the scanning technologies that will be addressed and their usage for VR applications are presented, as well as the current state of VR in e-commerce field. Section 4 describes the process of selecting scanning technologies, the hardware used for scanning and the scanning methodology. Moreover, it details the metrics used for 3D model quality evaluation, as well as the methodology used for scanning and the evaluation of the 3D models. Section 5 provides details of the performance of these models in VR environments tested on the Meta Quest 2, Quest Pro and Quest 3 headsets. Section 6 presents and optimisation process for 3D models, motivated by the results shown in Section 5, and it shows a comparison that addresses visual quality and performance with the optimised models. Finally, Section 6 presents the conclusions drawn and outlines future research work.

2. Related Work

In this section, we review papers related to current scanning technologies and their applications in object scanning. We will focus on technologies such as LiDAR sensors and NeRF. Subsequently, we discuss works that explore the integration of scanned objects into VR shopping environments. Lastly, we address the influence of realistic 3D models in e-commerce.

2.1. NeRF (Neurance Radiance Field)

NeRF is a state-of-the-art deep-learning approach for creating high-quality 3D models from 2D images. NeRF uses a fully connected neural network to represent a scene or object [8], which is trained using a set of images and a set of camera poses. Based on the viewing direction and camera position, the network generates the volume density and emitted radiance at that location in 3D space. The volumetric representation is then used to render novel views of the scene from any viewpoint in real-time [9].

The work discussed in [10] addressed the challenges in VR applications arising from the demand for high-quality, low-latency rendering in environments with wide field-of-view and stereoscopic viewing. NeRF has shown potential in rendering photorealistic 3D scenes with view-dependent effects. However, their application in VR faces hurdles due to significant latency and quality drop in high-resolution and wide-view scenarios, which can negatively affect the user experience and even induce sickness. To overcome these issues, the authors presented the first gaze-contingent 3D neural representation and view synthesis method for VR. This method incorporated human visual and stereo acuity into an egocentric neural representation of 3D scenery. By optimising both latency and visual quality and aligning them with human perception, their approach achieved a significant reduction in latency (up to 99% time reduction compared to standard NeRF) while maintaining high-fidelity rendering. The results are perceptually identical to full-resolution ground truth, marking a crucial step towards advanced VR systems that can capture and visualise remote environments in real-time. The effectiveness of this method has been validated through both objective analysis and subjective studies.

The authors of [11] presented an innovative end-to-end pipeline for creating dynamic 3D video content from monocular video footage. It can efficiently process video on consumer-grade hardware, reducing the processing time from days to mere minutes per second of footage. Key aspects of this pipeline include the estimation of camera parameters, depth maps and the 3D reconstruction of both dynamic foreground and static background elements. The pipeline also encompasses the rendering of these 3D videos on either a computer or a VR headset. A notable feature is the use of a state-of-the-art visual transformer model for estimating depth maps, which enhances the capability for RGB-D fusion with estimated depth data. The paper reports preliminary experiments where the output was rendered in a VR headset and visually compared against ground-truth datasets and other leading NeRF-based methods.

Li et al. presented and evaluated a NeRF-based framework that can render scenes in immersive VR, allowing users to freely explore complex real-world scenes [12]. The framework’s performance was benchmarked across three different NeRF scenes, demonstrating its capability to achieve frame rates of 30 frames per second at a resolution of 1280 × 720 pixels per eye, using super-resolution techniques. The authors noted that their implementation of instant neural graphics primitives for immersive NeRF in VR environments shows promising results, particularly in rendering small-sized scenes with frame rates of up to 38 fps. This framework stands out for its ability to rapidly digitise real-world 3D environments from unconstrained 2D images and to compress scene data efficiently. The Unity-based implementation allows for adaptability to various Head-Mounted Displays (HMDs) and sensors, opening up possibilities for features like foveated rendering Feld of View (FoV) adjustments.

A more recent work of some of the previous authors [13] focused on an implementation of real-time Variable Rate Shading (VRS) in VR, building upon the advancements of NeRF. The key innovation lies in merging multiple pixels into one, thereby reducing overall rendering time and allowing for the real-time visualisation of small to medium NeRF scenes. This system is also applicable in 2D settings, showcasing its versatility. It enabled rendering small to medium NeRF scenes on moderate and gaming-grade hardware, and on commercially available HMDs like Quest 2.

2.2. Photogrammetry

Photogrammetry is one of the oldest techniques for digitising objects in 3D graphics. The process of data handling involves using software to reconstruct the 3D geometric shape of an object from various images. Recent advancements in automated reconstruction techniques now allow for quicker creation of 3D models from a collection of photos.

Obradović et al. developed a VR application based on photogrammetric models, specifically applied to the iconostasis of the Serbian Orthodox Cathedral Church of Saint Nicholas [14]. The research outlined a comprehensive workflow, including the generation of an accurate 3D reconstruction of the iconostasis using photogrammetry, model optimisation, retopology, control and analysis, followed by the creation of the VR experience using a game engine. The result is an interactive walkthrough of the church. This method showed the significant role of photogrammetry in creating realistic 3D models for VR applications, particularly in cultural heritage preservation.

The work of [15] also provided insights into cultural heritage promotion. They proposed a VR solution using photogrammetry techniques to enhance heritage promotion in a tourist centre in Trujillo. The research identified deficiencies in heritage promotion, particularly in its diffusion and perception by the population. The authors tested with 30 participants and confirmed that the proposed VR solution significantly improves the promotion process. In addition, they underscored the role of realistic 3D models created through photogrammetry in delivering a more engaging and authentic experience to tourists.

In another application domain, Tadeja et al. researched the use of photogrammetry to generate realistic 3D models of physical structures for use in VR, particularly in engineering surveys [16]. The study explores how digital twins (digital replicas of physical objects) created through photogrammetry can serve as effective substitutes for real-world engineering inspections. The authors explored the effectiveness of bimanual gestural input, augmented with gaze-tracking, in interacting with these 3D models in VR. A key outcome of the study was the demonstration that such an interaction mechanism is an effective modality for domain experts in VR, facilitating efficient manipulation and accurate measurement of elements within the photogrammetric 3D models. The qualitative feedback from the six domain experts involved in the case study underscores the potential of this method in professional settings.

2.3. LiDAR Sensors

LiDAR (Light Detection and Ranging) sensors are another technology used for 3D scanning. LiDAR works by emitting laser beams and measuring the time it takes for the beams to reflect off objects in the environment and return to the sensor [17]. By measuring the time of flight of the laser beams, LiDAR sensors can obtain precise depth information about objects in the environment, which can be used to create 3D models.

LiDAR sensors have been used in works from several application domains. Mikita et al. explored the use of various laser scanning technologies to assess damage to forest road surfaces [18]. The study focused on comparing the accuracy and applicability of three different laser scanning methods: Terrestrial Laser Scanning (TLS), hand-held Personal Laser Scanning (PLShh) with GeoSLAM ZEB Horizon and hand-held personal laser scanning with iPhone 13 Pro using 3D Scanner and Polycam apps. The research aimed to determine the effectiveness of these technologies in accurately mapping forest road surfaces and identifying damage compared to traditional tacheometric surveying methods. Specifically, the iPhone 13 Pro with 3D Scanner app demonstrated a positional height accuracy with a Root Mean Square Error (RMSE) of 0.185 m (X,Y) and 0.021 m (Z), while with the Polycam app, the RMSE was 0.31 m (X,Y) and 0.045 m (Z). The GeoSLAM ZEB Horizon achieved a (X,Y) RMSE of 0.108 m and a (Z) RMSE of 0.025 m. Comparatively, Terrestrial Laser Scanning (TLS) achieved the best results with a (XY) RMSE of 0.049 m and a (Z) RMSE of 0.0077 m. These results indicate that while TLS provides the most precise measurements, both the Apple’s iPhone 13 Pro and the Geonorte’s GeoSLAM ZEB Horizon scanners produce sufficiently accurate results for practical applications. The iPhone 13 Pro, in particular, offers a cost-effective and rapid alternative for assessing road conditions.

The work discussed in [19] studied the viability of using Apple’s iPad Pro’s LiDAR and True-Depth camera systems as 3D scanning solutions compared to an industrial 3D scanner. The authors used different coloured Lego bricks to assess the accuracy of these technologies, focusing on shape and position tolerances. The data from the study revealed that the iPad Pro’s LiDAR and True-Depth systems, while not as accurate as the industrial 3D scanner, still showed considerable potential. The industrial scanner consistently outperformed the iPad in terms of accuracy, but the iPad’s technologies were deemed sufficient for specific applications where extreme precision is not a critical requirement.

Ferreira et al. presented a methodology for using a Digital Surface Model (DSM) within a VR environment focusing on its application in military operations in the Amazon [20]. The study details the use of an Unmanned Aircraft System (UAS) equipped with LiDAR sensors to gather data, which is then modelled using software like Blender and Sketchup. This model is used to create a VR environment for immersive terrain analysis, making it a simple and low-cost way to explore a DSM in VR. The paper highlights the potential of integrating modern technology like VR and UAS in strategic and tactical planning, particularly in difficult terrains, enhancing situational awareness and operational efficiency. Finally, the authors verified that a Ground Sample Distance (GSD) greater than 1.93 cm/pixel begins to present a significant loss of the terrain characteristics considered important.

2.4. VR in E-Commerce Field

In the literature, several works on the prototypes of virtual environments for commerce can be found. Ricci et al. [21] carried out a comparative study of Immersive Virtual Reality (IVR) and Desktop Virtual Reality (DVR) in virtual fashion stores, noting a significant enhancement of the shopping experience through IVR. The research, involving 60 participants, focused on evaluating various dimensions of the shopping experience: duration, hedonic and utilitarian values, cognitive load and overall user experience. The findings revealed that IVR led to a longer engagement time in shopping activities, with an average duration notably higher in IVR than in DVR. This extension in engagement time was attributed to the more immersive and interactive nature of IVR. Moreover, both the hedonic aspects (such as enjoyment and telepresence) and utilitarian aspects (like product diagnosticity and usefulness) were significantly enhanced in IVR, indicating a more satisfying and effective shopping experience. Interestingly, the cognitive load was comparable between the two modes, suggesting that the mental effort required by the users in IVR was not excessively higher than in DVR. The study also employed the User Experience Questionnaire (UEQ) to assess various user experience dimensions, which mostly favoured IVR, though some inconsistencies were observed in dependability scores. These findings underscore the potential of IVR in enriching the shopping experience in fashion retail.

The authors of [22] researched how immersive VR shopping experiences influence users’ propensity to adopt the technology for future applications. For this purpose, the researchers conducted an experimental study using two different settings: a highly immersive environment enabled by the HTC Vive and a less immersive setup on a standard desktop display. The more immersive scenario involved monitoring and recording the participants’ hand and head movements, including interactions with products such as picking up, dropping or passing items from one hand to the other. Eye-tracking data was also collected. The authors’ findings indicate that the immersive nature of the VR environment may affect a user’s decision to continue using the system through two distinct pathways: hedonic, motivated by enjoyment, and utilitarian, motivated by practicality.

Wu et al. conducted two experiments within the realm of user-defined gestures for interaction in immersive VR shopping applications [23]. Consequently, they suggested a more pragmatic approach for generating more dependable gestures compared to conventional gesture elicitation studies. One of these experiments involved VR shopping scenarios using the HTC Vive headset, where 32 participants were required to engage in 12 different tasks, including selecting objects, changing colours and resizing, among others. However, the authors did not detail the methods used for recording and processing the gesture information.

The work discussed in [24] detailed the development and evaluation of a prototype for a VR online shopping environment, focusing on how user inputs such as head movements and speech, and outputs through desktop and HMD, influence task performance and user preferences and behaviour. The prototype had two versions, employing different input devices for the desktop and HMD versions, which changed how users interacted with the prototype. Following a survey where participants identified the advantages and disadvantages of online shopping, a prototype was constructed for both desktop and mobile VR environments, restricting user input to voice and head gestures. After analysing a case study, they proposed design principles for VR shopping environments. Subsequently, the same authors developed a new VR shopping prototype using the ’Apartment’ metaphor [25]. This was aimed at examining various product selection and handling techniques (such as grab and beam) and exploring different shopping cart designs (like basket and sphere). In their study using this new prototype, they found that immersion and user experience were key concerns for the participants. They also offered recommendations to reduce motion sickness and identified which types of products are most suitable for VR stores.

Shravani et al. introduced VR Supermarket, an online VR shopping platform implemented using Unity [26]. This VR system was designed for use with the HTC Vive headset. Additionally, it incorporates a dynamic recommendation system that relies on user purchase history and is hosted on a web server. The VR application not only manages user interactions via controllers and the headset but also interfaces with the web server, which is backed by an SQL database for generating recommendations. Simultaneously, the VR application requires a NoSQL database to handle data storage and retrieval tasks related to products, supplementary information and user data, among other functionalities. The authors presented a brief case study with well-known datasets such as Amazon Groceries to showcase the recommendations made by the system.

2.5. Influence of 3D Models in E-Commerce Activities

It is noteworthy that high quality 3D models of products add realism to the virtual environment, allowing the user to abstract and increase their feeling of being in a physical shop [27]. In addition, studies have investigated the impact of product graphics quality on characteristics such as informativeness or playfulness, finding that those with higher visual quality graphics have a greater positive impact [3]. The aspects of the user’s shopping experience that are favoured have the potential to increase the number of retail sales or the users’ preferred use when choosing between a physical or VR-based shop to shop in [4].

In addition, a 3D model with higher-quality graphics can better capture the details of products that may contain minor flaws, which can enable small shops to display second-hand products and allow shoppers to feel more confident about the condition of the product they are buying. In addition, products that identify the towns in which small shops are located can benefit if the models can capture a greater number of details. Given that customers respond more positively to promotions in VR shops than in physical stores [27], it is worthwhile to obtain realistic 3D models so that small businesses can benefit even more from the platform. Therefore, we seek to analyse and compare scanning tools that are low-cost so that they do not pose a financial problem for retailers, easy to use so that they do not require skilled staff to use them, and that produce high-quality results to attract more attention from customers.

3. Low-Cost Digitisation for VR E-Commerce

Having established the importance of realistic 3D product display and its influence on immersive e-commerce, as well as the resource constraints faced by smaller businesses, we proceed to the study of various scanning methods that try to find a balance between quality and cost, as well as good performance during real-time graphics processing using VR Headsets.

Therefore, in the following subsections we will present in detail the software and hardware used in the study for the digitisation of physical products, characteristics and quality metrics for the comparison of the digitisation techniques and, finally, the experimentation carried out with some products considered as samples.

3.1. Software Tools Selection Based on Scanning Techniques

In Section 2, state-of-the-art 3D scanning techniques were introduced, including photogrammetry, LiDAR mapping and NeRF. In the current section, we will focus on choosing low-cost tools and technologies that support these techniques and the needs of small businesses. We would like to point out that all the selected software tools have a user interface very similar to that of the camera application integrated with the mobile phone. In addition, they usually include tooltips and visual guides to perform scans in video mode. This makes the learning curve very easy for people who are not experts in the use of scanning technologies [28], which is an incentive to suggest the use of these applications as opposed to a 3D scanner.

LiDAR sensors were first integrated in smartphones only in iPhone and iPad devices from the 4th generation (2020). Several authors have conducted studies with these sensors that show good results in terms of accuracy of the generated models. For example, ref. [29] stated that the accuracy is of ±1 cm for small objects with side length greater than 10 cm, with the limit of detection for objects being around 5 cm. This suggests that LiDAR sensors might be more accurate and preferable to use for large objects. Even though these devices are considerably cheaper compared with professional 3D scanners, they might not be affordable for every small business owner. Therefore, it is necessary to explore scanning tools available in Android-based devices, as this operating system for mobile devices has 71.44% market share globally (https://www.statista.com/statistics/272698/global-market-share-held-by-mobile-operating-systems-since-2009/ accessed on 7 July 2024) and is the operating system installed in the majority of mobile devices. Therefore, we searched for scanning applications available in either Android or iOS to select a set of tools that allow us to study the three technologies mentioned above. Even though some applications found had both Android and iOS versions, the features offered differed in some cases. Therefore, we proceeded to explore scanning applications from various sources, including the literature, to identify the most used ones by researchers.

Regarding the current state of photogrammetry applications, the image acquisition phase of such a process is easily done by the cameras of current mobile devices [30]. These cameras have superior specifications to those on previous years’ mobile phones, especially when they are high-end devices, allowing one to obtain high-quality images from which to generate the 3D models. In addition, the rest of the phases (image processing, point cloud generation, mesh generation and post processing) are done by more powerful processors in current mobile phone devices or even by a cloud-based computing process. Therefore, we explored free or low-cost applications available for Android devices. Moreover, we considered user’s ratings and reviews as well as the published results made with such applications in order to decide whether to consider them or not. Since all of them are based on the photogrammetry technique, we also considered additional features made public by the developers of such applications, such as editing tools for the obtained model inside the application, estimated time for generating the model, subscription prices, exporting formats or the guides offered for scanning.

With respect to NeRF, we believe that this novel technique can offer high-quality results, even when compared with professional scanners. For instance, the authors of [31] showed encouraging results, where, compared with Agisoft Metashape software 2.1.0, the accuracy comparisons performed over different datasets gave differences of less than 1 cm, averaging around 0.5 cm. Due to the novelty of this technique, we have only chosen Luma AI since it is free and accessible to small retailers, aligning with our goal of identifying low-cost solutions, and it has been more used in the literature, demonstrating its effectiveness and reliability in generating high-quality 3D models using the NeRF technique. Table 1 shows the relevant features of the found scanning technologies regarding the economic resources of small businesses.

The feature for exporting the models is essential in order to both evaluate its characteristics and to use it in further VR applications. Note that the pricing for these applications is very similar, hence, the pricing is not a differential factor. However, after conducting a test scan under the same conditions on one of the objects selected with these applications, we decided to use Polycam as the application for evaluating photogrammetry and LiDAR techniques. Polycam was selected for its user-friendly interface, extensive usage in the literature, and comprehensive feature set. It supports both photogrammetry and LiDAR scanning technologies, which are crucial for our comparative analysis. Additionally, Polycam offers multiple export formats and editing tools, making it a versatile option for small businesses aiming to digitize their product catalogs for VR shopping environments.

Therefore, we selected Polycam and Luma AI as the scanning applications to be used in this study, the second one for being the only one to use NeRF technology by the time we carried out the selection process. Once we finished the selection process described previously, we began the scanning process as well as the definition of characteristics to be measured.

3.2. Hardware Used

The deviceswe employed to scan the objects are as follows:

Apple iPad Pro 2022 (Ciudad Real, Spain). Dual camera of 12 megapixels (MP) and 10 MP ultra-wide, plus 3D LiDAR sensors incorporated. It is the device on which we used Luma AI and Polycam’s LiDAR feature.
Xiaomi Redmi Note 8 (Ciudad Real, Spain). Quad camera of 48 MP (main), 8 MP (ultra-wide), 2 MP (macro) and 2 MP (depth).

The reasons for selecting these two devices are as follows. First, we required an iOS device to be able to install the Luma AI app, as well as to have access to LiDAR sensors on a device that is cheaper and easier for the public to use than a professional 3D scanner. Second, we chose an Android device to use the Polycam app with photogrammetry. We consider that such a device has a camera with sufficient MP and number of lenses to provide good quality photographs. Taking into account the economic possibilities of small shops, the chosen Android device presents a good reference, and the results of photographs with cameras of higher-end devices could improve the exposed results.

3.3. Features and Quality Metrics Used

We aimed to retrieve characteristics that could help us identify its quality and detail and gain an insight into the performance of a VR application that uses a considerable number of such models. We aimed to select state-of-the-art 3D graphics quality metrics that worked with 3D model representations, either meshes or point clouds. Hence, to select and use such metrics, we required that (a) the process followed by the metric takes as input a 3D mesh or 3D point cloud, easily obtained from the scanned model and (b) the source code of the metric was public in order to be able to use it with our data, for which papers with code (https://paperswithcode.com accessed on 7 July 2024) web page was useful.

The features of the 3D models we collected were:

Size in MB of the .obj file.
The number of polygons of the model (polygon count).
The texture’s size of the model in MB.
NR-3DQA 1.0 (https://github.com/zzc-1998/NR-3DQA accessed on 7 July 2024) [32]. This metric, which does not need a reference model, takes as input a point cloud or 3D mesh, extracting colour and geometric features, such as curvature, sphericity or linearity. After estimating four statistical parameters, a vector of 64 features is constructed and fed to a Support Vector Regression (SVR) neural network to provide a quality score in the form of a Mean Opinion Score (MOS) obtained by regression. For training, the public database Waterloo Point Cloud (WPC) [33] is used, which comprises 760 point clouds including 20 original ones and others distorted through different methods. These clouds are evaluated by obtaining an MOS, which represents the visual quality with which the point cloud is perceived by the human visual system.
MM-PCQA 1.0 (https://github.com/zzc-1998/MM-PCQA accessed on 7 July 2024) was proposed by Zhang et al. [34]. This metric, which does not need a reference model, takes as input a point cloud in .PLY format. The proposed metric works by first splitting the point cloud into sub-models to capture local geometric distortions such as point offsets and down-sampling. It then converts the point clouds into 2D image projections to extract texture features. Both the sub-models and the projected images are encoded using point-based neural networks, such as PointNet++ and DGCNN (Dynamic Graph CNN), and image-based neural networks. The main innovation of this metric is the symmetric cross-modal attention mechanism, which fuses quality-aware multi-modal information to predict the quality score. The fused features are then used to predict the quality score of the point cloud. The loss function used to train the model includes components to minimise the difference between the predicted and actual quality scores and to maintain the quality ranking between the different point clouds. In this case, the WPC database was also used to train the neural networks.
CMDM (Color Mesh Distortion Measure) 0.15.1 (https://github.com/MEPP-team/MEPP2 accessed on 7 July 2024) [35]. This metric, which uses a reference mesh, evaluates the quality of a 3D mesh with colour, combining geometric and colour features to determine the perception of distortion. It is based on the extraction of features at a local level and their subsequent analysis at a global multi-scale scale. Geometric features include mean curvature and curvature structure comparison, while colour features are extracted by transforming the colours of vertices to the LAB200HL colour space, considering luminance, chroma and hue. These features are compared between the reference and the distorted or compared mesh, and an optimised linear combination of these features, determined through logistic regression, results in a measure of perceived quality.
Accuracy of the 3D model. This can be obtained by performing a comparison between triangular meshes of a reference model and the model you want to evaluate for accuracy. For this purpose, we will use the Root Mean Square (RMS) of the distances computed with the software CloudCompare 2.6.3 (https://www.danielgm.net/cc/ accessed on 7 July 2024), which is widely used in the literature for comparing meshes and point clouds. Thus, we will follow the recommendations provided by the tool for performing the comparisons. Since we do not have a ground truth model of the product being scanned, we are taking the 3D models provided by the most accurate scanning technology based on the results of Section 3.4.

To obtain the values of the features numbered from 1 to 3 for each 3D model, we used the MeshLab 2023.12 (https://www.meshlab.net accessed on 7 July 2024) software tool for inspecting them. Moreover, it was used for exporting the .obj 3D models into .ply files for usage in NR-3DQA, MM-PCQA and CMDM. With regard to quality metrics numbered from 4 to 6, the code from their public repositories was installed and configured for use. For the metrics that required a cloud point as input, we sampled a cloud point of 1 million points based on the 3D mesh obtained, since the scanning applications provided a point cloud of the whole scene. Even though we attempted a cleaning process, the number of points in the resulting clouds was low when compared with those in the WPC database, used for training the neural networks used in two of the three metrics. Therefore, we decided to sample the point clouds with the same number of points for each model to ensure that all the meshes’ details were retained. Meanwhile, for evaluating the accuracy of 3D models, we used CloudCompare for comparing meshes (see Section 3.3). For metrics that need a reference model, such as CMDM and accuracy, we will use the model that provides the most detail based on the number of polygons. We lack a ground truth model, as we do not have a professional 3D scanner to provide us with a highly accurate and reliable model. Therefore, we will see in the next section how we can deal with the bias of selecting the most detailed model provided by the selected software applications.

3.4. Scanning of Basic 3D Primitives

Because of the above, we decided to build, scan and model the objects shown in Figure 1. The cube, pyramid and sphere are basic primitives used in computer graphics software and are easy to model in a program like Blender 3.5 (https://www.blender.org/ accessed on 7 July 2024) with little experience. So are the letters ’A’, ’I’ and ’R’, in this case in Times New Roman font. After obtaining the necessary dimensions for the model of the expanded polystyrene objects and modelling them in Blender, we obtained a model that was close to the real one, with a significantly reduced amount of polygons, which allowed us to calculate the error of the technology we used as a reference in the comparisons. Thus, despite having the most detailed model of one of the tools we use as a reference, we will be aware of the error it has in advance.

Following the scanning process defined in Section 3.5, these figures were scanned with Luma AI to export the high-poly models as a result, as well as with both devices. Then, we used the 3D models designed with Blender as a reference model to obtain the results of the MSDM2 (Mesh Structural Distortion Measure 2) metric [36] and the accuracy of the 3D model (see Section 3.3) obtained as a result of the scanning. In this way, we quantified the inherent errors these scanning applications have. The philosophy behind MSDM2 is very similar to that of CMDM (see Section 3.3), which aims to provide a score of the subjective visual quality, as perceived by human vision, of a 3D model with respect to a reference model. However, unlike CMDM, this metric considers features related to colour, and we decided not to use it since the models created with Blender did not have textures. This decision was made because it could compromise the accuracy when obtaining the score.

Table 2 shows the results obtained for the mentioned metrics. The RMS values obtained with the comparisons obtained from CloudCompare 2.9.3 have resulted in 0.00425 ± 0.00253 m. That is, on average we can count on an error of approximately 0.42 cm between the distance from one point to another of the real object with respect to the generated 3D model. Depending on the dimensions of the object, 0.42 cm could be critical, therefore, considering the dimensions of real expanded polystyrene objects (around 30 cm), which can be consulted in the following repository: https://github.com/AIR-Research-Group-UCLM/PDIVR-ZOCO accessed on 7 July 2024. Taking these dimensions into account, the average error is 1.4 %. With respect to the MSDM2 values obtained, the mean is 0.112375. A value of 0, as in CMDM, indicates that the meshes are identical. Furthermore, in the work of the authors of the metric, graphical examples are shown in which a value of 0.14 is barely noticeable to the human eye [36]. With these results, we can anticipate the error made in the scans when using this type of model as a reference for comparisons.

3.5. Scanning of Objects Selected for the Study

We have followed best practice guidelines provided by Polycam 1.3.10 (https://learn.poly.cam/about accessed on 7 July 2024) and Luma 1.3.8 (https://docs.lumalabs.ai/MCrGAEukR4orR9 accessed on 7 July 2024), including good lighting and moving slowly while capturing or avoiding objects with transparent materials or complex reflections. There are a set of well-known materials for which it can be hard to obtain good-quality results from a scan. Those are transparent materials (i.e., glass or plastic), reflective materials (i.e., mirror-like or metallic surfaces), smooth and even surfaces (i.e., a white wall) and furry materials (i.e., a carpet). It can be seen in Figure 2b that the mirror-like material of the object is badly represented, with holes in the 3D model and the bottom part not being filled. Therefore, we tried to avoid objects for this work with such materials in order to obtain high visual quality results for later VR experimentation, in which a high level of realism is important. However, we selected two objects that have one of the materials listed above to research the impressions caused on users in further studies.

Although the guidance interface in Luma is more comprehensive than that of Polycam, it required, on average, 4 min more to complete the rings. We used a round table with a radius of 50 cm to rotate around the object placed on it to capture the different points of view. We did not take into account the processing time for generating the model since both applications are based on cloud computing to process it. However, the usage of LiDAR sensors in Polycam allows for local processing, generating the model in a range of 1 min 30 s and 3 min, depending on the product scanned.

We note that the scanning time with Polycam was around 4 min, while with Luma it was around 8 min. However, the model generation time depends on the workload of the cloud servers at that time. Bearing in mind that these times can vary, models generated with Polycam took around 3–5 min to generate, while those with Luma AI could take up to 10 min. However, these times were obtained at the time of the study and may change in the near future.

It is important to consider the various formats that these applications offer for generating 3D models. Ordered by the lowest to the highest detailed model given, Luma AI offers low-poly, medium-poly and high-poly, while Polycam provides optimized, medium, full and raw. Regarding Polycam, the last two options are better for VFX effects and professional workflows use cases, whereas the two first are designed for game engines [37]. Nonetheless, the optimised option is more convenient for fast loading and real-time rendering environments, which is our target by allowing potential customers to interact with the 3D models of products in the VR environment. Although it could seem that, among the options provided by Luma AI, the best for our use case is low-poly, we have explored every format since there is no existing research specifically focused on this technology for our use case.

We selected five objects (see Figure 3) made with different materials and from different sizes that could be potentially showcased in a VR e-commerce platform. In addition, we considered objects with a size large enough for LiDAR sensors to detect them effectively, since they commonly provide worse results with small objects, such as LEGO blocks [19]. Therefore, we selected objects sized between 20 and 100 cm in height or width. It was also important to select objects with different colours, as LiDAR sensors seem to return more 3D data points when lighter coloured objects are scanned [6,19]. Figure 4 shows the 3D models in .obj extension exported in low-poly format from Luma AI.

The selected objects for this scanning study represent a spectrum of surface types and their interactions with light, which are essential for assessing scanning effectiveness and detail accuracy. We include organic materials and textiles, which predominantly absorb and scatter light, serving as examples of diffuse surfaces. Ceramics are chosen for their semi-specular properties, providing a balanced mix of light absorption and moderate reflection. Metallic objects, known for their highly reflective, specular surfaces, are included to test the scanner’s ability to handle intense reflections and mirroring effects.

3.6. 3D Model Quality Evaluation

Table 3 shows the measurements taken from each 3D model obtained from the scanning process for each of the features described above. On the one hand, it is noteworthy that the polygon count from Luma AI’s high-poly models are between 9 and 10 times higher compared to those of medium-poly models, and approximately 50 times higher than those of low-poly models. This seems to indicate that high-poly models will not be the most feasible to use in VR environments in terms of the impact they will have on VR headset performance. However, this hypothesis will be tested in Section 4.

As can be seen, the sizes of the Luma AI high-poly files are very large compared to the other types, as are the sizes of the textures, since they have a higher resolution. In the case of textures, the scanning applications do not offer control over the desired resolution when exporting them. In the case of Polycam, a single texture file was generated, 4096 × 4096 in the case of the Android version (photogrammetry), and in the case of the iOS version (LiDAR sensors) the file obtained is 4096 pixels wide and a variable height depending on the scanned object. On the other hand, Luma AI generates several texture files in the three formats shown in a much more flexible way. For example, for the sneakers in high-poly format, 78 files were generated in which the highest resolution was 4096 × 4096, while for the octopus 56 files were generated in which the highest resolution was 2048 × 2048. The same applies for the rest of the data, generating 3–4 files for low-poly models and between 10–30 for medium-poly ones. This may indicate that the algorithms adapt the textures and their resolution more intelligently than in Polycam depending on the scanned object, since the process and the conditions under which the scanning was performed were identical. However, a downside for these models is the need to apply more than one material to the model when rendered by a graphics engine, which may worsen performance.

Regarding the quality metrics NR-3DQA and MM-PCQA, a low result does not necessarily indicate low visual quality. Because the neural networks are regression-based, they are trained with the features extracted from the corresponding models and the MOS obtained from the volunteer study present in the WPC database [33]. Since not only original models are used, but also distortions based on them, it is also necessary to take into account the results obtained considering the group of models of the same object.

Analysing the results, we observed discrepancies between the metrics, due to the characteristics of the models used in their algorithms. In the case of NR-3DQA, we noted that the models from Luma performing best were the medium-poly ones, except for the sport shoes, for which by a few decimals it was the Polycam model in its Android version. It can also be seen that, apart from the sport shoes, Luma’s models outperform Polycam’s in both versions. When we calculate the percentage difference between the top-performing models and those with inferior results, we find the following. The greatest difference is seen in the trophy, where Polycam’s models are outperformed by 45% compared to Luma’s medium-poly model, followed by the octopus (39% on the iOS version and 20% on the Android version) and the ceramic mushroom (Polycam’s Android model) with a 17% difference. The least difference is found in the case of the burner, with just a 6% gap between the best Luma model and Polycam’s, and it is the object where the quality of all models is most similar.

The situation with the MM-PCQA results is slightly different, as we see that the high-poly models yield the best results, except for the burner, as it is the low-poly model. These outcomes continue to demonstrate good visual quality for both the Luma and the Polycam models, with the difference narrowing compared to the NR-3DQA results. For Polycam’s Android version models, the results (from top to bottom in Table 3) are 4%, 16%, 1%, 19% and 5% lower compared to the highest scoring model. Once again, the burner is the object whose models have been most similarly generated by both applications. For the iOS version models, we can see that the percentage differences are even greater in some cases, with two models standing out where the visual quality is low: the plush octopus (74%) and the ceramic mushroom (46% less).

Based on these results, we can determine that Luma AI’s models offer the highest visual quality, with percentage differences ranging from 1% (MM-PCQA of the burner) to 74% (MM-PCQA of the octopus teddy). By calculating the mean and median of the percentage differences between Luma AI’s top-scoring model and Polycam’s models, we can more generally visualise Luma AI’s superior visual quality of models. Thus, for both metrics, we obtain the following results: (i) NR-3DQA. Mean: 17.78%, median: 17.72% better results compared to Polycam’s photogrammetric models (Android), mean: 22.73%, median: 15.18% better results compared to the models obtained with Polycam’s LiDAR sensors (iOS); (ii) M-PCQA. Mean: 9.43%, median: 5.2% better results compared to Polycam’s photogrammetric models, mean: 28.19%, median: 9.21% better results compared to the models obtained with Polycam’s LiDAR sensors.

In Figure 5, we can observe a comparison of models where a colour scale visually indicates the distances between the points of the point clouds of both models. In those parts of the compared model where colours show more red or blue hues, it indicates that the differences in distances are greater, which translates into less precision in the compared model. Figure 6a,b present a comparison of the meshes from high-poly and low-poly 3D models. It can be noted that the number of vertices and edges in the high-poly model is significantly greater than in the low-poly one, but this implies much more computation time and effort by the game engine that would render such a model. Moreover, Figure 7 shows a comparison in details with the different technologies used and the real object scanned.

Furthermore, the accuracy of the generated 3D model is presented in the remainder of this subsection. As stated before, given the absence of a ground truth model, high-poly models of each object have been employed as a reference model, as the data obtained confirm their ability to capture the most intricate details. The process implemented in the CloudCompare 2.9.3 tool is comprehensively outlined in its corresponding documentation. In addition, this tool provides a statistical distribution of the mean and standard deviation of the distances calculated during the comparison. Table 4 shows the RMS given by CloudCompare 2.9.3 from the comparisons performed, as well as the results of using the MEEP2 0.15.1 platform (https://github.com/MEPP-team/MEPP2 accessed on 7 July 2024) to obtain CMDM metric values for each comparison. In the case of a high difference of polygons, the application’s manual recommends sampling points in the compared model. Therefore, the points were sampled to match the number of vertices in the reference models.

Analysing the RMS results, it is expected that the medium and low-poly models of Luma would be the most precise compared to the reference model, maintaining accuracy in millimetres relative to the reference model. For the rest of the models, it can be observed that, except on one occasion, the results of the models obtained with Polycam using photogrammetry indicate greater precision compared to the models obtained with LiDAR sensors. Regarding the models obtained by photogrammetry, the average of the RMS results indicates a 2.36 cm difference in the measured distances.

By inspecting the CMDM results obtained, we can see that medium-poly models achieved the best results overall, with an average perceived distortion of 13.22% considering all comparisons. Only in the case of the mushroom did another comparison achieve better results, which was the one between Polycam models, the reference model being that from the Android version. However, this was the model that obtained the least realistic results due to the material it is made of. Again, the object that achieved the best results in both applications was the burner. It is interesting to note that the difference in perceived distortion between medium-poly and low-poly results is on average 3.4%, which indicates really good quality for low-poly models as well. This metric also shows the overall low capability of LiDAR sensors to provide good visual quality results when scanning moderate-sized objects.

Having analysed the obtained data, we can conclude that the use of LiDAR sensors with the Polycam application should not be the preferred option for obtaining realistic 3D models that can be showcased in VR shopping environments. The various metrics employed have yielded the worst visual quality level results for these models, both in comparisons using a reference model and those without. Therefore, for obtaining a realistic 3D model, the preferred option would be NeRF, specifically in the form of a mobile application like Luma AI, when available on a device. The data from the metrics used, both with and without a reference model, show that the models from Luma AI have the highest visual quality, except in NR-3DQA in the case of sports shoes. Taking into account high-poly models, values between 5 and 20% higher were obtained compared to the photogrammetry application used. Considering low-poly models, this range is reduced to between 2 and 15%. Given the visual results of the photogrammetry application, and considering that the difference is not very large in the results, this option is also viable for obtaining high visual quality 3D models that can be used in VR.

Discussion

Our evaluation revealed that the difference in visual quality between Luma AI models and those obtained through photogrammetry was generally minor, ranging from 10% to 20%. Specifically, the Root Mean Square (RMS) values for the medium-poly models of Luma AI were as low as 0.00041, while Polycam’s photogrammetry models showed slightly higher RMS values of approximately 0.01269 for the optimised models. However, the difference with models obtained using LiDAR sensors was significantly greater, with some cases showing RMS differences up to 70% higher, such as the RMS value of 0.05976 for low-poly models.

This substantial disparity leads us to consider the models produced by Luma AI as having the highest visual quality among the technologies we tested. As such, we recommend Luma AI for product digitisation when the highest visual fidelity is required. Additionally, Luma AI provides comprehensive support for NeRF technology, which is critical for achieving high-quality 3D models.

It is crucial to remember that we are focusing on technologies that can be easily used by individuals without advanced technical knowledge, such as small business owners, who may not have access to expensive hardware and software resources. Therefore, it is important to find solutions that offer a good balance between quality and accessibility. Polycam stands out in this regard due to its user-friendly interface and extensive usage in the literature. It supports both photogrammetry and LiDAR scanning technologies, which are crucial for our comparative analysis. Polycam’s average NR-3DQA score for optimised models was 0.8111, indicating good quality while maintaining accessibility for non-experts.

Conversely, our results indicate that models obtained using LiDAR sensors are less suitable for 3D object scanning in the context of product digitisation. The lower visual quality and higher error rates observed with LiDAR make it less ideal for creating detailed product models. For example, the CMDM values for LiDAR models reached up to 0.318699, significantly higher than those of photogrammetry models. However, LiDAR can be more appropriate for scanning larger environments, such as entire stores or shopping spaces, where the focus is on capturing the overall layout rather than fine details.

Further limitations of these scanning technologies were detected. Products made with transparent material or specular surfaces require a specific scanning environment where light sources do not produce poor results. Moreover, objects with very small dimensions are quite difficult to scan with the technologies proposed. Instead, it would require a professional camera where the optical zoom allows for quality photos, as well as the use of photogrammetry software to transfer the photos to.

In Section 4, we will analyze the performance of these models on various current market VR headsets, including the Meta Quest 2, Quest Pro and Quest 3. This analysis aims to determine the practical feasibility of using these models in VR environments and to provide guidelines for their optimal use in different VR shopping scenarios.

4. Performance Analysis of the Generated Models in Virtual Environments

After comparing and evaluating the 3D models generated by the various scanning tools based on different characteristics, we proceeded to measure their performance on Meta’s VR headsets: Quest 2, Quest Pro, Quest 3. To this end, we imported the .obj files of the models into a Unity project, where we configured scenes that were subsequently built and deployed on the headset for native execution, aiming to collect performance metrics without interference from the hardware of the connected PC. The chipset, GPU and RAM memory of the Meta Quest devices used are as follows:

Meta Quest 2. Chipset: Qualcomm Snapdragon XR2, 8 cores (1 × 2.84 GHz, 3 × 2.42 GHz, 4 × 1.8 GHz), integrated GPU: Qualcomm Adreno 650 @ 587 MHz, 1.267 GFLOPS. RAM: 6 GB.
Meta Quest Pro. Chipset: Qualcomm Snapdragon XR2+, 8 cores (1 × 2.84 GHz, 3 × 2.42 GHz, 4 × 1.8 GHz), integrated GPU: Qualcomm Adreno 650 @ 587 MHz, 1.267 GFLOPS. RAM: 12 GB.
Meta Quest 3. Chipset: Qualcomm Snapdragon XR2 Gen 2, 6 cores (1 × 3.19 GHz, 2 × 2.8 GHz, 3 × 2.0 GHz) GHz, integrated GPU: Qualcomm Adreno 740 @ 492 MHz, 3.1 GFLOPS. RAM: 8 GB.

While it would have been preferable to include a more comprehensive range of VR headset models in the study, financial constraints within the research project necessitated a selective approach. The three Meta models were selected for their potential to facilitate an examination of the progression among models that share a direct lineage. Furthermore, the Meta Pro model is equipped with internal cameras, which will facilitate the integration of eye tracking in future research. The Meta Quest 3 model supports switching between virtual reality and augmented reality, which is also promising for subsequent investigations. Two further reasons influenced the decision to procure and utilise the Meta models in the study. Firstly, they are the best-selling models currently, as confirmed by data from sources such as Counterpoint (https://www.counterpointresearch.com/insights/global-xr-ar-vr-headsets-market-share/), Statista (https://www.statista.com/topics/10052/vr-headsets/) and Amazon (accessed on 7 July 2024). Secondly, they exhibit an excellent balance between quality/features (as outlined above) and cost, making them accessible to future VR consumers across various economic backgrounds.

On the other hand, it is worth noting that the design of the scenes is very minimalist, with a display stand for positioning the products. We chose this design to avoid overloading the hardware of the headsets by rendering objects in the scene that are not part of the study, as this could introduce noise in the measurements that would not faithfully represent the performance offered by the models. An example of a scene can be seen in Figure 8. We are aware of the limitations of not considering dynamic optimisation techniques in the analysis, but our objective is to see the performance provided by the models that small businesses will provide to the platform in the future, without considering future performance improvements that may be included in the development of the application. Moreover, even though we will not analyse the effects of lightning sources or materials in the rendering process, we have shown in Section 3.3 the features of the textures of the 3D models used in the study. Next, the process of collecting performance data and the decisions made regarding the objectives of this work will be described.

4.1. Data Collection Method

Following the construction and deployment of each scene, usage metrics were collected using the native profiling tool OVR Metrics Tool 1.6.5 (https://developer.oculus.com/documentation/native/android/ts-ovrstats/ accessed on 7 July 2024). As the primary objective of analysing the performance offered by the models is their integration into a VR shopping environment, we made the following decisions when developing the scenes:

Employ the official Meta SDK for Unity to incorporate into each model the necessary components to interact with the models. Thus, these can be picked up with virtual hands and moved freely to observe how a typical interaction with objects affects performance.
Replicate the number of models in different scenes to estimate the limit of products that could be included in a virtual environment for each type of model generated by Luma AI. This allows estimating a maximum number of objects that can be displayed for the application to run smoothly and for the user experience to be satisfactory. This was performed with the low-poly models of Luma AI and Polycam for Android as they have been the most realistic models and the ones that have provided the best data, as seen in Section 3.3. In total, 14 scenes were developed to achieve this objective.
For the collection of performance data, the corresponding application for each scene was run, and during an application execution lasting approximately 60 s, there was interaction with the objects in the environment, as well as free movement around the scene. This was done with the aim of simulating a user’s behaviour in a VR shopping environment.

OVR Metrics exports data from each application execution to a .csv file, recording the value of up to 88 metrics every second regarding the CPU, GPU, memory, temperature, etc. We have selected for analysis the metrics we are interested in with respect to performance, as metrics such as voltage, wattage, or colour intensity in each eye of the headset are also collected but are not of particular interest to us.

In the next section, we provide a performance analysis of the models from the execution of the scenes developed. The main constraint to further replicating objects of one type was the smoothness of the application, which is a key factor in the user experience. With the recommended rate of frames per second being between 60 and 72 for the Meta Quest devices used, a rate below half that (in this case around 30 frames per second) starts to severely affect the user experience, causing lag that, in the case of VR, can cause motion sickness. We started with 4 objects to represent a minimum number of objects that could appear simultaneously in a real shopping environment, one of each type that was scanned. The replications, as shown below, allowed us to reach a maximum of 96 low-poly, 24 medium-poly and 12 high-poly objects, figures at which the user experience was affected quite negatively.

4.2. Performance Analysis of Models

The analysis of the data collected in the different .csv files has been conducted in a Python notebook that can be visited and executed in a Google Colab environment which can be accessed in the public repository of the paper. In the rest of this section, we will present the analysis of the metrics that we have considered relevant to understanding the performance of a VR shopping application using the obtained models. In Table 5, we can observe some values collected from metrics that have helped us to carry out the analysis, and that we will examine in more detail below. Please note that, during the analysis, we will refer to quantities of objects as a reference. To avoid repetition, bear in mind that this expression refers to objects with graphic characteristics (number of polygons, vertices, size, quantity of textures, etc.) similar to those used in our study. For quick reference, Table 5 includes a column to represent the polygonal load of each scene, while the size of the textures can be found in Table 1.

4.2.1. Frame Rate

This metric, commonly measured in Frames Per Second (FPS), critically affects the user experience in an application, as a low frame rate results in the application feeling slower and less fluid, showing much greater latency in response to user actions. Additionally, a low frame rate can also indicate that the workload is not suitable for the hardware on which it is running. In our case, the most critical components are the CPU and GPU of the headset responsible for rendering the 3D models.

It is important to note that the frame rate should be close to the refresh rate (60–72 Hz) for the VR application experience to be pleasant and smooth. In Figure 9, we can observe the frame rate offered by different types of objects in the scenes executed for each of the Meta VR headsets used. As can be seen, all devices offer performance close to the 72 Hz refresh rate in the form of FPS for LP objects up to a quantity of 48 objects. Beyond this number, the frame rate starts to fall to 60 FPS, which is still smooth enough to offer a pleasant user experience. However, in the last scene with 96 LP objects, there is a severe drop in the frame rate, especially in the Quest 2 and Quest Pro, due to complexity spikes when one or two objects were manipulated while a large number of them were being rendered.

Regarding the MP models, all devices struggle to varying degrees when handling 12 objects, with complexity spikes affecting the average FPS, as can be observed. This is more evident in the next scene with 24 MP objects, although the Quest 3 handles these types of spikes better due to its more powerful CPU and GPU hardware.

Looking at the data obtained for the scenes with HP objects, neither offers a reasonable frame rate to provide a good user experience. A considerable detriment in frame rate can be observed for a similar quantity of polygons compared to the scene with 24 MP objects, whose quantity of polygons is even higher. Focusing on the scenes with LP objects, specifically the one with 72 objects, executed on the Quest 3, we obtain a frame rate 3.91 times higher despite the scene with 4 HP objects having only almost twice the number of polygons (1.97 times more). These data highlight the effect that other factors such as the number and resolution of the textures generated for HP and MP models have on performance. As previously mentioned, the number of textures generated by Luma AI for these types of model is much higher compared to LP.

In terms of fluidity, we can observe that LP models would be the most suitable for inclusion in a VR shopping space that aims to house a considerable number of models in the same scene, trying not to exceed 72 objects with characteristics similar to those used in our study to avoid complexity spikes that negatively impact the frame rate. On the other hand, it is not advisable to use HP models from Luma AI due to the low frame rate they offer, which could only be resolved if executed with VR headsets connected to a PC with powerful hardware to help render the scene, thereby limiting the user’s shopping experience. Finally, the use of MP objects could be an alternative in scenes that do not require a large number of simultaneous objects, such as simple shopping spaces with a single virtual counter. However, this would mean losing flexibility in setting up the scene, as computing power would be used in a greater number of polygons and textures which, according to the results of Section 3.3, offer a visual quality improvement of less than 10% compared to LP.

4.2.2. Stale Frames

Stale frames indicate the number of times a frame could not be delivered on time due to the parallel processing of the GPU and CPU, resulting in the reuse of the previous frame. Stale frames can cause problems when the value recorded is greater than 0 but less than the refresh rate, i.e., 72. The issues arising from this may include increased latency between user actions and their visual representation in the virtual environment or even motion sickness due to the perceived lag between these actions and their visual feedback. Table 5 shows the percentage of records where values less than 72 but greater than 0 were collected.

Although it can be observed that even in scenes with a frame rate close to the refresh rate there are percentages close to or above 50% of stale frames, this does not necessarily signify a problem for the user experience. According to Meta’s own documentation, extra latency mode is enabled by default in applications built with Unity. This mode establishes communication with the Asynchronous TimeWarp (ATW) system (a technology that helps reproject frames based on the most recent head tracking information) to always wait for an extra frame. This could explain why the fluidity of the application is not affected. On the other hand, the low percentage of stale frames in scenes with HP objects indicates better synchronisation between generating a new frame and the refresh cycle. However, the low rate of generation of these new frames (frame rate) makes it more difficult for them to become stale.

4.2.3. GPU U (GPU Utilization)

A GPU utilisation percentage equal to or greater than 99% can be one of the reasons indicating that the application is GPU-limited, with performance issues potentially starting to appear from 90% utilisation. Table 5 shows the percentage of application’s execution time where the GPU U percentage exceeded or equalled 99%.

In scenes with LP objects, up to 24 objects, the application does not reach 99% utilisation at any point, just like scenes with 4 MP objects, except in the Quest Pro in this last case. From 48 LP objects onwards, the percentage of time varies between the Quest 2 and Pro, as the Quest 3 does not maximise GPU usage until the scene with 96 LP objects, for which it only does so 5% of the total execution time. While in the Quest 2, nearly a third of the execution time of the two scenes with the most LP objects maximised GPU usage, in the Quest Pro, it only happened in the scene with the highest number of these objects (30% of the time).

For scenes with 12 or more MP objects and those with HP objects, there is an upward trend. While in scenes with 12 MP objects, the percentage remains between 20 and 30% in all headsets; again, the difference in power of the Quest 3 is noticeable in the scene with 24 MP objects, where the Quest 2 and Pro reached nearly 60% of the time with the GPU at maximum use, against 34.43% in the Quest 3. However, all headsets remained near 90% of the time with the GPU at maximum use in all HP scenes, demonstrating how demanding these models are in terms of power when running applications as standalone VR.

With these results, we can observe once again that it is advisable to use LP objects up to a reasonable amount based on the number of polygons or a few MP objects. Thus, the application will avoid maximising GPU usage for an extended period. This can lead to overheating in the headset, causing discomfort or loss of stability and fluidity in the application when run for a longer period of time.

4.2.4. Application GPU Time (App T)

This metric helps determine if an application is potentially GPU-bound, as it collects the amount of time the application takes to render a single frame. If the time measured in the metric is longer than the length of a frame (13.88 ms for 72 frames per second), the application can be considered GPU-bound. Meta’s documentation provides a simple rule to indicate when an application is limited by the CPU or GPU. If App T > 13.88 ms and FPS < 72, then there is a 100% bottleneck caused by the GPU (it could also be happening in the CPU but is masked by the GPU), if App T < 13.88 and FPS < 72 then there is a bottleneck caused by the CPU. This rule was used to develop the charts shown in Figure 10, showing the frames where bottlenecks occur or not, and of what type (CPU or GPU) if they occur, for the data obtained in the Meta Quest 3.

As can be seen in Table 5, for scenes with LP models, in the Quest 2 and Pro, they begin to need more time than necessary to render a frame smoothly, starting from scenes with 48 objects, albeit with low percentages. However, in the scene with 96 objects, they obtained percentages of 48 and 71% in the Quest 2 and Pro, respectively, while in the Quest 3, it was 31%. These are moderately high percentages, since, taking the best performance obtained in the Quest 3 as a case, nearly a third of the total execution time, the application could have been limited by the GPU.

Considering scenes with MP objects, the polygonal load begins to affect more severely, reducing the GPU’s working capacity. In any of the three devices, at least one of the scenes reached a percentage close to 60%, representing a high percentage of the time in which the GPU was a bottleneck due to the work needed to render. Finally, we can see how, in scenes with HP objects, the Quest 3 presents an extremely high percentage, while the other two headsets present much lower percentages, even lower than in the scenes with MP models. This could be due to the main cause of poor performance in these headsets being a bottleneck in the CPU due to the low percentages of GPU usage time by the application.

4.2.5. CPU Utilization Percentage

Similarly, the tool records the CPU utilization percentage of the headset, represented by the worst-performing core. Figure 11 shows the average CPU usage percentage, taking into account that it captures the core with the worst performance in each measurement.

In the case of Quest 2 and 3, there is a clear trend of a higher percentage of CPU usage in scenes with low polygonal load, which decreases as the polygonal load of the scenes increases. In the case of the Quest Pro, the percentages remain close, between 20 and 30% usage, except in the three scenes with the highest polygonal load of all. The reason for the downward trend in CPU usage in devices may be due to the bottlenecks generated by the GPU in more demanding scenes during rendering, so the CPU waits for the GPU to finish rendering before starting to process the next frame. Conversely, in the case of scenes with lower polygonal load, the CPU is more used as it needs to prepare and send a greater number of frames to the GPU, meeting the frame rate obtained in these scenes as shown in Section 4.2.1. Finally, we highlight the difference in CPU usage in the Quest 2 compared to the other two more recent Meta devices. While in scenes with 4 LP and MP objects its usage is considerably higher, it is not concerning since these scenes feel fluid. However, it can be observed, especially with the data from the Quest 3, that the processor of the latter is used approximately 10% more for the rest of the scenes, as the power of its CPU allows it to try to reach a higher frame rate.

4.2.6. GPU and CPU Levels

The levels of the GPU and CPU clocks refer to the frequency in MHz. The clock level can be from 1 to 5 and, although they can be set manually, it is recommended to use dynamic clock throttling by default. Thus, these levels automatically increase if the application does not run at an adequate frame rate at the current levels. Figure 12 shows graphs of the evolution of GPU and CPU levels throughout the application execution in each scene.

At the device level, we observe how the Quest 3 stands out for reaching level 4 or even 5 more frequently in scenes with a higher graphic workload. These graphs are interesting as they show how, for example, in scenes with HP objects, it is demonstrated that the Quest 3 was limited by the GPU (see Section 4.2.4), as the GPU level was 5 while the CPU level remained at 4. The opposite occurs with the Quest 2, where the levels remained the same. In general, until the scene with 48 LP objects, no device reached level 5, with it being in this scene where the Quest 3 increased the GPU level to this number during specific periods of time. We also highlight in general that the Quest 3 and Pro increase the GPU level more frequently. This indicates that the default dynamic adjustment increases the clock frequency to try to process the rendering more quickly, especially in scenes with higher polygonal load, where we can observe complexity spikes in the graphs. Meanwhile, the Quest 2 has maintained the same level for both CPU and GPU in most application executions.

In general, we can observe how the Quest 3 and Pro adapted the use of the GPU dynamically better. The reason for this may be due to the difference in hardware power between these headsets and the Quest 2. In any case, it is shown once again that, from a certain number of polygons, as a guide, the different devices may start to experience problems due to the continuous increase in the clock frequency, either of the CPU or the GPU.

4.2.7. Average Prediction

This metric collects “the absolute time between when an app queries the pose before rendering and the time the frame is displayed on the HMD screen”. That is, this metric measures the delay between the hand tracking data collected by the headset and its rendering in the corresponding frame. This value should be a fixed number between 45 and 50 ms; however, if the value is much higher than this, the application may suffer latency problems when rendering. Since Meta’s documentation does not provide a reference value that quantifies this “much higher”, we have opted to check those records of the .csv files with a value greater than 60 ms (20% higher than the upper limit of the documented range). Table 5 shows the percentage of gaming time with this condition for this metric.

In all three devices, the latency recorded by this metric remains below 12% up to the scenes with 72 LP objects, while in the Quest 3 this percentage increases up to 21% in the scene with 96 LP objects. This higher value may indicate that, in that scene, there was more interaction and more frequent modification of the hand poses than in the same scene executed on the other two devices, as the application appeared more fluid and encouraged more interaction, as can be seen in the obtained frame rate (see Section 4.2.1). Thus, with more interaction these delays may occur more frequently in a scene that does not run with complete fluidity. In scenes with a higher number of MP and HP models, we can observe disparate percentages depending on the devices, from 8% to 90%. Although it is true that in HP scenes the percentages are intrinsically high because, what little the headset could track and render, it did so with a very high delay.

4.3. Discussion

Analysing the data with the OVR Metrics Tool, we have determined that, for 3D models with characteristics similar to ours (polygons, texture size, etc.), an acceptable limit could be 72 objects (1,792,944 polygons) of low-poly models from Luma AI. Beyond this limit, the frame rate falls below 60 FPS and presents worrying complexity spikes, which severely impact the fluidity and performance of all the VR headsets used. For example, in scenes with 72 low-poly objects, the frame rate dropped to an average of 56.21 FPS on the Meta Quest 2, 60.16 FPS on the Meta Quest Pro and 62.60 FPS on the Meta Quest 3, indicating that performance starts to degrade significantly beyond this point.

In the case of medium-poly models, reasonable results were obtained with up to 4 objects (599,600 polygons). When the number of medium-poly objects increased to 12, the total polygon count was similar to that of 72 low-poly objects, but the performance suffered more severely. Specifically, the average FPS dropped to 29.61 on the Meta Quest 2, 30.62 on the Meta Quest Pro and 48.08 on the Meta Quest 3, making these configurations less feasible for a smooth VR experience.

Lastly, the performance offered by the high-poly models was very low, so their use is discouraged. For instance, high-poly scenes resulted in frame rates as low as 9.30 FPS on the Meta Quest 2, 12.40 FPS on the Meta Quest Pro and 15.97 FPS on the Meta Quest 3. These results clearly indicate that high-poly models are not suitable for standalone VR shopping environments due to their detrimental impact on performance.

Conversely, low-poly models are deemed appropriate depending on the hardware currently available in VR headsets, as they allow greater scalability by adding more objects without severely impacting performance up to a limit of 72 objects. These could be suitable for large VR shopping spaces that simulate environments such as supermarkets or shopping centres. While the use of medium-poly models is very limited, they could be used in simpler spaces with fewer objects. Despite the visual quality of medium-poly models being slightly higher (approximately 5% improvement), the performance trade-off makes them less viable for extensive use.

Finally, we discuss some limitations of these 3D models for their use in VR shopping environments. The tests performed completely isolated the 3D models; therefore, other aspects such as the mesh complexity of the virtual environment are not taken into account. It would be interesting to further research the effects of other optimisation techniques such as Levels of Detail (LoD) considering both the environment and the products. In order to obtain a better performance, an optimisation of the 3D model can be made in 3D software tools like Blender. Even though it could be beneficial, it would require people with technical expertise in the field in order to obtain improved results, which might not be possible in the circumstances of small businesses.

5. 3D Model Optimisation of High-Poly Models for VR

The main focus of our work has been to analyse various low-cost scanning technologies and mobile applications that implement them, enabling small business owners to use them. This analysis aimed to examine whether the obtained 3D models offered good visual quality and whether their performance was suitable for various VR shopping use cases. Nevertheless, the low-poly models offered by both Luma and Polycam, which are around 20,000 polygons, may seem excessive for a mesh’s complexity to be labelled low poly. Additionally, for medium- and high-poly models, the number of polygons increases exponentially. Aware of the existence of performance optimisation techniques at both the model and application levels, which can be included in a future production version of the VR-ZOCO platform, we plan to explore the impact of optimising high-poly models in terms of visual and performance levels.

This process requires a series of technological knowledge not commonly possessed by small business owners. Therefore, we must adopt an approach that allows this process to be practically automated, taking the 3D model generated by a small merchant from one of the analysed applications as input and returning a simplified model. Following this approach, we chose Blender because it offers an API with which to create and execute Python scripts, allowing us to automate actions performed in the software’s GUI.

5.1. Optimisation Process

Next, the step-by-step optimisation process carried out with the tools that Blender offers is described:

Apply a 3D mesh modifier with the “Decimate” tool [38], a technique that reduces the number of polygons by a specified percentage, preserving details while reducing the mesh geometry.
Unwrap the 3D model imported to Blender 3.5 in order to create a 2D UV map that fits the mesh of the model with the Smart UV project tool, with an island margin of 0.02 to avoid overlapping.
The “Shrinkwrap” tool was used to adjust the mesh of the optimised low-polygon model more closely to the shape of the high-poly model.
Bake both a normal map and diffuse textures. While the first allows for the transfer of the surface details of the high-poly model without having to add more polygons, the second bake allows for the transfer of the colour details from the different texture files of the model (in the case of high-poly models, they had more than 50 files) into a single file.

Following this process, we have managed to optimise, not only in terms of polygons but also at the texture level for high-poly models, where we now have only 2 files: a normal map and a texture file. Figure 13 shows the result of the optimisation process both in terms of mesh and when rendering the model.

5.2. Quality and Performance Evaluation of Optimised Models

Briefly, we will analyse both the graphic quality and performance of the models obtained through the optimisation process. Figure 14 presents five bar charts comparing the visual quality of the optimised models to the original high-poly models from which they were derived. In the first subgraph, starting at the top left, we can see that the percentage difference in the number of polygons is at least 99%, meaning that the number of polygons is almost completely reduced compared to the high-poly models. In the second subgraph, we observe differences in percentages, as in some cases the textures of the high-poly models were smaller in total size than the two files generated for the optimised model. This is largely due to the size of the normal maps. Thus, we see that in models with more complex geometry, such as the sport shoes, the texture size can be considerably reduced in the optimised model, the opposite being case for objects with simpler meshes. In both the third and fourth subgraphs, which present the NR-3DQA and MM-PCQA metrics used in Section 3.3, we observe that, except in one case, there is some loss of visual quality in the optimised models. However, the losses in NR-3DQA do not exceed 15.5%, while in MM-PCQA two of them exceeded 25%. The case of the octopus teddy is surprising, which obtained a 30% loss in the metric value, but visually there is no significant difference (see Figure 13). Finally, the fifth subgraph compares the results of the CMDM metric, comparing the low-poly and optimised models with the corresponding high-poly models as the reference model. We recall that a lower CMDM value indicates less perceived distortion in the compared mesh relative to the reference. Thus, we can observe that in three of the five objects the low-poly models obtained better values in the comparison than the optimised ones, while in the other two objects (octopus teddy and ceramic mushroom) the comparison was more favourable to the optimised models with values 7.26% and 14.22% lower, respectively.

On the other hand, Table 6 shows the performance data obtained when running a scene with 96 optimised objects in the three VR headsets used in the study. We have only used this scene to offer a perspective on the performance offered by these models with low polygonal load compared to the models with the lowest number of polygons in the data shown in Section 4.2. As can be seen, on any of the devices the frame rate is very close to the refresh rate, the GPU and CPU do not generate bottlenecks due to excessive use at any time and the percentage of stale frames and average prediction are not a cause for concern. Thus, it is evident how these optimised models can allow the inclusion of an even greater number of objects due to their low polygonal and texture load.

6. Conclusions and Future Work

In this work, we have selected a series of scanning techniques (photogrammetry, LiDAR sensors and NeRF) to analyse, both in terms of visual quality and performance in current VR headsets on the market, the results of scanning a series of products using software applications on mobile devices (Polycam and Luma AI). This analysis was conducted with the goal of investigating and exploring low-cost 3D scanning tools, enabling small business owners to afford to digitise their product catalogues and include them in any VR shopping platform. In this way, we aim for small businesses with limited technological and financial resources to participate in the evolution of e-commerce in the coming years. At the same time, this would simplify the management of VR shopping platforms on the developer’s side, as it would be the small businesses themselves who digitise their products and keep their catalogues updated, without having to depend on graphic designers to model the products.

For this reason, it is essential that the results offered by the tools recommended to small merchants provide good visual quality, encouraging VR shopping users to explore immersive shopping experiences. In this first part of the analysis, we first measured the inherent error of the models obtained with the selected applications for the study, as we did not have a professional 3D scanner to provide us with a ground truth model. For this, we modelled six pieces of expanded polystyrene in Blender, with their real measurements, and used them as a reference to compare the 3D models obtained after scanning these pieces with the applications. The results showed that the high-poly models introduced the least error. After this, we evaluated the visual quality of five products with different materials and geometries, using metrics that do not require a reference model such as NR-3DQA [32] or MM-PCQA [34] and metrics that do require one such as CMDM [35]. The results obtained positioned the models from Luma AI (low poly, medium poly, and high poly) as those with the best visual quality, followed by Polycam using photogrammetry and Polycam using LiDAR sensors.

Following this first part, we began the second analysis, focused on the performance provided by the models offering the highest graphic quality (see Section 4), as we must ensure that, despite having good visual quality, they must provide a pleasant user experience when executed on VR headsets in standalone mode. Thus, we built scenes in Unity with the least possible amount of assets, logic, and other engine elements in order to isolate the obtained data as much as possible to the rendering of the Luma AI models in the Meta Quest 2, Pro and 3. According to the perceived fluidity, the number of objects of the corresponding type was multiplied to try to reach a theoretical maximum from which the application did not run fluidly. The data showed that the high-poly models from Luma AI should not be used for standalone VR shopping environments, despite being the most detailed and visually quality-rich. Instead, medium-poly models should be used if the shopping environment will host very few objects, or low-poly models with the possibility that the virtual space can accommodate a much larger number of objects (up to 18 times more for acceptable performance). As we saw in the data shown in Section 3.3, the difference in visual quality of the high-poly models compared to the low-poly ones is approximately 5%, so much better performance is provided at the cost of losing very little visual quality.

Finally, we explored the use of optimising high-poly models to convert them into ultra low-poly models, with a lower number of polygons than those offered by the low-poly models from Luma AI. The results, despite confirming that in terms of visual quality they can lose up to 30% depending on the mesh, are very good in terms of performance. Thus, these types of models could be used to implement level of detail techniques, loading them when the user is far from them and, as they approach, loading the low-poly models from Luma AI, replacing these optimised models. This would allow even more objects to be added to the scene.

We are currently planning contacts with small businesses in our city that will allow us to carry out further research along the lines of the VR-ZOCO project. Some of them include a cost study when the platform starts operating, as shown in the timeline of one of our previous works [39]. Additionally, we highlight the exploration of a new technique for the generation of 3D models, Gaussian splattering, analysing its results similarly to what was done in this work. Additionally, we plan to organise experimental sessions in collaboration with local chambers of commerce. These sessions will be coordinated with real businesses and shops to more accurately assess the impact of object realism on consumer purchase intentions and preferences. This approach will not only yield more relevant and applicable data but will also foster a direct relationship with the commercial sector, allowing us to gain practical and firsthand feedback from market participants.

In addition, we are currently scanning small businesses in our city in order to further research the benefits and challenges of these technologies. It will also allow us to study the adoption of these technologies by these businesses, giving us insights on how easy it is for people without expertise to digitize their shops. Moreover, it will serve us well to gather feedback about the scanning technologies studied from real users, as well as how the digitization process affects the operation of the small businesses. Considering that it would require an as-empty-as-possible shop to protect the privacy of customers and employees, it could affect the normal businesses operations.

Author Contributions

R.G.: conceptualizacion, methodology, software, validation, formal analysis, investigation, visualization, writing; J.A.: methodology, validation, formal analysis, investigation, writing, supervision, funding acquisition, project administration; D.V.: methodology, validation, formal analysis; C.G.-M.: methodology, validation, formal analysis; J.J.C.-S.: methodology, validation, formal analysis, investigation, supervision, funding acquisition, project administration. All authors have read and agreed to the published version of the manuscript.

Funding

This work has been funded by the Spanish Ministry of Science and Innovation MICIN/AEI/10.13039/501100000033, and the European Union (NextGenerationEU/PRTR), under the Research Project: Design and development of a platform based on VR shopping and AI for the digitalization and strengthening of local businesses and economies, TED2021-131082B-I00.

Data Availability Statement

All data generated during the study are publicly available in the GitHub repository: https://github.com/AIR-Research-Group-UCLM/PDIVR-ZOCO accessed on 7 July 2024. It contains links for downloading the 3D models obtained and used in the study, the data obtained for carrying out the performance analysis and more. The repository’s structure is fully detailed in its README.md.

Acknowledgments

This work has been funded by the Spanish Ministry of Science and Innovation MICIN/AEI/10.13039/501100000033, and the European Union (NextGenerationEU/PRTR), under the Research Project: Design and development of a platform based on VR shopping and AI for the digitalization and strengthening of local businesses and economies, TED2021-131082B-I00. During the preparation of this work, the author(s) used ChatGPT for assistance in coding content in LATEX and as a support tool for translation. After using this tool/service, the author(s) reviewed and edited the content as needed and take(s) full responsibility for the content of the publication.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

References

Chodak, G.; Ropuszyńska-Surma, E. Virtual Reality and Artificial Intelligence in e-Commerce. In Advanced Computing; Garg, D., Narayana, V.A., Suganthan, P.N., Anguera, J., Koppula, V.K., Gupta, S.K., Eds.; Springer: Cham, Switzerland, 2023; pp. 328–340. [Google Scholar] [CrossRef]
Fedorko, R.; Kráľ, Š.; Bačík, R. Artificial Intelligence in E-commerce: A Literature Review. In Congress on Intelligent Systems; Saraswat, M., Sharma, H., Balachandran, K., Kim, J.H., Bansal, J.C., Eds.; Springer: Singapore, 2022; pp. 677–689. [Google Scholar] [CrossRef]
Kang, H.J.; Shin, J.H.; Ponto, K. How 3D Virtual Reality Stores Can Shape Consumer Purchase Decisions: The Roles of Informativeness and Playfulness. J. Interact. Mark. 2020, 49, 70–85. [Google Scholar] [CrossRef]
Pizzi, G.; Scarpi, D.; Pichierri, M.; Vannucci, V. Virtual reality, real reactions?: Comparing consumers’ perceptions and shopping orientation across physical and virtual-reality retail stores. Comput. Hum. Behav. 2019, 96, 1–12. [Google Scholar] [CrossRef]
Cortinas, M.; Berne, C.; Chocarro, R.; Nilssen, F.; Rubio, N. Editorial: The Impact of AI-Enabled Technologies in E-commerce and Omnichannel Retailing. Front. Psychol. 2021, 12, 718885. [Google Scholar] [CrossRef]
Miller, S.H.; Hashemian, A.; Gillihan, R.; Helms, E. A Comparison of Mobile Phone LiDAR Capture and Established Ground based 3D Scanning Methodologies. In Proceedings of the WCX SAE World Congress Experience, Detroit, MI, USA, 5–7 April 2022. [Google Scholar] [CrossRef]
Skrupskaya, Y.; Skibina, V.; Taratukhin, V.; Kozlova, E. The Use of Virtual Reality to Drive Innovations. VRE-IP Experiment. In Information Systems and Design; Taratukhin, V., Matveev, M., Becker, J., Kupriyanov, Y., Eds.; Springer: Cham, Switzerland, 2022; pp. 336–345. [Google Scholar]
Mildenhall, B.; Srinivasan, P.P.; Tancik, M.; Barron, J.T.; Ramamoorthi, R.; Ng, R. NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis. arXiv 2020, arXiv:2003.08934. [Google Scholar] [CrossRef]
Gao, K.; Gao, Y.; He, H.; Lu, D.; Xu, L.; Li, J. NeRF: Neural Radiance Field in 3D Vision, A Comprehensive Review. arXiv 2023, arXiv:2210.00379. [Google Scholar] [CrossRef]
Deng, N.; He, Z.; Ye, J.; Duinkharjav, B.; Chakravarthula, P.; Yang, X.; Sun, Q. FoV-NeRF: Foveated Neural Radiance Fields for Virtual Reality. IEEE Trans. Vis. Comput. Graph. 2021, 28, 3854–3864. [Google Scholar] [CrossRef] [PubMed]
Dickson, A.; Shanks, J.; Ventura, J.; Knott, A.; Zollmann, S. VRVideos: A Flexible Pipeline for Virtual Reality Video Creation; University of Otago: Dunedin, New Zealand, 2022; pp. 199–202. [Google Scholar] [CrossRef]
Li, K.; Rolff, T.; Schmidt, S.; Bacher, R.; Frintrop, S.; Leemans, W.; Steinicke, F. Immersive Neural Graphics Primitives. arXiv 2022, arXiv:2211.13494. [Google Scholar] [CrossRef]
Rolff, T.; Li, K.; Hertel, J.; Schmidt, S.; Frintrop, S.; Steinicke, F. Interactive VRS-NeRF: Lightning fast Neural Radiance Field Rendering for Virtual Reality. In Proceedings of the SUI ’23: ACM Symposium on Spatial User Interaction, Sydney, Australia, 13–15 October 2023; pp. 1–3. [Google Scholar] [CrossRef]
Obradović, M.; Vasiljević, I.; Durić, I.; Kićanović, J.; Stojaković, V.; Obradović, R. Virtual Reality Models Based on Photogrammetric Surveys—A Case Study of the Iconostasis of the Serbian Orthodox Cathedral Church of Saint Nicholas in Sremski Karlovci (Serbia). Appl. Sci. 2020, 10, 2743. [Google Scholar] [CrossRef]
Andree León Tejada, R.; Alexander Jimenez Azabache, J.; Javier Berrú Beltrán, R. Proposal of virtual reality solution using Photogrammetry techniques to enhance the heritage promotion in a tourist center of Trujillo. In Proceedings of the 2022 IEEE Engineering International Research Conference (EIRCON), Lima, Peru, 26–28 October 2022; pp. 1–4. [Google Scholar] [CrossRef]
Tadeja, S.; Lu, Y.; Rydlewicz, M.; Rydlewicz, W.; Bubas, T.; Kristensson, P. Exploring gestural input for engineering surveys of real-life structures in virtual reality using photogrammetric 3D models. Multimed. Tools Appl. 2021, 80, 31039–31058. [Google Scholar] [CrossRef]
Raj, T.; Hashim, F.H.; Huddin, A.B.; Ibrahim, M.F.; Hussain, A. A Survey on LiDAR Scanning Mechanisms. Electronics 2020, 9, 741. [Google Scholar] [CrossRef]
Mikita, T.; Krausková, D.; Hrůza, P.; Cibulka, M.; Patočka, Z. Forest Road Wearing Course Damage Assessment Possibilities with Different Types of Laser Scanning Methods including New iPhone LiDAR Scanning Apps. Forests 2022, 13, 1763. [Google Scholar] [CrossRef]
Vogt, M.; Rips, A.; Emmelmann, C. Comparison of iPad Pro®’s LiDAR and TrueDepth Capabilities with an Industrial 3D Scanning Solution. Technologies 2021, 9, 25. [Google Scholar] [CrossRef]
Ferreira, V.S.; Martins, S.G.; Figueira, N.M.; Pochmann, P.G.C. The Use of a Digital Surface Model with Virtual Reality in the Amazonian Context. In Proceedings of the 2021 International Conference on Electrical, Computer and Energy Technologies (ICECET), Cape Town, South Africa, 9–10 December 2021; pp. 1–6. [Google Scholar] [CrossRef]
Ricci, M.; Evangelista, A.; Di Roma, A.; Fiorentino, M. Immersive and desktop virtual reality in virtual fashion stores: A comparison between shopping experiences. Virtual Real. 2023, 27, 2281–2296. [Google Scholar] [CrossRef] [PubMed]
Peukert, C.; Pfeiffer, J.; Meißner, M.; Pfeiffer, T.; Weinhardt, C. Shopping in Virtual Reality Stores: The Influence of Immersion on System Adoption. J. Manag. Inf. Syst. 2019, 36, 755–788. [Google Scholar] [CrossRef]
Wu, H.; Wang, Y.; Qiu, J.; Liu, J.; Zhang, X.L. User-defined gesture interaction for immersive VR shopping applications. Behav. Inf. Technol. 2019, 38, 726–741. [Google Scholar] [CrossRef]
Speicher, M.; Cucerca, S.; Krüger, A. VRShop: A Mobile Interactive Virtual Reality Shopping Environment Combining the Benefits of On- and Offline Shopping. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 2017, 1, 102. [Google Scholar] [CrossRef]
Speicher, M.; Hell, P.; Daiber, F.; Simeone, A.; Krüger, A. A virtual reality shopping experience using the apartment metaphor. In Proceedings of the AVI’18 2018 International Conference on Advanced Visual Interfaces, New York, NY, USA, 29 May–1 June 2018; pp. 1–9. [Google Scholar] [CrossRef]
Shravani, D.; R, P.Y.; Atreyas, P.V.; G, S. VR Supermarket: A Virtual Reality Online Shopping Platform with a Dynamic Recommendation System. In Proceedings of the 2021 IEEE International Conference on Artificial Intelligence and Virtual Reality (AIVR), Taichung, Taiwan, 15–17 November 2021; pp. 119–123. [Google Scholar] [CrossRef]
van Herpen, E.; van den Broek, E.; van Trijp, H.C.; Yu, T. Can a virtual supermarket bring realism into the lab? Comparing shopping behavior using virtual and pictorial store representations to behavior in a physical store. Appetite 2016, 107, 196–207. [Google Scholar] [CrossRef] [PubMed]
Nightingale, R.; Ross, M.T.; Cruz, R.; Allenby, M.; Powell, S.K.; Woodruff, M. Frugal 3D scanning using smartphones provides an accessible framework for capturing the external ear. J. Plast. Reconstr. Aesthetic Surg. JPRAS 2021, 74, 3066–3072. [Google Scholar] [CrossRef] [PubMed]
Luetzenburg, G.; Kroon, A.; Bjørk, A.A. Evaluation of the Apple iPhone 12 Pro LiDAR for an Application in Geosciences. Sci. Rep. 2021, 11, 22221. [Google Scholar] [CrossRef]
Fawzy, H.E.D. The Accuracy of Mobile Phone Camera Instead of High Resolution Camera in Digital Close Range Photogrammetry. Int. J. Civ. Eng. Technol. (IJCIET) 2015, 6, 76–85. [Google Scholar]
Croce, V.; Billi, D.; Caroti, G.; Piemonte, A.; De Luca, L.; Véron, P. Comparative Assessment of Neural Radiance Fields and Photogrammetry in Digital Heritage: Impact of Varying Image Conditions on 3D Reconstruction. Remote Sens. 2024, 16, 301. [Google Scholar] [CrossRef]
Zhang, Z.; Sun, W.; Min, X.; Wang, T.; Lu, W.; Zhai, G. No-Reference Quality Assessment for 3D Colored Point Cloud and Mesh Models. arXiv 2021, arXiv:2107.02041. [Google Scholar] [CrossRef]
Liu, Q.; Su, H.; Duanmu, Z.; Liu, W.; Wang, Z. Perceptual Quality Assessment of Colored 3D Point Clouds. IEEE Trans. Vis. Comput. Graph. 2022, 29, 3642–3655. [Google Scholar] [CrossRef] [PubMed]
Zhang, Z.; Sun, W.; Min, X.; Zhou, Q.; He, J.; Wang, Q.; Zhai, G. MM-PCQA: Multi-Modal Learning for No-reference Point Cloud Quality Assessment. arXiv 2023, arXiv:2209.00244. [Google Scholar] [CrossRef]
Nehmé, Y.; Dupont, F.; Farrugia, J.P.; Le Callet, P.; Lavoué, G. Visual Quality of 3D Meshes With Diffuse Colors in Virtual Reality: Subjective and Objective Evaluation. IEEE Trans. Vis. Comput. Graph. 2021, 27, 2202–2219. [Google Scholar] [CrossRef] [PubMed]
Lavoué, G. A Multiscale Metric for 3D Mesh Visual Quality Assessment. Comput. Graph. Forum 2011, 30, 1427–1437. [Google Scholar] [CrossRef]
Webster, N.L. High poly to low poly workflows for real-time rendering. J. Vis. Commun. Med. 2017, 40, 40–47. [Google Scholar] [CrossRef]
Schroeder, W.J.; Zarge, J.A.; Lorensen, W.E. Decimation of triangle meshes. SIGGRAPH Comput. Graph. 1992, 26, 65–70. [Google Scholar] [CrossRef]
Grande, R.; Albusac, J.; Castro-Schez, J.; Vallejo, D.; Sánchez-Sobrino, S. A Virtual Reality Shopping platform for enhancing e-commerce activities of small businesses and local economies. In Proceedings of the 2023 IEEE International Symposium on Mixed and Augmented Reality Adjunct (ISMAR-Adjunct), Sydney, Australia, 16–20 October 2023; pp. 793–796. [Google Scholar] [CrossRef]

Figure 1. Set of 6 expanded polystyrene Objects used.

Figure 2. Object based on a mirror material. (a) Decorative candle with a mirror; (b) 3D model of decorative candle with a mirror in Blender.

Figure 3. Collage of the 5 objects scanned in real life.

Figure 4. Collage of the 3D models in low-poly from Luma AI.

Figure 5. Comparison of 3D models performed using CloudCompare 2.9.3. (a) High-poly shoes as reference and low-poly shoes as compared; (b) High-poly teddy as reference and Polycam in Android’s version teddy as compared.

Figure 6. High-poly and low-poly meshes generated by Luma AI from one of the objects.

Figure 7. Comparison of details between Luma AI low-poly model, Polycam from Xiaomi, Polycam from iPad and the real object. (a) Shoe model generated with Xiaomi phone (Polycam); (b) Shoe low-poly model generated with Luma AI; (c) Shoe model generated with iPad (Polycam); (d) Shoes in real life.

Figure 8. Two of the scenes developed for analysing the performance of models. (a) Simple scene (1 object of each model) of low-poly Luma AI models; (b) 12x scene (12 objects of each model) of low-poly Luma AI models.

Figure 9. Average FPS obtained for each device and type of object. The number of polygons is the total of those for the objects loaded in each scene. LP = Low-Poly objects, MP = Medium-Poly objects, HP = High-Poly objects.

Figure 10. Meta Quest 3 bottleneck identification based on frame rate and App T.

Figure 11. CPU utilization % obtained of each scene run on each VR Headset. This metric represents the worst performing core. Therefore, for multi-threaded applications, the main thread of the app may not be represented in this metric.

Figure 12. CPU and GPU levels in each scene run in each VR headset.

Figure 13. High-poly (left) and optimised (right) models from the octopus teddy, showing the differences in both meshes and rendering.

Figure 14. Visual quality comparison between high-poly models (red bars) and optimised models from high poly (blue bars). In the fifth subgraph, the red bar represents data obtained from low-poly models. The yellow text box contains the percentage difference between high-poly and optimised models.

Table 1. Features of found scanning applications for Android and iOS devices . P = Photogrammetry, L = LiDAR, N = NeRF.

Feature	Polycam 1.3.10	Widar 4.1.3	Kiri Engine 4.4.0	Luma AI 1.3.8	MagiScan 1.9.7	Scaniverse 3.1.2 (iOS Only)
Technique Supported	P, L	P, L	P	N	P, L	P, L
Subscription Price	USD 14.99/m	USD 9.99/m	USD 15.99/m	Free	USD 9.99/m	Free
Free Export Formats	.gltf	No free export	.obj	.fbx, .obj, .glb, .gltf, .usdz, .stl, .ply, .xyz, .dae	.fbx, .obj, .glb, .gltf, .usdz, .stl, .ply	.fbx, .obj, .glb, .gltf, .usdz, .stl, .ply
Free Number of Export	Unlimited	No free export	3/week	Unlimited	3/account	Unlimited
Edition Tools Included	Yes	Yes	Yes	Yes	No	Yes
Object Masking (Photogrammetry)	Yes	No	Yes	-	No	No
User Rate	4.3/5 (16,227 reviews)	3.7/5 (2268 reviews)	4.1/5 (1231 reviews)	4.7/5 (1575 reviews)	4.0/5 (4936 reviews)	4.8/5 (7585 reviews)

Table 2. Metrics obtained from the expanded polystyrene figures comparisons.

Metric	Cube	Sphere	Pyramid	A	I	R
NeRF (Luma AI High-Poly Models)
Root Mean Square (RMS)	0.00144	0.00751	0.00125	0.00750	0.00386	0.00394
MSDM2	0.10303	0.14509	0.08394	0.12386	0.10182	0.11651
Photogrammetry (Polycam Android 1.3.10)
Root Mean Square (RMS)	0.00483	0.00945	0.00432	0.00221	0.00391	0.00417
MSDM2	0.13715	0.18423	0.10411	0.11910	0.13607	0.11754
LiDAR sensors (Polycam iOS 1.3.10)
Root Mean Square (RMS)	0.00701	0.01179	0.00471	0.01045	0.01350	0.0055
MSDM2	0.12862	0.24176	0.14615	0.16083	0.34365	0.17143

Table 3. Data extracted of defined features from each 3D model. HP = High Poly, MP = Medium Poly, LP = Low Poly, OP = Optimised.

Object	Type	Scanning Application	Device Used	Size (MB)	Polygon Count	Texture’s Size (MB)	NR-3DQA	MM-PCQA
Munich shoes	HP	LUMA AI	iPad Pro	98.94	999,846	18.50	0.7272	86.06
	MP	LUMA AI	iPad Pro	13.67	149,915	13.10	0.8101	83.52
	LP	LUMA AI	iPad Pro	2.16	24,893	10.20	0.7617	84.85
	OP	Polycam	iPad Pro	4.04	39,627	0.42	0.7363	80.67
	OP	Polycam	Xiaomi Note 8	1.94	19,424	2.30	0.8111	82.18
Octopus teddy	HP	LUMA AI	iPad Pro	99.47	999,869	8.60	0.6619	78.51
	MP	LUMA AI	iPad Pro	13.50	149,916	5.19	0.6914	70.92
	LP	LUMA AI	iPad Pro	2.10	24,920	3.67	0.6537	66.25
	OP	Polycam	iPad Pro	0.25	2689	0.45	0.4288	20.51
	OP	Polycam	Xiaomi Note 8	1.84	18,623	1.99	0.5539	65.98
Burner	HP	LUMA AI	iPad Pro	53.54	528,442	4.64	0.6190	70.43
	MP	LUMA AI	iPad Pro	14.30	149,972	3.71	0.6622	69.73
	LP	LUMA AI	iPad Pro	2.30	24,981	2.77	0.6487	70.63
	OP	Polycam	iPad Pro	0.64	6693	0.29	0.6381	67.50
	OP	Polycam	Xiaomi Note 8	1.93	19,678	1.54	0.6238	69.97
Ceramic mushroom	HP	LUMA AI	iPad Pro	98.94	999,731	2.97	0.7065	78.59
	MP	LUMA AI	iPad Pro	13.67	149,875	1.16	0.7948	61.10
	LP	LUMA AI	iPad Pro	2.16	24,814	0.87	0.7226	72.60
	OP	Polycam	iPad Pro	0.96	10,161	0.30	0.6748	42.66
	OP	Polycam	Xiaomi Note 8	1.05	10,917	2.26	0.6542	63.071
Trophy	HP	LUMA AI	iPad Pro	39.2	329,208	6.43	0.7606	76.17
	MP	LUMA AI	iPad Pro	13.9	149,923	4.93	0.8001	73.03
	LP	LUMA AI	iPad Pro	2.2	24,964	3.97	0.7974	69.61
	OP	Polycam	iPad Pro	1.49	13,416	1.83	0.4474	69.85
	OP	Polycam	Xiaomi Note 8	3.39	36,105	12.2	0.4460	72.90

Table 4. 3D mesh comparisons performed with CloudCompare and CMDM. Regarding the labelling of types compared: HP, LP, MP and OP stands for the same as the previous table, Poly<device> = Polycam using such <device>, Polycam = both models obtained from both devices are compared.

Object	Type	Types Compared	Scanning Application	RMS	CMDM
Munich shoes	MP	HP, MP	LUMA	0.00041	0.132868
	LP	HP, LP	LUMA	0.00171	0.190245
	OP	Polycam	Polycam	0.01269	0.152718
	OP	HP, PolyXiaomi	Polycam	0.03012	0.223199
	OP	HP, PolyiPad	Polycam	0.03767	0.224947
Octopus teddy	MP	HP, MP	LUMA	0.00023	0.151651
	LP	HP, LP	LUMA	0.00098	0.17674
	OP	Polycam	Polycam	0.03360	0.204733
	OP	HP, PolyXiaomi	Polycam	0.00522	0.222374
	OP	HP, PolyiPad	Polycam	0.05976	0.318699
Burner	MP	HP, MP	LUMA	0.00009	0.0480725
	LP	HP, LP	LUMA	0.00040	0.0637373
	OP	Polycam	Polycam	0.01451	0.12820
	OP	HP, PolyXiaomi	Polycam	0.01568	0.121982
	OP	HP, PolyiPad	Polycam	0.01423	0.151532
Ceramic mushroom	MP	HP, MP	LUMA	0.00018	0.239902
	LP	HP, LP	LUMA	0.00072	0.290423
	OP	Polycam	Polycam	0.04873	0.181069
	OP	HP, PolyXiaomi	Polycam	0.03710	0.34902
	OP	HP, PolyiPad	Polycam	0.02446	0.350014
Trophy	MP	HP, MP	LUMA	0.00027	0.0902698
	LP	HP, LP	LUMA	0.00141	0.1052886
	OP	Polycam	Polycam	0.08235	0.239079
	OP	HP, PolyXiaomi	Polycam	0.03008	0.29231
	OP	HP, PolyiPad	Polycam	0.23793	0.390321

Table 5. Metrics measured with the tool OVR Metrics. Scene names mean the following: (a) Simple: 4 objects, one of each model, (b) <Number>x: <Number>*4 objects, <Number> of each model, (c) LowPoly/MediumPoly/HighPoly: the type of the models used in the scene. In the second column, LP = Low Poly, MP = Medium Poly, HP = High Poly.

Scene	Nº of Objects (Nº Polygons)	Average FPS	Stale Frames %	App GPU Time %	GPU U ≥ 99%	Average Prediction %
Meta Quest 2
SimpleLowPoly	4 LP (99,806)	71.37	6.35	0.00	0.00	0.00
6xLowPoly	24 LP (597,648)	71.16	10.20	0.00	0.00	0.00
12xLowPoly	48 LP (1,195,296)	71.74	15.15	0.00	6.06	0.00
18xLowPoly	72 LP (1,792,944)	59.47	62.90	30.71	32.26	3.22
24xLowPoly	96 LP (2,390,592)	56.21	64.52	48.38	32.26	11.29
SimpleMediumPoly	4 MP (599,600)	69.22	16.00	0.00	0.00	0.00
3xMediumPoly	12 MP (1,798,800)	48.81	80.00	57.33	17.33	29.33
6xMediumPoly	24 MP (3,597,600)	29.61	45.00	61.67	56.66	8.33
SimpleHighPoly	4 HP (3,527,888)	9.30	10.0	30.00	93.33	78.33
3xHighPoly	12 HP (10,583,644)	8.75	1.66	33.33	93.33	56.66
Meta Quest Pro
SimpleLowPoly	4 LP (99,806)	71.63	11.66	0.00	0.00	0.00
6xLowPoly	24 LP (597,648)	71.56	16.66	0.00	0.00	0.00
12xLowPoly	48 LP (1,195,296)	69.28	43.33	5.00	11.66	1.67
18xLowPoly	72 LP (1,792,944)	60.16	65.00	35.00	13.33	5.00
24xLowPoly	96 LP (2,390,592)	44.56	78.33	71.67	30.00	6.67
SimpleMediumPoly	4 MP (599,600)	71.48	20.00	0.00	3.33	0.00
3xMediumPoly	12 MP (1,798,800)	56.94	88.33	63.33	20.00	30.00
6xMediumPoly	24 MP (3,597,600)	30.62	35.00	33.33	61.67	26.66
SimpleHighPoly	4 HP (3,527,888)	12.40	5.00	18.33	93.33	48.33
3xHighPoly	12 HP (10,583,644)	8.28	80.00	53.33	91.66	90.00
Meta Quest 3
SimpleLowPoly	4 LP (99,806)	71.52	4.91	0.00	0.00	0.00
6xLowPoly	24 LP (597,648)	71.95	6.55	0.00	0.00	0.00
12xLowPoly	48 LP (1,195,296)	71.96	13.11	3.28	0.00	0.00
18xLowPoly	72 LP (1,792,944)	62.60	45.90	21.31	0.00	11.47
24xLowPoly	96 LP (2,390,592)	55.52	50.82	31.15	4.92	21.31
SimpleMediumPol y	4 MP (599,600)	71.85	16.66	0.00	0.00	0.00
3xMediumPoly	12 MP (1,798,800)	52.43	84.51	67.60	30.03	19.71
6xMediumPoly	24 MP (3,597,600)	48.08	27.87	45.90	34.43	14.75
SimpleHighPoly	4 HP (3,527,888)	15.97	13.11	83.60	81.97	40.98
3xHighPoly	12 HP (10,583,644)	10.95	4.92	91.80	91.80	57.37

Table 6. Performance data obtained from running a 96 (483,648 polygons) optimised 3D models scene for 1 min.

Device	Average Frame Rate	Stale Frames %	CPU U %
Meta Quest 2	71.36	15.00	19.6
Meta Quest Pro	71.33	15.00	21.70
Meta Quest 3	71.65	11.66	26.15

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Grande, R.; Albusac, J.; Vallejo, D.; Glez-Morcillo, C.; Castro-Schez, J.J. Performance Evaluation and Optimization of 3D Models from Low-Cost 3D Scanning Technologies for Virtual Reality and Metaverse E-Commerce. Appl. Sci. 2024, 14, 6037. https://doi.org/10.3390/app14146037

AMA Style

Grande R, Albusac J, Vallejo D, Glez-Morcillo C, Castro-Schez JJ. Performance Evaluation and Optimization of 3D Models from Low-Cost 3D Scanning Technologies for Virtual Reality and Metaverse E-Commerce. Applied Sciences. 2024; 14(14):6037. https://doi.org/10.3390/app14146037

Chicago/Turabian Style

Grande, Rubén, Javier Albusac, David Vallejo, Carlos Glez-Morcillo, and José Jesús Castro-Schez. 2024. "Performance Evaluation and Optimization of 3D Models from Low-Cost 3D Scanning Technologies for Virtual Reality and Metaverse E-Commerce" Applied Sciences 14, no. 14: 6037. https://doi.org/10.3390/app14146037

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Performance Evaluation and Optimization of 3D Models from Low-Cost 3D Scanning Technologies for Virtual Reality and Metaverse E-Commerce

Abstract

1. Introduction

2. Related Work

2.1. NeRF (Neurance Radiance Field)

2.2. Photogrammetry

2.3. LiDAR Sensors

2.4. VR in E-Commerce Field

2.5. Influence of 3D Models in E-Commerce Activities

3. Low-Cost Digitisation for VR E-Commerce

3.1. Software Tools Selection Based on Scanning Techniques

3.2. Hardware Used

3.3. Features and Quality Metrics Used

3.4. Scanning of Basic 3D Primitives

3.5. Scanning of Objects Selected for the Study

3.6. 3D Model Quality Evaluation

Discussion

4. Performance Analysis of the Generated Models in Virtual Environments

4.1. Data Collection Method

4.2. Performance Analysis of Models

4.2.1. Frame Rate

4.2.2. Stale Frames

4.2.3. GPU U (GPU Utilization)

4.2.4. Application GPU Time (App T)

4.2.5. CPU Utilization Percentage

4.2.6. GPU and CPU Levels

4.2.7. Average Prediction

4.3. Discussion

5. 3D Model Optimisation of High-Poly Models for VR

5.1. Optimisation Process

5.2. Quality and Performance Evaluation of Optimised Models

6. Conclusions and Future Work

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI