Deep-Learning-Incorporated Augmented Reality Application for Engineering Lab Training
Abstract
:1. Introduction
2. Overview of Deep Learning and Augmented Reality
2.1. Deep Learning
2.2. Augmented Reality
2.2.1. Marker-Based AR
2.2.2. Marker-Less AR
- (a)
- Location-based AR: In this type of AR, simultaneous localization and mapping (SLAM) technology is used to track the user’s location as the map is generated and updated on the user’s mobile device [39]. To display AR content in the physical environment, the user must detect a surface with a mobile device [40,41]. As an example, the world-famous AR-based game app, Pokemon Go, uses SLAM technology that allows its users to battle, navigate, and search for 3D interactive objects based on their geographical locations [42].
- (b)
- Superimposition-based AR: Superimposition-based AR applications can provide an additional view along with the original view of the object. Object recognition is required to determine the type of object to partially or completely replace an object in the user’s environment with a digital image [43,44]. Using HoloLens glasses, surgeons can superimpose images previously gathered through scanners or X-rays on the patient’s body during the operation. They can anticipate potential problems using this approach.
- (c)
- Projection-based AR: Projection-based AR (also known as projection mapping and augmented spatial reality) is a technique that does not require the use of head-mounted or hand-held devices. This method allows augmented information to be viewed immediately from a natural perspective. Using projection mapping, projection-based AR turns an uneven surface into a projection screen. This method allows for the creation of optical illusions [45].
- (d)
- Outlining-based AR: This type of AR employs image recognition to create contours or forms and highlight components of the real world using special cameras. It is used by human eyes to designate specific items with lines to make situations easier. Vuforia’s Model Target is an example of outlining-based AR. Vuforia is a platform that enables developers to quickly incorporate AR technology into their applications. Model Targets allow apps to recognize and track real-world objects based on their shape [46].
3. Design and Implementation of the AR App
3.1. Object Detection Framework
3.2. TensorFlow Object Detection API
- (a)
- Image Dataset: The model was given input of 643 images collected from various perspective views and in different lighting settings. Each image is of 4032 × 3024 pixels in size. It is necessary to annotate these images before using them to train the model. A software, LabelImg [50], is used in the annotation process [51] that allows users to draw a rectangle in a specific area of the image. During training, the annotation will help the model precisely locate the object in the image. The outlining will generate and save coordinate points in an XML file.
- (b)
- TensorFlow Dataset: To make the computation of the DL framework efficient, TF records use a binary file format. Furthermore, TF records enable the dataset to be stored as a sequence of binary strings that improves the model’s performance while using less disk space. We converted the XML files generated by LabelImg into TF binary records using a Python script. The last step in configuring the TF dataset is to create a .pbtxt file containing all of the categorical label classes that will be stored in a TF record file.
- (c)
- Configuration File: Multiple pre-trained models based on the common objects in context (COCO) dataset are available in TF. These models can be used to set up DL models prior to training on a new dataset. Table 1 lists several popular architectures with pre-trained models. For instance, ssd_MobileNet_v1_coco is the SSD with a MobileNet v1 configuration, ssd_inception_v2_coco represents an SSD with an Inception v2 configuration, and faster_rcnn_resnet101_coco stands for Faster R-CNN with a Resnet-101 (v1) configuration. All these configurations have been derived for the COCO dataset. From Table 1, it can be observed that ssd_MobileNet_v1_coco reaches the fastest inference speed of 30 ms but with the lowest mean average precision (mAP). In contrast, faster_rcnn_resnet101_coco has the slowest inference speed but the highest mAP of 32.We tested both MobileNet SSD v2 and faster RCNN [52] and concluded that MobileNet SSD v2 performs faster inference in mobile devices than the faster-RCNN model in our study. Using a pre-trained model saves time and computing resources. A configuration file, in addition to the pre-trained model, is also required. It must match the same architecture of the pre-trained model. It is recommended to fine-tune the model to maximize the prediction outcome. The process of fine-tuning is divided into two steps: restoring weights and updating weights. After we completed the requirements, we ran the python code provided for TF API to start the training job. Following training, the API will generate a file serving as a training checkpoint in a specific format named .ckpt. This file is a binary file containing all of the weights, biases, and other variables’ values.
- (d)
- Inference: After training the model, the last step is to put it into production and feed the model with live data to calculate the predicted output. Before testing, we can evaluate the model’s accuracy using mAP. In Section 4, the evaluation result is described in detail. We also need a lightweight version of the model to perform inference, so we choose an OpenCV library.
3.3. Augmented Reality Framework
- (a)
- Object Scanning: It is the primary tool for generating an object data file, which is required for generating an object target in the target manager on the Vuforia webpage. This app is available on the Vuforia developer website, which users can access after creating a developer account. The ObjectScanner [53] app is used, and the scanning environment is configured. The Vuforia developer portal provides a printable target image that defines the target position and orientation relative to the local coordinate space to scan the object and collect data points. It also distinguishes and removes undesirable areas. This printable target image is used in conjunction with an Android application, which is available for free download from the Vuforia official website. During scanning, the printable target image must be placed under the object to be scanned. Using the Vuforia scanning mobile application, the user can start collecting data points from the object. To achieve the best scanning quality, it is recommended to work in a noise-free environment with moderately bright and diffuse lighting. It is also recommended to avoid objects with reflective surfaces. In this work, a multimeter met all of the requirements, and a successful scanning was achieved.
- (b)
- Data File: Following the scanning, an object data file is created. The mobile app will also show how many scanning points the object has. The completed scanning area is evidenced by a square grid that changes color from gray to green. The object data file contains all the object’s information. There is a test scenario to determine whether the scanned object has sufficient data for augmentation. In this scenario, a green rectangular prism will be drawn in one of the object corners relative to the target image coordinate space.
- (c)
- Target Management System: Vuforia has a framework that allows developers to choose from various target types, such as picture targets, stimuli targets, cylinder targets, object targets, and VuMarks. The system will process and manage the data for visual examination. A developer license is required to fully utilize the Vuforia manager web-based tool, which includes access to a target manager panel where a database can be uploaded, and targets can be added to the management system. The 3D object option must be selected when selecting the target type, and the object data file must be uploaded.
- (d)
- Export Target Dataset Following the web-tool processing the information, the database can be downloaded by choosing the desired platform. The platform can be converted into a package that can be used in the primary development process as well as to create AR experiences in Unity.
3.4. Lab Training Application Framework
- (a)
- Setup Environment: The setup starts with the creation of a new project using a Unity hub. After creating and opening the project, it is essential to switch to a different build platform because Unity allows us to create once and deploy anywhere. In other words, we can select a platform from the list of available platforms in Unity, such as Windows, WebGL, Android, iOS, or any gaming console. We chose Android as the deployed platform for this project. The platform can be changed in the build settings windows, which can be accessed via the file bar. Additionally, the rendering, scripting, and project configuration must be modified.
- (b)
- Libraries and Assets:
- (1)
- OpenCV Library: OpenCV For Unity [54] is a program that uses AI algorithms to analyze and interpret images on computers or mobile devices. This Unity asset store product allows users to test AI pre-trained models that can be used to run algorithms and executable applications on mobile devices. The model employs a script that requires a binary file of a DL model with trained weights (weights of deep neural networks are not modified in this stage), and a file model network configuration. This script is granted access to the device resource, specifically the camera, so that the script can pass input to the model and start object detection inference, which will generate bounding boxes and labels around the object detected.
- (2)
- Vuforia Engine: This library allows Unity to create AR experiences for mobile devices. It is a collection of scripts and pre-made components for developing Vuforia apps in Unity. It includes API libraries written in the C# language that expose the Vuforia APIs. This library supports all traceable functions as well as high-level access to device hardware such as the device camera.
- (3)
- Assets: They are graphical representations of any items that could be used in the project. It is made up of user interface sprites, 3D models, images, materials, and sounds, all with their own design and functionality. Photoshop is used to create art sprites, such as images for a virtual manual and blender. A 3D modeler software is used to create 3D models.
- (c)
- Scenarios creation
- (1)
- Main menu: The application includes a menu scenario, as shown in Figure 7, that will allow the user to select various modes based on their preferences. It includes a tutorial that teaches students how to use the application. There is a training mode to help students learn more about lab equipment or electrical components.
- (2)
- Object detection: In this case, the DL model is used in conjunction with the OpenCV library in Unity. The application has access to the device’s camera from which it will infer the object detection model provided by the object detection framework. Furthermore, depending on the object that is being targeted, the application automatically generates bounding boxes around the desired object with its respective label and confidence. When the user points to the desired equipment, a bottom panel will appear with the option to load the AR experience or continue looking for other lab equipment. The OpenCV library allows us to specify the desired confidence value threshold during the model inference. During model inference, we can specify the desired confidence value threshold using the OpenCV library. The model draws a green rectangle around the detected equipment. The detection threshold confidence value is set to 90%, which means that the confidence must be greater than or equal to 90% to indicate a detection with a rectangular bounding box. This percentage was chosen because the lab equipment is quite different. The score of 90% would ensure that the lab equipment detected had a high confidence level.
- (3)
- Learning scenarios: A 3D visual aid is provided in this scenario to understand the essential functions of the equipment selected during the detection scenario. Figure 8 shows how users will be able to access an interactive 3D model representation of the equipment or view the equipment from various perspectives. In other words, it is a virtual manual introductory guide.
- (4)
- AR training scenario: When the application detects an object that has previously been configured in Unity, a 3D model will be superimposed on top of the physical object in the mobile app. It will also include a UI for the user to interact with, allowing them to understand and explore the physical object, while the mobile application provides additional information in the form of holograms, as shown in Figure 9.
- (d)
- Deployment: The final step of the framework is to build the project for the desired end platform, which can be Android or iPhone. The scenarios in Unity must be selected and linked together. Unity will launch and generate a platform associated file that can be directly installed on mobile devices.
4. Experimental Results
4.1. Object Detection
4.2. Evaluation Metrics
- Precision: The percentage of positive detections that were correct is referred to as precision. If a model produces no false positives, it has a precision of 1.0. Equation (2) describes precision as the True Positive divided by the sum of the True Positive (TP) and False Positive (FP). TP is defined as a correct prediction of the positive class, whereas FP is an incorrect prediction of negative class as the positive class.
- Recall: The percentage of true positives that were correctly identified by the model. A model with a recall of 1.0 produces zero false negatives. Recall can be computed as a ratio of True Positives predictions and the sum of TP and False Negatives (FN), as shown in Equation (3). FN is defined as an incorrect prediction of the positive class as the negative class.
- Intersection over Union (IoU): It is also known as the Jaccard index used for measuring the similarity and diversity of sample sets. In an object detection task, it describes the similarity between the predicted bounding box and the ground truth bounding box. Equation (4) expresses IoU in terms of area of the prediction and ground truth bounding boxes.It is important to define a threshold to define what is the correctness of the prediction for IoU.
- mean Average Precision (mAP): It takes under consideration both precision and recall. It is also the area beneath the precision–recall curve. The mAP can be computed by
4.3. Augmented Reality
5. Conclusions and Future Work
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Mejías Borrero, A.; Andújar Márquez, J. A pilot study of the effectiveness of augmented reality to enhance the use of remote labs in electrical engineering education. J. Sci. Educ. Technol. 2012, 21, 540–557. [Google Scholar] [CrossRef]
- Singh, G.; Mantri, A.; Sharma, O.; Dutta, R.; Kaur, R. Evaluating the impact of the augmented reality learning environment on electronics laboratory skills of engineering students. Comput. Appl. Eng. Educ. 2019, 27, 1361–1375. [Google Scholar] [CrossRef]
- Ray, S. A quick review of machine learning algorithms. In Proceedings of the 2019 International Conference on Machine Learning, Big Data, Cloud and Parallel Computing (COMITCon), Faridabad, India, 14–16 February 2019; pp. 35–39. [Google Scholar]
- Xue, M.; Zhu, C. A study and application on machine learning of artificial intelligence. In Proceedings of the 2009 International Joint Conference on Artificial Intelligence, Hainan, China, 25–26 April 2009; pp. 272–274. [Google Scholar]
- Gong, L.; Fast-Berglund, Å.; Johansson, B. A framework for extended reality system development in manufacturing. IEEE Access 2021, 9, 24796–24813. [Google Scholar] [CrossRef]
- Nisiotis, L.; Alboul, L. Work-In-Progress-An Intelligent Immersive Learning System Using AI, XR and Robots. In Proceedings of the 2021 7th International Conference of the Immersive Learning Research Network (iLRN), 17 May–10 June 2021; pp. 1–3. [Google Scholar]
- Khomh, F.; Adams, B.; Cheng, J.; Fokaefs, M.; Antoniol, G. Software engineering for machine-learning applications: The road ahead. IEEE Softw. 2018, 35, 81–84. [Google Scholar] [CrossRef]
- Hu, M.; Weng, D.; Chen, F.; Wang, Y. Object Detecting Augmented Reality System. In Proceedings of the 2020 IEEE 20th International Conference on Communication Technology (ICCT), Nanning, China, 28–31 October 2020; pp. 1432–1438. [Google Scholar]
- Ang, I.J.X.; Lim, K.H. Enhancing STEM Education Using Augmented Reality and Machine Learning. In Proceedings of the 2019 7th International Conference on Smart Computing & Communications (ICSCC), Miri, Malaysia, 28–30 June 2019; pp. 1–5. [Google Scholar]
- Thiwanka, N.; Chamodika, U.; Priyankara, L.; Sumathipala, S.; Weerasuriya, G. Augmented Reality Based Breadboard Circuit Building Guide Application. In Proceedings of the 2018 3rd International Conference on Information Technology Research (ICITR), Moratuwa, Sri Lanka, 5–7 December 2018; pp. 1–6. [Google Scholar]
- Sandoval Pérez, S.; Gonzalez Lopez, J.M.; Villa Barba, M.A.; Jimenez Betancourt, R.O.; Molinar Solís, J.E.; Rosas Ornelas, J.L.; Riberth García, G.I.; Rodriguez Haro, F. On the Use of Augmented Reality to Reinforce the Learning of Power Electronics for Beginners. Electronics 2022, 11, 302. [Google Scholar] [CrossRef]
- Technologies, U. Unity Real-Time Development Platform. 2005. Available online: https://unity.com/ (accessed on 8 April 2022).
- Yu, H.; Chen, C.; Du, X.; Li, Y.; Rashwan, A.; Hou, L.; Jin, P.; Yang, F.; Liu, F.; Kim, J.; et al. TensorFlow 1 Detection Model Zoo. 2020. Available online: https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf1_detection_zoo.md/ (accessed on 28 April 2022).
- Chen, X.W.; Lin, X. Big data deep learning: Challenges and perspectives. IEEE Access 2014, 2, 514–525. [Google Scholar] [CrossRef]
- Alom, M.Z.; Taha, T.M.; Yakopcic, C.; Westberg, S.; Sidike, P.; Nasrin, M.S.; Hasan, M.; Van Essen, B.C.; Awwal, A.A.; Asari, V.K. A state-of-the-art survey on deep learning theory and architectures. Electronics 2019, 8, 292. [Google Scholar] [CrossRef] [Green Version]
- Deng, L.; Liu, Y. Deep Learning in Natural Language Processing; Springer: Berlin/Heidelberg, Germany, 2018. [Google Scholar]
- Deng, L.; Hinton, G.; Kingsbury, B. New types of deep neural network learning for speech recognition and related applications: An overview. In Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouve, BC, Canada, 26–31 May 2013; pp. 8599–8603. [Google Scholar]
- Nishani, E.; Çiço, B. Computer vision approaches based on deep learning and neural networks: Deep neural networks for video analysis of human pose estimation. In Proceedings of the 2017 6th Mediterranean Conference on Embedded Computing (MECO), Bar, Montenegro, 11–15 June 2017; pp. 1–4. [Google Scholar]
- LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef] [Green Version]
- Pandiya, M.; Dassani, S.; Mangalraj, P. Analysis of Deep Learning Architectures for Object Detection-A Critical Review. In Proceedings of the 2020 IEEE-HYDCON, Hyderabad, India, 11–12 September 2020; pp. 1–6. [Google Scholar]
- Arora, D.; Garg, M.; Gupta, M. Diving deep in deep convolutional neural network. In Proceedings of the 2020 2nd International Conference on Advances in Computing, Communication Control and Networking (ICACCCN), Greater Noida, India, 18–19 December 2020; pp. 749–751. [Google Scholar]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Region-based convolutional networks for accurate object detection and segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 38, 142–158. [Google Scholar] [CrossRef] [PubMed]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. SSD: Single Shot MultiBox Detector. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; pp. 21–37. [Google Scholar]
- Ryu, J.; Kim, S. Chinese character detection using modified single shot multibox detector. In Proceedings of the 2018 18th International Conference on Control, Automation and Systems (ICCAS), PyeongChang, Korea, 17–20 October 2018; pp. 1313–1315. [Google Scholar]
- Chiu, Y.C.; Tsai, C.Y.; Ruan, M.D.; Shen, G.Y.; Lee, T.T. Mobilenet-SSDv2: An improved object detection model for embedded systems. In Proceedings of the 2020 International Conference on System Science and Engineering (ICSSE), Kagawa, Japan, 31 August–3 September 2020; pp. 1–5. [Google Scholar]
- Heirman, J.; Selleri, S.; De Vleeschauwer, T.; Hamesse, C.; Bellemans, M.; Schoofs, E.; Haelterman, R. Exploring the possibilities of Extended Reality in the world of firefighting. In Proceedings of the 2020 IEEE International Conference on Artificial Intelligence and Virtual Reality (AIVR), Utrecht, The Netherlands, 14–18 December 2020; pp. 266–273. [Google Scholar]
- Andrade, T.M.; Smith-Creasey, M.; Roscoe, J.F. Discerning User Activity in Extended Reality Through Side-Channel Accelerometer Observations. In Proceedings of the 2020 IEEE International Conference on Intelligence and Security Informatics (ISI), Arlington, VA, USA, 9–10 November 2020; pp. 1–3. [Google Scholar]
- Dandachi, G.; Assoum, A.; Elhassan, B.; Dornaika, F. Machine learning schemes in augmented reality for features detection. In Proceedings of the 2015 Fifth International Conference on Digital Information and Communication Technology and Its Applications (DICTAP), Beirut, Lebanon, 29 April–1 May 2015; pp. 101–105. [Google Scholar]
- Sendari, S.; Anggreani, D.; Jiono, M.; Nurhandayani, A.; Suardi, C. Augmented reality performance in detecting hardware components using marker based tracking method. In Proceedings of the 2020 4th International Conference on Vocational Education and Training (ICOVET), Malang, Indonesia, 19 September 2020; pp. 1–5. [Google Scholar]
- Mahurkar, S. Integrating YOLO Object Detection with Augmented Reality for iOS Apps. In Proceedings of the 2018 9th IEEE Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON), New York, NY, USA, 8–10 November 2018; pp. 585–589. [Google Scholar]
- El Filali, Y.; Krit, S.D. Augmented reality types and popular use cases. Int. J. Eng. Sci. Math. 2019, 8, 91–97. [Google Scholar]
- Poetker, B. What Is Augmented Reality? (+Most Common Types of AR Used Today). 2018. Available online: https://www.g2.com/articles/augmented-reality (accessed on 8 April 2022).
- Gao, Y.F.; Wang, H.Y.; Bian, X.N. Marker tracking for video-based augmented reality. In Proceedings of the 2016 International Conference on Machine Learning and Cybernetics (ICMLC), Jeju Island, Korea, 10–13 July 2016; Volume 2, pp. 928–932. [Google Scholar]
- Sendari, S.; Firmansah, A. Performance analysis of augmented reality based on vuforia using 3d marker detection. In Proceedings of the 2020 4th International Conference on Vocational Education and Training (ICOVET), Malang, Indonesia, 19 September 2020; pp. 294–298. [Google Scholar]
- Vidya, K.; Deryl, R.; Dinesh, K.; Rajabommannan, S.; Sujitha, G. Enhancing hand interaction patterns for virtual objects in mobile augmented reality using marker-less tracking. In Proceedings of the 2014 International Conference on Computing for Sustainable Global Development (INDIACom), New Delhi, India, 5–7 March 2014; pp. 705–709. [Google Scholar]
- Beier, D.; Billert, R.; Bruderlin, B.; Stichling, D.; Kleinjohann, B. Marker-less vision based tracking for mobile augmented reality. In Proceedings of the The Second IEEE and ACM International Symposium on Mixed and Augmented Reality, Tokyo, Japan, 7–10 October 2003; pp. 258–259. [Google Scholar]
- Pooja, J.; Vinay, M.; Pai, V.G.; Anuradha, M. Comparative analysis of marker and marker-less augmented reality in education. In Proceedings of the 2020 IEEE International Conference for Innovation in Technology (INOCON), Bangalore, India, 6–8 November 2020; pp. 1–4. [Google Scholar]
- Batuwanthudawa, B.; Jayasena, K. Real-Time Location based Augmented Reality Advertising Platform. In Proceedings of the 2020 2nd International Conference on Advancements in Computing (ICAC), Malabe, Sri Lanka, 10–11 December 2020; Volume 1, pp. 174–179. [Google Scholar]
- Unal, M.; Bostanci, E.; Sertalp, E.; Guzel, M.S.; Kanwal, N. Geo-location based augmented reality application for cultural heritage using drones. In Proceedings of the 2018 2nd International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT), Ankara, Turkey, 19–21 October 2018; pp. 1–4. [Google Scholar]
- Argotti, Y.; Davis, L.; Outters, V.; Rolland, J.P. Dynamic superimposition of synthetic objects on rigid and simple-deformable real objects. Comput. Graph. 2002, 26, 919–930. [Google Scholar] [CrossRef] [Green Version]
- Ketchell, S.; Chinthammit, W.; Engelke, U. Situated storytelling with SLAM enabled augmented reality. In Proceedings of the The 17th International Conference on Virtual-Reality Continuum and Its Applications in Industry, Brisbane, QLD, Australia, 14–16 November 2019; pp. 1–9. [Google Scholar]
- Knopp, S.; Klimant, P.; Schaffrath, R.; Voigt, E.; Fritzsche, R.; Allmacher, C. Hololens ar-using vuforia-based marker tracking together with text recognition in an assembly scenario. In Proceedings of the 2019 IEEE International Symposium on Mixed and Augmented Reality Adjunct (ISMAR-Adjunct), Beijing, China, 10–18 October 2019; pp. 63–64. [Google Scholar]
- Soulami, K.B.; Ghribi, E.; Labyed, Y.; Saidi, M.N.; Tamtaoui, A.; Kaabouch, N. Mixed-reality aided system for glioblastoma resection surgery using microsoft HoloLens. In Proceedings of the 2019 IEEE International Conference on Electro Information Technology (EIT), Brookings, SD, USA, 20–22 May 2019; pp. 79–84. [Google Scholar]
- Lee, J.D.; Wu, H.K.; Wu, C.T. A projection-based AR system to display brain angiography via stereo vision. In Proceedings of the 2018 IEEE 7th Global Conference on Consumer Electronics (GCCE), Nara, Japan, 9–12 October 2018; pp. 130–131. [Google Scholar]
- Vuforia Developer Library. Introduction to Model Targets. 2021. Available online: https://library.vuforia.com/articles/Solution/introduction-model-targets-unity.html (accessed on 8 April 2022).
- Zhang, S.; Tian, J.; Zhai, X.; Ji, Z. Detection of Porcine Huddling Behaviour Based on Improved Multi-view SSD. In Proceedings of the 2020 Chinese Automation Congress (CAC), Shanghai, China, 6–8 November 2020; pp. 5494–5499. [Google Scholar]
- Rios, A.C.; dos Reis, D.H.; da Silva, R.M.; Cuadros, M.A.D.S.L.; Gamarra, D.F.T. Comparison of the YOLOv3 and SSD MobileNet v2 Algorithms for Identifying Objects in Images from an Indoor Robotics Dataset. In Proceedings of the 2021 14th IEEE International Conference on Industry Applications (INDUSCON), Sao Paulo, Brazil, 15–18 August 2021; pp. 96–101. [Google Scholar]
- Phadnis, R.; Mishra, J.; Bendale, S. Objects talk-object detection and pattern tracking using tensorflow. In Proceedings of the 2018 Second International Conference on Inventive Communication and Computational Technologies (ICICCT), Coimbatore, India, 20–21 April 2018; pp. 1216–1219. [Google Scholar]
- Lin, T. LabelImg. 2015. Available online: https://github.com/tzutalin/labelImg (accessed on 8 April 2022).
- Kilic, I.; Aydin, G. Traffic sign detection and recognition using tensorflow’s object detection API with a new benchmark dataset. In Proceedings of the 2020 International Conference on Electrical Engineering (ICEE), Istanbul, Turkey, 25–27 September 2020; pp. 1–5. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. In Proceedings of the 28th Annual Conference on Neural Information Processing Systems, Montreal, Canada, 7–12 December 2015; pp. 91–99. [Google Scholar]
- Park, Y.; Chin, S. An Efficient Method of Scanning and Tracking for AR. Int. J. Adv. Cult. Technol. 2019, 7, 302–307. [Google Scholar]
- Enoxsoftware. About OpenCV for Unity. 2016. Available online: https://enoxsoftware.com/opencvforunity/documentation/about-opencv-for-unity (accessed on 8 April 2022).
Model Name | Speed (ms) | COCO (mAP) | Output |
---|---|---|---|
ssd_mobilenet_v1_coco | 30 | 21 | Boxes |
ssd_mobilenet_v2_coco | 31 | 22 | Boxes |
ssd_inception_v2_coco | 42 | 24 | Boxes |
faster_rcnn_resnet101_coco | 106 | 32 | Boxes |
Description | IoU | mAP |
---|---|---|
Average Precision | 0.50:0.95 | 0.814 |
0.50 | 0.976 | |
0.75 | 0.954 |
Mobile Device | CPU | RAM | FPS |
---|---|---|---|
Samsung S21 Ultra | Samsung Exynos 2100 | 12 GB | 8.5 |
One Plus 6T | Octa-core Kryo 385 | 6 GB | 5.5 |
Device | Luminous Intensities | Distance (cm) | Focus | Detection Status |
---|---|---|---|---|
1 | 25 | 50 | No | Yes |
1 | 25 | 100 | No | No |
1 | 25 | 50 | Yes | Yes |
1 | 25 | 100 | Yes | Yes |
1 | 150 | 50 | No | Yes |
1 | 150 | 100 | No | No |
1 | 150 | 50 | Yes | Yes |
1 | 150 | 100 | Yes | Yes |
1 | 350 | 50 | No | Yes |
1 | 350 | 100 | No | Yes |
1 | 350 | 50 | Yes | Yes |
1 | 350 | 100 | Yes | Yes |
2 | 25 | 50 | No | Yes |
2 | 25 | 100 | No | No |
2 | 25 | 50 | Yes | Yes |
2 | 25 | 100 | Yes | No |
2 | 150 | 50 | No | Yes |
2 | 150 | 100 | No | No |
2 | 150 | 50 | Yes | Yes |
2 | 150 | 100 | Yes | No |
2 | 350 | 50 | No | Yes |
2 | 350 | 100 | No | No |
2 | 350 | 50 | Yes | Yes |
2 | 350 | 100 | Yes | Yes |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Estrada, J.; Paheding, S.; Yang, X.; Niyaz, Q. Deep-Learning-Incorporated Augmented Reality Application for Engineering Lab Training. Appl. Sci. 2022, 12, 5159. https://doi.org/10.3390/app12105159
Estrada J, Paheding S, Yang X, Niyaz Q. Deep-Learning-Incorporated Augmented Reality Application for Engineering Lab Training. Applied Sciences. 2022; 12(10):5159. https://doi.org/10.3390/app12105159
Chicago/Turabian StyleEstrada, John, Sidike Paheding, Xiaoli Yang, and Quamar Niyaz. 2022. "Deep-Learning-Incorporated Augmented Reality Application for Engineering Lab Training" Applied Sciences 12, no. 10: 5159. https://doi.org/10.3390/app12105159