AI | September 2024 - Browse Articles

24 pages, 7706 KiB

Open AccessArticle

Computer Vision for Safety Management in the Steel Industry

by Roy Lan, Ibukun Awolusi and Jiannan Cai

AI 2024, 5(3), 1192-1215; https://doi.org/10.3390/ai5030058 (registering DOI) - 19 Jul 2024

Viewed by 131

The complex nature of the steel manufacturing environment, characterized by different types of hazards from materials and large machinery, makes the need for objective and automated monitoring very critical to replace the traditional methods, which are manual and subjective. This study explores the [...] Read more.

The complex nature of the steel manufacturing environment, characterized by different types of hazards from materials and large machinery, makes the need for objective and automated monitoring very critical to replace the traditional methods, which are manual and subjective. This study explores the feasibility of implementing computer vision for safety management in steel manufacturing, with a case study implementation for automated hard hat detection. The research combines hazard characterization, technology assessment, and a pilot case study. First, a comprehensive review of steel manufacturing hazards was conducted, followed by the application of TOPSIS, a multi-criteria decision analysis method, to select a candidate computer vision system from eight commercially available systems. This pilot study evaluated YOLOv5m, YOLOv8m, and YOLOv9c models on 703 grayscale images from a steel mini-mill, assessing performance through precision, recall, F1-score, mAP, specificity, and AUC metrics. Results showed high overall accuracy in hard hat detection, with YOLOv9c slightly outperforming others, particularly in detecting safety violations. Challenges emerged in handling class imbalance and accurately identifying absent hard hats, especially given grayscale imagery limitations. Despite these challenges, this study affirms the feasibility of computer vision-based safety management in steel manufacturing, providing a foundation for future automated safety monitoring systems. Findings underscore the need for larger, diverse datasets and advanced techniques to address industry-specific complexities, paving the way for enhanced workplace safety in challenging industrial environments. Full article

(This article belongs to the Special Issue Artificial Intelligence-Based Image Processing and Computer Vision)

► Show Figures

Figure 1

20 pages, 1623 KiB

Open AccessArticle

Optimization Strategies for Atari Game Environments: Integrating Snake Optimization Algorithm and Energy Valley Optimization in Reinforcement Learning Models

by Sadeq Mohammed Kadhm Sarkhi and Hakan Koyuncu

AI 2024, 5(3), 1172-1191; https://doi.org/10.3390/ai5030057 - 17 Jul 2024

Viewed by 231

Abstract

One of the biggest problems in gaming AI is related to how we can optimize and adapt a deep reinforcement learning (DRL) model, especially when it is running inside complex, dynamic environments like “PacMan”. The existing research has concentrated more or less on [...] Read more.

One of the biggest problems in gaming AI is related to how we can optimize and adapt a deep reinforcement learning (DRL) model, especially when it is running inside complex, dynamic environments like “PacMan”. The existing research has concentrated more or less on basic DRL approaches though the utilization of advanced optimization methods. This paper tries to fill these gaps by proposing an innovative methodology that combines DRL with high-level metaheuristic optimization methods. The work presented in this paper specifically refactors DRL models on the “PacMan” domain with Energy Serpent Optimizer (ESO) for hyperparameter search. These novel adaptations give a major performance boost to the AI agent, as these are where its adaptability, response time, and efficiency gains start actually showing in the more complex game space. This work innovatively incorporates the metaheuristic optimization algorithm into another field—DRL—for Atari gaming AI. This integration is essential for the improvement of DRL models in general and allows for more efficient and real-time game play. This work delivers a comprehensive empirical study for these algorithms that not only verifies their capabilities in practice but also sets a state of the art through the prism of AI-driven game development. More than simply improving gaming AI, the developments could eventually apply to more sophisticated gaming environments, ongoing improvement of algorithms during execution, real-time adaptation regarding learning, and likely even robotics/autonomous systems. This study further illustrates the necessity for even-handed and conscientious application of AI in gaming—specifically regarding questions of fairness and addiction. Full article

(This article belongs to the Section AI Systems: Theory and Applications)

► Show Figures

Figure 1

40 pages, 5912 KiB

Open AccessArticle

ConVision Benchmark: A Contemporary Framework to Benchmark CNN and ViT Models

by Shreyas Bangalore Vijayakumar, Krishna Teja Chitty-Venkata, Kanishk Arya and Arun K. Somani

AI 2024, 5(3), 1132-1171; https://doi.org/10.3390/ai5030056 - 11 Jul 2024

Viewed by 449

Abstract

Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) have shown remarkable performance in computer vision tasks, including object detection and image recognition. These models have evolved significantly in architecture, efficiency, and versatility. Concurrently, deep-learning frameworks have diversified, with versions that often complicate reproducibility [...] Read more.

Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) have shown remarkable performance in computer vision tasks, including object detection and image recognition. These models have evolved significantly in architecture, efficiency, and versatility. Concurrently, deep-learning frameworks have diversified, with versions that often complicate reproducibility and unified benchmarking. We propose ConVision Benchmark, a comprehensive framework in PyTorch, to standardize the implementation and evaluation of state-of-the-art CNN and ViT models. This framework addresses common challenges such as version mismatches and inconsistent validation metrics. As a proof of concept, we performed an extensive benchmark analysis on a COVID-19 dataset, encompassing nearly 200 CNN and ViT models in which DenseNet-161 and MaxViT-Tiny achieved exceptional accuracy with a peak performance of around

95 %

. Although we primarily used the COVID-19 dataset for image classification, the framework is adaptable to a variety of datasets, enhancing its applicability across different domains. Our methodology includes rigorous performance evaluations, highlighting metrics such as accuracy, precision, recall, F1 score, and computational efficiency (FLOPs, MACs, CPU, and GPU latency). The ConVision Benchmark facilitates a comprehensive understanding of model efficacy, aiding researchers in deploying high-performance models for diverse applications. Full article

(This article belongs to the Special Issue Artificial Intelligence-Based Image Processing and Computer Vision)

► Show Figures

Figure 1

21 pages, 2574 KiB

Open AccessArticle

ZTCloudGuard: Zero Trust Context-Aware Access Management Framework to Avoid Medical Errors in the Era of Generative AI and Cloud-Based Health Information Ecosystems

by Khalid Al-hammuri, Fayez Gebali and Awos Kanan

AI 2024, 5(3), 1111-1131; https://doi.org/10.3390/ai5030055 - 8 Jul 2024

Viewed by 426

Abstract

Managing access between large numbers of distributed medical devices has become a crucial aspect of modern healthcare systems, enabling the establishment of smart hospitals and telehealth infrastructure. However, as telehealth technology continues to evolve and Internet of Things (IoT) devices become more widely [...] Read more.

Managing access between large numbers of distributed medical devices has become a crucial aspect of modern healthcare systems, enabling the establishment of smart hospitals and telehealth infrastructure. However, as telehealth technology continues to evolve and Internet of Things (IoT) devices become more widely used, they are also increasingly exposed to various types of vulnerabilities and medical errors. In healthcare information systems, about 90% of vulnerabilities emerge from medical error and human error. As a result, there is a need for additional research and development of security tools to prevent such attacks. This article proposes a zero-trust-based context-aware framework for managing access to the main components of the cloud ecosystem, including users, devices, and output data. The main goal and benefit of the proposed framework is to build a scoring system to prevent or alleviate medical errors while using distributed medical devices in cloud-based healthcare information systems. The framework has two main scoring criteria to maintain the chain of trust. First, it proposes a critical trust score based on cloud-native microservices for authentication, encryption, logging, and authorizations. Second, a bond trust scoring system is created to assess the real-time semantic and syntactic analysis of attributes stored in a healthcare information system. The analysis is based on a pre-trained machine learning model that generates the semantic and syntactic scores. The framework also takes into account regulatory compliance and user consent in the creation of the scoring system. The advantage of this method is that it applies to any language and adapts to all attributes, as it relies on a language model, not just a set of predefined and limited attributes. The results show a high

F 1

score of 93.5%, which proves that it is valid for detecting medical errors. Full article

(This article belongs to the Special Issue Cybersecurity and Artificial Intelligence: Current and Future Developments)

► Show Figures

Figure 1

16 pages, 1862 KiB

Open AccessArticle

Predicting Number of Vehicles Involved in Rural Crashes Using Learning Vector Quantization Algorithm

by Sina Shaffiee Haghshenas, Giuseppe Guido, Sami Shaffiee Haghshenas and Vittorio Astarita

AI 2024, 5(3), 1095-1110; https://doi.org/10.3390/ai5030054 - 8 Jul 2024

Viewed by 351

Abstract

Roads represent very important infrastructure and play a significant role in economic, cultural, and social growth. Therefore, there is a critical need for many researchers to model crash injury severity in order to study how safe roads are. When measuring the cost of [...] Read more.

Roads represent very important infrastructure and play a significant role in economic, cultural, and social growth. Therefore, there is a critical need for many researchers to model crash injury severity in order to study how safe roads are. When measuring the cost of crashes, the severity of the crash is a critical criterion, and it is classified into various categories. The number of vehicles involved in the crash (NVIC) is a crucial factor in all of these categories. For this purpose, this research examines road safety and provides a prediction model for the number of vehicles involved in a crash. Specifically, learning vector quantization (LVQ 2.1), one of the sub-branches of artificial neural networks (ANNs), is used to build a classification model. The novelty of this study demonstrates LVQ 2.1’s efficacy in categorizing accident data and its ability to improve road safety strategies. The LVQ 2.1 algorithm is particularly suitable for classification tasks and works by adjusting prototype vectors to improve the classification performance. The research emphasizes how urgently better prediction algorithms are needed to handle issues related to road safety. In this study, a dataset of 564 crash records from rural roads in Calabria between 2017 and 2048, a region in southern Italy, was utilized. The study analyzed several key parameters, including daylight, the crash type, day of the week, location, speed limit, average speed, and annual average daily traffic, as input variables to predict the number of vehicles involved in rural crashes. The findings revealed that the “crash type” parameter had the most significant impact, whereas “location” had the least significant impact on the occurrence of rural crashes in the investigated areas. Full article

(This article belongs to the Section AI Systems: Theory and Applications)

► Show Figures

Figure 1

29 pages, 861 KiB

Open AccessArticle

ChatGPT Code Detection: Techniques for Uncovering the Source of Code

by Marc Oedingen, Raphael C. Engelhardt, Robin Denz, Maximilian Hammer and Wolfgang Konen

AI 2024, 5(3), 1066-1094; https://doi.org/10.3390/ai5030053 - 2 Jul 2024

Viewed by 1821

Abstract

In recent times, large language models (LLMs) have made significant strides in generating computer code, blurring the lines between code created by humans and code produced by artificial intelligence (AI). As these technologies evolve rapidly, it is crucial to explore how they influence [...] Read more.

In recent times, large language models (LLMs) have made significant strides in generating computer code, blurring the lines between code created by humans and code produced by artificial intelligence (AI). As these technologies evolve rapidly, it is crucial to explore how they influence code generation, especially given the risk of misuse in areas such as higher education. The present paper explores this issue by using advanced classification techniques to differentiate between code written by humans and code generated by ChatGPT, a type of LLM. We employ a new approach that combines powerful embedding features (black-box) with supervised learning algorithms including Deep Neural Networks, Random Forests, and Extreme Gradient Boosting to achieve this differentiation with an impressive accuracy of

98 %

. For the successful combinations, we also examine their model calibration, showing that some of the models are extremely well calibrated. Additionally, we present white-box features and an interpretable Bayes classifier to elucidate critical differences between the code sources, enhancing the explainability and transparency of our approach. Both approaches work well, but provide at most 85–88% accuracy. Tests on a small sample of untrained humans suggest that humans do not solve the task much better than random guessing. This study is crucial in understanding and mitigating the potential risks associated with using AI in code generation, particularly in the context of higher education, software development, and competitive programming. Full article

(This article belongs to the Topic AI Chatbots: Threat or Opportunity?)

► Show Figures

Figure 1

17 pages, 3202 KiB

Open AccessFeature PaperArticle

Arabic Spam Tweets Classification: A Comprehensive Machine Learning Approach

by Wafa Hussain Hantom and Atta Rahman

AI 2024, 5(3), 1049-1065; https://doi.org/10.3390/ai5030052 - 2 Jul 2024

Viewed by 392

Abstract

Nowadays, one of the most common problems faced by Twitter (also known as X) users, including individuals as well as organizations, is dealing with spam tweets. The problem continues to proliferate due to the increasing popularity and number of users of social media [...] Read more.

Nowadays, one of the most common problems faced by Twitter (also known as X) users, including individuals as well as organizations, is dealing with spam tweets. The problem continues to proliferate due to the increasing popularity and number of users of social media platforms. Due to this overwhelming interest, spammers can post texts, images, and videos containing suspicious links that can be used to spread viruses, rumors, negative marketing, and sarcasm, and potentially hack the user’s information. Spam detection is among the hottest research areas in natural language processing (NLP) and cybersecurity. Several studies have been conducted in this regard, but they mainly focus on the English language. However, Arabic tweet spam detection still has a long way to go, especially emphasizing the diverse dialects other than modern standard Arabic (MSA), since, in the tweets, the standard dialect is seldom used. The situation demands an automated, robust, and efficient Arabic spam tweet detection approach. To address the issue, in this research, various machine learning and deep learning models have been investigated to detect spam tweets in Arabic, including Random Forest (RF), Support Vector Machine (SVM), Naive Bayes (NB) and Long-Short Term Memory (LSTM). In this regard, we have focused on the words as well as the meaning of the tweet text. Upon several experiments, the proposed models have produced promising results in contrast to the previous approaches for the same and diverse datasets. The results showed that the RF classifier achieved 96.78% and the LSTM classifier achieved 94.56%, followed by the SVM classifier that achieved 82% accuracy. Further, in terms of F1-score, there is an improvement of 21.38%, 19.16% and 5.2% using RF, LSTM and SVM classifiers compared to the schemes with same dataset. Full article

(This article belongs to the Special Issue Cybersecurity and Artificial Intelligence: Current and Future Developments)

► Show Figures

Figure 1

19 pages, 543 KiB

Open AccessArticle

Bio-Inspired Hyperparameter Tuning of Federated Learning for Student Activity Recognition in Online Exam Environment

by Ramu Shankarappa, Nandini Prasad, Ram Mohana Reddy Guddeti and Biju R. Mohan

AI 2024, 5(3), 1030-1048; https://doi.org/10.3390/ai5030051 - 1 Jul 2024

Viewed by 519

Abstract

Nowadays, online examination (exam in short) platforms are becoming more popular, demanding strong security measures for digital learning environments. This includes addressing key challenges such as head pose detection and estimation, which are integral for applications like automatic face recognition, advanced surveillance systems, [...] Read more.

Nowadays, online examination (exam in short) platforms are becoming more popular, demanding strong security measures for digital learning environments. This includes addressing key challenges such as head pose detection and estimation, which are integral for applications like automatic face recognition, advanced surveillance systems, intuitive human–computer interfaces, and enhancing driving safety measures. The proposed work holds significant potential in enhancing the security and reliability of online exam platforms. It achieves this by accurately classifying students’ attentiveness based on distinct head poses, a novel approach that leverages advanced techniques like federated learning and deep learning models. The proposed work aims to classify students’ attentiveness with the help of different head poses. In this work, we considered five head poses: front face, down face, right face, up face, and left face. A federated learning (FL) framework with a pre-trained deep learning model (ResNet50) was used to accomplish the classification task. To classify students’ activity (behavior) in an online exam environment using the FL framework’s local client device, we considered the ResNet50 model. However, identifying the best hyperparameters in the local client ResNet50 model is challenging. Hence, in this study, we proposed two hybrid bio-inspired optimized methods, namely, Particle Swarm Optimization with Genetic Algorithm (PSOGA) and Particle Swarm Optimization with Elitist Genetic Algorithm (PSOEGA), to fine-tune the hyperparameters of the ResNet50 model. The bio-inspired optimized methods employed in the ResNet50 model will train and classify the students’ behavior in an online exam environment. The FL framework trains the client model locally and sends the updated weights to the server model. The proposed hybrid bio-inspired algorithms outperform the GA and PSO when independently used. The proposed PSOGA not only outperforms the proposed PSOEGA but also outperforms the benchmark algorithms considered for performance evaluation by giving an accuracy of 95.97%. Full article

(This article belongs to the Section AI Systems: Theory and Applications)

► Show Figures

Figure 1

19 pages, 2134 KiB

Open AccessArticle

Utilizing Genetic Algorithms in Conjunction with ANN-Based Stock Valuation Models to Enhance the Optimization of Stock Investment Decisions

by Ying-Hua Chang and Chen-Wei Huang

AI 2024, 5(3), 1011-1029; https://doi.org/10.3390/ai5030050 - 1 Jul 2024

Viewed by 555

Abstract

Navigating the stock market’s unpredictability and reducing vulnerability to its volatility requires well-informed decisions on stock selection, capital allocation, and transaction timing. While stock selection can be accomplished through fundamental analysis, the extensive data involved often pose challenges in discerning pertinent information. Timing, [...] Read more.

Navigating the stock market’s unpredictability and reducing vulnerability to its volatility requires well-informed decisions on stock selection, capital allocation, and transaction timing. While stock selection can be accomplished through fundamental analysis, the extensive data involved often pose challenges in discerning pertinent information. Timing, typically managed through technical analysis, may experience delays, leading to missed opportunities for stock transactions. Capital allocation, a quintessential resource optimization dilemma, necessitates meticulous planning for resolution. Consequently, this thesis leverages the optimization attributes of genetic algorithms, in conjunction with fundamental analysis and the concept of combination with repetition optimization, to identify appropriate stock selection and capital allocation strategies. Regarding timing, it employs deep learning coupled with the Ohlson model for stock valuation to ascertain the intrinsic worth of stocks. This lays the groundwork for transactions to yield favorable returns. In terms of experimentation, this study juxtaposes the integrated analytical approach of this thesis with the equal capital allocation strategy, TAIEX, and the Taiwan 50 index. The findings affirm that irrespective of the Taiwan stock market’s bullish or bearish tendencies, the method proposed in this study indeed facilitates investors in making astute investment decisions and attaining substantial profits. Full article

(This article belongs to the Special Issue AI in Finance: Leveraging AI to Transform Financial Services)

► Show Figures

Figure 1

21 pages, 2273 KiB

Open AccessFeature PaperReview

Artificial Intelligence-Driven Facial Image Analysis for the Early Detection of Rare Diseases: Legal, Ethical, Forensic, and Cybersecurity Considerations

by Peter Kováč, Peter Jackuliak, Alexandra Bražinová, Ivan Varga, Michal Aláč, Martin Smatana, Dušan Lovich and Andrej Thurzo

AI 2024, 5(3), 990-1010; https://doi.org/10.3390/ai5030049 - 27 Jun 2024

Viewed by 1181

Abstract

This narrative review explores the potential, complexities, and consequences of using artificial intelligence (AI) to screen large government-held facial image databases for the early detection of rare genetic diseases. Government-held facial image databases, combined with the power of artificial intelligence, offer the potential [...] Read more.

This narrative review explores the potential, complexities, and consequences of using artificial intelligence (AI) to screen large government-held facial image databases for the early detection of rare genetic diseases. Government-held facial image databases, combined with the power of artificial intelligence, offer the potential to revolutionize the early diagnosis of rare genetic diseases. AI-powered phenotyping, as exemplified by the Face2Gene app, enables highly accurate genetic assessments from simple photographs. This and similar breakthrough technologies raise significant privacy and ethical concerns about potential government overreach augmented with the power of AI. This paper explores the concept, methods, and legal complexities of AI-based phenotyping within the EU. It highlights the transformative potential of such tools for public health while emphasizing the critical need to balance innovation with the protection of individual privacy and ethical boundaries. This comprehensive overview underscores the urgent need to develop robust safeguards around individual rights while responsibly utilizing AI’s potential for improved healthcare outcomes, including within a forensic context. Furthermore, the intersection of AI and sensitive genetic data necessitates proactive cybersecurity measures. Current and future developments must focus on securing AI models against attacks, ensuring data integrity, and safeguarding the privacy of individuals within this technological landscape. Full article

(This article belongs to the Special Issue Cybersecurity and Artificial Intelligence: Current and Future Developments)

► Show Figures

Graphical abstract

42 pages, 9818 KiB

Open AccessReview

A Review of Natural-Language-Instructed Robot Execution Systems

by Rui Liu, Yibei Guo, Runxiang Jin and Xiaoli Zhang

AI 2024, 5(3), 948-989; https://doi.org/10.3390/ai5030048 - 26 Jun 2024

Viewed by 959

Abstract

It is natural and efficient to use human natural language (NL) directly to instruct robot task executions without prior user knowledge of instruction patterns. Currently, NL-instructed robot execution (NLexe) is employed in various robotic scenarios, including manufacturing, daily assistance, and health caregiving. It [...] Read more.

It is natural and efficient to use human natural language (NL) directly to instruct robot task executions without prior user knowledge of instruction patterns. Currently, NL-instructed robot execution (NLexe) is employed in various robotic scenarios, including manufacturing, daily assistance, and health caregiving. It is imperative to summarize the current NLexe systems and discuss future development trends to provide valuable insights for upcoming NLexe research. This review categorizes NLexe systems into four types based on the robot’s cognition level during task execution: NL-based execution control systems, NL-based execution training systems, NL-based interactive execution systems, and NL-based social execution systems. For each type of NLexe system, typical application scenarios with advantages, disadvantages, and open problems are introduced. Then, typical implementation methods and future research trends of NLexe systems are discussed to guide the future NLexe research. Full article

(This article belongs to the Section AI in Autonomous Systems)

► Show Figures

Figure 1

10 pages, 1527 KiB

Open AccessArticle

TLtrack: Combining Transformers and a Linear Model for Robust Multi-Object Tracking

by Zuojie He, Kai Zhao and Dan Zeng

AI 2024, 5(3), 938-947; https://doi.org/10.3390/ai5030047 - 26 Jun 2024

Viewed by 752

Abstract

Multi-object tracking (MOT) aims at estimating locations and identities of objects in videos. Many modern multiple-object tracking systems follow the tracking-by-detection paradigm, consisting of a detector followed by a method for associating detections into tracks. Tracking by associating detections through motion-based similarity heuristics [...] Read more.

Multi-object tracking (MOT) aims at estimating locations and identities of objects in videos. Many modern multiple-object tracking systems follow the tracking-by-detection paradigm, consisting of a detector followed by a method for associating detections into tracks. Tracking by associating detections through motion-based similarity heuristics is the basic way. Motion models aim at utilizing motion information to estimate future locations, playing an important role in enhancing the performance of association. Recently, a large-scale dataset, DanceTrack, where objects have uniform appearance and diverse motion patterns, was proposed. With existing hand-crafted motion models, it is hard to achieve decent results on DanceTrack because of the lack of prior knowledge. In this work, we present a motion-based algorithm named TLtrack, which adopts a hybrid strategy to make motion estimates based on confidence scores. For high confidence score detections, TLtrack employs transformers to predict its locations. For low confidence score detections, a simple linear model that estimates locations through trajectory historical information is used. TLtrack can not only consider the historical information of the trajectory, but also analyze the latest movements. Our experimental results on the DanceTrack dataset show that our method achieves the best performance compared with other motion models. Full article

(This article belongs to the Section AI in Autonomous Systems)

► Show Figures

Figure 1

Journal Menu

Journal Browser

AI, Volume 5, Issue 3 (September 2024) – 12 articles

Further Information

Guidelines

MDPI Initiatives

Follow MDPI