Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (252)

Search Parameters:
Keywords = Histogram of Oriented Gradients

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
20 pages, 7366 KiB  
Article
Histogram of Polarization Gradient for Target Tracking in Infrared DoFP Polarization Thermal Imaging
by Jianguo Yang, Dian Sheng, Weiqi Jin and Li Li
Remote Sens. 2025, 17(5), 907; https://doi.org/10.3390/rs17050907 - 4 Mar 2025
Viewed by 123
Abstract
Division-of-focal-plane (DoFP) polarization imaging systems have demonstrated considerable promise in target detection and tracking in complex backgrounds. However, existing methods face challenges, including dependence on complex image preprocessing procedures and limited real-time performance. To address these issues, this study presents a novel histogram [...] Read more.
Division-of-focal-plane (DoFP) polarization imaging systems have demonstrated considerable promise in target detection and tracking in complex backgrounds. However, existing methods face challenges, including dependence on complex image preprocessing procedures and limited real-time performance. To address these issues, this study presents a novel histogram of polarization gradient (HPG) feature descriptor that enables efficient feature representation of polarization mosaic images. First, a polarization distance calculation model based on normalized cross-correlation (NCC) and local variance is constructed, which enhances the robustness of gradient feature extraction through dynamic weight adjustment. Second, a sparse Laplacian filter is introduced to achieve refined gradient feature representation. Subsequently, adaptive polarization channel correlation weights and the second-order gradient are utilized to reconstruct the degree of linear polarization (DoLP). Finally, the gradient and DoLP sign information are ingeniously integrated to enhance the capability of directional expression, thus providing a new theoretical perspective for polarization mosaic image structure analysis. The experimental results obtained using a self-developed long-wave infrared DoFP polarization thermal imaging system demonstrate that, within the same FBACF tracking framework, the proposed HPG feature descriptor significantly outperforms traditional grayscale {8.22%, 2.93%}, histogram of oriented gradient (HOG) {5.86%, 2.41%}, and mosaic gradient histogram (MGH) {27.19%, 18.11%} feature descriptors in terms of precision and success rate. The processing speed of approximately 20 fps meets the requirements for real-time tracking applications, providing a novel technical solution for polarization imaging applications. Full article
(This article belongs to the Special Issue Recent Advances in Infrared Target Detection)
Show Figures

Figure 1

7 pages, 2251 KiB  
Proceeding Paper
Image Classification Models as a Balancer Between Product Typicality and Novelty
by Hung-Hsiang Wang and Hsueh-Kuan Chen
Eng. Proc. 2025, 89(1), 21; https://doi.org/10.3390/engproc2025089021 - 26 Feb 2025
Viewed by 74
Abstract
Car styling is crucial for consumer acceptance and market success. Since vehicle manufacturers produce electric vehicles, they have faced the challenge of maintaining the typicality of their original products and presenting the innovation of new technologies. We propose a method that integrates artificial [...] Read more.
Car styling is crucial for consumer acceptance and market success. Since vehicle manufacturers produce electric vehicles, they have faced the challenge of maintaining the typicality of their original products and presenting the innovation of new technologies. We propose a method that integrates artificial intelligence (AI)-generated images and image classification technology to help designers effectively balance between typicality and novelty. We collected 118 pictures of electric vehicles and 122 pictures of fuel vehicles in 2024 from the BMW official website. Focusing on seven key visual features of the vehicles, we used the Waikato environment for knowledge analysis (WEKA) to train an image classification model on the dataset through three separate training and testing sessions. First, we used the prompts that described typical BMW design to generate images of new BMW electric vehicles in Stable Diffusion. The images consisted of 21 front views, 20 side views, and 20 rear views. The accuracy of the model of front views trained with the pyramid histogram of oriented gradients filter (PHOG)-Filter and random forest classifier was 78.5%, and the test accuracy reached 95%. The accuracy of the model of rear views trained with BinaryPatternsPyramid-Filter and random forest classifier was 80.5%, and the test accuracy was 90%. However, the accuracy of the model of side views did not reach 70%. That implies the distinction between BMW fuel vehicles and its electric vehicles is mainly based on the front and rear views, rather than the side view. The results of this study showed that integrating image classification and AI-generated images can be used to examine the balance between product typicality and novelty, and the application of machine learning and AI tools to study car style. Full article
Show Figures

Figure 1

17 pages, 1978 KiB  
Article
Lightweight Deepfake Detection Based on Multi-Feature Fusion
by Siddiqui Muhammad Yasir and Hyun Kim
Appl. Sci. 2025, 15(4), 1954; https://doi.org/10.3390/app15041954 - 13 Feb 2025
Viewed by 649
Abstract
Deepfake technology utilizes deep learning (DL)-based face manipulation techniques to seamlessly replace faces in videos, creating highly realistic but artificially generated content. Although this technology has beneficial applications in media and entertainment, misuse of its capabilities may lead to serious risks, including identity [...] Read more.
Deepfake technology utilizes deep learning (DL)-based face manipulation techniques to seamlessly replace faces in videos, creating highly realistic but artificially generated content. Although this technology has beneficial applications in media and entertainment, misuse of its capabilities may lead to serious risks, including identity theft, cyberbullying, and false information. The integration of DL with visual cognition has resulted in important technological improvements, particularly in addressing privacy risks caused by artificially generated “deepfake” images on digital media platforms. In this study, we propose an efficient and lightweight method for detecting deepfake images and videos, making it suitable for devices with limited computational resources. In order to reduce the computational burden usually associated with DL models, our method integrates machine learning classifiers in combination with keyframing approaches and texture analysis. Moreover, the features extracted with a histogram of oriented gradients (HOG), local binary pattern (LBP), and KAZE bands were integrated to evaluate using random forest, extreme gradient boosting, extra trees, and support vector classifier algorithms. Our findings show a feature-level fusion of HOG, LBP, and KAZE features improves accuracy to 92% and 96% on FaceForensics++ and Celeb-DF(v2), respectively. Full article
(This article belongs to the Collection Trends and Prospects in Multimedia)
Show Figures

Figure 1

24 pages, 3877 KiB  
Article
A Hybrid Approach for Sports Activity Recognition Using Key Body Descriptors and Hybrid Deep Learning Classifier
by Muhammad Tayyab, Sulaiman Abdullah Alateyah, Mohammed Alnusayri, Mohammed Alatiyyah, Dina Abdulaziz AlHammadi, Ahmad Jalal and Hui Liu
Sensors 2025, 25(2), 441; https://doi.org/10.3390/s25020441 - 13 Jan 2025
Viewed by 631
Abstract
This paper presents an approach for event recognition in sequential images using human body part features and their surrounding context. Key body points were approximated to track and monitor their presence in complex scenarios. Various feature descriptors, including MSER (Maximally Stable Extremal Regions), [...] Read more.
This paper presents an approach for event recognition in sequential images using human body part features and their surrounding context. Key body points were approximated to track and monitor their presence in complex scenarios. Various feature descriptors, including MSER (Maximally Stable Extremal Regions), SURF (Speeded-Up Robust Features), distance transform, and DOF (Degrees of Freedom), were applied to skeleton points, while BRIEF (Binary Robust Independent Elementary Features), HOG (Histogram of Oriented Gradients), FAST (Features from Accelerated Segment Test), and Optical Flow were used on silhouettes or full-body points to capture both geometric and motion-based features. Feature fusion was employed to enhance the discriminative power of the extracted data and the physical parameters calculated by different feature extraction techniques. The system utilized a hybrid CNN (Convolutional Neural Network) + RNN (Recurrent Neural Network) classifier for event recognition, with Grey Wolf Optimization (GWO) for feature selection. Experimental results showed significant accuracy, achieving 98.5% on the UCF-101 dataset and 99.2% on the YouTube dataset. Compared to state-of-the-art methods, our approach achieved better performance in event recognition. Full article
(This article belongs to the Section Intelligent Sensors)
Show Figures

Figure 1

28 pages, 46346 KiB  
Article
Optimizing Image Feature Extraction with Convolutional Neural Networks for Chicken Meat Detection Applications
by Azeddine Mjahad, Antonio Polo-Aguado, Luis Llorens-Serrano and Alfredo Rosado-Muñoz
Appl. Sci. 2025, 15(2), 733; https://doi.org/10.3390/app15020733 - 13 Jan 2025
Viewed by 888
Abstract
The food industry continuously prioritizes methods and technologies to ensure product quality and safety. Traditional approaches, which rely on conventional algorithms that utilize predefined features, have exhibited limitations in representing the intricate characteristics of food items. Recently, a significant shift has emerged with [...] Read more.
The food industry continuously prioritizes methods and technologies to ensure product quality and safety. Traditional approaches, which rely on conventional algorithms that utilize predefined features, have exhibited limitations in representing the intricate characteristics of food items. Recently, a significant shift has emerged with the introduction of convolutional neural networks (CNNs). These networks have emerged as powerful and versatile tools for feature extraction, standing out as a preferred choice in the field of deep learning. The main objective of this study is to evaluate the effectiveness of convolutional neural networks (CNNs) when applied to the classification of chicken meat products by comparing different image preprocessing approaches. This study was carried out in three phases. In the first phase, the original images were used without applying traditional filters or color modifications, processing them solely with a CNN. In the second phase, color filters were applied to help separate the images based on their chromatic characteristics, while still using a CNN for processing. Finally, in the third phase, additional filters, such as Histogram of Oriented Gradients (HOG), Local Binary Pattern (LBP), and saliency, were incorporated to extract complementary features from the images, without discontinuing the use of a CNN for processing. Experimental images, sourced from the Pygsa Group databases, underwent preprocessing using these filters before being input into a CNN-based classification architecture. The results show that the developed models outperformed conventional methods, significantly improving the ability to differentiate between chicken meat types, such as yellow wing, white wing, yellow thigh, and white thigh, with the training accuracy reaching 100%. This highlights the potential of CNNs, especially when combined with advanced architectures, for efficient detection and analysis of complex food matrices. In conclusion, these techniques can be applied to food quality control and other detection and analysis domains. Full article
(This article belongs to the Special Issue Technical Advances in Food and Agricultural Product Quality Detection)
Show Figures

Figure 1

20 pages, 7090 KiB  
Article
An Infrared and Visible Image Alignment Method Based on Gradient Distribution Properties and Scale-Invariant Features in Electric Power Scenes
by Lin Zhu, Yuxing Mao, Chunxu Chen and Lanjia Ning
J. Imaging 2025, 11(1), 23; https://doi.org/10.3390/jimaging11010023 - 13 Jan 2025
Viewed by 540
Abstract
In grid intelligent inspection systems, automatic registration of infrared and visible light images in power scenes is a crucial research technology. Since there are obvious differences in key attributes between visible and infrared images, direct alignment is often difficult to achieve the expected [...] Read more.
In grid intelligent inspection systems, automatic registration of infrared and visible light images in power scenes is a crucial research technology. Since there are obvious differences in key attributes between visible and infrared images, direct alignment is often difficult to achieve the expected results. To overcome the high difficulty of aligning infrared and visible light images, an image alignment method is proposed in this paper. First, we use the Sobel operator to extract the edge information of the image pair. Second, the feature points in the edges are recognised by a curvature scale space (CSS) corner detector. Third, the Histogram of Orientation Gradients (HOG) is extracted as the gradient distribution characteristics of the feature points, which are normalised with the Scale Invariant Feature Transform (SIFT) algorithm to form feature descriptors. Finally, initial matching and accurate matching are achieved by the improved fast approximate nearest-neighbour matching method and adaptive thresholding, respectively. Experiments show that this method can robustly match the feature points of image pairs under rotation, scale, and viewpoint differences, and achieves excellent matching results. Full article
(This article belongs to the Topic Computer Vision and Image Processing, 2nd Edition)
Show Figures

Graphical abstract

25 pages, 9394 KiB  
Article
Transmitted Light Measurement to Determine the Local Structural Characteristics of Paperboard: Grammage, Thickness, and Fiber Orientation
by Cedric W. Sanjon, Yuchen Leng, Marek Hauptmann, Peter Groche and Jens-Peter Majschak
Fibers 2024, 12(12), 113; https://doi.org/10.3390/fib12120113 - 23 Dec 2024
Viewed by 678
Abstract
This study presents a novel transmission-based method for characterizing local structural features, including the grammage, thickness, and fiber orientation, of paper materials. Some non-destructive techniques, such as micro-computed tomography (μ-CT), microscopy, and radiation-based methods, are costly, time-consuming, and lack the ability [...] Read more.
This study presents a novel transmission-based method for characterizing local structural features, including the grammage, thickness, and fiber orientation, of paper materials. Some non-destructive techniques, such as micro-computed tomography (μ-CT), microscopy, and radiation-based methods, are costly, time-consuming, and lack the ability to provide comprehensive local structural information within a single measurement. The proposed method utilizes a single light transmission measurement to assess local grammage and thickness through histogram matching with reference data obtained via β-radiography and profilometry. The same light transmission images are also used to determine local fiber orientation, employing image analysis techniques. The structure tensor method, which analyzes gradients of light transmission images, provides detailed insight into the local fiber orientation. The results show that thickness and grammage measurements are independent of which side of the paper is evaluated, while the fiber orientation distribution varies between the front and back sides, reflecting differences in fiber arrangement due to manufacturing processes. Various distribution functions are compared, and the Pearson Type 3, log-normal, and gamma distributions are found to most accurately describe the grammage, thickness, and fiber orientation distributions. The study includes a variety of paper types, ensuring a robust and comprehensive analysis of material behavior, and confirms that the method can effectively infer the inhomogeneous features from a single light transmission measurement. Full article
Show Figures

Figure 1

9 pages, 3908 KiB  
Proceeding Paper
Automated Glaucoma Detection in Fundus Images Using Comprehensive Feature Extraction and Advanced Classification Techniques
by Vijaya Kumar Velpula, Jyothisri Vadlamudi, Purna Prakash Kasaraneni and Yellapragada Venkata Pavan Kumar
Eng. Proc. 2024, 82(1), 33; https://doi.org/10.3390/ecsa-11-20437 - 25 Nov 2024
Viewed by 292
Abstract
Glaucoma, a primary cause of irreversible blindness, necessitates early detection to prevent significant vision loss. In the literature, fundus imaging is identified as a key tool in diagnosing glaucoma, which captures detailed retina images. However, the manual analysis of these images can be [...] Read more.
Glaucoma, a primary cause of irreversible blindness, necessitates early detection to prevent significant vision loss. In the literature, fundus imaging is identified as a key tool in diagnosing glaucoma, which captures detailed retina images. However, the manual analysis of these images can be time-consuming and subjective. Thus, this paper presents an automated system for glaucoma detection using fundus images, combining diverse feature extraction methods with advanced classifiers, specifically Support Vector Machine (SVM) and AdaBoost. The pre-processing step incorporated image enhancement via Contrast-Limited Adaptive Histogram Equalization (CLAHE) to enhance image quality and feature extraction. This work investigated individual features such as the histogram of oriented gradients (HOG), local binary patterns (LBP), chip histogram features, and the gray-level co-occurrence matrix (GLCM), as well as their various combinations, including HOG + LBP + chip histogram + GLCM, HOG + LBP + chip histogram, and others. These features were utilized with SVM and Adaboost classifiers to improve classification performance. For validation, the ACRIMA dataset, a public fundus image collection comprising 369 glaucoma-affected and 309 normal images, was used in this work, with 80% of the data allocated for training and 20% for testing. The results of the proposed study show that different feature sets yielded varying accuracies with the SVM and Adaboost classifiers. For instance, the combination of LBP + chip histogram achieved the highest accuracy of 99.29% with Adaboost, while the same combination yielded a 65.25% accuracy with SVM. The individual feature LBP alone achieved 97.87% with Adaboost and 98.58% with SVM. Furthermore, the combination of GLCM + LBP provided a 98.58% accuracy with Adaboost and 97.87% with SVM. The results demonstrate that CLAHE and combined feature sets significantly enhance detection accuracy, providing a reliable tool for early and precise glaucoma diagnosis, thus facilitating timely intervention and improved patient outcomes. Full article
Show Figures

Figure 1

20 pages, 5608 KiB  
Article
Cross-Granularity Infrared Image Segmentation Network for Nighttime Marine Observations
by Hu Xu, Yang Yu, Xiaomin Zhang and Ju He
J. Mar. Sci. Eng. 2024, 12(11), 2082; https://doi.org/10.3390/jmse12112082 - 18 Nov 2024
Viewed by 830
Abstract
Infrared image segmentation in marine environments is crucial for enhancing nighttime observations and ensuring maritime safety. While recent advancements in deep learning have significantly improved segmentation accuracy, challenges remain due to nighttime marine scenes including low contrast and noise backgrounds. This paper introduces [...] Read more.
Infrared image segmentation in marine environments is crucial for enhancing nighttime observations and ensuring maritime safety. While recent advancements in deep learning have significantly improved segmentation accuracy, challenges remain due to nighttime marine scenes including low contrast and noise backgrounds. This paper introduces a cross-granularity infrared image segmentation network CGSegNet designed to address these challenges specifically for infrared images. The proposed method designs a hybrid feature framework with cross-granularity to enhance segmentation performance in complex water surface scenarios. To suppress feature semantic disparity against different feature granularity, we propose an adaptive multi-scale fusion module (AMF) that combines local granularity extraction with global context granularity. Additionally, incorporating a handcrafted histogram of oriented gradients (HOG) features, we designed a novel HOG feature fusion module to improve edge detection accuracy under low-contrast conditions. Comprehensive experiments conducted on the public infrared segmentation dataset demonstrate that our method outperforms state-of-the-art techniques, achieving superior segmentation results compared to professional infrared image segmentation methods. The results highlight the potential of our approach in facilitating accurate infrared image segmentation for nighttime marine observation, with implications for maritime safety and environmental monitoring. Full article
(This article belongs to the Section Ocean Engineering)
Show Figures

Figure 1

16 pages, 6259 KiB  
Article
Spectrogram-Based Arrhythmia Classification Using Three-Channel Deep Learning Model with Feature Fusion
by Alaa Eleyan, Fatih Bayram and Gülden Eleyan
Appl. Sci. 2024, 14(21), 9936; https://doi.org/10.3390/app14219936 - 30 Oct 2024
Cited by 2 | Viewed by 1343
Abstract
This paper introduces a novel deep learning model for ECG signal classification using feature fusion. The proposed methodology transforms the ECG time series into a spectrogram image using a short-time Fourier transform (STFT). This spectrogram is further processed to generate a histogram of [...] Read more.
This paper introduces a novel deep learning model for ECG signal classification using feature fusion. The proposed methodology transforms the ECG time series into a spectrogram image using a short-time Fourier transform (STFT). This spectrogram is further processed to generate a histogram of oriented gradients (HOG) and local binary pattern (LBP) features. Three separate 2D convolutional neural networks (CNNs) then analyze these three image representations in parallel. To enhance performance, the extracted features are concatenated before feeding them into a gated recurrent unit (GRU) model. The proposed approach is extensively evaluated on two ECG datasets (MIT-BIH + BIDMC and MIT-BIH) with three and five classes, respectively. The experimental results demonstrate that the proposed approach achieves superior classification accuracy compared to existing algorithms in the literature. This suggests that the model has the potential to be a valuable tool for accurate ECG signal classification, aiding in the diagnosis and treatment of various cardiovascular disorders. Full article
(This article belongs to the Topic Applications in Image Analysis and Pattern Recognition)
Show Figures

Figure 1

21 pages, 12827 KiB  
Article
Research on the Registration of Aerial Images of Cyclobalanopsis Natural Forest Based on Optimized Fast Sample Consensus Point Matching with SIFT Features
by Peng Wu, Hailong Liu, Xiaomei Yi, Lufeng Mo, Guoying Wang and Shuai Ma
Forests 2024, 15(11), 1908; https://doi.org/10.3390/f15111908 - 29 Oct 2024
Viewed by 940
Abstract
The effective management and conservation of forest resources hinge on accurate monitoring. Nonetheless, individual remote-sensing images captured by low-altitude unmanned aerial vehicles (UAVs) fail to encapsulate the entirety of a forest’s characteristics. The application of image-stitching technology to high-resolution drone imagery facilitates a [...] Read more.
The effective management and conservation of forest resources hinge on accurate monitoring. Nonetheless, individual remote-sensing images captured by low-altitude unmanned aerial vehicles (UAVs) fail to encapsulate the entirety of a forest’s characteristics. The application of image-stitching technology to high-resolution drone imagery facilitates a prompt evaluation of forest resources, encompassing quantity, quality, and spatial distribution. This study introduces an improved SIFT algorithm designed to tackle the challenges of low matching rates and prolonged registration times encountered with forest images characterized by dense textures. By implementing the SIFT-OCT (SIFT omitting the initial scale space) approach, the algorithm bypasses the initial scale space, thereby reducing the number of ineffective feature points and augmenting processing efficiency. To bolster the SIFT algorithm’s resilience against rotation and illumination variations, and to furnish supplementary information for registration even when fewer valid feature points are available, a gradient location and orientation histogram (GLOH) descriptor is integrated. For feature matching, the more computationally efficient Manhattan distance is utilized to filter feature points, which further optimizes efficiency. The fast sample consensus (FSC) algorithm is then applied to remove mismatched point pairs, thus refining registration accuracy. This research also investigates the influence of vegetation coverage and image overlap rates on the algorithm’s efficacy, using five sets of Cyclobalanopsis natural forest images. Experimental outcomes reveal that the proposed method significantly reduces registration time by an average of 3.66 times compared to that of SIFT, 1.71 times compared to that of SIFT-OCT, 5.67 times compared to that of PSO-SIFT, and 3.42 times compared to that of KAZE, demonstrating its superior performance. Full article
(This article belongs to the Section Forest Inventory, Modeling and Remote Sensing)
Show Figures

Figure 1

28 pages, 7535 KiB  
Article
A New Computer-Aided Diagnosis System for Breast Cancer Detection from Thermograms Using Metaheuristic Algorithms and Explainable AI
by Hanane Dihmani, Abdelmajid Bousselham and Omar Bouattane
Algorithms 2024, 17(10), 462; https://doi.org/10.3390/a17100462 - 18 Oct 2024
Viewed by 1643
Abstract
Advances in the early detection of breast cancer and treatment improvements have significantly increased survival rates. Traditional screening methods, including mammography, MRI, ultrasound, and biopsies, while effective, often come with high costs and risks. Recently, thermal imaging has gained attention due to its [...] Read more.
Advances in the early detection of breast cancer and treatment improvements have significantly increased survival rates. Traditional screening methods, including mammography, MRI, ultrasound, and biopsies, while effective, often come with high costs and risks. Recently, thermal imaging has gained attention due to its minimal risks compared to mammography, although it is not widely adopted as a primary detection tool since it depends on identifying skin temperature changes and lesions. The advent of machine learning (ML) and deep learning (DL) has enhanced the effectiveness of breast cancer detection and diagnosis using this technology. In this study, a novel interpretable computer aided diagnosis (CAD) system for breast cancer detection is proposed, leveraging Explainable Artificial Intelligence (XAI) throughout its various phases. To achieve these goals, we proposed a new multi-objective optimization approach named the Hybrid Particle Swarm Optimization algorithm (HPSO) and Hybrid Spider Monkey Optimization algorithm (HSMO). These algorithms simultaneously combined the continuous and binary representations of PSO and SMO to effectively manage trade-offs between accuracy, feature selection, and hyperparameter tuning. We evaluated several CAD models and investigated the impact of handcrafted methods such as Local Binary Patterns (LBP), Histogram of Oriented Gradients (HOG), Gabor Filters, and Edge Detection. We further shed light on the effect of feature selection and optimization on feature attribution and model decision-making processes using the SHapley Additive exPlanations (SHAP) framework, with a particular emphasis on cancer classification using the DMR-IR dataset. The results of our experiments demonstrate in all trials that the performance of the model is improved. With HSMO, our models achieved an accuracy of 98.27% and F1-score of 98.15% while selecting only 25.78% of the HOG features. This approach not only boosts the performance of CAD models but also ensures comprehensive interpretability. This method emerges as a promising and transparent tool for early breast cancer diagnosis. Full article
Show Figures

Figure 1

22 pages, 3158 KiB  
Article
Sensitivity Analysis of Traffic Sign Recognition to Image Alteration and Training Data Size
by Arthur Rubio, Guillaume Demoor, Simon Chalmé, Nicolas Sutton-Charani and Baptiste Magnier
Information 2024, 15(10), 621; https://doi.org/10.3390/info15100621 - 10 Oct 2024
Viewed by 1597
Abstract
Accurately classifying road signs is crucial for autonomous driving due to the high stakes involved in ensuring safety and compliance. As Convolutional Neural Networks (CNNs) have largely replaced traditional Machine Learning models in this domain, the demand for substantial training data has increased. [...] Read more.
Accurately classifying road signs is crucial for autonomous driving due to the high stakes involved in ensuring safety and compliance. As Convolutional Neural Networks (CNNs) have largely replaced traditional Machine Learning models in this domain, the demand for substantial training data has increased. This study aims to compare the performance of classical Machine Learning (ML) models and Deep Learning (DL) models under varying amounts of training data, particularly focusing on altered signs to mimic real-world conditions. We evaluated three classical models: Support Vector Machine (SVM), Random Forest, and Linear Discriminant Analysis (LDA), and one Deep Learning model: Convolutional Neural Network (CNN). Using the German Traffic Sign Recognition Benchmark (GTSRB) dataset, which includes approximately 40,000 German traffic signs, we introduced digital alterations to simulate conditions such as environmental wear or vandalism. Additionally, the Histogram of Oriented Gradients (HOG) descriptor was used to assist classical models. Bayesian optimization and k-fold cross-validation were employed for model fine-tuning and performance assessment. Our findings reveal a threshold in training data beyond which accuracy plateaus. Classical models showed a linear performance decrease under increasing alteration, while CNNs, despite being more robust to alterations, did not significantly outperform classical models in overall accuracy. Ultimately, classical Machine Learning models demonstrated performance comparable to CNNs under certain conditions, suggesting that effective road sign classification can be achieved with less computationally intensive approaches. Full article
(This article belongs to the Special Issue Machine Learning and Artificial Intelligence with Applications)
Show Figures

Graphical abstract

19 pages, 25232 KiB  
Article
OS-PSO: A Modified Ratio of Exponentially Weighted Averages-Based Optical and SAR Image Registration
by Hui Zhang, Yu Song, Jingfang Hu, Yansheng Li, Yang Li and Guowei Gao
Sensors 2024, 24(18), 5959; https://doi.org/10.3390/s24185959 - 13 Sep 2024
Viewed by 1008
Abstract
Optical and synthetic aperture radar (SAR) images exhibit non-negligible intensity differences due to their unique imaging mechanisms, which makes it difficult for classical SIFT-based algorithms to obtain sufficiently correct correspondences when processing the registration of these two types of images. To tackle this [...] Read more.
Optical and synthetic aperture radar (SAR) images exhibit non-negligible intensity differences due to their unique imaging mechanisms, which makes it difficult for classical SIFT-based algorithms to obtain sufficiently correct correspondences when processing the registration of these two types of images. To tackle this problem, an accurate optical and SAR image registration algorithm based on the SIFT algorithm (OS-PSO) is proposed. First, a modified ratio of exponentially weighted averages (MROEWA) operator is introduced to resolve the sudden dark patches in SAR images, thus generating more consistent gradients between optical and SAR images. Next, we innovatively construct the Harris scale space to replace the traditional difference in the Gaussian (DoG) scale space, identify repeatable key-points by searching for local maxima, and perform localization refinement on the identified key-points to improve their accuracy. Immediately after that, the gradient location orientation histogram (GLOH) method is adopted to construct the feature descriptors. Finally, we propose an enhanced matching method. The transformed relation is obtained in the initial matching stage using the nearest neighbor distance ratio (NNDR) and fast sample consensus (FSC) methods. And the re-matching takes into account the location, scale, and main direction of key-points to increase the number of correctly corresponding points. The proposed OS-PSO algorithm has been implemented on the Gaofen and Sentinel series with excellent results. The superior performance of the designed registration system can also be applied in complex scenarios, including urban, suburban, river, farmland, and lake areas, with more efficiency and accuracy than the state-of-the-art methods based on the WHU-OPT-SAR dataset and the BISTU-OPT-SAR dataset. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

35 pages, 6064 KiB  
Article
Multi-Index Driver Drowsiness Detection Method Based on Driver’s Facial Recognition Using Haar Features and Histograms of Oriented Gradients
by Eduardo Quiles-Cucarella, Julio Cano-Bernet, Lucas Santos-Fernández, Carlos Roldán-Blay and Carlos Roldán-Porta
Sensors 2024, 24(17), 5683; https://doi.org/10.3390/s24175683 - 31 Aug 2024
Cited by 2 | Viewed by 1599
Abstract
It is estimated that 10% to 20% of road accidents are related to fatigue, with accidents caused by drowsiness up to twice as deadly as those caused by other factors. In order to reduce these numbers, strategies such as advertising campaigns, the implementation [...] Read more.
It is estimated that 10% to 20% of road accidents are related to fatigue, with accidents caused by drowsiness up to twice as deadly as those caused by other factors. In order to reduce these numbers, strategies such as advertising campaigns, the implementation of driving recorders in vehicles used for road transport of goods and passengers, or the use of drowsiness detection systems in cars have been implemented. Within the scope of the latter area, the technologies used are diverse. They can be based on the measurement of signals such as steering wheel movement, vehicle position on the road, or driver monitoring. Driver monitoring is a technology that has been exploited little so far and can be implemented in many different approaches. This work addresses the evaluation of a multidimensional drowsiness index based on the recording of facial expressions, gaze direction, and head position and studies the feasibility of its implementation in a low-cost electronic package. Specifically, the aim is to determine the driver’s state by monitoring their facial expressions, such as the frequency of blinking, yawning, eye-opening, gaze direction, and head position. For this purpose, an algorithm capable of detecting drowsiness has been developed. Two approaches are compared: Facial recognition based on Haar features and facial recognition based on Histograms of Oriented Gradients (HOG). The implementation has been carried out on a Raspberry Pi, a low-cost device that allows the creation of a prototype that can detect drowsiness and interact with peripherals such as cameras or speakers. The results show that the proposed multi-index methodology performs better in detecting drowsiness than algorithms based on one-index detection. Full article
(This article belongs to the Special Issue Sensors and Systems for Automotive and Road Safety (Volume 2))
Show Figures

Figure 1

Back to TopTop