Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (500)

Search Parameters:
Keywords = image set matching

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
19 pages, 1006 KiB  
Article
Semantic Interaction Meta-Learning Based on Patch Matching Metric
by Baoguo Wei, Xinyu Wang, Yuetong Su, Yue Zhang and Lixin Li
Sensors 2024, 24(17), 5620; https://doi.org/10.3390/s24175620 - 30 Aug 2024
Viewed by 408
Abstract
Metric-based meta-learning methods have demonstrated remarkable success in the domain of few-shot image classification. However, their performance is significantly contingent upon the choice of metric and the feature representation for the support classes. Current approaches, which predominantly rely on holistic image features, may [...] Read more.
Metric-based meta-learning methods have demonstrated remarkable success in the domain of few-shot image classification. However, their performance is significantly contingent upon the choice of metric and the feature representation for the support classes. Current approaches, which predominantly rely on holistic image features, may inadvertently disregard critical details necessary for novel tasks, a phenomenon known as “supervision collapse”. Moreover, relying solely on visual features to characterize support classes can prove to be insufficient, particularly in scenarios involving limited sample sizes. In this paper, we introduce an innovative framework named Patch Matching Metric-based Semantic Interaction Meta-Learning (PatSiML), designed to overcome these challenges. To counteract supervision collapse, we have developed a patch matching metric strategy based on the Transformer architecture to transform input images into a set of distinct patch embeddings. This approach dynamically creates task-specific embeddings, facilitated by a graph convolutional network, to formulate precise matching metrics between the support classes and the query image patches. To enhance the integration of semantic knowledge, we have also integrated a label-assisted channel semantic interaction strategy. This strategy merges word embeddings with patch-level visual features across the channel dimension, utilizing a sophisticated language model to combine semantic understanding with visual information. Our empirical findings across four diverse datasets reveal that the PatSiML method achieves a classification accuracy improvement of 0.65% to 21.15% over existing methodologies, underscoring its robustness and efficacy. Full article
(This article belongs to the Special Issue Advances in Remote Sensing Image Enhancement and Classification)
Show Figures

Figure 1

27 pages, 4723 KiB  
Review
Methods for Detecting the Patient’s Pupils’ Coordinates and Head Rotation Angle for the Video Head Impulse Test (vHIT), Applicable for the Diagnosis of Vestibular Neuritis and Pre-Stroke Conditions
by G. D. Mamykin, A. A. Kulesh, Fedor L. Barkov, Y. A. Konstantinov, D. P. Sokol’chik and Vladimir Pervadchuk
Computation 2024, 12(8), 167; https://doi.org/10.3390/computation12080167 - 18 Aug 2024
Viewed by 496
Abstract
In the contemporary era, dizziness is a prevalent ailment among patients. It can be caused by either vestibular neuritis or a stroke. Given the lack of diagnostic utility of instrumental methods in acute isolated vertigo, the differentiation of vestibular neuritis and stroke is [...] Read more.
In the contemporary era, dizziness is a prevalent ailment among patients. It can be caused by either vestibular neuritis or a stroke. Given the lack of diagnostic utility of instrumental methods in acute isolated vertigo, the differentiation of vestibular neuritis and stroke is primarily clinical. As a part of the initial differential diagnosis, the physician focuses on the characteristics of nystagmus and the results of the video head impulse test (vHIT). Instruments for accurate vHIT are costly and are often utilized exclusively in healthcare settings. The objective of this paper is to review contemporary methodologies for accurately detecting the position of pupil centers in both eyes of a patient and for precisely extracting their coordinates. Additionally, the paper describes methods for accurately determining the head rotation angle under diverse imaging and lighting conditions. Furthermore, the suitability of these methods for vHIT is being evaluated. We assume the maximum allowable error is 0.005 radians per frame to detect pupils’ coordinates or 0.3 degrees per frame while detecting the head position. We found that for such conditions, the most suitable approaches for head posture detection are deep learning (including LSTM networks), search by template matching, linear regression of EMG sensor data, and optical fiber sensor usage. The most relevant approaches for pupil localization for our medical tasks are deep learning, geometric transformations, decision trees, and RASNAC. This study might assist in the identification of a number of approaches that can be employed in the future to construct a high-accuracy system for vHIT based on a smartphone or a home computer, with subsequent signal processing and initial diagnosis. Full article
(This article belongs to the Special Issue Deep Learning Applications in Medical Imaging)
Show Figures

Figure 1

15 pages, 7315 KiB  
Article
Computer Vision Algorithms on a Raspberry Pi 4 for Automated Depalletizing
by Danilo Greco, Majid Fasihiany, Ali Varasteh Ranjbar, Francesco Masulli, Stefano Rovetta and Alberto Cabri
Algorithms 2024, 17(8), 363; https://doi.org/10.3390/a17080363 - 18 Aug 2024
Viewed by 482
Abstract
The primary objective of a depalletizing system is to automate the process of detecting and locating specific variable-shaped objects on a pallet, allowing a robotic system to accurately unstack them. Although many solutions exist for the problem in industrial and manufacturing settings, the [...] Read more.
The primary objective of a depalletizing system is to automate the process of detecting and locating specific variable-shaped objects on a pallet, allowing a robotic system to accurately unstack them. Although many solutions exist for the problem in industrial and manufacturing settings, the application to small-scale scenarios such as retail vending machines and small warehouses has not received much attention so far. This paper presents a comparative analysis of four different computer vision algorithms for the depalletizing task, implemented on a Raspberry Pi 4, a very popular single-board computer with low computer power suitable for the IoT and edge computing. The algorithms evaluated include the following: pattern matching, scale-invariant feature transform, Oriented FAST and Rotated BRIEF, and Haar cascade classifier. Each technique is described and their implementations are outlined. Their evaluation is performed on the task of box detection and localization in the test images to assess their suitability in a depalletizing system. The performance of the algorithms is given in terms of accuracy, robustness to variability, computational speed, detection sensitivity, and resource consumption. The results reveal the strengths and limitations of each algorithm, providing valuable insights for selecting the most appropriate technique based on the specific requirements of a depalletizing system. Full article
(This article belongs to the Special Issue Recent Advances in Algorithms for Computer Vision Applications)
Show Figures

Figure 1

14 pages, 1899 KiB  
Article
Using ArcFace Loss Function and Softmax with Temperature Activation Function for Improvement in X-ray Baggage Image Classification Quality
by Nikita Andriyanov
Mathematics 2024, 12(16), 2547; https://doi.org/10.3390/math12162547 - 18 Aug 2024
Viewed by 368
Abstract
Modern aviation security systems are largely tied to the work of screening operators. Due to physical characteristics, they are prone to problems such as fatigue, loss of attention, etc. There are methods for recognizing such objects, but they face such difficulties as the [...] Read more.
Modern aviation security systems are largely tied to the work of screening operators. Due to physical characteristics, they are prone to problems such as fatigue, loss of attention, etc. There are methods for recognizing such objects, but they face such difficulties as the specific structure of luggage X-ray images. Furthermore, such systems require significant computational resources when increasing the size of models. Overcoming the first and second disadvantage can largely lie in the hardware plane. It needs new introscopes and registration techniques, as well as more powerful computing devices. However, for processing, it is more preferable to improve quality without increasing the computational power requirements of the recognition system. This can be achieved on traditional neural network architectures, but with the more complex training process. A new training approach is proposed in this study. New ways of baggage X-ray image augmentation and advanced approaches to training convolutional neural networks and vision transformer networks are proposed. It is shown that the use of ArcFace loss function for the task of the items binary classification into forbidden and allowed classes provides a gain of about 3–5% for different architectures. At the same time, the use of softmax activation function with temperature allows one to obtain more flexible estimates of the probability of belonging, which, when the threshold is set, allows one to significantly increase the accuracy of recognition of forbidden items, and when it is reduced, provides high recall of recognition. The developed augmentations based on doubly stochastic image models allow one to increase the recall of recognizing dangerous items by 1–2%. On the basis of the developed classifier, the YOLO detector was modified and the mAP gain of 0.72% was obtained. Thus, the research results are matched to the goal of increasing efficiency in X-ray baggage image processing. Full article
(This article belongs to the Special Issue Advanced Research in Fuzzy Systems and Artificial Intelligence)
Show Figures

Figure 1

24 pages, 8028 KiB  
Article
SPTrack: Spectral Similarity Prompt Learning for Hyperspectral Object Tracking
by Gaowei Guo, Zhaoxu Li, Wei An, Yingqian Wang, Xu He, Yihang Luo, Qiang Ling, Miao Li and Zaiping Lin
Remote Sens. 2024, 16(16), 2975; https://doi.org/10.3390/rs16162975 - 14 Aug 2024
Viewed by 450
Abstract
Compared to hyperspectral trackers that adopt the “pre-training then fine-tuning” training paradigm, those using the “pre-training then prompt-tuning” training paradigm can inherit the expressive capabilities of the pre-trained model with fewer training parameters. Existing hyperspectral trackers utilizing prompt learning lack an adequate prompt [...] Read more.
Compared to hyperspectral trackers that adopt the “pre-training then fine-tuning” training paradigm, those using the “pre-training then prompt-tuning” training paradigm can inherit the expressive capabilities of the pre-trained model with fewer training parameters. Existing hyperspectral trackers utilizing prompt learning lack an adequate prompt template design, thus failing to bridge the domain gap between hyperspectral data and pre-trained models. Consequently, their tracking performance suffers. Additionally, these networks have a poor generalization ability and require re-training for the different spectral bands of hyperspectral data, leading to the inefficient use of computational resources. In order to address the aforementioned problems, we propose a spectral similarity prompt learning approach for hyperspectral object tracking (SPTrack). First, we introduce a spectral matching map based on spectral similarity, which converts 3D hyperspectral data with different spectral bands into single-channel hotmaps, thus enabling cross-spectral domain generalization. Then, we design a channel and position attention-based feature complementary prompter to learn blended prompts from spectral matching maps and three-channel images. Extensive experiments are conducted on the HOT2023 and IMEC25 data sets, and SPTrack is found to achieve state-of-the-art performance with minimal computational effort. Additionally, we verify the cross-spectral domain generalization ability of SPTrack on the HOT2023 data set, which includes data from three spectral bands. Full article
(This article belongs to the Special Issue Advances in Hyperspectral Data Processing)
Show Figures

Figure 1

18 pages, 7322 KiB  
Article
Aerial Map-Based Navigation by Ground Object Pattern Matching
by Youngjoo Kim, Seungho Back, Dongchan Song and Byung-Yoon Lee
Drones 2024, 8(8), 375; https://doi.org/10.3390/drones8080375 - 5 Aug 2024
Viewed by 603
Abstract
This paper proposes a novel approach to map-based navigation for unmanned aircraft. The proposed approach employs pattern matching of ground objects, not feature-to-feature or image-to-image matching, between an aerial image and a map database. Deep learning-based object detection converts the ground objects into [...] Read more.
This paper proposes a novel approach to map-based navigation for unmanned aircraft. The proposed approach employs pattern matching of ground objects, not feature-to-feature or image-to-image matching, between an aerial image and a map database. Deep learning-based object detection converts the ground objects into labeled points, and the objects’ configuration is used to find the corresponding location in the map database. Using the deep learning technique as a tool for extracting high-level features reduces the image-based localization problem to a pattern-matching problem. The pattern-matching algorithm proposed in this paper does not require altitude information or a camera model to estimate the horizontal geographical coordinates of the vehicle. Moreover, it requires significantly less storage because the map database is represented as a set of tuples, each consisting of a label, latitude, and longitude. Probabilistic data fusion with the inertial measurements by the Kalman filter is incorporated to deliver a comprehensive navigational solution. Flight experiments demonstrate the effectiveness of the proposed system in real-world environments. The map-based navigation system successfully provides the position estimates with RMSEs within 3.5 m at heights over 90 m without the aid of the GNSS. Full article
Show Figures

Figure 1

22 pages, 5018 KiB  
Article
Color Standardization of Chemical Solution Images Using Template-Based Histogram Matching in Deep Learning Regression
by Patrycja Kwiek and Małgorzata Jakubowska
Algorithms 2024, 17(8), 335; https://doi.org/10.3390/a17080335 - 1 Aug 2024
Viewed by 562
Abstract
Color distortion in an image presents a challenge for machine learning classification and regression when the input data consists of pictures. As a result, a new algorithm for color standardization of photos is proposed, forming the foundation for a deep neural network regression [...] Read more.
Color distortion in an image presents a challenge for machine learning classification and regression when the input data consists of pictures. As a result, a new algorithm for color standardization of photos is proposed, forming the foundation for a deep neural network regression model. This approach utilizes a self-designed color template that was developed based on an initial series of studies and digital imaging. Using the equalized histogram of the R, G, B channels of the digital template and its photo, a color mapping strategy was computed. By applying this approach, the histograms were adjusted and the colors of photos taken with a smartphone were standardized. The proposed algorithm was developed for a series of images where the entire surface roughly maintained a uniform color and the differences in color between the photographs of individual objects were minor. This optimized approach was validated in the colorimetric determination procedure of vitamin C. The dataset for the deep neural network in the regression variant was formed from photos of samples under two separate lighting conditions. For the vitamin C concentration range from 0 to 87.72 µg·mL−1, the RMSE for the test set ranged between 0.75 and 1.95 µg·mL−1, in comparison to the non-standardized variant, where this indicator was at the level of 1.48–2.29 µg·mL−1. The consistency of the predicted concentration results with actual data, expressed as R2, ranged between 0.9956 and 0.9999 for each of the standardized variants. This approach allows for the removal of light reflections on the shiny surfaces of solutions, which is a common problem in liquid samples. This color-matching algorithm has universal character, and its scope of application is not limited. Full article
(This article belongs to the Special Issue Machine Learning Models and Algorithms for Image Processing)
Show Figures

Figure 1

22 pages, 6010 KiB  
Article
pH-Sensitive Fluorescent Marker Based on Rhodamine 6G Conjugate with Its FRET/PeT Pair in “Smart” Polymeric Micelles for Selective Imaging of Cancer Cells
by Igor D. Zlotnikov, Alexander A. Ezhov and Elena V. Kudryashova
Pharmaceutics 2024, 16(8), 1007; https://doi.org/10.3390/pharmaceutics16081007 - 30 Jul 2024
Viewed by 460
Abstract
Cancer cells are known to create an acidic microenvironment (the Warburg effect). At the same time, fluorescent dyes can be sensitive to pH, showing a sharp increase or decrease in fluorescence depending on pH. However, modern applications, such as confocal laser scanning microscopy [...] Read more.
Cancer cells are known to create an acidic microenvironment (the Warburg effect). At the same time, fluorescent dyes can be sensitive to pH, showing a sharp increase or decrease in fluorescence depending on pH. However, modern applications, such as confocal laser scanning microscopy (CLSM), set additional requirements for such fluorescent markers to be of practical use, namely, high quantum yield, low bleaching, minimal quenching in the cell environment, and minimal overlap with auto-fluorophores. R6G could be the perfect match for these requirements, but its fluorescence is not pH-dependent. We have attempted to develop an R6G conjugate with its FRET or PeT pair that would grant it pH sensitivity in the desired range (5.5–7.5) and enable the selective targeting of tumor cells, thus improving CLSM imaging. Covalent conjugation of R6G with NBD using a spermidine (spd) linker produced a pH-sensitive FRET effect but within the pH range of 7.0–9.0. Shifting this effect to the target pH range of 5.5–7.5 appeared possible by incorporating the R6G-spd-NBD conjugate within a “smart” polymeric micelle based on chitosan grafted with lipoic acid. In our previous studies, one could conclude that the polycationic properties of chitosan could make this pH shift possible. As a result, the micellar form of the NBD-spd-R6G fluorophore demonstrates a sharp ignition of fluorescence by 40%per1 pH unit in the pH range from 7.5 to 5. Additionally, “smart” polymeric micelles based on chitosan allow the label to selectively target tumor cells. Due to the pH sensitivity of the fluorophore NBD-spd-R6G and the selective targeting of cancer cells, the efficient visualization of A875 and K562 cells was achieved. CLSM imaging showed that the dye actively penetrates cancer cells (A875 and K562), while minimal accumulation and low fluorophore emission are observed in normal cells (HEK293T). It is noteworthy that by using “smart” polymeric micelles based on polyelectrolytes of different charges and structures, we create the possibility of regulating the pH dependence of the fluorescence in the desired interval, which means that these “smart” polymeric micelles can be applied to the visualization of a variety of cell types, organelles, and other structures. Full article
(This article belongs to the Special Issue Polymeric Micelles for Drug Delivery and Cancer Therapy)
Show Figures

Figure 1

18 pages, 3862 KiB  
Article
Spatial Distribution of Cropping Systems in South Asia Using Time-Series Satellite Data Enriched with Ground Data
by Murali Krishna Gumma, Pranay Panjala, Sunil K. Dubey, Deepak K. Ray, C. S. Murthy, Dakshina Murthy Kadiyala, Ismail Mohammed and Yamano Takashi
Remote Sens. 2024, 16(15), 2733; https://doi.org/10.3390/rs16152733 - 26 Jul 2024
Viewed by 1466
Abstract
A cropping system practice is the sequential cultivation of crops in different crop seasons of a year. Cropping system practices determine the land productivity and sustainability of agriculture in regions and, therefore, information on cropping systems of different regions in the form of [...] Read more.
A cropping system practice is the sequential cultivation of crops in different crop seasons of a year. Cropping system practices determine the land productivity and sustainability of agriculture in regions and, therefore, information on cropping systems of different regions in the form of maps and statistics form critical inputs in crop planning for optimal use of resources. Although satellite-based crop mapping is widely practiced, deriving cropping systems maps using satellites is less reported. Here, we developed moderate-resolution maps of the major cropping systems of South Asia for the year 2014–2015 using multi-temporal satellite data together with a spectral matching technique (SMT) developed with an extensive set of field observation data supplemented with expert-identified crops in high-resolution satellite images. We identified and mapped 27 major cropping systems of South Asia at 250 m spatial resolution. The rice-wheat cropping system is the dominant system, followed by millet-wheat and soybean-wheat. The map showing the cropping system practices of regions opens up many use cases related to the agriculture performance of the regions. Comparison of such maps of different time periods offers insights on sensitive regions and analysis of such maps in conjunction with resources maps such as climate, soil, etc., enables optimization of resources vis-à-vis enhancing land productivity. Thus, the current study offers new opportunities to revisit the cropping system practices and redesign the same to meet the challenges of food security and climate resilient agriculture. Full article
Show Figures

Graphical abstract

18 pages, 20454 KiB  
Article
RCRFNet: Enhancing Object Detection with Self-Supervised Radar–Camera Fusion and Open-Set Recognition
by Minwei Chen, Yajun Liu, Zenghui Zhang and Weiwei Guo
Sensors 2024, 24(15), 4803; https://doi.org/10.3390/s24154803 - 24 Jul 2024
Viewed by 557
Abstract
Robust object detection in complex environments, poor visual conditions, and open scenarios presents significant technical challenges in autonomous driving. These challenges necessitate the development of advanced fusion methods for millimeter-wave (mmWave) radar point cloud data and visual images. To address these issues, this [...] Read more.
Robust object detection in complex environments, poor visual conditions, and open scenarios presents significant technical challenges in autonomous driving. These challenges necessitate the development of advanced fusion methods for millimeter-wave (mmWave) radar point cloud data and visual images. To address these issues, this paper proposes a radar–camera robust fusion network (RCRFNet), which leverages self-supervised learning and open-set recognition to effectively utilise the complementary information from both sensors. Specifically, the network uses matched radar–camera data through a frustum association approach to generate self-supervised signals, enhancing network training. The integration of global and local depth consistencies between radar point clouds and visual images, along with image features, helps construct object class confidence levels for detecting unknown targets. Additionally, these techniques are combined with a multi-layer feature extraction backbone and a multimodal feature detection head to achieve robust object detection. Experiments on the nuScenes public dataset demonstrate that RCRFNet outperforms state-of-the-art (SOTA) methods, particularly in conditions of low visual visibility and when detecting unknown class objects. Full article
Show Figures

Figure 1

16 pages, 6425 KiB  
Article
A Robust AR-DSNet Tracking Registration Method in Complex Scenarios
by Xiaomei Lei, Wenhuan Lu, Jiu Yong and Jianguo Wei
Electronics 2024, 13(14), 2807; https://doi.org/10.3390/electronics13142807 - 17 Jul 2024
Viewed by 443
Abstract
A robust AR-DSNet (Augmented Reality method based on DSST and SiamFC networks) tracking registration method in complex scenarios is proposed to improve the ability of AR (Augmented Reality) tracking registration to distinguish target foreground and semantic interference background, and to address the issue [...] Read more.
A robust AR-DSNet (Augmented Reality method based on DSST and SiamFC networks) tracking registration method in complex scenarios is proposed to improve the ability of AR (Augmented Reality) tracking registration to distinguish target foreground and semantic interference background, and to address the issue of registration failure caused by similar target drift when obtaining scale information based on predicted target positions. Firstly, the pre-trained network in SiamFC (Siamese Fully-Convolutional) is utilized to obtain the response map of a larger search area and set a threshold to filter out the initial possible positions of the target; Then, combining the advantage of the DSST (Discriminative Scale Space Tracking) filter tracker to update the template online, a new scale filter is trained after collecting multi-scale images at the initial possible position of target to reason the target scale change. And linear interpolation is used to update the correlation coefficient to determine the final position of target tracking based on the difference between two frames. Finally, ORB (Oriented FAST and Rotated BRIEF) feature detection and matching are performed on the accurate target position image, and the registration matrix is calculated through matching relationships to overlay the virtual model onto the real scene, achieving enhancement of the real world. Simulation experiments show that in complex scenarios such as similar interference, target occlusion, and local deformation, the proposed AR-DSNet method can complete the registration of the target in AR 3D tracking, ensuring real-time performance while improving the robustness of the AR tracking registration algorithm. Full article
Show Figures

Figure 1

18 pages, 15854 KiB  
Article
IRBEVF-Q: Optimization of Image–Radar Fusion Algorithm Based on Bird’s Eye View Features
by Ganlin Cai, Feng Chen and Ente Guo
Sensors 2024, 24(14), 4602; https://doi.org/10.3390/s24144602 - 16 Jul 2024
Viewed by 521
Abstract
In autonomous driving, the fusion of multiple sensors is considered essential to improve the accuracy and safety of 3D object detection. Currently, a fusion scheme combining low-cost cameras with highly robust radars can counteract the performance degradation caused by harsh environments. In this [...] Read more.
In autonomous driving, the fusion of multiple sensors is considered essential to improve the accuracy and safety of 3D object detection. Currently, a fusion scheme combining low-cost cameras with highly robust radars can counteract the performance degradation caused by harsh environments. In this paper, we propose the IRBEVF-Q model, which mainly consists of BEV (Bird’s Eye View) fusion coding module and an object decoder module.The BEV fusion coding module solves the problem of unified representation of different modal information by fusing the image and radar features through 3D spatial reference points as a medium. The query in the object decoder, as a core component, plays an important role in detection. In this paper, Heat Map-Guided Query Initialization (HGQI) and Dynamic Position Encoding (DPE) are proposed in query construction to increase the a priori information of the query. The Auxiliary Noise Query (ANQ) then helps to stabilize the matching. The experimental results demonstrate that the proposed fusion model IRBEVF-Q achieves an NDS of 0.575 and a mAP of 0.476 on the nuScenes test set. Compared to recent state-of-the-art methods, our model shows significant advantages, thus indicating that our approach contributes to improving detection accuracy. Full article
(This article belongs to the Section Radar Sensors)
Show Figures

Figure 1

16 pages, 7155 KiB  
Article
Overlapping Image-Set Determination Method Based on Hybrid BoVW-NoM Approach for UAV Image Localization
by Juyeon Lee and Kanghyeok Choi
Appl. Sci. 2024, 14(13), 5839; https://doi.org/10.3390/app14135839 - 4 Jul 2024
Viewed by 604
Abstract
With the increasing use of unmanned aerial vehicles (UAVs) in various fields, achieving the precise localization of UAV images is crucial for enhancing their utility. Photogrammetry-based techniques, particularly bundle adjustment, serve as foundational methods for accurately determining the spatial coordinates of UAV images. [...] Read more.
With the increasing use of unmanned aerial vehicles (UAVs) in various fields, achieving the precise localization of UAV images is crucial for enhancing their utility. Photogrammetry-based techniques, particularly bundle adjustment, serve as foundational methods for accurately determining the spatial coordinates of UAV images. The effectiveness of bundle adjustment is significantly influenced by the selection of input data, particularly the composition of overlapping image sets. The selection process of overlapping images significantly impacts both the accuracy of spatial coordinate determination and the computational efficiency of UAV image localization. Therefore, a strategic approach to this selection is crucial for optimizing the performance of bundle adjustment in UAV image processing. In this context, we propose an efficient methodology for determining overlapping image sets. The proposed method selects overlapping images based on image similarity, leveraging the complementary strengths of the bag of visual words and number of matches techniques. Essentially, our method achieves both high accuracy and high speed by utilizing a Bag of Visual Words for candidate selection and the number of matches for additional similarity assessment for overlapping image-set determination. We compared the performance of our proposed methodology with the conventional number of matches and bag-of-visual word-based methods for overlapping image-set determination. In the comparative evaluation, the proposed method demonstrated an average precision of 96%, comparable to that of the number of matches-based approach, while surpassing the 62% precision achieved by both bag-of-visual-word methods. Moreover, the processing time decreased by approximately 0.11 times compared with the number of matches-based methods, demonstrating relatively high efficiency. Furthermore, in the bundle adjustment results using image sets, the proposed method, along with the number of matches-based methods, showed reprojection error values of less than 1, indicating relatively high accuracy and contributing to the improvement in accuracy in estimating image positions. Full article
Show Figures

Figure 1

22 pages, 4997 KiB  
Article
A Sheep Identification Method Based on Three-Dimensional Sheep Face Reconstruction and Feature Point Matching
by Jing Xue, Zhanfeng Hou, Chuanzhong Xuan, Yanhua Ma, Quan Sun, Xiwen Zhang and Liang Zhong
Animals 2024, 14(13), 1923; https://doi.org/10.3390/ani14131923 - 29 Jun 2024
Viewed by 596
Abstract
As the sheep industry rapidly moves towards modernization, digitization, and intelligence, there is a need to build breeding farms integrated with big data. By collecting individual information on sheep, precision breeding can be conducted to improve breeding efficiency, reduce costs, and promote healthy [...] Read more.
As the sheep industry rapidly moves towards modernization, digitization, and intelligence, there is a need to build breeding farms integrated with big data. By collecting individual information on sheep, precision breeding can be conducted to improve breeding efficiency, reduce costs, and promote healthy breeding practices. In this context, the accurate identification of individual sheep is essential for establishing digitized sheep farms and precision animal husbandry. Currently, scholars utilize deep learning technology to construct recognition models, learning the biological features of sheep faces to achieve accurate identification. However, existing research methods are limited to pattern recognition at the image level, leading to a lack of diversity in recognition methods. Therefore, this study focuses on the small-tailed Han sheep and develops a sheep face recognition method based on three-dimensional reconstruction technology and feature point matching, aiming to enrich the theoretical research of sheep face recognition technology. The specific recognition approach is as follows: full-angle sheep face images of experimental sheep are collected, and corresponding three-dimensional sheep face models are generated using three-dimensional reconstruction technology, further obtaining three-dimensional sheep face images from three different perspectives. Additionally, this study developed a sheep face orientation recognition algorithm called the sheep face orientation recognition algorithm (SFORA). The SFORA incorporates the ECA mechanism to further enhance recognition performance. Ultimately, the SFORA has a model size of only 5.3 MB, with accuracy and F1 score reaching 99.6% and 99.5%, respectively. During the recognition task, the SFORA is first used for sheep face orientation recognition, followed by matching the recognition image with the corresponding three-dimensional sheep face image based on the established SuperGlue feature-matching algorithm, ultimately outputting the recognition result. Experimental results indicate that when the confidence threshold is set to 0.4, SuperGlue achieves the best matching performance, with matching accuracies for the front, left, and right faces reaching 96.0%, 94.2%, and 96.3%, respectively. This study enriches the theoretical research on sheep face recognition technology and provides technical support. Full article
Show Figures

Figure 1

16 pages, 4304 KiB  
Article
PDE-Constrained Scale Optimization Selection for Feature Detection in Remote Sensing Image Matching
by Yunchao Peng, Bin Zhou and Feng Qi
Mathematics 2024, 12(12), 1882; https://doi.org/10.3390/math12121882 - 17 Jun 2024
Viewed by 509
Abstract
Feature detection and matching is the key technique for remote sensing image processing and related applications. In this paper, a PDE-constrained optimization model is proposed to determine the scale levels advantageous for feature detection. A variance estimation technique is introduced to treat the [...] Read more.
Feature detection and matching is the key technique for remote sensing image processing and related applications. In this paper, a PDE-constrained optimization model is proposed to determine the scale levels advantageous for feature detection. A variance estimation technique is introduced to treat the observation optical images polluted by additive zero-mean Gaussian noise and determine the parameter of a nonlinear scale space governed by the partial differential equation. Additive Operator Splitting is applied to efficiently solve the PDE constraint, and an iterative algorithm is proposed to approximate the optimal subset of the original scale level set. The selected levels are distributed more uniformly in the total variation sense and helpful for generating more accurate and robust feature points. The experimental results show that the proposed method can achieve about a 30% improvement in the number of correct matches with only a small increase in time cost. Full article
Show Figures

Figure 1

Back to TopTop