Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (115)

Search Parameters:
Keywords = Siamese convolutional neural network

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
16 pages, 7952 KiB  
Technical Note
Unsupervised Domain Adaptation with Contrastive Learning-Based Discriminative Feature Augmentation for RS Image Classification
by Ren Xu, Alim Samat, Enzhao Zhu, Erzhu Li and Wei Li
Remote Sens. 2024, 16(11), 1974; https://doi.org/10.3390/rs16111974 - 30 May 2024
Viewed by 524
Abstract
High- and very high-resolution (HR, VHR) remote sensing (RS) images can provide comprehensive and intricate spatial information for land cover classification, which is particularly crucial when analyzing complex built-up environments. However, the application of HR and VHR images to large-scale and detailed land [...] Read more.
High- and very high-resolution (HR, VHR) remote sensing (RS) images can provide comprehensive and intricate spatial information for land cover classification, which is particularly crucial when analyzing complex built-up environments. However, the application of HR and VHR images to large-scale and detailed land cover mapping is always constrained by the intricacy of land cover classification models, the exorbitant cost of collecting training samples, and geographical changes or acquisition conditions. To overcome this limitation, we propose an unsupervised domain adaptation (UDA) with contrastive learning-based discriminative feature augmentation (CLDFA) for RS image classification. In detail, our method first utilizes contrastive learning (CL) through a memory bank in order to memorize sample features and improve model performance, where the approach employs an end-to-end Siamese network and incorporates dynamic pseudo-label assignment and class-balancing strategies for adaptive domain joint learning. By transferring classification models trained on a source domain (SD) to an unlabeled target domain (TD), our proposed UDA method enables large-scale land cover mapping. We conducted experiments using a massive five billion-pixels dataset as the SD and tested the HR and VHR RS images of five typical Chinese cities as the TD and applied the method on the completely unlabeled world view 3 (WV3) image of Urumqi city. The experimental results demonstrate that our method excels in large-scale HR and VHR RS image classification tasks, highlighting the advantages of semantic segmentation based on end-to-end deep convolutional neural networks (DCNNs). Full article
(This article belongs to the Special Issue Advances in Deep Fusion of Multi-Source Remote Sensing Images)
Show Figures

Figure 1

20 pages, 1720 KiB  
Article
Enhancing Sika Deer Identification: Integrating CNN-Based Siamese Networks with SVM Classification
by Sandhya Sharma, Suresh Timilsina, Bishnu Prasad Gautam, Shinya Watanabe, Satoshi Kondo and Kazuhiko Sato
Electronics 2024, 13(11), 2067; https://doi.org/10.3390/electronics13112067 - 26 May 2024
Viewed by 735
Abstract
Accurately identifying individual wildlife is critical to effective species management and conservation efforts. However, it becomes particularly challenging when distinctive features, such as spot shape and size, serve as primary discriminators, as in the case of Sika deer. To address this challenge, we [...] Read more.
Accurately identifying individual wildlife is critical to effective species management and conservation efforts. However, it becomes particularly challenging when distinctive features, such as spot shape and size, serve as primary discriminators, as in the case of Sika deer. To address this challenge, we employed four different Convolutional Neural Network (CNN) base models (EfficientNetB7, VGG19, ResNet152, Inception_v3) within a Siamese Network Architecture that used triplet loss functions for the identification and re-identification of Sika deer. Subsequently, we then determined the best-performing model based on its ability to capture discriminative features. From this model, we extracted embeddings representing the learned features. We then applied a Support Vector Machine (SVM) to these embeddings to classify individual Sika deer. We analyzed 5169 image datasets consisting of images of seven individual Sika deers captured with three camera traps deployed on farmland in Hokkaido, Japan, for over 60 days. During our analysis, ResNet152 performed exceptionally well, achieving a training accuracy of 0.97, and a validation accuracy of 0.96, with mAP scores for the training and validation datasets of 0.97 and 0.96, respectively. We extracted 128 dimensional embeddings of ResNet152 and performed Principal Component Analysis (PCA) for dimensionality reduction. PCA1 and PCA2, which together accounted for over 80% of the variance collectively, were selected for subsequent SVM analysis. Utilizing the Radial Basis Function (RBF) kernel, which yielded a cross-validation score of 0.96, proved to be most suitable for our research. Hyperparameter optimization using the GridSearchCV library resulted in a gamma value of 10 and C value of 0.001. The OneVsRest SVM classifier achieved an impressive overall accuracy of 0.97 and 0.96, respectively, for the training and validation datasets. This study presents a precise model for identifying individual Sika deer using images and video frames, which can be replicated for other species with unique patterns, thereby assisting conservationists and researchers in effectively monitoring and protecting the species. Full article
Show Figures

Figure 1

22 pages, 686 KiB  
Article
Enhancing Human Activity Recognition with Siamese Networks: A Comparative Study of Contrastive and Triplet Learning Approaches
by Byung-Rae Cha and Binod Vaidya
Electronics 2024, 13(9), 1739; https://doi.org/10.3390/electronics13091739 - 1 May 2024
Viewed by 760
Abstract
This paper delves into the realm of human activity recognition (HAR) by leveraging the capabilities of Siamese neural networks (SNNs), focusing on the comparative effectiveness of contrastive and triplet learning approaches. Against the backdrop of HAR’s growing importance in healthcare, sports, and smart [...] Read more.
This paper delves into the realm of human activity recognition (HAR) by leveraging the capabilities of Siamese neural networks (SNNs), focusing on the comparative effectiveness of contrastive and triplet learning approaches. Against the backdrop of HAR’s growing importance in healthcare, sports, and smart environments, the need for advanced models capable of accurately recognizing and classifying complex human activities has become paramount. Addressing this, we have introduced a Siamese network architecture integrated with convolutional neural networks (CNNs) for spatial feature extraction, bidirectional LSTM (Bi-LSTM) for temporal dependency capture, and attention mechanisms to prioritize salient features. Employing both contrastive and triplet loss functions, we meticulously analyze the impact of these learning approaches on the network’s ability to generate discriminative embeddings for HAR tasks. Through extensive experimentation, the study reveals that Siamese networks, particularly those utilizing triplet loss functions, demonstrate superior performance in activity recognition accuracy and F1 scores compared with baseline deep learning models. The inclusion of a stacking meta-classifier further amplifies classification efficacy, showcasing the robustness and adaptability of our proposed model. Conclusively, our findings underscore the potential of Siamese networks with advanced learning paradigms in enhancing HAR systems, paving the way for future research in model optimization and application expansion. Full article
(This article belongs to the Special Issue Recent Advances in Wireless Ad Hoc and Sensor Networks)
Show Figures

Figure 1

19 pages, 5280 KiB  
Article
A New Method for Bearing Fault Diagnosis across Machines Based on Envelope Spectrum and Conditional Metric Learning
by Xu Yang, Junfeng Yang, Yupeng Jin and Zhongchao Liu
Sensors 2024, 24(9), 2674; https://doi.org/10.3390/s24092674 - 23 Apr 2024
Cited by 1 | Viewed by 618
Abstract
In recent years, most research on bearing fault diagnosis has assumed that the source domain and target domain data come from the same machine. The differences in equipment lead to a decrease in diagnostic accuracy. To address this issue, unsupervised domain adaptation techniques [...] Read more.
In recent years, most research on bearing fault diagnosis has assumed that the source domain and target domain data come from the same machine. The differences in equipment lead to a decrease in diagnostic accuracy. To address this issue, unsupervised domain adaptation techniques have been introduced. However, most cross-device fault diagnosis models overlook the discriminative information under the marginal distribution, which restricts the performance of the models. In this paper, we propose a bearing fault diagnosis method based on envelope spectrum and conditional metric learning. First, envelope spectral analysis is used to extract frequency domain features. Then, to fully utilize the discriminative information from the label distribution, we construct a deep Siamese convolutional neural network based on conditional metric learning to eliminate the data distribution differences and extract common features from the source and target domain data. Finally, dynamic weighting factors are employed to improve the convergence performance of the model and optimize the training process. Experimental analysis is conducted on 12 cross-device tasks and compared with other relevant methods. The results show that the proposed method achieves the best performance on all three evaluation metrics. Full article
Show Figures

Figure 1

20 pages, 3396 KiB  
Article
Non-Intrusive Load Identification Based on Retrainable Siamese Network
by Lingxia Lu, Ju-Song Kang, Fanju Meng and Miao Yu
Sensors 2024, 24(8), 2562; https://doi.org/10.3390/s24082562 - 17 Apr 2024
Cited by 1 | Viewed by 743
Abstract
Non-intrusive load monitoring (NILM) can identify each electrical load and its operating state in a household by using the voltage and current data measured at a single point on the bus, thereby behaving as a key technology for smart grid construction and effective [...] Read more.
Non-intrusive load monitoring (NILM) can identify each electrical load and its operating state in a household by using the voltage and current data measured at a single point on the bus, thereby behaving as a key technology for smart grid construction and effective energy consumption. The existing NILM methods mainly focus on the identification of pre-trained loads, which can achieve high identification accuracy and satisfying outcomes. However, unknown load identification is rarely involved among those methods and the scalability of NILM is still a crucial problem at the current stage. In light of this, we have proposed a non-intrusive load identification method based on a Siamese network, which can be retrained after the detection of an unknown load to increase the identification accuracy for unknown loads. The proposed Siamese network comprises a fixed convolutional neural network (CNN) and two retrainable back propagation (BP) networks. When an unknown load is detected, the low-dimensional features of its voltage–current (V-I) trajectory are extracted by using the fixed CNN model, and the BP networks are retrained online. The finetuning of BP network parameters through retraining can improve the representation ability of the network model; thus, a high accuracy of unknown load identification can be achieved by updating the Siamese network in real time. The public WHITED and PLAID datasets are used for the validation of the proposed method. Finally, the practicality and scalability of the method are demonstrated using a real-house environment test to prove the ability of online retraining on an embedded Linux system with STM32MP1 as the core. Full article
(This article belongs to the Section Intelligent Sensors)
Show Figures

Figure 1

15 pages, 4091 KiB  
Article
SSTrack: An Object Tracking Algorithm Based on Spatial Scale Attention
by Qi Mu, Zuohui He, Xueqian Wang and Zhanli Li
Appl. Sci. 2024, 14(6), 2476; https://doi.org/10.3390/app14062476 - 15 Mar 2024
Viewed by 784
Abstract
The traditional Siamese object tracking algorithm uses a convolutional neural network as the backbone and has achieved good results in improving tracking precision. However, due to the lack of global information and the use of spatial and scale information, the accuracy and speed [...] Read more.
The traditional Siamese object tracking algorithm uses a convolutional neural network as the backbone and has achieved good results in improving tracking precision. However, due to the lack of global information and the use of spatial and scale information, the accuracy and speed of such tracking algorithms still need to be improved in complex environments such as rapid motion and illumination variation. In response to the above problems, we propose SSTrack, an object tracking algorithm based on spatial scale attention. We use dilated convolution branch and covariance pooling to build a spatial scale attention module, which can extract the spatial and scale information of the target object. By embedding the spatial scale attention module into Swin Transformer as the backbone, the ability to extract local detailed information has been enhanced, and the success rate and precision of tracking have been improved. At the same time, to reduce the computational complexity of self-attention, Exemplar Transformer is applied to the encoder structure. SSTrack achieved 71.5% average overlap (AO), 86.7% normalized precision (NP), and 68.4% area under curve (AUC) scores on the GOT-10k, TrackingNet, and LaSOT. The tracking speed reached 28fps, which can meet the need for real-time object tracking. Full article
Show Figures

Figure 1

16 pages, 3440 KiB  
Article
Siamese Networks for Clinically Relevant Bacteria Classification Based on Raman Spectroscopy
by Jhonatan Contreras, Sara Mostafapour, Jürgen Popp and Thomas Bocklitz
Molecules 2024, 29(5), 1061; https://doi.org/10.3390/molecules29051061 - 28 Feb 2024
Cited by 2 | Viewed by 1038
Abstract
Identifying bacterial strains is essential in microbiology for various practical applications, such as disease diagnosis and quality monitoring of food and water. Classical machine learning algorithms have been utilized to identify bacteria based on their Raman spectra. However, convolutional neural networks (CNNs) offer [...] Read more.
Identifying bacterial strains is essential in microbiology for various practical applications, such as disease diagnosis and quality monitoring of food and water. Classical machine learning algorithms have been utilized to identify bacteria based on their Raman spectra. However, convolutional neural networks (CNNs) offer higher classification accuracy, but they require extensive training sets and retraining of previous untrained class targets can be costly and time-consuming. Siamese networks have emerged as a promising solution. They are composed of two CNNs with the same structure and a final network that acts as a distance metric, converting the classification problem into a similarity problem. Classical machine learning approaches, shallow and deep CNNs, and two Siamese network variants were tailored and tested on Raman spectral datasets of bacteria. The methods were evaluated based on mean sensitivity, training time, prediction time, and the number of parameters. In this comparison, Siamese-model2 achieved the highest mean sensitivity of 83.61 ± 4.73 and demonstrated remarkable performance in handling unbalanced and limited data scenarios, achieving a prediction accuracy of 73%. Therefore, the choice of model depends on the specific trade-off between accuracy, (prediction/training) time, and resources for the particular application. Classical machine learning models and shallow CNN models may be more suitable if time and computational resources are a concern. Siamese networks are a good choice for small datasets and CNN for extensive data. Full article
(This article belongs to the Special Issue Chemometrics Tools in Analytical Chemistry 2.0)
Show Figures

Figure 1

19 pages, 7376 KiB  
Article
Siamese Tracking Network with Spatial-Semantic-Aware Attention and Flexible Spatiotemporal Constraint
by Huanlong Zhang, Panyun Wang, Jie Zhang, Fengxian Wang, Xiaohui Song and Hebin Zhou
Symmetry 2024, 16(1), 61; https://doi.org/10.3390/sym16010061 - 3 Jan 2024
Viewed by 1119
Abstract
Siamese trackers based on classification and regression have drawn extensive attention due to their appropriate balance between accuracy and efficiency. However, most of them are prone to failure in the face of abrupt motion or appearance changes. This paper proposes a Siamese-based tracker [...] Read more.
Siamese trackers based on classification and regression have drawn extensive attention due to their appropriate balance between accuracy and efficiency. However, most of them are prone to failure in the face of abrupt motion or appearance changes. This paper proposes a Siamese-based tracker that incorporates spatial-semantic-aware attention and flexible spatiotemporal constraint. First, we develop a spatial-semantic-aware attention model, which identifies the importance of each feature region and channel to target representation through the single convolution attention network with a loss function and increases the corresponding weights in the spatial and channel dimensions to reinforce the target region and semantic information on the target feature map. Secondly, considering that the traditional method unreasonably weights the target response in abrupt motion, we design a flexible spatiotemporal constraint. This constraint adaptively adjusts the constraint weights on the response map by evaluating the tracking result. Finally, we propose a new template updating the strategy. This strategy adaptively adjusts the contribution weights of the tracking result to the new template using depth correlation assessment criteria, thereby enhancing the reliability of the template. The Siamese network used in this paper is a symmetric neural network with dual input branches sharing weights. The experimental results on five challenging datasets show that our method outperformed other advanced algorithms. Full article
Show Figures

Figure 1

27 pages, 15549 KiB  
Article
Loop Closure Detection Based on Compressed ConvNet Features in Dynamic Environments
by Shuhai Jiang, Zhongkai Zhou and Shangjie Sun
Appl. Sci. 2024, 14(1), 8; https://doi.org/10.3390/app14010008 - 19 Dec 2023
Cited by 1 | Viewed by 822
Abstract
In dynamic environments, convolutional neural networks (CNNs) often produce image feature maps with significant redundancy due to external factors such as moving objects and occlusions. These feature maps are inadequate as precise image descriptors for similarity measurement, hindering loop closure detection. Addressing this [...] Read more.
In dynamic environments, convolutional neural networks (CNNs) often produce image feature maps with significant redundancy due to external factors such as moving objects and occlusions. These feature maps are inadequate as precise image descriptors for similarity measurement, hindering loop closure detection. Addressing this issue, this paper proposes feature compression of convolutional neural network output. The approach is detailed as follows: (1) employing ResNet152 as the backbone feature-extraction network, a Siamese neural network is constructed to enhance the efficiency of feature extraction; (2) utilizing KL transformation to extract principal components from the backbone network’s output, thereby eliminating redundant information; (3) employing the compressed features as input for NetVLAD to construct a spatially informed feature descriptor for similarity measurement. Experimental results demonstrate that, on the New College dataset, the proposed improved method exhibits an approximately 9.98% enhancement in average accuracy compared to the original network. On the City Center dataset, there is an improvement of approximately 2.64%, with an overall increase of about 23.51% in time performance. These findings indicate that the enhanced ResNet152 performs better than the original network in environments with more moving objects and occlusions. Full article
(This article belongs to the Section Robotics and Automation)
Show Figures

Figure 1

22 pages, 6937 KiB  
Article
A Full-Scale Connected CNN–Transformer Network for Remote Sensing Image Change Detection
by Min Chen, Qiangjiang Zhang, Xuming Ge, Bo Xu, Han Hu, Qing Zhu and Xin Zhang
Remote Sens. 2023, 15(22), 5383; https://doi.org/10.3390/rs15225383 - 16 Nov 2023
Cited by 4 | Viewed by 1401
Abstract
Recent studies have introduced transformer modules into convolutional neural networks (CNNs) to solve the inherent limitations of CNNs in global modeling and have achieved impressive performance. However, some challenges have yet to be addressed: first, networks with simple connections between the CNN and [...] Read more.
Recent studies have introduced transformer modules into convolutional neural networks (CNNs) to solve the inherent limitations of CNNs in global modeling and have achieved impressive performance. However, some challenges have yet to be addressed: first, networks with simple connections between the CNN and transformer perform poorly in small change areas; second, networks that only use transformer structures are prone to attaining coarse detection results and excessively generalizing feature boundaries. In addition, the methods of fusing the CNN and transformer have the issue of a unilateral flow of feature information and inter-scale communication, leading to a loss of change information across different scales. To mitigate these problems, this study proposes a full-scale connected CNN–Transformer network, which incorporates the Siamese structure, Unet3+, and transformer structure, used for change detection in remote sensing images, namely SUT. A progressive attention module (PAM) is adopted in SUT to deeply integrate the features extracted from both the CNN and the transformer, resulting in improved global modeling, small target detection capacities, and clearer feature boundaries. Furthermore, SUT adopts a full-scale skip connection to realize multi-directional information flow from the encoder to decoder, enhancing the ability to extract multi-scale features. Experimental results demonstrate that the method we designed performs best on the CDD, LEVIR-CD, and WHU-CD datasets with its concise structure. In particular, based on the WHU-CD dataset, SUT upgrades the F1-score by more than 4% and the intersection over union (IOU) by more than 7% compared with the second-best method. Full article
(This article belongs to the Section Remote Sensing Image Processing)
Show Figures

Figure 1

16 pages, 519 KiB  
Article
Domain-Specific Processing Stage for Estimating Single-Trail Evoked Potential Improves CNN Performance in Detecting Error Potential
by Andrea Farabbi and Luca Mainardi
Sensors 2023, 23(22), 9049; https://doi.org/10.3390/s23229049 - 8 Nov 2023
Cited by 1 | Viewed by 872
Abstract
We present a novel architecture designed to enhance the detection of Error Potential (ErrP) signals during ErrP stimulation tasks. In the context of predicting ErrP presence, conventional Convolutional Neural Networks (CNNs) typically accept a raw EEG signal as input, encompassing both the information [...] Read more.
We present a novel architecture designed to enhance the detection of Error Potential (ErrP) signals during ErrP stimulation tasks. In the context of predicting ErrP presence, conventional Convolutional Neural Networks (CNNs) typically accept a raw EEG signal as input, encompassing both the information associated with the evoked potential and the background activity, which can potentially diminish predictive accuracy. Our approach involves advanced Single-Trial (ST) ErrP enhancement techniques for processing raw EEG signals in the initial stage, followed by CNNs for discerning between ErrP and NonErrP segments in the second stage. We tested different combinations of methods and CNNs. As far as ST ErrP estimation is concerned, we examined various methods encompassing subspace regularization techniques, Continuous Wavelet Transform, and ARX models. For the classification stage, we evaluated the performance of EEGNet, CNN, and a Siamese Neural Network. A comparative analysis against the method of directly applying CNNs to raw EEG signals revealed the advantages of our architecture. Leveraging subspace regularization yielded the best improvement in classification metrics, at up to 14% in balanced accuracy and 13.4% in F1-score. Full article
(This article belongs to the Special Issue EEG Signal Processing Techniques and Applications—2nd Edition)
Show Figures

Figure 1

15 pages, 3070 KiB  
Article
Deep-Learning-Based Mixture Identification for Nuclear Magnetic Resonance Spectroscopy Applied to Plant Flavors
by Yufei Wang, Weiwei Wei, Wen Du, Jiaxiao Cai, Yuxuan Liao, Hongmei Lu, Bo Kong and Zhimin Zhang
Molecules 2023, 28(21), 7380; https://doi.org/10.3390/molecules28217380 - 1 Nov 2023
Cited by 1 | Viewed by 1404
Abstract
Nuclear magnetic resonance (NMR) is a crucial technique for analyzing mixtures consisting of small molecules, providing non-destructive, fast, reproducible, and unbiased benefits. However, it is challenging to perform mixture identification because of the offset of chemical shifts and peak overlaps that often exist [...] Read more.
Nuclear magnetic resonance (NMR) is a crucial technique for analyzing mixtures consisting of small molecules, providing non-destructive, fast, reproducible, and unbiased benefits. However, it is challenging to perform mixture identification because of the offset of chemical shifts and peak overlaps that often exist in mixtures such as plant flavors. Here, we propose a deep-learning-based mixture identification method (DeepMID) that can be used to identify plant flavors (mixtures) in a formulated flavor (mixture consisting of several plant flavors) without the need to know the specific components in the plant flavors. A pseudo-Siamese convolutional neural network (pSCNN) and a spatial pyramid pooling (SPP) layer were used to solve the problems due to their high accuracy and robustness. The DeepMID model is trained, validated, and tested on an augmented data set containing 50,000 pairs of formulated and plant flavors. We demonstrate that DeepMID can achieve excellent prediction results in the augmented test set: ACC = 99.58%, TPR = 99.48%, FPR = 0.32%; and two experimentally obtained data sets: one shows ACC = 97.60%, TPR = 92.81%, FPR = 0.78% and the other shows ACC = 92.31%, TPR = 80.00%, FPR = 0.00%. In conclusion, DeepMID is a reliable method for identifying plant flavors in formulated flavors based on NMR spectroscopy, which can assist researchers in accelerating the design of flavor formulations. Full article
(This article belongs to the Section Analytical Chemistry)
Show Figures

Graphical abstract

12 pages, 483 KiB  
Article
Infrared Fault Classification Based on the Siamese Network
by Lili Zhang, Xiuhui Wang, Qifu Bao, Bo Jia, Xuesheng Li and Yaru Wang
Appl. Sci. 2023, 13(20), 11457; https://doi.org/10.3390/app132011457 - 19 Oct 2023
Cited by 1 | Viewed by 863
Abstract
The rapid development of solar energy technology has led to significant progress in recent years, but the daily maintenance of solar panels faces significant challenges. The diagnosis of solar panel failures by infrared detection devices can improve the efficiency of maintenance personnel. Currently, [...] Read more.
The rapid development of solar energy technology has led to significant progress in recent years, but the daily maintenance of solar panels faces significant challenges. The diagnosis of solar panel failures by infrared detection devices can improve the efficiency of maintenance personnel. Currently, due to the scarcity of infrared solar panel failure samples and the problem of unclear image effective features, traditional deep neural network models can easily encounter overfitting and poor generalization performance under small sample conditions. To address these problems, this paper proposes a solar panel failure diagnosis method based on an improved Siamese network. Firstly, two types of solar panel samples of the same category are constructed. Secondly, the images of the samples are input into the feature model combining convolution, adaptive coordinate attention (ACA), and the feature fusion module (FFM) to extract features, learning the similarities between different types of solar panel samples. Finally, the trained model is used to determine the similarity of the input solar image, obtaining the failure diagnosis results. In this case, adaptive coordinate attention can effectively obtain interested effective feature information, and the feature fusion module can integrate the different effective information obtained, further enriching the feature information. The ACA-FFM Siamese network method can alleviate the problem of insufficient sample quantity and effectively improve the classification accuracy, achieving a classification accuracy rate of 83.9% on an open-accessed infrared failure dataset with high similarity. Full article
Show Figures

Figure 1

18 pages, 5370 KiB  
Article
Siamese Trackers Based on Deep Features for Visual Tracking
by Su-Chang Lim, Jun-Ho Huh and Jong-Chan Kim
Electronics 2023, 12(19), 4140; https://doi.org/10.3390/electronics12194140 - 4 Oct 2023
Cited by 1 | Viewed by 1569
Abstract
Visual object tracking poses challenges due to deformation of target object appearance, fast motion, brightness change, blocking due to obstacles, etc. In this paper, a Siamese network that is configured using a convolutional neural network is proposed to improve tracking accuracy and robustness. [...] Read more.
Visual object tracking poses challenges due to deformation of target object appearance, fast motion, brightness change, blocking due to obstacles, etc. In this paper, a Siamese network that is configured using a convolutional neural network is proposed to improve tracking accuracy and robustness. Object tracking accuracy is dependent on features that can well represent objects. Thus, we designed a convolutional neural network structure that can preserve feature information that is produced in the previous layer to extract spatial and semantic information. Features are extracted from the target object and search area using a Siamese network, and the extracted feature map is input into the region proposal network, where fast Fourier-transform convolution is applied. The feature map produces a probability score for the presence of an object region and an object in a region, where the similarities are high to search the target. The network was trained with a video dataset called ImageNet Large Scale Visual Recognition Challenge. In the experiment, quantitative and qualitative evaluations were conducted using the object-tracking benchmark dataset. The evaluation results indicated competitive results for some video attributes through various experiments. By conducting experiments, the proposed method achieved competitive results for some video attributes, with a success metric of 0.632 and a precision metric of 0.856 as quantitative values. Full article
Show Figures

Figure 1

20 pages, 8917 KiB  
Article
TTNet: A Temporal-Transform Network for Semantic Change Detection Based on Bi-Temporal Remote Sensing Images
by Liangcun Jiang, Feng Li, Li Huang, Feifei Peng and Lei Hu
Remote Sens. 2023, 15(18), 4555; https://doi.org/10.3390/rs15184555 - 15 Sep 2023
Cited by 2 | Viewed by 1465
Abstract
Semantic change detection (SCD) holds a critical place in remote sensing image interpretation, as it aims to locate changing regions and identify their associated land cover classes. Presently, post-classification techniques stand as the predominant strategy for SCD due to their simplicity and efficacy. [...] Read more.
Semantic change detection (SCD) holds a critical place in remote sensing image interpretation, as it aims to locate changing regions and identify their associated land cover classes. Presently, post-classification techniques stand as the predominant strategy for SCD due to their simplicity and efficacy. However, these methods often overlook the intricate relationships between alterations in land cover. In this paper, we argue that comprehending the interplay of changes within land cover maps holds the key to enhancing SCD’s performance. With this insight, a Temporal-Transform Module (TTM) is designed to capture change relationships across temporal dimensions. TTM selectively aggregates features across all temporal images, enhancing the unique features of each temporal image at distinct pixels. Moreover, we build a Temporal-Transform Network (TTNet) for SCD, comprising two semantic segmentation branches and a binary change detection branch. TTM is embedded into the decoder of each semantic segmentation branch, thus enabling TTNet to obtain better land cover classification results. Experimental results on the SECOND dataset show that TTNet achieves enhanced performance when compared to other benchmark methods in the SCD task. In particular, TTNet elevates mIoU accuracy by a minimum of 1.5% in the SCD task and 3.1% in the semantic segmentation task. Full article
Show Figures

Figure 1

Back to TopTop