Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (20)

Search Parameters:
Keywords = few-shot semantic segmentation

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
14 pages, 9682 KiB  
Article
Global–Local Query-Support Cross-Attention for Few-Shot Semantic Segmentation
by Fengxi Xie, Guozhen Liang and Ying-Ren Chien
Mathematics 2024, 12(18), 2936; https://doi.org/10.3390/math12182936 - 21 Sep 2024
Viewed by 686
Abstract
Few-shot semantic segmentation (FSS) models aim to segment unseen target objects in a query image with scarce annotated support samples. This challenging task requires an effective utilization of support information contained in the limited support set. However, the majority of existing FSS methods [...] Read more.
Few-shot semantic segmentation (FSS) models aim to segment unseen target objects in a query image with scarce annotated support samples. This challenging task requires an effective utilization of support information contained in the limited support set. However, the majority of existing FSS methods either compressed support features into several prototype vectors or constructed pixel-wise support-query correlations to guide the segmentation, which failed in effectively utilizing the support information from the global–local perspective. In this paper, we propose Global–Local Query-Support Cross-Attention (GLQSCA), where both global semantics and local details are exploited. Implemented with multi-head attention in a transformer architecture, GLQSCA treats every query pixel as a token, aggregates the segmentation label from the support mask values (weighted by the similarities with all foreground prototypes (global information)), and supports pixels (local information). Experiments show that our GLQSCA significantly surpasses state-of-the-art methods on the standard FSS benchmarks PASCAL-5i and COCO-20i. Full article
Show Figures

Figure 1

14 pages, 14439 KiB  
Article
Class-Aware Self- and Cross-Attention Network for Few-Shot Semantic Segmentation of Remote Sensing Images
by Guozhen Liang, Fengxi Xie and Ying-Ren Chien
Mathematics 2024, 12(17), 2761; https://doi.org/10.3390/math12172761 - 6 Sep 2024
Viewed by 899
Abstract
Few-Shot Semantic Segmentation (FSS) has drawn massive attention recently due to its remarkable ability to segment novel-class objects given only a handful of support samples. However, current FSS methods mainly focus on natural images and pay little attention to more practical and challenging [...] Read more.
Few-Shot Semantic Segmentation (FSS) has drawn massive attention recently due to its remarkable ability to segment novel-class objects given only a handful of support samples. However, current FSS methods mainly focus on natural images and pay little attention to more practical and challenging scenarios, e.g., remote sensing image segmentation. In the field of remote sensing image analysis, the characteristics of remote sensing images, like complex backgrounds and tiny foreground objects, make novel-class segmentation challenging. To cope with these obstacles, we propose a Class-Aware Self- and Cross-Attention Network (CSCANet) for FSS in remote sensing imagery, consisting of a lightweight self-attention module and a supervised prior-guided cross-attention module. Concretely, the self-attention module abstracts robust unseen-class information from support features, while the cross-attention module generates a superior quality query attention map for directing the network to focus on novel objects. Experiments demonstrate that our CSCANet achieves outstanding performance on the standard remote sensing FSS benchmark iSAID-5i, surpassing the existing state-of-the-art FSS models across all combinations of backbone networks and K-shot settings. Full article
Show Figures

Figure 1

15 pages, 1225 KiB  
Article
A Self-Supervised Few-Shot Semantic Segmentation Method Based on Multi-Task Learning and Dense Attention Computation
by Kai Yi , Weihang Wang  and Yi Zhang 
Sensors 2024, 24(15), 4975; https://doi.org/10.3390/s24154975 - 31 Jul 2024
Viewed by 1015
Abstract
Nowadays, autonomous driving technology has become widely prevalent. The intelligent vehicles have been equipped with various sensors (e.g., vision sensors, LiDAR, depth cameras etc.). Among them, the vision systems with tailored semantic segmentation and perception algorithms play critical roles in scene understanding. However, [...] Read more.
Nowadays, autonomous driving technology has become widely prevalent. The intelligent vehicles have been equipped with various sensors (e.g., vision sensors, LiDAR, depth cameras etc.). Among them, the vision systems with tailored semantic segmentation and perception algorithms play critical roles in scene understanding. However, the traditional supervised semantic segmentation needs a large number of pixel-level manual annotations to complete model training. Although few-shot methods reduce the annotation work to some extent, they are still labor intensive. In this paper, a self-supervised few-shot semantic segmentation method based on Multi-task Learning and Dense Attention Computation (dubbed MLDAC) is proposed. The salient part of an image is split into two parts; one of them serves as the support mask for few-shot segmentation, while cross-entropy losses are calculated between the other part and the entire region with the predicted results separately as multi-task learning so as to improve the model’s generalization ability. Swin Transformer is used as our backbone to extract feature maps at different scales. These feature maps are then input to multiple levels of dense attention computation blocks to enhance pixel-level correspondence. The final prediction results are obtained through inter-scale mixing and feature skip connection. The experimental results indicate that MLDAC obtains 55.1% and 26.8% one-shot mIoU self-supervised few-shot segmentation on the PASCAL-5i and COCO-20i datasets, respectively. In addition, it achieves 78.1% on the FSS-1000 few-shot dataset, proving its efficacy. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

29 pages, 13960 KiB  
Article
Few-Shot Image Segmentation Using Generating Mask with Meta-Learning Classifier Weight Transformer Network
by Jian-Hong Wang, Phuong Thi Le, Fong-Ci Jhou, Ming-Hsiang Su, Kuo-Chen Li, Shih-Lun Chen, Tuan Pham, Ji-Long He, Chien-Yao Wang, Jia-Ching Wang and Pao-Chi Chang
Electronics 2024, 13(13), 2634; https://doi.org/10.3390/electronics13132634 - 4 Jul 2024
Cited by 1 | Viewed by 1635
Abstract
With the rapid advancement of modern hardware technology, breakthroughs have been made in many areas of artificial intelligence research, leading to the direction of machine replacement or assistance in various fields. However, most artificial intelligence or deep learning techniques require large amounts of [...] Read more.
With the rapid advancement of modern hardware technology, breakthroughs have been made in many areas of artificial intelligence research, leading to the direction of machine replacement or assistance in various fields. However, most artificial intelligence or deep learning techniques require large amounts of training data and are typically applicable to a single task objective. Acquiring such large training datasets can be particularly challenging, especially in domains like medical imaging. In the field of image processing, few-shot image segmentation is an area of active research. Recent studies have employed deep learning and meta-learning approaches to enable models to segment objects in images with only a small amount of training data, allowing them to quickly adapt to new task objectives. This paper proposes a network architecture for meta-learning few-shot image segmentation, utilizing a meta-learning classification weight transfer network to generate masks for few-shot image segmentation. The architecture leverages pre-trained classification weight transfers to generate informative prior masks and employs pre-trained feature extraction architecture for feature extraction of query and support images. Furthermore, it utilizes a Feature Enrichment Module to adaptively propagate information from finer features to coarser features in a top-down manner for query image feature extraction. Finally, a classification module is employed for query image segmentation prediction. Experimental results demonstrate that compared to the baseline using the mean Intersection over Union (mIOU) as the evaluation metric, the accuracy increases by 1.7% in the one-shot experiment and by 2.6% in the five-shot experiment. Thus, compared to the baseline, the proposed architecture with meta-learning classification weight transfer network for mask generation exhibits superior performance in few-shot image segmentation. Full article
(This article belongs to the Special Issue Intelligent Big Data Analysis for High-Dimensional Internet of Things)
Show Figures

Figure 1

18 pages, 5724 KiB  
Article
Pixel-Wise and Class-Wise Semantic Cues for Few-Shot Segmentation in Astronaut Working Scenes
by Qingwei Sun, Jiangang Chao, Wanhong Lin, Dongyang Wang, Wei Chen, Zhenying Xu and Shaoli Xie
Aerospace 2024, 11(6), 496; https://doi.org/10.3390/aerospace11060496 - 20 Jun 2024
Cited by 1 | Viewed by 834
Abstract
Few-shot segmentation (FSS) is a cutting-edge technology that can meet requirements using a small workload. With the development of China Aerospace Engineering, FSS plays a fundamental role in astronaut working scene (AWS) intelligent parsing. Although mainstream FSS methods have made considerable breakthroughs in [...] Read more.
Few-shot segmentation (FSS) is a cutting-edge technology that can meet requirements using a small workload. With the development of China Aerospace Engineering, FSS plays a fundamental role in astronaut working scene (AWS) intelligent parsing. Although mainstream FSS methods have made considerable breakthroughs in natural data, they are not suitable for AWSs. AWSs are characterized by a similar foreground (FG) and background (BG), indistinguishable categories, and the strong influence of light, all of which place higher demands on FSS methods. We design a pixel-wise and class-wise network (PCNet) to match support and query features using pixel-wise and class-wise semantic cues. Specifically, PCNet extracts pixel-wise semantic information at each layer of the backbone using novel cross-attention. Dense prototypes are further utilized to extract class-wise semantic cues as a supplement. In addition, the deep prototype is distilled in reverse to the shallow layer to improve its quality. Furthermore, we customize a dataset for AWSs and conduct abundant experiments. The results indicate that PCNet outperforms the published best method by 4.34% and 5.15% in accuracy under one-shot and five-shot settings, respectively. Moreover, PCNet compares favorably with the traditional semantic segmentation model under the 13-shot setting. Full article
(This article belongs to the Section Astronautics & Space Science)
Show Figures

Figure 1

20 pages, 2767 KiB  
Article
A Robust Chinese Named Entity Recognition Method Based on Integrating Dual-Layer Features and CSBERT
by Yingjie Xu, Xiaobo Tan, Xin Tong and Wenbo Zhang
Appl. Sci. 2024, 14(3), 1060; https://doi.org/10.3390/app14031060 - 26 Jan 2024
Cited by 5 | Viewed by 1569
Abstract
In the rapidly evolving field of cybersecurity, the integration of multi-source, heterogeneous, and fragmented data into a coherent knowledge graph has garnered considerable attention. Such a graph elucidates semantic interconnections, thereby facilitating sophisticated analytical decision support. Central to the construction of a cybersecurity [...] Read more.
In the rapidly evolving field of cybersecurity, the integration of multi-source, heterogeneous, and fragmented data into a coherent knowledge graph has garnered considerable attention. Such a graph elucidates semantic interconnections, thereby facilitating sophisticated analytical decision support. Central to the construction of a cybersecurity knowledge graph is Named Entity Recognition (NER), a critical technology that converts unstructured text into structured data. The efficacy of NER is pivotal, as it directly influences the integrity of the knowledge graph. The task of NER in cybersecurity, particularly within the Chinese linguistic context, presents distinct challenges. Chinese text lacks explicit space delimiters and features complex contextual dependencies, exacerbating the difficulty in discerning and categorizing named entities. These linguistic characteristics contribute to errors in word segmentation and semantic ambiguities, impeding NER accuracy. This paper introduces a novel NER methodology tailored for the Chinese cybersecurity corpus, termed CSBERT-IDCNN-BiLSTM-CRF. This approach harnesses Iterative Dilated Convolutional Neural Networks (IDCNN) for extracting local features, and Bi-directional Long Short-Term Memory networks (BiLSTM) for contextual understanding. It incorporates CSBERT, a pre-trained model adept at processing few-shot data, to derive input feature representations. The process culminates with Conditional Random Fields (CRF) for precise sequence labeling. To compensate for the scarcity of publicly accessible Chinese cybersecurity datasets, this paper synthesizes a bespoke dataset, authenticated by data from the China National Vulnerability Database, processed via the YEDDA annotation tool. Empirical analysis affirms that the proposed CSBERT-IDCNN-BiLSTM-CRF model surpasses existing Chinese NER frameworks, with an F1-score of 87.30% and a precision rate of 85.89%. This marks a significant advancement in the accurate identification of cybersecurity entities in Chinese text, reflecting the model’s robust capability to address the unique challenges presented by the language’s structural intricacies. Full article
(This article belongs to the Special Issue Natural Language Processing (NLP) and Applications—2nd Edition)
Show Figures

Figure 1

19 pages, 1069 KiB  
Article
PCNet: Leveraging Prototype Complementarity to Improve Prototype Affinity for Few-Shot Segmentation
by Jing-Yu Wang, Shang-Kun Liu, Shi-Cheng Guo, Cheng-Yu Jiang and Wei-Min Zheng
Electronics 2024, 13(1), 142; https://doi.org/10.3390/electronics13010142 - 28 Dec 2023
Cited by 2 | Viewed by 1162
Abstract
With the advent of large-scale datasets, significant advancements have been made in image semantic segmentation. However, the annotation of these datasets necessitates substantial human and financial resources. Therefore, the focus of research has shifted towards few-shot semantic segmentation, which leverages a small number [...] Read more.
With the advent of large-scale datasets, significant advancements have been made in image semantic segmentation. However, the annotation of these datasets necessitates substantial human and financial resources. Therefore, the focus of research has shifted towards few-shot semantic segmentation, which leverages a small number of labeled samples to effectively segment unknown categories. The current mainstream methods are to use the meta-learning framework to achieve model generalization, and the main challenges are as follows. (1) The trained model will be biased towards the seen class, so the model will misactivate the seen class when segmenting the unseen class, which makes it difficult to achieve the idealized class agnostic effect. (2) When the sample size is limited, there exists an intra-class gap between the provided support images and the query images, significantly impacting the model’s generalization capability. To solve the above two problems, we propose a network with prototype complementarity characteristics (PCNet). Specifically, we first generate a self-support query prototype based on the query image. Through the self-distillation, the query prototype and the support prototype perform feature complementary learning, which effectively reduces the influence of the intra-class gap on the model generalization. A standard semantic segmentation model is introduced to segment the seen classes during the training process to achieve accurate irrelevant class shielding. After that, we use the rough prediction map to extract its background prototype and shield the background in the query image by the background prototype. In this way, we obtain more accurate fine-grained segmentation results. The proposed method exhibits superiority in extensive experiments conducted on the PASCAL-5i and COCO-20i datasets. We achieve new state-of-the-art results in the few-shot semantic segmentation task, with an mIoU of 71.27% and 51.71% in the 5-shot setting, respectively. Comprehensive ablation experiments and visualization studies show that the proposed method has a significant effect on small-sample semantic segmentation. Full article
(This article belongs to the Special Issue Recent Advances in Computer Vision: Technologies and Applications)
Show Figures

Figure 1

34 pages, 3055 KiB  
Review
Deep Learning Methods for Semantic Segmentation in Remote Sensing with Small Data: A Survey
by Anzhu Yu, Yujun Quan, Ru Yu, Wenyue Guo, Xin Wang, Danyang Hong, Haodi Zhang, Junming Chen, Qingfeng Hu and Peipei He
Remote Sens. 2023, 15(20), 4987; https://doi.org/10.3390/rs15204987 - 16 Oct 2023
Cited by 11 | Viewed by 5258
Abstract
The annotations used during the training process are crucial for the inference results of remote sensing images (RSIs) based on a deep learning framework. Unlabeled RSIs can be obtained relatively easily. However, pixel-level annotation is a process that necessitates a high level of [...] Read more.
The annotations used during the training process are crucial for the inference results of remote sensing images (RSIs) based on a deep learning framework. Unlabeled RSIs can be obtained relatively easily. However, pixel-level annotation is a process that necessitates a high level of expertise and experience. Consequently, the use of small sample training methods has attracted widespread attention as they help alleviate reliance on large amounts of high-quality labeled data and current deep learning methods. Moreover, research on small sample learning is still in its infancy owing to the unique challenges faced when completing semantic segmentation tasks with RSI. To better understand and stimulate future research that utilizes semantic segmentation tasks with small data, we summarized the supervised learning methods and challenges they face. We also reviewed the supervised approaches with data that are currently popular to help elucidate how to efficiently utilize a limited number of samples to address issues with semantic segmentation in RSI. The main methods discussed are self-supervised learning, semi-supervised learning, weakly supervised learning and few-shot methods. The solution of cross-domain challenges has also been discussed. Furthermore, multi-modal methods, prior knowledge constrained methods, and future research required to help optimize deep learning models for various downstream tasks in relation to RSI have been identified. Full article
Show Figures

Figure 1

21 pages, 11351 KiB  
Article
Learn to Few-Shot Segment Remote Sensing Images from Irrelevant Data
by Qingwei Sun, Jiangang Chao, Wanhong Lin, Zhenying Xu, Wei Chen and Ning He
Remote Sens. 2023, 15(20), 4937; https://doi.org/10.3390/rs15204937 - 12 Oct 2023
Cited by 4 | Viewed by 1395
Abstract
Few-shot semantic segmentation (FSS) is committed to segmenting new classes with only a few labels. Generally, FSS assumes that base classes and novel classes belong to the same domain, which limits FSS’s application in a wide range of areas. In particular, since annotation [...] Read more.
Few-shot semantic segmentation (FSS) is committed to segmenting new classes with only a few labels. Generally, FSS assumes that base classes and novel classes belong to the same domain, which limits FSS’s application in a wide range of areas. In particular, since annotation is time-consuming, it is not cost-effective to process remote sensing images using FSS. To address this issue, we designed a feature transformation network (FTNet) for learning to few-shot segment remote sensing images from irrelevant data (FSS-RSI). The main idea is to train networks on irrelevant, already labeled data but inference on remote sensing images. In other words, the training and testing data neither belong to the same domain nor category. The FTNet contains two main modules: a feature transformation module (FTM) and a hierarchical transformer module (HTM). Among them, the FTM transforms features into a domain-agnostic high-level anchor, and the HTM hierarchically enhances matching between support and query features. Moreover, to promote the development of FSS-RSI, we established a new benchmark, which other researchers may use. Our experiments demonstrate that our model outperforms the cutting-edge few-shot semantic segmentation method by 25.39% and 21.31% in the one-shot and five-shot settings, respectively. Full article
(This article belongs to the Special Issue Remote Sensing Image Classification and Semantic Segmentation)
Show Figures

Graphical abstract

18 pages, 1098 KiB  
Article
CLIP-Driven Prototype Network for Few-Shot Semantic Segmentation
by Shi-Cheng Guo, Shang-Kun Liu, Jing-Yu Wang, Wei-Min Zheng and Cheng-Yu Jiang
Entropy 2023, 25(9), 1353; https://doi.org/10.3390/e25091353 - 18 Sep 2023
Cited by 3 | Viewed by 3475
Abstract
Recent research has shown that visual–text pretrained models perform well in traditional vision tasks. CLIP, as the most influential work, has garnered significant attention from researchers. Thanks to its excellent visual representation capabilities, many recent studies have used CLIP for pixel-level tasks. We [...] Read more.
Recent research has shown that visual–text pretrained models perform well in traditional vision tasks. CLIP, as the most influential work, has garnered significant attention from researchers. Thanks to its excellent visual representation capabilities, many recent studies have used CLIP for pixel-level tasks. We explore the potential abilities of CLIP in the field of few-shot segmentation. The current mainstream approach is to utilize support and query features to generate class prototypes and then use the prototype features to match image features. We propose a new method that utilizes CLIP to extract text features for a specific class. These text features are then used as training samples to participate in the model’s training process. The addition of text features enables model to extract features that contain richer semantic information, thus making it easier to capture potential class information. To better match the query image features, we also propose a new prototype generation method that incorporates multi-modal fusion features of text and images in the prototype generation process. Adaptive query prototypes were generated by combining foreground and background information from the images with the multi-modal support prototype, thereby allowing for a better matching of image features and improved segmentation accuracy. We provide a new perspective to the task of few-shot segmentation in multi-modal scenarios. Experiments demonstrate that our proposed method achieves excellent results on two common datasets, PASCAL-5i and COCO-20i. Full article
Show Figures

Figure 1

18 pages, 2852 KiB  
Article
Self-Enhanced Mixed Attention Network for Three-Modal Images Few-Shot Semantic Segmentation
by Kechen Song, Yiming Zhang, Yanqi Bao, Ying Zhao and Yunhui Yan
Sensors 2023, 23(14), 6612; https://doi.org/10.3390/s23146612 - 22 Jul 2023
Cited by 3 | Viewed by 1773
Abstract
As an important computer vision technique, image segmentation has been widely used in various tasks. However, in some extreme cases, the insufficient illumination would result in a great impact on the performance of the model. So more and more fully supervised methods use [...] Read more.
As an important computer vision technique, image segmentation has been widely used in various tasks. However, in some extreme cases, the insufficient illumination would result in a great impact on the performance of the model. So more and more fully supervised methods use multi-modal images as their input. The dense annotated large datasets are difficult to obtain, but the few-shot methods still can have satisfactory results with few pixel-annotated samples. Therefore, we propose the Visible-Depth-Thermal (three-modal) images few-shot semantic segmentation method. It utilizes the homogeneous information of three-modal images and the complementary information of different modal images, which can improve the performance of few-shot segmentation tasks. We constructed a novel indoor dataset VDT-2048-5i for the three-modal images few-shot semantic segmentation task. We also proposed a Self-Enhanced Mixed Attention Network (SEMANet), which consists of a Self-Enhanced module (SE) and a Mixed Attention module (MA). The SE module amplifies the difference between the different kinds of features and strengthens the weak connection for the foreground features. The MA module fuses the three-modal feature to obtain a better feature. Compared with the most advanced methods before, our model improves mIoU by 3.8% and 3.3% in 1-shot and 5-shot settings, respectively, which achieves state-of-the-art performance. In the future, we will solve failure cases by obtaining more discriminative and robust feature representations, and explore achieving high performance with fewer parameters and computational costs. Full article
(This article belongs to the Special Issue Multi-Modal Image Processing Methods, Systems, and Applications)
Show Figures

Figure 1

18 pages, 9491 KiB  
Article
An Environmental Pattern Recognition Method for Traditional Chinese Settlements Using Deep Learning
by Yueping Kong, Peng Xue, Yuqian Xu and Xiaolong Li
Appl. Sci. 2023, 13(8), 4778; https://doi.org/10.3390/app13084778 - 11 Apr 2023
Cited by 4 | Viewed by 2214
Abstract
The recognition of environmental patterns for traditional Chinese settlements (TCSs) is a crucial task for rural planning. Traditionally, this task primarily relies on manual operations, which are inefficient and time consuming. In this paper, we study the use of deep learning techniques to [...] Read more.
The recognition of environmental patterns for traditional Chinese settlements (TCSs) is a crucial task for rural planning. Traditionally, this task primarily relies on manual operations, which are inefficient and time consuming. In this paper, we study the use of deep learning techniques to achieve automatic recognition of environmental patterns in TCSs based on environmental features learned from remote sensing images and digital elevation models. Specifically, due to the lack of available datasets, a new TCS dataset was created featuring five representative environmental patterns. We also use several representative CNNs to benchmark the new dataset, finding that overfitting and geographical discrepancies largely contribute to low classification performance. Consequently, we employ a semantic segmentation model to extract the dominant elements of the input data, utilizing a metric-based meta-learning method to enable the few-shot recognition of TCS samples in new areas by comparing their similarities. Extensive experiments on the newly created dataset validate the effectiveness of our proposed method, indicating a significant improvement in the generalization ability and performance of the baselines. In sum, the proposed method can automatically recognize TCS samples in new areas, providing a powerful and reliable tool for environmental pattern research in TCSs. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

17 pages, 21429 KiB  
Article
MCEENet: Multi-Scale Context Enhancement and Edge-Assisted Network for Few-Shot Semantic Segmentation
by Hongjie Zhou, Rufei Zhang, Xiaoyu He, Nannan Li, Yong Wang and Sheng Shen
Sensors 2023, 23(6), 2922; https://doi.org/10.3390/s23062922 - 8 Mar 2023
Cited by 8 | Viewed by 2136
Abstract
Few-shot semantic segmentation has attracted much attention because it requires only a few labeled samples to achieve good segmentation performance. However, existing methods still suffer from insufficient contextual information and unsatisfactory edge segmentation results. To overcome these two issues, this paper proposes a [...] Read more.
Few-shot semantic segmentation has attracted much attention because it requires only a few labeled samples to achieve good segmentation performance. However, existing methods still suffer from insufficient contextual information and unsatisfactory edge segmentation results. To overcome these two issues, this paper proposes a multi-scale context enhancement and edge-assisted network (called MCEENet) for few-shot semantic segmentation. First, rich support and query image features were extracted, respectively, using two weight-shared feature extraction networks, each consisting of a ResNet and a Vision Transformer. Subsequently, a multi-scale context enhancement (MCE) module was proposed to fuse the features of ResNet and Vision Transformer, and further mine the contextual information of the image by using cross-scale feature fusion and multi-scale dilated convolutions. Furthermore, we designed an Edge-Assisted Segmentation (EAS) module, which fuses the shallow ResNet features of the query image and the edge features computed by the Sobel operator to assist in the final segmentation task. We experimented on the PASCAL-5i dataset to demonstrate the effectiveness of MCEENet; the results of the 1-shot setting and 5-shot setting on the PASCAL-5i dataset are 63.5% and 64.7%, which surpasses the state-of-the-art results by 1.4% and 0.6%, respectively. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

14 pages, 2624 KiB  
Article
Multi-Scale and Multi-Match for Few-Shot Plant Disease Image Semantic Segmentation
by Wenji Yang, Wenchao Hu, Liping Xie and Zhenji Yang
Agronomy 2022, 12(11), 2847; https://doi.org/10.3390/agronomy12112847 - 15 Nov 2022
Cited by 2 | Viewed by 2087
Abstract
Currently, deep convolutional neural networks have achieved great achievements in semantic segmentation tasks, but existing methods all require a large number of annotated images for training and do not have good scalability for new objects. Therefore, few-shot semantic segmentation methods that can identify [...] Read more.
Currently, deep convolutional neural networks have achieved great achievements in semantic segmentation tasks, but existing methods all require a large number of annotated images for training and do not have good scalability for new objects. Therefore, few-shot semantic segmentation methods that can identify new objects with only one or a few annotated images are gradually gaining attention. However, the current few-shot segmentation methods cannot segment plant diseases well. Based on this situation, a few-shot plant disease semantic segmentation model with multi-scale and multi-prototypes match (MPM) is proposed. This method generates multiple prototypes and multiple query feature maps, and then the relationships between prototypes and query feature maps are established. Specifically, the support feature and query feature are first extracted from the high-scale layers of the feature extraction network; subsequently, masked average pooling is used for the support feature to generate prototypes for a similarity match with the query feature. At the same time, we also fuse low-scale features and high-scale features to generate another support feature and query feature that mix detailed features, and then a new prototype is generated through masked average pooling to establish a relationship with the query feature of this scale. Subsequently, in order to solve the shortcoming of traditional cosine similarity and lack of spatial distance awareness, a CES (cosine euclidean similarity) module is designed to establish the relationship between prototypes and query feature maps. To verify the superiority of our method, experiments are conducted on our constructed PDID-5i dataset, and the mIoU is 40.5%, which is 1.7% higher than that of the original network. Full article
Show Figures

Figure 1

14 pages, 1139 KiB  
Article
Multilevel Features-Guided Network for Few-Shot Segmentation
by Chenjing Xin, Xinfu Li and Yunfeng Yuan
Electronics 2022, 11(19), 3195; https://doi.org/10.3390/electronics11193195 - 5 Oct 2022
Viewed by 1617
Abstract
The purpose of few-shot semantic segmentation is to segment unseen classes with only a few labeled samples. However, most methods ignore the guidance of low-level features for segmentation, leading to unsatisfactory results. Therefore, we propose a multilevel features-guided network using convolutional neural network [...] Read more.
The purpose of few-shot semantic segmentation is to segment unseen classes with only a few labeled samples. However, most methods ignore the guidance of low-level features for segmentation, leading to unsatisfactory results. Therefore, we propose a multilevel features-guided network using convolutional neural network techniques, which fully utilizes features from each level. It includes two novel designs: (1) a similarity-guided feature reinforcement module (SRM), which uses features from different levels, it enables sufficient guidance from the support set to the query set, thus avoiding the situation that some feature information is ignored in deep network computation, (2) a method that bridges query features at each level to the decoder to guide the segmentation, making full use of local and edge information to improve model performance. We experiment on PASCAL-5i and COCO-20i datasets to demonstrate the effectiveness of the model, the results in 1-shot setting and 5-shot setting on PASCAL-5i are 64.7% and 68.0%, which are 3.9% and 6.1% higher than the baseline model, respectively, and the results on the COCO-20i are also improved. Full article
Show Figures

Figure 1

Back to TopTop