Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (3,786)

Search Parameters:
Keywords = semantic information

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
29 pages, 7411 KiB  
Article
Continuous Online Semantic Implicit Representation for Autonomous Ground Robot Navigation in Unstructured Environments
by Quentin Serdel, Julien Marzat and Julien Moras
Robotics 2024, 13(7), 108; https://doi.org/10.3390/robotics13070108 - 18 Jul 2024
Viewed by 48
Abstract
While mobile ground robots have now the physical capacity of travelling in unstructured challenging environments such as extraterrestrial surfaces or devastated terrains, their safe and efficient autonomous navigation has yet to be improved before entrusting them with complex unsupervised missions in such conditions. [...] Read more.
While mobile ground robots have now the physical capacity of travelling in unstructured challenging environments such as extraterrestrial surfaces or devastated terrains, their safe and efficient autonomous navigation has yet to be improved before entrusting them with complex unsupervised missions in such conditions. Recent advances in machine learning applied to semantic scene understanding and environment representations, coupled with modern embedded computational means and sensors hold promising potential in this matter. This paper therefore introduces the combination of semantic understanding, continuous implicit environment representation and smooth informed path-planning in a new method named COSMAu-Nav. It is specifically dedicated to autonomous ground robot navigation in unstructured environments and adaptable for embedded, real-time usage without requiring any form of telecommunication. Data clustering and Gaussian processes are employed to perform online regression of the environment topography, occupancy and terrain traversability from 3D semantic point clouds while providing an uncertainty modeling. The continuous and differentiable properties of Gaussian processes allow gradient based optimisation to be used for smooth local path-planning with respect to the terrain properties. The proposed pipeline has been evaluated and compared with two reference 3D semantic mapping methods in terms of quality of representation under localisation and semantic segmentation uncertainty using a Gazebo simulation, derived from the 3DRMS dataset. Its computational requirements have been evaluated using the Rellis-3D real world dataset. It has been implemented on a real ground robot and successfully employed for its autonomous navigation in a previously unknown outdoor environment. Full article
20 pages, 5228 KiB  
Article
Remote Sensing Image Change Detection Based on Deep Learning: Multi-Level Feature Cross-Fusion with 3D-Convolutional Neural Networks
by Sibo Yu, Chen Tao, Guang Zhang, Yubo Xuan and Xiaodong Wang
Appl. Sci. 2024, 14(14), 6269; https://doi.org/10.3390/app14146269 (registering DOI) - 18 Jul 2024
Viewed by 125
Abstract
Change detection (CD) in high-resolution remote sensing imagery remains challenging due to the complex nature of objects and varying spectral characteristics across different times and locations. Convolutional neural networks (CNNs) have shown promising performance in CD tasks by extracting meaningful semantic features. However, [...] Read more.
Change detection (CD) in high-resolution remote sensing imagery remains challenging due to the complex nature of objects and varying spectral characteristics across different times and locations. Convolutional neural networks (CNNs) have shown promising performance in CD tasks by extracting meaningful semantic features. However, traditional 2D-CNNs may struggle to accurately integrate deep features from multi-temporal images, limiting their ability to improve CD accuracy. This study proposes a Multi-level Feature Cross-Fusion (MFCF) network with 3D-CNNs for remote sensing image change detection. The network aims to effectively extract and fuse deep features from multi-temporal images to identify surface changes. To bridge the semantic gap between high-level and low-level features, a MFCF module is introduced. A channel attention mechanism (CAM) is also integrated to enhance model performance, interpretability, and generalization capabilities. The proposed methodology is validated on the LEVIR construction dataset (LEVIR-CD). The experimental results demonstrate superior performance compared to the current state-of-the-art in evaluation metrics including recall, F1 score, and IOU. The MFCF network, which combines 3D-CNNs and a CAM, effectively utilizes multi-temporal information and deep feature fusion, resulting in precise and reliable change detection in remote sensing imagery. This study significantly contributes to the advancement of change detection methods, facilitating more efficient management and decision making across various domains such as urban planning, natural resource management, and environmental monitoring. Full article
(This article belongs to the Special Issue Advances in Image Recognition and Processing Technologies)
Show Figures

Figure 1

16 pages, 4388 KiB  
Article
CellGAN: Generative Adversarial Networks for Cellular Microscopy Image Recognition with Integrated Feature Completion Mechanism
by Xiangle Liao and Wenlong Yi
Appl. Sci. 2024, 14(14), 6266; https://doi.org/10.3390/app14146266 - 18 Jul 2024
Viewed by 136
Abstract
In response to the challenges of high noise, high adhesion, and a low signal-to-noise ratio in microscopic cell images, as well as the difficulty of existing deep learning models such as UNet, ResUNet, and SwinUNet in segmenting images with clear boundaries and high-resolution, [...] Read more.
In response to the challenges of high noise, high adhesion, and a low signal-to-noise ratio in microscopic cell images, as well as the difficulty of existing deep learning models such as UNet, ResUNet, and SwinUNet in segmenting images with clear boundaries and high-resolution, this study proposes a CellGAN semantic segmentation method based on a generative adversarial network with a Feature Completion Mechanism. This method incorporates a Transformer to supplement long-range semantic information. In the self-attention module of the Transformer generator, bilinear interpolation for feature completion is introduced, reducing the computational complexity of self-attention to O(n). Additionally, two-dimensional relative positional encoding is employed in the self-attention mechanism to supplement positional information and facilitate position recovery. Experimental results demonstrate that this method outperforms ResUNet and SwinUNet in segmentation performance on rice leaf cell, MuNuSeg, and Nucleus datasets, achieving up to 23.45% and 19.90% improvements in the Intersection over Union and Similarity metrics, respectively. This method provides an automated and efficient analytical tool for cell biology, enabling more accurate segmentation of cell images, and contributing to a deeper understanding of cellular structure and function. Full article
Show Figures

Figure 1

18 pages, 5494 KiB  
Article
Design and Implementation of Time Metrology Vocabulary Ontology
by Mingxin Du, Boyong Gao, Shuaizhe Wang, Zilong Liu, Xingchuang Xiong and Yuqi Luo
Electronics 2024, 13(14), 2828; https://doi.org/10.3390/electronics13142828 - 18 Jul 2024
Viewed by 118
Abstract
The advent of the digital era has put forward an urgent need for the digitization of metrology, and the digitization of metrology vocabularies is one of the fundamental and critical steps to achieve the digital transformation of metrology. Metrology vocabulary ontology can facilitate [...] Read more.
The advent of the digital era has put forward an urgent need for the digitization of metrology, and the digitization of metrology vocabularies is one of the fundamental and critical steps to achieve the digital transformation of metrology. Metrology vocabulary ontology can facilitate the exchange and sharing of data and is an important way to achieve the digitization of metrology vocabulary. Time metrology vocabulary is a special and important part of the whole metrology vocabulary, and constructing its ontology can reduce the problems caused by semantic confusion, help to smooth the progress of metrological work, and promote the digital transformation of metrology. Currently, the existing ontology for metrology vocabulary is primarily the MetrOnto ontology, but it lacks a systematic description of the vocabulary of time metrology. To address this issue, improve the metrology vocabulary ontology, and lay the groundwork for realizing the digital transformation of metrology, this paper takes time metrology vocabulary as the research object; proposes a classification principle that meets the inherent requirements of time transfer in the digital world; adopts the seven-step method of ontology construction to construct an ontology specialized in time metrology vocabulary, OTMV (Ontology of Time Metrology Vocabulary); and conducts an ontology consistency check, a machine-readable validation, a machine-understandable primary validation, and information retrieval validation on it. The validation results show that OTMV has correct syntactic and logical consistency and is capable of realizing machine-readable, machine-understandable, and information retrieval. The construction of this ontology provides a systematic description of the time measurement vocabulary that can address the problem of word expression of time metrology vocabulary in the digital world and lay the foundation for the digitization of our metrology vocabulary, as well as its readability, understandability, and sharing. Full article
Show Figures

Figure 1

23 pages, 7788 KiB  
Article
A Novel Mamba Architecture with a Semantic Transformer for Efficient Real-Time Remote Sensing Semantic Segmentation
by Hao Ding, Bo Xia, Weilin Liu, Zekai Zhang, Jinglin Zhang, Xing Wang and Sen Xu
Remote Sens. 2024, 16(14), 2620; https://doi.org/10.3390/rs16142620 - 17 Jul 2024
Viewed by 256
Abstract
Real-time remote sensing segmentation technology is crucial for unmanned aerial vehicles (UAVs) in battlefield surveillance, land characterization observation, earthquake disaster assessment, etc., and can significantly enhance the application value of UAVs in military and civilian fields. To realize this potential, it is essential [...] Read more.
Real-time remote sensing segmentation technology is crucial for unmanned aerial vehicles (UAVs) in battlefield surveillance, land characterization observation, earthquake disaster assessment, etc., and can significantly enhance the application value of UAVs in military and civilian fields. To realize this potential, it is essential to develop real-time semantic segmentation methods that can be applied to resource-limited platforms, such as edge devices. The majority of mainstream real-time semantic segmentation methods rely on convolutional neural networks (CNNs) and transformers. However, CNNs cannot effectively capture long-range dependencies, while transformers have high computational complexity. This paper proposes a novel remote sensing Mamba architecture for real-time segmentation tasks in remote sensing, named RTMamba. Specifically, the backbone utilizes a Visual State-Space (VSS) block to extract deep features and maintains linear computational complexity, thereby capturing long-range contextual information. Additionally, a novel Inverted Triangle Pyramid Pooling (ITP) module is incorporated into the decoder. The ITP module can effectively filter redundant feature information and enhance the perception of objects and their boundaries in remote sensing images. Extensive experiments were conducted on three challenging aerial remote sensing segmentation benchmarks, including Vaihingen, Potsdam, and LoveDA. The results show that RTMamba achieves competitive performance advantages in terms of segmentation accuracy and inference speed compared to state-of-the-art CNN and transformer methods. To further validate the deployment potential of the model on embedded devices with limited resources, such as UAVs, we conducted tests on the Jetson AGX Orin edge device. The experimental results demonstrate that RTMamba achieves impressive real-time segmentation performance. Full article
Show Figures

Figure 1

17 pages, 10905 KiB  
Article
Complementary-View SAR Target Recognition Based on One-Shot Learning
by Benteng Chen, Zhengkang Zhou, Chunyu Liu and Jia Zheng
Remote Sens. 2024, 16(14), 2610; https://doi.org/10.3390/rs16142610 - 17 Jul 2024
Viewed by 183
Abstract
The consistent speckle noise in SAR images easily interferes with the semantic information of the target. Additionally, the limited quantity of supervisory information available in one-shot learning leads to poor performance. To address the aforementioned issues, we creatively propose an SAR target recognition [...] Read more.
The consistent speckle noise in SAR images easily interferes with the semantic information of the target. Additionally, the limited quantity of supervisory information available in one-shot learning leads to poor performance. To address the aforementioned issues, we creatively propose an SAR target recognition model based on one-shot learning. This model incorporates a background noise removal technique to eliminate the interference caused by consistent speckle noise in the image. Then, a global and local complementary strategy is employed to utilize the data’s inherent a priori information as a supplement to the supervisory information. The experimental results show that our approach achieves a recognition performance of 70.867% under the three-way one-shot condition, which attains a minimum improvement of 7.467% compared to five state-of-the-art one-shot learning methods. The ablation studies demonstrate the efficacy of each design introduced in our model. Full article
Show Figures

Figure 1

16 pages, 6425 KiB  
Article
A Robust AR-DSNet Tracking Registration Method in Complex Scenarios
by Xiaomei Lei, Wenhuan Lu, Jiu Yong and Jianguo Wei
Electronics 2024, 13(14), 2807; https://doi.org/10.3390/electronics13142807 - 17 Jul 2024
Viewed by 197
Abstract
A robust AR-DSNet (Augmented Reality method based on DSST and SiamFC networks) tracking registration method in complex scenarios is proposed to improve the ability of AR (Augmented Reality) tracking registration to distinguish target foreground and semantic interference background, and to address the issue [...] Read more.
A robust AR-DSNet (Augmented Reality method based on DSST and SiamFC networks) tracking registration method in complex scenarios is proposed to improve the ability of AR (Augmented Reality) tracking registration to distinguish target foreground and semantic interference background, and to address the issue of registration failure caused by similar target drift when obtaining scale information based on predicted target positions. Firstly, the pre-trained network in SiamFC (Siamese Fully-Convolutional) is utilized to obtain the response map of a larger search area and set a threshold to filter out the initial possible positions of the target; Then, combining the advantage of the DSST (Discriminative Scale Space Tracking) filter tracker to update the template online, a new scale filter is trained after collecting multi-scale images at the initial possible position of target to reason the target scale change. And linear interpolation is used to update the correlation coefficient to determine the final position of target tracking based on the difference between two frames. Finally, ORB (Oriented FAST and Rotated BRIEF) feature detection and matching are performed on the accurate target position image, and the registration matrix is calculated through matching relationships to overlay the virtual model onto the real scene, achieving enhancement of the real world. Simulation experiments show that in complex scenarios such as similar interference, target occlusion, and local deformation, the proposed AR-DSNet method can complete the registration of the target in AR 3D tracking, ensuring real-time performance while improving the robustness of the AR tracking registration algorithm. Full article
Show Figures

Figure 1

22 pages, 23824 KiB  
Article
DEDNet: Dual-Encoder DeeplabV3+ Network for Rock Glacier Recognition Based on Multispectral Remote Sensing Image
by Lujun Lin, Lei Liu, Ming Liu, Qunjia Zhang, Min Feng, Yasir Shaheen Khalil and Fang Yin
Remote Sens. 2024, 16(14), 2603; https://doi.org/10.3390/rs16142603 - 16 Jul 2024
Viewed by 226
Abstract
Understanding the distribution of rock glaciers provides key information for investigating and recognizing the status and changes of the cryosphere environment. Deep learning algorithms and red–green–blue (RGB) bands from high-resolution satellite images have been extensively employed to map rock glaciers. However, the near-infrared [...] Read more.
Understanding the distribution of rock glaciers provides key information for investigating and recognizing the status and changes of the cryosphere environment. Deep learning algorithms and red–green–blue (RGB) bands from high-resolution satellite images have been extensively employed to map rock glaciers. However, the near-infrared (NIR) band offers rich spectral information and sharp edge features that could significantly contribute to semantic segmentation tasks, but it is rarely utilized in constructing rock glacier identification models due to the limitation of three input bands for classical semantic segmentation networks, like DeeplabV3+. In this study, a dual-encoder DeeplabV3+ network (DEDNet) was designed to overcome the flaws of the classical DeeplabV3+ network (CDNet) when identifying rock glaciers using multispectral remote sensing images by extracting spatial and spectral features from RGB and NIR bands, respectively. This network, trained with manually labeled rock glacier samples from the Qilian Mountains, established a model with accuracy, precision, recall, specificity, and mIoU (mean intersection over union) of 0.9131, 0.9130, 0.9270, 0.9195, and 0.8601, respectively. The well-trained model was applied to identify new rock glaciers in a test region, achieving a producer’s accuracy of 93.68% and a user’s accuracy of 94.18%. Furthermore, the model was employed in two study areas in northern Tien Shan (Kazakhstan) and Daxue Shan (Hengduan Shan, China) with high accuracy, which proved that the DEDNet offers an innovative solution to more accurately map rock glaciers on a larger scale due to its robustness across diverse geographic regions. Full article
Show Figures

Figure 1

17 pages, 7301 KiB  
Article
Vision-Based Situational Graphs Exploiting Fiducial Markers for the Integration of Semantic Entities
by Ali Tourani, Hriday Bavle, Deniz Işınsu Avşar, Jose Luis Sanchez-Lopez, Rafael Munoz-Salinas and Holger Voos
Robotics 2024, 13(7), 106; https://doi.org/10.3390/robotics13070106 - 16 Jul 2024
Viewed by 287
Abstract
Situational Graphs (S-Graphs) merge geometric models of the environment generated by Simultaneous Localization and Mapping (SLAM) approaches with 3D scene graphs into a multi-layered jointly optimizable factor graph. As an advantage, S-Graphs not only offer a more comprehensive robotic situational awareness by combining [...] Read more.
Situational Graphs (S-Graphs) merge geometric models of the environment generated by Simultaneous Localization and Mapping (SLAM) approaches with 3D scene graphs into a multi-layered jointly optimizable factor graph. As an advantage, S-Graphs not only offer a more comprehensive robotic situational awareness by combining geometric maps with diverse, hierarchically organized semantic entities and their topological relationships within one graph, but they also lead to improved performance of localization and mapping on the SLAM level by exploiting semantic information. In this paper, we introduce a vision-based version of S-Graphs where a conventional Visual SLAM (VSLAM) system is used for low-level feature tracking and mapping. In addition, the framework exploits the potential of fiducial markers (both visible and our recently introduced transparent or fully invisible markers) to encode comprehensive information about environments and the objects within them. The markers aid in identifying and mapping structural-level semantic entities, including walls and doors in the environment, with reliable poses in the global reference, subsequently establishing meaningful associations with higher-level entities, including corridors and rooms. However, in addition to including semantic entities, the semantic and geometric constraints imposed by the fiducial markers are also utilized to improve the reconstructed map’s quality and reduce localization errors. Experimental results on a real-world dataset collected using legged robots show that our framework excels in crafting a richer, multi-layered hierarchical map and enhances robot pose accuracy at the same time. Full article
(This article belongs to the Special Issue Localization and 3D Mapping of Intelligent Robotics)
Show Figures

Figure 1

15 pages, 925 KiB  
Article
Entity-Alignment Interaction Model Based on Chinese RoBERTa
by Ping Feng, Boning Zhang, Lin Yang and Shiyu Feng
Appl. Sci. 2024, 14(14), 6162; https://doi.org/10.3390/app14146162 - 15 Jul 2024
Viewed by 252
Abstract
Entity alignment aims to match entities with the same semantics from different knowledge graphs. Most existing studies use neural networks to combine graph-structure information and additional entity information (such as names, descriptions, images, and attributes) to achieve entity alignment. However, due to the [...] Read more.
Entity alignment aims to match entities with the same semantics from different knowledge graphs. Most existing studies use neural networks to combine graph-structure information and additional entity information (such as names, descriptions, images, and attributes) to achieve entity alignment. However, due to the heterogeneity of knowledge graphs, aligned entities often do not have the same neighbors, which makes it difficult to utilize the structural information from knowledge graphs and results in a decrease in alignment accuracy. Therefore, in this paper, we propose an interaction model that exploits only the additional information on entities. Our model utilizes names, attributes, and neighbors of entities for interaction and introduces attention interaction to extract features to further evaluate the matching scores between entities. Our model is applicable to Chinese datasets, and experimental results show that it has achieved good results on the Chinese medical datasets denoted MED-BBK-9K. Full article
(This article belongs to the Special Issue Natural Language Processing (NLP) and Applications—2nd Edition)
Show Figures

Figure 1

20 pages, 28729 KiB  
Article
Unmanned Aerial Vehicle Object Detection Based on Information-Preserving and Fine-Grained Feature Aggregation
by Jiangfan Zhang, Yan Zhang, Zhiguang Shi, Yu Zhang and Ruobin Gao
Remote Sens. 2024, 16(14), 2590; https://doi.org/10.3390/rs16142590 - 15 Jul 2024
Viewed by 251
Abstract
General deep learning methods achieve high-level semantic feature representation by aggregating hierarchical features, which performs well in object detection tasks. However, issues arise with general deep learning methods in UAV-based remote sensing image object detection tasks. Firstly, general feature aggregation methods such as [...] Read more.
General deep learning methods achieve high-level semantic feature representation by aggregating hierarchical features, which performs well in object detection tasks. However, issues arise with general deep learning methods in UAV-based remote sensing image object detection tasks. Firstly, general feature aggregation methods such as stride convolution may lead to information loss in input samples. Secondly, common FPN methods introduce conflicting information by directly fusing feature maps from different levels. These shortcomings limit the model’s detection performance on small and weak targets in remote sensing images. In response to these concerns, we propose an unmanned aerial vehicle (UAV) object detection algorithm, IF-YOLO. Specifically, our algorithm leverages the Information-Preserving Feature Aggregation (IPFA) module to construct semantic feature representations while preserving the intrinsic features of small objects. Furthermore, to filter out irrelevant information introduced by direct fusion, we introduce the Conflict Information Suppression Feature Fusion Module (CSFM) to improve the feature fusion approach. Additionally, the Fine-Grained Aggregation Feature Pyramid Network (FGAFPN) facilitates interaction between feature maps at different levels, reducing the generation of conflicting information during multi-scale feature fusion. The experimental results on the VisDrone2019 dataset demonstrate that in contrast to the standard YOLOv8-s, our enhanced algorithm achieves a mean average precision (mAP) of 47.3%, with precision and recall rates enhanced by 6.3% and 5.6%, respectively. Full article
14 pages, 4537 KiB  
Article
Multimodal Hateful Meme Classification Based on Transfer Learning and a Cross-Mask Mechanism
by Fan Wu, Guolian Chen, Junkuo Cao, Yuhan Yan and Zhongneng Li
Electronics 2024, 13(14), 2780; https://doi.org/10.3390/electronics13142780 - 15 Jul 2024
Viewed by 323
Abstract
Hateful memes are malicious and biased sentiment information widely spread on the internet. Detecting hateful memes differs from traditional multimodal tasks because, in conventional tasks, visual and textual information align semantically. However, the challenge in detecting hateful memes lies in their unique multimodal [...] Read more.
Hateful memes are malicious and biased sentiment information widely spread on the internet. Detecting hateful memes differs from traditional multimodal tasks because, in conventional tasks, visual and textual information align semantically. However, the challenge in detecting hateful memes lies in their unique multimodal nature, where images and text in memes may be weak or unrelated, requiring models to understand the content and perform multimodal reasoning. To address this issue, we introduce a multimodal fine-grained hateful memes detection model named “TCAM”. The model leverages advanced encoding techniques from TweetEval and CLIP and introduces enhanced Cross-Attention and Cross-Mask Mechanisms (CAM) in the feature fusion stage to improve multimodal correlations. It effectively embeds fine-grained features of data and image descriptions into the model through transfer learning. This paper uses the Area Under the Receiver Operating Characteristic Curve (AUROC) as the primary metric to evaluate the model’s discriminatory ability. This approach achieved an AUROC score of 0.8362 and an accuracy score of 0.764 on the Facebook Hateful Memes Challenge (FHMC) dataset, confirming its high discriminatory capability. The TCAM model demonstrates relatively superior performance compared to ensemble machine learning methods. Full article
(This article belongs to the Special Issue Application of Data Mining in Social Media)
Show Figures

Figure 1

23 pages, 19814 KiB  
Article
Semi-Supervised One-Stage Object Detection for Maize Leaf Disease
by Jiaqi Liu, Yanxin Hu, Qianfu Su, Jianwei Guo, Zhiyu Chen and Gang Liu
Agriculture 2024, 14(7), 1140; https://doi.org/10.3390/agriculture14071140 - 14 Jul 2024
Viewed by 215
Abstract
Maize is one of the most important crops globally, and accurate diagnosis of leaf diseases is crucial for ensuring increased yields. Despite the continuous progress in computer vision technology, detecting maize leaf diseases based on deep learning still relies on a large amount [...] Read more.
Maize is one of the most important crops globally, and accurate diagnosis of leaf diseases is crucial for ensuring increased yields. Despite the continuous progress in computer vision technology, detecting maize leaf diseases based on deep learning still relies on a large amount of manually labeled data, and the labeling process is time-consuming and labor-intensive. Moreover, the detectors currently used for identifying maize leaf diseases have relatively low accuracy in complex experimental fields. Therefore, the proposed Agronomic Teacher, an object detection algorithm that utilizes limited labeled and abundant unlabeled data, is applied to maize leaf disease recognition. In this work, a semi-supervised object detection framework is built based on a single-stage detector, integrating the Weighted Average Pseudo-labeling Assignment (WAP) strategy and AgroYOLO detector combining Agro-Backbone network with Agro-Neck network. The WAP strategy uses weight adjustments to set objectness and classification scores as evaluation criteria for pseudo-labels reliability assignment. Agro-Backbone network accurately extracts features of maize leaf diseases and obtains richer semantic information. Agro-Neck network enhances feature fusion by utilizing multi-layer features for collaborative combinations. The effectiveness of the proposed method is validated on the MaizeData and PascalVOC datasets at different annotation ratios. Compared to the baseline model, Agronomic Teacher leverages abundant unlabeled data to achieve a 6.5% increase in mAP (0.5) on the 30% labeled MaizeData. On the 30% labeled PascalVOC dataset, the mAP (0.5) improved by 8.2%, demonstrating the method’s potential for generalization. Full article
(This article belongs to the Special Issue Advanced Image Processing in Agricultural Applications)
Show Figures

Figure 1

15 pages, 1727 KiB  
Article
Multi-Level Attention with 2D Table-Filling for Joint Entity-Relation Extraction
by Zhenyu Zhang, Lin Shi, Yang Yuan, Huanyue Zhou and Shoukun Xu
Information 2024, 15(7), 407; https://doi.org/10.3390/info15070407 - 14 Jul 2024
Viewed by 276
Abstract
Joint entity-relation extraction is a fundamental task in the construction of large-scale knowledge graphs. This task relies not only on the semantics of the text span but also on its intricate connections, including classification and structural details that most previous models overlook. In [...] Read more.
Joint entity-relation extraction is a fundamental task in the construction of large-scale knowledge graphs. This task relies not only on the semantics of the text span but also on its intricate connections, including classification and structural details that most previous models overlook. In this paper, we propose the incorporation of this information into the learning process. Specifically, we design a novel two-dimensional word-pair tagging method to define the task of entity and relation extraction. This allows type markers to focus on text tokens, gathering information for their corresponding spans. Additionally, we introduce a multi-level attention neural network to enhance its capacity to perceive structure-aware features. Our experiments show that our approach can overcome the limitations of earlier tagging methods and yield more accurate results. We evaluate our model using three different datasets: SciERC, ADE, and CoNLL04. Our model demonstrates competitive performance compared to the state-of-the-art, surpassing other approaches across the majority of evaluated metrics. Full article
20 pages, 2006 KiB  
Article
Multi-Source Information Graph Embedding with Ensemble Learning for Link Prediction
by Chunning Hou, Xinzhi Wang, Xiangfeng Luo and Shaorong Xie
Electronics 2024, 13(14), 2762; https://doi.org/10.3390/electronics13142762 - 13 Jul 2024
Viewed by 362
Abstract
Link prediction is a key technique for connecting entities and relationships in a graph reasoning field. It leverages known information about the graph structure data to predict missing factual information. Previous studies have either focused on the semantic representation of a single triplet [...] Read more.
Link prediction is a key technique for connecting entities and relationships in a graph reasoning field. It leverages known information about the graph structure data to predict missing factual information. Previous studies have either focused on the semantic representation of a single triplet or on the graph structure data built on triples. The former ignores the association between different triples, and the latter ignores the true meaning of the node itself. Furthermore, common graph-structured datasets inherently face challenges, such as missing information and incompleteness. In light of this challenge, we present a novel model called Multi-source Information Graph Embedding with Ensemble Learning for Link Prediction (EMGE), which can effectively improve the reasoning of link prediction. Ensemble learning is systematically applied throughout the model training process. At the data level, this approach enhances entity embeddings by integrating structured graph information and unstructured textual data as multi-source information inputs. The fusion of these inputs is effectively addressed by introducing an attention mechanism. During the training phase, the principle of ensemble learning is employed to extract semantic features from multiple neural network models, facilitating the interaction of enriched information. To ensure effective model learning, a novel loss function based on contrastive learning is devised, effectively minimizing the discrepancy between predicted values and the ground truth. Moreover, to enhance the semantic representation of graph nodes in link prediction, two rules are introduced during the aggregation of graph structure information. These rules incorporate the concept of spreading activation, enabling a more comprehensive understanding of the relationships between nodes and edges in the graph. During the testing phase, the EMGE model is validated on three datasets, including WN18RR, FB15k-237, and a private Chinese financial dataset. The experimental results demonstrate a reduction in the mean rank (MR) by 0.2 times, an improvement in the mean reciprocal rank (MRR) by 5.9%, and an increase in the Hit@1 by 12.9% compared to the baseline model. Full article
Show Figures

Figure 1

Back to TopTop