Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (65)

Search Parameters:
Keywords = hierarchical level semantic information

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
17 pages, 7301 KiB  
Article
Vision-Based Situational Graphs Exploiting Fiducial Markers for the Integration of Semantic Entities
by Ali Tourani, Hriday Bavle, Deniz Işınsu Avşar, Jose Luis Sanchez-Lopez, Rafael Munoz-Salinas and Holger Voos
Robotics 2024, 13(7), 106; https://doi.org/10.3390/robotics13070106 - 16 Jul 2024
Viewed by 287
Abstract
Situational Graphs (S-Graphs) merge geometric models of the environment generated by Simultaneous Localization and Mapping (SLAM) approaches with 3D scene graphs into a multi-layered jointly optimizable factor graph. As an advantage, S-Graphs not only offer a more comprehensive robotic situational awareness by combining [...] Read more.
Situational Graphs (S-Graphs) merge geometric models of the environment generated by Simultaneous Localization and Mapping (SLAM) approaches with 3D scene graphs into a multi-layered jointly optimizable factor graph. As an advantage, S-Graphs not only offer a more comprehensive robotic situational awareness by combining geometric maps with diverse, hierarchically organized semantic entities and their topological relationships within one graph, but they also lead to improved performance of localization and mapping on the SLAM level by exploiting semantic information. In this paper, we introduce a vision-based version of S-Graphs where a conventional Visual SLAM (VSLAM) system is used for low-level feature tracking and mapping. In addition, the framework exploits the potential of fiducial markers (both visible and our recently introduced transparent or fully invisible markers) to encode comprehensive information about environments and the objects within them. The markers aid in identifying and mapping structural-level semantic entities, including walls and doors in the environment, with reliable poses in the global reference, subsequently establishing meaningful associations with higher-level entities, including corridors and rooms. However, in addition to including semantic entities, the semantic and geometric constraints imposed by the fiducial markers are also utilized to improve the reconstructed map’s quality and reduce localization errors. Experimental results on a real-world dataset collected using legged robots show that our framework excels in crafting a richer, multi-layered hierarchical map and enhances robot pose accuracy at the same time. Full article
(This article belongs to the Special Issue Localization and 3D Mapping of Intelligent Robotics)
Show Figures

Figure 1

20 pages, 28730 KiB  
Article
Unmanned Aerial Vehicle Object Detection Based on Information-Preserving and Fine-Grained Feature Aggregation
by Jiangfan Zhang, Yan Zhang, Zhiguang Shi, Yu Zhang and Ruobin Gao
Remote Sens. 2024, 16(14), 2590; https://doi.org/10.3390/rs16142590 - 15 Jul 2024
Viewed by 251
Abstract
General deep learning methods achieve high-level semantic feature representation by aggregating hierarchical features, which performs well in object detection tasks. However, issues arise with general deep learning methods in UAV-based remote sensing image object detection tasks. Firstly, general feature aggregation methods such as [...] Read more.
General deep learning methods achieve high-level semantic feature representation by aggregating hierarchical features, which performs well in object detection tasks. However, issues arise with general deep learning methods in UAV-based remote sensing image object detection tasks. Firstly, general feature aggregation methods such as stride convolution may lead to information loss in input samples. Secondly, common FPN methods introduce conflicting information by directly fusing feature maps from different levels. These shortcomings limit the model’s detection performance on small and weak targets in remote sensing images. In response to these concerns, we propose an unmanned aerial vehicle (UAV) object detection algorithm, IF-YOLO. Specifically, our algorithm leverages the Information-Preserving Feature Aggregation (IPFA) module to construct semantic feature representations while preserving the intrinsic features of small objects. Furthermore, to filter out irrelevant information introduced by direct fusion, we introduce the Conflict Information Suppression Feature Fusion Module (CSFM) to improve the feature fusion approach. Additionally, the Fine-Grained Aggregation Feature Pyramid Network (FGAFPN) facilitates interaction between feature maps at different levels, reducing the generation of conflicting information during multi-scale feature fusion. The experimental results on the VisDrone2019 dataset demonstrate that in contrast to the standard YOLOv8-s, our enhanced algorithm achieves a mean average precision (mAP) of 47.3%, with precision and recall rates enhanced by 6.3% and 5.6%, respectively. Full article
Show Figures

Figure 1

15 pages, 6519 KiB  
Article
FF-HPINet: A Flipped Feature and Hierarchical Position Information Extraction Network for Lane Detection
by Xiaofeng Zhou and Peng Zhang
Sensors 2024, 24(11), 3502; https://doi.org/10.3390/s24113502 - 29 May 2024
Viewed by 368
Abstract
Effective lane detection technology plays an important role in the current autonomous driving system. Although deep learning models, with their intricate network designs, have proven highly capable of detecting lanes, there persist key areas requiring attention. Firstly, the symmetry inherent in visuals captured [...] Read more.
Effective lane detection technology plays an important role in the current autonomous driving system. Although deep learning models, with their intricate network designs, have proven highly capable of detecting lanes, there persist key areas requiring attention. Firstly, the symmetry inherent in visuals captured by forward-facing automotive cameras is an underexploited resource. Secondly, the vast potential of position information remains untapped, which can undermine detection precision. In response to these challenges, we propose FF-HPINet, a novel approach for lane detection. We introduce the Flipped Feature Extraction module, which models pixel pairwise relationships between the flipped feature and the original feature. This module allows us to capture symmetrical features and obtain high-level semantic feature maps from different receptive fields. Additionally, we design the Hierarchical Position Information Extraction module to meticulously mine the position information of the lanes, vastly improving target identification accuracy. Furthermore, the Deformable Context Extraction module is proposed to distill vital foreground elements and contextual nuances from the surrounding environment, yielding focused and contextually apt feature representations. Our approach achieves excellent performance with the F1 score of 97.00% on the TuSimple dataset and 76.84% on the CULane dataset. Full article
Show Figures

Figure 1

12 pages, 990 KiB  
Article
Multi-Modal Sarcasm Detection with Sentiment Word Embedding
by Hao Fu, Hao Liu, Hongling Wang, Linyan Xu, Jiali Lin and Dazhi Jiang
Electronics 2024, 13(5), 855; https://doi.org/10.3390/electronics13050855 - 23 Feb 2024
Cited by 2 | Viewed by 1906
Abstract
Sarcasm poses a significant challenge for detection due to its unique linguistic phenomenon where the intended meaning is often opposite of the literal expression. Current sarcasm detection technology primarily utilizes multi-modal processing, but the connotative semantic information provided by the modality itself is [...] Read more.
Sarcasm poses a significant challenge for detection due to its unique linguistic phenomenon where the intended meaning is often opposite of the literal expression. Current sarcasm detection technology primarily utilizes multi-modal processing, but the connotative semantic information provided by the modality itself is limited. It is a challenge to mine the semantic information contained in the combination of sarcasm samples and external commonsense knowledge. Furthermore, as the essence of sarcasm detection lies in measuring emotional inconsistency, the rich semantic information may introduce excessive noise to inconsistency measurement. To mitigate these limitations, we propose a hierarchical framework in this paper. Specifically, to enrich the semantic information of each modality, our approach uses sentiment dictionaries to obtain the sentiment vectors by evaluating the words extracted from various modalities, and then combines them with each modality. Furthermore, in order to mine the joint semantic information implied in the modalities and improve measurement of emotional inconsistency, the emotional information representation obtained by fusing each modality’s data is concatenated with the sentiment vector. Then, cross-modal fusion is performed through cross-attention, and, finally, the sarcasm is recognized by fusing low-level information in the cross-modal fusion layer. Our model is evaluated on a public multi-modal sarcasm detection dataset based on Twitter, and the results demonstrate its superiority. Full article
(This article belongs to the Special Issue New Advances in Affective Computing)
Show Figures

Figure 1

12 pages, 1219 KiB  
Article
Hierarchical Perceptual Graph Attention Network for Knowledge Graph Completion
by Wenhao Han, Xuemei Liu, Jianhao Zhang and Hairui Li
Electronics 2024, 13(4), 721; https://doi.org/10.3390/electronics13040721 - 9 Feb 2024
Viewed by 931
Abstract
Knowledge graph completion (KGC), the process of predicting missing knowledge through known triples, is a primary focus of research in the field of knowledge graphs. As an important graph representation technique in deep learning, graph neural networks (GNNs) perform well in knowledge graph [...] Read more.
Knowledge graph completion (KGC), the process of predicting missing knowledge through known triples, is a primary focus of research in the field of knowledge graphs. As an important graph representation technique in deep learning, graph neural networks (GNNs) perform well in knowledge graph completion, but most existing graph neural network-based knowledge graph completion methods tend to aggregate neighborhood information directly and individually, ignoring the rich hierarchical semantic structure of KGs. As a result, how to effectively deal with multi-level complex relations is still not well resolved. In this study, we present a hierarchical knowledge graph completion technique that combines both relation-level and entity-level attention and incorporates a weight matrix to enhance the significance of the embedded information under different semantic conditions. Furthermore, it updates neighborhood information to the central entity using a hierarchical aggregation approach. The proposed model enhances the capacity to capture hierarchical semantic feature information and is adaptable to various scoring functions as decoders, thus yielding robust results. We conducted experiments on a public benchmark dataset and compared it with several state-of-the-art models, and the experimental results indicate that our proposed model outperforms existing models in several aspects, proving its superior performance and validating the effectiveness of the model. Full article
(This article belongs to the Special Issue Natural Language Processing and Information Retrieval, 2nd Edition)
Show Figures

Figure 1

21 pages, 5618 KiB  
Article
ResU-Former: Advancing Remote Sensing Image Segmentation with Swin Residual Transformer for Precise Global–Local Feature Recognition and Visual–Semantic Space Learning
by Hanlu Li, Lei Li, Liangyu Zhao and Fuxiang Liu
Electronics 2024, 13(2), 436; https://doi.org/10.3390/electronics13020436 - 20 Jan 2024
Cited by 1 | Viewed by 1118
Abstract
In the field of remote sensing image segmentation, achieving high accuracy and efficiency in diverse and complex environments remains a challenge. Additionally, there is a notable imbalance between the underlying features and the high-level semantic information embedded within remote sensing images, and both [...] Read more.
In the field of remote sensing image segmentation, achieving high accuracy and efficiency in diverse and complex environments remains a challenge. Additionally, there is a notable imbalance between the underlying features and the high-level semantic information embedded within remote sensing images, and both global and local recognition improvements are also limited by the multi-scale remote sensing scenery and imbalanced class distribution. These challenges are further compounded by inaccurate local localization segmentation and the oversight of small-scale features. To achieve balance between visual space and semantic space, to increase both global and local recognition accuracy, and to enhance the flexibility of input scale features while supplementing global contextual information, in this paper, we propose a U-shaped hierarchical structure called ResU-Former. The incorporation of the Swin Residual Transformer block allows for the efficient segmentation of objects of varying sizes against complex backgrounds, a common scenario in remote sensing datasets. With the specially designed Swin Residual Transformer block as its fundamental unit, ResU-Former accomplishes the full utilization and evolution of information, and the maximum optimization of semantic segmentation in complex remote sensing scenarios. The standard experimental results on benchmark datasets such as Vaihingen, Overall Accuracy of 81.5%, etc., show the ResU-Former’s potential to improve segmentation tasks across various remote sensing applications. Full article
Show Figures

Figure 1

36 pages, 28285 KiB  
Article
Construction of a Type Knowledge Graph Based on the Value Cognitive Turn of Characteristic Villages: An Application in Jixi, Anhui Province, China
by Kai Ren and Khaliun Buyandelger
Land 2024, 13(1), 9; https://doi.org/10.3390/land13010009 - 19 Dec 2023
Viewed by 978
Abstract
Currently, Chinese villages are grappling with the issue of regional value collapse within the long-standing ‘urban-rural dual system’ strategy. Characteristic villages, as integral components of the urban–rural hierarchical spatial system and pivotal agents in rural development, wield significant influence in addressing China’s rural [...] Read more.
Currently, Chinese villages are grappling with the issue of regional value collapse within the long-standing ‘urban-rural dual system’ strategy. Characteristic villages, as integral components of the urban–rural hierarchical spatial system and pivotal agents in rural development, wield significant influence in addressing China’s rural crises. The construction practice of characteristic villages showcases the cognitive evolution of ‘element-industry-function-type’. Within the value perception of characteristic villages, these practices reflect fundamental orientations in the interaction between humans and land, emphasizing the symbiotic relationship between production, life, and ecology. In alignment with this value perception, and drawing upon the existing studies on the classification of characteristic village types in Jixi County, this paper establishes a comprehensive type knowledge graph of characteristic villages. The framework of this graph’s expression revolves around ‘spatial elements-spatial combination-spatial organization’. This graph delineates a knowledge progression encompassing ‘information-knowledge-strategy’, characterized by three levels: the factual knowledge graph, conceptual knowledge graph and regular knowledge graph. The type knowledge graph systematically accumulates insights derived from the spatiotemporal transmission path of the village spatial structure. It formulates a structured progression of knowledge as follows: cognition of the village entity information → analysis of the village landscape structure → examination of the village social relationships. This constructed graph translates type-data information into spatial strategy knowledge, serving as a pivotal process in amalgamating characteristic village spatial data with semantic networks, particularly in expressing authenticity inspection and gene transfer. Full article
Show Figures

Figure 1

14 pages, 1731 KiB  
Article
An Efficient and Light Transformer-Based Segmentation Network for Remote Sensing Images of Landscapes
by Lijia Chen, Honghui Chen, Yanqiu Xie, Tianyou He, Jing Ye and Yushan Zheng
Forests 2023, 14(11), 2271; https://doi.org/10.3390/f14112271 - 20 Nov 2023
Viewed by 973
Abstract
High-resolution image segmentation for landscape applications has garnered significant attention, particularly in the context of ultra-high-resolution (UHR) imagery. Current segmentation methodologies partition UHR images into standard patches for multiscale local segmentation and hierarchical reasoning. This creates a pressing dilemma, where the trade-off between [...] Read more.
High-resolution image segmentation for landscape applications has garnered significant attention, particularly in the context of ultra-high-resolution (UHR) imagery. Current segmentation methodologies partition UHR images into standard patches for multiscale local segmentation and hierarchical reasoning. This creates a pressing dilemma, where the trade-off between memory efficiency and segmentation quality becomes increasingly evident. This paper introduces the Multilevel Contexts Weighted Coupling Transformer (WCTNet) for UHR segmentation. This framework comprises the Mult-level Feature Weighting (MFW) module and Token-based Transformer (TT) designed to weigh and couple multilevel semantic contexts. First, we analyze the multilevel semantics within a local patch without image-level contextual reasoning. It avoids complex image-level contextual associations and eliminates the misleading information carried. Second, MFW is developed to weigh shallow and deep features for enhancing object-related attention at different grain sizes from multilevel semantics. Third, the TT module is introduced to couple multilevel semantic contexts and transform them into semantic tokens using spatial attention. Then, we can capture token interactions and obtain clearer local representations. The suggested contextual weighting and coupling of single-scale patches empower WCTNet to maintain a well-balanced relationship between accuracy and computational overhead. Experimental results show that WCTNet achieves state-of-the-art performance on two UHR datasets of DeepGlobe and Inria Aerial. Full article
Show Figures

Figure 1

23 pages, 3418 KiB  
Article
Semantic Attention and Structured Model for Weakly Supervised Instance Segmentation in Optical and SAR Remote Sensing Imagery
by Man Chen, Kun Xu, Enping Chen, Yao Zhang, Yifei Xie, Yahao Hu and Zhisong Pan
Remote Sens. 2023, 15(21), 5201; https://doi.org/10.3390/rs15215201 - 1 Nov 2023
Viewed by 945
Abstract
Instance segmentation in remote sensing (RS) imagery aims to predict the locations of instances and represent them with pixel-level masks. Thanks to the more accurate pixel-level information for each instance, instance segmentation has enormous potential applications in resource planning, urban surveillance, and military [...] Read more.
Instance segmentation in remote sensing (RS) imagery aims to predict the locations of instances and represent them with pixel-level masks. Thanks to the more accurate pixel-level information for each instance, instance segmentation has enormous potential applications in resource planning, urban surveillance, and military reconnaissance. However, current RS imagery instance segmentation methods mostly follow the fully supervised paradigm, relying on expensive pixel-level labels. Moreover, remote sensing imagery suffers from cluttered backgrounds and significant variations in target scales, making segmentation challenging. To accommodate these limitations, we propose a semantic attention enhancement and structured model-guided multi-scale weakly supervised instance segmentation network (SASM-Net). Building upon the modeling of spatial relationships for weakly supervised instance segmentation, we further design the multi-scale feature extraction module (MSFE module), semantic attention enhancement module (SAE module), and structured model guidance module (SMG module) for SASM-Net to enable a balance between label production costs and visual processing. The MSFE module adopts a hierarchical approach similar to the residual structure to establish equivalent feature scales and to adapt to the significant scale variations of instances in RS imagery. The SAE module is a dual-stream structure with semantic information prediction and attention enhancement streams. It can enhance the network’s activation of instances in the images and reduce cluttered backgrounds’ interference. The SMG module can assist the SAE module in the training process to construct supervision with edge information, which can implicitly lead the model to a representation with structured inductive bias, reducing the impact of the low sensitivity of the model to edge information caused by the lack of fine-grained pixel-level labeling. Experimental results indicate that the proposed SASM-Net is adaptable to optical and synthetic aperture radar (SAR) RS imagery instance segmentation tasks. It accurately predicts instance masks without relying on pixel-level labels, surpassing the segmentation accuracy of all weakly supervised methods. It also shows competitiveness when compared to hybrid and fully supervised paradigms. This research provides a low-cost, high-quality solution for the instance segmentation task in optical and SAR RS imagery. Full article
(This article belongs to the Section AI Remote Sensing)
Show Figures

Figure 1

23 pages, 10169 KiB  
Article
DB-Tracker: Multi-Object Tracking for Drone Aerial Video Based on Box-MeMBer and MB-OSNet
by Yubin Yuan, Yiquan Wu, Langyue Zhao, Jinlin Chen and Qichang Zhao
Drones 2023, 7(10), 607; https://doi.org/10.3390/drones7100607 - 27 Sep 2023
Cited by 1 | Viewed by 2126
Abstract
Drone aerial videos offer a promising future in modern digital media and remote sensing applications, but effectively tracking several objects in these recordings is difficult. Drone aerial footage typically includes complicated sceneries with moving objects, such as people, vehicles, and animals. Complicated scenarios [...] Read more.
Drone aerial videos offer a promising future in modern digital media and remote sensing applications, but effectively tracking several objects in these recordings is difficult. Drone aerial footage typically includes complicated sceneries with moving objects, such as people, vehicles, and animals. Complicated scenarios such as large-scale viewing angle shifts and object crossings may occur simultaneously. Random finite sets are mixed in a detection-based tracking framework, taking the object’s location and appearance into account. It maintains the detection box information of the detected object and constructs the Box-MeMBer object position prediction framework based on the MeMBer random finite set point object tracking. We develop a hierarchical connection structure in the OSNet network, build MB-OSNet to get the object appearance information, and connect feature maps of different levels through the hierarchy such that the network may obtain rich semantic information at different sizes. Similarity measurements are computed and collected for all detections and trajectories in a cost matrix that estimates the likelihood of all possible matches. The cost matrix entries compare the similarity of tracks and detections in terms of position and appearance. The DB-Tracker algorithm performs excellently in multi-target tracking of drone aerial videos, achieving MOTA of 37.4% and 46.2% on the VisDrone and UAVDT data sets, respectively. DB-Tracker achieves high robustness by comprehensively considering the object position and appearance information, especially in handling complex scenes and target occlusion. This makes DB-Tracker a powerful tool in challenging applications such as drone aerial videos. Full article
Show Figures

Figure 1

14 pages, 1916 KiB  
Article
Self-Supervised Skin Lesion Segmentation: An Annotation-Free Approach
by Abdulrahman Gharawi, Mohammad D. Alahmadi and Lakshmish Ramaswamy
Mathematics 2023, 11(18), 3805; https://doi.org/10.3390/math11183805 - 5 Sep 2023
Cited by 1 | Viewed by 1078
Abstract
Skin cancer poses a significant health risk, affecting multiple layers of the skin, including the dermis, epidermis, and hypodermis. Melanoma, a severe type of skin cancer, originates from the abnormal proliferation of melanocytes in the epidermis. Current methods for skin lesion segmentation heavily [...] Read more.
Skin cancer poses a significant health risk, affecting multiple layers of the skin, including the dermis, epidermis, and hypodermis. Melanoma, a severe type of skin cancer, originates from the abnormal proliferation of melanocytes in the epidermis. Current methods for skin lesion segmentation heavily rely on large annotated datasets, which are costly, time-consuming, and demand specialized expertise from dermatologists. To address these limitations and improve logistics in dermatology practices, we present a self-supervised strategy for accurate skin lesion segmentation in dermatologist images, eliminating the need for manual annotations. Unlike the traditional appraoch, our proposed approach integrates a hybrid CNN/Transformer model, harnessing the complementary strengths of both architectures. The Transformer module captures long-range contextual dependencies, enabling a comprehensive understanding of image content, while the CNN encoder extracts local semantic information. To dynamically recalibrate the representation space, we introduce a contextual attention module that effectively combines hierarchical features and pixel-level information. By incorporating local and global dependencies among image pixels, we perform a clustering process that organizes the image content into a meaningful space. Furthermore, as another contribution, we incorporate a spatial consistency loss to promote the gradual merging of clusters with similar representations, thereby improving the segmentation quality. Experimental evaluations conducted on two publicly available skin lesion segmentation datasets demonstrate the superiority of our proposed method, outperforming both unsupervised and self-supervised strategies, and achieving state-of-the-art performance in this challenging task. Full article
Show Figures

Figure 1

19 pages, 9866 KiB  
Article
Hierarchical Edge-Preserving Dense Matching by Exploiting Reliably Matched Line Segments
by Yi Yue, Tong Fang, Wen Li, Min Chen, Bo Xu, Xuming Ge, Han Hu and Zhanhao Zhang
Remote Sens. 2023, 15(17), 4311; https://doi.org/10.3390/rs15174311 - 1 Sep 2023
Viewed by 858
Abstract
Image dense matching plays a crucial role in the reconstruction of three-dimensional models of buildings. However, large variations in target heights and serious occlusion lead to obvious mismatches in areas with discontinuous depths, such as building edges. To solve this problem, the present [...] Read more.
Image dense matching plays a crucial role in the reconstruction of three-dimensional models of buildings. However, large variations in target heights and serious occlusion lead to obvious mismatches in areas with discontinuous depths, such as building edges. To solve this problem, the present study mines the geometric and semantic information of line segments to produce a constraint for the dense matching process. First, a disparity consistency-based line segment matching method is proposed. This method correctly matches line segments on building structures in discontinuous areas based on the assumption that, within the corresponding local areas formed by two corresponding line pairs, the disparity obtained by the coarse-level matching of the hierarchical dense matching is similar to that derived from the local homography estimated from the corresponding line pairs. Second, an adaptive guide parameter is designed to constrain the cost propagation between pixels in the neighborhood of line segments. This improves the rationality of cost aggregation paths in discontinuous areas, thereby enhancing the matching accuracy near building edges. Experimental results using satellite and aerial images show that the proposed method efficiently obtains reliable line segment matches at building edges with a matching precision exceeding 97%. Under the constraint of the matched line segments, the proposed dense matching method generates building edges that are visually clearer, and achieves higher accuracy around edges, than without the line segment constraint. Full article
Show Figures

Graphical abstract

16 pages, 4631 KiB  
Article
Hierarchical Fusion Network with Enhanced Knowledge and Contrastive Learning for Multimodal Aspect-Based Sentiment Analysis on Social Media
by Xiaoran Hu and Masayuki Yamamura
Sensors 2023, 23(17), 7330; https://doi.org/10.3390/s23177330 - 22 Aug 2023
Viewed by 1353
Abstract
Aspect-based sentiment analysis (ABSA) is a task of fine-grained sentiment analysis that aims to determine the sentiment of a given target. With the increased prevalence of smart devices and social media, diverse data modalities have become more abundant. This fuels interest in multimodal [...] Read more.
Aspect-based sentiment analysis (ABSA) is a task of fine-grained sentiment analysis that aims to determine the sentiment of a given target. With the increased prevalence of smart devices and social media, diverse data modalities have become more abundant. This fuels interest in multimodal ABSA (MABSA). However, most existing methods for MABSA prioritize analyzing the relationship between aspect–text and aspect–image, overlooking the semantic gap between text and image representations. Moreover, they neglect the rich information in external knowledge, e.g., image captions. To address these limitations, in this paper, we propose a novel hierarchical framework for MABSA, known as HF-EKCL, which also offers perspectives on sensor development within the context of sentiment analysis. Specifically, we generate captions for images to supplement the textual and visual features. The multi-head cross-attention mechanism and graph attention neural network are utilized to capture the interactions between modalities. This enables the construction of multi-level aspect fusion features that incorporate element-level and structure-level information. Furthermore, for this paper, we integrated modality-based and label-based contrastive learning methods into our framework, making the model learn shared features that are relevant to the sentiment of corresponding words in multimodal data. The results, based on two Twitter datasets, demonstrate the effectiveness of our proposed model. Full article
Show Figures

Figure 1

24 pages, 1087 KiB  
Article
Topic Mining and Future Trend Exploration in Digital Economy Research
by Changlu Zhang, Qiong Yang, Jian Zhang, Liming Gou and Haojie Fan
Information 2023, 14(8), 432; https://doi.org/10.3390/info14080432 - 1 Aug 2023
Cited by 3 | Viewed by 1934
Abstract
This work proposes a new literature topic clustering analysis framework, based on which the topics of digital-economy-related studies are condensed. First, we calculated the word vector of keywords using the FastText model, and then the keywords were merged according to semantic similarity. A [...] Read more.
This work proposes a new literature topic clustering analysis framework, based on which the topics of digital-economy-related studies are condensed. First, we calculated the word vector of keywords using the FastText model, and then the keywords were merged according to semantic similarity. A hierarchical clustering method based on the Jaccard coefficient was employed to cluster the domain documents. Finally, the information gain method was applied to estimate the high-gain feature words for each category of topics. Based on the above framework, 23 categories of research topics were formed. We divided these topics into layers of digital technology, convergence innovation and digital governance, and we constructed a three-level digital economy research framework. Thereafter, the current hot spots and frontier trends were derived based on the number and growth rate of the literature. Our study revealed that the research on digital technology, which is the basic layer of the digital economy, has waned. The field related to the integration and innovation of digital technology and the real economy was the current research focus, among which the results with respect to “New Business Forms in the Digital Age”, “Circular Economy” and “Gig Economy” were abundant. The problems of the unbalanced development of the digital economy and digital monopoly have strengthened research on digital governance. Furthermore, research on “Regional Digital Economy”, “Chinese Digital Economy” and “Data Management” is in its initial stage and is a potential area of future research. Full article
(This article belongs to the Special Issue Digital Economy and Management)
Show Figures

Figure 1

18 pages, 1064 KiB  
Article
Topic Discovery and Hotspot Analysis of Sentiment Analysis of Chinese Text Using Information-Theoretic Method
by Changlu Zhang, Haojie Fan, Jian Zhang, Qiong Yang and Liqian Tang
Entropy 2023, 25(6), 935; https://doi.org/10.3390/e25060935 - 13 Jun 2023
Viewed by 1473
Abstract
Currently, sentiment analysis is a research hotspot in many fields such as computer science and statistical science. Topic discovery of the literature in the field of text sentiment analysis aims to provide scholars with a quick and effective understanding of its research trends. [...] Read more.
Currently, sentiment analysis is a research hotspot in many fields such as computer science and statistical science. Topic discovery of the literature in the field of text sentiment analysis aims to provide scholars with a quick and effective understanding of its research trends. In this paper, we propose a new model for the topic discovery analysis of literature. Firstly, the FastText model is applied to calculate the word vector of literature keywords, based on which cosine similarity is applied to calculate keyword similarity, to carry out the merging of synonymous keywords. Secondly, the hierarchical clustering method based on the Jaccard coefficient is used to cluster the domain literature and count the literature volume of each topic. Thirdly, the information gain method is applied to extract the high information gain characteristic words of various topics, based on which the connotation of each topic is condensed. Finally, by conducting a time series analysis of the literature, a four-quadrant matrix of topic distribution is constructed to compare the research trends of each topic within different stages. The 1186 articles in the field of text sentiment analysis from 2012 to 2022 can be divided into 12 categories. By comparing and analyzing the topic distribution matrices of the two phases of 2012 to 2016 and 2017 to 2022, it is found that the various categories of topics have obvious research development changes in different phases. The results show that: ① Among the 12 categories, online opinion analysis of social media comments represented by microblogs is one of the current hot topics. ② The integration and application of methods such as sentiment lexicon, traditional machine learning and deep learning should be enhanced. ③ Semantic disambiguation of aspect-level sentiment analysis is one of the current difficult problems this field faces. ④ Research on multimodal sentiment analysis and cross-modal sentiment analysis should be promoted. Full article
(This article belongs to the Special Issue Information-Theoretic Methods in Data Analytics)
Show Figures

Figure 1

Back to TopTop