Out-of-vocabulary handling and topic quality control strategies in streaming topic models
Topic models have become ubiquitous tools for analyzing streaming data. However, existing streaming topic models suffer from several limitations when applied to real-world data streams. This includes the inability to accommodate evolving ...
A survey of deep learning algorithms for colorectal polyp segmentation
Early detecting and removing cancerous colorectal polyps can effectively reduce the risk of colorectal cancer. Computer intelligent segmentation techniques (CIST) can improve the detection rate of polyp by drawing the boundaries of colorectal ...
Highlights
- Summarize the development of colorectal polyp segmentation tasks over the past decade.
- Focus on four challenges in polyp segmentation tasks.
- Provide an overview of the available private datasets.
- Suggest five future directions ...
Implicit expression recognition enhanced table-filling for aspect sentiment triplet extraction
Aspect sentiment triplet extraction (ASTE) is a challenging task in aspect-based sentiment analysis (ABSA), involving the identification of aspect terms, opinion terms, and their corresponding sentiment polarities within comments to form ...
A review of AI edge devices and lightweight CNN and LLM deployment
Artificial Intelligence of Things (AIoT) which integrates artificial intelligence (AI) and the Internet of Things (IoT), has attracted increasing attention recently. With the remarkable development of AI, convolutional neural networks (CNN) have ...
Easy and effective! Data augmentation for knowledge-aware dialogue generation via multi-perspective sentences interaction
In recent years, knowledge-based dialogue generation has garnered significant attention due to its capacity to produce informative and coherent responses through the integration of external knowledge into models. However, obtaining high-quality ...
Mixed-scale cross-modal fusion network for referring image segmentation
Referring image segmentation aims to segment the target by a given language expression. Recently, the bottom-up fusion network utilizes language features to highlight the most relevant regions during the visual encoder stage. However, it is not ...
Imperceptible rhythm backdoor attacks: Exploring rhythm transformation for embedding undetectable vulnerabilities on speech recognition
Speech recognition is an essential start ring of human–computer interaction. Recently, deep learning models have achieved excellent success in this task. However, the model training and private data provider are sometimes separated, and potential ...
IoU-guided Siamese network with high-confidence template fusion for visual tracking
Existing IoU-guided trackers use IoU score to weight the classification score only in testing phase, this model mismatch between training and testing phases leads to poor tracking performance especially when facing background distractors. In this ...
Highlights
- Design an IoU-guided distractor suppression network to ensure the model consistency.
- Give a high-confidence template fusion network to generate discriminative feature.
- Propose the tracker SiamIH by integrating IDSNet and HTFNet.
Physically-guided open vocabulary segmentation with weighted patched alignment loss
Open vocabulary segmentation is a challenging task that aims to segment out the thousands of unseen categories. Directly applying CLIP to open-vocabulary semantic segmentation is challenging due to the granularity gap between its image-level ...
Perceptual metric for face image quality with pixel-level interpretability
This paper tackles the shortcomings of image evaluation metrics in evaluating facial image quality. Conventional metrics do neither accurately reflect the unique attributes of facial images nor correspond with human visual perception. To address ...
Industrial and medical anomaly detection through cycle-consistent adversarial networks
In this study, a new Anomaly Detection (AD) approach for industrial and medical images is proposed. This method leverages the theoretical strengths of unsupervised learning and the data availability of both normal and abnormal classes. Indeed, ...
Graphical abstractDisplay Omitted
Highlights
- Use abnormal data through a Cycle-GAN for AD, for better discrimination.
- Provide intuition on why the identity loss are meaningful for AD.
- Discuss the performances for diverse industrial and medical AD problems.
- Conduct an ...
A three-stage model for camouflaged object detection
Camouflaged objects are typically assimilated into their backgrounds and exhibit fuzzy boundaries. The complex environmental conditions and the high intrinsic similarity between camouflaged targets and their surroundings pose significant ...
Highlights
- A high-performance method is proposed for camouflaged object detection.
- A novel scheme for camouflaged object detection is proposed.
- Multiple modules are developed to boost the performance.
- The proposed method outperforms other ...
CoFiNet: Unveiling camouflaged objects with multi-scale finesse
Camouflaged Object Detection (COD) is a critical aspect of computer vision aimed at identifying concealed objects, with applications spanning military, industrial, medical and monitoring domains. To address the problem of poor detail segmentation ...
Adaptive feature alignment network with noise suppression for cross-domain object detection
Recently, unsupervised domain adaptive object detection methods have been proposed to address the challenge of detecting objects across different domains without labeled data in the target domain. These methods focus on aligning features either ...
MHEC: One-shot relational learning of knowledge graphs completion based on multi-hop information enhancement
With the wide application of knowledge graphs, knowledge graph completion has garnered increasing attention in recent years. However, we find that the long tail relation is more common in the KG. These relations typically do not have a large ...
Multi-attention associate prediction network for visual tracking
Classification-regression prediction networks have realized impressive success in several modern deep trackers. However, there is an inherent difference between classification and regression tasks, so they have diverse even opposite demands for ...
Highlights
- Two novel feature matchers are proposed to fully capture the category semantic patterns and the spatial detailed cues.
- We present an associate prediction network to achieve both robust classification and precise location.
- Numerous ...
Active self-semi-supervised learning for few labeled samples
Training deep models with limited annotations poses a significant challenge when applied to diverse practical domains. Employing semi-supervised learning alongside the self-supervised model offers the potential to enhance label efficiency. ...
Simulation-based effective comparative analysis of neuron circuits for neuromorphic computation systems
The spiking neural networks (SNN) that are inspired by the human brain offers wider scope for application in the growth of neuromorphic computing systems due to their brain level computational capabilities, reduced power consumption, and minimal ...
A pseudo-3D coarse-to-fine architecture for 3D medical landmark detection
The coarse-to-fine architecture is a benchmark method designed to enhance the accuracy of 3D medical landmark detection. However, incorporating 3D convolutional neural networks into the coarse-to-fine architecture leads to a significant increase ...
Graph-Based Similarity of Deep Neural Networks
Understanding the enigmatic black-box representations within Deep Neural Networks (DNNs) is an essential problem in the community of deep learning. An initial step towards tackling this conundrum lies in quantifying the degree of similarity ...
Highlights
- New framework for gauging the similarity between neural network representations.
- Comprehensive comparison against SOTA method CKA.
- Showing case its application on downstream tasks.
Cross-view action recognition understanding from exocentric to egocentric perspective
Understanding action recognition in egocentric videos has emerged as a vital research topic with numerous practical applications. With the limitation in the scale of egocentric data collection, learning robust deep learning-based action ...
Diffusion model conditioning on Gaussian mixture model and negative Gaussian mixture gradient
Diffusion models (DMs) are a type of generative model that has had a significant impact on image synthesis and beyond. They can incorporate a wide variety of conditioning inputs — such as text or bounding boxes — to guide generation. In this work,...
Graphical abstractDisplay Omitted
Highlights
- A diffusion model conditioning on Gaussian mixture model is proposed.
- Latent distributions built by features are proven better than by classes.
- A new negative Gaussian mixture gradient is integrated into our diffusion model.
- ...
Observer-based adaptive neural network event-triggered quantized control for active suspensions with actuator saturation
This paper proposes an adaptive neural network event-triggered and quantized output feedback control scheme for quarter vehicle active suspensions with actuator saturation. The scheme uses neural networks to approximate the unknown parts of the ...
Highlights
- The control scheme can realize dynamic event-triggered sampling and quantization.
- A state observer is used to estimate the unavailable states of a suspension system.
- The control scheme is easily implemented in automotive network ...
Auditing privacy budget of differentially private neural network models
In recent years, neural network models are used in various tasks. To eliminate privacy concern, differential privacy (DP) is introduced to the training phase of neural network models. However, introducing DP into neural network models is very ...
Learning from different perspectives for regret reduction in reinforcement learning: A free energy approach
Reinforcement learning (RL) is the core method for interactive learning in living and artificial creatures. Nevertheless, in contrast to humans and animals, artificial RL agents are very slow in learning and suffer from the curse of ...
Deep belief network with fuzzy parameters and its membership function sensitivity analysis
Over the last few years, deep belief networks (DBNs) have been extensively utilized for efficient and reliable performance in several complex systems. One critical factor contributing to the enhanced learning of the DBN layers is the handling of ...
Global and local semantic enhancement of samples for cross-modal hashing
Hashing becomes popular in cross-modal retrieval due to its exceptional performance in both search and storage. However, existing cross-modal hashing (CMH) methods may (a) neglect to learn sufficient modal-specific information, and (b) fail to ...
Dual-referenced assistive network for action quality assessment
Action quality assessment (AQA) aims to evaluate the performing quality of a specific action. It is a challenging task as it requires to identify the subtle differences between the videos containing the same action. Most of existing AQA methods ...
Highlights
- We propose a Rating-guided Attention module, which introduces a set of semantic-level referenced assistants to refine coarse-grained features into rating-informed features. These rating-informed features integrate hierarchical semantic ...
CRISP: A cross-modal integration framework based on the surprisingly popular algorithm for multimodal named entity recognition
The multimodal named entity recognition task on social media involves recognizing named entities with textual and visual information, which is of great significance for information processing. Nevertheless, many existing models still face the ...