Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleFebruary 2025
Semantic decomposition and enhancement hashing for deep cross-modal retrieval
AbstractDeep hashing has garnered considerable interest and has shown impressive performance in the domain of retrieval. However, the majority of the current hashing techniques rely solely on binary similarity evaluation criteria to assess the semantic ...
- research-articleFebruary 2025
BiFPN-YOLO: One-stage object detection integrating Bi-Directional Feature Pyramid Networks
AbstractObject detection is a key component in computer vision research, allowing a system to determine the location and type of object within any given scene. YOLOv5 is a modern object detection model, which utilises the advantages of the original YOLO ...
- research-articleFebruary 2025
Diffusion-based framework for weakly-supervised temporal action localization
AbstractWeakly supervised temporal action localization aims to localize action instances with only video-level supervision. Due to the absence of frame-level annotation supervision, how effectively separate action snippets and backgrounds from ...
Highlights- We propose a diffusion-based framework for weakly-supervised temporal action localization.
- We leverage a local masking module to separate local action instances from backgrounds.
- We propose a new-refining strategy to improve the ...
- research-articleFebruary 2025
UM-CAM: Uncertainty-weighted multi-resolution class activation maps for weakly-supervised segmentation
AbstractWeakly-supervised medical image segmentation methods utilizing image-level labels have gained attention for reducing the annotation cost. They typically use Class Activation Maps (CAM) from a classification network but struggle with incomplete ...
Highlights- A novel weakly-supervised segmentation method learning from image-level labels.
- Uncertainty-weighted multi-resolution class activation map to generate pixel-level pseudo-labels.
- Geodesic distance-based seed expansion to generate ...
- research-articleFebruary 2025
SEMACOL: Semantic-enhanced multi-scale approach for text-guided grayscale image colorization
AbstractHigh-quality colorization of grayscale images using text descriptions presents a significant challenge, especially in accurately coloring small objects. The existing methods have two major flaws. First, text descriptions typically omit size ...
Highlights- SEMACOL integrates the CMTA, MCL, and TICA modules, and the DFF strategy.
- The CMTA module enhances text features by integrating object size information.
- The MCL module leverages multi-scale visual features to precisely locate ...
-
- research-articleFebruary 2025
Automatic cervical cancer classification using adaptive vision transformer encoder with CNN for medical application
AbstractAccurate and early cervical cancer screening can reduce the mortality rate of cervical cancer patients. The Pap test, often known as a Pap smear, is one of the frequently used methods for the early diagnosis of cervical cancer. However, manual ...
- research-articleFebruary 2025
Frequency domain-based latent diffusion model for underwater image enhancement
AbstractThe degradation of underwater images, due to complex factors, negatively impacts the performance of underwater visual tasks. However, most underwater image enhancement methods (UIE) have been confined to the spatial domain, disregarding the ...
Highlights- The LDM with high-low frequency distinction enhancement strategy for UIE.
- High-low frequency priors effectively assist enhancement.
- High-low frequency residual attention achieves dynamic fusion of priors.
- Low-frequency ...
- research-articleFebruary 2025
Multi-branch feature transformation cross-domain few-shot learning for hyperspectral image classification
AbstractIn the field of hyperspectral image (HSI) classification, a source dataset with ample labeled samples is commonly utilized to enhance the classification performance of a target dataset with few labeled samples. Existing few-shot learning (FSL) ...
Highlights- A novel multi-branch feature extraction and fusion module.
- Featurewise transformation for features diversity and model generalization.
- Effective domain adaptation strategy for hyperspectral image few-shot classification.
- The ...
- research-articleFebruary 2025
Recursive reservoir concatenation for salt-and-pepper denoising
AbstractWe propose a recursive reservoir concatenation architecture in reservoir computing for salt-and-pepper noise removal. The recursive algorithm consists of two components. One is the initial network training for the recursion. Since the standard ...
Highlights- Reservoir computing can be adapted to image data through nonlinear image filtration.
- The new architecture is computationally more efficient than deep neural networks.
- The concatenation approach guarantees training error decrease ...
- research-articleFebruary 2025
RNDiff: Rainfall nowcasting with Condition Diffusion Model
AbstractThe Diffusion Models are widely used in image generation because they can generate high-quality and realistic samples. In contrast, generative adversarial networks (GANs) and variational autoencoders (VAEs) have some limitations in terms of image ...
Highlights- A diffusion model method called RNDiff is proposed for rainfall forecasting.
- We propose a Condition Encoder for guided image generation in training and sampling.
- The proposed model obtained outstanding performance in rainfall ...
- research-articleFebruary 2025
Percept, Chat, Adapt: Knowledge transfer of foundation models for open-world video recognition
AbstractOpen-world video recognition is challenging since traditional networks are not generalized well on complex environment variations. Alternatively, foundation models with rich knowledge have recently shown their generalization power. However, how ...
- research-articleFebruary 2025
CDHN: Cross-domain hallucination network for 3D keypoints estimation
AbstractThis paper presents a novel method to estimate sparse 3D keypoints from single-view RGB images. Our network is trained in two steps using a knowledge distillation framework. In the first step, the teacher is trained to extract 3D features from ...
Highlights- We estimate keypoints from RGB images by leveraging information learnt from 3D data.
- Our approach is flexible that can predict geometry based number of keypoints.
- It can be trained for various categories simultaneously.
- It is ...
- research-articleFebruary 2025
Explainable monotonic networks and constrained learning for interpretable classification and weakly supervised anomaly detection
AbstractDeep networks interpretability is fundamental in critical domains like medicine: using easily explainable networks with decisions based on radiological signs and not on spurious confounders would reassure the clinicians. Confidence is reinforced ...
Graphical abstractDisplay Omitted
- research-articleFebruary 2025
Exploring Latent Transferability of feature components
AbstractFeature disentanglement techniques have been widely employed to extract transferable (domain-invariant) features from non-transferable (domain-specific) features in Unsupervised Domain Adaptation (UDA). However, due to the complex interplay among ...
Highlights- This paper introduces two new types of partially transferable features components.
- The proposed model consists of rough feature disentanglement and dynamic adjustment.
- This proposed model is more applicable to diverse and complex ...
- research-articleFebruary 2025
STARNet: Low-light video enhancement using spatio-temporal consistency aggregation
- Zhe Wu,
- Zehua Sheng,
- Xue Zhang,
- Si-Yuan Cao,
- Runmin Zhang,
- Beinan Yu,
- Chenghao Zhang,
- Bailin Yang,
- Hui-Liang Shen
AbstractIn low-light environments, capturing high-quality videos is an imaging challenge due to the limited number of photons. Previous low-light enhancement approaches usually result in over-smoothed details, temporal flickers, and color deviation. We ...
- research-articleFebruary 2025
Lightweight remote sensing super-resolution with multi-scale graph attention network
AbstractRemote Sensing Super-Resolution (RS-SR) constitutes a pivotal component in the domain of remote sensing image analysis, aimed at enhancing the spatial resolution of low-resolution imagery. Recent advancements have seen deep learning techniques ...
Highlights- We propose an efficient lightweight reconstruction network for remote sensing.
- We develop a multi-scale graph attention module to extract multi-scale features.
- We design a global feature fusion module to fusion global contextual ...
- research-articleFebruary 2025
Uncertainty estimation in color constancy
AbstractComputational color constancy is an under-determined problem. As such, a key objective is to assign a level of uncertainty to the output illuminant estimations, which can significantly impact the reliability of the corrected images for downstream ...
Highlights- We formalize the concept of uncertainty in color constancy and define of three forms of uncertainty.
- We apply our uncertainty estimators to different categories of color constancy methods and quantify their validity.
- We show the ...
- research-articleFebruary 2025
Deep learning-enhanced environment perception for autonomous driving: MDNet with CSP-DarkNet53
AbstractImplementing environmental perception in intelligent vehicles is a crucial application, but the parallel processing of numerous algorithms on the vehicle side is complex, and their integration remains a critical challenge. To address this problem,...
- research-articleFebruary 2025
Distilling heterogeneous knowledge with aligned biological entities for histological image classification
AbstractIn the task of classifying histological images, prior works widely leverage Graph neural network (GNN) to aggregate histological knowledge from multi-level biological entities (e.g., cell and tissue). However, current GNN-based methods suffer ...
Highlights- A heterogeneous GNN distillation model is proposed for histological image analysis.
- A biological affiliation recognition module is designed to align the semantics.
- Extensive experiments demonstrate the effectiveness of the proposed ...
- research-articleFebruary 2025
Learning data association for multi-object tracking using only coordinates
AbstractWe propose a novel Transformer-based module to address the data association problem for multi-object tracking. From detections obtained by a pretrained detector, this module uses only coordinates from bounding boxes to estimate an affinity score ...
Graphical abstractDisplay Omitted
Highlights- Our Transformer-based model, TWiX, can learn to associate objects using only coordinates.
- We show that motion priors or intersection-over-union measure are not required for tracking. Using pairs of tracks is sufficient.
- Tracking ...