Multi-view spectral clustering based on constrained Laplacian rank
The graph-based approach is a representative clustering method among multi-view clustering algorithms. However, it remains a challenge to quickly acquire complementary information in multi-view data and to execute effective clustering when the ...
High-accuracy 3D locators tracking in real time using monocular vision
In the field of medical applications, precise localization of medical instruments and bone structures is crucial to ensure computer-assisted surgical interventions. In orthopedic surgery, existing devices typically rely on stereoscopic vision. ...
Obs-tackle: an obstacle detection system to assist navigation of visually impaired using smartphones
As the prevalence of vision impairment continues to rise worldwide, there is an increasing need for affordable and accessible solutions that improve the daily experiences of individuals with vision impairment. The Visually Impaired (VI) are often ...
Interaction semantic segmentation network via progressive supervised learning
Semantic segmentation requires both low-level details and high-level semantics, without losing too much detail and ensuring the speed of inference. Most existing segmentation approaches leverage low- and high-level features from pre-trained ...
A review of adaptable conventional image processing pipelines and deep learning on limited datasets
The objective of this paper is to study the impact of limited datasets on deep learning techniques and conventional methods in semantic image segmentation and to conduct a comparative analysis in order to determine the optimal scenario for ...
Optimization model based on attention mechanism for few-shot image classification
Deep learning has emerged as the leading approach for pattern recognition, but its reliance on large labeled datasets poses challenges in real-world applications where obtaining annotated samples is difficult. Few-shot learning, inspired by human ...
Regional filtering distillation for object detection
Knowledge distillation is a common and effective method in model compression, which trains a compact student model to mimic the capability of a large teacher model to get superior generalization. Previous works on knowledge distillation are ...
STARNet: spatio-temporal aware recurrent network for efficient video object detection on embedded devices
The challenge of converting various object detection methods from image to video remains unsolved. When applied to video, image methods frequently fail to generalize effectively due to issues, such as blurriness, different and unclear positions, ...
Tackling confusion among actions for action segmentation with adaptive margin and energy-driven refinement
Video action segmentation is a crucial task in evaluating the ability to understand human activities. Previous works on this task mainly focus on capturing complex temporal structures and fail to consider the feature ambiguity among similar ...
SGBGAN: minority class image generation for class-imbalanced datasets
Class imbalance frequently arises in the context of image classification. Conventional generative adversarial networks (GANs) have a tendency to produce samples from the majority class when trained on class-imbalanced datasets. To address this ...
End-to-end optimized image compression with the frequency-oriented transform
Image compression constitutes a significant challenge amid the era of information explosion. Recent studies employing deep learning methods have demonstrated the superior performance of learning-based image compression methods over traditional ...
Target–distractor memory joint tracking algorithm via Credit Allocation Network
The tracking framework based on the memory network has gained significant attention due to its enhanced adaptability to variations in target appearance. However, the performance of the framework is limited by the negative effects of distractors in ...
That’s BAD: blind anomaly detection by implicit local feature clustering
Recent studies on visual anomaly detection (AD) of industrial objects/textures have achieved quite good performance. They consider an unsupervised setting, specifically the one-class setting, in which we assume the availability of a set of normal (...
A gradient fusion-based image data augmentation method for reflective workpieces detection under small size datasets
Various of Convolutional Neural Network-based object detection models have been widely used in the industrial field. However, the high accuracy of the object detection of these models is difficult to obtain in the industrial sorting line. This is ...
A pixel and channel enhanced up-sampling module for biomedical image segmentation
Up-sampling operations are frequently utilized to recover the spatial resolution of feature maps in neural networks for segmentation task. However, current up-sampling methods, such as bilinear interpolation or deconvolution, do not fully consider ...
Self-supervised Siamese keypoint inference network for human pose estimation and tracking
Human pose estimation and tracking are important tasks to help understand human behavior. Currently, human pose estimation and tracking face the challenges of missed detection due to sparse annotation of video datasets and difficulty in ...
FESAR: SAR ship detection model based on local spatial relationship capture and fused convolutional enhancement
Synthetic aperture radar (SAR) is instrumental in ship monitoring owing to its all-weather capabilities and high resolution. In SAR images, ship targets frequently display blurred or mixed boundaries with the background, and instances of occlusion ...
An adaptive interpolation and 3D reconstruction algorithm for underwater images
3D reconstruction technology is gradually applied to underwater scenes, which has become a crucial research direction for human ocean exploration and exploitation. However, due to the complexity of the underwater environment, the number of high-...