Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleNovember 2024
A semantic guidance-based fusion network for multi-label image classification
Pattern Recognition Letters (PTRL), Volume 185, Issue CPages 254–261https://doi.org/10.1016/j.patrec.2024.08.020AbstractMulti-label image classification (MLIC), a fundamental task assigning multiple labels to each image, has been seen notable progress in recent years. Considering simultaneous appearances of objects in the physical world, modeling object ...
Highlights- The SGFN model proposed in this paper considers image spatial correlation and label semantic correlation, while the two correlations are fused hierarchically to improve the model performance.
- Regarding spatial correlation modeling, ...
- research-articleNovember 2024
Contraction mapping of feature norms for data quality imbalance learning
Pattern Recognition Letters (PTRL), Volume 185, Issue CPages 232–238https://doi.org/10.1016/j.patrec.2024.08.016AbstractThe popular softmax loss and its recent extensions have achieved great success in deep learning-based image classification. However, the data for training image classifiers often exhibit a highly skewed distribution in quality, i.e., the number ...
Highlights- Considering the problem of data quality differences in learning objectives.
- Finding the positive correlation between data quality and feature norm via softmax.
- Proposing a novel learning approach using contraction mapping of ...
- research-articleNovember 2024
Deep NRSFM for multi-view multi-body pose estimation
Pattern Recognition Letters (PTRL), Volume 185, Issue CPages 218–224https://doi.org/10.1016/j.patrec.2024.08.015AbstractThis paper addresses the challenging task of unsupervised relative human pose estimation. Our solution exploits the potential offered by utilizing multiple uncalibrated cameras. It is assumed that spatial human pose and camera parameter ...
Highlights- Estimating the relative body poses of several people is essential for studying social behavior.
- In some cases, several cameras are available, but the calibration of the cameras cannot be solved.
- With the Non Rigid Structure from ...
- research-articleNovember 2024
An unsupervised video anomaly detection method via Optical Flow decomposition and Spatio-Temporal feature learning
Pattern Recognition Letters (PTRL), Volume 185, Issue CPages 239–246https://doi.org/10.1016/j.patrec.2024.08.013AbstractThe purpose of this paper is to present an unsupervised video anomaly detection method using Optical Flow decomposition and Spatio-Temporal feature learning (OFST). This method employs a combination of optical flow reconstruction and video frame ...
Highlights- Propose a method by Optical Flow decomposition and Spatio-Temporal learning for VAD.
- Design a Multi-Granularity Memory-Augmented AE with Optical Flow Decomposition.
- Design a two-stream network structure to learn spatiotemporal ...
- research-articleNovember 2024
Coding self-representative and label-relaxed hashing for cross-modal retrieval
Pattern Recognition Letters (PTRL), Volume 185, Issue CPages 264–270https://doi.org/10.1016/j.patrec.2024.08.011AbstractIn cross-modal retrieval, most existing hashing-based methods merely considered the relationship between feature representations to reduce the heterogeneous gap for data from various modalities, whereas they neglected the correlation between ...
Highlights- Coding self-representative learning captures the essential semantic information.
- Label-relaxed regression ensures hash code consistency with label information matching.
- Log similarity preservation extracts non-linear features to ...
-
- research-articleNovember 2024
Meta-learning from learning curves for budget-limited algorithm selection
Pattern Recognition Letters (PTRL), Volume 185, Issue CPages 225–231https://doi.org/10.1016/j.patrec.2024.08.010AbstractTraining a large set of machine learning algorithms to convergence in order to select the best-performing algorithm for a dataset is computationally wasteful. Moreover, in a budget-limited scenario, it is crucial to carefully select an algorithm ...
Highlights- Partial learning curves are critical information in budget-limited learning scenarios.
- Meta-learning from past learning curves improves algorithm selection for new tasks.
- Learning policies for both algorithm selection and budget ...
- research-articleNovember 2024
A lightweight attention-driven distillation model for human pose estimation
Pattern Recognition Letters (PTRL), Volume 185, Issue CPages 247–253https://doi.org/10.1016/j.patrec.2024.08.009AbstractCurrently, research on human pose estimation tasks primarily focuses on heatmap-based and regression-based methods. However, the increasing complexity of heatmap models and the low accuracy of regression methods are becoming significant barriers ...
Highlights- Convolution effectively extracts features but introduces redundant parameters.
- We propose the CAU module, using split-transform-fusion strategy to cut redundancy.
- Increasing transformer layers improves performance but adds more ...
- research-articleNovember 2024
Saliency-based video summarization for face anti-spoofing
Pattern Recognition Letters (PTRL), Volume 185, Issue CPages 190–196https://doi.org/10.1016/j.patrec.2024.08.008AbstractWith the growing availability of databases for face presentation attack detection, researchers are increasingly focusing on video-based face anti-spoofing methods that involve hundreds to thousands of images for training the models. However, ...
Highlights- A saliency-based video summarization method is proposed specifically for the face anti-spoofing task.
- Two-scale image decomposition is employed to achieve reduced computational cost.
- The state-of-the-art results on four databases ...
- research-articleNovember 2024
Contrastive Learning for Lane Detection via cross-similarity
Pattern Recognition Letters (PTRL), Volume 185, Issue CPages 175–183https://doi.org/10.1016/j.patrec.2024.08.007AbstractDetecting lane markings in road scenes poses a significant challenge due to their intricate nature, which is susceptible to unfavorable conditions. While lane markings have strong shape priors, their visibility is easily compromised by varying ...
Highlights
- In this paper, we propose a new contrastive learning (CL) method for lane detection.
- The article introduces cross-similarity operation.
- We discussed why previous CL methods are unsuitable for lane detection.
- The article ...
- research-articleNovember 2024
Scale-aware token-matching for transformer-based object detector
Pattern Recognition Letters (PTRL), Volume 185, Issue CPages 197–202https://doi.org/10.1016/j.patrec.2024.08.006AbstractOwing to the advancements in deep learning, object detection has made significant progress in estimating the positions and classes of multiple objects within an image. However, detecting objects of various scales within a single image remains a ...
Highlights- Each token independently learn scale-specific information with our token matching method.
- Only small-scale information is processed separately to mitigate the heterogeneity of small objects.
- Improve small objects detection by ...
- research-articleNovember 2024
Semantic-aware hyper-space deformable neural radiance fields for facial avatar reconstruction
Pattern Recognition Letters (PTRL), Volume 185, Issue CPages 160–166https://doi.org/10.1016/j.patrec.2024.08.004AbstractHigh-fidelity facial avatar reconstruction from monocular videos is a prominent research problem in computer graphics and computer vision. Recent advancements in the Neural Radiance Field (NeRF) have demonstrated remarkable proficiency in ...
Graphical abstractDisplay Omitted
Highlights- We propose a generalized semantic-aware hyper-space deformable NeRF-based framework for reconstructing high-fidelity facial avatars from monocular videos, which can be driven by either 3DMM coefficients or audio input.
- We introduce a ...
- research-articleNovember 2024
Enhancing zero-shot object detection with external knowledge-guided robust contrast learning
Pattern Recognition Letters (PTRL), Volume 185, Issue CPages 152–159https://doi.org/10.1016/j.patrec.2024.08.003AbstractZero-shot object detection aims to identify objects from unseen categories not present during training. Existing methods rely on category labels to create pseudo-features for unseen categories, but they face limitations in exploring semantic ...
Highlights- Using large language models to provide rich semantic information as external knowledge.
- Supervised contrastive learning can optimize the distribution of visual features.
- It is robust for both natural and fine-grained scenes.
- ...
- research-articleNovember 2024
Feature-consistent coplane-pair correspondence- and fusion-based point cloud registration
Pattern Recognition Letters (PTRL), Volume 185, Issue CPages 143–149https://doi.org/10.1016/j.patrec.2024.08.001AbstractIt is an important and challenging task to register two point clouds, and the estimated registration solution can be applied in 3D vision. In this paper, an outlier removal method is first proposed to delete redundant coplane-pair correspondences ...
Highlights- All source and target coplane candidates are divided into three groups.
- Three feature-consistent coplane-pair correspondence subsets are constructed.
- Final registration result is obtained by fusing the best solutions of three ...
- editorialNovember 2024
- research-articleNovember 2024
Feature decomposition-based gaze estimation with auxiliary head pose regression
Pattern Recognition Letters (PTRL), Volume 185, Issue CPages 137–142https://doi.org/10.1016/j.patrec.2024.07.021AbstractRecognition and understanding of facial images or eye images are critical for eye tracking. Recent studies have shown that the simultaneous use of facial and eye images can effectively lower gaze errors. However, these methods typically consider ...
Highlights- Gaze estimation with feature decomposition as the core concept.
- Correction of gaze estimation by purifying head pose.
- Leverage the relationship between facial and eye images at the feature level.
- Decompose features through ...
- research-articleNovember 2024
Adversarial self-training for robustness and generalization
Pattern Recognition Letters (PTRL), Volume 185, Issue CPages 117–123https://doi.org/10.1016/j.patrec.2024.07.020AbstractAdversarial training is currently one of the most promising ways to achieve adversarial robustness of deep models. However, even the most sophisticated training methods is far from satisfactory, as improvement in robustness requires either ...
Highlights- An adversarial training technique using self-training is proposed.
- Consistency regularization is applied to suppress the distortion of representations in latent space.
- The proposed technique can be easily generalized to other ...
- research-articleNovember 2024
DECA-Net: Dual encoder and cross-attention fusion network for surgical instrument segmentation
Pattern Recognition Letters (PTRL), Volume 185, Issue CPages 130–136https://doi.org/10.1016/j.patrec.2024.07.019AbstractMinimally invasive surgery is now widely used to reduce surgical risks, and automatic and accurate instrument segmentation from endoscope videos is crucial for computer-assisted surgical guidance. However, given the rapid development of CNN-based ...
Highlights- A novel deep segmentation network, called DECA-Net, is proposed to realize precise surgical instrument segmentation.
- A dual encoder unit is proposed for effective global context and feature extraction under illumination issues.
- The ...
- research-articleNovember 2024
Enhancing low-light images via dehazing principles: Essence and method
Pattern Recognition Letters (PTRL), Volume 185, Issue CPages 167–174https://doi.org/10.1016/j.patrec.2024.07.017AbstractGiven the visual resemblance between inverted low-light and hazy images, dehazing principles are borrowed to enhance low-light images. However, the essence of such methods remains unclear, and they are susceptible to over-enhancement. Regarding ...
Highlights- Low-light images exhibit visually similar to haze images when their intensities are reversed.
- The image dehazing technique is a branch of low-light image enhancement field.
- Image dehazing techniques lack clear physical ...