Dynamic Dense Graph Convolutional Network for Skeleton-Based Human Motion Prediction
Graph Convolutional Networks (GCN) which typically follows a neural message passing framework to model dependencies among skeletal joints has achieved high success in skeleton-based human motion prediction task. Nevertheless, how to construct a graph from ...
A Discrete-Mapping-Based Cross-Component Prediction Paradigm for Screen Content Coding
Cross-component prediction is an important intra-prediction tool in the modern video coders. Existing prediction methods to exploit cross-component correlation include cross-component linear model and its extension of multi-model linear model. These ...
Robust Remote Photoplethysmography Estimation With Environmental Noise Disentanglement
Remote Photoplethysmography (rPPG) has been attracting increasing attention due to its potential in a wide range of application scenarios such as physical training, clinical monitoring, and face anti-spoofing. On top of conventional solutions, deep-...
A Study of Subjective and Objective Quality Assessment of HDR Videos
- Zaixi Shang,
- Joshua P. Ebenezer,
- Abhinau K. Venkataramanan,
- Yongjun Wu,
- Hai Wei,
- Sriram Sethuraman,
- Alan C. Bovik
As compared to standard dynamic range (SDR) videos, high dynamic range (HDR) content is able to represent and display much wider and more accurate ranges of brightness and color, leading to more engaging and enjoyable visual experiences. HDR also implies ...
ADMNet: Adaptive-Weighting Dual Mapping for Online Tracking With Respiratory Motion Estimation in Contrast-Enhanced Ultrasound
- Ming-De Li,
- Hang-Tong Hu,
- Si-Min Ruan,
- Mei-Qing Cheng,
- Li-Da Chen,
- Ze-Rong Huang,
- Wei Li,
- Peng Lin,
- Hong Yang,
- Ming Kuang,
- Ming-De Lu,
- Qing-Hua Huang,
- Wei Wang
Lesion localization and tracking are critical for accurate, automated medical imaging analysis. Contrast-enhanced ultrasound (CEUS) significantly enriches traditional B-mode ultrasound with contrast agents to provide high-resolution, real-time images of ...
An Efficient Single Image De-Raining Model With Decoupled Deep Networks
Single image de-raining is an emerging paradigm for many outdoor computer vision applications since rain streaks can significantly degrade the visibility and render the function compromised. The introduction of deep learning (DL) has brought about ...
KBStyle: Fast Style Transfer Using a 200 KB Network With Symmetric Knowledge Distillation
Convolutional Neural Networks (CNNs) have achieved remarkable progress in arbitrary artistic style transfer. However, the model size of existing state-of-the-art (SOTA) style transfer algorithms is immense, leading to enormous computational costs and ...
Texture-Guided Transfer Learning for Low-Quality Face Recognition
Although many advanced works have achieved significant progress for face recognition with deep learning and large-scale face datasets, low-quality face recognition remains a challenging problem in real-word applications, especially for unconstrained ...
Interpretable Neural Networks for Video Separation: Deep Unfolding RPCA With Foreground Masking
We present two deep unfolding neural networks for the simultaneous tasks of background subtraction and foreground detection in video. Unlike conventional neural networks based on deep feature extraction, we incorporate domain-knowledge models by ...
Deep Supervised Multi-View Learning With Graph Priors
This paper presents a novel method for supervised multi-view representation learning, which projects multiple views into a latent common space while preserving the discrimination and intrinsic structure of each view. Specifically, an apriori discriminant ...
Joint Learning of Fully Connected Network Models in Lifting Based Image Coders
The optimization of prediction and update operators plays a prominent role in lifting-based image coding schemes. In this paper, we focus on learning the prediction and update models involved in a recent Fully Connected Neural Network (FCNN)-based lifting ...
Low Complexity Coding Unit Decision for Video-Based Point Cloud Compression
With growing demand for point cloud coding, Video-based Point Cloud Compression (V-PCC) is released for dynamic point clouds, relying on mature 2D video coding techniques. However, the huge computational complexity of 2D video codec is inherited by V-PCC, ...
VGSG: Vision-Guided Semantic-Group Network for Text-Based Person Search
Text-based Person Search (TBPS) aims to retrieve images of target pedestrian indicated by textual descriptions. It is essential for TBPS to extract fine-grained local features and align them crossing modality. Existing methods utilize external tools or ...
Click-Pixel Cognition Fusion Network With Balanced Cut for Interactive Image Segmentation
Interactive image segmentation (IIS) has been widely used in various fields, such as medicine, industry, etc. However, some core issues, such as pixel imbalance, remain unresolved so far. Different from existing methods based on pre-processing or post-...
Multi-Scale Fusion and Decomposition Network for Single Image Deraining
Convolutional neural networks (CNNs) and self-attention (SA) have demonstrated remarkable success in low-level vision tasks, such as image super-resolution, deraining, and dehazing. The former excels in acquiring local connections with translation ...
LGCOAMix: Local and Global Context-and-Object-Part-Aware Superpixel-Based Data Augmentation for Deep Visual Recognition
Cutmix-based data augmentation, which uses a cut-and-paste strategy, has shown remarkable generalization capabilities in deep learning. However, existing methods primarily consider global semantics with image-level constraints, which excessively reduces ...
Robust Least Squares Regression for Subspace Clustering: A Multi-View Clustering Perspective
Recently, with the assumption that samples can be reconstructed by themselves, subspace clustering (SC) methods have achieved great success. Generally, SC methods contain some parameters to be tuned, and different affinity matrices can obtain with ...
Efficient Dynamic Correspondence Network
We tackle the problem of establishing dense correspondences between a pair of images in an efficient way. Most existing dense matching methods use 4D convolutions to filter incorrect matches, but 4D convolutions are highly inefficient due to their ...
Neural Graph Refinement for Robust Recognition of Nuclei Communities in Histopathological Landscape
Accurate classification of nuclei communities is an important step towards timely treating the cancer spread. Graph theory provides an elegant way to represent and analyze nuclei communities within the histopathological landscape in order to perform ...
ITER: Image-to-Pixel Representation for Weakly Supervised HSI Classification
Recent years have witnessed the superiority of deep learning-based algorithms in the field of HSI classification. However, a prerequisite for the favorable performance of these methods is a large number of refined pixel-level annotations. Due to ...
Efficient Multi-View -Means for Image Clustering<italic/>
Nowadays, data in the real world often comes from multiple sources, but most existing multi-view <inline-formula> <tex-math notation="LaTeX">${K}$ </tex-math></inline-formula>-Means perform poorly on linearly non-separable data and require initializing ...
Tracking With Saliency Region Transformer
Transformers show a great impact on visual tracking thanks to their powerful representation learning capabilities. As the capacity of the model grows, the speed of the tracker tends to decrease gradually. Our work focuses on dealing with massively ...
Semantic-Disentangled Transformer With Noun-Verb Embedding for Compositional Action Recognition
Recognizing actions performed on unseen objects, known as Compositional Action Recognition (CAR), has attracted increasing attention in recent years. The main challenge is to overcome the distribution shift of “action-objects” pairs between ...
Learning Diverse Tone Styles for Image Retouching
Image retouching, aiming to regenerate the visually pleasing renditions of given images, is a subjective task where the users are with different aesthetic sensations. Most existing methods adopt a deterministic model to learn the retouching style from a ...
Coarse- and Fine-Grained Fusion Hierarchical Network for Hole Filling in View Synthesis
Depth image-based rendering (DIBR) techniques play an essential role in free-viewpoint videos (FVVs), which generate the virtual views from a reference 2D texture video and its associated depth information. However, the background regions occluded by the ...
Rethinking Object Saliency Ranking: A Novel Whole-Flow Processing Paradigm
Existing salient object detection methods are capable of predicting binary maps that highlight visually salient regions. However, these methods are limited in their ability to differentiate the relative importance of multiple objects and the relationships ...
Active Disparity Sampling for Stereo Matching With Adjoint Network
The sparse signals provided by external sources have been leveraged as guidance for improving dense disparity estimation. However, previous methods assume depth measurements to be randomly sampled, which restricts performance improvements due to under-...
A Dataset and Model for the Visual Quality Assessment of Inversely Tone-Mapped HDR Videos
To enhance the viewer experience of standard dynamic range (SDR) video content on high dynamic range (HDR) displays, inverse tone mapping (ITM) is employed. Objective visual quality assessment (VQA) models are needed for effective evaluation of ITM ...
Cylin-Painting: Seamless 360° Panoramic Image Outpainting and Beyond
Image outpainting gains increasing attention since it can generate the complete scene from a partial view, providing a valuable solution to construct 360° panoramic images. As image outpainting suffers from the intrinsic issue of unidirectional ...
DMMG: Dual Min-Max Games for Self-Supervised Skeleton-Based Action Recognition
In this work, we propose a new Dual Min-Max Games (DMMG) based self-supervised skeleton action recognition method by augmenting unlabeled data in a contrastive learning framework. Our DMMG consists of a viewpoint variation min-max game and an edge ...