PTRL: Vol 166, No C

Volume 166, Issue CFeb 2023

Volume 166, Issue C

Feb 2023

Publisher:

Elsevier Science Inc.
655 Avenue of the Americas New York, NY
United States

ISSN:0167-8655

Tags:

Bibliometrics

Select All

Export Citations Save to Binder

editorial

Editorial Board

Page iihttps://doi.org/10.1016/S0167-8655(23)00037-5

research-article

Reversible attack based on adversarial perturbation and reversible data hiding in YUV colorspace

Pages 1–7https://doi.org/10.1016/j.patrec.2022.12.018

Highlights

Reversible adversarial attack can prevent illegal image access while ensure legal use.
Embed adversarial perturbations of Y channel into UV channel by reversible data hiding.
Reduce adversarial perturbations by class activation map ...

Abstract

Recent research on using adversarial perturbation to prevent malicious models from accessing image data has led to the corruption of image data, making images useless in other fields, especially in digital forensics. To prevent malicious models ...

research-article

Adaptive dynamic networks for object detection in aerial images

Pages 8–15https://doi.org/10.1016/j.patrec.2022.12.022

Highlights

Adaptively allocate computing resource to input regions for better network inference.
Patch sampling algorithm reduces redundant calculation costs in overlapping regions.
Comparable performance is achieved on two datasets by ...

Graphical abstract

Display Omitted

Abstract

In this paper, we propose an entropy-dynamic resolution detection (EDRdet) method for object detection in aerial images. Most conventional object detection methods usually detect each region in aerial images directly with a fixed resolution, so ...

research-article

Wavelet detail perception network for single image super-resolution

Pages 16–23https://doi.org/10.1016/j.patrec.2022.12.021

Highlights

A novel WDPNet is proposed to effectively solve the smooth problem of SR image details with novel L2HID and DPE mechanisms.
Low- and high-frequency branches recover low-frequency structure and high-frequency details, respectively.
...

Abstract

Single image super-resolution (SR) is an important topic in computer vision because of its ability to generate high-resolution (HR) images. Traditional SR methods do not pay attention to high-frequency detail perception in the reconstruction ...

research-article

Dual quaternion ambisonics array for six-degree-of-freedom acoustic representation

Pages 24–30https://doi.org/10.1016/j.patrec.2022.12.006

Highlights

Immersive audio experiences are knowing a growing interest.
3D audio is often acquired through first order Ambisonics microphones.
Quaternion algebra is appropriate to process quaternion Ambisonics signals.
Dual quaternion ...

Abstract

Spatial audio methods are gaining a growing interest due to the spread of immersive audio experiences and applications, such as virtual and augmented reality. For these purposes, 3D audio signals are often acquired through arrays of Ambisonics ...

research-article

Confidence Estimation for Object Detection in Document Images

Pages 31–37https://doi.org/10.1016/j.patrec.2022.12.024

Highlights

Deep neural networks always require more labelled data to be trained.
Annotating is time-consuming, learning on a limited amount of data becomes necessary.
The data to annotate must be correctly chosen to optimize performance.
...

Abstract

Deep neural networks are becoming increasingly powerful and large and always require more labelled data to be trained. However, since annotating data is time-consuming, it is now necessary to develop systems that show good performance while ...

research-article

Exploiting enhanced and robust RGB-D face representation via progressive multi-modal learning

Pages 38–45https://doi.org/10.1016/j.patrec.2022.12.027

Highlights

Face-specific depth enhancement to refine the low-quality depth.
Iterative inter-modal feature interaction to fully exploit complementary information.
Feature re-calibration and weighted complementary feature aggregation to ...

Abstract

Existing RGB-based 2D face recognition approaches are sensitive to facial variations, posture, occlusions, and illumination. Current depth-based methods have been proved to alleviate the above sensitivity by introducing geometric information but ...

research-article

Multi-scale self-attention-based feature enhancement for detection of targets with small image sizes

Pages 46–52https://doi.org/10.1016/j.patrec.2022.12.026

Highlights

The feature pyramids obtained by the top-down and bottom-up combinations are integrated into one feature map.
The integrated feature map is input into a self-attention module to yield the learnt feature map.
Mutual combinations ...

Abstract

In this paper, we propose a feature enhancement method based on multi-scale self-attention, mainly including a multi-scale feature combination module and a self-attention module. The multi-scale feature combination module integrates the multi-...

research-article

Jigsaw-ViT: Learning jigsaw puzzles in vision transformer

Pages 53–60https://doi.org/10.1016/j.patrec.2022.12.023

Highlights

Introduce jigsaw puzzle solving auxiliary loss into vision transformer-based models.
Removing positional embeddings, randomly masking patches as techniques.
Improve vision transformers’ generalization on large-scale image ...

Abstract

The success of Vision Transformer (ViT) in various computer vision tasks has promoted the ever-increasing prevalence of this convolution-free network. The fact that ViT works on image patches makes it potentially relevant to the problem of jigsaw ...

research-article

A novel dual-channel graph convolutional neural network for facial action unit recognition

Pages 61–68https://doi.org/10.1016/j.patrec.2023.01.001

Highlights

The FACS-GCN is built to model the inherent AU relations referring to FACS.
The DLR-GCN is built to complement individual differences ignored by FACS-GCN.
The DGCN is proposed to integrate two types factors impacting AU ...

Abstract

Facial Action Unit (AU) recognition is a challenging problem, where the subtle muscle movement brings diverse AU representations. Recently, AU relations are utilized to assist AU recognition and improve the understanding of AUs. Nevertheless, ...

research-article

A graphical approach for filter pruning by exploring the similarity relation between feature maps

Pages 69–75https://doi.org/10.1016/j.patrec.2022.12.028

Highlights

A “one-shot” pruning method based on the redundancy of filters for CNNs is proposed.
Graphs are established to represent the similarity relations between feature maps.
The proposed method is tested on VGGNet, ResNet, and DenseNet ...

Graphical abstract

Display Omitted

Abstract

The vast majority of repetitive pruning and retraining techniques used on CNNs require multi-stage optimization, which undermines the potential computing savings from pruning. The similarity relationship between the output feature maps of the ...

research-article

Reflections of an ancient document processor

George Nagy

Pages 76–79https://doi.org/10.1016/j.patrec.2023.01.006

Abstract

The bulk of the documents that affect our lives are digital or born digital. Our laborious investigations of layout, script, font and graphics, are turning into mere exercises with little influence on pursuits outside the Document Analysis and ...

Graphical abstract

Display Omitted

research-article

DBPNet: A dual-branch pyramid network for document super-resolution

Pages 80–88https://doi.org/10.1016/j.patrec.2022.12.013

Research highlights

Propose a dual-branch pyramid network (DBPNet) based on differences in feature distributions of document images.
Design a text edge restoration module based on laplacian pyramid structure to improve the edge details.
...

ABSTRACT

Convolutional neural networks (CNN), aiming to preserve the structural and texture in- formation lost in the initial low-resolution (LR) images, has been widely used to improve the quality of LR images. Most of the existing super-resolution ...

research-article

Image classification using graph neural network and multiscale wavelet superpixels

Pages 89–96https://doi.org/10.1016/j.patrec.2023.01.003

Highlights

We fill the gap in the literature by investigating image classification using multiscale superpixels.
WaveMesh, a novel wavelet-based superpixeling algorithm is proposed.
In WaveMesh, the number and size of superpixels are computed ...

Abstract

Prior studies using graph neural networks (GNNs) for image classification have focused on graphs generated from a regular grid of pixels or similar-sized superpixels. In the latter, a single target number of superpixels is defined for an entire ...

research-article

SIT-SR 3D: Self-supervised slice interpolation via transfer learning for 3D volume super-resolution

Pages 97–104https://doi.org/10.1016/j.patrec.2023.01.008

Abstract

We present SIT-SR 3D, a novel self-supervised method for 3D single image super-resolution (SISR). Scaling 2D SISR networks to 3D SISR requires code redesign, high computing resources, and 3D ground-truth. However, we circumvent this by (1) using ...

research-article

Weighted distances in the Cairo pattern

Pages 105–111https://doi.org/10.1016/j.patrec.2023.01.004

Highlights

Cairo pattern is a tiling by pentagons and is a dual of a semi-regular tiling.
Four orientations of pentagons resulted as the intersection of two hexagonal grids.
There are 4 types of neighbors, thus there are 4 weights depending ...

Abstract

The Cairo Pattern is an interesting tiling of the plane. It consists of pentagons which completely cover the plane without overlapping. The Cairo Pattern is the dual of a semi-regular grid. It is assumed that one can walk on the pentagons such ...

research-article

Improving edit-based unsupervised sentence simplification using fine-tuned BERT

Pages 112–118https://doi.org/10.1016/j.patrec.2023.01.009

Highlights

The proposed model is a modernized version of Edit-Unsup-TS that uses masked language models.
The idea of fine-tuning BERT on simpler language is explored.
It improves performance with a lower amount of training data.

Abstract

Word suggestion in unsupervised sentence simplification aims to replace complex words of a given sentence with their simpler alternatives. This is mostly done without considering their context within the input sentence. In this paper, we propose ...

research-article

Elastic-band transform for visualization and detection

Pages 119–125https://doi.org/10.1016/j.patrec.2023.01.010

Highlights

The proposed method adopts a concept motivated by human cognition, looking through objects at different intervals.
The proposed method captures local fluctuations and global trends in data using different intervals.
The proposed ...

Abstract

This paper presents a new multiscale transformation for statistical analysis of one-dimensional data such as time series under the concept of the scale-space approach. The proposed method uses regular observations (eye scanning) with a range of ...

research-article

Generating diverse augmented attributes for generalized zero shot learning

Pages 126–133https://doi.org/10.1016/j.patrec.2023.01.005

Highlights

A generator is learned to augment semantic attributes to solve the projection problem of “many to one”.
The reconstructed features and the original features are concatenated to train the classifier.
The proposed method can achieve ...

Abstract

Generalized Zero-Shot Learning (GZSL) has become an important research due to its powerful ability of recognizing unseen objects. Generative methods, converting conventional GZSL to fully supervised learning, can achieve competing performance, ...

research-article

A semi-automatic data integration process of heterogeneous databases

Pages 134–142https://doi.org/10.1016/j.patrec.2023.01.007

Highlights

Data Integration of two or more heterogeneous databases.
Syntactic and semantic analysis of textual data.
Semi-automatic process.

Graphical abstract

Display Omitted

Abstract

One of the most difficult issues today, is the integration of data from various sources. Thus, it arises the need of automatic Data Integration (DI) methods. However, in the literature there are fully automatic or semi-automatic DI techniques, ...

research-article

Zero-shot ear cross-dataset transfer for person recognition on mobile devices

Pages 143–150https://doi.org/10.1016/j.patrec.2023.01.012

Highlights

A zero-shot cross-dataset transfer protocol.
Competitive cross-dataset results.
The leverage of a pipeline built on top of a pre-trained backbone.

Graphical abstract

Display Omitted

Abstract

Smartphones contain personal and private data to be protected, such as everyday communications or bank accounts. Several biometric techniques have been developed to unlock smartphones, among which ear biometrics represents a natural and promising ...

research-article

An angular shrinkage BERT model for few-shot relation extraction with none-of-the-above detection

Pages 151–158https://doi.org/10.1016/j.patrec.2023.01.002

Highlights

A novel lossis proposed for few-shot RE to enlarge the inter-class margin.
Two-step training improves the model performance in few-shot RE with NOTA detection.
A SOTA sentence-pair model for few-shot RE with NOTA detection.

Abstract

Few-shot relation extraction aims to solve the problem of insufficient annotated data in relation extraction tasks. Through the comparison between samples, few-shot relation extraction achieves lower-cost relation classification. However, most ...

research-article

Differentiable Mean Opinion Score Regularization for Perceptual Speech Enhancement

Pages 159–163https://doi.org/10.1016/j.patrec.2023.01.011

Highlights

We proposed a deep-learning-based model for perceptual speech quality assessment.
We presented an application of this model as perceptual regularization for speech enhancement.
Experimental results show significant improvement in ...

Abstract

Many speech enhancement methods require perceptual quality metrics for evaluation. The “holy grail” of perceptual speech quality assessment is human subjective ratings, known as the mean opinion score. However, acquiring human ratings is time-...

research-article

A too-good-to-be-true prior to reduce shortcut reliance

Pages 164–171https://doi.org/10.1016/j.patrec.2022.12.010

Highlights

Challenging machine learning problems are unlikely to have trivial solutions.
Solutions from low-capacity models are likely shortcuts that won’t generalize.
One inductive bias for robust generalization is to avoid overly simple ...

Abstract

Despite their impressive performance in object recognition and other tasks under standard testing conditions, deep networks often fail to generalize to out-of-distribution (o.o.d.) samples. One cause for this shortcoming is that modern ...

Special Section on SIBGRAPI2021 - 4th Conference on Graphics, Patterns and Images

Special Section on Pattern Recognition for Recent and Future Advances On Intelligent Systems (ISPR2022)

Pattern Recognition Letters

Sections

Editorial Board

Reversible attack based on adversarial perturbation and reversible data hiding in YUV colorspace

Adaptive dynamic networks for object detection in aerial images

Wavelet detail perception network for single image super-resolution

Dual quaternion ambisonics array for six-degree-of-freedom acoustic representation

Confidence Estimation for Object Detection in Document Images

Exploiting enhanced and robust RGB-D face representation via progressive multi-modal learning

Multi-scale self-attention-based feature enhancement for detection of targets with small image sizes

Jigsaw-ViT: Learning jigsaw puzzles in vision transformer

A novel dual-channel graph convolutional neural network for facial action unit recognition

A graphical approach for filter pruning by exploring the similarity relation between feature maps

Reflections of an ancient document processor

DBPNet: A dual-branch pyramid network for document super-resolution

Image classification using graph neural network and multiscale wavelet superpixels

SIT-SR 3D: Self-supervised slice interpolation via transfer learning for 3D volume super-resolution

Weighted distances in the Cairo pattern

Improving edit-based unsupervised sentence simplification using fine-tuned BERT

Elastic-band transform for visualization and detection

Generating diverse augmented attributes for generalized zero shot learning

A semi-automatic data integration process of heterogeneous databases

Zero-shot ear cross-dataset transfer for person recognition on mobile devices

An angular shrinkage BERT model for few-shot relation extraction with none-of-the-above detection

Differentiable Mean Opinion Score Regularization for Perceptual Speech Enhancement

A too-good-to-be-true prior to reduce shortcut reliance

Sections

Save to Binder

Comments