CVI2: Vol 18, No 1

Volume 18, Issue 1February 2024

Volume 18, Issue 1

February 2024

Publisher:

John Wiley & Sons, Inc.
605 Third Ave. New York, NY
United States

EISSN:1751-9640

Tags:

Bibliometrics

Select All

Export Citations Save to Binder

research-article

Open Access

CAGAN: Classifier‐augmented generative adversarial networks for weakly‐supervised COVID‐19 lung lesion localisation

Pages 1–14https://doi.org/10.1049/cvi2.12216

Abstract

The Coronavirus Disease 2019 (COVID‐19) epidemic has constituted a Public Health Emergency of International Concern. Chest computed tomography (CT) can help early reveal abnormalities indicative of lung disease. Thus, accurate and automatic ...

(1) The authors propose an effective classifier‐augmented generative adversarial network framework for COVID‐19 lung lesion localisation, which provides a more accurate feature map indicating the lesion regions. The proposed framework incorporating the ...

research-article

Open Access

Mirror complementary transformer network for RGB‐thermal salient object detection

Pages 15–32https://doi.org/10.1049/cvi2.12221

Abstract

Conventional RGB‐T salient object detection treats RGB and thermal modalities equally to locate the common salient regions. However, the authors observed that the rich colour and texture information of the RGB modality makes the objects more ...

We design a mirror complementary RGB‐T SOD network (MCNet) with Transformer and CNN hybrid architecture, aiming to improve the detection performance under challenging scenes, such as low illumination and thermal crossover. The proposed model outperforms ...

research-article

Open Access

Dynamic facial expression recognition with pseudo‐label guided multi‐modal pre‐training

Pages 33–45https://doi.org/10.1049/cvi2.12217

Abstract

Due to the huge cost of manual annotations, the labelled data may not be sufficient to train a dynamic facial expression (DFR) recogniser with good performance. To address this, the authors propose a multi‐modal pre‐training method with a pseudo‐...

To reduce the dependency on expensive manual annotations for the dynamic expression recognition (DER) task, the authors propose a multi‐modal pre‐training method with a pseudo‐label guidance mechanism to make full use of unlabelled video data for ...

research-article

Open Access

Robust object tracking via ensembling semantic‐aware network and redetection

Pages 46–59https://doi.org/10.1049/cvi2.12219

Abstract

Most Siamese‐based trackers use classification and regression to determine the target bounding box, which can be formulated as a linear matching process of the template and search region. However, this only takes into account the similarity of ...

We innovatively propose object tracking using semantic‐aware ensemble learning for Siamese networks. We propose for the first time a semantic tag redetection method to rescore the tracker bounding boxes and replace the inaccurate bounding boxes. image ...

research-article

Open Access

Enhancing human parsing with region‐level learning

Pages 60–71https://doi.org/10.1049/cvi2.12222

Abstract

Human parsing is very important in a diverse range of industrial applications. Despite the considerable progress that has been achieved, the performance of existing methods is still less than satisfactory, since these methods learn the shared ...

We propose a Region‐level Parsing Refiner (RPR) to enhance human parsing performance by the introduction of region‐level parsing learning. image image

research-article

Open Access

Lite‐weight semantic segmentation with AG self‐attention

Pages 72–83https://doi.org/10.1049/cvi2.12225

Abstract

Due to the large computational and GPUs memory cost of semantic segmentation, some works focus on designing a lite weight model to achieve a good trade‐off between computational cost and accuracy. A common method is to combined CNN and vision ...

We propose AG Self‐Attention, Enhanced Atrous Self‐Attention, and Gate Attention to fix the ignore of multi fields context and inject detailed information. image image

research-article

Open Access

Improved triplet loss for domain adaptation

Pages 84–96https://doi.org/10.1049/cvi2.12226

Abstract

A technique known as domain adaptation is utilised to address classification challenges in an unlabelled target domain by leveraging labelled source domains. Previous domain adaptation approaches have predominantly focussed on global domain ...

In recent years, a considerable number of researchers have explored class‐level domain adaptation, aiming to precisely align the distribution of diverse domains. Nevertheless, existing research on class‐level alignment tends to align domain features ...

research-article

Open Access

Improving object detection by enhancing the effect of localisation quality evaluation on detection confidence

Pages 97–109https://doi.org/10.1049/cvi2.12227

Abstract

The one‐stage object detector has been widely applied in many computer vision applications due to its high detection efficiency and simple framework. However, one‐stage detectors heavily rely on Non‐maximum Suppression to remove the duplicated ...

A lightweight subnetwork, named Quality Prediction Block, is proposed by the authors to strengthen the role of localisation quality evaluation in detection confidence. It is discovered that the features of predicted bounding boxes are closely related to ...

research-article

Open Access

Visual privacy behaviour recognition for social robots based on an improved generative adversarial network

Pages 110–123https://doi.org/10.1049/cvi2.12231

Abstract

Although social robots equipped with visual devices may leak user information, countermeasures for ensuring privacy are not readily available, making visual privacy protection problematic. In this article, a semi‐supervised learning algorithm is ...

We propose a semi‐supervised learning algorithm for visual privacy behaviour recognition based on an improved generative adversarial network for social robots (PBR‐GAN). We implement a social robot platform and architecture for visual privacy recognition ...

research-article

Open Access

Low‐rank preserving embedding regression for robust image feature extraction

Pages 124–140https://doi.org/10.1049/cvi2.12228

Abstract

Although low‐rank representation (LRR)‐based subspace learning has been widely applied for feature extraction in computer vision, how to enhance the discriminability of the low‐dimensional features extracted by LRR based subspace learning ...

The manuscript constructs a robust feature extraction model named low‐rank preserving embedding regression (LRPER) for data with noises in the field of computer vision. An alternative iteration algorithm is designed to optimise the LRPER model, and its ...

research-article

Open Access

Determining the proper number of proposals for individual images

Pages 141–149https://doi.org/10.1049/cvi2.12230

Abstract

The region proposal network is indispensable to two‐stage object detection methods. It generates a fixed number of proposals that are to be classified and regressed by detection heads to produce detection boxes. However, the fixed number of ...

RPN generates a fixed number of proposals images. The authors designed a module to estimate the number of objects in an image and determined the proper number of proposals according to the number, improving the performance and decreasing the FLOPS for ...

research-article

Open Access

Feature fusion over hyperbolic graph convolution networks for video summarisation

Pages 150–164https://doi.org/10.1049/cvi2.12232

Abstract

A novel video summarisation method called the Hyperbolic Graph Convolutional Network (HVSN) is proposed, which addresses the challenges of summarising edited videos and capturing the semantic consistency of video shots at different time points. ...

For user‐generated video editing, HGCN is used to extract video features and achieve good results on the dataset. Based on the attention mechanism, feature fusion technology is used to adapt to different types of videos. image image

research-article

Open Access

STFT: Spatial and temporal feature fusion for transformer tracker

Pages 165–176https://doi.org/10.1049/cvi2.12233

Abstract

Siamese‐based trackers have demonstrated robust performance in object tracking, while Transformers have achieved widespread success in object detection. Currently, many researchers use a hybrid structure of convolutional neural networks and ...

The authors are committed to novel approaches to solving the obscuration problem in target tracking. A full transformer tracker with state‐of‐the‐art results in multiple benchmark datasets is proposed. image image

research-article

Open Access

IoUNet++: Spatial cross‐layer interaction‐based bounding box regression for visual tracking

Pages 177–189https://doi.org/10.1049/cvi2.12235

Abstract

Accurate target prediction, especially bounding box estimation, is a key problem in visual tracking. Many recently proposed trackers adopt the refinement module called IoU predictor by designing a high‐level modulation vector to achieve bounding ...

This paper proposes a novel IoU predictor (IoUNet++) for visual tracking that uses multi‐layer fused spatial template features and depthwise separable convolutional correlations to achieve more accurate bounding box estimation. The tracker improved by ...

IET Computer Vision

Sections

CAGAN: Classifier‐augmented generative adversarial networks for weakly‐supervised COVID‐19 lung lesion localisation

Mirror complementary transformer network for RGB‐thermal salient object detection

Dynamic facial expression recognition with pseudo‐label guided multi‐modal pre‐training

Robust object tracking via ensembling semantic‐aware network and redetection

Enhancing human parsing with region‐level learning

Lite‐weight semantic segmentation with AG self‐attention

Improved triplet loss for domain adaptation

Improving object detection by enhancing the effect of localisation quality evaluation on detection confidence

Visual privacy behaviour recognition for social robots based on an improved generative adversarial network

Low‐rank preserving embedding regression for robust image feature extraction

Determining the proper number of proposals for individual images

Feature fusion over hyperbolic graph convolution networks for video summarisation

STFT: Spatial and temporal feature fusion for transformer tracker

IoUNet++: Spatial cross‐layer interaction‐based bounding box regression for visual tracking

Sections

Save to Binder

Comments