TIP: Vol 28, No 11

Volume 28, Issue 11Nov. 2019

Volume 28, Issue 11

Nov. 2019

Publisher:

IEEE Press

ISSN:1057-7149

Tags:

Bibliometrics

Select All

Export Citations Save to Binder

research-article

Self-Guiding Multimodal LSTM—When We Do Not Have a Perfect Training Dataset for Image Captioning

Pages 5241–5252https://doi.org/10.1109/TIP.2019.2917229

In this paper, a self-guiding multimodal LSTM (sgLSTM) image captioning model is proposed to handle an uncontrolled imbalanced real-world image-sentence dataset. We collect a FlickrNYC dataset from Flickr as our testbed with 306,165 images and the ...

research-article

Visual Attention Prediction for Stereoscopic Video by Multi-Module Fully Convolutional Network

Pages 5253–5265https://doi.org/10.1109/TIP.2019.2916766

Visual attention is an important mechanism in the human visual system (HVS) and there have been numerous saliency detection algorithms designed for 2D images/video recently. However, the research for fixation detection of stereoscopic video is still ...

research-article

Occlusion-Aware Depth Map Coding Optimization Using Allowable Depth Map Distortions

Pages 5266–5280https://doi.org/10.1109/TIP.2019.2919198

In depth map coding, rate-distortion optimization for those pixels that will cause occlusion in view synthesis is a rather challenging task, since the synthesis distortion estimation is complicated by the warping competition and the occlusion order can be ...

research-article

Sample Fusion Network: An End-to-End Data Augmentation Network for Skeleton-Based Human Action Recognition

Pages 5281–5295https://doi.org/10.1109/TIP.2019.2913544

Data augmentation is a widely used technique for enhancing the generalization ability of deep neural networks for skeleton-based human action recognition (HAR) tasks. Most existing data augmentation methods generate new samples by means of handcrafted ...

research-article

Saliency From Growing Neural Gas: Learning Pre-Attentional Structures for a Flexible Attention System

Pages 5296–5307https://doi.org/10.1109/TIP.2019.2913549

Artificial visual attention has been an active research area for over two decades. Especially, the concept of saliency has been implemented in many different ways. Early approaches aimed at closely modeling saliency processing with concepts from ...

research-article

Exploiting Images for Video Recognition: Heterogeneous Feature Augmentation via Symmetric Adversarial Learning

Pages 5308–5321https://doi.org/10.1109/TIP.2019.2917867

Training deep models of video recognition usually requires sufficient labeled videos in order to achieve good performance without over-fitting. However, it is quite labor-intensive and time-consuming to collect and annotate a large amount of videos. ...

research-article

A Work Efficient Parallel Algorithm for Exact Euclidean Distance Transform

Pages 5322–5335https://doi.org/10.1109/TIP.2019.2916741

A fully-parallelized work-time optimal algorithm is presented for computing the exact Euclidean Distance Transform (EDT) of a 2D binary image with the size of <inline-formula> <tex-math notation="LaTeX">$n\times n$ </tex-math></inline-formula>. Unlike ...

research-article

Reference-Free Quality Assessment of Sonar Images via Contour Degradation Measurement

Pages 5336–5351https://doi.org/10.1109/TIP.2019.2910666

Sonar imagery plays a significant role in oceanic applications since there is little natural light underwater, and light is irrelevant to sonar imaging. Sonar images are very likely to be affected by various distortions during the process of transmission ...

research-article

Multi-View Linear Discriminant Analysis Network

Pages 5352–5365https://doi.org/10.1109/TIP.2019.2913511

In many real-world applications, an object can be described from multiple views or styles, leading to the emerging multi-view analysis. To eliminate the complicated (usually highly nonlinear) view discrepancy for favorable cross-view recognition and ...

research-article

O2O Method for Fast 2D Shape Retrieval

Pages 5366–5378https://doi.org/10.1109/TIP.2019.2919195

A novel post-processing method, online to offline (O2O), to improve the efficiency of shape retrieval is proposed in this paper. The essence of this proposed method is to move more work that requires a lot of computation to offline. Based on this approach,...

research-article

Point Cloud Saliency Detection by Local and Global Feature Fusion

Pages 5379–5393https://doi.org/10.1109/TIP.2019.2918735

Inspired by the characteristics of the human visual system, a novel method is proposed for detecting the visually salient regions on 3D point clouds. First, the local distinctness of each point is evaluated based on the difference with its local ...

research-article

Learning to Find Unpaired Cross-Spectral Correspondences

Pages 5394–5406https://doi.org/10.1109/TIP.2019.2917864

We present a deep architecture and learning framework for establishing correspondences across cross-spectral visible and infrared images in an unpaired setting. To overcome the unpaired cross-spectral data problem, we design the unified image translation ...

research-article

Robust Adaptive Median Binary Pattern for Noisy Texture Classification and Retrieval

Pages 5407–5418https://doi.org/10.1109/TIP.2019.2916742

Texture is an important characteristic for different computer vision tasks and applications. Local binary pattern (LBP) is considered one of the most efficient texture descriptors yet. However, LBP has some notable limitations, in particular its ...

research-article

Optimal Adaptive Quantization Based on Temporal Distortion Propagation Model for HEVC

Pages 5419–5434https://doi.org/10.1109/TIP.2019.2919180

Optimal adaptive quantization is one of the key points to optimize the coding efficiency of video encoders. The latest block-based video compression standards, such as <italic>high-efficiency</italic> <italic>video coding</italic> (HEVC), extensively use ...

research-article

Weakly Supervised Salient Object Detection by Learning A Classifier-Driven Map Generator

Pages 5435–5449https://doi.org/10.1109/TIP.2019.2917224

Top-down saliency detection aims to highlight the regions of a specific object category, and typically relies on pixel-wise annotated training data. In this paper, we address the high cost of collecting such training data by a weakly supervised approach ...

research-article

Learning Deep Features for One-Class Classification

Pages 5450–5463https://doi.org/10.1109/TIP.2019.2917862

We present a novel deep-learning-based approach for <italic>one-class transfer learning</italic> in which labeled data from an unrelated task is used for feature learning in one-class classification. The proposed method operates on top of a convolutional ...

research-article

AttGAN: Facial Attribute Editing by Only Changing What You Want

Pages 5464–5478https://doi.org/10.1109/TIP.2019.2916751

Facial attribute editing aims to manipulate single or multiple attributes on a given face image, i.e., to generate a new face image with desired attributes while preserving other details. Recently, the generative adversarial net (GAN) and encoder–...

research-article

Reconstruction of Stochastic 3D Signals With Symmetric Statistics From 2D Projection Images Motivated by Cryo-Electron Microscopy

Pages 5479–5494https://doi.org/10.1109/TIP.2019.2915631

Cryo-electron microscopy provides 2D projection images of the 3D electron scattering intensity of many instances of the particle under study (e.g., a virus). Both symmetry (rotational point groups) and heterogeneity are important aspects of biological ...

research-article

Multiple Pyramids Based Image Inpainting Using Local Patch Statistics and Steering Kernel Feature

Pages 5495–5509https://doi.org/10.1109/TIP.2019.2920528

In this paper, we propose a novel multiple pyramids based image inpainting method using local patch statistics and geometric feature-based sparse representation to maintain texture consistency and structure coherence. First, we approximate each patch in ...

research-article

Open Access

Adaptive Morphological Reconstruction for Seeded Image Segmentation

Pages 5510–5523https://doi.org/10.1109/TIP.2019.2920514

Morphological reconstruction (MR) is often employed by seeded image segmentation algorithms such as watershed transform and power watershed, as it is able to filter out seeds (regional minima) to reduce over-segmentation. However, the MR might mistakenly ...

research-article

Fast Blind Quality Assessment of DIBR-Synthesized Video Based on High-High Wavelet Subband

Pages 5524–5536https://doi.org/10.1109/TIP.2019.2919416

Free-viewpoint video, as the development direction of the next-generation video technologies, uses the depth-image-based rendering (DIBR) technique for the synthesis of video sequences at viewpoints, where real captured videos are missing. As reference ...

research-article

Texture Variation Adaptive Image Denoising With Nonlocal PCA

Pages 5537–5551https://doi.org/10.1109/TIP.2019.2916976

Image textures, as a kind of local variations, provide important information for the human visual system. Many image textures, especially the small-scale or stochastic textures, are rich in high-frequency variations, and are difficult to be preserved. ...

research-article

CAM-RNN: Co-Attention Model Based RNN for Video Captioning

Pages 5552–5565https://doi.org/10.1109/TIP.2019.2916757

Video captioning is a technique that bridges vision and language together, for which both visual information and text information are quite important. Typical approaches are based on the recurrent neural network (RNN), where the video caption is generated ...

research-article

TextField: Learning a Deep Direction Field for Irregular Scene Text Detection

Pages 5566–5579https://doi.org/10.1109/TIP.2019.2900589

Scene text detection is an important step in the scene text reading system. The main challenges lie in significantly varied sizes and aspect ratios, arbitrary orientations, and shapes. Driven by the recent progress in deep learning, impressive ...

research-article

Underwater Image Enhancement Using Adaptive Retinal Mechanisms

Pages 5580–5595https://doi.org/10.1109/TIP.2019.2919947

We propose an underwater image enhancement model inspired by the morphology and function of the teleost fish retina. We aim to solve the problems of underwater image degradation raised by the blurring and nonuniform color biasing. In particular, the ...

research-article

Learning Adaptive Discriminative Correlation Filters via Temporal Consistency Preserving Spatial Feature Selection for Robust Visual Object Tracking

Pages 5596–5609https://doi.org/10.1109/TIP.2019.2919201

With efficient appearance learning models, discriminative correlation filter (DCF) has been proven to be very successful in recent video object tracking benchmarks and competitions. However, the existing DCF paradigm suffers from two major issues, i.e., ...

research-article

Efficient Bandwidth Estimation in 2D Filtered Backprojection Reconstruction

Ranjan Maitra

Pages 5610–5619https://doi.org/10.1109/TIP.2019.2919428

A generalized cross-validation approach to estimate the reconstruction filter bandwidth in 2D filtered backprojection is presented. The method writes the reconstruction equation in equivalent backprojected filtering form, derives results on ...

research-article

Subjective and Objective Quality Assessment of Stitched Images for Virtual Reality

Pages 5620–5635https://doi.org/10.1109/TIP.2019.2921858

We consider the problem of quality assessment (QA) of image stitching algorithms used to generate panoramic images for virtual reality applications. Our contributions are two-fold. We design the Indian Institute of Science Stitched Image QA (ISIQA) ...

research-article

Conditional Random Field Model for Robust Multi-Focus Image Fusion

Pages 5636–5648https://doi.org/10.1109/TIP.2019.2922097

In this paper, a novel multi-focus image fusion algorithm based on conditional random field optimization (mf-CRF) is proposed. It is based on an unary term that includes the combined activity estimation of both high and low frequencies of the input images,...

research-article

Channel Splitting Network for Single MR Image Super-Resolution

Pages 5649–5662https://doi.org/10.1109/TIP.2019.2921882

High resolution magnetic resonance (MR) imaging is desirable in many clinical applications due to its contribution to more accurate subsequent analyses and early clinical diagnoses. Single image super-resolution (SISR) is an effective and cost efficient ...

IEEE Transactions on Image Processing

Sections

Self-Guiding Multimodal LSTM—When We Do Not Have a Perfect Training Dataset for Image Captioning

Visual Attention Prediction for Stereoscopic Video by Multi-Module Fully Convolutional Network

Occlusion-Aware Depth Map Coding Optimization Using Allowable Depth Map Distortions

Sample Fusion Network: An End-to-End Data Augmentation Network for Skeleton-Based Human Action Recognition

Saliency From Growing Neural Gas: Learning Pre-Attentional Structures for a Flexible Attention System

Exploiting Images for Video Recognition: Heterogeneous Feature Augmentation via Symmetric Adversarial Learning

A Work Efficient Parallel Algorithm for Exact Euclidean Distance Transform

Reference-Free Quality Assessment of Sonar Images via Contour Degradation Measurement

Multi-View Linear Discriminant Analysis Network

O2O Method for Fast 2D Shape Retrieval

Point Cloud Saliency Detection by Local and Global Feature Fusion

Learning to Find Unpaired Cross-Spectral Correspondences

Robust Adaptive Median Binary Pattern for Noisy Texture Classification and Retrieval

Optimal Adaptive Quantization Based on Temporal Distortion Propagation Model for HEVC

Weakly Supervised Salient Object Detection by Learning A Classifier-Driven Map Generator

Learning Deep Features for One-Class Classification

AttGAN: Facial Attribute Editing by Only Changing What You Want

Reconstruction of Stochastic 3D Signals With Symmetric Statistics From 2D Projection Images Motivated by Cryo-Electron Microscopy

Multiple Pyramids Based Image Inpainting Using Local Patch Statistics and Steering Kernel Feature

Adaptive Morphological Reconstruction for Seeded Image Segmentation

Fast Blind Quality Assessment of DIBR-Synthesized Video Based on High-High Wavelet Subband

Texture Variation Adaptive Image Denoising With Nonlocal PCA

CAM-RNN: Co-Attention Model Based RNN for Video Captioning

TextField: Learning a Deep Direction Field for Irregular Scene Text Detection

Underwater Image Enhancement Using Adaptive Retinal Mechanisms

Learning Adaptive Discriminative Correlation Filters via Temporal Consistency Preserving Spatial Feature Selection for Robust Visual Object Tracking

Efficient Bandwidth Estimation in 2D Filtered Backprojection Reconstruction

Subjective and Objective Quality Assessment of Stitched Images for Virtual Reality

Conditional Random Field Model for Robust Multi-Focus Image Fusion

Channel Splitting Network for Single MR Image Super-Resolution

Sections

Save to Binder

Comments