Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
Volume 28, Issue 11Nov. 2019
Publisher:
  • IEEE Press
ISSN:1057-7149
Reflects downloads up to 14 Jan 2025Bibliometrics
research-article
Self-Guiding Multimodal LSTM—When We Do Not Have a Perfect Training Dataset for Image Captioning

In this paper, a self-guiding multimodal LSTM (sgLSTM) image captioning model is proposed to handle an uncontrolled imbalanced real-world image-sentence dataset. We collect a FlickrNYC dataset from Flickr as our testbed with 306,165 images and the ...

research-article
Visual Attention Prediction for Stereoscopic Video by Multi-Module Fully Convolutional Network

Visual attention is an important mechanism in the human visual system (HVS) and there have been numerous saliency detection algorithms designed for 2D images/video recently. However, the research for fixation detection of stereoscopic video is still ...

research-article
Occlusion-Aware Depth Map Coding Optimization Using Allowable Depth Map Distortions

In depth map coding, rate-distortion optimization for those pixels that will cause occlusion in view synthesis is a rather challenging task, since the synthesis distortion estimation is complicated by the warping competition and the occlusion order can be ...

research-article
Sample Fusion Network: An End-to-End Data Augmentation Network for Skeleton-Based Human Action Recognition

Data augmentation is a widely used technique for enhancing the generalization ability of deep neural networks for skeleton-based human action recognition (HAR) tasks. Most existing data augmentation methods generate new samples by means of handcrafted ...

research-article
Saliency From Growing Neural Gas: Learning Pre-Attentional Structures for a Flexible Attention System

Artificial visual attention has been an active research area for over two decades. Especially, the concept of saliency has been implemented in many different ways. Early approaches aimed at closely modeling saliency processing with concepts from ...

research-article
Exploiting Images for Video Recognition: Heterogeneous Feature Augmentation via Symmetric Adversarial Learning

Training deep models of video recognition usually requires sufficient labeled videos in order to achieve good performance without over-fitting. However, it is quite labor-intensive and time-consuming to collect and annotate a large amount of videos. ...

research-article
A Work Efficient Parallel Algorithm for Exact Euclidean Distance Transform

A fully-parallelized work-time optimal algorithm is presented for computing the exact Euclidean Distance Transform (EDT) of a 2D binary image with the size of <inline-formula> <tex-math notation="LaTeX">$n\times n$ </tex-math></inline-formula>. Unlike ...

research-article
Reference-Free Quality Assessment of Sonar Images via Contour Degradation Measurement

Sonar imagery plays a significant role in oceanic applications since there is little natural light underwater, and light is irrelevant to sonar imaging. Sonar images are very likely to be affected by various distortions during the process of transmission ...

research-article
Multi-View Linear Discriminant Analysis Network

In many real-world applications, an object can be described from multiple views or styles, leading to the emerging multi-view analysis. To eliminate the complicated (usually highly nonlinear) view discrepancy for favorable cross-view recognition and ...

research-article
O2O Method for Fast 2D Shape Retrieval

A novel post-processing method, online to offline (O2O), to improve the efficiency of shape retrieval is proposed in this paper. The essence of this proposed method is to move more work that requires a lot of computation to offline. Based on this approach,...

research-article
Point Cloud Saliency Detection by Local and Global Feature Fusion

Inspired by the characteristics of the human visual system, a novel method is proposed for detecting the visually salient regions on 3D point clouds. First, the local distinctness of each point is evaluated based on the difference with its local ...

research-article
Learning to Find Unpaired Cross-Spectral Correspondences

We present a deep architecture and learning framework for establishing correspondences across cross-spectral visible and infrared images in an unpaired setting. To overcome the unpaired cross-spectral data problem, we design the unified image translation ...

research-article
Robust Adaptive Median Binary Pattern for Noisy Texture Classification and Retrieval

Texture is an important characteristic for different computer vision tasks and applications. Local binary pattern (LBP) is considered one of the most efficient texture descriptors yet. However, LBP has some notable limitations, in particular its ...

research-article
Optimal Adaptive Quantization Based on Temporal Distortion Propagation Model for HEVC

Optimal adaptive quantization is one of the key points to optimize the coding efficiency of video encoders. The latest block-based video compression standards, such as <italic>high-efficiency</italic> <italic>video coding</italic> (HEVC), extensively use ...

research-article
Weakly Supervised Salient Object Detection by Learning A Classifier-Driven Map Generator

Top-down saliency detection aims to highlight the regions of a specific object category, and typically relies on pixel-wise annotated training data. In this paper, we address the high cost of collecting such training data by a weakly supervised approach ...

research-article
Learning Deep Features for One-Class Classification

We present a novel deep-learning-based approach for <italic>one-class transfer learning</italic> in which labeled data from an unrelated task is used for feature learning in one-class classification. The proposed method operates on top of a convolutional ...

research-article
AttGAN: Facial Attribute Editing by Only Changing What You Want

Facial attribute editing aims to manipulate single or multiple attributes on a given face image, i.e., to generate a new face image with desired attributes while preserving other details. Recently, the generative adversarial net (GAN) and encoder&#x2013;...

research-article
Reconstruction of Stochastic 3D Signals With Symmetric Statistics From 2D Projection Images Motivated by Cryo-Electron Microscopy

Cryo-electron microscopy provides 2D projection images of the 3D electron scattering intensity of many instances of the particle under study (e.g., a virus). Both symmetry (rotational point groups) and heterogeneity are important aspects of biological ...

research-article
Multiple Pyramids Based Image Inpainting Using Local Patch Statistics and Steering Kernel Feature

In this paper, we propose a novel multiple pyramids based image inpainting method using local patch statistics and geometric feature-based sparse representation to maintain texture consistency and structure coherence. First, we approximate each patch in ...

research-article
Open Access
Adaptive Morphological Reconstruction for Seeded Image Segmentation

Morphological reconstruction (MR) is often employed by seeded image segmentation algorithms such as watershed transform and power watershed, as it is able to filter out seeds (regional minima) to reduce over-segmentation. However, the MR might mistakenly ...

research-article
Fast Blind Quality Assessment of DIBR-Synthesized Video Based on High-High Wavelet Subband

Free-viewpoint video, as the development direction of the next-generation video technologies, uses the depth-image-based rendering (DIBR) technique for the synthesis of video sequences at viewpoints, where real captured videos are missing. As reference ...

research-article
Texture Variation Adaptive Image Denoising With Nonlocal PCA

Image textures, as a kind of local variations, provide important information for the human visual system. Many image textures, especially the small-scale or stochastic textures, are rich in high-frequency variations, and are difficult to be preserved. ...

research-article
CAM-RNN: Co-Attention Model Based RNN for Video Captioning

Video captioning is a technique that bridges vision and language together, for which both visual information and text information are quite important. Typical approaches are based on the recurrent neural network (RNN), where the video caption is generated ...

research-article
TextField: Learning a Deep Direction Field for Irregular Scene Text Detection

Scene text detection is an important step in the scene text reading system. The main challenges lie in significantly varied sizes and aspect ratios, arbitrary orientations, and shapes. Driven by the recent progress in deep learning, impressive ...

research-article
Underwater Image Enhancement Using Adaptive Retinal Mechanisms

We propose an underwater image enhancement model inspired by the morphology and function of the teleost fish retina. We aim to solve the problems of underwater image degradation raised by the blurring and nonuniform color biasing. In particular, the ...

research-article
Learning Adaptive Discriminative Correlation Filters via Temporal Consistency Preserving Spatial Feature Selection for Robust Visual Object Tracking

With efficient appearance learning models, discriminative correlation filter (DCF) has been proven to be very successful in recent video object tracking benchmarks and competitions. However, the existing DCF paradigm suffers from two major issues, i.e., ...

research-article
Efficient Bandwidth Estimation in 2D Filtered Backprojection Reconstruction

A generalized cross-validation approach to estimate the reconstruction filter bandwidth in 2D filtered backprojection is presented. The method writes the reconstruction equation in equivalent backprojected filtering form, derives results on ...

research-article
Subjective and Objective Quality Assessment of Stitched Images for Virtual Reality

We consider the problem of quality assessment (QA) of image stitching algorithms used to generate panoramic images for virtual reality applications. Our contributions are two-fold. We design the Indian Institute of Science Stitched Image QA (ISIQA) ...

research-article
Conditional Random Field Model for Robust Multi-Focus Image Fusion

In this paper, a novel multi-focus image fusion algorithm based on conditional random field optimization (mf-CRF) is proposed. It is based on an unary term that includes the combined activity estimation of both high and low frequencies of the input images,...

research-article
Channel Splitting Network for Single MR Image Super-Resolution

High resolution magnetic resonance (MR) imaging is desirable in many clinical applications due to its contribution to more accurate subsequent analyses and early clinical diagnoses. Single image super-resolution (SISR) is an effective and cost efficient ...

Comments