-
Learning Binary Color Filter Arrays with Trainable Hard Thresholding
Authors:
Cemre Omer Ayna,
Bahadir Kursat Gunturk,
Ali Cafer Gurbuz
Abstract:
Color Filter Arrays (CFA) are optical filters in digital cameras that capture specific color channels. Current commercial CFAs are hand-crafted patterns with different physical and application-specific considerations. This study proposes a binary CFA learning module based on hard thresholding with a deep learning-based demosaicing network in a joint architecture. Unlike most existing learnable CFA…
▽ More
Color Filter Arrays (CFA) are optical filters in digital cameras that capture specific color channels. Current commercial CFAs are hand-crafted patterns with different physical and application-specific considerations. This study proposes a binary CFA learning module based on hard thresholding with a deep learning-based demosaicing network in a joint architecture. Unlike most existing learnable CFAs that learn a channel from the whole color spectrum or linearly combine available digital colors, this method learns a binary channel selection, resulting in CFAs that are practical and physically implementable to digital cameras. The binary selection is based on adapting the hard thresholding operation into neural networks via a straight-through estimator, and therefore it is named HardMax. This paper includes the background on the CFA design problem, the description of the HardMax method, and the performance evaluation results. The evaluation of the proposed method includes tests for different demosaicing models, color configurations, filter sizes, and a comparison with existing methods in various reconstruction metrics. The proposed approach is tested with Kodak and BSDS500 datasets and provides higher reconstruction performance than hand-crafted or alternative learned binary filters.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
Deep learning-based blind image super-resolution with iterative kernel reconstruction and noise estimation
Authors:
Hasan F. Ates,
Suleyman Yildirim,
Bahadir K. Gunturk
Abstract:
Blind single image super-resolution (SISR) is a challenging task in image processing due to the ill-posed nature of the inverse problem. Complex degradations present in real life images make it difficult to solve this problem using naïve deep learning approaches, where models are often trained on synthetically generated image pairs. Most of the effort so far has been focused on solving the inverse…
▽ More
Blind single image super-resolution (SISR) is a challenging task in image processing due to the ill-posed nature of the inverse problem. Complex degradations present in real life images make it difficult to solve this problem using naïve deep learning approaches, where models are often trained on synthetically generated image pairs. Most of the effort so far has been focused on solving the inverse problem under some constraints, such as for a limited space of blur kernels and/or assuming noise-free input images. Yet, there is a gap in the literature to provide a well-generalized deep learning-based solution that performs well on images with unknown and highly complex degradations. In this paper, we propose IKR-Net (Iterative Kernel Reconstruction Network) for blind SISR. In the proposed approach, kernel and noise estimation and high-resolution image reconstruction are carried out iteratively using dedicated deep models. The iterative refinement provides significant improvement in both the reconstructed image and the estimated blur kernel even for noisy inputs. IKR-Net provides a generalized solution that can handle any type of blur and level of noise in the input low-resolution image. IKR-Net achieves state-of-the-art results in blind SISR, especially for noisy images with motion blur.
△ Less
Submitted 25 April, 2024;
originally announced April 2024.
-
HSTR-Net: High Spatio-Temporal Resolution Video Generation For Wide Area Surveillance
Authors:
H. Umut Suluhan,
Hasan F. Ates,
Bahadir K. Gunturk
Abstract:
Wide area surveillance has many applications and tracking of objects under observation is an important task, which often needs high spatio-temporal resolution (HSTR) video for better precision. This paper presents the usage of multiple video feeds for the generation of HSTR video as an extension of reference based super resolution (RefSR). One feed captures video at high spatial resolution with lo…
▽ More
Wide area surveillance has many applications and tracking of objects under observation is an important task, which often needs high spatio-temporal resolution (HSTR) video for better precision. This paper presents the usage of multiple video feeds for the generation of HSTR video as an extension of reference based super resolution (RefSR). One feed captures video at high spatial resolution with low frame rate (HSLF) while the other captures low spatial resolution and high frame rate (LSHF) video simultaneously for the same scene. The main purpose is to create an HSTR video from the fusion of HSLF and LSHF videos. In this paper we propose an end-to-end trainable deep network that performs optical flow estimation and frame reconstruction by combining inputs from both video feeds. The proposed architecture provides significant improvement over existing video frame interpolation and RefSR techniques in terms of objective PSNR and SSIM metrics.
△ Less
Submitted 9 April, 2022;
originally announced April 2022.