Search | arXiv e-print repository

arXiv:2009.03118 [pdf, other]

Interpretable Deep Multimodal Image Super-Resolution

Authors: Iman Marivani, Evaggelia Tsiligianni, Bruno Cornelis, Nikos Deligiannis

Abstract: Multimodal image super-resolution (SR) is the reconstruction of a high resolution image given a low-resolution observation with the aid of another image modality. While existing deep multimodal models do not incorporate domain knowledge about image SR, we present a multimodal deep network design that integrates coupled sparse priors and allows the effective fusion of information from another modal… ▽ More Multimodal image super-resolution (SR) is the reconstruction of a high resolution image given a low-resolution observation with the aid of another image modality. While existing deep multimodal models do not incorporate domain knowledge about image SR, we present a multimodal deep network design that integrates coupled sparse priors and allows the effective fusion of information from another modality into the reconstruction process. Our method is inspired by a novel iterative algorithm for coupled convolutional sparse coding, resulting in an interpretable network by design. We apply our model to the super-resolution of near-infrared image guided by RGB images. Experimental results show that our model outperforms state-of-the-art methods. △ Less

Submitted 7 September, 2020; originally announced September 2020.

Comments: in Proceedings of iTWIST'20, Paper-ID: 41, Nantes, France, December, 2-4, 2020

arXiv:2001.07575 [pdf, other]

doi 10.1109/TIP.2020.3014729

Multimodal Deep Unfolding for Guided Image Super-Resolution

Authors: Iman Marivani, Evaggelia Tsiligianni, Bruno Cornelis, Nikos Deligiannis

Abstract: The reconstruction of a high resolution image given a low resolution observation is an ill-posed inverse problem in imaging. Deep learning methods rely on training data to learn an end-to-end mapping from a low-resolution input to a high-resolution output. Unlike existing deep multimodal models that do not incorporate domain knowledge about the problem, we propose a multimodal deep learning design… ▽ More The reconstruction of a high resolution image given a low resolution observation is an ill-posed inverse problem in imaging. Deep learning methods rely on training data to learn an end-to-end mapping from a low-resolution input to a high-resolution output. Unlike existing deep multimodal models that do not incorporate domain knowledge about the problem, we propose a multimodal deep learning design that incorporates sparse priors and allows the effective integration of information from another image modality into the network architecture. Our solution relies on a novel deep unfolding operator, performing steps similar to an iterative algorithm for convolutional sparse coding with side information; therefore, the proposed neural network is interpretable by design. The deep unfolding architecture is used as a core component of a multimodal framework for guided image super-resolution. An alternative multimodal design is investigated by employing residual learning to improve the training efficiency. The presented multimodal approach is applied to super-resolution of near-infrared and multi-spectral images as well as depth upsampling using RGB images as side information. Experimental results show that our model outperforms state-of-the-art methods. △ Less

Submitted 21 January, 2020; originally announced January 2020.

arXiv:1910.08320 [pdf, other]

Multimodal Image Super-resolution via Deep Unfolding with Side Information

Authors: Iman Marivani, Evaggelia Tsiligianni, Bruno Cornelis, Nikos Deligiannis

Abstract: Deep learning methods have been successfully applied to various computer vision tasks. However, existing neural network architectures do not per se incorporate domain knowledge about the addressed problem, thus, understanding what the model has learned is an open research topic. In this paper, we rely on the unfolding of an iterative algorithm for sparse approximation with side information, and de… ▽ More Deep learning methods have been successfully applied to various computer vision tasks. However, existing neural network architectures do not per se incorporate domain knowledge about the addressed problem, thus, understanding what the model has learned is an open research topic. In this paper, we rely on the unfolding of an iterative algorithm for sparse approximation with side information, and design a deep learning architecture for multimodal image super-resolution that incorporates sparse priors and effectively utilizes information from another image modality. We develop two deep models performing reconstruction of a high-resolution image of a target image modality from its low-resolution variant with the aid of a high-resolution image from a second modality. We apply the proposed models to super-resolve near-infrared images using as side information high-resolution RGB\ images. Experimental results demonstrate the superior performance of the proposed models against state-of-the-art methods including unimodal and multimodal approaches. △ Less

Submitted 18 October, 2019; originally announced October 2019.

Comments: 5 pages, 5 figures, 3 tables, EUSIPCO 2019

arXiv:1907.02511 [pdf, other]

doi 10.1109/LSP.2019.2929869

Deep Coupled-Representation Learning for Sparse Linear Inverse Problems with Side Information

Authors: Evaggelia Tsiligianni, Nikos Deligiannis

Abstract: In linear inverse problems, the goal is to recover a target signal from undersampled, incomplete or noisy linear measurements. Typically, the recovery relies on complex numerical optimization methods; recent approaches perform an unfolding of a numerical algorithm into a neural network form, resulting in a substantial reduction of the computational complexity. In this paper, we consider the recove… ▽ More In linear inverse problems, the goal is to recover a target signal from undersampled, incomplete or noisy linear measurements. Typically, the recovery relies on complex numerical optimization methods; recent approaches perform an unfolding of a numerical algorithm into a neural network form, resulting in a substantial reduction of the computational complexity. In this paper, we consider the recovery of a target signal with the aid of a correlated signal, the so-called side information (SI), and propose a deep unfolding model that incorporates SI. The proposed model is used to learn coupled representations of correlated signals from different modalities, enabling the recovery of multimodal data at a low computational cost. As such, our work introduces the first deep unfolding method with SI, which actually comes from a different modality. We apply our model to reconstruct near-infrared images from undersampled measurements given RGB images as SI. Experimental results demonstrate the superior performance of the proposed framework against single-modal deep learning methods that do not use SI, multimodal deep learning designs, and optimization algorithms. △ Less

Submitted 4 July, 2019; originally announced July 2019.

Comments: 6 pages, 4 figures

arXiv:1812.01478 [pdf, other]

Matrix Factorization via Deep Learning

Authors: Duc Minh Nguyen, Evaggelia Tsiligianni, Nikos Deligiannis

Abstract: Matrix completion is one of the key problems in signal processing and machine learning. In recent years, deep-learning-based models have achieved state-of-the-art results in matrix completion. Nevertheless, they suffer from two drawbacks: (i) they can not be extended easily to rows or columns unseen during training; and (ii) their results are often degraded in case discrete predictions are require… ▽ More Matrix completion is one of the key problems in signal processing and machine learning. In recent years, deep-learning-based models have achieved state-of-the-art results in matrix completion. Nevertheless, they suffer from two drawbacks: (i) they can not be extended easily to rows or columns unseen during training; and (ii) their results are often degraded in case discrete predictions are required. This paper addresses these two drawbacks by presenting a deep matrix factorization model and a generic method to allow joint training of the factorization model and the discretization operator. Experiments on a real movie rating dataset show the efficacy of the proposed models. △ Less

Submitted 4 December, 2018; originally announced December 2018.

Comments: in Proceedings of iTWIST'18, Paper-ID: 27, Marseille, France, November, 21-23, 2018

arXiv:1811.01662 [pdf, other]

Matrix Completion With Variational Graph Autoencoders: Application in Hyperlocal Air Quality Inference

Authors: Tien Huu Do, Duc Minh Nguyen, Evaggelia Tsiligianni, Angel Lopez Aguirre, Valerio Panzica La Manna, Frank Pasveer, Wilfried Philips, Nikos Deligiannis

Abstract: Inferring air quality from a limited number of observations is an essential task for monitoring and controlling air pollution. Existing inference methods typically use low spatial resolution data collected by fixed monitoring stations and infer the concentration of air pollutants using additional types of data, e.g., meteorological and traffic information. In this work, we focus on street-level ai… ▽ More Inferring air quality from a limited number of observations is an essential task for monitoring and controlling air pollution. Existing inference methods typically use low spatial resolution data collected by fixed monitoring stations and infer the concentration of air pollutants using additional types of data, e.g., meteorological and traffic information. In this work, we focus on street-level air quality inference by utilizing data collected by mobile stations. We formulate air quality inference in this setting as a graph-based matrix completion problem and propose a novel variational model based on graph convolutional autoencoders. Our model captures effectively the spatio-temporal correlation of the measurements and does not depend on the availability of additional information apart from the street-network topology. Experiments on a real air quality dataset, collected with mobile stations, shows that the proposed model outperforms state-of-the-art approaches. △ Less

Submitted 5 November, 2018; originally announced November 2018.

arXiv:1807.01798 [pdf, ps, other]

Regularizing Autoencoder-Based Matrix Completion Models via Manifold Learning

Authors: Duc Minh Nguyen, Evaggelia Tsiligianni, Robert Calderbank, Nikos Deligiannis

Abstract: Autoencoders are popular among neural-network-based matrix completion models due to their ability to retrieve potential latent factors from the partially observed matrices. Nevertheless, when training data is scarce their performance is significantly degraded due to overfitting. In this paper, we mit- igate overfitting with a data-dependent regularization technique that relies on the principles of… ▽ More Autoencoders are popular among neural-network-based matrix completion models due to their ability to retrieve potential latent factors from the partially observed matrices. Nevertheless, when training data is scarce their performance is significantly degraded due to overfitting. In this paper, we mit- igate overfitting with a data-dependent regularization technique that relies on the principles of multi-task learning. Specifically, we propose an autoencoder-based matrix completion model that performs prediction of the unknown matrix values as a main task, and manifold learning as an auxiliary task. The latter acts as an inductive bias, leading to solutions that generalize better. The proposed model outperforms the existing autoencoder-based models designed for matrix completion, achieving high reconstruction accuracy in well-known datasets. △ Less

Submitted 4 July, 2018; originally announced July 2018.

Comments: 5 pages, Eusipco 2018

arXiv:1805.04912 [pdf, other]

Extendable Neural Matrix Completion

Authors: Duc Minh Nguyen, Evaggelia Tsiligianni, Nikos Deligiannis

Abstract: Matrix completion is one of the key problems in signal processing and machine learning, with applications ranging from image pro- cessing and data gathering to classification and recommender sys- tems. Recently, deep neural networks have been proposed as la- tent factor models for matrix completion and have achieved state- of-the-art performance. Nevertheless, a major problem with existing neural-… ▽ More Matrix completion is one of the key problems in signal processing and machine learning, with applications ranging from image pro- cessing and data gathering to classification and recommender sys- tems. Recently, deep neural networks have been proposed as la- tent factor models for matrix completion and have achieved state- of-the-art performance. Nevertheless, a major problem with existing neural-network-based models is their limited capabilities to extend to samples unavailable at the training stage. In this paper, we propose a deep two-branch neural network model for matrix completion. The proposed model not only inherits the predictive power of neural net- works, but is also capable of extending to partially observed samples outside the training set, without the need of retraining or fine-tuning. Experimental studies on popular movie rating datasets prove the ef- fectiveness of our model compared to the state of the art, in terms of both accuracy and extendability. △ Less

Submitted 13 May, 2018; originally announced May 2018.

Comments: 5 pages, 2 figures, ICASSP 2018

arXiv:1805.04612 [pdf, other]

Twitter User Geolocation using Deep Multiview Learning

Authors: Tien Huu Do, Duc Minh Nguyen, Evaggelia Tsiligianni, Bruno Cornelis, Nikos Deligiannis

Abstract: Predicting the geographical location of users on social networks like Twitter is an active research topic with plenty of methods proposed so far. Most of the existing work follows either a content-based or a network-based approach. The former is based on user-generated content while the latter exploits the structure of the network of users. In this paper, we propose a more generic approach, which… ▽ More Predicting the geographical location of users on social networks like Twitter is an active research topic with plenty of methods proposed so far. Most of the existing work follows either a content-based or a network-based approach. The former is based on user-generated content while the latter exploits the structure of the network of users. In this paper, we propose a more generic approach, which incorporates not only both content-based and network-based features, but also other available information into a unified model. Our approach, named Multi-Entry Neural Network (MENET), leverages the latest advances in deep learning and multiview learning. A realization of MENET with textual, network and metadata features results in an effective method for Twitter user geolocation, achieving the state of the art on two well-known datasets. △ Less

Submitted 11 May, 2018; originally announced May 2018.

Comments: Presented at IEEE International Conference on Acoustics, Speech and Signal Processing, 2018

arXiv:1712.08091 [pdf, other]

Multiview Deep Learning for Predicting Twitter Users' Location

Authors: Tien Huu Do, Duc Minh Nguyen, Evaggelia Tsiligianni, Bruno Cornelis, Nikos Deligiannis

Abstract: The problem of predicting the location of users on large social networks like Twitter has emerged from real-life applications such as social unrest detection and online marketing. Twitter user geolocation is a difficult and active research topic with a vast literature. Most of the proposed methods follow either a content-based or a network-based approach. The former exploits user-generated content… ▽ More The problem of predicting the location of users on large social networks like Twitter has emerged from real-life applications such as social unrest detection and online marketing. Twitter user geolocation is a difficult and active research topic with a vast literature. Most of the proposed methods follow either a content-based or a network-based approach. The former exploits user-generated content while the latter utilizes the connection or interaction between Twitter users. In this paper, we introduce a novel method combining the strength of both approaches. Concretely, we propose a multi-entry neural network architecture named MENET leveraging the advances in deep learning and multiview learning. The generalizability of MENET enables the integration of multiple data representations. In the context of Twitter user geolocation, we realize MENET with textual, network, and metadata features. Considering the natural distribution of Twitter users across the concerned geographical area, we subdivide the surface of the earth into multi-scale cells and train MENET with the labels of the cells. We show that our method outperforms the state of the art by a large margin on three benchmark datasets. △ Less

Submitted 21 December, 2017; originally announced December 2017.

Comments: Submitted to the IEEE Transactions on Big Data

arXiv:1708.08311 [pdf, other]

Deep Learning Sparse Ternary Projections for Compressed Sensing of Images

Authors: Duc Minh Nguyen, Evaggelia Tsiligianni, Nikos Deligiannis

Abstract: Compressed sensing (CS) is a sampling theory that allows reconstruction of sparse (or compressible) signals from an incomplete number of measurements, using of a sensing mechanism implemented by an appropriate projection matrix. The CS theory is based on random Gaussian projection matrices, which satisfy recovery guarantees with high probability; however, sparse ternary {0, -1, +1} projections are… ▽ More Compressed sensing (CS) is a sampling theory that allows reconstruction of sparse (or compressible) signals from an incomplete number of measurements, using of a sensing mechanism implemented by an appropriate projection matrix. The CS theory is based on random Gaussian projection matrices, which satisfy recovery guarantees with high probability; however, sparse ternary {0, -1, +1} projections are more suitable for hardware implementation. In this paper, we present a deep learning approach to obtain very sparse ternary projections for compressed sensing. Our deep learning architecture jointly learns a pair of a projection matrix and a reconstruction operator in an end-to-end fashion. The experimental results on real images demonstrate the effectiveness of the proposed approach compared to state-of-the-art methods, with significant advantage in terms of complexity. △ Less

Submitted 28 August, 2017; originally announced August 2017.

Comments: To appear in GlobalSIP 2017

Showing 1–11 of 11 results for author: Tsiligianni, E