Search | arXiv e-print repository

MotionCraft: Physics-based Zero-Shot Video Generation

Authors: Luca Savant Aira, Antonio Montanaro, Emanuele Aiello, Diego Valsesia, Enrico Magli

Abstract: Generating videos with realistic and physically plausible motion is one of the main recent challenges in computer vision. While diffusion models are achieving compelling results in image generation, video diffusion models are limited by heavy training and huge models, resulting in videos that are still biased to the training dataset. In this work we propose MotionCraft, a new zero-shot video gener… ▽ More Generating videos with realistic and physically plausible motion is one of the main recent challenges in computer vision. While diffusion models are achieving compelling results in image generation, video diffusion models are limited by heavy training and huge models, resulting in videos that are still biased to the training dataset. In this work we propose MotionCraft, a new zero-shot video generator to craft physics-based and realistic videos. MotionCraft is able to warp the noise latent space of an image diffusion model, such as Stable Diffusion, by applying an optical flow derived from a physics simulation. We show that warping the noise latent space results in coherent application of the desired motion while allowing the model to generate missing elements consistent with the scene evolution, which would otherwise result in artefacts or missing content if the flow was applied in the pixel space. We compare our method with the state-of-the-art Text2Video-Zero reporting qualitative and quantitative improvements, demonstrating the effectiveness of our approach to generate videos with finely-prescribed complex motion dynamics. Project page: https://mezzelfo.github.io/MotionCraft/ △ Less

Submitted 22 May, 2024; originally announced May 2024.

Journal ref: NeurIPS 2024

arXiv:2403.18476 [pdf, other]

Modeling uncertainty for Gaussian Splatting

Authors: Luca Savant, Diego Valsesia, Enrico Magli

Abstract: We present Stochastic Gaussian Splatting (SGS): the first framework for uncertainty estimation using Gaussian Splatting (GS). GS recently advanced the novel-view synthesis field by achieving impressive reconstruction quality at a fraction of the computational cost of Neural Radiance Fields (NeRF). However, contrary to the latter, it still lacks the ability to provide information about the confiden… ▽ More We present Stochastic Gaussian Splatting (SGS): the first framework for uncertainty estimation using Gaussian Splatting (GS). GS recently advanced the novel-view synthesis field by achieving impressive reconstruction quality at a fraction of the computational cost of Neural Radiance Fields (NeRF). However, contrary to the latter, it still lacks the ability to provide information about the confidence associated with their outputs. To address this limitation, in this paper, we introduce a Variational Inference-based approach that seamlessly integrates uncertainty prediction into the common rendering pipeline of GS. Additionally, we introduce the Area Under Sparsification Error (AUSE) as a new term in the loss function, enabling optimization of uncertainty estimation alongside image reconstruction. Experimental results on the LLFF dataset demonstrate that our method outperforms existing approaches in terms of both image rendering quality and uncertainty estimation accuracy. Overall, our framework equips practitioners with valuable insights into the reliability of synthesized views, facilitating safer decision-making in real-world applications. △ Less

Submitted 27 March, 2024; originally announced March 2024.

arXiv:2403.17677 [pdf, other]

Onboard deep lossless and near-lossless predictive coding of hyperspectral images with line-based attention

Authors: Diego Valsesia, Tiziano Bianchi, Enrico Magli

Abstract: Deep learning methods have traditionally been difficult to apply to compression of hyperspectral images onboard of spacecrafts, due to the large computational complexity needed to achieve adequate representational power, as well as the lack of suitable datasets for training and testing. In this paper, we depart from the traditional autoencoder approach and we design a predictive neural network, ca… ▽ More Deep learning methods have traditionally been difficult to apply to compression of hyperspectral images onboard of spacecrafts, due to the large computational complexity needed to achieve adequate representational power, as well as the lack of suitable datasets for training and testing. In this paper, we depart from the traditional autoencoder approach and we design a predictive neural network, called LineRWKV, that works recursively line-by-line to limit memory consumption. In order to achieve that, we adopt a novel hybrid attentive-recursive operation that combines the representational advantages of Transformers with the linear complexity and recursive implementation of recurrent neural networks. The compression algorithm performs prediction of each pixel using LineRWKV, followed by entropy coding of the residual. Experiments on the HySpecNet-11k dataset and PRISMA images show that LineRWKV is the first deep-learning method to outperform CCSDS-123.0-B-2 at lossless and near-lossless compression. Promising throughput results are also evaluated on a 7W embedded system. △ Less

Submitted 26 March, 2024; originally announced March 2024.

arXiv:2401.16972 [pdf, other]

Deep 3D World Models for Multi-Image Super-Resolution Beyond Optical Flow

Authors: Luca Savant Aira, Diego Valsesia, Andrea Bordone Molini, Giulia Fracastoro, Enrico Magli, Andrea Mirabile

Abstract: Multi-image super-resolution (MISR) allows to increase the spatial resolution of a low-resolution (LR) acquisition by combining multiple images carrying complementary information in the form of sub-pixel offsets in the scene sampling, and can be significantly more effective than its single-image counterpart. Its main difficulty lies in accurately registering and fusing the multi-image information.… ▽ More Multi-image super-resolution (MISR) allows to increase the spatial resolution of a low-resolution (LR) acquisition by combining multiple images carrying complementary information in the form of sub-pixel offsets in the scene sampling, and can be significantly more effective than its single-image counterpart. Its main difficulty lies in accurately registering and fusing the multi-image information. Currently studied settings, such as burst photography, typically involve assumptions of small geometric disparity between the LR images and rely on optical flow for image registration. We study a MISR method that can increase the resolution of sets of images acquired with arbitrary, and potentially wildly different, camera positions and orientations, generalizing the currently studied MISR settings. Our proposed model, called EpiMISR, moves away from optical flow and explicitly uses the epipolar geometry of the acquisition process, together with transformer-based processing of radiance feature fields to substantially improve over state-of-the-art MISR methods in presence of large disparities in the LR images. △ Less

Submitted 30 January, 2024; originally announced January 2024.

arXiv:2301.07969 [pdf, other]

doi 10.1109/ACCESS.2024.3436698

Fast Inference in Denoising Diffusion Models via MMD Finetuning

Authors: Emanuele Aiello, Diego Valsesia, Enrico Magli

Abstract: Denoising Diffusion Models (DDMs) have become a popular tool for generating high-quality samples from complex data distributions. These models are able to capture sophisticated patterns and structures in the data, and can generate samples that are highly diverse and representative of the underlying distribution. However, one of the main limitations of diffusion models is the complexity of sample g… ▽ More Denoising Diffusion Models (DDMs) have become a popular tool for generating high-quality samples from complex data distributions. These models are able to capture sophisticated patterns and structures in the data, and can generate samples that are highly diverse and representative of the underlying distribution. However, one of the main limitations of diffusion models is the complexity of sample generation, since a large number of inference timesteps is required to faithfully capture the data distribution. In this paper, we present MMD-DDM, a novel method for fast sampling of diffusion models. Our approach is based on the idea of using the Maximum Mean Discrepancy (MMD) to finetune the learned distribution with a given budget of timesteps. This allows the finetuned model to significantly improve the speed-quality trade-off, by substantially increasing fidelity in inference regimes with few steps or, equivalently, by reducing the required number of steps to reach a target fidelity, thus paving the way for a more practical adoption of diffusion models in a wide range of applications. We evaluate our approach on unconditional image generation with extensive experiments across the CIFAR-10, CelebA, ImageNet and LSUN-Church datasets. Our findings show that the proposed method is able to produce high-quality samples in a fraction of the time required by widely-used diffusion models, and outperforms state-of-the-art techniques for accelerated sampling. Code is available at: https://github.com/diegovalsesia/MMD-DDM. △ Less

Submitted 19 January, 2023; originally announced January 2023.

arXiv:2209.10318 [pdf, other]

Rethinking the compositionality of point clouds through regularization in the hyperbolic space

Authors: Antonio Montanaro, Diego Valsesia, Enrico Magli

Abstract: Point clouds of 3D objects exhibit an inherent compositional nature where simple parts can be assembled into progressively more complex shapes to form whole objects. Explicitly capturing such part-whole hierarchy is a long-sought objective in order to build effective models, but its tree-like nature has made the task elusive. In this paper, we propose to embed the features of a point cloud classif… ▽ More Point clouds of 3D objects exhibit an inherent compositional nature where simple parts can be assembled into progressively more complex shapes to form whole objects. Explicitly capturing such part-whole hierarchy is a long-sought objective in order to build effective models, but its tree-like nature has made the task elusive. In this paper, we propose to embed the features of a point cloud classifier into the hyperbolic space and explicitly regularize the space to account for the part-whole hierarchy. The hyperbolic space is the only space that can successfully embed the tree-like nature of the hierarchy. This leads to substantial improvements in the performance of state-of-art supervised models for point cloud classification. △ Less

Submitted 21 September, 2022; originally announced September 2022.

Comments: NeurIPS 2022

arXiv:2209.09552 [pdf, other]

Cross-modal Learning for Image-Guided Point Cloud Shape Completion

Authors: Emanuele Aiello, Diego Valsesia, Enrico Magli

Abstract: In this paper we explore the recent topic of point cloud completion, guided by an auxiliary image. We show how it is possible to effectively combine the information from the two modalities in a localized latent space, thus avoiding the need for complex point cloud reconstruction methods from single views used by the state-of-the-art. We also investigate a novel weakly-supervised setting where the… ▽ More In this paper we explore the recent topic of point cloud completion, guided by an auxiliary image. We show how it is possible to effectively combine the information from the two modalities in a localized latent space, thus avoiding the need for complex point cloud reconstruction methods from single views used by the state-of-the-art. We also investigate a novel weakly-supervised setting where the auxiliary image provides a supervisory signal to the training process by using a differentiable renderer on the completed point cloud to measure fidelity in the image space. Experiments show significant improvements over state-of-the-art supervised methods for both unimodal and multimodal completion. We also show the effectiveness of the weakly-supervised approach which outperforms a number of supervised methods and is competitive with the latest supervised models only exploiting point cloud information. △ Less

Submitted 20 September, 2022; originally announced September 2022.

Comments: NeurIPS 2022

arXiv:2207.00460 [pdf, other]

Exploring the solution space of linear inverse problems with GAN latent geometry

Authors: Antonio Montanaro, Diego Valsesia, Enrico Magli

Abstract: Inverse problems consist in reconstructing signals from incomplete sets of measurements and their performance is highly dependent on the quality of the prior knowledge encoded via regularization. While traditional approaches focus on obtaining a unique solution, an emerging trend considers exploring multiple feasibile solutions. In this paper, we propose a method to generate multiple reconstructio… ▽ More Inverse problems consist in reconstructing signals from incomplete sets of measurements and their performance is highly dependent on the quality of the prior knowledge encoded via regularization. While traditional approaches focus on obtaining a unique solution, an emerging trend considers exploring multiple feasibile solutions. In this paper, we propose a method to generate multiple reconstructions that fit both the measurements and a data-driven prior learned by a generative adversarial network. In particular, we show that, starting from an initial solution, it is possible to find directions in the latent space of the generative model that are null to the forward operator, and thus keep consistency with the measurements, while inducing significant perceptual change. Our exploration approach allows to generate multiple solutions to the inverse problem an order of magnitude faster than existing approaches; we show results on image super-resolution and inpainting problems. △ Less

Submitted 1 July, 2022; originally announced July 2022.

Comments: ICIP 2022

arXiv:2204.02631 [pdf, other]

Super-resolved multi-temporal segmentation with deep permutation-invariant networks

Authors: Diego Valsesia, Enrico Magli

Abstract: Multi-image super-resolution from multi-temporal satellite acquisitions of a scene has recently enjoyed great success thanks to new deep learning models. In this paper, we go beyond classic image reconstruction at a higher resolution by studying a super-resolved inference problem, namely semantic segmentation at a spatial resolution higher than the one of sensing platform. We expand upon recently… ▽ More Multi-image super-resolution from multi-temporal satellite acquisitions of a scene has recently enjoyed great success thanks to new deep learning models. In this paper, we go beyond classic image reconstruction at a higher resolution by studying a super-resolved inference problem, namely semantic segmentation at a spatial resolution higher than the one of sensing platform. We expand upon recently proposed models exploiting temporal permutation invariance with a multi-resolution fusion module able to infer the rich semantic information needed by the segmentation task. The model presented in this paper has recently won the AI4EO challenge on Enhanced Sentinel 2 Agriculture. △ Less

Submitted 6 April, 2022; originally announced April 2022.

Comments: IGARSS 2022

arXiv:2108.09075 [pdf, other]

doi 10.1109/LGRS.2022.3195259

Semi-supervised learning for joint SAR and multispectral land cover classification

Authors: Antonio Montanaro, Diego Valsesia, Giulia Fracastoro, Enrico Magli

Abstract: Semi-supervised learning techniques are gaining popularity due to their capability of building models that are effective, even when scarce amounts of labeled data are available. In this paper, we present a framework and specific tasks for self-supervised pretraining of \textit{multichannel} models, such as the fusion of multispectral and synthetic aperture radar images. We show that the proposed s… ▽ More Semi-supervised learning techniques are gaining popularity due to their capability of building models that are effective, even when scarce amounts of labeled data are available. In this paper, we present a framework and specific tasks for self-supervised pretraining of \textit{multichannel} models, such as the fusion of multispectral and synthetic aperture radar images. We show that the proposed self-supervised approach is highly effective at learning features that correlate with the labels for land cover classification. This is enabled by an explicit design of pretraining tasks which promotes bridging the gaps between sensing modalities and exploiting the spectral characteristics of the input. In a semi-supervised setting, when limited labels are available, using the proposed self-supervised pretraining, followed by supervised finetuning for land cover classification with SAR and multispectral data, outperforms conventional approaches such as purely supervised learning, initialization from training on ImageNet and other recent self-supervised approaches. △ Less

Submitted 28 July, 2022; v1 submitted 20 August, 2021; originally announced August 2021.

Comments: IEEE Geoscience and Remote Sensing Letters

arXiv:2105.12409 [pdf, other]

doi 10.1109/TGRS.2021.3130673

Permutation invariance and uncertainty in multitemporal image super-resolution

Authors: Diego Valsesia, Enrico Magli

Abstract: Recent advances have shown how deep neural networks can be extremely effective at super-resolving remote sensing imagery, starting from a multitemporal collection of low-resolution images. However, existing models have neglected the issue of temporal permutation, whereby the temporal ordering of the input images does not carry any relevant information for the super-resolution task and causes such… ▽ More Recent advances have shown how deep neural networks can be extremely effective at super-resolving remote sensing imagery, starting from a multitemporal collection of low-resolution images. However, existing models have neglected the issue of temporal permutation, whereby the temporal ordering of the input images does not carry any relevant information for the super-resolution task and causes such models to be inefficient with the, often scarce, ground truth data that available for training. Thus, models ought not to learn feature extractors that rely on temporal ordering. In this paper, we show how building a model that is fully invariant to temporal permutation significantly improves performance and data efficiency. Moreover, we study how to quantify the uncertainty of the super-resolved image so that the final user is informed on the local quality of the product. We show how uncertainty correlates with temporal variation in the series, and how quantifying it further improves model performance. Experiments on the Proba-V challenge dataset show significant improvements over the state of the art without the need for self-ensembling, as well as improved data efficiency, reaching the performance of the challenge winner with just 25% of the training data. △ Less

Submitted 26 May, 2021; originally announced May 2021.

arXiv:2103.16671 [pdf, other]

Denoise and Contrast for Category Agnostic Shape Completion

Authors: Antonio Alliegro, Diego Valsesia, Giulia Fracastoro, Enrico Magli, Tatiana Tommasi

Abstract: In this paper, we present a deep learning model that exploits the power of self-supervision to perform 3D point cloud completion, estimating the missing part and a context region around it. Local and global information are encoded in a combined embedding. A denoising pretext task provides the network with the needed local cues, decoupled from the high-level semantics and naturally shared over mult… ▽ More In this paper, we present a deep learning model that exploits the power of self-supervision to perform 3D point cloud completion, estimating the missing part and a context region around it. Local and global information are encoded in a combined embedding. A denoising pretext task provides the network with the needed local cues, decoupled from the high-level semantics and naturally shared over multiple classes. On the other hand, contrastive learning maximizes the agreement between variants of the same shape with different missing portions, thus producing a representation which captures the global appearance of the shape. The combined embedding inherits category-agnostic properties from the chosen pretext tasks. Differently from existing approaches, this allows to better generalize the completion properties to new categories unseen at training time. Moreover, while decoding the obtained joint representation, we better blend the reconstructed missing part with the partial shape by paying attention to its known surrounding region and reconstructing this frame as auxiliary objective. Our extensive experiments and detailed ablation on the ShapeNet dataset show the effectiveness of each part of the method with new state of the art results. Our quantitative and qualitative analysis confirms how our approach is able to work on novel categories without relying neither on classification and shape symmetry priors, nor on adversarial training procedures. △ Less

Submitted 30 March, 2021; originally announced March 2021.

Comments: Accepted at CVPR 2021

arXiv:2103.15565 [pdf, other]

RAN-GNNs: breaking the capacity limits of graph neural networks

Authors: Diego Valsesia, Giulia Fracastoro, Enrico Magli

Abstract: Graph neural networks have become a staple in problems addressing learning and analysis of data defined over graphs. However, several results suggest an inherent difficulty in extracting better performance by increasing the number of layers. Recent works attribute this to a phenomenon peculiar to the extraction of node features in graph-based tasks, i.e., the need to consider multiple neighborhood… ▽ More Graph neural networks have become a staple in problems addressing learning and analysis of data defined over graphs. However, several results suggest an inherent difficulty in extracting better performance by increasing the number of layers. Recent works attribute this to a phenomenon peculiar to the extraction of node features in graph-based tasks, i.e., the need to consider multiple neighborhood sizes at the same time and adaptively tune them. In this paper, we investigate the recently proposed randomly wired architectures in the context of graph neural networks. Instead of building deeper networks by stacking many layers, we prove that employing a randomly-wired architecture can be a more effective way to increase the capacity of the network and obtain richer representations. We show that such architectures behave like an ensemble of paths, which are able to merge contributions from receptive fields of varied size. Moreover, these receptive fields can also be modulated to be wider or narrower through the trainable weights over the paths. We also provide extensive experimental evidence of the superior performance of randomly wired architectures over multiple tasks and four graph convolution definitions, using recent benchmarking frameworks that addresses the reliability of previous testing methodologies. △ Less

Submitted 29 March, 2021; originally announced March 2021.

arXiv:2012.05508 [pdf, other]

doi 10.1109/MGRS.2021.3070956

Deep Learning Methods For Synthetic Aperture Radar Image Despeckling: An Overview Of Trends And Perspectives

Authors: Giulia Fracastoro, Enrico Magli, Giovanni Poggi, Giuseppe Scarpa, Diego Valsesia, Luisa Verdoliva

Abstract: Synthetic aperture radar (SAR) images are affected by a spatially-correlated and signal-dependent noise called speckle, which is very severe and may hinder image exploitation. Despeckling is an important task that aims at removing such noise, so as to improve the accuracy of all downstream image processing tasks. The first despeckling methods date back to the 1970's, and several model-based algori… ▽ More Synthetic aperture radar (SAR) images are affected by a spatially-correlated and signal-dependent noise called speckle, which is very severe and may hinder image exploitation. Despeckling is an important task that aims at removing such noise, so as to improve the accuracy of all downstream image processing tasks. The first despeckling methods date back to the 1970's, and several model-based algorithms have been developed in the subsequent years. The field has received growing attention, sparkled by the availability of powerful deep learning models that have yielded excellent performance for inverse problems in image processing. This paper surveys the literature on deep learning methods applied to SAR despeckling, covering both the supervised and the more recent self-supervised approaches. We provide a critical analysis of existing methods with the objective to recognize the most promising research lines, to identify the factors that have limited the success of deep models, and to propose ways forward in an attempt to fully exploit the potential of deep learning for SAR despeckling. △ Less

Submitted 2 May, 2021; v1 submitted 10 December, 2020; originally announced December 2020.

arXiv:2010.15487 [pdf, ps, other]

Beyond cross-entropy: learning highly separable feature distributions for robust and accurate classification

Authors: Arslan Ali, Andrea Migliorati, Tiziano Bianchi, Enrico Magli

Abstract: Deep learning has shown outstanding performance in several applications including image classification. However, deep classifiers are known to be highly vulnerable to adversarial attacks, in that a minor perturbation of the input can easily lead to an error. Providing robustness to adversarial attacks is a very challenging task especially in problems involving a large number of classes, as it typi… ▽ More Deep learning has shown outstanding performance in several applications including image classification. However, deep classifiers are known to be highly vulnerable to adversarial attacks, in that a minor perturbation of the input can easily lead to an error. Providing robustness to adversarial attacks is a very challenging task especially in problems involving a large number of classes, as it typically comes at the expense of an accuracy decrease. In this work, we propose the Gaussian class-conditional simplex (GCCS) loss: a novel approach for training deep robust multiclass classifiers that provides adversarial robustness while at the same time achieving or even surpassing the classification accuracy of state-of-the-art methods. Differently from other frameworks, the proposed method learns a mapping of the input classes onto target distributions in a latent space such that the classes are linearly separable. Instead of maximizing the likelihood of target labels for individual samples, our objective function pushes the network to produce feature distributions yielding high inter-class separation. The mean values of the distributions are centered on the vertices of a simplex such that each class is at the same distance from every other class. We show that the regularization of the latent space based on our approach yields excellent classification accuracy and inherently provides robustness to multiple adversarial attacks, both targeted and untargeted, outperforming state-of-the-art approaches over challenging datasets. △ Less

Submitted 29 October, 2020; originally announced October 2020.

arXiv:2008.06021 [pdf, other]

BioMetricNet: deep unconstrained face verification through learning of metrics regularized onto Gaussian distributions

Authors: Arslan Ali, Matteo Testa, Tiziano Bianchi, Enrico Magli

Abstract: We present BioMetricNet: a novel framework for deep unconstrained face verification which learns a regularized metric to compare facial features. Differently from popular methods such as FaceNet, the proposed approach does not impose any specific metric on facial features; instead, it shapes the decision space by learning a latent representation in which matching and non-matching pairs are mapped… ▽ More We present BioMetricNet: a novel framework for deep unconstrained face verification which learns a regularized metric to compare facial features. Differently from popular methods such as FaceNet, the proposed approach does not impose any specific metric on facial features; instead, it shapes the decision space by learning a latent representation in which matching and non-matching pairs are mapped onto clearly separated and well-behaved target distributions. In particular, the network jointly learns the best feature representation, and the best metric that follows the target distributions, to be used to discriminate face images. In this paper we present this general framework, first of its kind for facial verification, and tailor it to Gaussian distributions. This choice enables the use of a simple linear decision boundary that can be tuned to achieve the desired trade-off between false alarm and genuine acceptance rate, and leads to a loss function that can be written in closed form. Extensive analysis and experimentation on publicly available datasets such as Labeled Faces in the wild (LFW), Youtube faces (YTF), Celebrities in Frontal-Profile in the Wild (CFP), and challenging datasets like cross-age LFW (CALFW), cross-pose LFW (CPLFW), In-the-wild Age Dataset (AgeDB) show a significant performance improvement and confirms the effectiveness and superiority of BioMetricNet over existing state-of-the-art methods. △ Less

Submitted 13 August, 2020; originally announced August 2020.

Comments: Accepted at ECCV20

arXiv:2007.02578 [pdf, other]

Learning Graph-Convolutional Representations for Point Cloud Denoising

Authors: Francesca Pistilli, Giulia Fracastoro, Diego Valsesia, Enrico Magli

Abstract: Point clouds are an increasingly relevant data type but they are often corrupted by noise. We propose a deep neural network based on graph-convolutional layers that can elegantly deal with the permutation-invariance problem encountered by learning-based point cloud processing methods. The network is fully-convolutional and can build complex hierarchies of features by dynamically constructing neigh… ▽ More Point clouds are an increasingly relevant data type but they are often corrupted by noise. We propose a deep neural network based on graph-convolutional layers that can elegantly deal with the permutation-invariance problem encountered by learning-based point cloud processing methods. The network is fully-convolutional and can build complex hierarchies of features by dynamically constructing neighborhood graphs from similarity among the high-dimensional feature representations of the points. When coupled with a loss promoting proximity to the ideal surface, the proposed approach significantly outperforms state-of-the-art methods on a variety of metrics. In particular, it is able to improve in terms of Chamfer measure and of quality of the surface normals that can be estimated from the denoised data. We also show that it is especially robust both at high noise levels and in presence of structured noise such as the one encountered in real LiDAR scans. △ Less

Submitted 6 July, 2020; originally announced July 2020.

Comments: European Conference on Computer Vision (ECCV) 2020

arXiv:2007.02075 [pdf, other]

Speckle2Void: Deep Self-Supervised SAR Despeckling with Blind-Spot Convolutional Neural Networks

Authors: Andrea Bordone Molini, Diego Valsesia, Giulia Fracastoro, Enrico Magli

Abstract: Information extraction from synthetic aperture radar (SAR) images is heavily impaired by speckle noise, hence despeckling is a crucial preliminary step in scene analysis algorithms. The recent success of deep learning envisions a new generation of despeckling techniques that could outperform classical model-based methods. However, current deep learning approaches to despeckling require supervision… ▽ More Information extraction from synthetic aperture radar (SAR) images is heavily impaired by speckle noise, hence despeckling is a crucial preliminary step in scene analysis algorithms. The recent success of deep learning envisions a new generation of despeckling techniques that could outperform classical model-based methods. However, current deep learning approaches to despeckling require supervision for training, whereas clean SAR images are impossible to obtain. In the literature, this issue is tackled by resorting to either synthetically speckled optical images, which exhibit different properties with respect to true SAR images, or multi-temporal SAR images, which are difficult to acquire or fuse accurately. In this paper, inspired by recent works on blind-spot denoising networks, we propose a self-supervised Bayesian despeckling method. The proposed method is trained employing only noisy SAR images and can therefore learn features of real SAR images rather than synthetic data. Experiments show that the performance of the proposed approach is very close to the supervised training approach on synthetic data and superior on real data in both quantitative and visual assessments. △ Less

Submitted 4 July, 2020; originally announced July 2020.

arXiv:2001.06342 [pdf, other]

DeepSUM++: Non-local Deep Neural Network for Super-Resolution of Unregistered Multitemporal Images

Authors: Andrea Bordone Molini, Diego Valsesia, Giulia Fracastoro, Enrico Magli

Abstract: Deep learning methods for super-resolution of a remote sensing scene from multiple unregistered low-resolution images have recently gained attention thanks to a challenge proposed by the European Space Agency. This paper presents an evolution of the winner of the challenge, showing how incorporating non-local information in a convolutional neural network allows to exploit self-similar patterns tha… ▽ More Deep learning methods for super-resolution of a remote sensing scene from multiple unregistered low-resolution images have recently gained attention thanks to a challenge proposed by the European Space Agency. This paper presents an evolution of the winner of the challenge, showing how incorporating non-local information in a convolutional neural network allows to exploit self-similar patterns that provide enhanced regularization of the super-resolution problem. Experiments on the dataset of the challenge show improved performance over the state-of-the-art, which does not exploit non-local information. △ Less

Submitted 15 January, 2020; originally announced January 2020.

Comments: arXiv admin note: text overlap with arXiv:1907.06490

arXiv:2001.05264 [pdf, other]

Towards Deep Unsupervised SAR Despeckling with Blind-Spot Convolutional Neural Networks

Authors: Andrea Bordone Molini, Diego Valsesia, Giulia Fracastoro, Enrico Magli

Abstract: SAR despeckling is a problem of paramount importance in remote sensing, since it represents the first step of many scene analysis algorithms. Recently, deep learning techniques have outperformed classical model-based despeckling algorithms. However, such methods require clean ground truth images for training, thus resorting to synthetically speckled optical images since clean SAR images cannot be… ▽ More SAR despeckling is a problem of paramount importance in remote sensing, since it represents the first step of many scene analysis algorithms. Recently, deep learning techniques have outperformed classical model-based despeckling algorithms. However, such methods require clean ground truth images for training, thus resorting to synthetically speckled optical images since clean SAR images cannot be acquired. In this paper, inspired by recent works on blind-spot denoising networks, we propose a self-supervised Bayesian despeckling method. The proposed method is trained employing only noisy images and can therefore learn features of real SAR images rather than synthetic data. We show that the performance of the proposed network is very close to the supervised training approach on synthetic data and competitive on real data. △ Less

Submitted 15 January, 2020; originally announced January 2020.

arXiv:1911.08764 [pdf, other]

Learning mappings onto regularized latent spaces for biometric authentication

Authors: Matteo Testa, Arslan Ali, Tiziano Bianchi, Enrico Magli

Abstract: We propose a novel architecture for generic biometric authentication based on deep neural networks: RegNet. Differently from other methods, RegNet learns a mapping of the input biometric traits onto a target distribution in a well-behaved space in which users can be separated by means of simple and tunable boundaries. More specifically, authorized and unauthorized users are mapped onto two differe… ▽ More We propose a novel architecture for generic biometric authentication based on deep neural networks: RegNet. Differently from other methods, RegNet learns a mapping of the input biometric traits onto a target distribution in a well-behaved space in which users can be separated by means of simple and tunable boundaries. More specifically, authorized and unauthorized users are mapped onto two different and well behaved Gaussian distributions. The novel approach of learning the mapping instead of the boundaries further avoids the problem encountered in typical classifiers for which the learnt boundaries may be complex and difficult to analyze. RegNet achieves high performance in terms of security metrics such as Equal Error Rate (EER), False Acceptance Rate (FAR) and Genuine Acceptance Rate (GAR). The experiments we conducted on publicly available datasets of face and fingerprint confirm the effectiveness of the proposed system. △ Less

Submitted 20 November, 2019; originally announced November 2019.

Comments: Accepted at IEEE MMSP 2019

arXiv:1909.01802 [pdf, other]

Analysis of SparseHash: an efficient embedding of set-similarity via sparse projections

Authors: Diego Valsesia, Sophie Marie Fosson, Chiara Ravazzi, Tiziano Bianchi, Enrico Magli

Abstract: Embeddings provide compact representations of signals in order to perform efficient inference in a wide variety of tasks. In particular, random projections are common tools to construct Euclidean distance-preserving embeddings, while hashing techniques are extensively used to embed set-similarity metrics, such as the Jaccard coefficient. In this letter, we theoretically prove that a class of rando… ▽ More Embeddings provide compact representations of signals in order to perform efficient inference in a wide variety of tasks. In particular, random projections are common tools to construct Euclidean distance-preserving embeddings, while hashing techniques are extensively used to embed set-similarity metrics, such as the Jaccard coefficient. In this letter, we theoretically prove that a class of random projections based on sparse matrices, called SparseHash, can preserve the Jaccard coefficient between the supports of sparse signals, which can be used to estimate set similarities. Moreover, besides the analysis, we provide an efficient implementation and we test the performance in several numerical experiments, both on synthetic and real datasets. △ Less

Submitted 2 September, 2019; originally announced September 2019.

Comments: 25 pages, 6 figures

arXiv:1907.08448 [pdf, other]

doi 10.1109/TIP.2020.3013166

Deep Graph-Convolutional Image Denoising

Authors: Diego Valsesia, Giulia Fracastoro, Enrico Magli

Abstract: Non-local self-similarity is well-known to be an effective prior for the image denoising problem. However, little work has been done to incorporate it in convolutional neural networks, which surpass non-local model-based methods despite only exploiting local information. In this paper, we propose a novel end-to-end trainable neural network architecture employing layers based on graph convolution o… ▽ More Non-local self-similarity is well-known to be an effective prior for the image denoising problem. However, little work has been done to incorporate it in convolutional neural networks, which surpass non-local model-based methods despite only exploiting local information. In this paper, we propose a novel end-to-end trainable neural network architecture employing layers based on graph convolution operations, thereby creating neurons with non-local receptive fields. The graph convolution operation generalizes the classic convolution to arbitrary graphs. In this work, the graph is dynamically computed from similarities among the hidden features of the network, so that the powerful representation learning capabilities of the network are exploited to uncover self-similar patterns. We introduce a lightweight Edge-Conditioned Convolution which addresses vanishing gradient and over-parameterization issues of this particular graph convolution. Extensive experiments show state-of-the-art performance with improved qualitative and quantitative results on both synthetic Gaussian noise and real noise. △ Less

Submitted 19 July, 2019; originally announced July 2019.

arXiv:1907.06490 [pdf, other]

doi 10.1109/TGRS.2019.2959248

DeepSUM: Deep neural network for Super-resolution of Unregistered Multitemporal images

Authors: Andrea Bordone Molini, Diego Valsesia, Giulia Fracastoro, Enrico Magli

Abstract: Recently, convolutional neural networks (CNN) have been successfully applied to many remote sensing problems. However, deep learning techniques for multi-image super-resolution from multitemporal unregistered imagery have received little attention so far. This work proposes a novel CNN-based technique that exploits both spatial and temporal correlations to combine multiple images. This novel frame… ▽ More Recently, convolutional neural networks (CNN) have been successfully applied to many remote sensing problems. However, deep learning techniques for multi-image super-resolution from multitemporal unregistered imagery have received little attention so far. This work proposes a novel CNN-based technique that exploits both spatial and temporal correlations to combine multiple images. This novel framework integrates the spatial registration task directly inside the CNN, and allows to exploit the representation learning capabilities of the network to enhance registration accuracy. The entire super-resolution process relies on a single CNN with three main stages: shared 2D convolutions to extract high-dimensional features from the input images; a subnetwork proposing registration filters derived from the high-dimensional feature representations; 3D convolutions for slow fusion of the features from multiple images. The whole network can be trained end-to-end to recover a single high resolution image from multiple unregistered low resolution images. The method presented in this paper is the winner of the PROBA-V super-resolution challenge issued by the European Space Agency. △ Less

Submitted 15 January, 2020; v1 submitted 15 July, 2019; originally announced July 2019.

arXiv:1907.02959 [pdf, other]

doi 10.1109/TGRS.2019.2927434

High-throughput Onboard Hyperspectral Image Compression with Ground-based CNN Reconstruction

Authors: Diego Valsesia, Enrico Magli

Abstract: Compression of hyperspectral images onboard of spacecrafts is a tradeoff between the limited computational resources and the ever-growing spatial and spectral resolution of the optical instruments. As such, it requires low-complexity algorithms with good rate-distortion performance and high throughput. In recent years, the Consultative Committee for Space Data Systems (CCSDS) has focused on lossle… ▽ More Compression of hyperspectral images onboard of spacecrafts is a tradeoff between the limited computational resources and the ever-growing spatial and spectral resolution of the optical instruments. As such, it requires low-complexity algorithms with good rate-distortion performance and high throughput. In recent years, the Consultative Committee for Space Data Systems (CCSDS) has focused on lossless and near-lossless compression approaches based on predictive coding, resulting in the recently published CCSDS 123.0-B-2 recommended standard. While the in-loop reconstruction of quantized prediction residuals provides excellent rate-distortion performance for the near-lossless operating mode, it significantly constrains the achievable throughput due to data dependencies. In this paper, we study the performance of a faster method based on prequantization of the image followed by a lossless predictive compressor. While this is well known to be suboptimal, one can exploit powerful signal models to reconstruct the image at the ground segment, recovering part of the suboptimality. In particular, we show that convolutional neural networks can be used for this task and that they can recover the whole SNR drop incurred at a bitrate of 2 bits per pixel. △ Less

Submitted 5 July, 2019; originally announced July 2019.

arXiv:1905.12281 [pdf, other]

Image Denoising with Graph-Convolutional Neural Networks

Authors: Diego Valsesia, Giulia Fracastoro, Enrico Magli

Abstract: Recovering an image from a noisy observation is a key problem in signal processing. Recently, it has been shown that data-driven approaches employing convolutional neural networks can outperform classical model-based techniques, because they can capture more powerful and discriminative features. However, since these methods are based on convolutional operations, they are only capable of exploiting… ▽ More Recovering an image from a noisy observation is a key problem in signal processing. Recently, it has been shown that data-driven approaches employing convolutional neural networks can outperform classical model-based techniques, because they can capture more powerful and discriminative features. However, since these methods are based on convolutional operations, they are only capable of exploiting local similarities without taking into account non-local self-similarities. In this paper we propose a convolutional neural network that employs graph-convolutional layers in order to exploit both local and non-local similarities. The graph-convolutional layers dynamically construct neighborhoods in the feature space to detect latent correlations in the feature maps produced by the hidden layers. The experimental results show that the proposed architecture outperforms classical convolutional neural networks for the denoising task. △ Less

Submitted 29 May, 2019; originally announced May 2019.

Comments: IEEE International Conference on Image Processing (ICIP) 2019

arXiv:1804.06182 [pdf, ps, other]

doi 10.1109/TSIPN.2018.2869354

Sampling of graph signals via randomized local aggregations

Authors: Diego Valsesia, Giulia Fracastoro, Enrico Magli

Abstract: Sampling of signals defined over the nodes of a graph is one of the crucial problems in graph signal processing. While in classical signal processing sampling is a well defined operation, when we consider a graph signal many new challenges arise and defining an efficient sampling strategy is not straightforward. Recently, several works have addressed this problem. The most common techniques select… ▽ More Sampling of signals defined over the nodes of a graph is one of the crucial problems in graph signal processing. While in classical signal processing sampling is a well defined operation, when we consider a graph signal many new challenges arise and defining an efficient sampling strategy is not straightforward. Recently, several works have addressed this problem. The most common techniques select a subset of nodes to reconstruct the entire signal. However, such methods often require the knowledge of the signal support and the computation of the sparsity basis before sampling. Instead, in this paper we propose a new approach to this issue. We introduce a novel technique that combines localized sampling with compressed sensing. We first choose a subset of nodes and then, for each node of the subset, we compute random linear combinations of signal coefficients localized at the node itself and its neighborhood. The proposed method provides theoretical guarantees in terms of reconstruction and stability to noise for any graph and any orthonormal basis, even when the support is not known. △ Less

Submitted 29 May, 2019; v1 submitted 17 April, 2018; originally announced April 2018.

Comments: IEEE Transactions on Signal and Information Processing over Networks, 2019

arXiv:1707.02244 [pdf, other]

doi 10.1080/01431161.2017.1356489

GPU-Accelerated Algorithms for Compressed Signals Recovery with Application to Astronomical Imagery Deblurring

Authors: Attilio Fiandrotti, Sophie M. Fosson, Chiara Ravazzi, Enrico Magli

Abstract: Compressive sensing promises to enable bandwidth-efficient on-board compression of astronomical data by lifting the encoding complexity from the source to the receiver. The signal is recovered off-line, exploiting GPUs parallel computation capabilities to speedup the reconstruction process. However, inherent GPU hardware constraints limit the size of the recoverable signal and the speedup practica… ▽ More Compressive sensing promises to enable bandwidth-efficient on-board compression of astronomical data by lifting the encoding complexity from the source to the receiver. The signal is recovered off-line, exploiting GPUs parallel computation capabilities to speedup the reconstruction process. However, inherent GPU hardware constraints limit the size of the recoverable signal and the speedup practically achievable. In this work, we design parallel algorithms that exploit the properties of circulant matrices for efficient GPU-accelerated sparse signals recovery. Our approach reduces the memory requirements, allowing us to recover very large signals with limited memory. In addition, it achieves a tenfold signal recovery speedup thanks to ad-hoc parallelization of matrix-vector multiplications and matrix inversions. Finally, we practically demonstrate our algorithms in a typical application of circulant matrices: deblurring a sparse astronomical image in the compressed domain. △ Less

Submitted 7 July, 2017; originally announced July 2017.

arXiv:1703.05022 [pdf, other]

doi 10.1109/LSP.2017.2657889

Steerable Discrete Fourier Transform

Authors: Giulia Fracastoro, Enrico Magli

Abstract: Directional transforms have recently raised a lot of interest thanks to their numerous applications in signal compression and analysis. In this letter, we introduce a generalization of the discrete Fourier transform, called steerable DFT (SDFT). Since the DFT is used in numerous fields, it may be of interest in a wide range of applications. Moreover, we also show that the SDFT is highly related to… ▽ More Directional transforms have recently raised a lot of interest thanks to their numerous applications in signal compression and analysis. In this letter, we introduce a generalization of the discrete Fourier transform, called steerable DFT (SDFT). Since the DFT is used in numerous fields, it may be of interest in a wide range of applications. Moreover, we also show that the SDFT is highly related to other well-known transforms, such as the Fourier sine and cosine transforms and the Hilbert transforms. △ Less

Submitted 15 March, 2017; originally announced March 2017.

arXiv:1701.08513 [pdf, other]

doi 10.1109/LGRS.2016.2644726

Fast and Lightweight Rate Control for Onboard Predictive Coding of Hyperspectral Images

Authors: Diego Valsesia, Enrico Magli

Abstract: Predictive coding is attractive for compression of hyperspecral images onboard of spacecrafts in light of the excellent rate-distortion performance and low complexity of recent schemes. In this letter we propose a rate control algorithm and integrate it in a lossy extension to the CCSDS-123 lossless compression recommendation. The proposed rate algorithm overhauls our previous scheme by being orde… ▽ More Predictive coding is attractive for compression of hyperspecral images onboard of spacecrafts in light of the excellent rate-distortion performance and low complexity of recent schemes. In this letter we propose a rate control algorithm and integrate it in a lossy extension to the CCSDS-123 lossless compression recommendation. The proposed rate algorithm overhauls our previous scheme by being orders of magnitude faster and simpler to implement, while still providing the same accuracy in terms of output rate and comparable or better image quality. △ Less

Submitted 30 January, 2017; originally announced January 2017.

arXiv:1701.08511 [pdf, other]

doi 10.1109/LSP.2016.2639036

Binary adaptive embeddings from order statistics of random projections

Authors: Diego Valsesia, Enrico Magli

Abstract: We use some of the largest order statistics of the random projections of a reference signal to construct a binary embedding that is adapted to signals correlated with such signal. The embedding is characterized from the analytical standpoint and shown to provide improved performance on tasks such as classification in a reduced-dimensionality space. We use some of the largest order statistics of the random projections of a reference signal to construct a binary embedding that is adapted to signals correlated with such signal. The embedding is characterized from the analytical standpoint and shown to provide improved performance on tasks such as classification in a reduced-dimensionality space. △ Less

Submitted 30 January, 2017; originally announced January 2017.

arXiv:1611.02431 [pdf, other]

doi 10.1109/TSP.2016.2548990

Distributed recovery of jointly sparse signals under communication constraints

Authors: Sophie M. Fosson, Javier Matamoros, Carles Anton-Haro, Enrico Magli

Abstract: The problem of the distributed recovery of jointly sparse signals has attracted much attention recently. Let us assume that the nodes of a network observe different sparse signals with common support; starting from linear, compressed measurements, and exploiting network communication, each node aims at reconstructing the support and the non-zero values of its observed signal. In the literature, di… ▽ More The problem of the distributed recovery of jointly sparse signals has attracted much attention recently. Let us assume that the nodes of a network observe different sparse signals with common support; starting from linear, compressed measurements, and exploiting network communication, each node aims at reconstructing the support and the non-zero values of its observed signal. In the literature, distributed greedy algorithms have been proposed to tackle this problem, among which the most reliable ones require a large amount of transmitted data, which barely adapts to realistic network communication constraints. In this work, we address the problem through a reweighted $\ell_1$ soft thresholding technique, in which the threshold is iteratively tuned based on the current estimate of the support. The proposed method adapts to constrained networks, as it requires only local communication among neighbors, and the transmitted messages are indices from a finite set. We analytically prove the convergence of the proposed algorithm and we show that it outperforms the state-of-the-art greedy methods in terms of balance between recovery accuracy and communication load. △ Less

Submitted 8 November, 2016; originally announced November 2016.

arXiv:1610.09152 [pdf, other]

doi 10.1109/TIP.2016.2623489

Steerable Discrete Cosine Transform

Authors: Giulia Fracastoro, Sophie Marie Fosson, Enrico Magli

Abstract: In image compression, classical block-based separable transforms tend to be inefficient when image blocks contain arbitrarily shaped discontinuities. For this reason, transforms incorporating directional information are an appealing alternative. In this paper, we propose a new approach to this problem, namely a discrete cosine transform (DCT) that can be steered in any chosen direction. Such trans… ▽ More In image compression, classical block-based separable transforms tend to be inefficient when image blocks contain arbitrarily shaped discontinuities. For this reason, transforms incorporating directional information are an appealing alternative. In this paper, we propose a new approach to this problem, namely a discrete cosine transform (DCT) that can be steered in any chosen direction. Such transform, called steerable DCT (SDCT), allows to rotate in a flexible way pairs of basis vectors, and enables precise matching of directionality in each image block, achieving improved coding efficiency. The optimal rotation angles for SDCT can be represented as solution of a suitable rate-distortion (RD) problem. We propose iterative methods to search such solution, and we develop a fully fledged image encoder to practically compare our techniques with other competing transforms. Analytical and numerical results prove that SDCT outperforms both DCT and state-of-the-art directional transforms. △ Less

Submitted 23 October, 2018; v1 submitted 28 October, 2016; originally announced October 2016.

arXiv:1610.03623 [pdf, other]

Fast Training of Convolutional Neural Networks via Kernel Rescaling

Authors: Pedro Porto Buarque de Gusmão, Gianluca Francini, Skjalg Lepsøy, Enrico Magli

Abstract: Training deep Convolutional Neural Networks (CNN) is a time consuming task that may take weeks to complete. In this article we propose a novel, theoretically founded method for reducing CNN training time without incurring any loss in accuracy. The basic idea is to begin training with a pre-train network using lower-resolution kernels and input images, and then refine the results at the full resolu… ▽ More Training deep Convolutional Neural Networks (CNN) is a time consuming task that may take weeks to complete. In this article we propose a novel, theoretically founded method for reducing CNN training time without incurring any loss in accuracy. The basic idea is to begin training with a pre-train network using lower-resolution kernels and input images, and then refine the results at the full resolution by exploiting the spatial scaling property of convolutions. We apply our method to the ImageNet winner OverFeat and to the more recent ResNet architecture and show a reduction in training time of nearly 20% while test set accuracy is preserved in both cases. △ Less

Submitted 12 October, 2016; originally announced October 2016.

arXiv:1503.08696 [pdf, ps, other]

doi 10.1109/TCOMM.2015.2413405

Graded quantization for multiple description coding of compressive measurements

Authors: Diego Valsesia, Giulio Coluccia, Enrico Magli

Abstract: Compressed sensing (CS) is an emerging paradigm for acquisition of compressed representations of a sparse signal. Its low complexity is appealing for resource-constrained scenarios like sensor networks. However, such scenarios are often coupled with unreliable communication channels and providing robust transmission of the acquired data to a receiver is an issue. Multiple description coding (MDC)… ▽ More Compressed sensing (CS) is an emerging paradigm for acquisition of compressed representations of a sparse signal. Its low complexity is appealing for resource-constrained scenarios like sensor networks. However, such scenarios are often coupled with unreliable communication channels and providing robust transmission of the acquired data to a receiver is an issue. Multiple description coding (MDC) effectively combats channel losses for systems without feedback, thus raising the interest in developing MDC methods explicitly designed for the CS framework, and exploiting its properties. We propose a method called Graded Quantization (CS-GQ) that leverages the democratic property of compressive measurements to effectively implement MDC, and we provide methods to optimize its performance. A novel decoding algorithm based on the alternating directions method of multipliers is derived to reconstruct signals from a limited number of received descriptions. Simulations are performed to assess the performance of CS-GQ against other methods in presence of packet losses. The proposed method is successful at providing robust coding of CS measurements and outperforms other schemes for the considered test metrics. △ Less

Submitted 30 March, 2015; originally announced March 2015.

Journal ref: IEEE Transactions on Communications, 2015

arXiv:1404.1237 [pdf, other]

doi 10.1109/TCOMM.2014.2316176

Operational Rate-Distortion Performance of Single-source and Distributed Compressed Sensing

Authors: Giulio Coluccia, Aline Roumy, Enrico Magli

Abstract: We consider correlated and distributed sources without cooperation at the encoder. For these sources, we derive the best achievable performance in the rate-distortion sense of any distributed compressed sensing scheme, under the constraint of high--rate quantization. Moreover, under this model we derive a closed--form expression of the rate gain achieved by taking into account the correlation of t… ▽ More We consider correlated and distributed sources without cooperation at the encoder. For these sources, we derive the best achievable performance in the rate-distortion sense of any distributed compressed sensing scheme, under the constraint of high--rate quantization. Moreover, under this model we derive a closed--form expression of the rate gain achieved by taking into account the correlation of the sources at the receiver and a closed--form expression of the average performance of the oracle receiver for independent and joint reconstruction. Finally, we show experimentally that the exploitation of the correlation between the sources performs close to optimal and that the only penalty is due to the missing knowledge of the sparsity support as in (non distributed) compressed sensing. Even if the derivation is performed in the large system regime, where signal and system parameters tend to infinity, numerical results show that the equations match simulations for parameter values of practical interest. △ Less

Submitted 4 April, 2014; originally announced April 2014.

Comments: To appear in IEEE Transactions on Communications

arXiv:1403.2835 [pdf, other]

Compressive Signal Processing with Circulant Sensing Matrices

Authors: Diego Valsesia, Enrico Magli

Abstract: Compressive sensing achieves effective dimensionality reduction of signals, under a sparsity constraint, by means of a small number of random measurements acquired through a sensing matrix. In a signal processing system, the problem arises of processing the random projections directly, without first reconstructing the signal. In this paper, we show that circulant sensing matrices allow to perform… ▽ More Compressive sensing achieves effective dimensionality reduction of signals, under a sparsity constraint, by means of a small number of random measurements acquired through a sensing matrix. In a signal processing system, the problem arises of processing the random projections directly, without first reconstructing the signal. In this paper, we show that circulant sensing matrices allow to perform a variety of classical signal processing tasks such as filtering, interpolation, registration, transforms, and so forth, directly in the compressed domain and in an exact fashion, \emph{i.e.}, without relying on estimators as proposed in the existing literature. The advantage of the techniques presented in this paper is to enable direct measurement-to-measurement transformations, without the need of costly recovery procedures. △ Less

Submitted 12 March, 2014; originally announced March 2014.

arXiv:1403.1697 [pdf, other]

Compressive Hyperspectral Imaging Using Progressive Total Variation

Authors: Simeon Kamdem Kuiteing, Giulio Coluccia, Alessandro Barducci, Mauro Barni, Enrico Magli

Abstract: Compressed Sensing (CS) is suitable for remote acquisition of hyperspectral images for earth observation, since it could exploit the strong spatial and spectral correlations, llowing to simplify the architecture of the onboard sensors. Solutions proposed so far tend to decouple spatial and spectral dimensions to reduce the complexity of the reconstruction, not taking into account that onboard sens… ▽ More Compressed Sensing (CS) is suitable for remote acquisition of hyperspectral images for earth observation, since it could exploit the strong spatial and spectral correlations, llowing to simplify the architecture of the onboard sensors. Solutions proposed so far tend to decouple spatial and spectral dimensions to reduce the complexity of the reconstruction, not taking into account that onboard sensors progressively acquire spectral rows rather than acquiring spectral channels. For this reason, we propose a novel progressive CS architecture based on separate sensing of spectral rows and joint reconstruction employing Total Variation. Experimental results run on raw AVIRIS and AIRS images confirm the validity of the proposed system. △ Less

Submitted 7 March, 2014; originally announced March 2014.

Comments: To be published on ICASSP 2014 proceedings

arXiv:1403.1696 [pdf, other]

Exact Performance Analysis of the Oracle Receiver for Compressed Sensing Reconstruction

Authors: Giulio Coluccia, Aline Roumy, Enrico Magli

Abstract: A sparse or compressible signal can be recovered from a certain number of noisy random projections, smaller than what dictated by classic Shannon/Nyquist theory. In this paper, we derive the closed-form expression of the mean square error performance of the oracle receiver, knowing the sparsity pattern of the signal. With respect to existing bounds, our result is exact and does not depend on a par… ▽ More A sparse or compressible signal can be recovered from a certain number of noisy random projections, smaller than what dictated by classic Shannon/Nyquist theory. In this paper, we derive the closed-form expression of the mean square error performance of the oracle receiver, knowing the sparsity pattern of the signal. With respect to existing bounds, our result is exact and does not depend on a particular realization of the sensing matrix. Moreover, our result holds irrespective of whether the noise affecting the measurements is white or correlated. Numerical results show a perfect match between equations and simulations, confirming the validity of the result. △ Less

Submitted 7 March, 2014; originally announced March 2014.

Comments: To be published in ICASSP 2014 proceedings

arXiv:1401.3277 [pdf, ps, other]

doi 10.1109/TGRS.2013.2296329

A Novel Rate Control Algorithm for Onboard Predictive Coding of Multispectral and Hyperspectral Images

Authors: Diego Valsesia, Enrico Magli

Abstract: Predictive coding is attractive for compression onboard of spacecrafts thanks to its low computational complexity, modest memory requirements and the ability to accurately control quality on a pixel-by-pixel basis. Traditionally, predictive compression focused on the lossless and near-lossless modes of operation where the maximum error can be bounded but the rate of the compressed image is variabl… ▽ More Predictive coding is attractive for compression onboard of spacecrafts thanks to its low computational complexity, modest memory requirements and the ability to accurately control quality on a pixel-by-pixel basis. Traditionally, predictive compression focused on the lossless and near-lossless modes of operation where the maximum error can be bounded but the rate of the compressed image is variable. Rate control is considered a challenging problem for predictive encoders due to the dependencies between quantization and prediction in the feedback loop, and the lack of a signal representation that packs the signal's energy into few coefficients. In this paper, we show that it is possible to design a rate control scheme intended for onboard implementation. In particular, we propose a general framework to select quantizers in each spatial and spectral region of an image so as to achieve the desired target rate while minimizing distortion. The rate control algorithm allows to achieve lossy, near-lossless compression, and any in-between type of compression, e.g., lossy compression with a near-lossless constraint. While this framework is independent of the specific predictor used, in order to show its performance, in this paper we tailor it to the predictor adopted by the CCSDS-123 lossless compression standard, obtaining an extension that allows to perform lossless, near-lossless and lossy compression in a single package. We show that the rate controller has excellent performance in terms of accuracy in the output rate, rate-distortion characteristics and is extremely competitive with respect to state-of-the-art transform coding. △ Less

Submitted 14 January, 2014; originally announced January 2014.

arXiv:1311.2433 [pdf, other]

doi 10.1109/ACSSC.2013.6810309

Joint recovery algorithms using difference of innovations for distributed compressed sensing

Authors: Diego Valsesia, Giulio Coluccia, Enrico Magli

Abstract: Distributed compressed sensing is concerned with representing an ensemble of jointly sparse signals using as few linear measurements as possible. Two novel joint reconstruction algorithms for distributed compressed sensing are presented in this paper. These algorithms are based on the idea of using one of the signals as side information; this allows to exploit joint sparsity in a more effective wa… ▽ More Distributed compressed sensing is concerned with representing an ensemble of jointly sparse signals using as few linear measurements as possible. Two novel joint reconstruction algorithms for distributed compressed sensing are presented in this paper. These algorithms are based on the idea of using one of the signals as side information; this allows to exploit joint sparsity in a more effective way with respect to existing schemes. They provide gains in reconstruction quality, especially when the nodes acquire few measurements, so that the system is able to operate with fewer measurements than is required by other existing schemes. We show that the algorithms achieve better performance with respect to the state-of-the-art. △ Less

Submitted 11 November, 2013; originally announced November 2013.

Comments: Conference Record of the Forty Seventh Asilomar Conference on Signals, Systems and Computers (ASILOMAR), 2013

arXiv:1311.0646 [pdf, other]

A Parallel Compressive Imaging Architecture for One-Shot Acquisition

Authors: Tomas Björklund, Enrico Magli

Abstract: A limitation of many compressive imaging architectures lies in the sequential nature of the sensing process, which leads to long sensing times. In this paper we present a novel architecture that uses fewer detectors than the number of reconstructed pixels and is able to acquire the image in a single acquisition. This paves the way for the development of video architectures that acquire several fra… ▽ More A limitation of many compressive imaging architectures lies in the sequential nature of the sensing process, which leads to long sensing times. In this paper we present a novel architecture that uses fewer detectors than the number of reconstructed pixels and is able to acquire the image in a single acquisition. This paves the way for the development of video architectures that acquire several frames per second. We specifically address the diffraction problem, showing that deconvolution normally used to recover diffraction blur can be replaced by convolution of the sensing matrix, and how measurements of a 0/1 physical sensing matrix can be converted to -1/1 compressive sensing matrix without any extra acquisitions. Simulations of our architecture show that the image quality is comparable to that of a classic Compressive Imaging camera, whereas the proposed architecture avoids long acquisition times due to sequential sensing. This one-shot procedure also allows to employ a fixed sensing matrix instead of a complex device such as a Digital Micro Mirror array or Spatial Light Modulator. It also enables imaging at bandwidths where these are not efficient. △ Less

Submitted 4 November, 2013; originally announced November 2013.

arXiv:1310.7813 [pdf, other]

doi 10.1109/MMSP.2013.6659276

Smoothness-Constrained Image Recovery from Block-Based Random Projections

Authors: Giulio Coluccia, Diego Valsesia, Enrico Magli

Abstract: In this paper we address the problem of visual quality of images reconstructed from block-wise random projections. Independent reconstruction of the blocks can severely affect visual quality, by displaying artifacts along block borders. We propose a method to enforce smoothness across block borders by modifying the sensing and reconstruction process so as to employ partially overlapping blocks. Th… ▽ More In this paper we address the problem of visual quality of images reconstructed from block-wise random projections. Independent reconstruction of the blocks can severely affect visual quality, by displaying artifacts along block borders. We propose a method to enforce smoothness across block borders by modifying the sensing and reconstruction process so as to employ partially overlapping blocks. The proposed algorithm accomplishes this by computing a fast preview from the blocks, whose purpose is twofold. On one hand, it allows to enforce a set of constraints to drive the reconstruction algorithm towards a smooth solution, imposing the similarity of block borders. On the other hand, the preview is used as a predictor of the entire block, allowing to recover the prediction error, only. The quality improvement over the result of independent reconstruction can be easily assessed both visually and in terms of PSNR and SSIM index. △ Less

Submitted 8 October, 2013; originally announced October 2013.

Journal ref: Proceedings of the 15th International Workshop on Multimedia Signal Processing (MMSP), Pula (Sardinia), Italy, September 30 - October 2, 2013, pp. 129-134

arXiv:1310.1266 [pdf, other]

doi 10.1109/JETCAS.2012.2214891

Progressive Compressed Sensing and Reconstruction of Multidimensional Signals Using Hybrid Transform/Prediction Sparsity Model

Authors: Giulio Coluccia, Simeon Kamden-Kuiteng, Andrea Abrardo, Mauro Barni, Enrico Magli

Abstract: Compressed sensing (CS) is an innovative technique allowing to represent signals through a small number of their linear projections. Hence, CS can be thought of as a natural candidate for acquisition of multidimensional signals, as the amount of data acquired and processed by conventional sensors could create problems in terms of computational complexity. In this paper, we propose a framework for… ▽ More Compressed sensing (CS) is an innovative technique allowing to represent signals through a small number of their linear projections. Hence, CS can be thought of as a natural candidate for acquisition of multidimensional signals, as the amount of data acquired and processed by conventional sensors could create problems in terms of computational complexity. In this paper, we propose a framework for the acquisition and reconstruction of multidimensional correlated signals. The approach is general and can be applied to D dimensional signals, even if the algorithms we propose to practically implement such architectures apply to 2-D and 3-D signals. The proposed architectures employ iterative local signal reconstruction based on a hybrid transform/prediction correlation model, coupled with a proper initialization strategy. △ Less

Submitted 4 October, 2013; originally announced October 2013.

Journal ref: IEEE Journal on Emerging and Selected Topics in Circuits and Systems, vol.2, no.3, pp.340,352, Sept. 2012

arXiv:1310.1259 [pdf, other]

doi 10.1109/ICME.2012.71

A Novel Progressive Image Scanning and Reconstruction Scheme based on Compressed Sensing and Linear Prediction

Authors: Giulio Coluccia, Enrico Magli

Abstract: Compressed sensing (CS) is an innovative technique allowing to represent signals through a small number of their linear projections. In this paper we address the application of CS to the scenario of progressive acquisition of 2D visual signals in a line-by-line fashion. This is an important setting which encompasses diverse systems such as flatbed scanners and remote sensing imagers. The use of CS… ▽ More Compressed sensing (CS) is an innovative technique allowing to represent signals through a small number of their linear projections. In this paper we address the application of CS to the scenario of progressive acquisition of 2D visual signals in a line-by-line fashion. This is an important setting which encompasses diverse systems such as flatbed scanners and remote sensing imagers. The use of CS in such setting raises the problem of reconstructing a very high number of samples, as are contained in an image, from their linear projections. Conventional reconstruction algorithms, whose complexity is cubic in the number of samples, are computationally intractable. In this paper we develop an iterative reconstruction algorithm that reconstructs an image by iteratively estimating a row, and correlating adjacent rows by means of linear prediction. We develop suitable predictors and test the proposed algorithm in the context of flatbed scanners and remote sensing imaging systems. We show that this approach can significantly improve the results of separate reconstruction of each row, providing very good reconstruction quality with reasonable complexity. △ Less

Submitted 4 October, 2013; originally announced October 2013.

Comments: 2012 IEEE International Conference on Multimedia and Expo (ICME), Melbourne, Australia, 9-13 July 2012, pp.866-871

arXiv:1310.1221 [pdf, other]

Spatially Scalable Compressed Image Sensing with Hybrid Transform and Inter-layer Prediction Model

Authors: Diego Valsesia, Enrico Magli

Abstract: Compressive imaging is an emerging application of compressed sensing, devoted to acquisition, encoding and reconstruction of images using random projections as measurements. In this paper we propose a novel method to provide a scalable encoding of an image acquired by means of compressed sensing techniques. Two bit-streams are generated to provide two distinct quality levels: a low-resolution base… ▽ More Compressive imaging is an emerging application of compressed sensing, devoted to acquisition, encoding and reconstruction of images using random projections as measurements. In this paper we propose a novel method to provide a scalable encoding of an image acquired by means of compressed sensing techniques. Two bit-streams are generated to provide two distinct quality levels: a low-resolution base layer and full-resolution enhancement layer. In the proposed method we exploit a fast preview of the image at the encoder in order to perform inter-layer prediction and encode the prediction residuals only. The proposed method successfully provides resolution and quality scalability with modest complexity and it provides gains in the quality of the reconstructed images with respect to separate encoding of the quality layers. Remarkably, we also show that the scheme can also provide significant gains with respect to a direct, non-scalable system, thus accomplishing two features at once: scalability and improved reconstruction performance. △ Less

Submitted 4 October, 2013; originally announced October 2013.

Comments: Proceedings of the 15th International Workshop on Multimedia Signal Processing, September 30-October 3, 2013, Pula, Italy

arXiv:1310.1217 [pdf, ps, other]

doi 10.1109/ICASSP.2013.6638781

Graded Quantization: Democracy for Multiple Descriptions in Compressed Sensing

Authors: Diego Valsesia, Giulio Coluccia, Enrico Magli

Abstract: The compressed sensing paradigm allows to efficiently represent sparse signals by means of their linear measurements. However, the problem of transmitting these measurements to a receiver over a channel potentially prone to packet losses has received little attention so far. In this paper, we propose novel methods to generate multiple descriptions from compressed sensing measurements to increase t… ▽ More The compressed sensing paradigm allows to efficiently represent sparse signals by means of their linear measurements. However, the problem of transmitting these measurements to a receiver over a channel potentially prone to packet losses has received little attention so far. In this paper, we propose novel methods to generate multiple descriptions from compressed sensing measurements to increase the robustness over unreliable channels. In particular, we exploit the democracy property of compressive measurements to generate descriptions in a simple manner by partitioning the measurement vector and properly allocating bit-rate, outperforming classical methods like the multiple description scalar quantizer. In addition, we propose a modified version of the Basis Pursuit Denoising recovery procedure that is specifically tailored to the proposed methods. Experimental results show significant performance gains with respect to existing methods. △ Less

Submitted 4 October, 2013; originally announced October 2013.

Journal ref: Proceedings of the 38th International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Vancouver, Canada, May 26 - 31, 2013, pp. 5825-5829

arXiv:1309.0316 [pdf, ps, other]

doi 10.1109/TMM.2013.2285518

Band Codes for Energy-Efficient Network Coding with Application to P2P Mobile Streaming

Authors: Attilio Fiandrotti, Valerio Bioglio, Marco Grangetto, Rossano Gaeta, Enrico Magli

Abstract: A key problem in random network coding (NC) lies in the complexity and energy consumption associated with the packet decoding processes, which hinder its application in mobile environments. Controlling and hence limiting such factors has always been an important but elusive research goal, since the packet degree distribution, which is the main factor driving the complexity, is altered in a non-det… ▽ More A key problem in random network coding (NC) lies in the complexity and energy consumption associated with the packet decoding processes, which hinder its application in mobile environments. Controlling and hence limiting such factors has always been an important but elusive research goal, since the packet degree distribution, which is the main factor driving the complexity, is altered in a non-deterministic way by the random recombinations at the network nodes. In this paper we tackle this problem proposing Band Codes (BC), a novel class of network codes specifically designed to preserve the packet degree distribution during packet encoding, ecombination and decoding. BC are random codes over GF(2) that exhibit low decoding complexity, feature limited and controlled degree distribution by construction, and hence allow to effectively apply NC even in energy-constrained scenarios. In particular, in this paper we motivate and describe our new design and provide a thorough analysis of its performance. We provide numerical simulations of the performance of BC in order to validate the analysis and assess the overhead of BC with respect to a onventional NC scheme. Moreover, peer-to-peer media streaming experiments with a random-push protocol show that BC reduce the decoding complexity by a factor of two, to a point where NC-based mobile streaming to mobile devices becomes practically feasible. △ Less

Submitted 2 September, 2013; originally announced September 2013.

Comments: To be published in IEEE Transacions on Multimedia

ACM Class: H.5.1

arXiv:1301.2130 [pdf, other]

Distributed soft thresholding for sparse signal recovery

Authors: Chiara Ravazzi, Sophie M. Fosson, Enrico Magli

Abstract: In this paper, we address the problem of distributed sparse recovery of signals acquired via compressed measurements in a sensor network. We propose a new class of distributed algorithms to solve Lasso regression problems, when the communication to a fusion center is not possible, e.g., due to communication cost or privacy reasons. More precisely, we introduce a distributed iterative soft threshol… ▽ More In this paper, we address the problem of distributed sparse recovery of signals acquired via compressed measurements in a sensor network. We propose a new class of distributed algorithms to solve Lasso regression problems, when the communication to a fusion center is not possible, e.g., due to communication cost or privacy reasons. More precisely, we introduce a distributed iterative soft thresholding algorithm (DISTA) that consists of three steps: an averaging step, a gradient step, and a soft thresholding operation. We prove the convergence of DISTA in networks represented by regular graphs, and we compare it with existing methods in terms of performance, memory, and complexity. △ Less

Submitted 14 October, 2013; v1 submitted 10 January, 2013; originally announced January 2013.

Comments: Revised version. Main improvements: extension of the convergence theorem to regular graphs; new numerical results and comparisons with other algorithms

arXiv:1211.4206 [pdf, other]

doi 10.1109/TMM.2013.2241415

Network Coding Meets Multimedia: a Review

Authors: Enrico Magli, Mea Wang, Pascal Frossard, Athina Markopoulou

Abstract: While every network node only relays messages in a traditional communication system, the recent network coding (NC) paradigm proposes to implement simple in-network processing with packet combinations in the nodes. NC extends the concept of "encoding" a message beyond source coding (for compression) and channel coding (for protection against errors and losses). It has been shown to increase networ… ▽ More While every network node only relays messages in a traditional communication system, the recent network coding (NC) paradigm proposes to implement simple in-network processing with packet combinations in the nodes. NC extends the concept of "encoding" a message beyond source coding (for compression) and channel coding (for protection against errors and losses). It has been shown to increase network throughput compared to traditional networks implementation, to reduce delay and to provide robustness to transmission errors and network dynamics. These features are so appealing for multimedia applications that they have spurred a large research effort towards the development of multimedia-specific NC techniques. This paper reviews the recent work in NC for multimedia applications and focuses on the techniques that fill the gap between NC theory and practical applications. It outlines the benefits of NC and presents the open challenges in this area. The paper initially focuses on multimedia-specific aspects of network coding, in particular delay, in-network error control, and media-specific error control. These aspects permit to handle varying network conditions as well as client heterogeneity, which are critical to the design and deployment of multimedia systems. After introducing these general concepts, the paper reviews in detail two applications that lend themselves naturally to NC via the cooperation and broadcast models, namely peer-to-peer multimedia streaming and wireless networking. △ Less

Submitted 18 November, 2012; originally announced November 2012.

Comments: Part of this work is under publication in IEEE Transactions on Multimedia

Showing 1–50 of 51 results for author: Magli, E