Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 61 results for author: Favaro, P

.
  1. arXiv:2405.19572  [pdf, other

    cs.CV

    Blind Image Restoration via Fast Diffusion Inversion

    Authors: Hamadi Chihaoui, Abdelhak Lemkhenter, Paolo Favaro

    Abstract: Recently, various methods have been proposed to solve Image Restoration (IR) tasks using a pre-trained diffusion model leading to state-of-the-art performance. However, most of these methods assume that the degradation operator in the IR task is completely known. Furthermore, a common characteristic among these approaches is that they alter the diffusion sampling process in order to satisfy the co… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  2. arXiv:2404.18065  [pdf, other

    cs.CV cs.AI

    Grounded Compositional and Diverse Text-to-3D with Pretrained Multi-View Diffusion Model

    Authors: Xiaolong Li, Jiawei Mo, Ying Wang, Chethan Parameshwara, Xiaohan Fei, Ashwin Swaminathan, CJ Taylor, Zhuowen Tu, Paolo Favaro, Stefano Soatto

    Abstract: In this paper, we propose an effective two-stage approach named Grounded-Dreamer to generate 3D assets that can accurately follow complex, compositional text prompts while achieving high fidelity by using a pre-trained multi-view diffusion model. Multi-view diffusion models, such as MVDream, have shown to generate high-fidelity 3D assets using score distillation sampling (SDS). However, applied na… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

    Comments: 9 pages, 10 figures

  3. arXiv:2404.09389  [pdf, other

    cs.CV cs.LG

    Masked and Shuffled Blind Spot Denoising for Real-World Images

    Authors: Hamadi Chihaoui, Paolo Favaro

    Abstract: We introduce a novel approach to single image denoising based on the Blind Spot Denoising principle, which we call MAsked and SHuffled Blind Spot Denoising (MASH). We focus on the case of correlated noise, which often plagues real images. MASH is the result of a careful analysis to determine the relationships between the level of blindness (masking) of the input and the (unknown) noise correlation… ▽ More

    Submitted 14 April, 2024; originally announced April 2024.

  4. arXiv:2404.03392  [pdf, other

    cs.CV

    Two Tricks to Improve Unsupervised Segmentation Learning

    Authors: Alp Eren Sari, Francesco Locatello, Paolo Favaro

    Abstract: We present two practical improvement techniques for unsupervised segmentation learning. These techniques address limitations in the resolution and accuracy of predicted segmentation maps of recent state-of-the-art methods. Firstly, we leverage image post-processing techniques such as guided filtering to refine the output masks, improving accuracy while avoiding substantial computational costs. Sec… ▽ More

    Submitted 8 April, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

  5. arXiv:2403.14368  [pdf, other

    cs.CV

    Enabling Visual Composition and Animation in Unsupervised Video Generation

    Authors: Aram Davtyan, Sepehr Sameni, Björn Ommer, Paolo Favaro

    Abstract: In this work we propose a novel method for unsupervised controllable video generation. Once trained on a dataset of unannotated videos, at inference our model is capable of both composing scenes of predefined object parts and animating them in a plausible and controlled way. This is achieved by conditioning video generation on a randomly selected subset of local pre-trained self-supervised feature… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

    Comments: Project website: https://araachie.github.io/cage

  6. arXiv:2402.18780  [pdf, other

    cs.CV

    A Quantitative Evaluation of Score Distillation Sampling Based Text-to-3D

    Authors: Xiaohan Fei, Chethan Parameshwara, Jiawei Mo, Xiaolong Li, Ashwin Swaminathan, CJ Taylor, Paolo Favaro, Stefano Soatto

    Abstract: The development of generative models that create 3D content from a text prompt has made considerable strides thanks to the use of the score distillation sampling (SDS) method on pre-trained diffusion models for image generation. However, the SDS method is also the source of several artifacts, such as the Janus problem, the misalignment between the text prompt and the generated 3D model, and 3D mod… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

  7. arXiv:2312.04337  [pdf, other

    cs.CV

    Multi-View Unsupervised Image Generation with Cross Attention Guidance

    Authors: Llukman Cerkezi, Aram Davtyan, Sepehr Sameni, Paolo Favaro

    Abstract: The growing interest in novel view synthesis, driven by Neural Radiance Field (NeRF) models, is hindered by scalability issues due to their reliance on precisely annotated multi-view images. Recent models address this by fine-tuning large text2image diffusion models on synthetic multi-view data. Despite robust zero-shot generalization, they may need post-processing and can face quality issues due… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

  8. arXiv:2311.01646  [pdf, other

    cs.CV cs.LG

    SemiGPC: Distribution-Aware Label Refinement for Imbalanced Semi-Supervised Learning Using Gaussian Processes

    Authors: Abdelhak Lemkhenter, Manchen Wang, Luca Zancato, Gurumurthy Swaminathan, Paolo Favaro, Davide Modolo

    Abstract: In this paper we introduce SemiGPC, a distribution-aware label refinement strategy based on Gaussian Processes where the predictions of the model are derived from the labels posterior distribution. Differently from other buffer-based semi-supervised methods such as CoMatch and SimMatch, our SemiGPC includes a normalization term that addresses imbalances in the global data distribution while mainta… ▽ More

    Submitted 2 November, 2023; originally announced November 2023.

  9. arXiv:2310.00099  [pdf, other

    cs.CV

    Denoising and Selecting Pseudo-Heatmaps for Semi-Supervised Human Pose Estimation

    Authors: Zhuoran Yu, Manchen Wang, Yanbei Chen, Paolo Favaro, Davide Modolo

    Abstract: We propose a new semi-supervised learning design for human pose estimation that revisits the popular dual-student framework and enhances it two ways. First, we introduce a denoising scheme to generate reliable pseudo-heatmaps as targets for learning from unlabeled data. This uses multi-view augmentations and a threshold-and-refine procedure to produce a pool of pseudo-heatmaps. Second, we select t… ▽ More

    Submitted 29 September, 2023; originally announced October 2023.

  10. arXiv:2309.03008  [pdf, other

    cs.CV

    Sparse 3D Reconstruction via Object-Centric Ray Sampling

    Authors: Llukman Cerkezi, Paolo Favaro

    Abstract: We propose a novel method for 3D object reconstruction from a sparse set of views captured from a 360-degree calibrated camera rig. We represent the object surface through a hybrid model that uses both an MLP-based neural representation and a triangle mesh. A key contribution in our work is a novel object-centric sampling scheme of the neural representation, where rays are shared among all views.… ▽ More

    Submitted 28 March, 2024; v1 submitted 6 September, 2023; originally announced September 2023.

  11. arXiv:2306.16048  [pdf, other

    cs.CV cs.AI

    Benchmarking Zero-Shot Recognition with Vision-Language Models: Challenges on Granularity and Specificity

    Authors: Zhenlin Xu, Yi Zhu, Tiffany Deng, Abhay Mittal, Yanbei Chen, Manchen Wang, Paolo Favaro, Joseph Tighe, Davide Modolo

    Abstract: This paper presents novel benchmarks for evaluating vision-language models (VLMs) in zero-shot recognition, focusing on granularity and specificity. Although VLMs excel in tasks like image captioning, they face challenges in open-world settings. Our benchmarks test VLMs' consistency in understanding concepts across semantic granularity levels and their response to varying text specificity. Finding… ▽ More

    Submitted 18 June, 2024; v1 submitted 28 June, 2023; originally announced June 2023.

    Comments: CVPR2024 MMFM workshop

  12. arXiv:2306.06408  [pdf, other

    eess.IV cs.CV cs.LG q-bio.NC

    Fast light-field 3D microscopy with out-of-distribution detection and adaptation through Conditional Normalizing Flows

    Authors: Josué Page Vizcaíno, Panagiotis Symvoulidis, Zeguan Wang, Jonas Jelten, Paolo Favaro, Edward S. Boyden, Tobias Lasser

    Abstract: Real-time 3D fluorescence microscopy is crucial for the spatiotemporal analysis of live organisms, such as neural activity monitoring. The eXtended field-of-view light field microscope (XLFM), also known as Fourier light field microscope, is a straightforward, single snapshot solution to achieve this. The XLFM acquires spatial-angular information in a single camera exposure. In a subsequent step,… ▽ More

    Submitted 14 June, 2023; v1 submitted 10 June, 2023; originally announced June 2023.

  13. arXiv:2306.04849  [pdf, other

    cs.CV

    ScaleDet: A Scalable Multi-Dataset Object Detector

    Authors: Yanbei Chen, Manchen Wang, Abhay Mittal, Zhenlin Xu, Paolo Favaro, Joseph Tighe, Davide Modolo

    Abstract: Multi-dataset training provides a viable solution for exploiting heterogeneous large-scale datasets without extra annotation cost. In this work, we propose a scalable multi-dataset detector (ScaleDet) that can scale up its generalization across datasets when increasing the number of training datasets. Unlike existing multi-dataset learners that mostly rely on manual relabelling efforts or sophisti… ▽ More

    Submitted 7 June, 2023; originally announced June 2023.

    Comments: CVPR 2023

  14. arXiv:2306.03988  [pdf, other

    cs.CV cs.AI

    Learn the Force We Can: Enabling Sparse Motion Control in Multi-Object Video Generation

    Authors: Aram Davtyan, Paolo Favaro

    Abstract: We propose a novel unsupervised method to autoregressively generate videos from a single frame and a sparse motion input. Our trained model can generate unseen realistic object-to-object interactions. Although our model has never been given the explicit segmentation and motion of each object in the scene during training, it is able to implicitly separate their dynamics and extents. Key components… ▽ More

    Submitted 13 January, 2024; v1 submitted 6 June, 2023; originally announced June 2023.

    Comments: Accepted to AAAI 2024. Project website: https://araachie.github.io/yoda

  15. arXiv:2303.01598  [pdf, other

    cs.CV cs.LG

    A Meta-Learning Approach to Predicting Performance and Data Requirements

    Authors: Achin Jain, Gurumurthy Swaminathan, Paolo Favaro, Hao Yang, Avinash Ravichandran, Hrayr Harutyunyan, Alessandro Achille, Onkar Dabeer, Bernt Schiele, Ashwin Swaminathan, Stefano Soatto

    Abstract: We propose an approach to estimate the number of samples required for a model to reach a target performance. We find that the power law, the de facto principle to estimate model performance, leads to large error when using a small dataset (e.g., 5 samples per class) for extrapolation. This is because the log-performance error against the log-dataset size follows a nonlinear progression in the few-… ▽ More

    Submitted 2 March, 2023; originally announced March 2023.

    Comments: CVPR 2023

  16. arXiv:2211.17042  [pdf, other

    cs.CV

    Spatio-Temporal Crop Aggregation for Video Representation Learning

    Authors: Sepehr Sameni, Simon Jenni, Paolo Favaro

    Abstract: We propose Spatio-temporal Crop Aggregation for video representation LEarning (SCALE), a novel method that enjoys high scalability at both training and inference time. Our model builds long-range video features by learning from sets of video clip-level features extracted with a pre-trained backbone. To train the model, we propose a self-supervised objective consisting of masked clip feature predic… ▽ More

    Submitted 13 March, 2023; v1 submitted 30 November, 2022; originally announced November 2022.

  17. arXiv:2211.14575  [pdf, other

    cs.CV

    Efficient Video Prediction via Sparsely Conditioned Flow Matching

    Authors: Aram Davtyan, Sepehr Sameni, Paolo Favaro

    Abstract: We introduce a novel generative model for video prediction based on latent flow matching, an efficient alternative to diffusion-based models. In contrast to prior work, we keep the high costs of modeling the past during training and inference at bay by conditioning only on a small random set of past frames at each integration step of the image generation process. Moreover, to enable the generation… ▽ More

    Submitted 24 August, 2023; v1 submitted 26 November, 2022; originally announced November 2022.

    Comments: Accepted to ICCV 2023. Project page: https://araachie.github.io/river

  18. arXiv:2210.07920  [pdf, other

    cs.CV cs.AI cs.LG

    MOVE: Unsupervised Movable Object Segmentation and Detection

    Authors: Adam Bielski, Paolo Favaro

    Abstract: We introduce MOVE, a novel method to segment objects without any form of supervision. MOVE exploits the fact that foreground objects can be shifted locally relative to their initial position and result in realistic (undistorted) new images. This property allows us to train a segmentation model on a dataset of images without annotation and to achieve state of the art (SotA) performance on several e… ▽ More

    Submitted 20 October, 2022; v1 submitted 14 October, 2022; originally announced October 2022.

    Comments: 36th Conference on Neural Information Processing Systems (NeurIPS 2022)

  19. U-Sleep's resilience to AASM guidelines

    Authors: Luigi Fiorillo, Giuliana Monachino, Julia van der Meer, Marco Pesce, Jan D. Warncke, Markus H. Schmidt, Claudio L. A. Bassetti, Athina Tzovara, Paolo Favaro, Francesca D. Faraci

    Abstract: AASM guidelines are the result of decades of efforts aiming at standardizing sleep scoring procedure, with the final goal of sharing a worldwide common methodology. The guidelines cover several aspects from the technical/digital specifications,e.g., recommended EEG derivations, to detailed sleep scoring rules accordingly to age. Automated sleep scoring systems have always largely exploited the sta… ▽ More

    Submitted 13 March, 2023; v1 submitted 19 September, 2022; originally announced September 2022.

    Journal ref: npj Digital Medicine (2023)

  20. arXiv:2208.05688  [pdf, other

    cs.CV cs.AI cs.LG

    Semi-supervised Vision Transformers at Scale

    Authors: Zhaowei Cai, Avinash Ravichandran, Paolo Favaro, Manchen Wang, Davide Modolo, Rahul Bhotika, Zhuowen Tu, Stefano Soatto

    Abstract: We study semi-supervised learning (SSL) for vision transformers (ViT), an under-explored topic despite the wide adoption of the ViT architectures to different tasks. To tackle this problem, we propose a new SSL pipeline, consisting of first un/self-supervised pre-training, followed by supervised fine-tuning, and finally semi-supervised fine-tuning. At the semi-supervised fine-tuning stage, we adop… ▽ More

    Submitted 11 August, 2022; originally announced August 2022.

  21. arXiv:2207.13801  [pdf, ps, other

    cs.LG

    Towards Sleep Scoring Generalization Through Self-Supervised Meta-Learning

    Authors: Abdelhak Lemkhenter, Paolo Favaro

    Abstract: In this work we introduce a novel meta-learning method for sleep scoring based on self-supervised learning. Our approach aims at building models for sleep scoring that can generalize across different patients and recording facilities, but do not require a further adaptation step to the target data. Towards this goal, we build our method on top of the Model Agnostic Meta-Learning (MAML) framework b… ▽ More

    Submitted 27 July, 2022; originally announced July 2022.

    Comments: EMBC 2022

  22. Multi-Scored Sleep Databases: How to Exploit the Multiple-Labels in Automated Sleep Scoring

    Authors: Luigi Fiorillo, Davide Pedroncelli, Valentina Agostini, Paolo Favaro, Francesca Dalia Faraci

    Abstract: Study Objectives: Inter-scorer variability in scoring polysomnograms is a well-known problem. Most of the existing automated sleep scoring systems are trained using labels annotated by a single scorer, whose subjective evaluation is transferred to the model. When annotations from two or more scorers are available, the scoring models are usually trained on the scorer consensus. The averaged scorer'… ▽ More

    Submitted 12 February, 2023; v1 submitted 5 July, 2022; originally announced July 2022.

    Journal ref: Sleep, 2023

  23. arXiv:2204.06558  [pdf, other

    cs.CV cs.AI

    Controllable Video Generation through Global and Local Motion Dynamics

    Authors: Aram Davtyan, Paolo Favaro

    Abstract: We present GLASS, a method for Global and Local Action-driven Sequence Synthesis. GLASS is a generative model that is trained on video sequences in an unsupervised manner and that can animate an input image at test time. The method learns to segment frames into foreground-background layers and to generate transitions of the foregrounds over time through a global and local action representation. Gl… ▽ More

    Submitted 13 April, 2022; originally announced April 2022.

  24. arXiv:2204.04788  [pdf, other

    cs.CV cs.AI cs.LG

    Representation Learning by Detecting Incorrect Location Embeddings

    Authors: Sepehr Sameni, Simon Jenni, Paolo Favaro

    Abstract: In this paper, we introduce a novel self-supervised learning (SSL) loss for image representation learning. There is a growing belief that generalization in deep neural networks is linked to their ability to discriminate object shapes. Since object shape is related to the location of its parts, we propose to detect those that have been artificially misplaced. We represent object parts with image to… ▽ More

    Submitted 13 March, 2023; v1 submitted 10 April, 2022; originally announced April 2022.

    Comments: accepted at AAAI2023, https://github.com/Separius/DILEMMA

  25. arXiv:2112.07599  [pdf, other

    cs.CV cs.AI cs.GR

    Learning to Deblur and Rotate Motion-Blurred Faces

    Authors: Givi Meishvili, Attila Szabó, Simon Jenni, Paolo Favaro

    Abstract: We propose a solution to the novel task of rendering sharp videos from new viewpoints from a single motion-blurred image of a face. Our method handles the complexity of face blur by implicitly learning the geometry and motion of faces through the joint training on three large datasets: FFHQ and 300VW, which are publicly available, and a new Bern Multi-View Face Dataset (BMFD) that we built. The fi… ▽ More

    Submitted 14 December, 2021; originally announced December 2021.

    Comments: British Machine Vision Conference 2021

  26. DeepSleepNet-Lite: A Simplified Automatic Sleep Stage Scoring Model with Uncertainty Estimates

    Authors: Luigi Fiorillo, Paolo Favaro, Francesca Dalia Faraci

    Abstract: Deep learning is widely used in the most recent automatic sleep scoring algorithms. Its popularity stems from its excellent performance and from its ability to directly process raw signals and to learn feature from the data. Most of the existing scoring algorithms exploit very computationally demanding architectures, due to their high number of training parameters, and process lengthy time sequenc… ▽ More

    Submitted 24 August, 2021; originally announced August 2021.

    Journal ref: IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 29, pp. 2076-2085

  27. arXiv:2107.06197  [pdf, other

    cs.LG cs.CV

    Generative Adversarial Learning via Kernel Density Discrimination

    Authors: Abdelhak Lemkhenter, Adam Bielski, Alp Eren Sari, Paolo Favaro

    Abstract: We introduce Kernel Density Discrimination GAN (KDD GAN), a novel method for generative adversarial learning. KDD GAN formulates the training as a likelihood ratio optimization problem where the data distributions are written explicitly via (local) Kernel Density Estimates (KDE). This is inspired by the recent progress in contrastive learning and its relation to KDE. We define the KDEs directly in… ▽ More

    Submitted 13 July, 2021; originally announced July 2021.

  28. arXiv:2107.03331  [pdf, ps, other

    cs.LG cs.CV math.OC stat.ML

    KOALA: A Kalman Optimization Algorithm with Loss Adaptivity

    Authors: Aram Davtyan, Sepehr Sameni, Llukman Cerkezi, Givi Meishvilli, Adam Bielski, Paolo Favaro

    Abstract: Optimization is often cast as a deterministic problem, where the solution is found through some iterative procedure such as gradient descent. However, when training neural networks the loss function changes over (iteration) time due to the randomized selection of a subset of the samples. This randomization turns the optimization problem into a stochastic one. We propose to consider the loss as a n… ▽ More

    Submitted 16 December, 2021; v1 submitted 7 July, 2021; originally announced July 2021.

    Comments: Accepted to AAAI2022

  29. arXiv:2106.09914  [pdf, other

    cs.LG cs.CV

    A Unified Generative Adversarial Network Training via Self-Labeling and Self-Attention

    Authors: Tomoki Watanabe, Paolo Favaro

    Abstract: We propose a novel GAN training scheme that can handle any level of labeling in a unified manner. Our scheme introduces a form of artificial labeling that can incorporate manually defined labels, when available, and induce an alignment between them. To define the artificial labels, we exploit the assumption that neural network generators can be trained more easily to map nearby latent vectors to d… ▽ More

    Submitted 18 June, 2021; originally announced June 2021.

  30. arXiv:2104.02615  [pdf, other

    cs.CV

    Optical Flow Dataset Synthesis from Unpaired Images

    Authors: Adrian Wälchli, Paolo Favaro

    Abstract: The estimation of optical flow is an ambiguous task due to the lack of correspondence at occlusions, shadows, reflections, lack of texture and changes in illumination over time. Thus, unsupervised methods face major challenges as they need to tune complex cost functions with several terms designed to handle each of these sources of ambiguity. In contrast, supervised methods avoid these challenges… ▽ More

    Submitted 2 April, 2021; originally announced April 2021.

  31. arXiv:2012.09259  [pdf, other

    cs.CV

    ISD: Self-Supervised Learning by Iterative Similarity Distillation

    Authors: Ajinkya Tejankar, Soroush Abbasi Koohpayegani, Vipin Pillai, Paolo Favaro, Hamed Pirsiavash

    Abstract: Recently, contrastive learning has achieved great results in self-supervised learning, where the main idea is to push two augmentations of an image (positive pairs) closer compared to other random images (negative pairs). We argue that not all random images are equal. Hence, we introduce a self supervised learning algorithm where we use a soft similarity for the negative images rather than a binar… ▽ More

    Submitted 10 September, 2021; v1 submitted 16 December, 2020; originally announced December 2020.

  32. arXiv:2010.06218  [pdf, other

    cs.CV

    Self-Supervised Multi-View Synchronization Learning for 3D Pose Estimation

    Authors: Simon Jenni, Paolo Favaro

    Abstract: Current state-of-the-art methods cast monocular 3D human pose estimation as a learning problem by training neural networks on large data sets of images and corresponding skeleton poses. In contrast, we propose an approach that can exploit small annotated data sets by fine-tuning networks pre-trained via self-supervised learning on (large) unlabeled data sets. To drive such networks towards support… ▽ More

    Submitted 13 October, 2020; originally announced October 2020.

    Comments: ACCV 2020 (oral)

  33. arXiv:2009.07664  [pdf, other

    cs.LG cs.AI stat.ML

    Boosting Generalization in Bio-Signal Classification by Learning the Phase-Amplitude Coupling

    Authors: Abdelhak Lemkhenter, Paolo Favaro

    Abstract: Various hand-crafted features representations of bio-signals rely primarily on the amplitude or power of the signal in specific frequency bands. The phase component is often discarded as it is more sample specific, and thus more sensitive to noise, than the amplitude. However, in general, the phase component also carries information relevant to the underlying biological processes. In fact, in this… ▽ More

    Submitted 16 October, 2020; v1 submitted 16 September, 2020; originally announced September 2020.

    Comments: Accepted at GCPR 2020

  34. arXiv:2007.10730  [pdf, other

    cs.CV

    Video Representation Learning by Recognizing Temporal Transformations

    Authors: Simon Jenni, Givi Meishvili, Paolo Favaro

    Abstract: We introduce a novel self-supervised learning approach to learn representations of videos that are responsive to changes in the motion dynamics. Our representations can be learned from data without human annotation and provide a substantial boost to the training of neural networks on small labeled data sets for tasks such as action recognition, which require to accurately distinguish the motion of… ▽ More

    Submitted 21 July, 2020; originally announced July 2020.

    Comments: ECCV 2020

  35. Learning to Model and Calibrate Optics via a Differentiable Wave Optics Simulator

    Authors: Josue Page, Paolo Favaro

    Abstract: We present a novel learning-based method to build a differentiable computational model of a real fluorescence microscope. Our model can be used to calibrate a real optical setup directly from data samples and to engineer point spread functions by specifying the desired input-output data. This approach is poised to drastically improve the design of microscopes, because the parameters of current mod… ▽ More

    Submitted 18 May, 2020; originally announced May 2020.

    Comments: 6 pages, 3 figures, for source code see https://github.com/pvjosue/WaveBlocks, to be published in IEEE 2020 International Conference on Image Processing (ICIP 2020)

    Journal ref: 2020 IEEE International Conference on Image Processing (ICIP)

  36. arXiv:2004.02331  [pdf, other

    cs.CV

    Steering Self-Supervised Feature Learning Beyond Local Pixel Statistics

    Authors: Simon Jenni, Hailin Jin, Paolo Favaro

    Abstract: We introduce a novel principle for self-supervised feature learning based on the discrimination of specific transformations of an image. We argue that the generalization capability of learned features depends on what image neighborhood size is sufficient to discriminate different image transformations: The larger the required neighborhood size and the more global the image statistics that the feat… ▽ More

    Submitted 5 April, 2020; originally announced April 2020.

    Comments: CVPR 2020 (oral)

  37. arXiv:2003.11004  [pdf, other

    eess.IV cs.CV physics.optics

    Learning to Reconstruct Confocal Microscopy Stacks from Single Light Field Images

    Authors: Josue Page, Federico Saltarin, Yury Belyaev, Ruth Lyck, Paolo Favaro

    Abstract: We present a novel deep learning approach to reconstruct confocal microscopy stacks from single light field images. To perform the reconstruction, we introduce the LFMNet, a novel neural network architecture inspired by the U-Net design. It is able to reconstruct with high-accuracy a 112x112x57.6$μm^3$ volume (1287x1287x64 voxels) in 50ms given a single light field image of 1287x1287 pixels, thus… ▽ More

    Submitted 24 March, 2020; originally announced March 2020.

    Comments: 22 pages, 12 figures

  38. arXiv:1910.00287  [pdf, other

    cs.CV

    Unsupervised Generative 3D Shape Learning from Natural Images

    Authors: Attila Szabó, Givi Meishvili, Paolo Favaro

    Abstract: In this paper we present, to the best of our knowledge, the first method to learn a generative model of 3D shapes from natural images in a fully unsupervised way. For example, we do not use any ground truth 3D or 2D annotations, stereo video, and ego-motion during the training. Our approach follows the general strategy of Generative Adversarial Networks, where an image generator network learns to… ▽ More

    Submitted 1 October, 2019; originally announced October 2019.

    Comments: The paper is under review

  39. arXiv:1909.12780  [pdf, other

    cs.CV eess.AS eess.IV

    Learning to Have an Ear for Face Super-Resolution

    Authors: Givi Meishvili, Simon Jenni, Paolo Favaro

    Abstract: We propose a novel method to use both audio and a low-resolution image to perform extreme face super-resolution (a 16x increase of the input size). When the resolution of the input image is very low (e.g., 8x8 pixels), the loss of information is so dire that important details of the original identity have been lost and audio can aid the recovery of a plausible high-resolution image. In fact, audio… ▽ More

    Submitted 2 April, 2020; v1 submitted 27 September, 2019; originally announced September 2019.

  40. arXiv:1906.04612  [pdf, other

    cs.CV cs.LG

    On Stabilizing Generative Adversarial Training with Noise

    Authors: Simon Jenni, Paolo Favaro

    Abstract: We present a novel method and analysis to train generative adversarial networks (GAN) in a stable manner. As shown in recent analysis, training is often undermined by the probability distribution of the data being zero on neighborhoods of the data space. We notice that the distributions of real and generated data should match even when they undergo the same filtering. Therefore, to address the lim… ▽ More

    Submitted 17 September, 2019; v1 submitted 11 June, 2019; originally announced June 2019.

    Comments: CVPR 2019

  41. arXiv:1905.12663  [pdf, other

    cs.CV cs.AI cs.LG

    Emergence of Object Segmentation in Perturbed Generative Models

    Authors: Adam Bielski, Paolo Favaro

    Abstract: We introduce a novel framework to build a model that can learn how to segment objects from a collection of images without any human annotation. Our method builds on the observation that the location of object segments can be perturbed locally relative to a given background without affecting the realism of a scene. Our approach is to first train a generative model of a layered scene. The layered re… ▽ More

    Submitted 2 November, 2019; v1 submitted 29 May, 2019; originally announced May 2019.

    Comments: 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Spotlight presentation

  42. arXiv:1903.06763  [pdf, other

    cs.GR cs.CV

    Smart, Deep Copy-Paste

    Authors: Tiziano Portenier, Qiyang Hu, Paolo Favaro, Matthias Zwicker

    Abstract: In this work, we propose a novel system for smart copy-paste, enabling the synthesis of high-quality results given a masked source image content and a target image context as input. Our system naturally resolves both shading and geometric inconsistencies between source and target image, resulting in a merged result image that features the content from the pasted source image, seamlessly pasted int… ▽ More

    Submitted 15 March, 2019; originally announced March 2019.

    Comments: 12 pages, 9 figures

  43. arXiv:1812.01874  [pdf, other

    eess.IV cs.CV

    Learning to Take Directions One Step at a Time

    Authors: Qiyang Hu, Adrian Wälchli, Tiziano Portenier, Matthias Zwicker, Paolo Favaro

    Abstract: We present a method to generate a video sequence given a single image. Because items in an image can be animated in arbitrarily many different ways, we introduce as control signal a sequence of motion strokes. Such control signal can be automatically transferred from other videos, e.g., via bounding box tracking. Each motion stroke provides the direction to the moving object in the input image and… ▽ More

    Submitted 14 August, 2020; v1 submitted 5 December, 2018; originally announced December 2018.

  44. arXiv:1811.10519  [pdf, other

    cs.CV

    Unsupervised 3D Shape Learning from Image Collections in the Wild

    Authors: Attila Szabó, Paolo Favaro

    Abstract: We present a method to learn the 3D surface of objects directly from a collection of images. Previous work achieved this capability by exploiting additional manual annotation, such as object pose, 3D surface templates, temporal continuity of videos, manually selected landmarks, and foreground/background masks. In contrast, our method does not make use of any such annotation. Rather, it builds a ge… ▽ More

    Submitted 26 November, 2018; v1 submitted 26 November, 2018; originally announced November 2018.

  45. arXiv:1809.01465  [pdf, other

    cs.CV cs.LG stat.ML

    Deep Bilevel Learning

    Authors: Simon Jenni, Paolo Favaro

    Abstract: We present a novel regularization approach to train neural networks that enjoys better generalization and test error than standard stochastic gradient descent. Our approach is based on the principles of cross-validation, where a validation set is used to limit the model overfitting. We formulate such principles as a bilevel optimization problem. This formulation allows us to define the optimizatio… ▽ More

    Submitted 5 September, 2018; originally announced September 2018.

    Comments: ECCV 2018

  46. arXiv:1806.05024  [pdf, other

    cs.CV

    Self-Supervised Feature Learning by Learning to Spot Artifacts

    Authors: Simon Jenni, Paolo Favaro

    Abstract: We introduce a novel self-supervised learning method based on adversarial training. Our objective is to train a discriminator network to distinguish real images from images with synthetic artifacts, and then to extract features from its intermediate layers that can be transferred to other data domains and tasks. To generate images with artifacts, we pre-train a high-capacity autoencoder and then w… ▽ More

    Submitted 13 June, 2018; originally announced June 2018.

    Comments: CVPR 2018 (spotlight)

  47. arXiv:1805.00385  [pdf, other

    cs.CV

    Boosting Self-Supervised Learning via Knowledge Transfer

    Authors: Mehdi Noroozi, Ananth Vinjimoor, Paolo Favaro, Hamed Pirsiavash

    Abstract: In self-supervised learning, one trains a model to solve a so-called pretext task on a dataset without the need for human annotation. The main objective, however, is to transfer this model to a target domain and task. Currently, the most effective transfer strategy is fine-tuning, which restricts one to use the same model or parts thereof for both pretext and target tasks. In this paper, we presen… ▽ More

    Submitted 1 May, 2018; originally announced May 2018.

  48. arXiv:1804.08972  [pdf, other

    cs.CV cs.GR

    FaceShop: Deep Sketch-based Face Image Editing

    Authors: Tiziano Portenier, Qiyang Hu, Attila Szabó, Siavash Arjomand Bigdeli, Paolo Favaro, Matthias Zwicker

    Abstract: We present a novel system for sketch-based face image editing, enabling users to edit images intuitively by sketching a few strokes on a region of interest. Our interface features tools to express a desired image manipulation by providing both geometry and color constraints as user-drawn strokes. As an alternative to the direct user input, our proposed system naturally supports a copy-paste mode,… ▽ More

    Submitted 7 June, 2018; v1 submitted 24 April, 2018; originally announced April 2018.

    Comments: 13 pages, 20 figures

  49. arXiv:1804.04065  [pdf, other

    cs.CV

    Learning to Extract a Video Sequence from a Single Motion-Blurred Image

    Authors: Meiguang Jin, Givi Meishvili, Paolo Favaro

    Abstract: We present a method to extract a video sequence from a single motion-blurred image. Motion-blurred images are the result of an averaging process, where instant frames are accumulated over time during the exposure of the sensor. Unfortunately, reversing this process is nontrivial. Firstly, averaging destroys the temporal ordering of the frames. Secondly, the recovery of a single frame is a blind de… ▽ More

    Submitted 11 April, 2018; originally announced April 2018.

  50. arXiv:1803.03330  [pdf, other

    cs.CV

    Motion deblurring of faces

    Authors: Grigorios G. Chrysos, Paolo Favaro, Stefanos Zafeiriou

    Abstract: Face analysis is a core part of computer vision, in which remarkable progress has been observed in the past decades. Current methods achieve recognition and tracking with invariance to fundamental modes of variation such as illumination, 3D pose, expressions. Notwithstanding, a much less standing mode of variation is motion deblurring, which however presents substantial challenges in face analysis… ▽ More

    Submitted 8 March, 2018; originally announced March 2018.