Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 79 results for author: Rupprecht, C

.
  1. arXiv:2407.20034  [pdf, other

    cs.CV

    MaskInversion: Localized Embeddings via Optimization of Explainability Maps

    Authors: Walid Bousselham, Sofian Chaybouti, Christian Rupprecht, Vittorio Ferrari, Hilde Kuehne

    Abstract: Vision-language foundation models such as CLIP have achieved tremendous results in global vision-language alignment, but still show some limitations in creating representations for specific image regions. % To address this problem, we propose MaskInversion, a method that leverages the feature representations of pre-trained foundation models, such as CLIP, to generate a context-aware embedding for… ▽ More

    Submitted 29 July, 2024; originally announced July 2024.

    Comments: Project page: https://walidbousselham.com/MaskInversion

  2. arXiv:2407.18907  [pdf, other

    cs.CV

    SHIC: Shape-Image Correspondences with no Keypoint Supervision

    Authors: Aleksandar Shtedritski, Christian Rupprecht, Andrea Vedaldi

    Abstract: Canonical surface mapping generalizes keypoint detection by assigning each pixel of an object to a corresponding point in a 3D template. Popularised by DensePose for the analysis of humans, authors have since attempted to apply the concept to more categories, but with limited success due to the high cost of manual supervision. In this work, we introduce SHIC, a method to learn canonical maps witho… ▽ More

    Submitted 26 July, 2024; originally announced July 2024.

    Comments: ECCV 2024. Project website https://www.robots.ox.ac.uk/~vgg/research/shic/

  3. arXiv:2407.18245  [pdf, other

    cs.CV cs.LG

    VGGHeads: A Large-Scale Synthetic Dataset for 3D Human Heads

    Authors: Orest Kupyn, Eugene Khvedchenia, Christian Rupprecht

    Abstract: Human head detection, keypoint estimation, and 3D head model fitting are important tasks with many applications. However, traditional real-world datasets often suffer from bias, privacy, and ethical concerns, and they have been recorded in laboratory environments, which makes it difficult for trained models to generalize. Here, we introduce VGGHeads -- a large scale synthetic dataset generated wit… ▽ More

    Submitted 25 July, 2024; originally announced July 2024.

  4. arXiv:2406.08249  [pdf, other

    cs.CV cs.LG

    Dataset Enhancement with Instance-Level Augmentations

    Authors: Orest Kupyn, Christian Rupprecht

    Abstract: We present a method for expanding a dataset by incorporating knowledge from the wide distribution of pre-trained latent diffusion models. Data augmentations typically incorporate inductive biases about the image formation process into the training (e.g. translation, scaling, colour changes, etc.). Here, we go beyond simple pixel transformations and introduce the concept of instance-level data augm… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  5. arXiv:2406.04343  [pdf, other

    cs.CV

    Flash3D: Feed-Forward Generalisable 3D Scene Reconstruction from a Single Image

    Authors: Stanislaw Szymanowicz, Eldar Insafutdinov, Chuanxia Zheng, Dylan Campbell, João F. Henriques, Christian Rupprecht, Andrea Vedaldi

    Abstract: In this paper, we propose Flash3D, a method for scene reconstruction and novel view synthesis from a single image which is both very generalisable and efficient. For generalisability, we start from a "foundation" model for monocular depth estimation and extend it to a full 3D shape and appearance reconstructor. For efficiency, we base this extension on feed-forward Gaussian Splatting. Specifically… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: Project page: https://www.robots.ox.ac.uk/~vgg/research/flash3d/

  6. arXiv:2404.19758  [pdf, other

    cs.CV

    Invisible Stitch: Generating Smooth 3D Scenes with Depth Inpainting

    Authors: Paul Engstler, Andrea Vedaldi, Iro Laina, Christian Rupprecht

    Abstract: 3D scene generation has quickly become a challenging new research direction, fueled by consistent improvements of 2D generative diffusion models. Most prior work in this area generates scenes by iteratively stitching newly generated frames with existing geometry. These works often depend on pre-trained monocular depth estimators to lift the generated images into 3D, fusing them with the existing s… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

    Comments: Project page: https://research.paulengstler.com/invisible-stitch/

  7. arXiv:2403.15382  [pdf, other

    cs.CV

    DragAPart: Learning a Part-Level Motion Prior for Articulated Objects

    Authors: Ruining Li, Chuanxia Zheng, Christian Rupprecht, Andrea Vedaldi

    Abstract: We introduce DragAPart, a method that, given an image and a set of drags as input, generates a new image of the same object that responds to the action of the drags. Differently from prior works that focused on repositioning objects, DragAPart predicts part-level interactions, such as opening and closing a drawer. We study this problem as a proxy for learning a generalist motion model, not restric… ▽ More

    Submitted 28 July, 2024; v1 submitted 22 March, 2024; originally announced March 2024.

    Comments: Project page: https://dragapart.github.io/

  8. arXiv:2403.15064  [pdf, other

    cs.CV cs.GR

    Recent Trends in 3D Reconstruction of General Non-Rigid Scenes

    Authors: Raza Yunus, Jan Eric Lenssen, Michael Niemeyer, Yiyi Liao, Christian Rupprecht, Christian Theobalt, Gerard Pons-Moll, Jia-Bin Huang, Vladislav Golyanik, Eddy Ilg

    Abstract: Reconstructing models of the real world, including 3D geometry, appearance, and motion of real scenes, is essential for computer graphics and computer vision. It enables the synthesizing of photorealistic novel views, useful for the movie industry and AR/VR applications. It also facilitates the content creation necessary in computer games and AR/VR by avoiding laborious manual design processes. Fu… ▽ More

    Submitted 6 May, 2024; v1 submitted 22 March, 2024; originally announced March 2024.

    Comments: 42 pages, 18 figures, 5 tables; State-of-the-Art Report at EUROGRAPHICS 2024. Project page: https://razayunus.github.io/non-rigid-star

  9. arXiv:2402.08682  [pdf, other

    cs.CV cs.AI cs.LG

    IM-3D: Iterative Multiview Diffusion and Reconstruction for High-Quality 3D Generation

    Authors: Luke Melas-Kyriazi, Iro Laina, Christian Rupprecht, Natalia Neverova, Andrea Vedaldi, Oran Gafni, Filippos Kokkinos

    Abstract: Most text-to-3D generators build upon off-the-shelf text-to-image models trained on billions of images. They use variants of Score Distillation Sampling (SDS), which is slow, somewhat unstable, and prone to artifacts. A mitigation is to fine-tune the 2D generator to be multi-view aware, which can help distillation or can be combined with reconstruction networks to output 3D objects directly. In th… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

  10. arXiv:2401.02400  [pdf, other

    cs.CV

    Learning the 3D Fauna of the Web

    Authors: Zizhang Li, Dor Litvak, Ruining Li, Yunzhi Zhang, Tomas Jakab, Christian Rupprecht, Shangzhe Wu, Andrea Vedaldi, Jiajun Wu

    Abstract: Learning 3D models of all animals on the Earth requires massively scaling up existing solutions. With this ultimate goal in mind, we develop 3D-Fauna, an approach that learns a pan-category deformable 3D animal model for more than 100 animal species jointly. One crucial bottleneck of modeling animals is the limited availability of training data, which we overcome by simply learning from 2D Interne… ▽ More

    Submitted 1 April, 2024; v1 submitted 4 January, 2024; originally announced January 2024.

    Comments: The first two authors contributed equally to this work. The last three authors contributed equally. Project page: https://kyleleey.github.io/3DFauna/

  11. arXiv:2312.13150  [pdf, other

    cs.CV

    Splatter Image: Ultra-Fast Single-View 3D Reconstruction

    Authors: Stanislaw Szymanowicz, Christian Rupprecht, Andrea Vedaldi

    Abstract: We introduce the \method, an ultra-efficient approach for monocular 3D object reconstruction. Splatter Image is based on Gaussian Splatting, which allows fast and high-quality reconstruction of 3D scenes from multiple images. We apply Gaussian Splatting to monocular reconstruction by learning a neural network that, at test time, performs reconstruction in a feed-forward manner, at 38 FPS. Our main… ▽ More

    Submitted 16 April, 2024; v1 submitted 20 December, 2023; originally announced December 2023.

    Comments: CVPR 2024. Project page: https://szymanowiczs.github.io/splatter-image.html . Code: https://github.com/szymanowiczs/splatter-image , Demo: https://huggingface.co/spaces/szymanowiczs/splatter_image

  12. arXiv:2312.12419  [pdf, other

    cs.CV

    Scene-Conditional 3D Object Stylization and Composition

    Authors: Jinghao Zhou, Tomas Jakab, Philip Torr, Christian Rupprecht

    Abstract: Recently, 3D generative models have made impressive progress, enabling the generation of almost arbitrary 3D assets from text or image inputs. However, these approaches generate objects in isolation without any consideration for the scene where they will eventually be placed. In this paper, we propose a framework that allows for the stylization of an existing 3D asset to fit into a given 2D scene,… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

  13. arXiv:2312.04563  [pdf, other

    cs.CV cs.RO

    Visual Geometry Grounded Deep Structure From Motion

    Authors: Jianyuan Wang, Nikita Karaev, Christian Rupprecht, David Novotny

    Abstract: Structure-from-motion (SfM) is a long-standing problem in the computer vision community, which aims to reconstruct the camera poses and 3D structure of a scene from a set of unconstrained 2D images. Classical frameworks solve this problem in an incremental manner by detecting and matching keypoints, registering images, triangulating 3D points, and conducting bundle adjustment. Recent research effo… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

    Comments: 8 figures. Project page: https://vggsfm.github.io/

  14. arXiv:2312.03209  [pdf, other

    cs.CV

    Cache Me if You Can: Accelerating Diffusion Models through Block Caching

    Authors: Felix Wimbauer, Bichen Wu, Edgar Schoenfeld, Xiaoliang Dai, Ji Hou, Zijian He, Artsiom Sanakoyeu, Peizhao Zhang, Sam Tsai, Jonas Kohler, Christian Rupprecht, Daniel Cremers, Peter Vajda, Jialiang Wang

    Abstract: Diffusion models have recently revolutionized the field of image synthesis due to their ability to generate photorealistic images. However, one of the major drawbacks of diffusion models is that the image generation process is costly. A large image-to-image network has to be applied many times to iteratively refine an image from random noise. While many recent works propose techniques to reduce th… ▽ More

    Submitted 12 January, 2024; v1 submitted 5 December, 2023; originally announced December 2023.

    Comments: Project page: https://fwmb.github.io/blockcaching/

  15. arXiv:2311.14665  [pdf, other

    cs.CV

    Understanding Self-Supervised Features for Learning Unsupervised Instance Segmentation

    Authors: Paul Engstler, Luke Melas-Kyriazi, Christian Rupprecht, Iro Laina

    Abstract: Self-supervised learning (SSL) can be used to solve complex visual tasks without human labels. Self-supervised representations encode useful semantic information about images, and as a result, they have already been used for tasks such as unsupervised semantic segmentation. In this paper, we investigate self-supervised representations for instance segmentation without any manual annotations. We fi… ▽ More

    Submitted 24 November, 2023; originally announced November 2023.

  16. arXiv:2310.07522  [pdf, other

    cs.CV

    S4C: Self-Supervised Semantic Scene Completion with Neural Fields

    Authors: Adrian Hayler, Felix Wimbauer, Dominik Muhle, Christian Rupprecht, Daniel Cremers

    Abstract: 3D semantic scene understanding is a fundamental challenge in computer vision. It enables mobile agents to autonomously plan and navigate arbitrary environments. SSC formalizes this challenge as jointly estimating dense geometry and semantic information from sparse observations of a scene. Current methods for SSC are generally trained on 3D ground truth based on aggregated LiDAR scans. This proces… ▽ More

    Submitted 12 October, 2023; v1 submitted 11 October, 2023; originally announced October 2023.

  17. arXiv:2310.04900  [pdf, other

    cs.CV

    HowToCaption: Prompting LLMs to Transform Video Annotations at Scale

    Authors: Nina Shvetsova, Anna Kukleva, Xudong Hong, Christian Rupprecht, Bernt Schiele, Hilde Kuehne

    Abstract: Instructional videos are an excellent source for learning multimodal representations by leveraging video-subtitle pairs extracted with automatic speech recognition systems (ASR) from the audio signal in the videos. However, in contrast to human-annotated captions, both speech and subtitles naturally differ from the visual content of the videos and thus provide only noisy supervision for multimodal… ▽ More

    Submitted 7 October, 2023; originally announced October 2023.

    Comments: https://github.com/ninatu/howtocaption

  18. arXiv:2307.07635  [pdf, other

    cs.CV

    CoTracker: It is Better to Track Together

    Authors: Nikita Karaev, Ignacio Rocco, Benjamin Graham, Natalia Neverova, Andrea Vedaldi, Christian Rupprecht

    Abstract: We introduce CoTracker, a transformer-based model that tracks dense points in a frame jointly across a video sequence. This differs from most existing state-of-the-art approaches that track points independently, ignoring their correlation. We show that joint tracking results in a significantly higher tracking accuracy and robustness. We also provide several technical innovations, including the con… ▽ More

    Submitted 26 December, 2023; v1 submitted 14 July, 2023; originally announced July 2023.

    Comments: Code and model weights are available at: https://co-tracker.github.io/

  19. arXiv:2306.15667  [pdf, other

    cs.CV

    PoseDiffusion: Solving Pose Estimation via Diffusion-aided Bundle Adjustment

    Authors: Jianyuan Wang, Christian Rupprecht, David Novotny

    Abstract: Camera pose estimation is a long-standing computer vision problem that to date often relies on classical methods, such as handcrafted keypoint matching, RANSAC and bundle adjustment. In this paper, we propose to formulate the Structure from Motion (SfM) problem inside a probabilistic diffusion framework, modelling the conditional distribution of camera poses given input images. This novel view of… ▽ More

    Submitted 24 January, 2024; v1 submitted 27 June, 2023; originally announced June 2023.

    Comments: ICCV Camera Ready: revised Introduction and Related work, added a metric mAA (AUC), added some quantitative results, and added Appendix

  20. arXiv:2306.09316  [pdf, other

    cs.CV

    Diffusion Models for Zero-Shot Open-Vocabulary Segmentation

    Authors: Laurynas Karazija, Iro Laina, Andrea Vedaldi, Christian Rupprecht

    Abstract: The variety of objects in the real world is nearly unlimited and is thus impossible to capture using models trained on a fixed set of categories. As a result, in recent years, open-vocabulary methods have attracted the interest of the community. This paper proposes a new method for zero-shot open-vocabulary segmentation. Prior work largely relies on contrastive training using image-text pairs, lev… ▽ More

    Submitted 15 June, 2023; originally announced June 2023.

    Comments: Project page https://www.robots.ox.ac.uk/~vgg/research/ovdiff

  21. arXiv:2306.07881  [pdf, other

    cs.CV

    Viewset Diffusion: (0-)Image-Conditioned 3D Generative Models from 2D Data

    Authors: Stanislaw Szymanowicz, Christian Rupprecht, Andrea Vedaldi

    Abstract: We present Viewset Diffusion, a diffusion-based generator that outputs 3D objects while only using multi-view 2D data for supervision. We note that there exists a one-to-one mapping between viewsets, i.e., collections of several 2D views of an object, and 3D models. Hence, we train a diffusion model to generate viewsets, but design the neural network generator to reconstruct internally correspondi… ▽ More

    Submitted 1 September, 2023; v1 submitted 13 June, 2023; originally announced June 2023.

    Comments: International Conference on Computer Vision 2023

  22. arXiv:2305.02296  [pdf, other

    cs.CV cs.AI

    DynamicStereo: Consistent Dynamic Depth from Stereo Videos

    Authors: Nikita Karaev, Ignacio Rocco, Benjamin Graham, Natalia Neverova, Andrea Vedaldi, Christian Rupprecht

    Abstract: We consider the problem of reconstructing a dynamic scene observed from a stereo camera. Most existing methods for depth from stereo treat different stereo frames independently, leading to temporally inconsistent depth predictions. Temporal consistency is especially important for immersive AR or VR scenarios, where flickering greatly diminishes the user experience. We propose DynamicStereo, a nove… ▽ More

    Submitted 3 May, 2023; originally announced May 2023.

    Comments: CVPR 2023; project page available at https://dynamic-stereo.github.io/

  23. arXiv:2304.10535  [pdf, other

    cs.CV

    Farm3D: Learning Articulated 3D Animals by Distilling 2D Diffusion

    Authors: Tomas Jakab, Ruining Li, Shangzhe Wu, Christian Rupprecht, Andrea Vedaldi

    Abstract: We present Farm3D, a method for learning category-specific 3D reconstructors for articulated objects, relying solely on "free" virtual supervision from a pre-trained 2D diffusion-based image generator. Recent approaches can learn a monocular network that predicts the 3D shape, albedo, illumination, and viewpoint of any object occurrence, given a collection of single-view images of an object catego… ▽ More

    Submitted 14 May, 2024; v1 submitted 20 April, 2023; originally announced April 2023.

    Comments: In 3DV 2024, Project page: http://farm3d.github.io

  24. arXiv:2304.06712  [pdf, other

    cs.CV

    What does CLIP know about a red circle? Visual prompt engineering for VLMs

    Authors: Aleksandar Shtedritski, Christian Rupprecht, Andrea Vedaldi

    Abstract: Large-scale Vision-Language Models, such as CLIP, learn powerful image-text representations that have found numerous applications, from zero-shot classification to text-to-image generation. Despite that, their capabilities for solving novel discriminative tasks via prompting fall behind those of large language models, such as GPT-3. Here we explore the idea of visual prompt engineering for solving… ▽ More

    Submitted 18 August, 2023; v1 submitted 13 April, 2023; originally announced April 2023.

    Comments: ICCV 2023 Oral

  25. arXiv:2304.03110  [pdf, other

    cs.CV

    Continual Detection Transformer for Incremental Object Detection

    Authors: Yaoyao Liu, Bernt Schiele, Andrea Vedaldi, Christian Rupprecht

    Abstract: Incremental object detection (IOD) aims to train an object detector in phases, each with annotations for new object categories. As other incremental settings, IOD is subject to catastrophic forgetting, which is often addressed by techniques such as knowledge distillation (KD) and exemplar replay (ER). However, KD and ER do not work well if applied directly to state-of-the-art transformer-based obj… ▽ More

    Submitted 6 April, 2023; originally announced April 2023.

    Comments: Accepted to CVPR 2023

  26. arXiv:2303.13664  [pdf, other

    cs.CV cs.LG

    Temperature Schedules for Self-Supervised Contrastive Methods on Long-Tail Data

    Authors: Anna Kukleva, Moritz Böhle, Bernt Schiele, Hilde Kuehne, Christian Rupprecht

    Abstract: Most approaches for self-supervised learning (SSL) are optimised on curated balanced datasets, e.g. ImageNet, despite the fact that natural data usually exhibits long-tail distributions. In this paper, we analyse the behaviour of one of the most popular variants of SSL, i.e. contrastive methods, on long-tail data. In particular, we investigate the role of the temperature parameter $τ$ in the contr… ▽ More

    Submitted 23 March, 2023; originally announced March 2023.

    Comments: ICLR 2023

  27. arXiv:2302.10668  [pdf, other

    cs.CV cs.AI cs.LG

    $PC^2$: Projection-Conditioned Point Cloud Diffusion for Single-Image 3D Reconstruction

    Authors: Luke Melas-Kyriazi, Christian Rupprecht, Andrea Vedaldi

    Abstract: Reconstructing the 3D shape of an object from a single RGB image is a long-standing and highly challenging problem in computer vision. In this paper, we propose a novel method for single-image 3D reconstruction which generates a sparse point cloud via a conditional denoising diffusion process. Our method takes as input a single RGB image along with its camera pose and gradually denoises a set of 3… ▽ More

    Submitted 23 February, 2023; v1 submitted 21 February, 2023; originally announced February 2023.

    Comments: Project page: https://lukemelas.github.io/projection-conditioned-point-cloud-diffusion

  28. arXiv:2302.10663  [pdf, other

    cs.CV cs.AI cs.LG

    RealFusion: 360° Reconstruction of Any Object from a Single Image

    Authors: Luke Melas-Kyriazi, Christian Rupprecht, Iro Laina, Andrea Vedaldi

    Abstract: We consider the problem of reconstructing a full 360° photographic model of an object from a single image of it. We do so by fitting a neural radiance field to the image, but find this problem to be severely ill-posed. We thus take an off-the-self conditional image generator based on diffusion and engineer a prompt that encourages it to "dream up" novel views of the object. Using an approach inspi… ▽ More

    Submitted 23 February, 2023; v1 submitted 21 February, 2023; originally announced February 2023.

    Comments: Project page: https://lukemelas.github.io/realfusion

  29. arXiv:2301.07668  [pdf, other

    cs.CV

    Behind the Scenes: Density Fields for Single View Reconstruction

    Authors: Felix Wimbauer, Nan Yang, Christian Rupprecht, Daniel Cremers

    Abstract: Inferring a meaningful geometric scene representation from a single image is a fundamental problem in computer vision. Approaches based on traditional depth map prediction can only reason about areas that are visible in the image. Currently, neural radiance fields (NeRFs) can capture true 3D including color, but are too complex to be generated from a single image. As an alternative, we propose to… ▽ More

    Submitted 19 April, 2023; v1 submitted 18 January, 2023; originally announced January 2023.

    Comments: Project Page: https://fwmb.github.io/bts/

  30. arXiv:2211.12497  [pdf, other

    cs.CV

    MagicPony: Learning Articulated 3D Animals in the Wild

    Authors: Shangzhe Wu, Ruining Li, Tomas Jakab, Christian Rupprecht, Andrea Vedaldi

    Abstract: We consider the problem of predicting the 3D shape, articulation, viewpoint, texture, and lighting of an articulated animal like a horse given a single test image as input. We present a new method, dubbed MagicPony, that learns this predictor purely from in-the-wild single-view images of the object category, with minimal assumptions about the topology of deformation. At its core is an implicit-exp… ▽ More

    Submitted 3 April, 2023; v1 submitted 22 November, 2022; originally announced November 2022.

    Comments: CVPR 2023. Project Page: https://3dmagicpony.github.io/

  31. Nondestructive thermographic detection of internal defects using pixel-pattern based laser excitation and photothermal super resolution reconstruction

    Authors: Julien Lecompagnon, Philipp Daniel Hirsch, Christian Rupprecht, Mathias Ziegler

    Abstract: In this work, we present a novel approach to photothermal super resolution based thermographic resolution of internal defects using two-dimensional pixel pattern-based active photothermal laser heating in conjunction with subsequent numerical reconstruction to achieve a high-resolution reconstruction of internal defect structures. With the proposed adoption of pixelated patterns generated using la… ▽ More

    Submitted 3 January, 2023; v1 submitted 8 November, 2022; originally announced November 2022.

    Comments: 14 pages, 9 figures

  32. arXiv:2210.12148  [pdf, other

    cs.CV cs.AI cs.LG

    Unsupervised Multi-object Segmentation by Predicting Probable Motion Patterns

    Authors: Laurynas Karazija, Subhabrata Choudhury, Iro Laina, Christian Rupprecht, Andrea Vedaldi

    Abstract: We propose a new approach to learn to segment multiple image objects without manual supervision. The method can extract objects form still images, but uses videos for supervision. While prior works have considered motion for segmentation, a key insight is that, while motion can be used to identify objects, not all objects are necessarily in motion: the absence of motion does not imply the absence… ▽ More

    Submitted 21 October, 2022; originally announced October 2022.

  33. arXiv:2210.10820  [pdf, other

    cs.CV cs.CL cs.IR cs.LG

    VTC: Improving Video-Text Retrieval with User Comments

    Authors: Laura Hanu, James Thewlis, Yuki M. Asano, Christian Rupprecht

    Abstract: Multi-modal retrieval is an important problem for many applications, such as recommendation and search. Current benchmarks and even datasets are often manually constructed and consist of mostly clean samples where all modalities are well-correlated with the content. Thus, current video-text retrieval literature largely focuses on video titles or audio transcripts, while ignoring user comments, sin… ▽ More

    Submitted 19 October, 2022; originally announced October 2022.

    Comments: Accepted paper at the European Conference on Computer Vision (ECCV) 2022

  34. arXiv:2205.07844  [pdf, other

    cs.CV

    Guess What Moves: Unsupervised Video and Image Segmentation by Anticipating Motion

    Authors: Subhabrata Choudhury, Laurynas Karazija, Iro Laina, Andrea Vedaldi, Christian Rupprecht

    Abstract: Motion, measured via optical flow, provides a powerful cue to discover and learn objects in images and videos. However, compared to using appearance, it has some blind spots, such as the fact that objects become invisible if they do not move. In this work, we propose an approach that combines the strengths of motion-based and appearance-based segmentation. We propose to supervise an image segmenta… ▽ More

    Submitted 13 October, 2022; v1 submitted 16 May, 2022; originally announced May 2022.

    Comments: BMVC 2022

  35. arXiv:2205.07839  [pdf, other

    cs.CV cs.AI

    Deep Spectral Methods: A Surprisingly Strong Baseline for Unsupervised Semantic Segmentation and Localization

    Authors: Luke Melas-Kyriazi, Christian Rupprecht, Iro Laina, Andrea Vedaldi

    Abstract: Unsupervised localization and segmentation are long-standing computer vision challenges that involve decomposing an image into semantically-meaningful segments without any labeled data. These tasks are particularly interesting in an unsupervised setting due to the difficulty and cost of obtaining dense image annotations, but existing unsupervised approaches struggle with complex scenes containing… ▽ More

    Submitted 16 May, 2022; originally announced May 2022.

    Comments: Published at CVPR 2022. Project Page: https://lukemelas.github.io/deep-spectral-segmentation

  36. Brightening of a dark monolayer semiconductor via strong light-matter coupling in a cavity

    Authors: Hangyong Shan, Ivan Iorsh, Bo Han, Christoph Rupprecht, Heiko Knopf, Falk Eilenberger, Martin Esmann, Kentaro Yumigeta, Kenji Watanabe, Takashi Taniguchi, Sebastian Klembt, Sven Höfling, Sefaattin Tongay, Carlos Antón-Solanas, Ivan A. Shelykh, Christian Schneider

    Abstract: Engineering the properties of quantum materials via strong light-matter coupling is a compelling research direction with a multiplicity of modern applications. Those range from modifying charge transport in organic molecules, steering particle correlation and interactions, and even controlling chemical reactions. Here, we study the modification of the material properties via strong coupling and de… ▽ More

    Submitted 26 April, 2022; originally announced April 2022.

  37. arXiv:2203.02060  [pdf, other

    eess.IV physics.app-ph physics.optics

    Thermographic detection of internal defects using 2D photothermal super resolution reconstruction with sequential laser heating

    Authors: Julien Lecompagnon, Samim Ahmadi, Philipp Hirsch, Christian Rupprecht, Mathias Ziegler

    Abstract: Thermographic photothermal super resolution reconstruction enables the resolution of internal defects/inhomogeneities below the classical limit which is governed by the diffusion properties of thermal wave propagation. Based on a combination of the application of special sampling strategies and a subsequent numerical optimization step in post-processing, thermographic super resolution has already… ▽ More

    Submitted 24 April, 2022; v1 submitted 1 March, 2022; originally announced March 2022.

    Comments: 11 Pages, 9 Figures. The following article has been accepted by the Journal of Applied Physics. After it is published, it will be found at https://aip.scitation.org/toc/jap/collection/10.1063/jap.2022.P1NDM2022.issue-1

  38. arXiv:2201.02279  [pdf, other

    cs.CV

    De-rendering 3D Objects in the Wild

    Authors: Felix Wimbauer, Shangzhe Wu, Christian Rupprecht

    Abstract: With increasing focus on augmented and virtual reality applications (XR) comes the demand for algorithms that can lift objects from images and videos into representations that are suitable for a wide variety of related 3D tasks. Large-scale deployment of XR devices and applications means that we cannot solely rely on supervised learning, as collecting and annotating data for the unlimited variety… ▽ More

    Submitted 27 September, 2022; v1 submitted 6 January, 2022; originally announced January 2022.

    Journal ref: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 18490-18499

  39. arXiv:2111.10265  [pdf, other

    cs.CV cs.LG

    ClevrTex: A Texture-Rich Benchmark for Unsupervised Multi-Object Segmentation

    Authors: Laurynas Karazija, Iro Laina, Christian Rupprecht

    Abstract: There has been a recent surge in methods that aim to decompose and segment scenes into multiple objects in an unsupervised manner, i.e., unsupervised multi-object segmentation. Performing such a task is a long-standing goal of computer vision, offering to unlock object-level reasoning without requiring dense annotations to train segmentation models. Despite significant progress, current models are… ▽ More

    Submitted 19 November, 2021; originally announced November 2021.

    Comments: NeurIPS 2021 Datasets and Benchmarks

  40. arXiv:2111.06349  [pdf, other

    cs.CV cs.LG

    Unsupervised Part Discovery from Contrastive Reconstruction

    Authors: Subhabrata Choudhury, Iro Laina, Christian Rupprecht, Andrea Vedaldi

    Abstract: The goal of self-supervised visual representation learning is to learn strong, transferable image representations, with the majority of research focusing on object or scene level. On the other hand, representation learning at part level has received significantly less attention. In this paper, we propose an unsupervised approach to object part discovery and segmentation and make three contribution… ▽ More

    Submitted 21 March, 2022; v1 submitted 11 November, 2021; originally announced November 2021.

    Comments: NeurIPS 2021. Project page: https://www.robots.ox.ac.uk/~vgg/research/unsup-parts/

  41. arXiv:2111.03651  [pdf, other

    cs.CV cs.CL

    The Curious Layperson: Fine-Grained Image Recognition without Expert Labels

    Authors: Subhabrata Choudhury, Iro Laina, Christian Rupprecht, Andrea Vedaldi

    Abstract: Most of us are not experts in specific fields, such as ornithology. Nonetheless, we do have general image and language understanding capabilities that we use to match what we see to expert resources. This allows us to expand our knowledge and perform novel tasks without ad-hoc external supervision. On the contrary, machines have a much harder time consulting expert-curated knowledge bases unless t… ▽ More

    Submitted 5 November, 2021; originally announced November 2021.

    Comments: To appear in BMVC 2021 (Oral). Project page: https://www.robots.ox.ac.uk/~vgg/research/clever/

  42. arXiv:2109.13228  [pdf, other

    cs.CV cs.CY

    PASS: An ImageNet replacement for self-supervised pretraining without humans

    Authors: Yuki M. Asano, Christian Rupprecht, Andrew Zisserman, Andrea Vedaldi

    Abstract: Computer vision has long relied on ImageNet and other large datasets of images sampled from the Internet for pretraining models. However, these datasets have ethical and technical shortcomings, such as containing personal information taken without consent, unclear license usage, biases, and, in some cases, even problematic image content. On the other hand, state-of-the-art pretraining is nowadays… ▽ More

    Submitted 27 September, 2021; originally announced September 2021.

    Comments: Accepted to NeurIPS Track on Datasets and Benchmarks 2021. Webpage: https://www.robots.ox.ac.uk/~vgg/research/pass/

  43. arXiv:2107.10844  [pdf, other

    cs.CV

    DOVE: Learning Deformable 3D Objects by Watching Videos

    Authors: Shangzhe Wu, Tomas Jakab, Christian Rupprecht, Andrea Vedaldi

    Abstract: Learning deformable 3D objects from 2D images is often an ill-posed problem. Existing methods rely on explicit supervision to establish multi-view correspondences, such as template shape models and keypoint annotations, which restricts their applicability on objects "in the wild". A more natural way of establishing correspondences is by watching videos of objects moving around. In this paper, we p… ▽ More

    Submitted 29 June, 2022; v1 submitted 22 July, 2021; originally announced July 2021.

    Comments: Project Page: https://dove3d.github.io/

  44. arXiv:2105.08127  [pdf, other

    cs.CV cs.AI

    Finding an Unsupervised Image Segmenter in Each of Your Deep Generative Models

    Authors: Luke Melas-Kyriazi, Christian Rupprecht, Iro Laina, Andrea Vedaldi

    Abstract: Recent research has shown that numerous human-interpretable directions exist in the latent space of GANs. In this paper, we develop an automatic procedure for finding directions that lead to foreground-background image separation, and we use these directions to train an image segmentation model without human supervision. Our method is generator-agnostic, producing strong segmentation results with… ▽ More

    Submitted 17 May, 2021; originally announced May 2021.

    Comments: Project page and GitHub link: https://lukemelas.github.io/unsupervised-image-segmentation & https://github.com/lukemelas/unsupervised-image-segmentation

  45. arXiv:2103.16886  [pdf, other

    cs.CV cs.LG

    Neural Response Interpretation through the Lens of Critical Pathways

    Authors: Ashkan Khakzar, Soroosh Baselizadeh, Saurabh Khanduja, Christian Rupprecht, Seong Tae Kim, Nassir Navab

    Abstract: Is critical input information encoded in specific sparse pathways within the neural network? In this work, we discuss the problem of identifying these critical pathways and subsequently leverage them for interpreting the network's response to an input. The pruning objective -- selecting the smallest group of neurons for which the response remains equivalent to the original network -- has been prev… ▽ More

    Submitted 31 March, 2021; originally announced March 2021.

    Comments: Accepted at CVPR 2021 (IEEE/CVF Conference on Computer Vision and Pattern Recognition)

  46. arXiv:2103.10459  [pdf

    cond-mat.mes-hall physics.optics

    Spatial coherence of room-temperature monolayer WSe$_2$ exciton-polaritons in a trap

    Authors: Hangyong Shan, Lukas Lackner, Bo Han, Evgeny Sedov, Christoph Rupprecht, Heiko Knopf, Falk Eilenberger, Johannes Beierlein, Nils Kunte, Martin Esmann, Kentaro Yumigeta, Kenji Watanabe, Takashi Taniguchi, Sebastian Klembt, Sven Höfling, Alexey V. Kavokin, Sefaattin Tongay, Christian Schneider, Carlos Antón-Solanas

    Abstract: The emergence of spatial and temporal coherence of light emitted from solid-state systems is a fundamental phenomenon, rooting in a plethora of microscopic processes. It is intrinsically aligned with the control of light-matter coupling, and canonical for laser oscillation. However, it also emerges in the superradiance of multiple, phase-locked emitters, and more recently, coherence and long-range… ▽ More

    Submitted 9 November, 2021; v1 submitted 18 March, 2021; originally announced March 2021.

    Comments: 13 pages, 4 figures

    Journal ref: Shan, H. et al., Nature Communications 12, 6406 (2021)

  47. arXiv:2009.08196  [pdf

    cond-mat.mtrl-sci

    Micro- Mechanical assembly of high-quality Fabry-Perot microcavities for the integration with two-dimensional materials

    Authors: Christoph Rupprecht, Nils Lundt, Sven Höfling, Christian Schneider

    Abstract: Integrating monolayers of two-dimensional semiconductors in planar, and potentially microstructured microcavities is challenging because of the few, available approaches to overgrow the monolayers without damaging them. Some strategies have been developed, but they either rely on complicated experimental settings, expensive technologies or compromise the available quality factors. As a result, hig… ▽ More

    Submitted 17 September, 2020; originally announced September 2020.

    Comments: 8 pages, 4 figures

  48. arXiv:2007.14818  [pdf

    cond-mat.mes-hall cond-mat.mtrl-sci

    Demonstration of a polariton step potential by local variation of light-matter coupling in a van-der-Waals heterostructure

    Authors: C. Rupprecht, M. Klaas, H. Knopf, T. Taniguchi, K. Watanabe, Y. Qin, S. Tongay, S. SchrÖder, F. Eilenberger, S. HÖfling, C. Schneider

    Abstract: The large oscillator strength of excitons in transition metal dichalcogenide layers facilitates the formation of exciton-polariton resonances for monolayers and van-der-Waals heterostructures embedded in optical microcavities. Here, we show, that locally changing the number of layers in a WSe2/hBN/WSe2 van-der-Waals heterostructure embedded in a monolithic, high-quality-factor cavity gives rise to… ▽ More

    Submitted 29 July, 2020; originally announced July 2020.

    Journal ref: Optics Express 28.13 (2020): 18649-18657

  49. arXiv:2007.12022  [pdf

    cond-mat.mes-hall cond-mat.mtrl-sci

    Manipulation of room-temperature Valley-Coherent Exciton-Polaritons in atomically thin crystals by real and artificial magnetic fields

    Authors: Christoph Rupprecht, Evgeny Sedov, Martin Klaas, Heiko Knopf, Mark Blei, Nils Lundt, Sefaattin Tongay, Takashi Taniguchi, Kenji Watanabe, Ulrike Schulz, Alexey Kavokin, Falk Eilenberger, Sven Höfling, Christian Schneider

    Abstract: Strong spin-orbit coupling and inversion symmetry breaking in transition metal dichalcogenide monolayers yield the intriguing effects of valley-dependent optical selection rules. As such, it is possible to substantially polarize valley excitons with chiral light and furthermore create coherent superpositions of K and K- polarized states. Yet, at ambient conditions dephasing usually becomes too dom… ▽ More

    Submitted 23 July, 2020; originally announced July 2020.

    Comments: 12 Pages, 5 Figures

    Journal ref: 2D Mater.7 035025 (2020)

  50. arXiv:2006.13662  [pdf, other

    cs.CV cs.LG

    Labelling unlabelled videos from scratch with multi-modal self-supervision

    Authors: Yuki M. Asano, Mandela Patrick, Christian Rupprecht, Andrea Vedaldi

    Abstract: A large part of the current success of deep learning lies in the effectiveness of data -- more precisely: labelled data. Yet, labelling a dataset with human annotation continues to carry high costs, especially for videos. While in the image domain, recent methods have allowed to generate meaningful (pseudo-) labels for unlabelled datasets without supervision, this development is missing for the vi… ▽ More

    Submitted 28 February, 2021; v1 submitted 24 June, 2020; originally announced June 2020.

    Comments: Accepted to NeurIPS 2020. Project page: https://www.robots.ox.ac.uk/~vgg/research/selavi, code: https://github.com/facebookresearch/selavi