Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–11 of 11 results for author: Bachmann, R

.
  1. arXiv:2407.17365  [pdf, other

    cs.CV

    ViPer: Visual Personalization of Generative Models via Individual Preference Learning

    Authors: Sogand Salehi, Mahdi Shafiei, Teresa Yeo, Roman Bachmann, Amir Zamir

    Abstract: Different users find different images generated for the same prompt desirable. This gives rise to personalized image generation which involves creating images aligned with an individual's visual preference. Current generative models are, however, unpersonalized, as they are tuned to produce outputs that appeal to a broad audience. Using them to generate images aligned with individual users relies… ▽ More

    Submitted 24 July, 2024; originally announced July 2024.

    Comments: Project page at https://viper.epfl.ch/

  2. arXiv:2406.09406  [pdf, other

    cs.CV cs.AI cs.LG

    4M-21: An Any-to-Any Vision Model for Tens of Tasks and Modalities

    Authors: Roman Bachmann, Oğuzhan Fatih Kar, David Mizrahi, Ali Garjani, Mingfei Gao, David Griffiths, Jiaming Hu, Afshin Dehghan, Amir Zamir

    Abstract: Current multimodal and multitask foundation models like 4M or UnifiedIO show promising results, but in practice their out-of-the-box abilities to accept diverse inputs and perform diverse tasks are limited by the (usually rather small) number of modalities and tasks they are trained on. In this paper, we expand upon the capabilities of them by training a single model on tens of highly diverse moda… ▽ More

    Submitted 14 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

    Comments: Project page at 4m.epfl.ch

  3. EUSO-SPB1 Mission and Science

    Authors: JEM-EUSO Collaboration, :, G. Abdellaoui, S. Abe, J. H. Adams. Jr., D. Allard, G. Alonso, L. Anchordoqui, A. Anzalone, E. Arnone, K. Asano, R. Attallah, H. Attoui, M. Ave Pernas, R. Bachmann, S. Bacholle, M. Bagheri, M. Bakiri, J. Baláz, D. Barghini, S. Bartocci, M. Battisti, J. Bayer, B. Beldjilali, T. Belenguer , et al. (271 additional authors not shown)

    Abstract: The Extreme Universe Space Observatory on a Super Pressure Balloon 1 (EUSO-SPB1) was launched in 2017 April from Wanaka, New Zealand. The plan of this mission of opportunity on a NASA super pressure balloon test flight was to circle the southern hemisphere. The primary scientific goal was to make the first observations of ultra-high-energy cosmic-ray extensive air showers (EASs) by looking down on… ▽ More

    Submitted 12 January, 2024; originally announced January 2024.

    Comments: 18 pages, 19 figures

    Journal ref: Astropart Phys 154 (2024) 102891

  4. arXiv:2312.06647  [pdf, other

    cs.CV cs.AI cs.LG

    4M: Massively Multimodal Masked Modeling

    Authors: David Mizrahi, Roman Bachmann, Oğuzhan Fatih Kar, Teresa Yeo, Mingfei Gao, Afshin Dehghan, Amir Zamir

    Abstract: Current machine learning models for vision are often highly specialized and limited to a single modality and task. In contrast, recent large language models exhibit a wide range of capabilities, hinting at a possibility for similarly versatile models in computer vision. In this paper, we take a step in this direction and propose a multimodal training scheme called 4M. It consists of training a sin… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

    Comments: NeurIPS 2023 Spotlight. Project page at https://4m.epfl.ch/

  5. arXiv:2305.00348  [pdf, other

    cs.CV cs.RO

    Modality-invariant Visual Odometry for Embodied Vision

    Authors: Marius Memmel, Roman Bachmann, Amir Zamir

    Abstract: Effectively localizing an agent in a realistic, noisy setting is crucial for many embodied vision tasks. Visual Odometry (VO) is a practical substitute for unreliable GPS and compass sensors, especially in indoor environments. While SLAM-based methods show a solid performance without large data requirements, they are less flexible and robust w.r.t. to noise and changes in the sensor suite compared… ▽ More

    Submitted 29 April, 2023; originally announced May 2023.

  6. arXiv:2204.01678  [pdf, other

    cs.CV cs.LG

    MultiMAE: Multi-modal Multi-task Masked Autoencoders

    Authors: Roman Bachmann, David Mizrahi, Andrei Atanov, Amir Zamir

    Abstract: We propose a pre-training strategy called Multi-modal Multi-task Masked Autoencoders (MultiMAE). It differs from standard Masked Autoencoding in two key aspects: I) it can optionally accept additional modalities of information in the input besides the RGB image (hence "multi-modal"), and II) its training objective accordingly includes predicting multiple outputs besides the RGB image (hence "multi… ▽ More

    Submitted 4 April, 2022; originally announced April 2022.

    Comments: Project page at https://multimae.epfl.ch

  7. arXiv:2202.05822  [pdf, other

    cs.GR cs.AI cs.CV

    CLIPasso: Semantically-Aware Object Sketching

    Authors: Yael Vinker, Ehsan Pajouheshgar, Jessica Y. Bo, Roman Christian Bachmann, Amit Haim Bermano, Daniel Cohen-Or, Amir Zamir, Ariel Shamir

    Abstract: Abstraction is at the heart of sketching due to the simple and minimal nature of line drawings. Abstraction entails identifying the essential visual properties of an object or scene, which requires semantic understanding and prior knowledge of high-level concepts. Abstract depictions are therefore challenging for artists, and even more so for machines. We present CLIPasso, an object sketching meth… ▽ More

    Submitted 16 May, 2022; v1 submitted 11 February, 2022; originally announced February 2022.

    Comments: https://clipasso.github.io/clipasso/

  8. arXiv:2110.04994  [pdf, other

    cs.CV cs.AI cs.GR cs.RO

    Omnidata: A Scalable Pipeline for Making Multi-Task Mid-Level Vision Datasets from 3D Scans

    Authors: Ainaz Eftekhar, Alexander Sax, Roman Bachmann, Jitendra Malik, Amir Zamir

    Abstract: This paper introduces a pipeline to parametrically sample and render multi-task vision datasets from comprehensive 3D scans from the real world. Changing the sampling parameters allows one to "steer" the generated datasets to emphasize specific information. In addition to enabling interesting lines of research, we show the tooling and generated data suffice to train robust vision models. Common… ▽ More

    Submitted 11 October, 2021; originally announced October 2021.

    Comments: ICCV 2021: See project website https://omnidata.vision

  9. arXiv:2002.10778  [pdf, other

    cs.LG stat.ML

    Training Binary Neural Networks using the Bayesian Learning Rule

    Authors: Xiangming Meng, Roman Bachmann, Mohammad Emtiyaz Khan

    Abstract: Neural networks with binary weights are computation-efficient and hardware-friendly, but their training is challenging because it involves a discrete optimization problem. Surprisingly, ignoring the discrete nature of the problem and using gradient-based methods, such as the Straight-Through Estimator, still works well in practice. This raises the question: are there principled approaches which ju… ▽ More

    Submitted 17 August, 2020; v1 submitted 25 February, 2020; originally announced February 2020.

    Comments: accepted by ICML 2020, the camera-ready version

  10. arXiv:1908.11676  [pdf, other

    cs.CV

    Motion Capture from Pan-Tilt Cameras with Unknown Orientation

    Authors: Roman Bachmann, Jörg Spörri, Pascal Fua, Helge Rhodin

    Abstract: In sports, such as alpine skiing, coaches would like to know the speed and various biomechanical variables of their athletes and competitors. Existing methods use either body-worn sensors, which are cumbersome to setup, or manual image annotation, which is time consuming. We propose a method for estimating an athlete's global 3D position and articulated pose using multiple cameras. By contrast to… ▽ More

    Submitted 30 August, 2019; originally announced August 2019.

    Comments: International Conference on 3D Vision 2019

  11. Driving forces for Ag-induced periodic faceting of vicinal Cu(111)

    Authors: A. R. Bachmann, A. Mugarza, S. Speller, J. E. Ortega

    Abstract: Adsorption of submonolayer amounts of Ag on vicinal Cu(111) induces periodic faceting. The equilibrium structure is characterized by Ag-covered facets that alternate with clean Cu stripes. In the atomic scale, the driving force is the matching of Ag(111)-like packed rows with Cu(111) terraces underneath. This determines the preference for the facet orientation and the evolution of different phas… ▽ More

    Submitted 2 October, 2002; originally announced October 2002.

    Comments: 1 text, 4 figures