Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–28 of 28 results for author: Greff, K

.
  1. arXiv:2310.13018  [pdf, other

    q-bio.NC cs.AI cs.LG cs.NE

    Getting aligned on representational alignment

    Authors: Ilia Sucholutsky, Lukas Muttenthaler, Adrian Weller, Andi Peng, Andreea Bobu, Been Kim, Bradley C. Love, Erin Grant, Iris Groen, Jascha Achterberg, Joshua B. Tenenbaum, Katherine M. Collins, Katherine L. Hermann, Kerem Oktar, Klaus Greff, Martin N. Hebart, Nori Jacoby, Qiuyi Zhang, Raja Marjieh, Robert Geirhos, Sherol Chen, Simon Kornblith, Sunayana Rane, Talia Konkle, Thomas P. O'Connell , et al. (5 additional authors not shown)

    Abstract: Biological and artificial information processing systems form representations that they can use to categorize, reason, plan, navigate, and make decisions. How can we measure the extent to which the representations formed by these diverse systems agree? Do similarities in representations then translate into similar behavior? How can a system's representations be modified to better match those of an… ▽ More

    Submitted 2 November, 2023; v1 submitted 18 October, 2023; originally announced October 2023.

    Comments: Working paper, changes to be made in upcoming revisions

  2. arXiv:2310.06020  [pdf, other

    cs.CV cs.AI cs.GR cs.LG cs.RO

    DyST: Towards Dynamic Neural Scene Representations on Real-World Videos

    Authors: Maximilian Seitzer, Sjoerd van Steenkiste, Thomas Kipf, Klaus Greff, Mehdi S. M. Sajjadi

    Abstract: Visual understanding of the world goes beyond the semantics and flat structure of individual images. In this work, we aim to capture both the 3D structure and dynamics of real-world scenes from monocular real-world videos. Our Dynamic Scene Transformer (DyST) model leverages recent work in neural scene representation to learn a latent decomposition of monocular real-world videos into scene content… ▽ More

    Submitted 15 March, 2024; v1 submitted 9 October, 2023; originally announced October 2023.

    Comments: ICLR 2024 spotlight. Project website: https://dyst-paper.github.io/

  3. arXiv:2305.18890  [pdf, other

    cs.CV cs.LG

    Sensitivity of Slot-Based Object-Centric Models to their Number of Slots

    Authors: Roland S. Zimmermann, Sjoerd van Steenkiste, Mehdi S. M. Sajjadi, Thomas Kipf, Klaus Greff

    Abstract: Self-supervised methods for learning object-centric representations have recently been applied successfully to various datasets. This progress is largely fueled by slot-based methods, whose ability to cluster visual scenes into meaningful objects holds great promise for compositional generalization and downstream learning. In these methods, the number of slots (clusters) $K$ is typically chosen to… ▽ More

    Submitted 30 May, 2023; originally announced May 2023.

  4. arXiv:2305.05591  [pdf, other

    cs.SD cs.CV eess.AS

    AudioSlots: A slot-centric generative model for audio separation

    Authors: Pradyumna Reddy, Scott Wisdom, Klaus Greff, John R. Hershey, Thomas Kipf

    Abstract: In a range of recent works, object-centric architectures have been shown to be suitable for unsupervised scene decomposition in the vision domain. Inspired by these methods we present AudioSlots, a slot-centric generative model for blind source separation in the audio domain. AudioSlots is built using permutation-equivariant encoder and decoder networks. The encoder network based on the Transforme… ▽ More

    Submitted 9 May, 2023; originally announced May 2023.

    Comments: Accepted at the Self-supervision in Audio, Speech and Beyond (SASB) Workshop at ICASSP 2023

  5. arXiv:2303.03378  [pdf, other

    cs.LG cs.AI cs.RO

    PaLM-E: An Embodied Multimodal Language Model

    Authors: Danny Driess, Fei Xia, Mehdi S. M. Sajjadi, Corey Lynch, Aakanksha Chowdhery, Brian Ichter, Ayzaan Wahid, Jonathan Tompson, Quan Vuong, Tianhe Yu, Wenlong Huang, Yevgen Chebotar, Pierre Sermanet, Daniel Duckworth, Sergey Levine, Vincent Vanhoucke, Karol Hausman, Marc Toussaint, Klaus Greff, Andy Zeng, Igor Mordatch, Pete Florence

    Abstract: Large language models excel at a wide range of complex tasks. However, enabling general inference in the real world, e.g., for robotics problems, raises the challenge of grounding. We propose embodied language models to directly incorporate real-world continuous sensor modalities into language models and thereby establish the link between words and percepts. Input to our embodied language model ar… ▽ More

    Submitted 6 March, 2023; originally announced March 2023.

  6. arXiv:2211.14306  [pdf, other

    cs.CV cs.GR cs.LG eess.IV

    RUST: Latent Neural Scene Representations from Unposed Imagery

    Authors: Mehdi S. M. Sajjadi, Aravindh Mahendran, Thomas Kipf, Etienne Pot, Daniel Duckworth, Mario Lucic, Klaus Greff

    Abstract: Inferring the structure of 3D scenes from 2D observations is a fundamental challenge in computer vision. Recently popularized approaches based on neural scene representations have achieved tremendous impact and have been applied across a variety of applications. One of the major remaining challenges in this space is training a single model which can provide latent representations which effectively… ▽ More

    Submitted 24 March, 2023; v1 submitted 25 November, 2022; originally announced November 2022.

    Comments: CVPR 2023 Highlight. Project website: https://rust-paper.github.io/

  7. arXiv:2210.05861  [pdf, other

    cs.CV cs.AI cs.LG

    SlotFormer: Unsupervised Visual Dynamics Simulation with Object-Centric Models

    Authors: Ziyi Wu, Nikita Dvornik, Klaus Greff, Thomas Kipf, Animesh Garg

    Abstract: Understanding dynamics from visual observations is a challenging problem that requires disentangling individual objects from the scene and learning their interactions. While recent object-centric models can successfully decompose a scene into objects, modeling their dynamics effectively still remains a challenge. We address this problem by introducing SlotFormer -- a Transformer-based autoregressi… ▽ More

    Submitted 20 January, 2023; v1 submitted 11 October, 2022; originally announced October 2022.

    Comments: Accepted by ICLR 2023. Project page: https://slotformer.github.io/

  8. arXiv:2206.07764  [pdf, other

    cs.CV cs.LG

    SAVi++: Towards End-to-End Object-Centric Learning from Real-World Videos

    Authors: Gamaleldin F. Elsayed, Aravindh Mahendran, Sjoerd van Steenkiste, Klaus Greff, Michael C. Mozer, Thomas Kipf

    Abstract: The visual world can be parsimoniously characterized in terms of distinct entities with sparse interactions. Discovering this compositional structure in dynamic visual scenes has proven challenging for end-to-end computer vision approaches unless explicit instance-level supervision is provided. Slot-based models leveraging motion cues have recently shown great promise in learning to represent, seg… ▽ More

    Submitted 23 December, 2022; v1 submitted 15 June, 2022; originally announced June 2022.

    Comments: Project page at https://slot-attention-video.github.io/savi++/

  9. arXiv:2206.06922  [pdf, other

    cs.CV cs.AI cs.LG

    Object Scene Representation Transformer

    Authors: Mehdi S. M. Sajjadi, Daniel Duckworth, Aravindh Mahendran, Sjoerd van Steenkiste, Filip Pavetić, Mario Lučić, Leonidas J. Guibas, Klaus Greff, Thomas Kipf

    Abstract: A compositional understanding of the world in terms of objects and their geometry in 3D space is considered a cornerstone of human cognition. Facilitating the learning of such a representation in neural networks holds promise for substantially improving labeled data efficiency. As a key step in this direction, we make progress on the problem of learning 3D-consistent decompositions of complex scen… ▽ More

    Submitted 12 October, 2022; v1 submitted 14 June, 2022; originally announced June 2022.

    Comments: Accepted at NeurIPS '22. Project page: https://osrt-paper.github.io/

  10. arXiv:2203.03570  [pdf, other

    cs.CV cs.GR cs.LG

    Kubric: A scalable dataset generator

    Authors: Klaus Greff, Francois Belletti, Lucas Beyer, Carl Doersch, Yilun Du, Daniel Duckworth, David J. Fleet, Dan Gnanapragasam, Florian Golemo, Charles Herrmann, Thomas Kipf, Abhijit Kundu, Dmitry Lagun, Issam Laradji, Hsueh-Ti, Liu, Henning Meyer, Yishu Miao, Derek Nowrouzezahrai, Cengiz Oztireli, Etienne Pot, Noha Radwan, Daniel Rebain, Sara Sabour, Mehdi S. M. Sajjadi , et al. (10 additional authors not shown)

    Abstract: Data is the driving force of machine learning, with the amount and quality of training data often being more important for the performance of a system than architecture and training details. But collecting, processing and annotating real data at scale is difficult, expensive, and frequently raises additional privacy, fairness and legal concerns. Synthetic data is a powerful tool with the potential… ▽ More

    Submitted 7 March, 2022; originally announced March 2022.

    Comments: 21 pages, CVPR2022

  11. arXiv:2111.13260  [pdf, other

    cs.CV cs.RO

    NeSF: Neural Semantic Fields for Generalizable Semantic Segmentation of 3D Scenes

    Authors: Suhani Vora, Noha Radwan, Klaus Greff, Henning Meyer, Kyle Genova, Mehdi S. M. Sajjadi, Etienne Pot, Andrea Tagliasacchi, Daniel Duckworth

    Abstract: We present NeSF, a method for producing 3D semantic fields from posed RGB images alone. In place of classical 3D representations, our method builds on recent work in implicit neural scene representations wherein 3D structure is captured by point-wise functions. We leverage this methodology to recover 3D density fields upon which we then train a 3D semantic segmentation model supervised by posed 2D… ▽ More

    Submitted 2 December, 2021; v1 submitted 25 November, 2021; originally announced November 2021.

    Comments: Project website: https://nesf3d.github.io/. Updated with minor edits to text

  12. arXiv:2111.13152  [pdf, other

    cs.CV cs.AI cs.GR cs.LG cs.RO

    Scene Representation Transformer: Geometry-Free Novel View Synthesis Through Set-Latent Scene Representations

    Authors: Mehdi S. M. Sajjadi, Henning Meyer, Etienne Pot, Urs Bergmann, Klaus Greff, Noha Radwan, Suhani Vora, Mario Lucic, Daniel Duckworth, Alexey Dosovitskiy, Jakob Uszkoreit, Thomas Funkhouser, Andrea Tagliasacchi

    Abstract: A classical problem in computer vision is to infer a 3D scene representation from few images that can be used to render novel views at interactive rates. Previous work focuses on reconstructing pre-defined 3D representations, e.g. textured meshes, or implicit representations, e.g. radiance fields, and often requires input images with precise camera poses and long processing times for each novel sc… ▽ More

    Submitted 29 March, 2022; v1 submitted 25 November, 2021; originally announced November 2021.

    Comments: Accepted to CVPR 2022, Project website: https://srt-paper.github.io/

    Journal ref: CVPR 2022

  13. arXiv:2111.12594  [pdf, other

    cs.CV cs.LG stat.ML

    Conditional Object-Centric Learning from Video

    Authors: Thomas Kipf, Gamaleldin F. Elsayed, Aravindh Mahendran, Austin Stone, Sara Sabour, Georg Heigold, Rico Jonschkowski, Alexey Dosovitskiy, Klaus Greff

    Abstract: Object-centric representations are a promising path toward more systematic generalization by providing flexible abstractions upon which compositional world models can be built. Recent work on simple 2D and 3D datasets has shown that models with object-centric inductive biases can learn to segment and represent meaningful objects from the statistical structure of the data alone without the need for… ▽ More

    Submitted 15 March, 2022; v1 submitted 24 November, 2021; originally announced November 2021.

    Comments: Published at ICLR 2022. Project page at https://slot-attention-video.github.io/

  14. arXiv:2012.05208  [pdf, other

    cs.NE cs.AI cs.LG

    On the Binding Problem in Artificial Neural Networks

    Authors: Klaus Greff, Sjoerd van Steenkiste, Jürgen Schmidhuber

    Abstract: Contemporary neural networks still fall short of human-level generalization, which extends far beyond our direct experiences. In this paper, we argue that the underlying cause for this shortcoming is their inability to dynamically and flexibly bind information that is distributed throughout the network. This binding problem affects their capacity to acquire a compositional understanding of the wor… ▽ More

    Submitted 9 December, 2020; originally announced December 2020.

    ACM Class: I.2.6

  15. arXiv:2011.10287  [pdf, other

    cs.CV cs.LG

    Learning Object-Centric Video Models by Contrasting Sets

    Authors: Sindy Löwe, Klaus Greff, Rico Jonschkowski, Alexey Dosovitskiy, Thomas Kipf

    Abstract: Contrastive, self-supervised learning of object representations recently emerged as an attractive alternative to reconstruction-based training. Prior approaches focus on contrasting individual object representations (slots) against one another. However, a fundamental problem with this approach is that the overall contrastive loss is the same for (i) representing a different object in each slot, as… ▽ More

    Submitted 20 November, 2020; originally announced November 2020.

    Comments: NeurIPS 2020 Workshop on Object Representations for Learning and Reasoning

  16. arXiv:1906.01035  [pdf, other

    cs.LG cs.AI cs.NE stat.ML

    A Perspective on Objects and Systematic Generalization in Model-Based RL

    Authors: Sjoerd van Steenkiste, Klaus Greff, Jürgen Schmidhuber

    Abstract: In order to meet the diverse challenges in solving many real-world problems, an intelligent agent has to be able to dynamically construct a model of its environment. Objects facilitate the modular reuse of prior knowledge and the combinatorial construction of such models. In this work, we argue that dynamically bound features (objects) do not simply emerge in connectionist models of the world. We… ▽ More

    Submitted 3 June, 2019; originally announced June 2019.

    Comments: Accepted to the ICML 2019 workshop on Workshop on Generative Modeling and Model-Based Reasoning for Robotics and AI

    ACM Class: I.2.6

  17. arXiv:1903.00450  [pdf, other

    cs.LG cs.CV stat.ML

    Multi-Object Representation Learning with Iterative Variational Inference

    Authors: Klaus Greff, Raphaël Lopez Kaufman, Rishabh Kabra, Nick Watters, Chris Burgess, Daniel Zoran, Loic Matthey, Matthew Botvinick, Alexander Lerchner

    Abstract: Human perception is structured around objects which form the basis for our higher-level cognition and impressive systematic generalization abilities. Yet most work on representation learning focuses on feature learning without even considering multiple objects, or treats segmentation as an (often supervised) preprocessing step. Instead, we argue for the importance of learning to segment and repres… ▽ More

    Submitted 27 July, 2020; v1 submitted 1 March, 2019; originally announced March 2019.

    Journal ref: ICML 2019 (PMLR 97:2424-2433)

  18. arXiv:1802.10353  [pdf, other

    cs.LG cs.AI cs.NE

    Relational Neural Expectation Maximization: Unsupervised Discovery of Objects and their Interactions

    Authors: Sjoerd van Steenkiste, Michael Chang, Klaus Greff, Jürgen Schmidhuber

    Abstract: Common-sense physical reasoning is an essential ingredient for any intelligent agent operating in the real-world. For example, it can be used to simulate the environment, or to infer the state of parts of the world that are currently unobserved. In order to match real-world conditions this causal knowledge must be learned without access to supervised data. To address this problem we present a nove… ▽ More

    Submitted 28 February, 2018; originally announced February 2018.

    Comments: Accepted to ICLR 2018

    ACM Class: I.2.6

  19. arXiv:1708.03498  [pdf, other

    cs.LG cs.NE stat.ML

    Neural Expectation Maximization

    Authors: Klaus Greff, Sjoerd van Steenkiste, Jürgen Schmidhuber

    Abstract: Many real world tasks such as reasoning and physical interaction require identification and manipulation of conceptual entities. A first step towards solving these tasks is the automated discovery of distributed symbol-like representations. In this paper, we explicitly formalize this problem as inference in a spatial mixture model where each component is parametrized by a neural network. Based on… ▽ More

    Submitted 4 November, 2017; v1 submitted 11 August, 2017; originally announced August 2017.

    Comments: Accepted to NIPS 2017

    ACM Class: I.2.6

  20. arXiv:1612.07771  [pdf, other

    cs.NE cs.AI cs.LG

    Highway and Residual Networks learn Unrolled Iterative Estimation

    Authors: Klaus Greff, Rupesh K. Srivastava, Jürgen Schmidhuber

    Abstract: The past year saw the introduction of new architectures such as Highway networks and Residual networks which, for the first time, enabled the training of feedforward networks with dozens to hundreds of layers using simple gradient descent. While depth of representation has been posited as a primary reason for their success, there are indications that these architectures defy a popular view of deep… ▽ More

    Submitted 14 March, 2017; v1 submitted 22 December, 2016; originally announced December 2016.

    Comments: 10 + 4 pages, accepted for ICLR 2017

    ACM Class: I.2.6; I.5.1

  21. arXiv:1607.02168  [pdf, other

    cs.ET

    Discovering Boolean Gates in Slime Mould

    Authors: Simon Harding, Jan Koutnik, Klaus Greff, Jurgen Schmidhuber, Andy Adamatzky

    Abstract: Slime mould of Physarum polycephalum is a large cell exhibiting rich spatial non-linear electrical characteristics. We exploit the electrical properties of the slime mould to implement logic gates using a flexible hardware platform designed for investigating the electrical properties of a substrate (MECOBO). We apply arbitrary electrical signals to `configure' the slime mould, i.e. change shape of… ▽ More

    Submitted 7 July, 2016; originally announced July 2016.

  22. arXiv:1606.06724  [pdf, other

    cs.CV cs.NE

    Tagger: Deep Unsupervised Perceptual Grouping

    Authors: Klaus Greff, Antti Rasmus, Mathias Berglund, Tele Hotloo Hao, Jürgen Schmidhuber, Harri Valpola

    Abstract: We present a framework for efficient perceptual inference that explicitly reasons about the segmentation of its inputs and features. Rather than being trained for any specific segmentation, our framework learns the grouping process in an unsupervised manner or alongside any supervised task. By enriching the representations of a neural network, we enable it to group the representations of different… ▽ More

    Submitted 28 November, 2016; v1 submitted 21 June, 2016; originally announced June 2016.

    Comments: 14 pages + 5 pages supplementary, accepted at NIPS 2016

    MSC Class: 97R40

  23. arXiv:1511.06727  [pdf, other

    cs.LG

    Scalable Gradient-Based Tuning of Continuous Regularization Hyperparameters

    Authors: Jelena Luketina, Mathias Berglund, Klaus Greff, Tapani Raiko

    Abstract: Hyperparameter selection generally relies on running multiple full training trials, with selection based on validation set performance. We propose a gradient-based approach for locally adjusting hyperparameters during training of the model. Hyperparameters are adjusted so as to make the model parameter gradients, and hence updates, more advantageous for the validation cost. We explore the approach… ▽ More

    Submitted 17 June, 2016; v1 submitted 20 November, 2015; originally announced November 2015.

    Comments: 9 pages, 7 figures. Accepted at ICML 2016

  24. arXiv:1511.06418  [pdf, other

    cs.LG cs.NE

    Binding via Reconstruction Clustering

    Authors: Klaus Greff, Rupesh Kumar Srivastava, Jürgen Schmidhuber

    Abstract: Disentangled distributed representations of data are desirable for machine learning, since they are more expressive and can generalize from fewer examples. However, for complex data, the distributed representations of multiple objects present in the same input can interfere and lead to ambiguities, which is commonly referred to as the binding problem. We argue for the importance of the binding pro… ▽ More

    Submitted 20 January, 2016; v1 submitted 19 November, 2015; originally announced November 2015.

    Comments: 12 pages, plus 12 pages Appendix

  25. arXiv:1507.06228  [pdf, other

    cs.LG cs.NE

    Training Very Deep Networks

    Authors: Rupesh Kumar Srivastava, Klaus Greff, Jürgen Schmidhuber

    Abstract: Theoretical and empirical evidence indicates that the depth of neural networks is crucial for their success. However, training becomes more difficult as depth increases, and training of very deep networks remains an open problem. Here we introduce a new architecture designed to overcome this. Our so-called highway networks allow unimpeded information flow across many layers on information highways… ▽ More

    Submitted 23 November, 2015; v1 submitted 22 July, 2015; originally announced July 2015.

    Comments: 11 pages. Extends arXiv:1505.00387. Project webpage is at http://people.idsia.ch/~rupesh/very_deep_learning/. in Advances in Neural Information Processing Systems 2015

    MSC Class: 68T01 ACM Class: I.2.6; G.1.6

  26. arXiv:1505.00387  [pdf, other

    cs.LG cs.NE

    Highway Networks

    Authors: Rupesh Kumar Srivastava, Klaus Greff, Jürgen Schmidhuber

    Abstract: There is plenty of theoretical and empirical evidence that depth of neural networks is a crucial ingredient for their success. However, network training becomes more difficult with increasing depth and training of very deep networks remains an open problem. In this extended abstract, we introduce a new architecture designed to ease gradient-based training of very deep networks. We refer to network… ▽ More

    Submitted 3 November, 2015; v1 submitted 2 May, 2015; originally announced May 2015.

    Comments: 6 pages, 2 figures. Presented at ICML 2015 Deep Learning workshop. Full paper is at arXiv:1507.06228

    MSC Class: 68T01 ACM Class: I.2.6; G.1.6

  27. LSTM: A Search Space Odyssey

    Authors: Klaus Greff, Rupesh Kumar Srivastava, Jan Koutník, Bas R. Steunebrink, Jürgen Schmidhuber

    Abstract: Several variants of the Long Short-Term Memory (LSTM) architecture for recurrent neural networks have been proposed since its inception in 1995. In recent years, these networks have become the state-of-the-art models for a variety of machine learning problems. This has led to a renewed interest in understanding the role and utility of various computational components of typical LSTM variants. In t… ▽ More

    Submitted 4 October, 2017; v1 submitted 13 March, 2015; originally announced March 2015.

    Comments: 12 pages, 6 figures

    MSC Class: 68T10 ACM Class: I.2.6; I.2.7; I.5.1; H.5.5

    Journal ref: IEEE Transactions on Neural Networks and Learning Systems ( Volume: 28, Issue: 10, Oct. 2017 ) Pages: 2222 - 2232

  28. arXiv:1402.3511  [pdf, other

    cs.NE cs.LG

    A Clockwork RNN

    Authors: Jan Koutník, Klaus Greff, Faustino Gomez, Jürgen Schmidhuber

    Abstract: Sequence prediction and classification are ubiquitous and challenging problems in machine learning that can require identifying complex dependencies between temporally distant inputs. Recurrent Neural Networks (RNNs) have the ability, in theory, to cope with these temporal dependencies by virtue of the short-term memory implemented by their recurrent (feedback) connections. However, in practice th… ▽ More

    Submitted 14 February, 2014; originally announced February 2014.