Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 74 results for author: Mandt, S

Searching in archive cs. Search in all archives.
  1. arXiv:2406.16308  [pdf, other

    cs.LG cs.AI cs.CL

    Anomaly Detection of Tabular Data Using LLMs

    Authors: Aodong Li, Yunhan Zhao, Chen Qiu, Marius Kloft, Padhraic Smyth, Maja Rudolph, Stephan Mandt

    Abstract: Large language models (LLMs) have shown their potential in long-context understanding and mathematical reasoning. In this paper, we study the problem of using LLMs to detect tabular anomalies and show that pre-trained LLMs are zero-shot batch-level anomaly detectors. That is, without extra distribution-specific model fitting, they can discover hidden outliers in a batch of data, demonstrating thei… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: accepted at the Anomaly Detection with Foundation Models workshop

  2. arXiv:2406.08953  [pdf, other

    cs.CV cs.LG

    Preserving Identity with Variational Score for General-purpose 3D Editing

    Authors: Duong H. Le, Tuan Pham, Aniruddha Kembhavi, Stephan Mandt, Wei-Chiu Ma, Jiasen Lu

    Abstract: We present Piva (Preserving Identity with Variational Score Distillation), a novel optimization-based method for editing images and 3D models based on diffusion models. Specifically, our approach is inspired by the recently proposed method for 2D image editing - Delta Denoising Score (DDS). We pinpoint the limitations in DDS for 2D and 3D editing, which causes detail loss and over-saturation. To a… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 22 pages, 14 figures

  3. arXiv:2406.08943  [pdf, other

    cs.CV cs.LG

    Neural NeRF Compression

    Authors: Tuan Pham, Stephan Mandt

    Abstract: Neural Radiance Fields (NeRFs) have emerged as powerful tools for capturing detailed 3D scenes through continuous volumetric representations. Recent NeRFs utilize feature grids to improve rendering quality and speed; however, these representations introduce significant storage overhead. This paper presents a novel method for efficiently compressing a grid-based NeRF model, addressing the storage o… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: Accepted to ICML 2024

  4. arXiv:2405.17673  [pdf, other

    cs.CV cs.LG stat.ML

    Fast Samplers for Inverse Problems in Iterative Refinement Models

    Authors: Kushagra Pandey, Ruihan Yang, Stephan Mandt

    Abstract: Constructing fast samplers for unconditional diffusion and flow-matching models has received much attention recently; however, existing methods for solving inverse problems, such as super-resolution, inpainting, or deblurring, still require hundreds to thousands of iterative steps to obtain high-quality results. We propose a plug-and-play framework for constructing efficient samplers for inverse p… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  5. arXiv:2403.05300  [pdf, other

    cs.LG cs.AI

    Unity by Diversity: Improved Representation Learning in Multimodal VAEs

    Authors: Thomas M. Sutter, Yang Meng, Andrea Agostini, Daphné Chopard, Norbert Fortin, Julia E. Vogt, Bahbak Shahbaba, Stephan Mandt

    Abstract: Variational Autoencoders for multimodal data hold promise for many tasks in data analysis, such as representation learning, conditional generation, and imputation. Current architectures either share the encoder output, decoder input, or both across modalities to learn a shared representation. Such architectures impose hard constraints on the model. In this work, we show that a better latent repres… ▽ More

    Submitted 31 May, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  6. arXiv:2403.00025  [pdf, ps, other

    cs.LG cs.AI

    On the Challenges and Opportunities in Generative AI

    Authors: Laura Manduchi, Kushagra Pandey, Robert Bamler, Ryan Cotterell, Sina Däubener, Sophie Fellenz, Asja Fischer, Thomas Gärtner, Matthias Kirchler, Marius Kloft, Yingzhen Li, Christoph Lippert, Gerard de Melo, Eric Nalisnick, Björn Ommer, Rajesh Ranganath, Maja Rudolph, Karen Ullrich, Guy Van den Broeck, Julia E Vogt, Yixin Wang, Florian Wenzel, Frank Wood, Stephan Mandt, Vincent Fortuin

    Abstract: The field of deep generative modeling has grown rapidly and consistently over the years. With the availability of massive amounts of training data coupled with advances in scalable unsupervised learning paradigms, recent large-scale generative models show tremendous promise in synthesizing high-resolution images and text, as well as structured data such as videos and molecules. However, we argue t… ▽ More

    Submitted 28 February, 2024; originally announced March 2024.

  7. arXiv:2402.07211  [pdf, other

    cs.LG stat.ML

    Towards Fast Stochastic Sampling in Diffusion Generative Models

    Authors: Kushagra Pandey, Maja Rudolph, Stephan Mandt

    Abstract: Diffusion models suffer from slow sample generation at inference time. Despite recent efforts, improving the sampling efficiency of stochastic samplers for diffusion models remains a promising direction. We propose Splitting Integrators for fast stochastic sampling in pre-trained diffusion models in augmented spaces. Commonly used in molecular dynamics, splitting-based integrators attempt to impro… ▽ More

    Submitted 13 February, 2024; v1 submitted 11 February, 2024; originally announced February 2024.

    Comments: Accepted in the NeurIPS'23 Workshop on Diffusion Models. Full version of this work can be found at arXiv:2310.07894

  8. arXiv:2402.00809  [pdf, other

    cs.LG stat.ML

    Position: Bayesian Deep Learning is Needed in the Age of Large-Scale AI

    Authors: Theodore Papamarkou, Maria Skoularidou, Konstantina Palla, Laurence Aitchison, Julyan Arbel, David Dunson, Maurizio Filippone, Vincent Fortuin, Philipp Hennig, José Miguel Hernández-Lobato, Aliaksandr Hubin, Alexander Immer, Theofanis Karaletsos, Mohammad Emtiyaz Khan, Agustinus Kristiadi, Yingzhen Li, Stephan Mandt, Christopher Nemeth, Michael A. Osborne, Tim G. J. Rudner, David Rügamer, Yee Whye Teh, Max Welling, Andrew Gordon Wilson, Ruqi Zhang

    Abstract: In the current landscape of deep learning research, there is a predominant emphasis on achieving high predictive accuracy in supervised tasks involving large image and language datasets. However, a broader perspective reveals a multitude of overlooked metrics, tasks, and data types, such as uncertainty, active and continual learning, and scientific data, that demand attention. Bayesian deep learni… ▽ More

    Submitted 2 June, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

  9. arXiv:2312.06071  [pdf, other

    cs.CV cs.LG physics.ao-ph stat.ML

    Precipitation Downscaling with Spatiotemporal Video Diffusion

    Authors: Prakhar Srivastava, Ruihan Yang, Gavin Kerrigan, Gideon Dresdner, Jeremy McGibbon, Christopher Bretherton, Stephan Mandt

    Abstract: In climate science and meteorology, high-resolution local precipitation (rain and snowfall) predictions are limited by the computational costs of simulation-based methods. Statistical downscaling, or super-resolution, is a common workaround where a low-resolution prediction is improved using statistical approaches. Unlike traditional computer vision tasks, weather and climate applications require… ▽ More

    Submitted 20 June, 2024; v1 submitted 10 December, 2023; originally announced December 2023.

  10. arXiv:2311.05931  [pdf, other

    cs.LG cs.AI stat.ML

    Early-Exit Neural Networks with Nested Prediction Sets

    Authors: Metod Jazbec, Patrick Forré, Stephan Mandt, Dan Zhang, Eric Nalisnick

    Abstract: Early-exit neural networks (EENNs) enable adaptive and efficient inference by providing predictions at multiple stages during the forward pass. In safety-critical applications, these predictions are meaningful only when accompanied by reliable uncertainty estimates. A popular method for quantifying the uncertainty of predictive models is the use of prediction sets. However, we demonstrate that sta… ▽ More

    Submitted 2 June, 2024; v1 submitted 10 November, 2023; originally announced November 2023.

    Comments: UAI 2024

  11. arXiv:2310.20168  [pdf, other

    cs.LG physics.ao-ph physics.flu-dyn

    Understanding and Visualizing Droplet Distributions in Simulations of Shallow Clouds

    Authors: Justus C. Will, Andrea M. Jenney, Kara D. Lamb, Michael S. Pritchard, Colleen Kaul, Po-Lun Ma, Kyle Pressel, Jacob Shpund, Marcus van Lier-Walqui, Stephan Mandt

    Abstract: Thorough analysis of local droplet-level interactions is crucial to better understand the microphysical processes in clouds and their effect on the global climate. High-accuracy simulations of relevant droplet size distributions from Large Eddy Simulations (LES) of bin microphysics challenge current analysis techniques due to their high dimensionality involving three spatial dimensions, time, and… ▽ More

    Submitted 31 October, 2023; originally announced October 2023.

    Comments: 4 pages, 3 figures, accepted at NeurIPS 2023 (Machine Learning and the Physical Sciences Workshop)

  12. arXiv:2310.18908  [pdf, other

    cs.IT cs.LG stat.AP stat.ML

    Estimating the Rate-Distortion Function by Wasserstein Gradient Descent

    Authors: Yibo Yang, Stephan Eckstein, Marcel Nutz, Stephan Mandt

    Abstract: In the theory of lossy compression, the rate-distortion (R-D) function $R(D)$ describes how much a data source can be compressed (in bit-rate) at any given level of fidelity (distortion). Obtaining $R(D)$ for a given data source establishes the fundamental performance limit for all compression algorithms. We propose a new method to estimate $R(D)$ from the perspective of optimal transport. Unlike… ▽ More

    Submitted 29 October, 2023; originally announced October 2023.

    Comments: Accepted as conference paper at NeurIPS 2023

  13. arXiv:2310.07894  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Efficient Integrators for Diffusion Generative Models

    Authors: Kushagra Pandey, Maja Rudolph, Stephan Mandt

    Abstract: Diffusion models suffer from slow sample generation at inference time. Therefore, developing a principled framework for fast deterministic/stochastic sampling for a broader class of diffusion models is a promising direction. We propose two complementary frameworks for accelerating sample generation in pre-trained models: Conjugate Integrators and Splitting Integrators. Conjugate integrators genera… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

  14. arXiv:2306.16717  [pdf, other

    stat.ML cs.LG

    Understanding Pathologies of Deep Heteroskedastic Regression

    Authors: Eliot Wong-Toi, Alex Boyd, Vincent Fortuin, Stephan Mandt

    Abstract: Deep, overparameterized regression models are notorious for their tendency to overfit. This problem is exacerbated in heteroskedastic models, which predict both mean and residual noise for each data point. At one extreme, these models fit all training data perfectly, eliminating residual noise entirely; at the other, they overfit the residual noise while predicting a constant, uninformative mean.… ▽ More

    Submitted 13 February, 2024; v1 submitted 29 June, 2023; originally announced June 2023.

    Comments: 20 pages, 8 figures

  15. arXiv:2306.08754  [pdf, other

    cs.LG physics.ao-ph

    ClimSim: A large multi-scale dataset for hybrid physics-ML climate emulation

    Authors: Sungduk Yu, Walter Hannah, Liran Peng, Jerry Lin, Mohamed Aziz Bhouri, Ritwik Gupta, Björn Lütjens, Justus Christopher Will, Gunnar Behrens, Julius Busecke, Nora Loose, Charles I Stern, Tom Beucler, Bryce Harrop, Benjamin R Hillman, Andrea Jenney, Savannah Ferretti, Nana Liu, Anima Anandkumar, Noah D Brenowitz, Veronika Eyring, Nicholas Geneva, Pierre Gentine, Stephan Mandt, Jaideep Pathak , et al. (31 additional authors not shown)

    Abstract: Modern climate projections lack adequate spatial and temporal resolution due to computational constraints. A consequence is inaccurate and imprecise predictions of critical processes such as storms. Hybrid methods that combine physics with machine learning (ML) have introduced a new generation of higher fidelity climate simulators that can sidestep Moore's Law by outsourcing compute-hungry, short,… ▽ More

    Submitted 6 February, 2024; v1 submitted 14 June, 2023; originally announced June 2023.

    Comments: NeurIPS 2023 Outstanding Datasets and Benchmarks Track Paper

  16. arXiv:2304.06244  [pdf, other

    eess.IV cs.CV cs.LG

    Computationally-Efficient Neural Image Compression with Shallow Decoders

    Authors: Yibo Yang, Stephan Mandt

    Abstract: Neural image compression methods have seen increasingly strong performance in recent years. However, they suffer orders of magnitude higher computational complexity compared to traditional codecs, which hinders their real-world deployment. This paper takes a step forward towards closing this gap in decoding complexity by using a shallow or even linear decoding transform resembling that of JPEG. To… ▽ More

    Submitted 10 November, 2023; v1 submitted 12 April, 2023; originally announced April 2023.

    Comments: Updated version of the ICCV 2023 paper. Previously titled "Asymmetrically-powered Neural Image Compression with Shallow Decoders" on arXiv

  17. arXiv:2303.05904  [pdf, ps, other


    Deep Anomaly Detection on Tennessee Eastman Process Data

    Authors: Fabian Hartung, Billy Joe Franks, Tobias Michels, Dennis Wagner, Philipp Liznerski, Steffen Reithermann, Sophie Fellenz, Fabian Jirasek, Maja Rudolph, Daniel Neider, Heike Leitte, Chen Song, Benjamin Kloepper, Stephan Mandt, Michael Bortz, Jakob Burger, Hans Hasse, Marius Kloft

    Abstract: This paper provides the first comprehensive evaluation and analysis of modern (deep-learning) unsupervised anomaly detection methods for chemical process data. We focus on the Tennessee Eastman process dataset, which has been a standard litmus test to benchmark anomaly detection methods for nearly three decades. Our extensive study will facilitate choosing appropriate anomaly detection methods in… ▽ More

    Submitted 10 March, 2023; originally announced March 2023.

  18. arXiv:2303.01748  [pdf, other

    cs.LG cs.CV stat.ML

    A Complete Recipe for Diffusion Generative Models

    Authors: Kushagra Pandey, Stephan Mandt

    Abstract: Score-based Generative Models (SGMs) have demonstrated exceptional synthesis outcomes across various tasks. However, the current design landscape of the forward diffusion process remains largely untapped and often relies on physical heuristics or simplifying assumptions. Utilizing insights from the development of scalable Bayesian posterior samplers, we present a complete recipe for formulating fo… ▽ More

    Submitted 11 October, 2023; v1 submitted 3 March, 2023; originally announced March 2023.

    Comments: Accepted in ICCV'23 (Oral Presentation)

  19. arXiv:2302.07849  [pdf, other

    cs.LG cs.AI stat.ML

    Zero-Shot Anomaly Detection via Batch Normalization

    Authors: Aodong Li, Chen Qiu, Marius Kloft, Padhraic Smyth, Maja Rudolph, Stephan Mandt

    Abstract: Anomaly detection (AD) plays a crucial role in many safety-critical application domains. The challenge of adapting an anomaly detector to drift in the normal data distribution, especially when no training data is available for the "new normal," has led to the development of zero-shot AD techniques. In this paper, we propose a simple yet effective method called Adaptive Centered Representations (AC… ▽ More

    Submitted 7 November, 2023; v1 submitted 15 February, 2023; originally announced February 2023.

    Comments: accepted at NeurIPS 2023

  20. arXiv:2302.07832  [pdf, other

    cs.LG cs.AI

    Deep Anomaly Detection under Labeling Budget Constraints

    Authors: Aodong Li, Chen Qiu, Marius Kloft, Padhraic Smyth, Stephan Mandt, Maja Rudolph

    Abstract: Selecting informative data points for expert feedback can significantly improve the performance of anomaly detection (AD) in various contexts, such as medical diagnostics or fraud detection. In this paper, we determine a set of theoretical conditions under which anomaly scores generalize from labeled queries to unlabeled data. Motivated by these results, we propose a data labeling strategy with op… ▽ More

    Submitted 4 July, 2023; v1 submitted 15 February, 2023; originally announced February 2023.

    Comments: ICML 2023

  21. arXiv:2302.04534  [pdf, other

    cs.LG stat.ML

    Fully Bayesian Autoencoders with Latent Sparse Gaussian Processes

    Authors: Ba-Hien Tran, Babak Shahbaba, Stephan Mandt, Maurizio Filippone

    Abstract: Autoencoders and their variants are among the most widely used models in representation learning and generative modeling. However, autoencoder-based models usually assume that the learned representations are i.i.d. and fail to capture the correlations between the data samples. To address this issue, we propose a novel Sparse Gaussian Process Bayesian Autoencoder (SGPBAE) model in which we impose f… ▽ More

    Submitted 9 February, 2023; originally announced February 2023.

  22. arXiv:2211.08499  [pdf, other

    stat.ML cs.LG

    Probabilistic Querying of Continuous-Time Event Sequences

    Authors: Alex Boyd, Yuxin Chang, Stephan Mandt, Padhraic Smyth

    Abstract: Continuous-time event sequences, i.e., sequences consisting of continuous time stamps and associated event types ("marks"), are an important type of sequential data with many applications, e.g., in clinical medicine or user behavior modeling. Since these data are typically modeled autoregressively (e.g., using neural Hawkes processes or their classical counterparts), it is natural to ask questions… ▽ More

    Submitted 15 November, 2022; originally announced November 2022.

  23. arXiv:2210.06464  [pdf, other

    cs.LG cs.AI

    Predictive Querying for Autoregressive Neural Sequence Models

    Authors: Alex Boyd, Sam Showalter, Stephan Mandt, Padhraic Smyth

    Abstract: In reasoning about sequential events it is natural to pose probabilistic queries such as "when will event A occur next" or "what is the probability of A occurring before B", with applications in areas such as user modeling, medicine, and finance. However, with machine learning shifting towards neural autoregressive models such as RNNs and transformers, probabilistic querying has been largely restr… ▽ More

    Submitted 4 November, 2022; v1 submitted 12 October, 2022; originally announced October 2022.

    Comments: Oral Presentation at the Intl. Conference on Neural Information Processing Systems (NeurIPS 2022)

  24. arXiv:2209.06950  [pdf, other

    eess.IV cs.CV cs.LG stat.ML

    Lossy Image Compression with Conditional Diffusion Models

    Authors: Ruihan Yang, Stephan Mandt

    Abstract: This paper outlines an end-to-end optimized lossy image compression framework using diffusion generative models. The approach relies on the transform coding paradigm, where an image is mapped into a latent space for entropy coding and, from there, mapped back to the data space for reconstruction. In contrast to VAE-based neural compression, where the (mean) decoder is a deterministic neural networ… ▽ More

    Submitted 2 January, 2024; v1 submitted 14 September, 2022; originally announced September 2022.

  25. arXiv:2205.13845  [pdf, other

    cs.LG cs.AI

    Raising the Bar in Graph-level Anomaly Detection

    Authors: Chen Qiu, Marius Kloft, Stephan Mandt, Maja Rudolph

    Abstract: Graph-level anomaly detection has become a critical topic in diverse areas, such as financial fraud detection and detecting anomalous activities in social networks. While most research has focused on anomaly detection for visual data such as images, where high detection accuracies have been obtained, existing deep learning approaches for graphs currently show considerably worse performance. This p… ▽ More

    Submitted 27 May, 2022; originally announced May 2022.

    Comments: To appear in IJCAI-ECAI 2022

    Journal ref: Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence (IJCAI-22), 2022

  26. arXiv:2203.09481  [pdf, other

    cs.CV cs.LG stat.ML

    Diffusion Probabilistic Modeling for Video Generation

    Authors: Ruihan Yang, Prakhar Srivastava, Stephan Mandt

    Abstract: Denoising diffusion probabilistic models are a promising new class of generative models that mark a milestone in high-quality image generation. This paper showcases their ability to sequentially generate video, surpassing prior methods in perceptual and probabilistic forecasting metrics. We propose an autoregressive, end-to-end optimized video diffusion model inspired by recent advances in neural… ▽ More

    Submitted 7 December, 2022; v1 submitted 15 March, 2022; originally announced March 2022.

  27. arXiv:2203.08875  [pdf, other

    cs.LG cs.CV eess.IV

    SC2 Benchmark: Supervised Compression for Split Computing

    Authors: Yoshitomo Matsubara, Ruihan Yang, Marco Levorato, Stephan Mandt

    Abstract: With the increasing demand for deep learning models on mobile devices, splitting neural network computation between the device and a more powerful edge server has become an attractive solution. However, existing split computing approaches often underperform compared to a naive baseline of remote computation on compressed data. Recent studies propose learning compressed representations that contain… ▽ More

    Submitted 14 June, 2023; v1 submitted 16 March, 2022; originally announced March 2022.

    Comments: Accepted at TMLR. Code and models are available at https://github.com/yoshitomo-matsubara/sc2-benchmark

  28. arXiv:2202.08088  [pdf, other

    cs.LG cs.AI

    Latent Outlier Exposure for Anomaly Detection with Contaminated Data

    Authors: Chen Qiu, Aodong Li, Marius Kloft, Maja Rudolph, Stephan Mandt

    Abstract: Anomaly detection aims at identifying data points that show systematic deviations from the majority of data in an unlabeled dataset. A common assumption is that clean training data (free of anomalies) is available, which is often violated in practice. We propose a strategy for training an anomaly detector in the presence of unlabeled anomalies that is compatible with a broad class of models. The i… ▽ More

    Submitted 26 June, 2022; v1 submitted 16 February, 2022; originally announced February 2022.

    Comments: To appear in ICML 2022

    Journal ref: Proceedings of the 39th International Conference on Machine Learning, 2022, volume:162, pages:18153--18167

  29. arXiv:2202.06533  [pdf, other

    cs.LG cs.IT eess.IV

    An Introduction to Neural Data Compression

    Authors: Yibo Yang, Stephan Mandt, Lucas Theis

    Abstract: Neural compression is the application of neural networks and other machine learning methods to data compression. Recent advances in statistical machine learning have opened up new possibilities for data compression, allowing compression algorithms to be learned end-to-end from data using powerful generative models such as normalizing flows, variational autoencoders, diffusion probabilistic models,… ▽ More

    Submitted 16 August, 2023; v1 submitted 14 February, 2022; originally announced February 2022.

    Comments: Published in Foundations and Trends in Computer Graphics and Vision: Vol. 15, No. 2, pp 113-200. https://www.nowpublishers.com/article/Details/CGV-107

  30. arXiv:2202.03944  [pdf, other

    cs.LG cs.AI

    Detecting Anomalies within Time Series using Local Neural Transformations

    Authors: Tim Schneider, Chen Qiu, Marius Kloft, Decky Aspandi Latif, Steffen Staab, Stephan Mandt, Maja Rudolph

    Abstract: We develop a new method to detect anomalies within time series, which is essential in many application domains, reaching from self-driving cars, finance, and marketing to medical diagnosis and epidemiology. The method is based on self-supervised deep learning that has played a key role in facilitating deep anomaly detection on images, where powerful image transformations are available. However, su… ▽ More

    Submitted 20 February, 2022; v1 submitted 8 February, 2022; originally announced February 2022.

  31. arXiv:2112.01221  [pdf, other

    physics.ao-ph cs.LG

    Analyzing High-Resolution Clouds and Convection using Multi-Channel VAEs

    Authors: Harshini Mangipudi, Griffin Mooers, Mike Pritchard, Tom Beucler, Stephan Mandt

    Abstract: Understanding the details of small-scale convection and storm formation is crucial to accurately represent the larger-scale planetary dynamics. Presently, atmospheric scientists run high-resolution, storm-resolving simulations to capture these kilometer-scale weather details. However, because they contain abundant information, these simulations can be overwhelming to analyze using conventional app… ▽ More

    Submitted 1 December, 2021; originally announced December 2021.

    Comments: 4 Pages, 3 Figures. Accepted to NeurIPS 2021 (Machine Learning and Physical Sciences)

  32. arXiv:2111.12166  [pdf, other

    cs.IT cs.LG stat.ML

    Towards Empirical Sandwich Bounds on the Rate-Distortion Function

    Authors: Yibo Yang, Stephan Mandt

    Abstract: Rate-distortion (R-D) function, a key quantity in information theory, characterizes the fundamental limit of how much a data source can be compressed subject to a fidelity criterion, by any compression algorithm. As researchers push for ever-improving compression performance, establishing the R-D function of a given data source is not only of scientific interest, but also sheds light on the possib… ▽ More

    Submitted 11 March, 2022; v1 submitted 23 November, 2021; originally announced November 2021.

    Comments: ICLR 2022 camera-ready version

  33. arXiv:2111.11632  [pdf, other

    cs.LG cs.IT

    Lossless Compression with Probabilistic Circuits

    Authors: Anji Liu, Stephan Mandt, Guy Van den Broeck

    Abstract: Despite extensive progress on image generation, common deep generative model architectures are not easily applied to lossless compression. For example, VAEs suffer from a compression cost overhead due to their latent variables. This overhead can only be partially eliminated with elaborate schemes such as bits-back coding, often resulting in poor single-sample compression rates. To overcome such pr… ▽ More

    Submitted 16 March, 2022; v1 submitted 22 November, 2021; originally announced November 2021.

  34. Supervised Compression for Resource-Constrained Edge Computing Systems

    Authors: Yoshitomo Matsubara, Ruihan Yang, Marco Levorato, Stephan Mandt

    Abstract: There has been much interest in deploying deep learning algorithms on low-powered devices, including smartphones, drones, and medical sensors. However, full-scale deep neural networks are often too resource-intensive in terms of energy and storage. As a result, the bulk part of the machine learning operation is therefore often carried out on an edge server, where the data is compressed and transmi… ▽ More

    Submitted 21 October, 2021; v1 submitted 21 August, 2021; originally announced August 2021.

    Comments: Accepted to WACV 2022. Code and models are available at https://github.com/yoshitomo-matsubara/supervised-compression

    Journal ref: IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2022

  35. arXiv:2107.13136  [pdf, other

    eess.IV cs.CV cs.LG

    Insights from Generative Modeling for Neural Video Compression

    Authors: Ruihan Yang, Yibo Yang, Joseph Marino, Stephan Mandt

    Abstract: While recent machine learning research has revealed connections between deep generative models such as VAEs and rate-distortion losses used in learned compression, most of this work has focused on images. In a similar spirit, we view recently proposed neural video coding algorithms through the lens of deep autoregressive and latent variable modeling. We present these codecs as instances of a gener… ▽ More

    Submitted 9 July, 2023; v1 submitted 27 July, 2021; originally announced July 2021.

    Comments: This work has been submitted to the IEEE for publication as an extension work of arXiv:2010.10258. Copyright may be transferred without notice, after which this version may no longer be accessible. arXiv admin note: text overlap with arXiv:2010.10258

  36. arXiv:2107.09028  [pdf, other

    cs.LG stat.ML

    Structured Stochastic Gradient MCMC

    Authors: Antonios Alexos, Alex Boyd, Stephan Mandt

    Abstract: Stochastic gradient Markov Chain Monte Carlo (SGMCMC) is considered the gold standard for Bayesian inference in large-scale models, such as Bayesian neural networks. Since practitioners face speed versus accuracy tradeoffs in these models, variational inference (VI) is often the preferable option. Unfortunately, VI makes strong assumptions on both the factorization and functional form of the poste… ▽ More

    Submitted 17 July, 2022; v1 submitted 19 July, 2021; originally announced July 2021.

    Comments: paper accepted in ICML2022. Code can be found here https://github.com/ajboyd2/pytorch_lvi

  37. arXiv:2103.16440  [pdf, other

    cs.LG cs.AI

    Neural Transformation Learning for Deep Anomaly Detection Beyond Images

    Authors: Chen Qiu, Timo Pfrommer, Marius Kloft, Stephan Mandt, Maja Rudolph

    Abstract: Data transformations (e.g. rotations, reflections, and cropping) play an important role in self-supervised learning. Typically, images are transformed into different views, and neural networks trained on tasks involving these views produce useful feature representations for downstream tasks, including anomaly detection. However, for anomaly detection beyond image data, it is often unclear which tr… ▽ More

    Submitted 3 February, 2022; v1 submitted 30 March, 2021; originally announced March 2021.

    Journal ref: Proceedings of the 38th International Conference on Machine Learning, 2021, volume:139, pages:8703--8714

  38. arXiv:2012.08101  [pdf, other

    stat.ML cs.LG

    Detecting and Adapting to Irregular Distribution Shifts in Bayesian Online Learning

    Authors: Aodong Li, Alex Boyd, Padhraic Smyth, Stephan Mandt

    Abstract: We consider the problem of online learning in the presence of distribution shifts that occur at an unknown rate and of unknown intensity. We derive a new Bayesian online inference approach to simultaneously infer these distribution shifts and adapt the model to the detected changes by integrating ideas from change point detection, switching dynamical systems, and Bayesian online learning. Using a… ▽ More

    Submitted 26 October, 2021; v1 submitted 15 December, 2020; originally announced December 2020.

    Comments: Published version, Neural Information Processing Systems 2021

  39. arXiv:2011.03231  [pdf, other

    stat.ML cs.LG

    User-Dependent Neural Sequence Models for Continuous-Time Event Data

    Authors: Alex Boyd, Robert Bamler, Stephan Mandt, Padhraic Smyth

    Abstract: Continuous-time event data are common in applications such as individual behavior data, financial transactions, and medical health records. Modeling such data can be very challenging, in particular for applications with many different types of events, since it requires a model to predict the event types as well as the time of occurrence. Recurrent neural networks that parameterize time-varying int… ▽ More

    Submitted 6 November, 2020; originally announced November 2020.

    Comments: Accepted at NeurIPS 2020

  40. arXiv:2010.13472  [pdf, other

    stat.ML cs.LG

    Scalable Gaussian Process Variational Autoencoders

    Authors: Metod Jazbec, Matthew Ashman, Vincent Fortuin, Michael Pearce, Stephan Mandt, Gunnar Rätsch

    Abstract: Conventional variational autoencoders fail in modeling correlations between data points due to their use of factorized priors. Amortized Gaussian process inference through GP-VAEs has led to significant improvements in this regard, but is still inhibited by the intrinsic complexity of exact GP inference. We improve the scalability of these methods through principled sparse inference approaches. We… ▽ More

    Submitted 24 February, 2021; v1 submitted 26 October, 2020; originally announced October 2020.

    Comments: Published at AISTATS 2021

  41. arXiv:2010.10403  [pdf, other


    Variational Dynamic Mixtures

    Authors: Chen Qiu, Stephan Mandt, Maja Rudolph

    Abstract: Deep probabilistic time series forecasting models have become an integral part of machine learning. While several powerful generative models have been proposed, we provide evidence that their associated inference models are oftentimes too limited and cause the generative model to predict mode-averaged dynamics. Modeaveraging is problematic since many real-world sequences are highly multi-modal, an… ▽ More

    Submitted 4 December, 2020; v1 submitted 20 October, 2020; originally announced October 2020.

  42. arXiv:2010.10258  [pdf, other

    eess.IV cs.LG

    Hierarchical Autoregressive Modeling for Neural Video Compression

    Authors: Ruihan Yang, Yibo Yang, Joseph Marino, Stephan Mandt

    Abstract: Recent work by Marino et al. (2020) showed improved performance in sequential density estimation by combining masked autoregressive flows with hierarchical latent variable models. We draw a connection between such autoregressive generative models and the task of lossy video compression. Specifically, we view recent neural video compression methods (Lu et al., 2019; Yang et al., 2020b; Agustssonet… ▽ More

    Submitted 19 December, 2023; v1 submitted 18 October, 2020; originally announced October 2020.

    Comments: Published as a conference paper at ICLR 2021

  43. Improving Sequential Latent Variable Models with Autoregressive Flows

    Authors: Joseph Marino, Lei Chen, Jiawei He, Stephan Mandt

    Abstract: We propose an approach for improving sequence modeling based on autoregressive normalizing flows. Each autoregressive transform, acting across time, serves as a moving frame of reference, removing temporal correlations, and simplifying the modeling of higher-level dynamics. This technique provides a simple, general-purpose method for improving sequence modeling, with connections to existing and cl… ▽ More

    Submitted 8 March, 2022; v1 submitted 7 October, 2020; originally announced October 2020.

    Comments: Published at Machine Learning Journal

    Journal ref: Mach Learn (2021)

  44. arXiv:2007.01444  [pdf, other

    physics.ao-ph cs.LG physics.comp-ph

    Generative Modeling for Atmospheric Convection

    Authors: Griffin Mooers, Jens Tuyls, Stephan Mandt, Michael Pritchard, Tom Beucler

    Abstract: While cloud-resolving models can explicitly simulate the details of small-scale storm formation and morphology, these details are often ignored by climate models for lack of computational resources. Here, we explore the potential of generative modeling to cheaply recreate small-scale storms by designing and implementing a Variational Autoencoder (VAE) that performs structural replication, dimensio… ▽ More

    Submitted 24 October, 2020; v1 submitted 2 July, 2020; originally announced July 2020.

    Comments: 8 Pages, 6 Figures. Accepted into ACM International Conference Proceedings Series

  45. arXiv:2006.04240  [pdf, other

    eess.IV cs.LG stat.ML

    Improving Inference for Neural Image Compression

    Authors: Yibo Yang, Robert Bamler, Stephan Mandt

    Abstract: We consider the problem of lossy image compression with deep latent variable models. State-of-the-art methods build on hierarchical variational autoencoders (VAEs) and learn inference networks to predict a compressible latent representation of each data point. Drawing on the variational inference perspective on compression, we identify three approximation gaps which limit performance in the conven… ▽ More

    Submitted 8 January, 2021; v1 submitted 7 June, 2020; originally announced June 2020.

    Comments: 9 pages + detailed supplement with additional results; various typos corrected. Camera-ready version paper at NeurIPS 2020

  46. arXiv:2002.08158  [pdf, other

    eess.IV cs.CV cs.LG stat.ML

    Variational Bayesian Quantization

    Authors: Yibo Yang, Robert Bamler, Stephan Mandt

    Abstract: We propose a novel algorithm for quantizing continuous latent representations in trained models. Our approach applies to deep probabilistic models, such as variational autoencoders (VAEs), and enables both data and model compression. Unlike current end-to-end neural compression methods that cater the model to a fixed quantization scheme, our algorithm separates model design and training from quant… ▽ More

    Submitted 7 September, 2020; v1 submitted 17 February, 2020; originally announced February 2020.

    Comments: 9 pages + detailed supplement with additional full resolution reconstructed images; ICML 2020 final camera-ready version, title changed to "Variational Bayesian Quantization" following reviewer feedback

  47. arXiv:2002.06298  [pdf, other

    stat.ML cs.LG

    Extreme Classification via Adversarial Softmax Approximation

    Authors: Robert Bamler, Stephan Mandt

    Abstract: Training a classifier over a large number of classes, known as 'extreme classification', has become a topic of major interest with applications in technology, science, and e-commerce. Traditional softmax regression induces a gradient cost proportional to the number of classes $C$, which often is prohibitively expensive. A popular scalable softmax approximation relies on uniform negative sampling,… ▽ More

    Submitted 14 February, 2020; originally announced February 2020.

    Comments: Accepted for presentation at the Eighth International Conference on Learning Representations (ICLR 2020), https://openreview.net/forum?id=rJxe3xSYDS

  48. arXiv:2002.02655  [pdf, other

    cs.LG stat.ML

    The k-tied Normal Distribution: A Compact Parameterization of Gaussian Mean Field Posteriors in Bayesian Neural Networks

    Authors: Jakub Swiatkowski, Kevin Roth, Bastiaan S. Veeling, Linh Tran, Joshua V. Dillon, Jasper Snoek, Stephan Mandt, Tim Salimans, Rodolphe Jenatton, Sebastian Nowozin

    Abstract: Variational Bayesian Inference is a popular methodology for approximating posterior distributions over Bayesian neural network weights. Recent work developing this class of methods has explored ever richer parameterizations of the approximate posterior in the hope of improving performance. In contrast, here we share a curious experimental finding that suggests instead restricting the variational d… ▽ More

    Submitted 5 July, 2020; v1 submitted 7 February, 2020; originally announced February 2020.

  49. arXiv:2002.02405  [pdf, other

    stat.ML cs.LG stat.CO

    How Good is the Bayes Posterior in Deep Neural Networks Really?

    Authors: Florian Wenzel, Kevin Roth, Bastiaan S. Veeling, Jakub Świątkowski, Linh Tran, Stephan Mandt, Jasper Snoek, Tim Salimans, Rodolphe Jenatton, Sebastian Nowozin

    Abstract: During the past five years the Bayesian deep learning community has developed increasingly accurate and efficient approximate inference procedures that allow for Bayesian inference in deep neural networks. However, despite this algorithmic progress and the promise of improved uncertainty quantification and sample efficiency there are---as of early 2020---no publicized deployments of Bayesian neura… ▽ More

    Submitted 2 July, 2020; v1 submitted 6 February, 2020; originally announced February 2020.

    Comments: Full version (main paper and appendix) of the ICML 2020 publication

  50. arXiv:2001.04694  [pdf, other

    cs.LG stat.ML

    Hydra: Preserving Ensemble Diversity for Model Distillation

    Authors: Linh Tran, Bastiaan S. Veeling, Kevin Roth, Jakub Swiatkowski, Joshua V. Dillon, Jasper Snoek, Stephan Mandt, Tim Salimans, Sebastian Nowozin, Rodolphe Jenatton

    Abstract: Ensembles of models have been empirically shown to improve predictive performance and to yield robust measures of uncertainty. However, they are expensive in computation and memory. Therefore, recent research has focused on distilling ensembles into a single compact model, reducing the computational and memory burden of the ensemble while trying to preserve its predictive behavior. Most existing d… ▽ More

    Submitted 19 March, 2021; v1 submitted 14 January, 2020; originally announced January 2020.

    Comments: Accepted to ICML 2020 Workshop on Uncertainty and Robustness in Deep Learning