Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 171 results for author: Wilson, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.11463  [pdf, other

    cs.LG stat.ML

    Just How Flexible are Neural Networks in Practice?

    Authors: Ravid Shwartz-Ziv, Micah Goldblum, Arpit Bansal, C. Bayan Bruss, Yann LeCun, Andrew Gordon Wilson

    Abstract: It is widely believed that a neural network can fit a training set containing at least as many samples as it has parameters, underpinning notions of overparameterized and underparameterized models. In practice, however, we only find solutions accessible via our training procedure, including the optimizer and regularizers, limiting flexibility. Moreover, the exact parameterization of the function c… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  2. arXiv:2406.09177  [pdf, other

    stat.ML cs.LG

    Scalable and Flexible Causal Discovery with an Efficient Test for Adjacency

    Authors: Alan Nawzad Amin, Andrew Gordon Wilson

    Abstract: To make accurate predictions, understand mechanisms, and design interventions in systems of many variables, we wish to learn causal graphs from large scale data. Unfortunately the space of all possible causal graphs is enormous so scalably and accurately searching for the best fit to the data is a challenge. In principle we could substantially decrease the search space, or learn the graph entirely… ▽ More

    Submitted 18 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

    Comments: ICML 2024; Code at https://github.com/AlanNawzadAmin/DAT-graph

  3. arXiv:2406.08391  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    Large Language Models Must Be Taught to Know What They Don't Know

    Authors: Sanyam Kapoor, Nate Gruver, Manley Roberts, Katherine Collins, Arka Pal, Umang Bhatt, Adrian Weller, Samuel Dooley, Micah Goldblum, Andrew Gordon Wilson

    Abstract: When using large language models (LLMs) in high-stakes applications, we need to know when we can trust their predictions. Some works argue that prompting high-performance LLMs is sufficient to produce calibrated uncertainties, while others introduce sampling methods that can be prohibitively expensive. In this work, we first argue that prompting on its own is insufficient to achieve good calibrati… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: Code available at: https://github.com/activatedgeek/calibration-tuning

  4. arXiv:2406.07337  [pdf, other

    cs.LG

    Transferring Knowledge from Large Foundation Models to Small Downstream Models

    Authors: Shikai Qiu, Boran Han, Danielle C. Maddix, Shuai Zhang, Yuyang Wang, Andrew Gordon Wilson

    Abstract: How do we transfer the relevant knowledge from ever larger foundation models into small, task-specific downstream models that can run at much lower costs? Standard transfer learning using pre-trained weights as the initialization transfers limited information and commits us to often massive pre-trained architectures. This procedure also precludes combining multiple pre-trained models that learn co… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: ICML 2024. Code available at https://github.com/amazon-science/adaptive-feature-transfer

  5. arXiv:2406.06248  [pdf, other

    cs.LG

    Compute Better Spent: Replacing Dense Layers with Structured Matrices

    Authors: Shikai Qiu, Andres Potapczynski, Marc Finzi, Micah Goldblum, Andrew Gordon Wilson

    Abstract: Dense linear layers are the dominant computational bottleneck in foundation models. Identifying more efficient alternatives to dense matrices has enormous potential for building more compute-efficient models, as exemplified by the success of convolutional networks in the image domain. In this work, we systematically explore structured matrices as replacements for dense matrices. We show that diffe… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: ICML 24. Code available at https://github.com/shikaiqiu/compute-better-spent

  6. arXiv:2405.14812  [pdf, other

    cs.CY

    As an AI Language Model, "Yes I Would Recommend Calling the Police'': Norm Inconsistency in LLM Decision-Making

    Authors: Shomik Jain, D Calacci, Ashia Wilson

    Abstract: We investigate the phenomenon of norm inconsistency: where LLMs apply different norms in similar situations. Specifically, we focus on the high-risk application of deciding whether to call the police in Amazon Ring home surveillance videos. We evaluate the decisions of three state-of-the-art LLMs -- GPT-4, Gemini 1.0, and Claude 3 Sonnet -- in relation to the activities portrayed in the videos, th… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  7. arXiv:2405.00740  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    Modeling Caption Diversity in Contrastive Vision-Language Pretraining

    Authors: Samuel Lavoie, Polina Kirichenko, Mark Ibrahim, Mahmoud Assran, Andrew Gordon Wilson, Aaron Courville, Nicolas Ballas

    Abstract: There are a thousand ways to caption an image. Contrastive Language Pretraining (CLIP) on the other hand, works by mapping an image and its caption to a single vector -- limiting how well CLIP-like models can represent the diverse ways to describe an image. In this work, we introduce Llip, Latent Language Image Pretraining, which models the diversity of captions that could match an image. Llip's v… ▽ More

    Submitted 14 May, 2024; v1 submitted 29 April, 2024; originally announced May 2024.

    Comments: 14 pages, 8 figures, 7 tables, to be published at ICML2024

  8. arXiv:2404.14952  [pdf, other

    cs.CV cs.AI

    Leveraging Speech for Gesture Detection in Multimodal Communication

    Authors: Esam Ghaleb, Ilya Burenko, Marlou Rasenberg, Wim Pouw, Ivan Toni, Peter Uhrig, Anna Wilson, Judith Holler, Aslı Özyürek, Raquel Fernández

    Abstract: Gestures are inherent to human interaction and often complement speech in face-to-face communication, forming a multimodal communication system. An important task in gesture analysis is detecting a gesture's beginning and end. Research on automatic gesture detection has primarily focused on visual and kinematic information to detect a limited set of isolated or silent gestures with low variability… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

  9. arXiv:2404.08592  [pdf, other

    cs.CY

    Scarce Resource Allocations That Rely On Machine Learning Should Be Randomized

    Authors: Shomik Jain, Kathleen Creel, Ashia Wilson

    Abstract: Contrary to traditional deterministic notions of algorithmic fairness, this paper argues that fairly allocating scarce resources using machine learning often requires randomness. We address why, when, and how to randomize by proposing stochastic procedures that more adequately account for all of the claims that individuals have to allocations of social goods or opportunities.

    Submitted 19 June, 2024; v1 submitted 12 April, 2024; originally announced April 2024.

    Comments: To appear in the proceedings of the International Conference on Machine Learning (ICML 2024)

    ACM Class: K.4.0

  10. arXiv:2403.16365  [pdf, other

    cs.LG cs.CR cs.CV

    Generating Potent Poisons and Backdoors from Scratch with Guided Diffusion

    Authors: Hossein Souri, Arpit Bansal, Hamid Kazemi, Liam Fowl, Aniruddha Saha, Jonas Geiping, Andrew Gordon Wilson, Rama Chellappa, Tom Goldstein, Micah Goldblum

    Abstract: Modern neural networks are often trained on massive datasets that are web scraped with minimal human inspection. As a result of this insecure curation pipeline, an adversary can poison or backdoor the resulting model by uploading malicious data to the internet and waiting for a victim to scrape and train on it. Existing approaches for creating poisons and backdoors start with randomly sampled clea… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

  11. arXiv:2403.14029  [pdf, other

    cs.RO

    Quadcopter Team Configurable Motion Guided by a Quadruped

    Authors: Mohammad Ghufran, Sourish Tetakayala, Jack Hughes, Aron Wilson, Hossein Rastgoftar

    Abstract: The paper focuses on modeling and experimental evaluation of a quadcopter team configurable coordination guided by a single quadruped robot. We consider the quadcopter team as particles of a two-dimensional deformable body and propose a two-dimensional affine transformation model for safe and collision-free configurable coordination of this heterogeneous robotic system. The proposed affine transfo… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

  12. arXiv:2403.13947  [pdf, other

    cs.HC cs.AI

    BlendScape: Enabling Unified and Personalized Video-Conferencing Environments through Generative AI

    Authors: Shwetha Rajaram, Nels Numan, Balasaravanan Thoravi Kumaravel, Nicolai Marquardt, Andrew D. Wilson

    Abstract: Today's video-conferencing tools support a rich range of professional and social activities, but their generic, grid-based environments cannot be easily adapted to meet the varying needs of distributed collaborators. To enable end-user customization, we developed BlendScape, a system for meeting participants to compose video-conferencing environments tailored to their collaboration context by leve… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

  13. arXiv:2403.09869  [pdf, other

    stat.ML cs.AI cs.LG stat.ME

    Mind the GAP: Improving Robustness to Subpopulation Shifts with Group-Aware Priors

    Authors: Tim G. J. Rudner, Ya Shi Zhang, Andrew Gordon Wilson, Julia Kempe

    Abstract: Machine learning models often perform poorly under subpopulation shifts in the data distribution. Developing methods that allow machine learning models to better generalize to such shifts is crucial for safe deployment in real-world settings. In this paper, we develop a family of group-aware prior (GAP) distributions over neural network parameters that explicitly favor models that generalize well… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: Published in Proceedings of the 27th International Conference on Artificial Intelligence and Statistics (AISTATS 2024)

  14. arXiv:2403.07815  [pdf, other

    cs.LG cs.AI

    Chronos: Learning the Language of Time Series

    Authors: Abdul Fatir Ansari, Lorenzo Stella, Caner Turkmen, Xiyuan Zhang, Pedro Mercado, Huibin Shen, Oleksandr Shchur, Syama Sundar Rangapuram, Sebastian Pineda Arango, Shubham Kapoor, Jasper Zschiegner, Danielle C. Maddix, Hao Wang, Michael W. Mahoney, Kari Torkkola, Andrew Gordon Wilson, Michael Bohlke-Schneider, Yuyang Wang

    Abstract: We introduce Chronos, a simple yet effective framework for pretrained probabilistic time series models. Chronos tokenizes time series values using scaling and quantization into a fixed vocabulary and trains existing transformer-based language model architectures on these tokenized time series via the cross-entropy loss. We pretrained Chronos models based on the T5 family (ranging from 20M to 710M… ▽ More

    Submitted 2 May, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

    Comments: Code and model checkpoints available at https://github.com/amazon-science/chronos-forecasting

  15. arXiv:2403.02695  [pdf, other

    cs.LG

    Controllable Prompt Tuning For Balancing Group Distributional Robustness

    Authors: Hoang Phan, Andrew Gordon Wilson, Qi Lei

    Abstract: Models trained on data composed of different groups or domains can suffer from severe performance degradation under distribution shifts. While recent methods have largely focused on optimizing the worst-group objective, this often comes at the expense of good performance on other groups. To address this problem, we introduce an optimization scheme to achieve good performance across groups and find… ▽ More

    Submitted 4 June, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

    Comments: Proceedings of the 41st International Conference on Machine Learning

  16. arXiv:2402.05980  [pdf, other

    cs.SE cs.AI cs.LG cs.PL

    Do Large Code Models Understand Programming Concepts? A Black-box Approach

    Authors: Ashish Hooda, Mihai Christodorescu, Miltiadis Allamanis, Aaron Wilson, Kassem Fawaz, Somesh Jha

    Abstract: Large Language Models' success on text generation has also made them better at code generation and coding tasks. While a lot of work has demonstrated their remarkable performance on tasks such as code completion and editing, it is still unclear as to why. We help bridge this gap by exploring to what degree auto-regressive models understand the logical constructs of the underlying programs. We prop… ▽ More

    Submitted 23 February, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

  17. arXiv:2402.04379  [pdf, other

    cs.LG cond-mat.mtrl-sci

    Fine-Tuned Language Models Generate Stable Inorganic Materials as Text

    Authors: Nate Gruver, Anuroop Sriram, Andrea Madotto, Andrew Gordon Wilson, C. Lawrence Zitnick, Zachary Ulissi

    Abstract: We propose fine-tuning large language models for generation of stable materials. While unorthodox, fine-tuning large language models on text-encoded atomistic data is simple to implement yet reliable, with around 90% of sampled structures obeying physical constraints on atom positions and charges. Using energy above hull calculations from both learned ML potentials and gold-standard DFT calculatio… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

    Comments: ICLR 2024. Code available at: https://github.com/facebookresearch/crystal-llm

  18. arXiv:2402.00809  [pdf, other

    cs.LG stat.ML

    Position: Bayesian Deep Learning is Needed in the Age of Large-Scale AI

    Authors: Theodore Papamarkou, Maria Skoularidou, Konstantina Palla, Laurence Aitchison, Julyan Arbel, David Dunson, Maurizio Filippone, Vincent Fortuin, Philipp Hennig, José Miguel Hernández-Lobato, Aliaksandr Hubin, Alexander Immer, Theofanis Karaletsos, Mohammad Emtiyaz Khan, Agustinus Kristiadi, Yingzhen Li, Stephan Mandt, Christopher Nemeth, Michael A. Osborne, Tim G. J. Rudner, David Rügamer, Yee Whye Teh, Max Welling, Andrew Gordon Wilson, Ruqi Zhang

    Abstract: In the current landscape of deep learning research, there is a predominant emphasis on achieving high predictive accuracy in supervised tasks involving large image and language datasets. However, a broader perspective reveals a multitude of overlooked metrics, tasks, and data types, such as uncertainty, active and continual learning, and scientific data, that demand attention. Bayesian deep learni… ▽ More

    Submitted 2 June, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

  19. arXiv:2401.10149  [pdf, other

    cs.LG cs.CR cs.MA

    Multi-Agent Reinforcement Learning for Maritime Operational Technology Cyber Security

    Authors: Alec Wilson, Ryan Menzies, Neela Morarji, David Foster, Marco Casassa Mont, Esin Turkbeyler, Lisa Gralewski

    Abstract: This paper demonstrates the potential for autonomous cyber defence to be applied on industrial control systems and provides a baseline environment to further explore Multi-Agent Reinforcement Learning's (MARL) application to this problem domain. It introduces a simulation environment, IPMSRL, of a generic Integrated Platform Management System (IPMS) and explores the use of MARL for autonomous cybe… ▽ More

    Submitted 18 January, 2024; originally announced January 2024.

    Comments: 13 pages, 7 figures, Proceedings of the Conference on Applied Machine Learning in Information Security 2023 (CAMLIS)

    Journal ref: Proceedings of the Conference on Applied Machine Learning in Information Security 2023 (CAMLIS), Arlington VA, USA, October 19-20, 2023, CEUR-WS.org, online CEUR-WS.org/Vol-3652/paper3.pdf

  20. arXiv:2401.04349  [pdf

    cs.CR cs.AR

    WebGPU-SPY: Finding Fingerprints in the Sandbox through GPU Cache Attacks

    Authors: Ethan Ferguson, Adam Wilson, Hoda Naghibijouybari

    Abstract: Microarchitectural attacks on CPU structures have been studied in native applications, as well as in web browsers. These attacks continue to be a substantial threat to computing systems at all scales. With the proliferation of heterogeneous systems and integration of hardware accelerators in every computing system, modern web browsers provide the support of GPU-based acceleration for the graphic… ▽ More

    Submitted 8 January, 2024; originally announced January 2024.

  21. arXiv:2401.01764  [pdf, other

    cs.CV cs.LG

    Understanding the Detrimental Class-level Effects of Data Augmentation

    Authors: Polina Kirichenko, Mark Ibrahim, Randall Balestriero, Diane Bouchacourt, Ramakrishna Vedantam, Hamed Firooz, Andrew Gordon Wilson

    Abstract: Data augmentation (DA) encodes invariance and provides implicit regularization critical to a model's performance in image classification tasks. However, while DA improves average accuracy, recent studies have shown that its impact can be highly class dependent: achieving optimal average accuracy comes at the cost of significantly hurting individual class accuracy by as much as 20% on ImageNet. The… ▽ More

    Submitted 7 December, 2023; originally announced January 2024.

    Comments: Neural Information Processing Systems (NeurIPS), 2023

  22. arXiv:2312.17174  [pdf, other

    cs.CV cs.AI cs.LG

    Visual Explanations of Image-Text Representations via Multi-Modal Information Bottleneck Attribution

    Authors: Ying Wang, Tim G. J. Rudner, Andrew Gordon Wilson

    Abstract: Vision-language pretrained models have seen remarkable success, but their application to safety-critical settings is limited by their lack of interpretability. To improve the interpretability of vision-language models such as CLIP, we propose a multi-modal information bottleneck (M2IB) approach that learns latent representations that compress irrelevant information while preserving relevant visual… ▽ More

    Submitted 22 June, 2024; v1 submitted 28 December, 2023; originally announced December 2023.

    Comments: Published in Advances in Neural Information Processing Systems 36 (NeurIPS 2023)

  23. arXiv:2312.17173  [pdf, other

    stat.ML cs.LG

    Non-Vacuous Generalization Bounds for Large Language Models

    Authors: Sanae Lotfi, Marc Finzi, Yilun Kuang, Tim G. J. Rudner, Micah Goldblum, Andrew Gordon Wilson

    Abstract: Modern language models can contain billions of parameters, raising the question of whether they can generalize beyond the training data or simply regurgitate their training corpora. We provide the first non-vacuous generalization bounds for pretrained large language models (LLMs), indicating that language models are capable of discovering regularities that generalize to unseen data. In particular,… ▽ More

    Submitted 12 February, 2024; v1 submitted 28 December, 2023; originally announced December 2023.

  24. arXiv:2312.17162  [pdf, other

    stat.ML cs.AI cs.LG

    Function-Space Regularization in Neural Networks: A Probabilistic Perspective

    Authors: Tim G. J. Rudner, Sanyam Kapoor, Shikai Qiu, Andrew Gordon Wilson

    Abstract: Parameter-space regularization in neural network optimization is a fundamental tool for improving generalization. However, standard parameter-space regularization methods make it challenging to encode explicit preferences about desired predictive functions into neural network training. In this work, we approach regularization in neural networks from a probabilistic perspective and show that by vie… ▽ More

    Submitted 28 December, 2023; originally announced December 2023.

    Comments: Published in Proceedings of the 40th International Conference on Machine Learning (ICML 2023)

  25. arXiv:2312.09323  [pdf, other

    cs.AI cs.LG

    Perspectives on the State and Future of Deep Learning - 2023

    Authors: Micah Goldblum, Anima Anandkumar, Richard Baraniuk, Tom Goldstein, Kyunghyun Cho, Zachary C Lipton, Melanie Mitchell, Preetum Nakkiran, Max Welling, Andrew Gordon Wilson

    Abstract: The goal of this series is to chronicle opinions and issues in the field of machine learning as they stand today and as they change over time. The plan is to host this survey periodically until the AI singularity paperclip-frenzy-driven doomsday, keeping an updated list of topical questions and interviewing new community members for each edition. In this issue, we probed people's opinions on inter… ▽ More

    Submitted 18 December, 2023; v1 submitted 7 December, 2023; originally announced December 2023.

  26. arXiv:2312.08823  [pdf, other

    stat.CO cs.DS cs.LG math.ST stat.ML

    Fast sampling from constrained spaces using the Metropolis-adjusted Mirror Langevin algorithm

    Authors: Vishwak Srinivasan, Andre Wibisono, Ashia Wilson

    Abstract: We propose a new method called the Metropolis-adjusted Mirror Langevin algorithm for approximate sampling from distributions whose support is a compact and convex set. This algorithm adds an accept-reject filter to the Markov chain induced by a single step of the Mirror Langevin algorithm (Zhang et al., 2020), which is a basic discretisation of the Mirror Langevin dynamics. Due to the inclusion of… ▽ More

    Submitted 21 June, 2024; v1 submitted 14 December, 2023; originally announced December 2023.

    Comments: 49 pages, 6 figures, 2 tables. Shorter version without experiments accepted to COLT 2024

  27. arXiv:2312.05879  [pdf, other

    cs.CV

    Wild Motion Unleashed: Markerless 3D Kinematics and Force Estimation in Cheetahs

    Authors: Zico da Silva, Stacy Shield, Penny E. Hudson, Alan M. Wilson, Fred Nicolls, Amir Patel

    Abstract: The complex dynamics of animal manoeuvrability in the wild is extremely challenging to study. The cheetah ($\textit{Acinonyx jubatus}$) is a perfect example: despite great interest in its unmatched speed and manoeuvrability, obtaining complete whole-body motion data from these animals remains an unsolved problem. This is especially difficult in wild cheetahs, where it is essential that the methods… ▽ More

    Submitted 10 December, 2023; originally announced December 2023.

  28. arXiv:2312.02796  [pdf, other

    cond-mat.mtrl-sci cond-mat.str-el cs.LG physics.data-an

    Materials Expert-Artificial Intelligence for Materials Discovery

    Authors: Yanjun Liu, Milena Jovanovic, Krishnanand Mallayya, Wesley J. Maddox, Andrew Gordon Wilson, Sebastian Klemenz, Leslie M. Schoop, Eun-Ah Kim

    Abstract: The advent of material databases provides an unprecedented opportunity to uncover predictive descriptors for emergent material properties from vast data space. However, common reliance on high-throughput ab initio data necessarily inherits limitations of such data: mismatch with experiments. On the other hand, experimental decisions are often guided by an expert's intuition honed from experiences… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

    Comments: 8 pages main text, 4 figs, 8 pages Supplementary material

  29. arXiv:2312.02517  [pdf, other

    cs.LG cs.AI

    Simplifying Neural Network Training Under Class Imbalance

    Authors: Ravid Shwartz-Ziv, Micah Goldblum, Yucen Lily Li, C. Bayan Bruss, Andrew Gordon Wilson

    Abstract: Real-world datasets are often highly class-imbalanced, which can adversely impact the performance of deep learning models. The majority of research on training neural networks under class imbalance has focused on specialized loss functions, sampling techniques, or two-stage training procedures. Notably, we demonstrate that simply tuning existing components of standard deep learning pipelines, such… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

    Comments: NeurIPS 2023. Code available at https://github.com/ravidziv/SimplifyingImbalancedTraining

  30. arXiv:2311.15990  [pdf, other

    cs.LG stat.ML

    Should We Learn Most Likely Functions or Parameters?

    Authors: Shikai Qiu, Tim G. J. Rudner, Sanyam Kapoor, Andrew Gordon Wilson

    Abstract: Standard regularized training procedures correspond to maximizing a posterior distribution over parameters, known as maximum a posteriori (MAP) estimation. However, model parameters are of interest only insomuch as they combine with the functional form of a model to provide a function that can make good predictions. Moreover, the most likely parameters under the parameter posterior do not generall… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

    Comments: NeurIPS 2023. Code available at https://github.com/activatedgeek/function-space-map

  31. arXiv:2311.05877  [pdf, other

    cs.LG cs.AI

    A Performance-Driven Benchmark for Feature Selection in Tabular Deep Learning

    Authors: Valeriia Cherepanova, Roman Levin, Gowthami Somepalli, Jonas Geiping, C. Bayan Bruss, Andrew Gordon Wilson, Tom Goldstein, Micah Goldblum

    Abstract: Academic tabular benchmarks often contain small sets of curated features. In contrast, data scientists typically collect as many features as possible into their datasets, and even engineer new features from existing ones. To prevent overfitting in subsequent downstream modeling, practitioners commonly use automated feature selection methods that identify a reduced subset of informative features. E… ▽ More

    Submitted 10 November, 2023; originally announced November 2023.

    Journal ref: Conference on Neural Information Processing Systems 2023

  32. arXiv:2310.19909  [pdf, other

    cs.CV cs.LG

    Battle of the Backbones: A Large-Scale Comparison of Pretrained Models across Computer Vision Tasks

    Authors: Micah Goldblum, Hossein Souri, Renkun Ni, Manli Shu, Viraj Prabhu, Gowthami Somepalli, Prithvijit Chattopadhyay, Mark Ibrahim, Adrien Bardes, Judy Hoffman, Rama Chellappa, Andrew Gordon Wilson, Tom Goldstein

    Abstract: Neural network based computer vision systems are typically built on a backbone, a pretrained or randomly initialized feature extractor. Several years ago, the default option was an ImageNet-trained convolutional neural network. However, the recent past has seen the emergence of countless backbones pretrained using various algorithms and datasets. While this abundance of choice has led to performan… ▽ More

    Submitted 19 November, 2023; v1 submitted 30 October, 2023; originally announced October 2023.

    Comments: Accepted to NeurIPS 2023

  33. arXiv:2310.16139  [pdf, other

    eess.IV cs.CV

    Pix2HDR -- A pixel-wise acquisition and deep learning-based synthesis approach for high-speed HDR videos

    Authors: Caixin Wang, Jie Zhang, Matthew A. Wilson, Ralph Etienne-Cummings

    Abstract: Accurately capturing dynamic scenes with wide-ranging motion and light intensity is crucial for many vision applications. However, acquiring high-speed high dynamic range (HDR) video is challenging because the camera's frame rate restricts its dynamic range. Existing methods sacrifice speed to acquire multi-exposure frames. Yet, misaligned motion in these frames can still pose complications for HD… ▽ More

    Submitted 25 April, 2024; v1 submitted 24 October, 2023; originally announced October 2023.

    Comments: 17 pages, 18 figures

  34. arXiv:2310.07820  [pdf, other

    cs.LG

    Large Language Models Are Zero-Shot Time Series Forecasters

    Authors: Nate Gruver, Marc Finzi, Shikai Qiu, Andrew Gordon Wilson

    Abstract: By encoding time series as a string of numerical digits, we can frame time series forecasting as next-token prediction in text. Developing this approach, we find that large language models (LLMs) such as GPT-3 and LLaMA-2 can surprisingly zero-shot extrapolate time series at a level comparable to or exceeding the performance of purpose-built time series models trained on the downstream tasks. To f… ▽ More

    Submitted 18 June, 2024; v1 submitted 11 October, 2023; originally announced October 2023.

    Comments: NeurIPS 2023. Code available at: https://github.com/ngruver/llmtime

  35. arXiv:2309.09944  [pdf, other

    cs.LG cs.AI cs.CV cs.CY

    DiffusionWorldViewer: Exposing and Broadening the Worldview Reflected by Generative Text-to-Image Models

    Authors: Zoe De Simone, Angie Boggust, Arvind Satyanarayan, Ashia Wilson

    Abstract: Generative text-to-image (TTI) models produce high-quality images from short textual descriptions and are widely used in academic and creative domains. Like humans, TTI models have a worldview, a conception of the world learned from their training data and task that influences the images they generate for a given prompt. However, the worldviews of TTI models are often hidden from users, making it… ▽ More

    Submitted 5 February, 2024; v1 submitted 18 September, 2023; originally announced September 2023.

    Comments: 20 pages, 8 figures

  36. arXiv:2309.03060  [pdf, other

    cs.LG math.NA stat.ML

    CoLA: Exploiting Compositional Structure for Automatic and Efficient Numerical Linear Algebra

    Authors: Andres Potapczynski, Marc Finzi, Geoff Pleiss, Andrew Gordon Wilson

    Abstract: Many areas of machine learning and science involve large linear algebra problems, such as eigendecompositions, solving linear systems, computing matrix exponentials, and trace estimation. The matrices involved often have Kronecker, convolutional, block diagonal, sum, or product structure. In this paper, we propose a simple but general framework for large-scale linear algebra problems in machine le… ▽ More

    Submitted 29 November, 2023; v1 submitted 6 September, 2023; originally announced September 2023.

    Comments: Code available at https://github.com/wilson-labs/cola. NeurIPS 2023

  37. arXiv:2306.13564  [pdf, other

    cs.CV eess.IV

    Estimating Residential Solar Potential Using Aerial Data

    Authors: Ross Goroshin, Alex Wilson, Andrew Lamb, Betty Peng, Brandon Ewonus, Cornelius Ratsch, Jordan Raisher, Marisa Leung, Max Burq, Thomas Colthurst, William Rucklidge, Carl Elkin

    Abstract: Project Sunroof estimates the solar potential of residential buildings using high quality aerial data. That is, it estimates the potential solar energy (and associated financial savings) that can be captured by buildings if solar panels were to be installed on their roofs. Unfortunately its coverage is limited by the lack of high resolution digital surface map (DSM) data. We present a deep learnin… ▽ More

    Submitted 23 June, 2023; originally announced June 2023.

    Journal ref: ICLR 2023 - Tackling Climate Change with Machine Learning Workshop

  38. arXiv:2306.11074  [pdf, other

    cs.LG stat.ML

    Simple and Fast Group Robustness by Automatic Feature Reweighting

    Authors: Shikai Qiu, Andres Potapczynski, Pavel Izmailov, Andrew Gordon Wilson

    Abstract: A major challenge to out-of-distribution generalization is reliance on spurious features -- patterns that are predictive of the class label in the training data distribution, but not causally related to the target. Standard methods for reducing the reliance on spurious features typically assume that we know what the spurious feature is, which is rarely true in the real world. Methods that attempt… ▽ More

    Submitted 19 June, 2023; originally announced June 2023.

    Comments: ICML 23. Code available at https://github.com/AndPotap/afr

    Journal ref: 40th International Conference on Machine Learning 2023

  39. arXiv:2306.07526  [pdf, other

    cs.LG cs.AI

    User-defined Event Sampling and Uncertainty Quantification in Diffusion Models for Physical Dynamical Systems

    Authors: Marc Finzi, Anudhyan Boral, Andrew Gordon Wilson, Fei Sha, Leonardo Zepeda-Núñez

    Abstract: Diffusion models are a class of probabilistic generative models that have been widely used as a prior for image processing tasks like text conditional generation and inpainting. We demonstrate that these models can be adapted to make predictions and provide uncertainty quantification for chaotic dynamical systems. In these applications, diffusion models can implicitly represent knowledge about out… ▽ More

    Submitted 12 June, 2023; originally announced June 2023.

    Comments: ICML 2023 Conference

  40. arXiv:2306.05854  [pdf, other

    cs.DC cs.LO

    Partitioning Strategies for Distributed SMT Solving

    Authors: Amalee Wilson, Andres Noetzli, Andrew Reynolds, Byron Cook, Cesare Tinelli, Clark Barrett

    Abstract: For many users of Satisfiability Modulo Theories (SMT) solvers, the solver's performance is the main bottleneck in their application. One promising approach for improving performance is to leverage the increasing availability of parallel and cloud computing. However, despite many efforts, the best parallel approach to date consists of running a portfolio of solvers, meaning that performance is sti… ▽ More

    Submitted 8 June, 2023; originally announced June 2023.

    Comments: Submitted to FMCAD 2023

  41. arXiv:2305.20028  [pdf, other

    cs.LG stat.ML

    A Study of Bayesian Neural Network Surrogates for Bayesian Optimization

    Authors: Yucen Lily Li, Tim G. J. Rudner, Andrew Gordon Wilson

    Abstract: Bayesian optimization is a highly efficient approach to optimizing objective functions which are expensive to query. These objectives are typically represented by Gaussian process (GP) surrogate models which are easy to optimize and support exact inference. While standard GP surrogates have been well-established in Bayesian optimization, Bayesian neural networks (BNNs) have recently become practic… ▽ More

    Submitted 8 May, 2024; v1 submitted 31 May, 2023; originally announced May 2023.

    Comments: ICLR 2024. Code available at https://github.com/yucenli/bnn-bo

  42. arXiv:2305.20009  [pdf, other

    cs.LG q-bio.BM

    Protein Design with Guided Discrete Diffusion

    Authors: Nate Gruver, Samuel Stanton, Nathan C. Frey, Tim G. J. Rudner, Isidro Hotzel, Julien Lafrance-Vanasse, Arvind Rajpal, Kyunghyun Cho, Andrew Gordon Wilson

    Abstract: A popular approach to protein design is to combine a generative model with a discriminative model for conditional sampling. The generative model samples plausible sequences while the discriminative model guides a search for sequences with high fitness. Given its broad success in conditional sampling, classifier-guided diffusion modeling is a promising foundation for protein design, leading many to… ▽ More

    Submitted 12 December, 2023; v1 submitted 31 May, 2023; originally announced May 2023.

    Journal ref: Advances in Neural Information Processing Systems 36, December 10-16, 2023

  43. arXiv:2305.12576  [pdf, other

    cs.CL

    Automated Few-shot Classification with Instruction-Finetuned Language Models

    Authors: Rami Aly, Xingjian Shi, Kaixiang Lin, Aston Zhang, Andrew Gordon Wilson

    Abstract: A particularly successful class of approaches for few-shot learning combines language models with prompts -- hand-crafted task descriptions that complement data samples. However, designing prompts by hand for each task commonly requires domain knowledge and substantial guesswork. We observe, in the context of classification tasks, that instruction finetuned language models exhibit remarkable promp… ▽ More

    Submitted 21 October, 2023; v1 submitted 21 May, 2023; originally announced May 2023.

    Comments: EMNLP2023 Findings

  44. Algorithmic Pluralism: A Structural Approach To Equal Opportunity

    Authors: Shomik Jain, Vinith Suriyakumar, Kathleen Creel, Ashia Wilson

    Abstract: We present a structural approach toward achieving equal opportunity in systems of algorithmic decision-making called algorithmic pluralism. Algorithmic pluralism describes a state of affairs in which no set of algorithms severely limits access to opportunity, allowing individuals the freedom to pursue a diverse range of life paths. To argue for algorithmic pluralism, we adopt Joseph Fishkin's theo… ▽ More

    Submitted 15 May, 2024; v1 submitted 14 May, 2023; originally announced May 2023.

    Comments: To appear in the proceedings of the ACM Conference on Fairness, Accountability, and Transparency (FAccT 2024)

    MSC Class: 68-06 ACM Class: K.4.0

  45. arXiv:2304.14994  [pdf, other

    cs.LG math.NA stat.ML

    A Stable and Scalable Method for Solving Initial Value PDEs with Neural Networks

    Authors: Marc Finzi, Andres Potapczynski, Matthew Choptuik, Andrew Gordon Wilson

    Abstract: Unlike conventional grid and mesh based methods for solving partial differential equations (PDEs), neural networks have the potential to break the curse of dimensionality, providing approximate solutions to problems where using classical solvers is difficult or impossible. While global minimization of the PDE residual over the network parameters works well for boundary value problems, catastrophic… ▽ More

    Submitted 30 August, 2023; v1 submitted 28 April, 2023; originally announced April 2023.

    Comments: ICLR 2023. Code available at https://github.com/mfinzi/neural-ivp

  46. arXiv:2304.12210  [pdf, other

    cs.LG cs.CV

    A Cookbook of Self-Supervised Learning

    Authors: Randall Balestriero, Mark Ibrahim, Vlad Sobal, Ari Morcos, Shashank Shekhar, Tom Goldstein, Florian Bordes, Adrien Bardes, Gregoire Mialon, Yuandong Tian, Avi Schwarzschild, Andrew Gordon Wilson, Jonas Geiping, Quentin Garrido, Pierre Fernandez, Amir Bar, Hamed Pirsiavash, Yann LeCun, Micah Goldblum

    Abstract: Self-supervised learning, dubbed the dark matter of intelligence, is a promising path to advance machine learning. Yet, much like cooking, training SSL methods is a delicate art with a high barrier to entry. While many components are familiar, successfully training a SSL method involves a dizzying set of choices from the pretext tasks to training hyper-parameters. Our goal is to lower the barrier… ▽ More

    Submitted 28 June, 2023; v1 submitted 24 April, 2023; originally announced April 2023.

  47. arXiv:2304.05366  [pdf, other

    cs.LG stat.ML

    The No Free Lunch Theorem, Kolmogorov Complexity, and the Role of Inductive Biases in Machine Learning

    Authors: Micah Goldblum, Marc Finzi, Keefer Rowan, Andrew Gordon Wilson

    Abstract: No free lunch theorems for supervised learning state that no learner can solve all problems or that all learners achieve exactly the same accuracy on average over a uniform distribution on learning problems. Accordingly, these theorems are often referenced in support of the notion that individual problems require specially tailored inductive biases. While virtually all uniformly sampled datasets h… ▽ More

    Submitted 7 June, 2024; v1 submitted 11 April, 2023; originally announced April 2023.

    Comments: Published at the International Conference on Machine Learning (ICML) 2024

  48. arXiv:2303.11765  [pdf, other

    cs.RO

    Cable Routing and Assembly using Tactile-driven Motion Primitives

    Authors: Achu Wilson, Helen Jiang, Wenzhao Lian, Wenzhen Yuan

    Abstract: Manipulating cables is challenging for robots because of the infinite degrees of freedom of the cables and frequent occlusion by the gripper and the environment. These challenges are further complicated by the dexterous nature of the operations required for cable routing and assembly, such as weaving and inserting, hampering common solutions with vision-only sensing. In this paper, we propose to i… ▽ More

    Submitted 21 March, 2023; originally announced March 2023.

  49. arXiv:2303.01513  [pdf

    cs.LG cs.AI

    Safe AI for health and beyond -- Monitoring to transform a health service

    Authors: Mahed Abroshan, Michael Burkhart, Oscar Giles, Sam Greenbury, Zoe Kourtzi, Jack Roberts, Mihaela van der Schaar, Jannetta S Steyn, Alan Wilson, May Yong

    Abstract: Machine learning techniques are effective for building predictive models because they identify patterns in large datasets. Development of a model for complex real-life problems often stop at the point of publication, proof of concept or when made accessible through some mode of deployment. However, a model in the medical domain risks becoming obsolete as patient demographics, systems and clinical… ▽ More

    Submitted 6 June, 2023; v1 submitted 2 March, 2023; originally announced March 2023.

    Comments: 12 pages, 3 figures

    ACM Class: I.2.1

  50. arXiv:2302.04019  [pdf, other

    cs.LG stat.ML

    Fortuna: A Library for Uncertainty Quantification in Deep Learning

    Authors: Gianluca Detommaso, Alberto Gasparin, Michele Donini, Matthias Seeger, Andrew Gordon Wilson, Cedric Archambeau

    Abstract: We present Fortuna, an open-source library for uncertainty quantification in deep learning. Fortuna supports a range of calibration techniques, such as conformal prediction that can be applied to any trained neural network to generate reliable uncertainty estimates, and scalable Bayesian inference methods that can be applied to Flax-based deep neural networks trained from scratch for improved unce… ▽ More

    Submitted 8 February, 2023; originally announced February 2023.