Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–20 of 20 results for author: Setlur, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.14532  [pdf, other

    cs.LG cs.CL

    RL on Incorrect Synthetic Data Scales the Efficiency of LLM Math Reasoning by Eight-Fold

    Authors: Amrith Setlur, Saurabh Garg, Xinyang Geng, Naman Garg, Virginia Smith, Aviral Kumar

    Abstract: Training on model-generated synthetic data is a promising approach for finetuning LLMs, but it remains unclear when it helps or hurts. In this paper, we investigate this question for math reasoning via an empirical study, followed by building a conceptual understanding of our observations. First, we find that while the typical approach of finetuning a model on synthetic correct or positive problem… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  2. arXiv:2312.15551  [pdf, other

    cs.LG cs.CR stat.ML

    On the Benefits of Public Representations for Private Transfer Learning under Distribution Shift

    Authors: Pratiksha Thaker, Amrith Setlur, Zhiwei Steven Wu, Virginia Smith

    Abstract: Public pretraining is a promising approach to improve differentially private model training. However, recent work has noted that many positive research results studying this paradigm only consider in-distribution tasks, and may not apply to settings where there is distribution shift between the pretraining and finetuning data -- a scenario that is likely when finetuning private tasks due to the se… ▽ More

    Submitted 11 June, 2024; v1 submitted 24 December, 2023; originally announced December 2023.

  3. arXiv:2312.03318  [pdf, other

    cs.LG cs.CV stat.ML

    Complementary Benefits of Contrastive Learning and Self-Training Under Distribution Shift

    Authors: Saurabh Garg, Amrith Setlur, Zachary Chase Lipton, Sivaraman Balakrishnan, Virginia Smith, Aditi Raghunathan

    Abstract: Self-training and contrastive learning have emerged as leading techniques for incorporating unlabeled data, both under distribution shift (unsupervised domain adaptation) and when it is absent (semi-supervised learning). However, despite the popularity and compatibility of these techniques, their efficacy in combination remains unexplored. In this paper, we undertake a systematic empirical investi… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

    Comments: NeurIPS 2023

  4. arXiv:2312.03151  [pdf, other

    cs.LG

    Multitask Learning Can Improve Worst-Group Outcomes

    Authors: Atharva Kulkarni, Lucio Dery, Amrith Setlur, Aditi Raghunathan, Ameet Talwalkar, Graham Neubig

    Abstract: In order to create machine learning systems that serve a variety of users well, it is vital to not only achieve high average performance but also ensure equitable outcomes across diverse groups. However, most machine learning methods are designed to improve a model's average performance on a chosen end task without consideration for their impact on worst group error. Multitask learning (MTL) is on… ▽ More

    Submitted 28 February, 2024; v1 submitted 5 December, 2023; originally announced December 2023.

    Comments: 20 pages, 7 tables, 6 Figures

  5. arXiv:2310.00873  [pdf, other

    cs.LG

    Deep Neural Networks Tend To Extrapolate Predictably

    Authors: Katie Kang, Amrith Setlur, Claire Tomlin, Sergey Levine

    Abstract: Conventional wisdom suggests that neural network predictions tend to be unpredictable and overconfident when faced with out-of-distribution (OOD) inputs. Our work reassesses this assumption for neural networks with high-dimensional inputs. Rather than extrapolating in arbitrary ways, we observe that neural network predictions often tend towards a constant value as input data becomes increasingly O… ▽ More

    Submitted 15 March, 2024; v1 submitted 1 October, 2023; originally announced October 2023.

  6. arXiv:2307.10026  [pdf, other

    cs.LG

    Contextual Reliability: When Different Features Matter in Different Contexts

    Authors: Gaurav Ghosal, Amrith Setlur, Daniel S. Brown, Anca D. Dragan, Aditi Raghunathan

    Abstract: Deep neural networks often fail catastrophically by relying on spurious correlations. Most prior work assumes a clear dichotomy into spurious and reliable features; however, this is often unrealistic. For example, most of the time we do not want an autonomous car to simply copy the speed of surrounding cars -- we don't want our car to run a red light if a neighboring car does so. However, we canno… ▽ More

    Submitted 19 July, 2023; originally announced July 2023.

    Comments: ICML 2023 Camera Ready Version

  7. arXiv:2306.11120  [pdf, other

    cs.LG cs.AI

    Confidence-Based Model Selection: When to Take Shortcuts for Subpopulation Shifts

    Authors: Annie S. Chen, Yoonho Lee, Amrith Setlur, Sergey Levine, Chelsea Finn

    Abstract: Effective machine learning models learn both robust features that directly determine the outcome of interest (e.g., an object with wheels is more likely to be a car), and shortcut features (e.g., an object on a road is more likely to be a car). The latter can be a source of error under distributional shift, when the correlations change at test-time. The prevailing sentiment in the robustness liter… ▽ More

    Submitted 19 June, 2023; originally announced June 2023.

    Comments: 15 pages, 5 figures

  8. arXiv:2302.05441  [pdf, other

    cs.LG cs.AI

    Project and Probe: Sample-Efficient Domain Adaptation by Interpolating Orthogonal Features

    Authors: Annie S. Chen, Yoonho Lee, Amrith Setlur, Sergey Levine, Chelsea Finn

    Abstract: Transfer learning with a small amount of target data is an effective and common approach to adapting a pre-trained model to distribution shifts. In some situations, target data labels may be expensive to obtain, so we may only have access to a limited number of target data points. To make the most of a very small target dataset, we propose a lightweight, sample-efficient approach that learns a div… ▽ More

    Submitted 25 May, 2023; v1 submitted 10 February, 2023; originally announced February 2023.

    Comments: 22 pages, 9 figures

  9. arXiv:2302.02931  [pdf, other

    cs.LG

    Bitrate-Constrained DRO: Beyond Worst Case Robustness To Unknown Group Shifts

    Authors: Amrith Setlur, Don Dennis, Benjamin Eysenbach, Aditi Raghunathan, Chelsea Finn, Virginia Smith, Sergey Levine

    Abstract: Training machine learning models robust to distribution shifts is critical for real-world applications. Some robust training algorithms (e.g., Group DRO) specialize to group shifts and require group information on all training points. Other methods (e.g., CVaR DRO) that do not need group annotations can be overly conservative, since they naively upweight high loss points which may form a contrived… ▽ More

    Submitted 11 October, 2023; v1 submitted 6 February, 2023; originally announced February 2023.

    Journal ref: ICLR 2023

  10. arXiv:2206.01367  [pdf, other

    cs.LG cs.CR

    Adversarial Unlearning: Reducing Confidence Along Adversarial Directions

    Authors: Amrith Setlur, Benjamin Eysenbach, Virginia Smith, Sergey Levine

    Abstract: Supervised learning methods trained with maximum likelihood objectives often overfit on training data. Most regularizers that prevent overfitting look to increase confidence on additional examples (e.g., data augmentation, adversarial training), or reduce it on training data (e.g., label smoothing). In this work we propose a complementary regularization strategy that reduces confidence on self-gen… ▽ More

    Submitted 2 June, 2022; originally announced June 2022.

  11. arXiv:2102.11503  [pdf, other

    cs.LG

    Two Sides of Meta-Learning Evaluation: In vs. Out of Distribution

    Authors: Amrith Setlur, Oscar Li, Virginia Smith

    Abstract: We categorize meta-learning evaluation into two settings: $\textit{in-distribution}$ [ID], in which the train and test tasks are sampled $\textit{iid}$ from the same underlying task distribution, and $\textit{out-of-distribution}$ [OOD], in which they are not. While most meta-learning theory and some FSL applications follow the ID setting, we identify that most existing few-shot classification ben… ▽ More

    Submitted 27 October, 2021; v1 submitted 23 February, 2021; originally announced February 2021.

  12. arXiv:2011.14048  [pdf, other

    cs.LG stat.ML

    Is Support Set Diversity Necessary for Meta-Learning?

    Authors: Amrith Setlur, Oscar Li, Virginia Smith

    Abstract: Meta-learning is a popular framework for learning with limited data in which an algorithm is produced by training over multiple few-shot learning tasks. For classification problems, these tasks are typically constructed by sampling a small number of support and query examples from a subset of the classes. While conventional wisdom is that task diversity should improve the performance of meta-learn… ▽ More

    Submitted 7 October, 2021; v1 submitted 27 November, 2020; originally announced November 2020.

    Journal ref: NeurIPS 2020 Workshop on Meta-learning

  13. arXiv:2010.02114  [pdf, other

    cs.CL cs.AI cs.LG stat.ML

    Explaining The Efficacy of Counterfactually Augmented Data

    Authors: Divyansh Kaushik, Amrith Setlur, Eduard Hovy, Zachary C. Lipton

    Abstract: In attempts to produce ML models less reliant on spurious patterns in NLP datasets, researchers have recently proposed curating counterfactually augmented data (CAD) via a human-in-the-loop process in which given some documents and their (initial) labels, humans must revise the text to make a counterfactual label applicable. Importantly, edits that are not necessary to flip the applicable label ar… ▽ More

    Submitted 23 March, 2021; v1 submitted 5 October, 2020; originally announced October 2020.

    Comments: Published at ICLR 2021

  14. arXiv:2008.08148  [pdf, other

    cs.CV cs.LG

    Robust Handwriting Recognition with Limited and Noisy Data

    Authors: Hai Pham, Amrith Setlur, Saket Dingliwal, Tzu-Hsiang Lin, Barnabas Poczos, Kang Huang, Zhuo Li, Jae Lim, Collin McCormack, Tam Vu

    Abstract: Despite the advent of deep learning in computer vision, the general handwriting recognition problem is far from solved. Most existing approaches focus on handwriting datasets that have clearly written text and carefully segmented labels. In this paper, we instead focus on learning handwritten characters from maintenance logs, a constrained setting where data is very limited and noisy. We break the… ▽ More

    Submitted 18 August, 2020; originally announced August 2020.

    Comments: icfhr2020

  15. arXiv:2007.12948  [pdf, ps, other

    eess.AS cs.LG cs.SD stat.ML

    Nonlinear ISA with Auxiliary Variables for Learning Speech Representations

    Authors: Amrith Setlur, Barnabas Poczos, Alan W Black

    Abstract: This paper extends recent work on nonlinear Independent Component Analysis (ICA) by introducing a theoretical framework for nonlinear Independent Subspace Analysis (ISA) in the presence of auxiliary variables. Observed high dimensional acoustic features like log Mel spectrograms can be considered as surface level manifestations of nonlinear transformations over individual multivariate sources of i… ▽ More

    Submitted 25 July, 2020; originally announced July 2020.

    Comments: To be presented at Interspeech 2020

  16. arXiv:2007.02523  [pdf, other

    cs.LG stat.ML

    Covariate Distribution Aware Meta-learning

    Authors: Amrith Setlur, Saket Dingliwal, Barnabas Poczos

    Abstract: Meta-learning has proven to be successful for few-shot learning across the regression, classification, and reinforcement learning paradigms. Recent approaches have adopted Bayesian interpretations to improve gradient-based meta-learners by quantifying the uncertainty of the post-adaptation estimates. Most of these works almost completely ignore the latent relationship between the covariate distrib… ▽ More

    Submitted 27 November, 2020; v1 submitted 6 July, 2020; originally announced July 2020.

    Journal ref: ICML 2020 Lifelong Learning Workshop

  17. arXiv:2004.14257  [pdf, other

    cs.CL

    Politeness Transfer: A Tag and Generate Approach

    Authors: Aman Madaan, Amrith Setlur, Tanmay Parekh, Barnabas Poczos, Graham Neubig, Yiming Yang, Ruslan Salakhutdinov, Alan W Black, Shrimai Prabhumoye

    Abstract: This paper introduces a new task of politeness transfer which involves converting non-polite sentences to polite sentences while preserving the meaning. We also provide a dataset of more than 1.39 instances automatically labeled for politeness to encourage benchmark evaluations on this new task. We design a tag and generate pipeline that identifies stylistic attributes and subsequently generates a… ▽ More

    Submitted 1 May, 2020; v1 submitted 29 April, 2020; originally announced April 2020.

    Comments: To appear at ACL 2020

  18. arXiv:1910.10211  [pdf, other

    cs.LG stat.ML

    Better Approximate Inference for Partial Likelihood Models with a Latent Structure

    Authors: Amrith Setlur, Barnabás Póczós

    Abstract: Temporal Point Processes (TPP) with partial likelihoods involving a latent structure often entail an intractable marginalization, thus making inference hard. We propose a novel approach to Maximum Likelihood Estimation (MLE) involving approximate inference over the latent variables by minimizing a tight upper bound on the approximation gap. Given a discrete latent variable $Z$, the proposed approx… ▽ More

    Submitted 19 December, 2019; v1 submitted 22 October, 2019; originally announced October 2019.

    Journal ref: NeurIPS 2019 Workshop on Learning with Temporal Point Processes

  19. arXiv:1811.01355  [pdf, other

    cs.CL

    Semi-Supervised Confidence Network aided Gated Attention based Recurrent Neural Network for Clickbait Detection

    Authors: Amrith Rajagopal Setlur

    Abstract: Clickbaits are catchy headlines that are frequently used by social media outlets in order to allure its viewers into clicking them and thus leading them to dubious content. Such venal schemes thrive on exploiting the curiosity of naive social media users, directing traffic to web pages that won't be visited otherwise. In this paper, we propose a novel, semi-supervised classification based approach… ▽ More

    Submitted 4 November, 2018; originally announced November 2018.

    Comments: Oral Presentation, ICON 2018 [Proceedings in ACL Anthology]

  20. An Efficient Fault Tolerant Workflow Scheduling Approach using Replication Heuristics and Checkpointing in the Cloud

    Authors: S. Jaya Nirmala, Amrith Rajagopal Setlur, Har Simrat Singh, Sudhanshu Khoriya

    Abstract: Scientific workflows have been predominantly used for complex and large scale data analysis and scientific computation/automation and the need for robust workflow scheduling techniques has grown considerably. But, most of the existing workflow scheduling algorithms do not provide the required reliability and robustness. In this paper, a new fault tolerant workflow scheduling algorithm that learns… ▽ More

    Submitted 31 October, 2019; v1 submitted 15 October, 2018; originally announced October 2018.

    Comments: 35 pages, 9 figures

    Journal ref: Journal of Parallel and Distributed Computing 2020