Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 157 results for author: Roth, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.12206  [pdf, other

    cs.CL cs.SD eess.AS

    A Language Modeling Approach to Diacritic-Free Hebrew TTS

    Authors: Amit Roth, Arnon Turetzky, Yossi Adi

    Abstract: We tackle the task of text-to-speech (TTS) in Hebrew. Traditional Hebrew contains Diacritics, which dictate the way individuals should pronounce given words, however, modern Hebrew rarely uses them. The lack of diacritics in modern Hebrew results in readers expected to conclude the correct pronunciation and understand which phonemes to use based on the context. This imposes a fundamental challenge… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: Accepted at Interspeech24

  2. arXiv:2407.11876  [pdf, other

    cs.LG

    Simplifying the Theory on Over-Smoothing

    Authors: Andreas Roth

    Abstract: Graph convolutions have gained popularity due to their ability to efficiently operate on data with an irregular geometric structure. However, graph convolutions cause over-smoothing, which refers to representations becoming more similar with increased depth. However, many different definitions and intuitions currently coexist, leading to research efforts focusing on incompatible directions. This p… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

  3. arXiv:2407.07566  [pdf, other

    cs.CL cs.SD eess.AS

    HebDB: a Weakly Supervised Dataset for Hebrew Speech Processing

    Authors: Arnon Turetzky, Or Tal, Yael Segal-Feldman, Yehoshua Dissen, Ella Zeldes, Amit Roth, Eyal Cohen, Yosi Shrem, Bronya R. Chernyak, Olga Seleznova, Joseph Keshet, Yossi Adi

    Abstract: We present HebDB, a weakly supervised dataset for spoken language processing in the Hebrew language. HebDB offers roughly 2500 hours of natural and spontaneous speech recordings in the Hebrew language, consisting of a large variety of speakers and topics. We provide raw recordings together with a pre-processed, weakly supervised, and filtered version. The goal of HebDB is to further enhance resear… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: Accepted at Interspeech2024

  4. arXiv:2405.20272  [pdf, other

    cs.LG cs.CR

    Reconstruction Attacks on Machine Unlearning: Simple Models are Vulnerable

    Authors: Martin Bertran, Shuai Tang, Michael Kearns, Jamie Morgenstern, Aaron Roth, Zhiwei Steven Wu

    Abstract: Machine unlearning is motivated by desire for data autonomy: a person can request to have their data's influence removed from deployed models, and those models should be updated as if they were retrained without the person's data. We show that, counter-intuitively, these updates expose individuals to high-accuracy reconstruction attacks which allow the attacker to recover their data in its entiret… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  5. arXiv:2405.16752  [pdf, other

    cs.LG cs.AI

    Model Ensembling for Constrained Optimization

    Authors: Ira Globus-Harris, Varun Gupta, Michael Kearns, Aaron Roth

    Abstract: There is a long history in machine learning of model ensembling, beginning with boosting and bagging and continuing to the present day. Much of this history has focused on combining models for classification and regression, but recently there is interest in more complex settings such as ensembling policies in reinforcement learning. Strong connections have also emerged between ensembling and multi… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

  6. arXiv:2405.16739  [pdf, other

    cs.LG cs.AI eess.SY

    Oracle-Efficient Reinforcement Learning for Max Value Ensembles

    Authors: Marcel Hussing, Michael Kearns, Aaron Roth, Sikata Bela Sengupta, Jessica Sorrell

    Abstract: Reinforcement learning (RL) in large or infinite state spaces is notoriously challenging, both theoretically (where worst-case sample and computational complexities must scale with state space cardinality) and experimentally (where function approximation and policy gradient techniques often scale poorly and suffer from instability and high variance). One line of research attempting to address thes… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

  7. arXiv:2405.02225  [pdf, other

    stat.ML cs.AI cs.CY cs.LG stat.ME

    Fair Risk Control: A Generalized Framework for Calibrating Multi-group Fairness Risks

    Authors: Lujing Zhang, Aaron Roth, Linjun Zhang

    Abstract: This paper introduces a framework for post-processing machine learning models so that their predictions satisfy multi-group fairness guarantees. Based on the celebrated notion of multicalibration, we introduce $(\mathbf{s},\mathcal{G}, α)-$GMC (Generalized Multi-Dimensional Multicalibration) for multi-dimensional mappings $\mathbf{s}$, constraint set $\mathcal{G}$, and a pre-specified threshold le… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

    Comments: 28 pages, 8 figures, accepted by ICML2024

  8. arXiv:2404.04689  [pdf, other

    stat.ML cs.CL cs.LG

    Multicalibration for Confidence Scoring in LLMs

    Authors: Gianluca Detommaso, Martin Bertran, Riccardo Fogliato, Aaron Roth

    Abstract: This paper proposes the use of "multicalibration" to yield interpretable and reliable confidence scores for outputs generated by large language models (LLMs). Multicalibration asks for calibration not just marginally, but simultaneously across various intersecting groupings of the data. We show how to form groupings for prompt/completion pairs that are correlated with the probability of correctnes… ▽ More

    Submitted 6 April, 2024; originally announced April 2024.

  9. arXiv:2402.17108  [pdf, ps, other

    cs.GT cs.DS cs.LG

    Repeated Contracting with Multiple Non-Myopic Agents: Policy Regret and Limited Liability

    Authors: Natalie Collina, Varun Gupta, Aaron Roth

    Abstract: We study a repeated contracting setting in which a Principal adaptively chooses amongst $k$ Agents at each of $T$ rounds. The Agents are non-myopic, and so a mechanism for the Principal induces a $T$-round extensive form game amongst the Agents. We give several results aimed at understanding an under-explored aspect of contract theory -- the game induced when choosing an Agent to contract with. Fi… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

  10. arXiv:2402.11410  [pdf, ps, other

    cs.LG cs.DS stat.ML

    An Elementary Predictor Obtaining $2\sqrt{T}$ Distance to Calibration

    Authors: Eshwar Ram Arunachaleswaran, Natalie Collina, Aaron Roth, Mirah Shi

    Abstract: Blasiok et al. [2023] proposed distance to calibration as a natural measure of calibration error that unlike expected calibration error (ECE) is continuous. Recently, Qiao and Zheng [2024] gave a non-constructive argument establishing the existence of an online predictor that can obtain $O(\sqrt{T})$ distance to calibration in the adversarial setting, which is known to be impossible for ECE. They… ▽ More

    Submitted 17 February, 2024; originally announced February 2024.

  11. arXiv:2402.10795  [pdf, other

    cs.LG cs.CY cs.HC

    Diversified Ensembling: An Experiment in Crowdsourced Machine Learning

    Authors: Ira Globus-Harris, Declan Harrison, Michael Kearns, Pietro Perona, Aaron Roth

    Abstract: Crowdsourced machine learning on competition platforms such as Kaggle is a popular and often effective method for generating accurate models. Typically, teams vie for the most accurate model, as measured by overall error on a holdout set, and it is common towards the end of such competitions for teams at the top of the leaderboard to ensemble or average their models outside the platform mechanism… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

  12. arXiv:2402.08753  [pdf, ps, other

    cs.GT cs.LG

    Forecasting for Swap Regret for All Downstream Agents

    Authors: Aaron Roth, Mirah Shi

    Abstract: We study the problem of making predictions so that downstream agents who best respond to them will be guaranteed diminishing swap regret, no matter what their utility functions are. It has been known since Foster and Vohra (1997) that agents who best-respond to calibrated forecasts have no swap regret. Unfortunately, the best known algorithms for guaranteeing calibrated forecasts in sequential adv… ▽ More

    Submitted 15 June, 2024; v1 submitted 13 February, 2024; originally announced February 2024.

  13. arXiv:2312.05140  [pdf, other

    cs.LG cs.CR

    Membership Inference Attacks on Diffusion Models via Quantile Regression

    Authors: Shuai Tang, Zhiwei Steven Wu, Sergul Aydore, Michael Kearns, Aaron Roth

    Abstract: Recently, diffusion models have become popular tools for image synthesis because of their high-quality outputs. However, like other large-scale models, they may leak private information about their training data. Here, we demonstrate a privacy vulnerability of diffusion models through a \emph{membership inference (MI) attack}, which aims to identify whether a target example belongs to the training… ▽ More

    Submitted 8 December, 2023; originally announced December 2023.

  14. arXiv:2311.07754  [pdf, other

    cs.GT cs.DS econ.TH

    Efficient Prior-Free Mechanisms for No-Regret Agents

    Authors: Natalie Collina, Aaron Roth, Han Shao

    Abstract: We study a repeated Principal Agent problem between a long lived Principal and Agent pair in a prior free setting. In our setting, the sequence of realized states of nature may be adversarially chosen, the Agent is non-myopic, and the Principal aims for a strong form of policy regret. Following Camara, Hartline, and Johnson, we model the Agent's long-run behavior with behavioral assumptions that r… ▽ More

    Submitted 13 November, 2023; originally announced November 2023.

  15. arXiv:2310.17651  [pdf, other

    cs.LG

    High-Dimensional Prediction for Sequential Decision Making

    Authors: Georgy Noarov, Ramya Ramalingam, Aaron Roth, Stephan Xie

    Abstract: We study the problem of making predictions of an adversarially chosen high-dimensional state that are unbiased subject to an arbitrary collection of conditioning events, with the goal of tailoring these events to downstream decision makers. We give efficient algorithms for solving this problem, as well as a number of applications that stem from choosing an appropriate set of conditioning events.… ▽ More

    Submitted 27 October, 2023; v1 submitted 26 October, 2023; originally announced October 2023.

    Comments: Added references, Arxiv abstract edited

  16. arXiv:2310.04652  [pdf, other

    cs.LG

    Oracle Efficient Algorithms for Groupwise Regret

    Authors: Krishna Acharya, Eshwar Ram Arunachaleswaran, Sampath Kannan, Aaron Roth, Juba Ziani

    Abstract: We study the problem of online prediction, in which at each time step $t$, an individual $x_t$ arrives, whose label we must predict. Each individual is associated with various groups, defined based on their features such as age, sex, race etc., which may intersect. Our goal is to make predictions that have regret guarantees not just overall but also simultaneously on each sub-sequence comprised of… ▽ More

    Submitted 6 October, 2023; originally announced October 2023.

  17. arXiv:2310.00946  [pdf, other

    cs.LG cs.AI

    Distilling Influences to Mitigate Prediction Churn in Graph Neural Networks

    Authors: Andreas Roth, Thomas Liebig

    Abstract: Models with similar performances exhibit significant disagreement in the predictions of individual samples, referred to as prediction churn. Our work explores this phenomenon in graph neural networks by investigating differences between models differing only in their initializations in their utilized features for predictions. We propose a novel metric called Influence Difference (ID) to quantify t… ▽ More

    Submitted 2 October, 2023; originally announced October 2023.

    Comments: Accepted at ACML 2023

  18. arXiv:2309.06000  [pdf, other

    cs.RO

    Gait Design of a Novel Arboreal Concertina Locomotion for Snake-like Robots

    Authors: Shuoqi Chen, Aaron Roth

    Abstract: In this paper, we propose a novel strategy for a snake robot to move straight up a cylindrical surface. Prior works on pole-climbing for a snake robot mainly utilized a rolling helix gait, and although proven to be efficient, it does not reassemble movements made by a natural snake. We take inspiration from nature and seek to imitate the Arboreal Concertina Locomotion (ACL) from real-life serpents… ▽ More

    Submitted 12 September, 2023; originally announced September 2023.

    Comments: 4 pages, 3 figures

  19. arXiv:2308.16800  [pdf, other

    cs.LG cs.AI

    Rank Collapse Causes Over-Smoothing and Over-Correlation in Graph Neural Networks

    Authors: Andreas Roth, Thomas Liebig

    Abstract: Our study reveals new theoretical insights into over-smoothing and feature over-correlation in deep graph neural networks. We show the prevalence of invariant subspaces, demonstrating a fixed relative behavior that is unaffected by feature transformations. Our work clarifies recent observations related to convergence to a constant state and a potential over-separation of node states, as the amplif… ▽ More

    Submitted 21 February, 2024; v1 submitted 31 August, 2023; originally announced August 2023.

    Comments: Published at LoG 2023

  20. arXiv:2308.16516  [pdf, other

    cs.LG cs.AI

    Curvature-based Pooling within Graph Neural Networks

    Authors: Cedric Sanders, Andreas Roth, Thomas Liebig

    Abstract: Over-squashing and over-smoothing are two critical issues, that limit the capabilities of graph neural networks (GNNs). While over-smoothing eliminates the differences between nodes making them indistinguishable, over-squashing refers to the inability of GNNs to propagate information over long distances, as exponentially many node states are squashed into fixed-size representations. Both phenomena… ▽ More

    Submitted 31 August, 2023; originally announced August 2023.

    Comments: ECMLPKDD 2023 - Workshop on Mining and Learning with Graphs

  21. arXiv:2307.08999  [pdf, ps, other

    cs.LG stat.ML

    Oracle Efficient Online Multicalibration and Omniprediction

    Authors: Sumegha Garg, Christopher Jung, Omer Reingold, Aaron Roth

    Abstract: A recent line of work has shown a surprising connection between multicalibration, a multi-group fairness notion, and omniprediction, a learning paradigm that provides simultaneous loss minimization guarantees for a large family of loss functions. Prior work studies omniprediction in the batch setting. We initiate the study of omniprediction in the online adversarial setting. Although there exist a… ▽ More

    Submitted 18 July, 2023; originally announced July 2023.

  22. arXiv:2307.03694  [pdf, other

    cs.LG cs.AI cs.CR

    Scalable Membership Inference Attacks via Quantile Regression

    Authors: Martin Bertran, Shuai Tang, Michael Kearns, Jamie Morgenstern, Aaron Roth, Zhiwei Steven Wu

    Abstract: Membership inference attacks are designed to determine, using black box access to trained models, whether a particular example was used in training or not. Membership inference can be formalized as a hypothesis testing problem. The most effective existing attacks estimate the distribution of some test statistic (usually the model's confidence on the true label) on points that were (and were not) u… ▽ More

    Submitted 7 July, 2023; originally announced July 2023.

  23. Balanced Filtering via Disclosure-Controlled Proxies

    Authors: Siqi Deng, Emily Diana, Michael Kearns, Aaron Roth

    Abstract: We study the problem of collecting a cohort or set that is balanced with respect to sensitive groups when group membership is unavailable or prohibited from use at deployment time. Specifically, our deployment-time collection mechanism does not reveal significantly more about the group membership of any individual sample than can be ascertained from base rates alone. To do this, we study a learner… ▽ More

    Submitted 17 June, 2024; v1 submitted 26 June, 2023; originally announced June 2023.

    Journal ref: 5th Symposium on Foundations of Responsible Computing (FORC 2024)

  24. arXiv:2303.03451  [pdf, other

    cs.LG cs.CR

    Improved Differentially Private Regression via Gradient Boosting

    Authors: Shuai Tang, Sergul Aydore, Michael Kearns, Saeyoung Rho, Aaron Roth, Yichen Wang, Yu-Xiang Wang, Zhiwei Steven Wu

    Abstract: We revisit the problem of differentially private squared error linear regression. We observe that existing state-of-the-art methods are sensitive to the choice of hyperparameters -- including the ``clipping threshold'' that cannot be set optimally in a data-independent way. We give a new algorithm for private linear regression based on gradient boosting. We show that our method consistently improv… ▽ More

    Submitted 20 May, 2023; v1 submitted 6 March, 2023; originally announced March 2023.

  25. arXiv:2302.08507  [pdf, ps, other

    cs.LG cs.DS math.ST

    The Scope of Multicalibration: Characterizing Multicalibration via Property Elicitation

    Authors: Georgy Noarov, Aaron Roth

    Abstract: We make a connection between multicalibration and property elicitation and show that (under mild technical conditions) it is possible to produce a multicalibrated predictor for a continuous scalar distributional property $Γ$ if and only if $Γ$ is elicitable. On the negative side, we show that for non-elicitable continuous properties there exist simple data distributions on which even the true di… ▽ More

    Submitted 16 February, 2023; originally announced February 2023.

  26. arXiv:2301.13767  [pdf, other

    cs.LG cs.DS

    Multicalibration as Boosting for Regression

    Authors: Ira Globus-Harris, Declan Harrison, Michael Kearns, Aaron Roth, Jessica Sorrell

    Abstract: We study the connection between multicalibration and boosting for squared error regression. First we prove a useful characterization of multicalibration in terms of a ``swap regret'' like condition on squared error. Using this characterization, we give an exceedingly simple algorithm that can be analyzed both as a boosting algorithm for regression and as a multicalibration algorithm for a class H… ▽ More

    Submitted 31 January, 2023; originally announced January 2023.

    Comments: Code available here: https://github.com/Declancharrison/Level-Set-Boosting

  27. arXiv:2211.11596  [pdf, other

    cs.LG

    Forecasting Unobserved Node States with spatio-temporal Graph Neural Networks

    Authors: Andreas Roth, Thomas Liebig

    Abstract: Forecasting future states of sensors is key to solving tasks like weather prediction, route planning, and many others when dealing with networks of sensors. But complete spatial coverage of sensors is generally unavailable and would practically be infeasible due to limitations in budget and other resources during deployment and maintenance. Currently existing approaches using machine learning are… ▽ More

    Submitted 21 November, 2022; originally announced November 2022.

  28. arXiv:2211.03128  [pdf, other

    cs.CY cs.CR cs.LG

    Confidence-Ranked Reconstruction of Census Microdata from Published Statistics

    Authors: Travis Dick, Cynthia Dwork, Michael Kearns, Terrance Liu, Aaron Roth, Giuseppe Vietri, Zhiwei Steven Wu

    Abstract: A reconstruction attack on a private dataset $D$ takes as input some publicly accessible information about the dataset and produces a list of candidate elements of $D$. We introduce a new class of data reconstruction attacks based on randomized methods for non-convex optimization. We empirically demonstrate that our attacks can not only reconstruct full rows of $D$ from aggregate query statistics… ▽ More

    Submitted 6 February, 2023; v1 submitted 6 November, 2022; originally announced November 2022.

  29. arXiv:2209.15145  [pdf, other

    cs.LG math.ST

    Batch Multivalid Conformal Prediction

    Authors: Christopher Jung, Georgy Noarov, Ramya Ramalingam, Aaron Roth

    Abstract: We develop fast distribution-free conformal prediction algorithms for obtaining multivalid coverage on exchangeable data in the batch setting. Multivalid coverage guarantees are stronger than marginal coverage guarantees in two ways: (1) They hold even conditional on group membership -- that is, the target coverage level $1-α$ holds conditionally on membership in each of an arbitrary (potentially… ▽ More

    Submitted 29 September, 2022; originally announced September 2022.

    Comments: Code to replicate all of our experiments can be found at https://github.com/ProgBelarus/BatchMultivalidConformal

  30. arXiv:2209.09079  [pdf, other

    cs.RO cs.AI cs.HC cs.LG

    MSVIPER: Improved Policy Distillation for Reinforcement-Learning-Based Robot Navigation

    Authors: Aaron M. Roth, Jing Liang, Ram Sriram, Elham Tabassi, Dinesh Manocha

    Abstract: We present Multiple Scenario Verifiable Reinforcement Learning via Policy Extraction (MSVIPER), a new method for policy distillation to decision trees for improved robot navigation. MSVIPER learns an "expert" policy using any Reinforcement Learning (RL) technique involving learning a state-action mapping and then uses imitation learning to learn a decision-tree policy from it. We demonstrate that… ▽ More

    Submitted 19 September, 2022; originally announced September 2022.

    Comments: 6 pages main paper, 2 pages of references, 5 page appendix (13 pages total) 5 tables, 9 algorithms, 4 figures

  31. arXiv:2209.07400  [pdf, other

    cs.LG

    Private Synthetic Data for Multitask Learning and Marginal Queries

    Authors: Giuseppe Vietri, Cedric Archambeau, Sergul Aydore, William Brown, Michael Kearns, Aaron Roth, Ankit Siva, Shuai Tang, Zhiwei Steven Wu

    Abstract: We provide a differentially private algorithm for producing synthetic data simultaneously useful for multiple tasks: marginal queries and multitask machine learning (ML). A key innovation in our algorithm is the ability to directly handle numerical features, in contrast to a number of related prior approaches which require numerical features to be first converted into {high cardinality} categorica… ▽ More

    Submitted 15 September, 2022; originally announced September 2022.

    Comments: The short version of this paper appears in the proceedings of NeurIPS-22

  32. arXiv:2209.07375  [pdf, other

    cs.GT

    Wealth Dynamics Over Generations: Analysis and Interventions

    Authors: Krishna Acharya, Eshwar Ram Arunachaleswaran, Sampath Kannan, Aaron Roth, Juba Ziani

    Abstract: We present a stylized model with feedback loops for the evolution of a population's wealth over generations. Individuals have both talent and wealth: talent is a random variable distributed identically for everyone, but wealth is a random variable that is dependent on the population one is born into. Individuals then apply to a downstream agent, which we treat as a university throughout the paper… ▽ More

    Submitted 15 September, 2022; originally announced September 2022.

  33. arXiv:2209.07312  [pdf, other

    cs.LG cs.DS

    Multicalibrated Regression for Downstream Fairness

    Authors: Ira Globus-Harris, Varun Gupta, Christopher Jung, Michael Kearns, Jamie Morgenstern, Aaron Roth

    Abstract: We show how to take a regression function $\hat{f}$ that is appropriately ``multicalibrated'' and efficiently post-process it into an approximately error minimizing classifier satisfying a large variety of fairness constraints. The post-processing requires no labeled data, and only a modest amount of unlabeled data and computation. The computational and sample complexity requirements of computing… ▽ More

    Submitted 15 September, 2022; originally announced September 2022.

  34. arXiv:2209.01687  [pdf, ps, other

    cs.LG cs.DS math.ST

    Reconciling Individual Probability Forecasts

    Authors: Aaron Roth, Alexander Tolbert, Scott Weinstein

    Abstract: Individual probabilities refer to the probabilities of outcomes that are realized only once: the probability that it will rain tomorrow, the probability that Alice will die within the next 12 months, the probability that Bob will be arrested for a violent crime in the next 18 months, etc. Individual probabilities are fundamentally unknowable. Nevertheless, we show that two parties who agree on the… ▽ More

    Submitted 6 May, 2023; v1 submitted 4 September, 2022; originally announced September 2022.

    Comments: This is the full version of a paper that appears in the proceedings of FAccT 2023: The Sixth Annual ACM Conference on Fairness, Accountability, and Transparency, 2023

  35. arXiv:2207.00684  [pdf, other

    cs.LG

    Transforming PageRank into an Infinite-Depth Graph Neural Network

    Authors: Andreas Roth, Thomas Liebig

    Abstract: Popular graph neural networks are shallow models, despite the success of very deep architectures in other application domains of deep learning. This reduces the modeling capacity and leaves models unable to capture long-range relationships. The primary reason for the shallow design results from over-smoothing, which leads node states to become more similar with increased depth. We build on the clo… ▽ More

    Submitted 1 July, 2022; originally announced July 2022.

    Comments: Accepted at ECML-PKDD 2022

    ACM Class: I.2.6; I.0

  36. arXiv:2206.04475  [pdf, ps, other

    cs.LG stat.ML

    Individually Fair Learning with One-Sided Feedback

    Authors: Yahav Bechavod, Aaron Roth

    Abstract: We consider an online learning problem with one-sided feedback, in which the learner is able to observe the true label only for positively predicted instances. On each round, $k$ instances arrive and receive classification outcomes according to a randomized policy deployed by the learner, whose goal is to maximize accuracy while deploying individually fair policies. We first extend the framework o… ▽ More

    Submitted 9 June, 2022; originally announced June 2022.

  37. arXiv:2206.01067  [pdf, other

    cs.LG

    Practical Adversarial Multivalid Conformal Prediction

    Authors: Osbert Bastani, Varun Gupta, Christopher Jung, Georgy Noarov, Ramya Ramalingam, Aaron Roth

    Abstract: We give a simple, generic conformal prediction method for sequential prediction that achieves target empirical coverage guarantees against adversarially chosen data. It is computationally lightweight -- comparable to split conformal prediction -- but does not require having a held-out validation set, and so all data can be used for training models from which to derive a conformal score. It gives s… ▽ More

    Submitted 2 June, 2022; originally announced June 2022.

    Comments: Code for our experiments can be found at: https://github.com/ProgBelarus/MultiValidPrediction

  38. arXiv:2203.11481  [pdf, other

    cs.CV cs.CR

    Mixed Differential Privacy in Computer Vision

    Authors: Aditya Golatkar, Alessandro Achille, Yu-Xiang Wang, Aaron Roth, Michael Kearns, Stefano Soatto

    Abstract: We introduce AdaMix, an adaptive differentially private algorithm for training deep neural network classifiers using both private and public image data. While pre-training language models on large public datasets has enabled strong differential privacy (DP) guarantees with minor loss of accuracy, a similar practice yields punishing trade-offs in vision tasks. A few-shot or even zero-shot learning… ▽ More

    Submitted 28 March, 2022; v1 submitted 22 March, 2022; originally announced March 2022.

    Comments: Accepted at CVPR 2022

  39. arXiv:2201.10408  [pdf, other

    cs.LG cs.CY cs.DS

    An Algorithmic Framework for Bias Bounties

    Authors: Ira Globus-Harris, Michael Kearns, Aaron Roth

    Abstract: We propose and analyze an algorithmic framework for "bias bounties": events in which external participants are invited to propose improvements to a trained model, akin to bug bounty events in software and security. Our framework allows participants to submit arbitrary subgroup improvements, which are then algorithmically incorporated into an updated model. Our algorithm has the property that there… ▽ More

    Submitted 9 May, 2022; v1 submitted 25 January, 2022; originally announced January 2022.

  40. arXiv:2108.03837  [pdf, ps, other

    cs.LG cs.DS cs.GT

    Online Minimax Multiobjective Optimization: Multicalibeating and Other Applications

    Authors: Daniel Lee, Georgy Noarov, Mallesh Pai, Aaron Roth

    Abstract: We introduce a simple but general online learning framework in which a learner plays against an adversary in a vector-valued game that changes every round. Even though the learner's objective is not convex-concave (and so the minimax theorem does not apply), we give a simple algorithm that can compete with the setting in which the adversary must announce their action first, with optimally diminish… ▽ More

    Submitted 13 October, 2022; v1 submitted 9 August, 2021; originally announced August 2021.

    Comments: Appears in NeurIPS 2022

  41. arXiv:2107.04423  [pdf, other

    cs.LG cs.DS

    Multiaccurate Proxies for Downstream Fairness

    Authors: Emily Diana, Wesley Gill, Michael Kearns, Krishnaram Kenthapadi, Aaron Roth, Saeed Sharifi-Malvajerdi

    Abstract: We study the problem of training a model that must obey demographic fairness conditions when the sensitive features are not available at training time -- in other words, how can we train a model to be fair by race when we don't have data about race? We adopt a fairness pipeline perspective, in which an "upstream" learner that does have access to the sensitive features will learn a proxy model for… ▽ More

    Submitted 25 January, 2022; v1 submitted 9 July, 2021; originally announced July 2021.

  42. arXiv:2106.16207  [pdf, other

    cs.SI cs.CY

    When the Echo Chamber Shatters: Examining the Use of Community-Specific Language Post-Subreddit Ban

    Authors: Milo Z. Trujillo, Samuel F. Rosenblatt, Guillermo de Anda Jáuregui, Emily Moog, Briane Paul V. Samson, Laurent Hébert-Dufresne, Allison M. Roth

    Abstract: Community-level bans are a common tool against groups that enable online harassment and harmful speech. Unfortunately, the efficacy of community bans has only been partially studied and with mixed results. Here, we provide a flexible unsupervised methodology to identify in-group language and track user activity on Reddit both before and after the ban of a community (subreddit). We use a simple wor… ▽ More

    Submitted 30 June, 2021; originally announced June 2021.

    Comments: 15 pages (including references and appendix), 5 figures

  43. arXiv:2106.04378  [pdf, other

    cs.LG stat.ML

    Adaptive Machine Unlearning

    Authors: Varun Gupta, Christopher Jung, Seth Neel, Aaron Roth, Saeed Sharifi-Malvajerdi, Chris Waites

    Abstract: Data deletion algorithms aim to remove the influence of deleted data points from trained models at a cheaper computational cost than fully retraining those models. However, for sequences of deletions, most prior work in the non-convex setting gives valid guarantees only for sequences that are chosen independently of the models that are published. If people choose to delete their data as a function… ▽ More

    Submitted 8 June, 2021; originally announced June 2021.

  44. arXiv:2104.10818  [pdf, other

    cs.RO cs.AI cs.LG

    XAI-N: Sensor-based Robot Navigation using Expert Policies and Decision Trees

    Authors: Aaron M. Roth, Jing Liang, Dinesh Manocha

    Abstract: We present a novel sensor-based learning navigation algorithm to compute a collision-free trajectory for a robot in dense and dynamic environments with moving obstacles or targets. Our approach uses deep reinforcement learning-based expert policy that is trained using a sim2real paradigm. In order to increase the reliability and handle the failure cases of the expert policy, we combine with a poli… ▽ More

    Submitted 18 July, 2021; v1 submitted 21 April, 2021; originally announced April 2021.

    Journal ref: 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

  45. arXiv:2104.01987  [pdf, ps, other

    cs.CR cs.LG math.ST stat.ML

    Rejoinder: Gaussian Differential Privacy

    Authors: Jinshuo Dong, Aaron Roth, Weijie J. Su

    Abstract: In this rejoinder, we aim to address two broad issues that cover most comments made in the discussion. First, we discuss some theoretical aspects of our work and comment on how this work might impact the theoretical foundation of privacy-preserving data analysis. Taking a practical viewpoint, we next discuss how f-differential privacy (f-DP) and Gaussian differential privacy (GDP) can make a diffe… ▽ More

    Submitted 25 June, 2021; v1 submitted 5 April, 2021; originally announced April 2021.

    Comments: Updated the references. Rejoinder to discussions on Gaussian Differential Privacy, read to the Royal Statistical Society in December 2020

  46. arXiv:2103.06641  [pdf, other

    cs.LG cs.CR

    Differentially Private Query Release Through Adaptive Projection

    Authors: Sergul Aydore, William Brown, Michael Kearns, Krishnaram Kenthapadi, Luca Melis, Aaron Roth, Ankit Siva

    Abstract: We propose, implement, and evaluate a new algorithm for releasing answers to very large numbers of statistical queries like $k$-way marginals, subject to differential privacy. Our algorithm makes adaptive use of a continuous relaxation of the Projection Mechanism, which answers queries on the private dataset using simple perturbation, and then attempts to find the synthetic dataset that most close… ▽ More

    Submitted 23 June, 2021; v1 submitted 11 March, 2021; originally announced March 2021.

  47. arXiv:2102.08454  [pdf, ps, other

    cs.LG cs.DS stat.ML

    Lexicographically Fair Learning: Algorithms and Generalization

    Authors: Emily Diana, Wesley Gill, Ira Globus-Harris, Michael Kearns, Aaron Roth, Saeed Sharifi-Malvajerdi

    Abstract: We extend the notion of minimax fairness in supervised learning problems to its natural conclusion: lexicographic minimax fairness (or lexifairness for short). Informally, given a collection of demographic groups of interest, minimax fairness asks that the error of the group with the highest error be minimized. Lexifairness goes further and asks that amongst all minimax fair solutions, the error o… ▽ More

    Submitted 16 February, 2021; originally announced February 2021.

  48. arXiv:2102.07809  [pdf, ps, other

    cs.GT econ.TH physics.soc-ph

    Best vs. All: Equity and Accuracy of Standardized Test Score Reporting

    Authors: Sampath Kannan, Mingzi Niu, Aaron Roth, Rakesh Vohra

    Abstract: We study a game theoretic model of standardized testing for college admissions. Students are of two types; High and Low. There is a college that would like to admit the High type students. Students take a potentially costly standardized exam which provides a noisy signal of their type. The students come from two populations, which are identical in talent (i.e. the type distribution is the same),… ▽ More

    Submitted 15 February, 2021; originally announced February 2021.

  49. arXiv:2101.01739  [pdf, ps, other

    cs.LG cs.DS cs.GT econ.EM

    Online Multivalid Learning: Means, Moments, and Prediction Intervals

    Authors: Varun Gupta, Christopher Jung, Georgy Noarov, Mallesh M. Pai, Aaron Roth

    Abstract: We present a general, efficient technique for providing contextual predictions that are "multivalid" in various senses, against an online sequence of adversarially chosen examples $(x,y)$. This means that the resulting estimates correctly predict various statistics of the labels $y$ not just marginally -- as averaged over the sequence of examples -- but also conditionally on $x \in G$ for any $G$… ▽ More

    Submitted 5 January, 2021; originally announced January 2021.

  50. arXiv:2011.03108  [pdf, other

    cs.LG

    Minimax Group Fairness: Algorithms and Experiments

    Authors: Emily Diana, Wesley Gill, Michael Kearns, Krishnaram Kenthapadi, Aaron Roth

    Abstract: We consider a recently introduced framework in which fairness is measured by worst-case outcomes across groups, rather than by the more standard differences between group outcomes. In this framework we provide provably convergent oracle-efficient learning algorithms (or equivalently, reductions to non-fair learning) for minimax group fairness. Here the goal is that of minimizing the maximum loss a… ▽ More

    Submitted 7 March, 2021; v1 submitted 5 November, 2020; originally announced November 2020.