Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–38 of 38 results for author: Karimireddy, S P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.15898  [pdf, other

    cs.GT cs.LG

    Defection-Free Collaboration between Competitors in a Learning System

    Authors: Mariel Werner, Sai Praneeth Karimireddy, Michael I. Jordan

    Abstract: We study collaborative learning systems in which the participants are competitors who will defect from the system if they lose revenue by collaborating. As such, we frame the system as a duopoly of competitive firms who are each engaged in training machine-learning models and selling their predictions to a market of consumers. We first examine a fully collaborative scheme in which both firms share… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

  2. arXiv:2404.15746  [pdf, other

    stat.ML cs.CR cs.LG

    Collaborative Heterogeneous Causal Inference Beyond Meta-analysis

    Authors: Tianyu Guo, Sai Praneeth Karimireddy, Michael I. Jordan

    Abstract: Collaboration between different data centers is often challenged by heterogeneity across sites. To account for the heterogeneity, the state-of-the-art method is to re-weight the covariate distributions in each site to match the distribution of the target population. Nevertheless, this method could easily fail when a certain site couldn't cover the entire population. Moreover, it still relies on th… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: submitted to ICML

  3. arXiv:2404.10767  [pdf, other

    cs.GT

    Privacy Can Arise Endogenously in an Economic System with Learning Agents

    Authors: Nivasini Ananthakrishnan, Tiffany Ding, Mariel Werner, Sai Praneeth Karimireddy, Michael I. Jordan

    Abstract: We study price-discrimination games between buyers and a seller where privacy arises endogenously--that is, utility maximization yields equilibrium strategies where privacy occurs naturally. In this game, buyers with a high valuation for a good have an incentive to keep their valuation private, lest the seller charge them a higher price. This yields an equilibrium where some buyers will send a sig… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Comments: To appear in Symposium on Foundations of Responsible Computing (FORC 2024)

  4. arXiv:2403.13893  [pdf, other

    cs.LG

    Data Acquisition via Experimental Design for Decentralized Data Markets

    Authors: Charles Lu, Baihe Huang, Sai Praneeth Karimireddy, Praneeth Vepakomma, Michael Jordan, Ramesh Raskar

    Abstract: Acquiring high-quality training data is essential for current machine learning models. Data markets provide a way to increase the supply of data, particularly in data-scarce domains such as healthcare, by incentivizing potential data sellers to join the market. A major challenge for a data buyer in such a market is selecting the most valuable data points from a data seller. Unlike prior work in da… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

    Comments: 26 pages, 20 figures

  5. arXiv:2307.13381  [pdf, other

    cs.LG cs.DC math.OC stat.ML

    Scaff-PD: Communication Efficient Fair and Robust Federated Learning

    Authors: Yaodong Yu, Sai Praneeth Karimireddy, Yi Ma, Michael I. Jordan

    Abstract: We present Scaff-PD, a fast and communication-efficient algorithm for distributionally robust federated learning. Our approach improves fairness by optimizing a family of distributionally robust objectives tailored to heterogeneous clients. We leverage the special structure of these objectives, and design an accelerated primal dual (APD) algorithm which uses bias corrected local steps (as in Scaff… ▽ More

    Submitted 25 July, 2023; originally announced July 2023.

    MSC Class: 68W40; 68W15; 90C25; 90C06 ACM Class: G.1.6; F.2.1; E.4

  6. arXiv:2306.08393  [pdf, other

    cs.LG cs.DC

    Provably Personalized and Robust Federated Learning

    Authors: Mariel Werner, Lie He, Michael Jordan, Martin Jaggi, Sai Praneeth Karimireddy

    Abstract: Identifying clients with similar objectives and learning a model-per-cluster is an intuitive and interpretable approach to personalization in federated learning. However, doing so with provable and optimal guarantees has remained an open challenge. We formalize this problem as a stochastic optimization problem, achieving optimal convergence rates for a large class of loss functions. We propose sim… ▽ More

    Submitted 18 December, 2023; v1 submitted 14 June, 2023; originally announced June 2023.

  7. arXiv:2306.05592  [pdf, other

    cs.GT cs.CY cs.DC cs.LG econ.TH

    Evaluating and Incentivizing Diverse Data Contributions in Collaborative Learning

    Authors: Baihe Huang, Sai Praneeth Karimireddy, Michael I. Jordan

    Abstract: For a federated learning model to perform well, it is crucial to have a diverse and representative dataset. However, the data contributors may only be concerned with the performance on a specific subset of the population, which may not reflect the diversity of the wider population. This creates a tension between the principal (the FL platform designer) who cares about global performance and the ag… ▽ More

    Submitted 8 June, 2023; originally announced June 2023.

  8. arXiv:2305.17564  [pdf, other

    cs.LG

    Federated Conformal Predictors for Distributed Uncertainty Quantification

    Authors: Charles Lu, Yaodong Yu, Sai Praneeth Karimireddy, Michael I. Jordan, Ramesh Raskar

    Abstract: Conformal prediction is emerging as a popular paradigm for providing rigorous uncertainty quantification in machine learning since it can be easily applied as a post-processing step to already trained models. In this paper, we extend conformal prediction to the federated learning setting. The main challenge we face is data heterogeneity across the clients - this violates the fundamental tenet of e… ▽ More

    Submitted 1 June, 2023; v1 submitted 27 May, 2023; originally announced May 2023.

    Comments: 23 pages, 18 figures, accepted to International Conference on Machine Learning (ICML 2023)

  9. arXiv:2305.11381  [pdf, ps, other

    cs.GT cs.CY cs.IR cs.LG econ.TH

    Online Learning in a Creator Economy

    Authors: Banghua Zhu, Sai Praneeth Karimireddy, Jiantao Jiao, Michael I. Jordan

    Abstract: The creator economy has revolutionized the way individuals can profit through online platforms. In this paper, we initiate the study of online learning in the creator economy by modeling the creator economy as a three-party game between the users, platform, and content creators, with the platform interacting with the content creator under a principal-agent model through contracts to encourage bett… ▽ More

    Submitted 18 May, 2023; originally announced May 2023.

  10. arXiv:2301.12407  [pdf, other

    cs.LG

    FedEBA+: Towards Fair and Effective Federated Learning via Entropy-Based Model

    Authors: Lin Wang, Zhichao Wang, Sai Praneeth Karimireddy, Xiaoying Tang

    Abstract: Ensuring fairness is a crucial aspect of Federated Learning (FL), which enables the model to perform consistently across all clients. However, designing an FL algorithm that simultaneously improves global model performance and promotes fairness remains a formidable challenge, as achieving the latter often necessitates a trade-off with the former. To address this challenge, we propose a new FL algo… ▽ More

    Submitted 5 February, 2024; v1 submitted 29 January, 2023; originally announced January 2023.

  11. arXiv:2210.04620  [pdf, other

    cs.LG cs.CV

    FLamby: Datasets and Benchmarks for Cross-Silo Federated Learning in Realistic Healthcare Settings

    Authors: Jean Ogier du Terrail, Samy-Safwan Ayed, Edwige Cyffers, Felix Grimberg, Chaoyang He, Regis Loeb, Paul Mangold, Tanguy Marchand, Othmane Marfoq, Erum Mushtaq, Boris Muzellec, Constantin Philippenko, Santiago Silva, Maria Teleńczuk, Shadi Albarqouni, Salman Avestimehr, Aurélien Bellet, Aymeric Dieuleveut, Martin Jaggi, Sai Praneeth Karimireddy, Marco Lorenzi, Giovanni Neglia, Marc Tommasi, Mathieu Andreux

    Abstract: Federated Learning (FL) is a novel approach enabling several clients holding sensitive data to collaboratively train machine learning models, without centralizing data. The cross-silo FL setting corresponds to the case of few ($2$--$50$) reliable clients, each holding medium to large datasets, and is typically found in applications such as healthcare, finance, or industry. While previous works hav… ▽ More

    Submitted 5 May, 2023; v1 submitted 10 October, 2022; originally announced October 2022.

    Comments: Accepted to NeurIPS, Datasets and Benchmarks Track, this version fixes typos in the datasets' table and the appendix

  12. arXiv:2207.06343  [pdf, other

    cs.LG cs.DC math.OC stat.ML

    TCT: Convexifying Federated Learning using Bootstrapped Neural Tangent Kernels

    Authors: Yaodong Yu, Alexander Wei, Sai Praneeth Karimireddy, Yi Ma, Michael I. Jordan

    Abstract: State-of-the-art federated learning methods can perform far worse than their centralized counterparts when clients have dissimilar data distributions. For neural networks, even when centralized SGD easily finds a solution that is simultaneously performant for all clients, current federated optimization methods fail to converge to a comparable solution. We show that this performance disparity can l… ▽ More

    Submitted 5 October, 2022; v1 submitted 13 July, 2022; originally announced July 2022.

    Comments: Accepted at Neural Information Processing Systems (NeurIPS) 2022. V2 releases code

    MSC Class: 68W40; 68W15; 90C25; 90C06 ACM Class: G.1.6; F.2.1; E.4

  13. arXiv:2207.04557  [pdf, other

    cs.GT cs.CY cs.DC cs.LG econ.TH

    Mechanisms that Incentivize Data Sharing in Federated Learning

    Authors: Sai Praneeth Karimireddy, Wenshuo Guo, Michael I. Jordan

    Abstract: Federated learning is typically considered a beneficial technology which allows multiple agents to collaborate with each other, improve the accuracy of their models, and solve problems which are otherwise too data-intensive / expensive to be solved individually. However, under the expectation that other agents will share their data, rational agents may be tempted to engage in detrimental behavior… ▽ More

    Submitted 10 July, 2022; originally announced July 2022.

  14. arXiv:2206.00395  [pdf, other

    cs.LG math.OC

    Optimization with Access to Auxiliary Information

    Authors: El Mahdi Chayti, Sai Praneeth Karimireddy

    Abstract: We investigate the fundamental optimization question of minimizing a target function $f$, whose gradients are expensive to compute or have limited availability, given access to some auxiliary side function $h$ whose gradients are cheap or more available. This formulation captures many settings of practical relevance, such as i) re-using batches in SGD, ii) transfer learning, iii) federated learnin… ▽ More

    Submitted 24 February, 2024; v1 submitted 1 June, 2022; originally announced June 2022.

    Comments: Published at TMLR

  15. arXiv:2205.11518  [pdf, other

    cs.CR cs.AI cs.LG

    LIA: Privacy-Preserving Data Quality Evaluation in Federated Learning Using a Lazy Influence Approximation

    Authors: Ljubomir Rokvic, Panayiotis Danassis, Sai Praneeth Karimireddy, Boi Faltings

    Abstract: In Federated Learning, it is crucial to handle low-quality, corrupted, or malicious data. However, traditional data valuation methods are not suitable due to privacy concerns. To address this, we propose a simple yet effective approach that utilizes a new influence approximation called "lazy influence" to filter and score data while preserving privacy. To do this, each participant uses their own d… ▽ More

    Submitted 30 May, 2024; v1 submitted 23 May, 2022; originally announced May 2022.

    Comments: A preliminary version of this work received the Best Paper Award at the International Workshop on Trustworthy Federated Learning at IJCAI (FL-IJCAI) 2023

  16. arXiv:2202.04414  [pdf, other

    cs.LG

    Agree to Disagree: Diversity through Disagreement for Better Transferability

    Authors: Matteo Pagliardini, Martin Jaggi, François Fleuret, Sai Praneeth Karimireddy

    Abstract: Gradient-based learning algorithms have an implicit simplicity bias which in effect can limit the diversity of predictors being sampled by the learning procedure. This behavior can hinder the transferability of trained models by (i) favoring the learning of simpler but spurious features -- present in the training data but absent from the test data -- and (ii) by only leveraging a small subset of p… ▽ More

    Submitted 23 November, 2022; v1 submitted 9 February, 2022; originally announced February 2022.

    Comments: 23 pages, 17 figures

  17. arXiv:2202.01545  [pdf, other

    cs.LG cs.DC math.OC stat.ML

    Byzantine-Robust Decentralized Learning via ClippedGossip

    Authors: Lie He, Sai Praneeth Karimireddy, Martin Jaggi

    Abstract: In this paper, we study the challenging task of Byzantine-robust decentralized training on arbitrary communication graphs. Unlike federated learning where workers communicate through a server, workers in the decentralized environment can only talk to their neighbors, making it harder to reach consensus and benefit from collaborative training. To address these issues, we propose a ClippedGossip alg… ▽ More

    Submitted 20 April, 2023; v1 submitted 3 February, 2022; originally announced February 2022.

  18. arXiv:2111.05968  [pdf, other

    cs.LG

    Linear Speedup in Personalized Collaborative Learning

    Authors: El Mahdi Chayti, Sai Praneeth Karimireddy, Sebastian U. Stich, Nicolas Flammarion, Martin Jaggi

    Abstract: Collaborative training can improve the accuracy of a model for a user by trading off the model's bias (introduced by using data from other users who are potentially different) against its variance (due to the limited amount of data on any single user). In this work, we formalize the personalized collaborative learning problem as a stochastic optimization of a task 0 while giving access to N relate… ▽ More

    Submitted 22 June, 2022; v1 submitted 10 November, 2021; originally announced November 2021.

  19. arXiv:2110.15210  [pdf, other

    cs.LG cs.DC math.OC stat.ML

    Towards Model Agnostic Federated Learning Using Knowledge Distillation

    Authors: Andrei Afonin, Sai Praneeth Karimireddy

    Abstract: Is it possible to design an universal API for federated learning using which an ad-hoc group of data-holders (agents) collaborate with each other and perform federated learning? Such an API would necessarily need to be model-agnostic i.e. make no assumption about the model architecture being used by the agents, and also cannot rely on having representative public data at hand. Knowledge distillati… ▽ More

    Submitted 10 May, 2022; v1 submitted 28 October, 2021; originally announced October 2021.

    Comments: Published at ICLR 2022

    MSC Class: 68W40; 68W15; 90C25; 90C06 ACM Class: G.1.6; F.2.1; E.4

  20. arXiv:2110.12946  [pdf, other

    cs.LG cs.IR stat.ML

    Optimal Model Averaging: Towards Personalized Collaborative Learning

    Authors: Felix Grimberg, Mary-Anne Hartley, Sai P. Karimireddy, Martin Jaggi

    Abstract: In federated learning, differences in the data or objectives between the participating nodes motivate approaches to train a personalized machine learning model for each node. One such approach is weighted averaging between a locally trained model and the global model. In this theoretical work, we study weighted model averaging for arbitrary scalar mean estimation problems under minimal assumptions… ▽ More

    Submitted 25 October, 2021; originally announced October 2021.

    Comments: 9 pages (12 pages incl. references and appendix), 1 figure, Best Paper at International Workshop on Federated Learning for User Privacy and Data Confidentiality in Conjunction with ICML 2021 (FL-ICML'21) ( https://web.archive.org/web/20210908135923/http://federated-learning.org/fl-icml-2021/ICML\%202021\%20Best\%20Paper.pdf )

  21. arXiv:2110.04175  [pdf, other

    cs.LG cs.DC math.OC stat.ML

    RelaySum for Decentralized Deep Learning on Heterogeneous Data

    Authors: Thijs Vogels, Lie He, Anastasia Koloskova, Tao Lin, Sai Praneeth Karimireddy, Sebastian U. Stich, Martin Jaggi

    Abstract: In decentralized machine learning, workers compute model updates on their local data. Because the workers only communicate with few neighbors without central coordination, these updates propagate progressively over the network. This paradigm enables distributed training on networks without all-to-all connectivity, helping to protect data privacy as well as to reduce the communication cost of distr… ▽ More

    Submitted 31 January, 2022; v1 submitted 8 October, 2021; originally announced October 2021.

    Comments: Presented at NeurIPS 2021

    Journal ref: Advances in Neural Information Processing Systems 34, 2021

  22. arXiv:2107.06917  [pdf, other

    cs.LG

    A Field Guide to Federated Optimization

    Authors: Jianyu Wang, Zachary Charles, Zheng Xu, Gauri Joshi, H. Brendan McMahan, Blaise Aguera y Arcas, Maruan Al-Shedivat, Galen Andrew, Salman Avestimehr, Katharine Daly, Deepesh Data, Suhas Diggavi, Hubert Eichner, Advait Gadhikar, Zachary Garrett, Antonious M. Girgis, Filip Hanzely, Andrew Hard, Chaoyang He, Samuel Horvath, Zhouyuan Huo, Alex Ingerman, Martin Jaggi, Tara Javidi, Peter Kairouz , et al. (28 additional authors not shown)

    Abstract: Federated learning and analytics are a distributed approach for collaboratively learning models (or statistics) from decentralized data, motivated by and designed for privacy protection. The distributed learning process can be formulated as solving federated optimization problems, which emphasize communication efficiency, data heterogeneity, compatibility with privacy and system requirements, and… ▽ More

    Submitted 14 July, 2021; originally announced July 2021.

  23. arXiv:2102.04761  [pdf, other

    cs.LG

    Quasi-Global Momentum: Accelerating Decentralized Deep Learning on Heterogeneous Data

    Authors: Tao Lin, Sai Praneeth Karimireddy, Sebastian U. Stich, Martin Jaggi

    Abstract: Decentralized training of deep learning models is a key element for enabling data privacy and on-device learning over networks. In realistic learning scenarios, the presence of heterogeneity across different clients' local datasets poses an optimization challenge and may severely deteriorate the generalization performance. In this paper, we investigate and identify the limitation of several decent… ▽ More

    Submitted 18 June, 2021; v1 submitted 9 February, 2021; originally announced February 2021.

  24. arXiv:2012.10333  [pdf, other

    cs.LG cs.DC math.OC stat.ML

    Learning from History for Byzantine Robust Optimization

    Authors: Sai Praneeth Karimireddy, Lie He, Martin Jaggi

    Abstract: Byzantine robustness has received significant attention recently given its importance for distributed and federated learning. In spite of this, we identify severe flaws in existing algorithms even when the data across the participants is identically distributed. First, we show realistic examples where current state of the art robust aggregation rules fail to converge even in the absence of any Byz… ▽ More

    Submitted 29 June, 2021; v1 submitted 18 December, 2020; originally announced December 2020.

    Comments: ICML 2021. v2 contains stronger theory; v3 fixes some errors in the proof

    ACM Class: I.2.6; I.5.1

  25. arXiv:2008.03606  [pdf, other

    cs.LG cs.DC math.OC stat.ML

    Mime: Mimicking Centralized Stochastic Algorithms in Federated Learning

    Authors: Sai Praneeth Karimireddy, Martin Jaggi, Satyen Kale, Mehryar Mohri, Sashank J. Reddi, Sebastian U. Stich, Ananda Theertha Suresh

    Abstract: Federated learning (FL) is a challenging setting for optimization due to the heterogeneity of the data across different clients which gives rise to the client drift phenomenon. In fact, obtaining an algorithm for FL which is uniformly better than simple centralized training has been a major open problem thus far. In this work, we propose a general algorithmic framework, Mime, which i) mitigates cl… ▽ More

    Submitted 8 June, 2021; v1 submitted 8 August, 2020; originally announced August 2020.

    Comments: Version 2 provides stronger theoretical results and more thorough experiments

    MSC Class: 68W40; 68W15; 90C25; 90C06 ACM Class: G.1.6; F.2.1; E.4

  26. arXiv:2008.01425  [pdf, other

    cs.LG cs.DC math.OC stat.ML

    PowerGossip: Practical Low-Rank Communication Compression in Decentralized Deep Learning

    Authors: Thijs Vogels, Sai Praneeth Karimireddy, Martin Jaggi

    Abstract: Lossy gradient compression has become a practical tool to overcome the communication bottleneck in centrally coordinated distributed training of machine learning models. However, algorithms for decentralized training with compressed communication over arbitrary connected networks have been more complicated, requiring additional memory and hyperparameters. We introduce a simple algorithm that direc… ▽ More

    Submitted 19 October, 2020; v1 submitted 4 August, 2020; originally announced August 2020.

    Comments: To appear in NeurIPS 2020

  27. arXiv:2006.09365  [pdf, other

    cs.LG stat.ML

    Byzantine-Robust Learning on Heterogeneous Datasets via Bucketing

    Authors: Sai Praneeth Karimireddy, Lie He, Martin Jaggi

    Abstract: In Byzantine robust distributed or federated learning, a central server wants to train a machine learning model over data distributed across multiple workers. However, a fraction of these workers may deviate from the prescribed algorithm and send arbitrary messages. While this problem has received significant attention recently, most current defenses assume that the workers have identical data. Fo… ▽ More

    Submitted 22 November, 2023; v1 submitted 16 June, 2020; originally announced June 2020.

    Comments: v5 is the camera-ready version of this paper on ICLR 2022

    ACM Class: I.2.6; I.5.1

  28. arXiv:2006.04747  [pdf, other

    cs.LG cs.CR stat.ML

    Secure Byzantine-Robust Machine Learning

    Authors: Lie He, Sai Praneeth Karimireddy, Martin Jaggi

    Abstract: Increasingly machine learning systems are being deployed to edge servers and devices (e.g. mobile phones) and trained in a collaborative manner. Such distributed/federated/decentralized training raises a number of concerns about the robustness, privacy, and security of the procedure. While extensive work has been done in tackling with robustness, privacy, or security individually, their combinatio… ▽ More

    Submitted 18 October, 2020; v1 submitted 8 June, 2020; originally announced June 2020.

  29. arXiv:1912.03194  [pdf, other

    math.OC cs.LG

    Why are Adaptive Methods Good for Attention Models?

    Authors: Jingzhao Zhang, Sai Praneeth Karimireddy, Andreas Veit, Seungyeon Kim, Sashank J Reddi, Sanjiv Kumar, Suvrit Sra

    Abstract: While stochastic gradient descent (SGD) is still the \emph{de facto} algorithm in deep learning, adaptive methods like Clipped SGD/Adam have been observed to outperform SGD across important tasks, such as attention models. The settings under which SGD performs poorly in comparison to adaptive methods are not well understood yet. In this paper, we provide empirical and theoretical evidence that a h… ▽ More

    Submitted 23 October, 2020; v1 submitted 6 December, 2019; originally announced December 2019.

  30. arXiv:1910.06378  [pdf, other

    cs.LG cs.DC math.OC stat.ML

    SCAFFOLD: Stochastic Controlled Averaging for Federated Learning

    Authors: Sai Praneeth Karimireddy, Satyen Kale, Mehryar Mohri, Sashank J. Reddi, Sebastian U. Stich, Ananda Theertha Suresh

    Abstract: Federated Averaging (FedAvg) has emerged as the algorithm of choice for federated learning due to its simplicity and low communication cost. However, in spite of recent research efforts, its performance is not fully understood. We obtain tight convergence rates for FedAvg and prove that it suffers from `client-drift' when the data is heterogeneous (non-iid), resulting in unstable and slow converge… ▽ More

    Submitted 9 April, 2021; v1 submitted 14 October, 2019; originally announced October 2019.

    Comments: v2 contains analysis of FedAvg, non-convex rates of Scaffold, and experimental evaluation. v3 fixes typos, ICML version. v4 slightly improves rate of SCAFFOLD for general convex functions

    MSC Class: 68W40; 68W15; 90C25; 90C06 ACM Class: G.1.6; F.2.1; E.4

  31. arXiv:1909.05350  [pdf, ps, other

    cs.LG cs.DC math.OC stat.ML

    The Error-Feedback Framework: Better Rates for SGD with Delayed Gradients and Compressed Communication

    Authors: Sebastian U. Stich, Sai Praneeth Karimireddy

    Abstract: We analyze (stochastic) gradient descent (SGD) with delayed updates on smooth quasi-convex and non-convex functions and derive concise, non-asymptotic, convergence rates. We show that the rate of convergence in all cases consists of two terms: (i) a stochastic term which is not affected by the delay, and (ii) a higher order deterministic term which is only linearly slowed down by the delay. Thus,… ▽ More

    Submitted 16 June, 2021; v1 submitted 11 September, 2019; originally announced September 2019.

    Comments: Submitted 9/19, Published 9/20

    MSC Class: 68W40; 68W15; 90C25; 90C06 ACM Class: G.1.6; F.2.1; E.4

    Journal ref: Journal of Machine Learning Research (JMLR), 21(237):1-36, 2020

  32. arXiv:1907.05156   

    cs.LG cs.CR stat.ML

    Amplifying Rényi Differential Privacy via Shuffling

    Authors: Eloïse Berthier, Sai Praneeth Karimireddy

    Abstract: Differential privacy is a useful tool to build machine learning models which do not release too much information about the training data. We study the Rényi differential privacy of stochastic gradient descent when each training example is sampled without replacement (also known as cyclic SGD). Cyclic SGD is typically faster than traditional SGD and is the algorithm of choice in large-scale impleme… ▽ More

    Submitted 17 February, 2020; v1 submitted 11 July, 2019; originally announced July 2019.

    Comments: This version has incorrect proofs! We are currently working on fixing these

  33. arXiv:1905.13727  [pdf, other

    cs.LG cs.DC math.OC stat.ML

    PowerSGD: Practical Low-Rank Gradient Compression for Distributed Optimization

    Authors: Thijs Vogels, Sai Praneeth Karimireddy, Martin Jaggi

    Abstract: We study gradient compression methods to alleviate the communication bottleneck in data-parallel distributed optimization. Despite the significant attention received, current compression schemes either do not scale well or fail to achieve the target test accuracy. We propose a new low-rank gradient compressor based on power iteration that can i) compress gradients rapidly, ii) efficiently aggregat… ▽ More

    Submitted 18 February, 2020; v1 submitted 31 May, 2019; originally announced May 2019.

    Comments: Presented at NeurIPS 2019

    ACM Class: I.2.6; I.5.1

    Journal ref: NeurIPS 2019

  34. arXiv:1903.08708  [pdf, other

    cs.LG stat.ML

    Accelerating Gradient Boosting Machine

    Authors: Haihao Lu, Sai Praneeth Karimireddy, Natalia Ponomareva, Vahab Mirrokni

    Abstract: Gradient Boosting Machine (GBM) is an extremely powerful supervised learning algorithm that is widely used in practice. GBM routinely features as a leading algorithm in machine learning competitions such as Kaggle and the KDDCup. In this work, we propose Accelerated Gradient Boosting Machine (AGBM) by incorporating Nesterov's acceleration techniques into the design of GBM. The difficulty in accele… ▽ More

    Submitted 27 August, 2020; v1 submitted 20 March, 2019; originally announced March 2019.

  35. arXiv:1901.09847  [pdf, other

    cs.LG math.OC stat.ML

    Error Feedback Fixes SignSGD and other Gradient Compression Schemes

    Authors: Sai Praneeth Karimireddy, Quentin Rebjock, Sebastian U. Stich, Martin Jaggi

    Abstract: Sign-based algorithms (e.g. signSGD) have been proposed as a biased gradient compression technique to alleviate the communication bottleneck in training large neural networks across multiple workers. We show simple convex counter-examples where signSGD does not converge to the optimum. Further, even when it does converge, signSGD may generalize poorly when compared with SGD. These issues arise bec… ▽ More

    Submitted 29 May, 2019; v1 submitted 28 January, 2019; originally announced January 2019.

    Comments: ICML 2019 (long talk)

    ACM Class: I.2.6; I.5.1

  36. arXiv:1810.06999  [pdf, other

    math.OC cs.LG stat.CO stat.ML

    Efficient Greedy Coordinate Descent for Composite Problems

    Authors: Sai Praneeth Karimireddy, Anastasia Koloskova, Sebastian U. Stich, Martin Jaggi

    Abstract: Coordinate descent with random coordinate selection is the current state of the art for many large scale optimization problems. However, greedy selection of the steepest coordinate on smooth problems can yield convergence rates independent of the dimension $n$, and requiring upto $n$ times fewer iterations. In this paper, we consider greedy updates that are based on subgradients for a class of n… ▽ More

    Submitted 16 October, 2018; originally announced October 2018.

    Comments: 44 pages, 17 figures, 3 tables

    MSC Class: 90C25; 68Q25 ACM Class: G.1.6

  37. arXiv:1806.00413  [pdf, ps, other

    cs.LG math.OC stat.ML

    Global linear convergence of Newton's method without strong-convexity or Lipschitz gradients

    Authors: Sai Praneeth Karimireddy, Sebastian U. Stich, Martin Jaggi

    Abstract: We show that Newton's method converges globally at a linear rate for objective functions whose Hessians are stable. This class of problems includes many functions which are not strongly convex, such as logistic regression. Our linear convergence result is (i) affine-invariant, and holds even if an (ii) approximate Hessian is used, and if the subproblems are (iii) only solved approximately. Thus we… ▽ More

    Submitted 1 June, 2018; originally announced June 2018.

    Comments: 19 pages

    MSC Class: 90C25; 68Q25 ACM Class: G.1.6

  38. arXiv:1803.09539  [pdf, other

    stat.ML cs.LG math.OC

    On Matching Pursuit and Coordinate Descent

    Authors: Francesco Locatello, Anant Raj, Sai Praneeth Karimireddy, Gunnar Rätsch, Bernhard Schölkopf, Sebastian U. Stich, Martin Jaggi

    Abstract: Two popular examples of first-order optimization methods over linear spaces are coordinate descent and matching pursuit algorithms, with their randomized variants. While the former targets the optimization by moving along coordinates, the latter considers a generalized notion of directions. Exploiting the connection between the two algorithms, we present a unified analysis of both, providing affin… ▽ More

    Submitted 31 May, 2019; v1 submitted 26 March, 2018; originally announced March 2018.

    Journal ref: ICML 2018 - Proceedings of the 35th International Conference on Machine Learning