Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 65 results for author: Smith, V

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.02348  [pdf, other

    cs.LG

    Revisiting Cascaded Ensembles for Efficient Inference

    Authors: Steven Kolawole, Don Dennis, Ameet Talwalkar, Virginia Smith

    Abstract: A common approach to make machine learning inference more efficient is to use example-specific adaptive schemes, which route or select models for each example at inference time. In this work we study a simple scheme for adaptive inference. We build a cascade of ensembles (CoE), beginning with resource-efficient models and growing to larger, more expressive models, where ensemble agreement serves a… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: ES-FOMO, ICML 2024

  2. arXiv:2406.17660  [pdf, other

    cs.LG

    Grass: Compute Efficient Low-Memory LLM Training with Structured Sparse Gradients

    Authors: Aashiq Muhamed, Oscar Li, David Woodruff, Mona Diab, Virginia Smith

    Abstract: Large language model (LLM) training and finetuning are often bottlenecked by limited GPU memory. While existing projection-based optimization methods address this by projecting gradients into a lower-dimensional subspace to reduce optimizer state memory, they typically rely on dense projection matrices, which can introduce computational and memory overheads. In this work, we propose Grass (GRAdien… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  3. arXiv:2406.14532  [pdf, other

    cs.LG cs.CL

    RL on Incorrect Synthetic Data Scales the Efficiency of LLM Math Reasoning by Eight-Fold

    Authors: Amrith Setlur, Saurabh Garg, Xinyang Geng, Naman Garg, Virginia Smith, Aviral Kumar

    Abstract: Training on model-generated synthetic data is a promising approach for finetuning LLMs, but it remains unclear when it helps or hurts. In this paper, we investigate this question for math reasoning via an empirical study, followed by building a conceptual understanding of our observations. First, we find that while the typical approach of finetuning a model on synthetic correct or positive problem… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  4. arXiv:2406.13356  [pdf, other

    cs.LG

    Jogging the Memory of Unlearned Model Through Targeted Relearning Attack

    Authors: Shengyuan Hu, Yiwei Fu, Zhiwei Steven Wu, Virginia Smith

    Abstract: Machine unlearning is a promising approach to mitigate undesirable memorization of training data in ML models. However, in this work we show that existing approaches for unlearning in LLMs are surprisingly susceptible to a simple set of targeted relearning attacks. With access to only a small and potentially loosely related set of data, we find that we can 'jog' the memory of unlearned models to r… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: 17 pages, 8 figures, 12 tables

  5. arXiv:2406.05233  [pdf, other

    cs.LG cs.DC

    Federated LoRA with Sparse Communication

    Authors: Kevin Kuo, Arian Raje, Kousik Rajesh, Virginia Smith

    Abstract: Low-rank adaptation (LoRA) is a natural method for finetuning in communication-constrained machine learning settings such as cross-device federated learning. Prior work that has studied LoRA in the context of federated learning has focused on improving LoRA's robustness to heterogeneity and privacy. In this work, we instead consider techniques for further improving communication-efficiency in fede… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: 12 pages (excluding references), 8 figures

  6. arXiv:2403.05598  [pdf, other

    cs.CR cs.LG

    Privacy Amplification for the Gaussian Mechanism via Bounded Support

    Authors: Shengyuan Hu, Saeed Mahloujifar, Virginia Smith, Kamalika Chaudhuri, Chuan Guo

    Abstract: Data-dependent privacy accounting frameworks such as per-instance differential privacy (pDP) and Fisher information loss (FIL) confer fine-grained privacy guarantees for individuals in a fixed training dataset. These guarantees can be desirable compared to vanilla DP in real world settings as they tightly upper-bound the privacy leakage for a $\textit{specific}$ individual in an $\textit{actual}$… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

    Comments: 23 pages, 4 figures

  7. arXiv:2403.04099  [pdf, other

    cs.LG

    Many-Objective Multi-Solution Transport

    Authors: Ziyue Li, Tian Li, Virginia Smith, Jeff Bilmes, Tianyi Zhou

    Abstract: Optimizing the performance of many objectives (instantiated by tasks or clients) jointly with a few Pareto stationary solutions (models) is critical in machine learning. However, previous multi-objective optimization methods often focus on a few number of objectives and cannot scale to many objectives that outnumber the solutions, leading to either subpar performance or ignored objectives. We intr… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

  8. arXiv:2403.03329  [pdf, other

    cs.CL

    Guardrail Baselines for Unlearning in LLMs

    Authors: Pratiksha Thaker, Yash Maurya, Shengyuan Hu, Zhiwei Steven Wu, Virginia Smith

    Abstract: Recent work has demonstrated that finetuning is a promising approach to 'unlearn' concepts from large language models. However, finetuning can be expensive, as it requires both generating a set of examples and running iterations of finetuning to update the model. In this work, we show that simple guardrail-based approaches such as prompting and filtering can achieve unlearning results comparable t… ▽ More

    Submitted 11 June, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

    Comments: Preliminary work, accepted to ICLR workshop SeT-LLM 2024

  9. arXiv:2402.16187  [pdf, other

    cs.CR cs.CL cs.LG

    No Free Lunch in LLM Watermarking: Trade-offs in Watermarking Design Choices

    Authors: Qi Pang, Shengyuan Hu, Wenting Zheng, Virginia Smith

    Abstract: Advances in generative models have made it possible for AI-generated text, code, and images to mirror human-generated content in many applications. Watermarking, a technique that aims to embed information in the output of a model to verify its source, is useful for mitigating the misuse of such AI-generated content. However, we show that common design choices in LLM watermarking schemes make the r… ▽ More

    Submitted 25 May, 2024; v1 submitted 25 February, 2024; originally announced February 2024.

  10. arXiv:2402.05406  [pdf, other

    cs.LG cs.CL

    Everybody Prune Now: Structured Pruning of LLMs with only Forward Passes

    Authors: Lucio Dery, Steven Kolawole, Jean-François Kagy, Virginia Smith, Graham Neubig, Ameet Talwalkar

    Abstract: Given the generational gap in available hardware between lay practitioners and the most endowed institutions, LLMs are becoming increasingly inaccessible as they grow in size. Whilst many approaches have been proposed to compress LLMs to make their resource consumption manageable, these methods themselves tend to be resource intensive, putting them out of the reach of the very user groups they tar… ▽ More

    Submitted 9 February, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

    Comments: 15 pages, 4 fiigures, 15 tables

  11. arXiv:2312.15551  [pdf, other

    cs.LG cs.CR stat.ML

    On the Benefits of Public Representations for Private Transfer Learning under Distribution Shift

    Authors: Pratiksha Thaker, Amrith Setlur, Zhiwei Steven Wu, Virginia Smith

    Abstract: Public pretraining is a promising approach to improve differentially private model training. However, recent work has noted that many positive research results studying this paradigm only consider in-distribution tasks, and may not apply to settings where there is distribution shift between the pretraining and finetuning data -- a scenario that is likely when finetuning private tasks due to the se… ▽ More

    Submitted 11 June, 2024; v1 submitted 24 December, 2023; originally announced December 2023.

  12. arXiv:2312.03318  [pdf, other

    cs.LG cs.CV stat.ML

    Complementary Benefits of Contrastive Learning and Self-Training Under Distribution Shift

    Authors: Saurabh Garg, Amrith Setlur, Zachary Chase Lipton, Sivaraman Balakrishnan, Virginia Smith, Aditi Raghunathan

    Abstract: Self-training and contrastive learning have emerged as leading techniques for incorporating unlabeled data, both under distribution shift (unsupervised domain adaptation) and when it is absent (semi-supervised learning). However, despite the popularity and compatibility of these techniques, their efficacy in combination remains unexplored. In this paper, we undertake a systematic empirical investi… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

    Comments: NeurIPS 2023

  13. arXiv:2310.01424  [pdf, ps, other

    cs.CL cs.AI

    Identifying and Mitigating Privacy Risks Stemming from Language Models: A Survey

    Authors: Victoria Smith, Ali Shahin Shamsabadi, Carolyn Ashurst, Adrian Weller

    Abstract: Large Language Models (LLMs) have shown greatly enhanced performance in recent years, attributed to increased size and extensive training data. This advancement has led to widespread interest and adoption across industries and the public. However, training data memorization in Machine Learning models scales with model size, particularly concerning for LLMs. Memorized text sequences have the potent… ▽ More

    Submitted 18 June, 2024; v1 submitted 27 September, 2023; originally announced October 2023.

    Comments: 15 pages

  14. arXiv:2304.12180  [pdf, other

    cs.NE cs.AI cs.LG

    Variance-Reduced Gradient Estimation via Noise-Reuse in Online Evolution Strategies

    Authors: Oscar Li, James Harrison, Jascha Sohl-Dickstein, Virginia Smith, Luke Metz

    Abstract: Unrolled computation graphs are prevalent throughout machine learning but present challenges to automatic differentiation (AD) gradient estimation methods when their loss functions exhibit extreme local sensitivtiy, discontinuity, or blackbox characteristics. In such scenarios, online evolution strategies methods are a more capable alternative, while being more parallelizable than vanilla evolutio… ▽ More

    Submitted 9 December, 2023; v1 submitted 21 April, 2023; originally announced April 2023.

    Comments: NeurIPS 2023. 41 pages. Code available at https://github.com/OscarcarLi/Noise-Reuse-Evolution-Strategies

  15. arXiv:2302.10093  [pdf, other

    cs.LG

    Progressive Ensemble Distillation: Building Ensembles for Efficient Inference

    Authors: Don Kurian Dennis, Abhishek Shetty, Anish Sevekari, Kazuhito Koishida, Virginia Smith

    Abstract: We study the problem of progressive ensemble distillation: Given a large, pretrained teacher model $g$, we seek to decompose the model into smaller, low-inference cost student models $f_i$, such that progressively evaluating additional models in this ensemble leads to improved predictions. The resulting ensemble allows for flexibly tuning accuracy vs. inference cost at runtime, which is useful for… ▽ More

    Submitted 9 November, 2023; v1 submitted 20 February, 2023; originally announced February 2023.

  16. arXiv:2302.08533  [pdf, other

    cs.LG cs.DC

    Federated Learning as a Network Effects Game

    Authors: Shengyuan Hu, Dung Daniel Ngo, Shuran Zheng, Virginia Smith, Zhiwei Steven Wu

    Abstract: Federated Learning (FL) aims to foster collaboration among a population of clients to improve the accuracy of machine learning without directly sharing local data. Although there has been rich literature on designing federated learning algorithms, most prior works implicitly assume that all clients are willing to participate in a FL scheme. In practice, clients may not benefit from joining in FL,… ▽ More

    Submitted 16 February, 2023; originally announced February 2023.

    Comments: 14 pages of main text, 26 pages in total

  17. arXiv:2302.02931  [pdf, other

    cs.LG

    Bitrate-Constrained DRO: Beyond Worst Case Robustness To Unknown Group Shifts

    Authors: Amrith Setlur, Don Dennis, Benjamin Eysenbach, Aditi Raghunathan, Chelsea Finn, Virginia Smith, Sergey Levine

    Abstract: Training machine learning models robust to distribution shifts is critical for real-world applications. Some robust training algorithms (e.g., Group DRO) specialize to group shifts and require group information on all training points. Other methods (e.g., CVaR DRO) that do not need group annotations can be overly conservative, since they naively upweight high loss points which may form a contrived… ▽ More

    Submitted 11 October, 2023; v1 submitted 6 February, 2023; originally announced February 2023.

    Journal ref: ICLR 2023

  18. arXiv:2212.08930  [pdf, other

    cs.LG

    On Noisy Evaluation in Federated Hyperparameter Tuning

    Authors: Kevin Kuo, Pratiksha Thaker, Mikhail Khodak, John Nguyen, Daniel Jiang, Ameet Talwalkar, Virginia Smith

    Abstract: Hyperparameter tuning is critical to the success of federated learning applications. Unfortunately, appropriately selecting hyperparameters is challenging in federated networks. Issues of scale, privacy, and heterogeneity introduce noise in the tuning process and make it difficult to evaluate the performance of various hyperparameters. In this work, we perform the first systematic study on the eff… ▽ More

    Submitted 15 May, 2023; v1 submitted 17 December, 2022; originally announced December 2022.

    Comments: v1: 19 pages, 15 figures, submitted to MLSys2023; v2: Fixed citation formatting; v3: Fixed typo, update acks v4: MLSys2023 camera-ready

  19. arXiv:2212.00309  [pdf, other

    cs.LG cs.CR

    Differentially Private Adaptive Optimization with Delayed Preconditioners

    Authors: Tian Li, Manzil Zaheer, Ken Ziyu Liu, Sashank J. Reddi, H. Brendan McMahan, Virginia Smith

    Abstract: Privacy noise may negate the benefits of using adaptive optimizers in differentially private model training. Prior works typically address this issue by using auxiliary information (e.g., public data) to boost the effectiveness of adaptive optimization. In this work, we explore techniques to estimate and efficiently adapt to gradient geometry in private adaptive optimization without auxiliary data… ▽ More

    Submitted 7 June, 2023; v1 submitted 1 December, 2022; originally announced December 2022.

    Comments: Accepted by ICLR 2023

  20. arXiv:2211.15458  [pdf, other

    cs.LG cs.CL

    Validating Large Language Models with ReLM

    Authors: Michael Kuchnik, Virginia Smith, George Amvrosiadis

    Abstract: Although large language models (LLMs) have been touted for their ability to generate natural-sounding text, there are growing concerns around possible negative effects of LLMs such as data memorization, bias, and inappropriate language. Unfortunately, the complexity and generation capacities of LLMs make validating (and correcting) such concerns difficult. In this work, we introduce ReLM, a system… ▽ More

    Submitted 8 May, 2023; v1 submitted 21 November, 2022; originally announced November 2022.

  21. arXiv:2208.00467  [pdf, other

    cs.CV cs.LG

    COCOA: Cross Modality Contrastive Learning for Sensor Data

    Authors: Shohreh Deldari, Hao Xue, Aaqib Saeed, Daniel V. Smith, Flora D. Salim

    Abstract: Self-Supervised Learning (SSL) is a new paradigm for learning discriminative representations without labelled data and has reached comparable or even state-of-the-art results in comparison to supervised counterparts. Contrastive Learning (CL) is one of the most well-known approaches in SSL that attempts to learn general, informative representations of data. CL methods have been mostly developed fo… ▽ More

    Submitted 3 August, 2022; v1 submitted 31 July, 2022; originally announced August 2022.

    Comments: 27 pages, 10 figures, 6 tables, Accepted with minor revision at IMWUT Vol. 6 No. 3

  22. arXiv:2206.09262  [pdf, other

    cs.LG cs.DC

    Motley: Benchmarking Heterogeneity and Personalization in Federated Learning

    Authors: Shanshan Wu, Tian Li, Zachary Charles, Yu Xiao, Ziyu Liu, Zheng Xu, Virginia Smith

    Abstract: Personalized federated learning considers learning models unique to each client in a heterogeneous network. The resulting client-specific models have been purported to improve metrics such as accuracy, fairness, and robustness in federated networks. However, despite a plethora of work in this area, it remains unclear: (1) which personalization techniques are most effective in various settings, and… ▽ More

    Submitted 26 September, 2022; v1 submitted 18 June, 2022; originally announced June 2022.

    Comments: 40 pages, 10 figures, 7 tables. EMNIST and Landmarks fine-tuning results are corrected in (and after) v5. Code: https://github.com/google-research/federated/tree/master/personalization_benchmark

  23. arXiv:2206.07902  [pdf, other

    cs.LG cs.CR stat.ML

    On Privacy and Personalization in Cross-Silo Federated Learning

    Authors: Ziyu Liu, Shengyuan Hu, Zhiwei Steven Wu, Virginia Smith

    Abstract: While the application of differential privacy (DP) has been well-studied in cross-device federated learning (FL), there is a lack of work considering DP and its implications for cross-silo FL, a setting characterized by a limited number of clients each containing many data subjects. In cross-silo FL, usual notions of client-level DP are less suitable as real-world privacy regulations typically con… ▽ More

    Submitted 17 October, 2022; v1 submitted 15 June, 2022; originally announced June 2022.

    Comments: NeurIPS 2022, 37 pages

  24. arXiv:2206.02353  [pdf, other

    cs.LG cs.CV

    Beyond Just Vision: A Review on Self-Supervised Representation Learning on Multimodal and Temporal Data

    Authors: Shohreh Deldari, Hao Xue, Aaqib Saeed, Jiayuan He, Daniel V. Smith, Flora D. Salim

    Abstract: Recently, Self-Supervised Representation Learning (SSRL) has attracted much attention in the field of computer vision, speech, natural language processing (NLP), and recently, with other types of modalities, including time series from sensors. The popularity of self-supervised learning is driven by the fact that traditional models typically require a huge amount of well-annotated data for training… ▽ More

    Submitted 7 June, 2022; v1 submitted 6 June, 2022; originally announced June 2022.

    Comments: 36 pages, 5 figures, 9 tables, Survey paper

  25. arXiv:2206.01367  [pdf, other

    cs.LG cs.CR

    Adversarial Unlearning: Reducing Confidence Along Adversarial Directions

    Authors: Amrith Setlur, Benjamin Eysenbach, Virginia Smith, Sergey Levine

    Abstract: Supervised learning methods trained with maximum likelihood objectives often overfit on training data. Most regularizers that prevent overfitting look to increase confidence on additional examples (e.g., data augmentation, adversarial training), or reduce it on training data (e.g., label smoothing). In this work we propose a complementary regularization strategy that reduces confidence on self-gen… ▽ More

    Submitted 2 June, 2022; originally announced June 2022.

  26. arXiv:2205.14840  [pdf, other

    cs.LG

    Maximizing Global Model Appeal in Federated Learning

    Authors: Yae Jee Cho, Divyansh Jhunjhunwala, Tian Li, Virginia Smith, Gauri Joshi

    Abstract: Federated learning typically considers collaboratively training a global model using local data at edge clients. Clients may have their own individual requirements, such as having a minimal training loss threshold, which they expect to be met by the global model. However, due to client heterogeneity, the global model may not meet each client's requirements, and only a small subset may find the glo… ▽ More

    Submitted 4 February, 2023; v1 submitted 30 May, 2022; originally announced May 2022.

  27. arXiv:2203.10190  [pdf, other

    cs.LG cs.CY

    Fair Federated Learning via Bounded Group Loss

    Authors: Shengyuan Hu, Zhiwei Steven Wu, Virginia Smith

    Abstract: Fair prediction across protected groups is an important constraint for many federated learning applications. However, prior work studying group fair federated learning lacks formal convergence or fairness guarantees. In this work we propose a general framework for provably fair federated learning. In particular, we explore and extend the notion of Bounded Group Loss as a theoretically-grounded app… ▽ More

    Submitted 12 October, 2022; v1 submitted 18 March, 2022; originally announced March 2022.

    Comments: 19 pages

  28. arXiv:2202.05963  [pdf, other

    cs.LG cs.CR stat.ML

    Private Adaptive Optimization with Side Information

    Authors: Tian Li, Manzil Zaheer, Sashank J. Reddi, Virginia Smith

    Abstract: Adaptive optimization methods have become the default solvers for many machine learning tasks. Unfortunately, the benefits of adaptivity may degrade when training with differential privacy, as the noise added to ensure privacy reduces the effectiveness of the adaptive preconditioner. To this end, we propose AdaDPS, a general framework that uses non-sensitive side information to precondition the gr… ▽ More

    Submitted 24 June, 2022; v1 submitted 11 February, 2022; originally announced February 2022.

    Comments: ICML 2022

  29. arXiv:2111.04131  [pdf, other

    cs.LG cs.PF

    Plumber: Diagnosing and Removing Performance Bottlenecks in Machine Learning Data Pipelines

    Authors: Michael Kuchnik, Ana Klimovic, Jiri Simsa, Virginia Smith, George Amvrosiadis

    Abstract: Input pipelines, which ingest and transform input data, are an essential part of training Machine Learning (ML) models. However, it is challenging to implement efficient input pipelines, as it requires reasoning about parallelism, asynchrony, and variability in fine-grained profiling information. Our analysis of over two million ML jobs in Google datacenters reveals that a significant fraction of… ▽ More

    Submitted 21 March, 2022; v1 submitted 7 November, 2021; originally announced November 2021.

  30. arXiv:2109.06141  [pdf, other

    cs.LG cs.IT math.OC stat.ML

    On Tilted Losses in Machine Learning: Theory and Applications

    Authors: Tian Li, Ahmad Beirami, Maziar Sanjabi, Virginia Smith

    Abstract: Exponential tilting is a technique commonly used in fields such as statistics, probability, information theory, and optimization to create parametric distribution shifts. Despite its prevalence in related fields, tilting has not seen widespread use in machine learning. In this work, we aim to bridge this gap by exploring the use of tilting in risk minimization. We study a simple extension to ERM -… ▽ More

    Submitted 1 June, 2023; v1 submitted 13 September, 2021; originally announced September 2021.

    Comments: arXiv admin note: substantial text overlap with arXiv:2007.01162

  31. arXiv:2108.12978  [pdf, other

    cs.LG cs.CR

    Private Multi-Task Learning: Formulation and Applications to Federated Learning

    Authors: Shengyuan Hu, Zhiwei Steven Wu, Virginia Smith

    Abstract: Many problems in machine learning rely on multi-task learning (MTL), in which the goal is to solve multiple related machine learning tasks simultaneously. MTL is particularly relevant for privacy-sensitive applications in areas such as healthcare, finance, and IoT computing, where sensitive data from multiple, varied sources are shared for the purpose of learning. In this work, we formalize notion… ▽ More

    Submitted 17 October, 2023; v1 submitted 29 August, 2021; originally announced August 2021.

    Comments: Accepted to TMLR. Transactions on Machine Learning Research (2022)

  32. arXiv:2107.06917  [pdf, other

    cs.LG

    A Field Guide to Federated Optimization

    Authors: Jianyu Wang, Zachary Charles, Zheng Xu, Gauri Joshi, H. Brendan McMahan, Blaise Aguera y Arcas, Maruan Al-Shedivat, Galen Andrew, Salman Avestimehr, Katharine Daly, Deepesh Data, Suhas Diggavi, Hubert Eichner, Advait Gadhikar, Zachary Garrett, Antonious M. Girgis, Filip Hanzely, Andrew Hard, Chaoyang He, Samuel Horvath, Zhouyuan Huo, Alex Ingerman, Martin Jaggi, Tara Javidi, Peter Kairouz , et al. (28 additional authors not shown)

    Abstract: Federated learning and analytics are a distributed approach for collaboratively learning models (or statistics) from decentralized data, motivated by and designed for privacy protection. The distributed learning process can be formulated as solving federated optimization problems, which emphasize communication efficiency, data heterogeneity, compatibility with privacy and system requirements, and… ▽ More

    Submitted 14 July, 2021; originally announced July 2021.

  33. arXiv:2106.07820  [pdf, other

    cs.LG cs.DC

    On Large-Cohort Training for Federated Learning

    Authors: Zachary Charles, Zachary Garrett, Zhouyuan Huo, Sergei Shmulyian, Virginia Smith

    Abstract: Federated learning methods typically learn a model by iteratively sampling updates from a population of clients. In this work, we explore how the number of clients sampled at each round (the cohort size) impacts the quality of the learned model and the training dynamics of federated learning algorithms. Our work poses three fundamental questions. First, what challenges arise when trying to scale f… ▽ More

    Submitted 14 June, 2021; originally announced June 2021.

  34. arXiv:2106.04502  [pdf, other

    cs.LG cs.AI cs.DC stat.ML

    Federated Hyperparameter Tuning: Challenges, Baselines, and Connections to Weight-Sharing

    Authors: Mikhail Khodak, Renbo Tu, Tian Li, Liam Li, Maria-Florina Balcan, Virginia Smith, Ameet Talwalkar

    Abstract: Tuning hyperparameters is a crucial but arduous part of the machine learning pipeline. Hyperparameter optimization is even more challenging in federated learning, where models are learned over a distributed network of heterogeneous devices; here, the need to keep data on device and perform local training makes it difficult to efficiently train and evaluate configurations. In this work, we investig… ▽ More

    Submitted 4 November, 2021; v1 submitted 8 June, 2021; originally announced June 2021.

    Comments: NeurIPS 2021

  35. arXiv:2103.00697  [pdf, other

    cs.LG

    Heterogeneity for the Win: One-Shot Federated Clustering

    Authors: Don Kurian Dennis, Tian Li, Virginia Smith

    Abstract: In this work, we explore the unique challenges -- and opportunities -- of unsupervised federated learning (FL). We develop and analyze a one-shot federated clustering scheme, $k$-FED, based on the widely-used Lloyd's method for $k$-means clustering. In contrast to many supervised problems, we show that the issue of statistical heterogeneity in federated networks can in fact benefit our analysis. W… ▽ More

    Submitted 5 October, 2021; v1 submitted 28 February, 2021; originally announced March 2021.

  36. arXiv:2102.11503  [pdf, other

    cs.LG

    Two Sides of Meta-Learning Evaluation: In vs. Out of Distribution

    Authors: Amrith Setlur, Oscar Li, Virginia Smith

    Abstract: We categorize meta-learning evaluation into two settings: $\textit{in-distribution}$ [ID], in which the train and test tasks are sampled $\textit{iid}$ from the same underlying task distribution, and $\textit{out-of-distribution}$ [OOD], in which they are not. While most meta-learning theory and some FSL applications follow the ID setting, we identify that most existing few-shot classification ben… ▽ More

    Submitted 27 October, 2021; v1 submitted 23 February, 2021; originally announced February 2021.

  37. arXiv:2102.08504  [pdf, other

    cs.LG cs.CR

    Label Leakage and Protection in Two-party Split Learning

    Authors: Oscar Li, Jiankai Sun, Xin Yang, Weihao Gao, Hongyi Zhang, Junyuan Xie, Virginia Smith, Chong Wang

    Abstract: Two-party split learning is a popular technique for learning a model across feature-partitioned data. In this work, we explore whether it is possible for one party to steal the private label information from the other party during split training, and whether there are methods that can protect against such attacks. Specifically, we first formulate a realistic threat model and propose a privacy loss… ▽ More

    Submitted 24 May, 2022; v1 submitted 16 February, 2021; originally announced February 2021.

    Comments: Accepted to ICLR 2022 (https://openreview.net/forum?id=cOtBRgsf2fO)

  38. arXiv:2012.04221  [pdf, other

    cs.LG stat.ML

    Ditto: Fair and Robust Federated Learning Through Personalization

    Authors: Tian Li, Shengyuan Hu, Ahmad Beirami, Virginia Smith

    Abstract: Fairness and robustness are two important concerns for federated learning systems. In this work, we identify that robustness to data and model poisoning attacks and fairness, measured as the uniformity of performance across devices, are competing constraints in statistically heterogeneous networks. To address these constraints, we propose employing a simple, general framework for personalized fede… ▽ More

    Submitted 15 June, 2021; v1 submitted 8 December, 2020; originally announced December 2020.

    Comments: Accepted by ICML 2021

  39. arXiv:2011.14097  [pdf, other

    cs.LG cs.AI cs.CV

    Time Series Change Point Detection with Self-Supervised Contrastive Predictive Coding

    Authors: Shohreh Deldari, Daniel V. Smith, Hao Xue, Flora D. Salim

    Abstract: Change Point Detection (CPD) methods identify the times associated with changes in the trends and properties of time series data in order to describe the underlying behaviour of the system. For instance, detecting the changes and anomalies associated with web service usage, application usage or human behaviour can provide valuable insights for downstream modelling tasks. We propose a novel approac… ▽ More

    Submitted 4 March, 2021; v1 submitted 28 November, 2020; originally announced November 2020.

    Comments: Accepted at The WEB Conference 2021 (WWW'21)

  40. arXiv:2011.14048  [pdf, other

    cs.LG stat.ML

    Is Support Set Diversity Necessary for Meta-Learning?

    Authors: Amrith Setlur, Oscar Li, Virginia Smith

    Abstract: Meta-learning is a popular framework for learning with limited data in which an algorithm is produced by training over multiple few-shot learning tasks. For classification problems, these tasks are typically constructed by sampling a small number of support and query examples from a subset of the classes. While conventional wisdom is that task diversity should improve the performance of meta-learn… ▽ More

    Submitted 7 October, 2021; v1 submitted 27 November, 2020; originally announced November 2020.

    Journal ref: NeurIPS 2020 Workshop on Meta-learning

  41. arXiv:2008.03230  [pdf, other

    cs.LG cs.CV cs.DB cs.IT eess.SP stat.ML

    ESPRESSO: Entropy and ShaPe awaRe timE-Series SegmentatiOn for processing heterogeneous sensor data

    Authors: Shohreh Deldari, Daniel V. Smith, Amin Sadri, Flora D. Salim

    Abstract: Extracting informative and meaningful temporal segments from high-dimensional wearable sensor data, smart devices, or IoT data is a vital preprocessing step in applications such as Human Activity Recognition (HAR), trajectory prediction, gesture recognition, and lifelogging. In this paper, we propose ESPRESSO (Entropy and ShaPe awaRe timE-Series SegmentatiOn), a hybrid segmentation model for multi… ▽ More

    Submitted 24 July, 2020; originally announced August 2020.

    Comments: 23 pages, 11 figures, accepted at IMWUT Volume(4) issue(3)

  42. arXiv:2007.01162  [pdf, other

    cs.LG cs.IT stat.ML

    Tilted Empirical Risk Minimization

    Authors: Tian Li, Ahmad Beirami, Maziar Sanjabi, Virginia Smith

    Abstract: Empirical risk minimization (ERM) is typically designed to perform well on the average loss, which can result in estimators that are sensitive to outliers, generalize poorly, or treat subgroups unfairly. While many methods aim to address these problems individually, in this work, we explore them through a unified framework -- tilted empirical risk minimization (TERM). In particular, we show that i… ▽ More

    Submitted 17 March, 2021; v1 submitted 2 July, 2020; originally announced July 2020.

    Comments: Accepted by ICLR 2021

  43. arXiv:2001.01920  [pdf, other

    cs.LG stat.ML

    FedDANE: A Federated Newton-Type Method

    Authors: Tian Li, Anit Kumar Sahu, Manzil Zaheer, Maziar Sanjabi, Ameet Talwalkar, Virginia Smith

    Abstract: Federated learning aims to jointly learn statistical models over massively distributed remote devices. In this work, we propose FedDANE, an optimization method that we adapt from DANE, a method for classical distributed optimization, to handle the practical constraints of federated learning. We provide convergence guarantees for this method when learning over both convex and non-convex functions.… ▽ More

    Submitted 7 January, 2020; originally announced January 2020.

    Comments: Asilomar Conference on Signals, Systems, and Computers 2019

  44. arXiv:1911.01812  [pdf, other

    cs.LG cs.CR cs.NI stat.ML

    Enhancing the Privacy of Federated Learning with Sketching

    Authors: Zaoxing Liu, Tian Li, Virginia Smith, Vyas Sekar

    Abstract: In response to growing concerns about user privacy, federated learning has emerged as a promising tool to train statistical models over networks of devices while keeping data localized. Federated learning methods run training tasks directly on user devices and do not share the raw user data with third parties. However, current methods still share model updates, which may contain private informatio… ▽ More

    Submitted 5 November, 2019; originally announced November 2019.

  45. arXiv:1911.00972  [pdf, other

    cs.LG cs.CR stat.ML

    Privacy for Free: Communication-Efficient Learning with Differential Privacy Using Sketches

    Authors: Tian Li, Zaoxing Liu, Vyas Sekar, Virginia Smith

    Abstract: Communication and privacy are two critical concerns in distributed learning. Many existing works treat these concerns separately. In this work, we argue that a natural connection exists between methods for communication reduction and privacy preservation in the context of distributed machine learning. In particular, we prove that Count Sketch, a simple method for data stream summarization, has inh… ▽ More

    Submitted 6 December, 2019; v1 submitted 3 November, 2019; originally announced November 2019.

  46. arXiv:1911.00472  [pdf, other

    cs.LG stat.ML

    Progressive Compressed Records: Taking a Byte out of Deep Learning Data

    Authors: Michael Kuchnik, George Amvrosiadis, Virginia Smith

    Abstract: Deep learning accelerators efficiently train over vast and growing amounts of data, placing a newfound burden on commodity networks and storage devices. A common approach to conserve bandwidth involves resizing or compressing data prior to training. We introduce Progressive Compressed Records (PCRs), a data format that uses compression to reduce the overhead of fetching and transporting data, effe… ▽ More

    Submitted 11 August, 2021; v1 submitted 1 November, 2019; originally announced November 2019.

  47. arXiv:1908.07873  [pdf, other

    cs.LG cs.DC stat.ML

    Federated Learning: Challenges, Methods, and Future Directions

    Authors: Tian Li, Anit Kumar Sahu, Ameet Talwalkar, Virginia Smith

    Abstract: Federated learning involves training statistical models over remote devices or siloed data centers, such as mobile phones or hospitals, while keeping data localized. Training in heterogeneous and potentially massive networks introduces novel challenges that require a fundamental departure from standard approaches for large-scale machine learning, distributed optimization, and privacy-preserving da… ▽ More

    Submitted 21 August, 2019; originally announced August 2019.

  48. arXiv:1907.11304  [pdf, ps, other

    cs.CR

    On The Fly Diffie Hellman for IoT

    Authors: J. Díaz Arancibia, V. Ferrari Smith, J. López Fenner

    Abstract: The Internet of Things (IoT) is a fast growing field of devices being added to an interconnected environment in an abstract heterogeneous array of servers and other devices, called smart environments, ranging from private local (home) environments to nation-wide infrastructures, often accessible via unsecured wireless communications and information technologies, hence, massively open to attacks. I… ▽ More

    Submitted 25 July, 2019; originally announced July 2019.

    Comments: 6 pages, 1 figure, 1 table, preprint

    MSC Class: 94A60; 94A62

  49. arXiv:1905.10497  [pdf, other

    cs.LG stat.ML

    Fair Resource Allocation in Federated Learning

    Authors: Tian Li, Maziar Sanjabi, Ahmad Beirami, Virginia Smith

    Abstract: Federated learning involves training statistical models in massive, heterogeneous networks. Naively minimizing an aggregate loss function in such a network may disproportionately advantage or disadvantage some of the devices. In this work, we propose q-Fair Federated Learning (q-FFL), a novel optimization objective inspired by fair resource allocation in wireless networks that encourages a more fa… ▽ More

    Submitted 14 February, 2020; v1 submitted 24 May, 2019; originally announced May 2019.

    Comments: ICLR 2020

  50. arXiv:1904.03257  [pdf, ps, other

    cs.LG cs.DB cs.DC cs.SE stat.ML

    MLSys: The New Frontier of Machine Learning Systems

    Authors: Alexander Ratner, Dan Alistarh, Gustavo Alonso, David G. Andersen, Peter Bailis, Sarah Bird, Nicholas Carlini, Bryan Catanzaro, Jennifer Chayes, Eric Chung, Bill Dally, Jeff Dean, Inderjit S. Dhillon, Alexandros Dimakis, Pradeep Dubey, Charles Elkan, Grigori Fursin, Gregory R. Ganger, Lise Getoor, Phillip B. Gibbons, Garth A. Gibson, Joseph E. Gonzalez, Justin Gottschlich, Song Han, Kim Hazelwood , et al. (44 additional authors not shown)

    Abstract: Machine learning (ML) techniques are enjoying rapidly increasing adoption. However, designing and implementing the systems that support ML models in real-world deployments remains a significant obstacle, in large part due to the radically different development and deployment profile of modern ML methods, and the range of practical concerns that come with broader adoption. We propose to foster a ne… ▽ More

    Submitted 1 December, 2019; v1 submitted 29 March, 2019; originally announced April 2019.