Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–22 of 22 results for author: Balle, B

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.08918  [pdf, other

    cs.CR cs.AI cs.LG math.ST stat.ML

    Beyond the Calibration Point: Mechanism Comparison in Differential Privacy

    Authors: Georgios Kaissis, Stefan Kolek, Borja Balle, Jamie Hayes, Daniel Rueckert

    Abstract: In differentially private (DP) machine learning, the privacy guarantees of DP mechanisms are often reported and compared on the basis of a single $(\varepsilon, δ)$-pair. This practice overlooks that DP guarantees can vary substantially even between mechanisms sharing a given $(\varepsilon, δ)$, and potentially introduces privacy vulnerabilities which can remain undetected. This motivates the need… ▽ More

    Submitted 10 July, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

    Comments: ICML 2024

  2. arXiv:2302.13861  [pdf, other

    cs.LG cs.CR cs.CV stat.ML

    Differentially Private Diffusion Models Generate Useful Synthetic Images

    Authors: Sahra Ghalebikesabi, Leonard Berrada, Sven Gowal, Ira Ktena, Robert Stanforth, Jamie Hayes, Soham De, Samuel L. Smith, Olivia Wiles, Borja Balle

    Abstract: The ability to generate privacy-preserving synthetic versions of sensitive image datasets could unlock numerous ML applications currently constrained by data availability. Due to their astonishing image generation quality, diffusion models are a prime candidate for generating high-quality synthetic data. However, recent studies have found that, by default, the outputs of some diffusion models do n… ▽ More

    Submitted 27 February, 2023; originally announced February 2023.

  3. arXiv:2204.13650  [pdf, other

    cs.LG cs.CR cs.CV stat.ML

    Unlocking High-Accuracy Differentially Private Image Classification through Scale

    Authors: Soham De, Leonard Berrada, Jamie Hayes, Samuel L. Smith, Borja Balle

    Abstract: Differential Privacy (DP) provides a formal privacy guarantee preventing adversaries with access to a machine learning model from extracting information about individual training points. Differentially Private Stochastic Gradient Descent (DP-SGD), the most popular DP training method for deep learning, realizes this protection by injecting noise during training. However previous works have found th… ▽ More

    Submitted 16 June, 2022; v1 submitted 28 April, 2022; originally announced April 2022.

  4. arXiv:2102.08093   

    stat.ML cs.LG

    A Law of Robustness for Weight-bounded Neural Networks

    Authors: Hisham Husain, Borja Balle

    Abstract: Robustness of deep neural networks against adversarial perturbations is a pressing concern motivated by recent findings showing the pervasive nature of such vulnerabilities. One method of characterizing the robustness of a neural network model is through its Lipschitz constant, which forms a robustness certificate. A natural question to ask is, for a fixed model class (such as neural networks) and… ▽ More

    Submitted 12 March, 2021; v1 submitted 16 February, 2021; originally announced February 2021.

    Comments: The main result does not resolve the conjecture as claimed. However the proof technique can be used to obtain a weaker result. The manuscript will be updated at a later date

  5. arXiv:2009.09052  [pdf, ps, other

    cs.LG cs.CR stat.ML

    Private Reinforcement Learning with PAC and Regret Guarantees

    Authors: Giuseppe Vietri, Borja Balle, Akshay Krishnamurthy, Zhiwei Steven Wu

    Abstract: Motivated by high-stakes decision-making domains like personalized medicine where user information is inherently sensitive, we design privacy preserving exploration policies for episodic reinforcement learning (RL). We first provide a meaningful privacy formulation using the notion of joint differential privacy (JDP)--a strong variant of differential privacy for settings where each user receives t… ▽ More

    Submitted 18 September, 2020; originally announced September 2020.

  6. arXiv:2007.06605  [pdf, other

    cs.LG cs.CR stat.ML

    Privacy Amplification via Random Check-Ins

    Authors: Borja Balle, Peter Kairouz, H. Brendan McMahan, Om Thakkar, Abhradeep Thakurta

    Abstract: Differentially Private Stochastic Gradient Descent (DP-SGD) forms a fundamental building block in many applications for learning over sensitive data. Two standard approaches, privacy amplification by subsampling, and privacy amplification by shuffling, permit adding lower noise in DP-SGD than via naïve schemes. A key assumption in both these approaches is that the elements in the data set can be u… ▽ More

    Submitted 30 July, 2020; v1 submitted 13 July, 2020; originally announced July 2020.

    Comments: Updated proof for $(ε_0, δ_0)$-DP local randomizers

  7. arXiv:1910.08902  [pdf, ps, other

    cs.LG cs.CL cs.CR stat.ML

    Privacy- and Utility-Preserving Textual Analysis via Calibrated Multivariate Perturbations

    Authors: Oluwaseyi Feyisetan, Borja Balle, Thomas Drake, Tom Diethe

    Abstract: Accurately learning from user data while providing quantifiable privacy guarantees provides an opportunity to build better ML models while maintaining user trust. This paper presents a formal approach to carrying out privacy preserving text perturbation using the notion of dx-privacy designed to achieve geo-indistinguishability in location data. Our approach applies carefully calibrated noise to v… ▽ More

    Submitted 20 October, 2019; originally announced October 2019.

    Comments: Accepted at WSDM 2020

  8. arXiv:1910.05876  [pdf, other

    cs.LG stat.ML

    Actor Critic with Differentially Private Critic

    Authors: Jonathan Lebensold, William Hamilton, Borja Balle, Doina Precup

    Abstract: Reinforcement learning algorithms are known to be sample inefficient, and often performance on one task can be substantially improved by leveraging information (e.g., via pre-training) on other related tasks. In this work, we propose a technique to achieve such knowledge transfer in cases where agent trajectories contain sensitive or private information, such as in the healthcare domain. Our appro… ▽ More

    Submitted 13 October, 2019; originally announced October 2019.

    Comments: 6 Pages, Presented at the Privacy in Machine Learning Workshop, NeurIPS 2019

  9. arXiv:1906.09116  [pdf, ps, other

    cs.CR stat.ML

    Differentially Private Summation with Multi-Message Shuffling

    Authors: Borja Balle, James Bell, Adria Gascon, Kobbi Nissim

    Abstract: In recent work, Cheu et al. (Eurocrypt 2019) proposed a protocol for $n$-party real summation in the shuffle model of differential privacy with $O_{ε, δ}(1)$ error and $Θ(ε\sqrt{n})$ one-bit messages per party. In contrast, every local model protocol for real summation must incur error $Ω(1/\sqrt{n})$, and there exist protocols matching this lower bound which require just one bit of communication… ▽ More

    Submitted 21 August, 2019; v1 submitted 20 June, 2019; originally announced June 2019.

  10. arXiv:1905.12264  [pdf, ps, other

    cs.LG cs.CR math.PR stat.ML

    Privacy Amplification by Mixing and Diffusion Mechanisms

    Authors: Borja Balle, Gilles Barthe, Marco Gaboardi, Joseph Geumlek

    Abstract: A fundamental result in differential privacy states that the privacy guarantees of a mechanism are preserved by any post-processing of its output. In this paper we investigate under what conditions stochastic post-processing can amplify the privacy of a mechanism. By interpreting post-processing as the application of a Markov operator, we first give a series of amplification results in terms of un… ▽ More

    Submitted 27 October, 2019; v1 submitted 29 May, 2019; originally announced May 2019.

  11. arXiv:1905.11190  [pdf, other

    cs.LG cs.AI cs.LO stat.ML

    Model-Agnostic Counterfactual Explanations for Consequential Decisions

    Authors: Amir-Hossein Karimi, Gilles Barthe, Borja Balle, Isabel Valera

    Abstract: Predictive models are being increasingly used to support consequential decision making at the individual level in contexts such as pretrial bail and loan approval. As a result, there is increasing social and legal pressure to provide explanations that help the affected individuals not only to understand why a prediction was output, but also how to act to obtain a desired outcome. To this end, seve… ▽ More

    Submitted 28 February, 2020; v1 submitted 27 May, 2019; originally announced May 2019.

  12. arXiv:1905.10862  [pdf, other

    stat.ML cs.LG

    Automatic Discovery of Privacy-Utility Pareto Fronts

    Authors: Brendan Avent, Javier Gonzalez, Tom Diethe, Andrei Paleyes, Borja Balle

    Abstract: Differential privacy is a mathematical framework for privacy-preserving data analysis. Changing the hyperparameters of a differentially private algorithm allows one to trade off privacy and utility in a principled way. Quantifying this trade-off in advance is essential to decision-makers tasked with deciding how much privacy can be provided in a particular application while maintaining acceptable… ▽ More

    Submitted 21 July, 2020; v1 submitted 26 May, 2019; originally announced May 2019.

    Comments: Proceedings on Privacy Enhancing Technologies 2020

  13. arXiv:1905.09982  [pdf, other

    cs.LG stat.ML

    Hypothesis Testing Interpretations and Renyi Differential Privacy

    Authors: Borja Balle, Gilles Barthe, Marco Gaboardi, Justin Hsu, Tetsuya Sato

    Abstract: Differential privacy is a de facto standard in data privacy, with applications in the public and private sectors. A way to explain differential privacy, which is particularly appealing to statistician and social scientists is by means of its statistical hypothesis testing interpretation. Informally, one cannot effectively test whether a specific individual has contributed her data by observing the… ▽ More

    Submitted 8 October, 2019; v1 submitted 23 May, 2019; originally announced May 2019.

    Journal ref: Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, PMLR 108:2496-2506, 2020

  14. arXiv:1903.11112  [pdf, other

    cs.LG cs.CL stat.ML

    Privacy-preserving Active Learning on Sensitive Data for User Intent Classification

    Authors: Oluwaseyi Feyisetan, Thomas Drake, Borja Balle, Tom Diethe

    Abstract: Active learning holds promise of significantly reducing data annotation costs while maintaining reasonable model performance. However, it requires sending data to annotators for labeling. This presents a possible privacy leak when the training set includes sensitive user data. In this paper, we describe an approach for carrying out privacy preserving active learning with quantifiable guarantees. W… ▽ More

    Submitted 26 March, 2019; originally announced March 2019.

    Comments: To appear at PAL: Privacy-Enhancing Artificial Intelligence and Language Technologies as part of the AAAI Spring Symposium Series (AAAI-SSS 2019)

  15. arXiv:1903.05202  [pdf, other

    stat.ML cs.LG

    Continual Learning in Practice

    Authors: Tom Diethe, Tom Borchert, Eno Thereska, Borja Balle, Neil Lawrence

    Abstract: This paper describes a reference architecture for self-maintaining systems that can learn continually, as data arrives. In environments where data evolves, we need architectures that manage Machine Learning (ML) models in production, adapt to shifting data distributions, cope with outliers, retrain when necessary, and adapt to new tasks. This represents continual AutoML or Automatically Adaptive M… ▽ More

    Submitted 18 March, 2019; v1 submitted 12 March, 2019; originally announced March 2019.

    Comments: Presented at the NeurIPS 2018 workshop on Continual Learning https://sites.google.com/view/continual2018/home

  16. arXiv:1903.02837  [pdf, other

    cs.LG cs.CR stat.ML

    The Privacy Blanket of the Shuffle Model

    Authors: Borja Balle, James Bell, Adria Gascon, Kobbi Nissim

    Abstract: This work studies differential privacy in the context of the recently proposed shuffle model. Unlike in the local model, where the server collecting privatized data from users can track back an input to a specific user, in the shuffle model users submit their privatized inputs to a server anonymously. This setup yields a trust model which sits in between the classical curator and local models for… ▽ More

    Submitted 2 June, 2019; v1 submitted 7 March, 2019; originally announced March 2019.

  17. arXiv:1810.07468  [pdf, other

    stat.ML cs.LG

    Hierarchical Methods of Moments

    Authors: Matteo Ruffini, Guillaume Rabusseau, Borja Balle

    Abstract: Spectral methods of moments provide a powerful tool for learning the parameters of latent variable models. Despite their theoretical appeal, the applicability of these methods to real data is still limited due to a lack of robustness to model misspecification. In this paper we present a hierarchical approach to methods of moments to circumvent such limitations. Our method is based on replacing the… ▽ More

    Submitted 17 October, 2018; originally announced October 2018.

    Comments: NIPS 2017

  18. arXiv:1808.00087  [pdf, other

    cs.LG cs.CR stat.ML

    Subsampled Rényi Differential Privacy and Analytical Moments Accountant

    Authors: Yu-Xiang Wang, Borja Balle, Shiva Kasiviswanathan

    Abstract: We study the problem of subsampling in differential privacy (DP), a question that is the centerpiece behind many successful differentially private machine learning algorithms. Specifically, we provide a tight upper bound on the Rényi Differential Privacy (RDP) (Mironov, 2017) parameters for algorithms that: (1) subsample the dataset, and then (2) applies a randomized mechanism M to the subsample,… ▽ More

    Submitted 4 December, 2018; v1 submitted 31 July, 2018; originally announced August 2018.

  19. arXiv:1807.01647  [pdf, other

    cs.LG cs.CR stat.ML

    Privacy Amplification by Subsampling: Tight Analyses via Couplings and Divergences

    Authors: Borja Balle, Gilles Barthe, Marco Gaboardi

    Abstract: Differential privacy comes equipped with multiple analytical tools for the design of private data analyses. One important tool is the so-called "privacy amplification by subsampling" principle, which ensures that a differentially private mechanism run on a random subsample of a population provides higher privacy guarantees than when run on the entire population. Several instances of this principle… ▽ More

    Submitted 23 November, 2018; v1 submitted 4 July, 2018; originally announced July 2018.

    Comments: To appear in NeurIPS 2018

  20. arXiv:1805.06530  [pdf, other

    cs.LG stat.ML

    Improving the Gaussian Mechanism for Differential Privacy: Analytical Calibration and Optimal Denoising

    Authors: Borja Balle, Yu-Xiang Wang

    Abstract: The Gaussian mechanism is an essential building block used in multitude of differentially private data analysis algorithms. In this paper we revisit the Gaussian mechanism and show that the original analysis has several important limitations. Our analysis reveals that the variance formula for the original mechanism is far from tight in the high privacy regime ($\varepsilon \to 0$) and it cannot be… ▽ More

    Submitted 7 June, 2018; v1 submitted 16 May, 2018; originally announced May 2018.

    Comments: To appear at the 35th International Conference on Machine Learning (ICML), 2018

  21. arXiv:1603.02010  [pdf, other

    cs.LG stat.ML

    Differentially Private Policy Evaluation

    Authors: Borja Balle, Maziar Gomrokchi, Doina Precup

    Abstract: We present the first differentially private algorithms for reinforcement learning, which apply to the task of evaluating a fixed policy. We establish two approaches for achieving differential privacy, provide a theoretical analysis of the privacy and utility of the two algorithms, and show promising results on simple empirical examples.

    Submitted 7 March, 2016; originally announced March 2016.

  22. arXiv:1206.6393  [pdf

    cs.LG stat.ML

    Local Loss Optimization in Operator Models: A New Insight into Spectral Learning

    Authors: Borja Balle, Ariadna Quattoni, Xavier Carreras

    Abstract: This paper re-visits the spectral method for learning latent variable models defined in terms of observable operators. We give a new perspective on the method, showing that operators can be recovered by minimizing a loss defined on a finite subset of the domain. A non-convex optimization similar to the spectral method is derived. We also propose a regularized convex relaxation of this optimization… ▽ More

    Submitted 27 June, 2012; originally announced June 2012.

    Comments: Appears in Proceedings of the 29th International Conference on Machine Learning (ICML 2012)