Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Smoothed Analysis with Adaptive Adversaries

Published: 11 June 2024 Publication History

Abstract

We prove novel algorithmic guarantees for several online problems in the smoothed analysis model. In this model, at each time step an adversary chooses an input distribution with density function bounded above pointwise by  \(\tfrac{1}{\sigma }\) times that of the uniform distribution; nature then samples an input from this distribution. Here, σ is a parameter that interpolates between the extremes of worst-case and average case analysis. Crucially, our results hold for adaptive adversaries that can base their choice of input distribution on the decisions of the algorithm and the realizations of the inputs in the previous time steps. An adaptive adversary can nontrivially correlate inputs at different time steps with each other and with the algorithm’s current state; this appears to rule out the standard proof approaches in smoothed analysis.
This paper presents a general technique for proving smoothed algorithmic guarantees against adaptive adversaries, in effect reducing the setting of an adaptive adversary to the much simpler case of an oblivious adversary (i.e., an adversary that commits in advance to the entire sequence of input distributions). We apply this technique to prove strong smoothed guarantees for three different problems:
(1)
Online learning: We consider the online prediction problem, where instances are generated from an adaptive sequence of σ-smooth distributions and the hypothesis class has VC dimension d. We bound the regret by \(\tilde{O}(\sqrt {T d\ln (1/\sigma)} + d\ln (T/\sigma))\) and provide a near-matching lower bound. Our result shows that under smoothed analysis, learnability against adaptive adversaries is characterized by the finiteness of the VC dimension. This is as opposed to the worst-case analysis, where online learnability is characterized by Littlestone dimension (which is infinite even in the extremely restricted case of one-dimensional threshold functions). Our results fully answer an open question of Rakhlin et al. [64].
(2)
Online discrepancy minimization: We consider the setting of the online Komlós problem, where the input is generated from an adaptive sequence of σ-smooth and isotropic distributions on the ℓ2 unit ball. We bound the ℓ norm of the discrepancy vector by \(\tilde{O}(\ln ^2(\frac{nT}{\sigma }))\) . This is as opposed to the worst-case analysis, where the tight discrepancy bound is \(\Theta (\sqrt {T/n})\) . We show such \(\mathrm{polylog}(nT/\sigma)\) discrepancy guarantees are not achievable for non-isotropic σ-smooth distributions.
(3)
Dispersion in online optimization: We consider online optimization with piecewise Lipschitz functions where functions with ℓ discontinuities are chosen by a smoothed adaptive adversary and show that the resulting sequence is \(({\sigma }/{\sqrt {T\ell }}, \tilde{O}(\sqrt {T\ell }))\) -dispersed. That is, every ball of radius \({\sigma }/{\sqrt {T\ell }}\) is split by \(\tilde{O}(\sqrt {T\ell })\) of the partitions made by these functions. This result matches the dispersion parameters of Balcan et al. [13] for oblivious smooth adversaries, up to logarithmic factors. On the other hand, worst-case sequences are trivially (0, T)-dispersed.

References

[1]
Naman Agarwal, Nataly Brukhim, Elad Hazan, and Zhou Lu. 2020. Boosting for control of dynamical systems. In Proceedings of the 37th International Conference on Machine Learning, (ICML’20), Vol. 119. PMLR, 96–103.
[2]
Naman Agarwal, Elad Hazan, Anirudha Majumdar, and Karan Singh. 2021. A regret minimization approach to iterative learning control. In Proceedings of the 38th International Conference on Machine Learning (ICML’21), Vol. 139. PMLR, 100–109.
[3]
Zeyuan Allen-Zhu, Zhenyu Liao, and Lorenzo Orecchia. 2015. Spectral sparsification and regret minimization beyond matrix multiplicative updates. In Proceedings of the 47th Annual ACM on Symposium on Theory of Computing (STOC’15). 237–245.
[4]
Noga Alon, Omri Ben-Eliezer, Yuval Dagan, Shay Moran, Moni Naor, and Eylon Yogev. 2021. Adversarial laws of large numbers and optimal regret in online classification. In Proceedings of the 53rd Annual ACM Symposium on Theory of Computing (STOC’21). 447–455.
[5]
Ryan Alweiss, Yang P. Liu, and Mehtaab Sawhney. 2021. Discrepancy minimization via a self-balancing walk. In 53rd Annual ACM Symposium on Theory of Computing (STOC’21). 14–20.
[6]
Sanjeev Arora, Rong Ge, and Ankur Moitra. 2012. Learning topic models – going beyond SVD. In Proceedings of the 53rd Annual Symposium on Foundations of Computer Science (FOCS’12). 1–10.
[7]
Sanjeev Arora, Elad Hazan, and Satyen Kale. 2012. The multiplicative weights update method: A meta-algorithm and applications. Theory of Computing 8, 6 (2012), 121–164.
[8]
David Arthur and Sergei Vassilvitskii. 2006. How slow is the k-means method?. In Proceedings of the 22nd Symposium on Computational Geometry (SoCG’06). 144–153.
[9]
Pranjal Awasthi, Maria-Florina Balcan, Nika Haghtalab, and Ruth Urner. 2015. Efficient learning of linear separators under bounded noise. In Conference on Learning Theory (COLT’15). 167–190.
[10]
Pranjal Awasthi, Maria-Florina Balcan, Nika Haghtalab, and Hongyang Zhang. 2016. Learning and 1-bit compressed sensing under asymmetric noise. In Conference on Learning Theory (COLT’16). 152–192.
[11]
Pranjal Awasthi, Avrim Blum, Nika Haghtalab, and Yishay Mansour. 2017. Efficient PAC learning from the crowd. In Conference on Learning Theory (COLT’17). 127–150.
[12]
Maria-Florina Balcan, Avrim Blum, and Anupam Gupta. 2013. Clustering under approximation stability. J. ACM 60, 2, Article 8 (May 2013), 34 pages.
[13]
Maria-Florina Balcan, Travis Dick, and Ellen Vitercik. 2018. Dispersion for data-driven algorithm design, online learning, and private optimization. In Proceedings of the 59th Annual Symposium on Foundations of Computer Science (FOCS’18). 603–614.
[14]
Maria-Florina Balcan, Nika Haghtalab, and Colin White. 2020. K-center clustering under perturbation resilience. ACM Trans. Algorithms 16, 2, Article 22 (March 2020), 39 pages.
[15]
Nikhil Bansal. 2010. Constructive algorithms for discrepancy minimization. In Proceedings of the 51st Annual Symposium on Foundations of Computer Science (FOCS’10). 3–10.
[16]
Nikhil Bansal, Daniel Dadush, Shashwat Garg, and Shachar Lovett. 2018. The Gram-Schmidt walk: A cure for the Banaszczyk blues. In Proceedings of the 50th Annual ACM Symposium on Theory of Computing (STOC’18). 587–597.
[17]
Nikhil Bansal and Shashwat Garg. 2017. Algorithmic discrepancy beyond partial coloring. In Proceedings of the 49th Annual ACM Symposium on Theory of Computing (STOC’17). 914–926.
[18]
Nikhil Bansal, Haotian Jiang, Raghu Meka, Sahil Singla, and Makrand Sinha. 2021. Online discrepancy minimization for stochastic arrivals. In Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms (SODA’21). 2842–2861.
[19]
Nikhil Bansal, Haotian Jiang, Raghu Meka, Sahil Singla, and Makrand Sinha. 2022. Prefix discrepancy, smoothed analysis, and combinatorial vector balancing. In 13th Innovations in Theoretical Computer Science Conference (ITCS’22)(LIPIcs, Vol. 215). Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 13:1–13:22.
[20]
Nikhil Bansal, Haotian Jiang, Raghu Meka, Sahil Singla, and Makrand Sinha. 2022. Smoothed analysis of the Komlós conjecture. In 49th International Colloquium on Automata, Languages, and Programming (ICALP’22)(LIPIcs, Vol. 229). 14:1–14:12.
[21]
Nikhil Bansal, Haotian Jiang, Sahil Singla, and Makrand Sinha. 2020. Online vector balancing and geometric discrepancy. In Proceedings of the 52nd Annual ACM Symposium on Theory of Computing (STOC’20). 1139–1152.
[22]
Nikhil Bansal and Joel H. Spencer. 2020. On-line balancing of random inputs. Random Struct. Algorithms 57, 4 (2020), 879–891.
[23]
Shai Ben-David, Dávid Pál, and Shai Shalev-Shwartz. 2009. Agnostic online learning. In Proceedings of the 22nd Annual Conference on Learning Theory (COLT’09).
[24]
Aditya Bhaskara, Aidao Chen, Aidan Perreault, and Aravindan Vijayaraghavan. 2019. Smoothed analysis in unsupervised learning via decoupling. In Proceedings of the 60th Annual Symposium on Foundations of Computer Science (FOCS’19). 582–610.
[25]
Yonatan Bilu and Nathan Linial. 2012. Are stable instances easy? Combinatorics, Probability and Computing 21, 5 (2012), 643–660.
[26]
Adam Block, Yuval Dagan, Noah Golowich, and Alexander Rakhlin. 2022. Smoothed online learning is as easy as statistical learning. In Conference on Learning Theory (COLT’22), Vol. 178. PMLR, 1716–1786.
[27]
Adam Block and Yury Polyanskiy. 2023. The sample complexity of approximate rejection sampling with applications to smoothed online learning. In Conference on Learning Theory (COLT’23), Vol. 195. PMLR, 228–273.
[28]
Adam Block and Max Simchowitz. 2022. Efficient and near-optimal smoothed online learning for generalized linear functions. In Advances in Neural Information Processing Systems (NeurIPS’22). 36, 7477–7489.
[29]
Avrim Blum and Yishay Mansour. 2007. Learning, regret minimization, and equilibria. In Algorithmic Game Theory, Noam Nisan, Tim Roughgarden, Eva Tardos, and Vijay V. Editors Vazirani (Eds.). Cambridge University Press, 79–102.
[30]
Stéphane Boucheron, Gábor Lugosi, and Pascal Massart. 2013. Concentration Inequalities: A Nonasymptotic Theory of Independence. OUP Oxford. 2012277339
[31]
Olivier Bousquet, Stéphane Boucheron, and Gábor Lugosi. 2003. Introduction to statistical learning theory. In Summer School on Machine Learning. Springer, 169–207.
[32]
Mark Bun, Roi Livni, and Shay Moran. 2020. An equivalence between private classification and online prediction. In Proceeding of the 61st Annual Symposium on Foundations of Computer Science (FOCS’20). 389–402.
[33]
Nicolò Cesa-Bianchi and Gábor Lugosi. 2006. Prediction, Learning, and Games. Cambridge University Press.
[34]
Bernard Chazelle. 2000. The Discrepancy Method: Randomness and Complexity. Cambridge University Press.
[35]
Vincent Cohen-Addad and Varun Kanade. 2017. Online optimization of smoothed piecewise constant functions. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS’17). 412–420.
[36]
Ofer Dekel, Arthur Flajolet, Nika Haghtalab, and Patrick Jaillet. 2017. Online learning with a hint. In Advances in Neural Information Processing Systems (NeurIPS’17). 30, 5299–5308.
[37]
Ilias Diakonikolas, Themis Gouleakis, and Christos Tzamos. 2019. Distribution-independent PAC learning of halfspaces with Massart noise. In Advances in Neural Information Processing Systems (NeurIPS’19). 32, 4749–4760.
[38]
Rishi Gupta and Tim Roughgarden. 2017. A PAC approach to application-specific algorithm selection. SIAM J. Comput. 46, 3 (2017), 992–1017.
[39]
Nika Haghtalab. 2018. Foundation of Machine Learning, by the People, for the People. Ph. D. Dissertation. Carnegie Mellon University.
[40]
Nika Haghtalab, Yanjun Han, Abhishek Shetty, and Kunhe Yang. 2022. Oracle-efficient online learning for beyond worst-case adversaries. In Advances in Neural Information Processing Systems (NeurIPS’22). 36, 4072–4084.
[41]
Nika Haghtalab, Tim Roughgarden, and Abhishek Shetty. 2020. Smoothed analysis of online and differentially private learning. In Advances in Neural Information Processing Systems (NeurIPS’20). 34, 9203–9215.
[42]
Nika Haghtalab, Tim Roughgarden, and Abhishek Shetty. 2021. Smoothed analysis with adaptive adversaries. In Proceedings of the 62nd Annual Symposium on Foundations of Computer Science (FOCS’21). 942–953.
[43]
Nika Haghtalab, Tim Roughgarden, and Abhishek Shetty. 2021. Smoothed analysis with adaptive adversaries. CoRR abs/2102.08446 (2021).
[44]
Moritz Hardt, Katrina Ligett, and Frank McSherry. 2012. A simple and practical algorithm for differentially private data release. In Advances in Neural Information Processing Systems (NeuRIPS’12). 25, 2348–2356.
[45]
Moritz Hardt and Aaron Roth. 2013. Beyond worst-case analysis in private singular vector computation. In Proceedings of the 45th Annual ACM Symposium on Theory of Computing (STOC’13). 331–340.
[46]
Moritz Hardt and Guy N. Rothblum. 2010. A multiplicative weights mechanism for privacy-preserving data analysis. In Proceeding of the 51st Annual Symposium on Foundations of Computer Science (FOCS’10). 61–70.
[47]
Christopher Harshaw, Fredrik Sävje, Daniel A. Spielman, and Peng Zhang. 2019. Balancing covariates in randomized experiments using the Gram-Schmidt walk. CoRR abs/1911.03071 (2019). arxiv:1911.03071
[48]
David Haussler. 1995. Sphere packing numbers for subsets of the Boolean n-cube with bounded Vapnik-Chervonenkis dimension. Journal of Combinatorial Theory, Series A 69, 2 (1995), 217–232.
[49]
Elad Hazan and Tomer Koren. 2016. The computational power of optimization in online learning. In Proceedings of the 48th Annual ACM Symposium on Theory of Computing (STOC’16). 128–141.
[50]
Samuel B. Hopkins, Jerry Li, and Fred Zhang. 2020. Robust and heavy-tailed mean estimation made simple, via regret minimization. In Advances in Neural Information Processing Systems (NeuRIPS’20). 30, 11902–11912.
[51]
Victor Reis Janardhan Kulkarni and Thomas Rothvoss. 2024. Optimal online discrepancy minimization. In Proceedings of the 56th Annual ACM Symposium on Theory of Computing (STOC’24).
[52]
Haotian Jiang, Janardhan Kulkarni, and Sahil Singla. 2019. Online geometric discrepancy for stochastic arrivals with applications to envy minimization. CoRR abs/1910.01073 (2019). arxiv:1910.01073
[53]
Adam Tauman Kalai, Alex Samorodnitsky, and Shang-Hua Teng. 2009. Learning and smoothed analysis. In Proceedings of the 50th Symposium on Foundations of Computer Science (FOCS’09). 395–404.
[54]
Adam Tauman Kalai and Shang-Hua Teng. 2008. Decision trees are PAC-learnable from most product distributions: A smoothed analysis. CoRR abs/0812.0933 (2008). arxiv:0812.0933
[55]
Sampath Kannan, Jamie H. Morgenstern, Aaron Roth, Bo Waggoner, and Zhiwei Steven Wu. 2018. A smoothed analysis of the greedy algorithm for the linear contextual bandit problem. In Advances in Neural Information Processing Systems (NeurIPS’18). 31, 2227–2236.
[56]
Victor Klee and George J. Minty. 1972. How good is the simplex algorithm. Inequalities 3, 3 (1972), 159–175.
[57]
Akshay Krishnamurthy, Alekh Agarwal, Tzu-Kuo Huang, Hal Daumé III, and John Langford. 2019. Active learning for cost-sensitive classification. J. Mach. Learn. Res. 20 (2019), 65:1–65:50.
[58]
Shachar Lovett and Raghu Meka. 2015. Constructive discrepancy minimization by walking on the edges. SIAM J. Comput. 44, 5 (2015), 1573–1582.
[59]
Konstantin Makarychev, Yury Makarychev, and Aravindan Vijayaraghavan. 2014. Bilu–Linial stable instances of max cut and minimum multiway cut. In Proceedings of the 25th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA’14). 890–906.
[60]
Bodo Manthey. 2021. Smoothed Analysis of Local Search. Cambridge University Press, 285–308.
[61]
Rafail Ostrovsky, Yuval Rabani, Leonard J. Schulman, and Chaitanya Swamy. 2013. The effectiveness of Lloyd-type methods for the k-means problem. J. ACM 59, 6, Article 28 (2013).
[62]
Manish Raghavan, Aleksandrs Slivkins, Jennifer Vaughan Wortman, and Zhiwei Steven Wu. 2018. The externalities of exploration and how data diversity helps exploitation. In Proceedings of the 31st Conference On Learning Theory (COLT’18). 1724–1738.
[63]
Alexander Rakhlin and Karthik Sridharan. 2013. Optimization, learning, and games with predictable sequences. In Advances in Neural Information Processing Systems (NeurIPS’13). 26, 3066–3074.
[64]
Alexander Rakhlin, Karthik Sridharan, and Ambuj Tewari. 2011. Online learning: Stochastic, constrained, and smoothed adversaries. In Advances in Neural Information Processing Systems (NeurIPS’11). 24, 1764–1772.
[65]
Thomas Rothvoss. 2017. Constructive discrepancy minimization for convex sets. SIAM J. Comput. 46, 1 (2017), 224–234.
[66]
Tim Roughgarden. 2020. Beyond the Worst-Case Analysis of Algorithms. Cambridge University Press.
[67]
Alejandro A. Schäffer. 1991. Simple local search problems that are hard to solve. SIAM Journal on Computing 20, 1 (1991), 56–87.
[68]
Joel Spencer. 1994. Ten Lectures on the Probabilistic Method (2nd edition). Society for Industrial and Applied Mathematics.
[69]
Daniel A. Spielman and Shang-Hua Teng. 2004. Smoothed analysis: Why the simplex algorithm usually takes polynomial time. J. ACM 51, 3 (2004), 385–463.
[70]
Aravindan Vijayaraghavan, Abhratanu Dutta, and Alex Wang. 2017. Clustering stable instances of Euclidean k-means. In Advances in Neural Information Processing Systems (NeurIPS’17). 30, 6500–6509.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Journal of the ACM
Journal of the ACM  Volume 71, Issue 3
June 2024
323 pages
EISSN:1557-735X
DOI:10.1145/3613558
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 June 2024
Online AM: 13 April 2024
Accepted: 21 February 2024
Revised: 12 November 2023
Received: 01 December 2022
Published in JACM Volume 71, Issue 3

Check for updates

Author Tags

  1. Smoothed analysis
  2. online learning
  3. regret bounds
  4. online convex optimization
  5. data driven algorithm design
  6. online discrepancy minimization

Qualifiers

  • Research-article

Funding Sources

  • NSF
  • NSF
  • ONR
  • C3.AI Digital Transformation Institute grant, a Berkeley AI Research and Microsoft Research Commons award, a JP Morgan Chase Faculty Fellowship, a Google research scholar faculty award, a Schmidt Sciences AI2050 award, and an Apple AI Ph.D. fellowship

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 158
    Total Downloads
  • Downloads (Last 12 months)158
  • Downloads (Last 6 weeks)48
Reflects downloads up to 01 Sep 2024

Other Metrics

Citations

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media