Article

Free access

Learning with a slowly changing distribution

Author:

Peter L. BartlettAuthors Info & Claims

COLT '92: Proceedings of the fifth annual workshop on Computational learning theory

Pages 243 - 252

https://doi.org/10.1145/130385.130412

Published: 01 July 1992 Publication History

PDF eReader

Abstract

In this paper, we consider the problem of learning a subset of a domain from randomly chosen examples when the probability distribution of the examples changes slowly but continually throughout the learning process. We give upper and lower bounds on the best achievable probability of misclassification after a given number of examples. If d is the VC-dimension of the target function class, t is the number of examples, and Υ is the amount by which the distribution is allowed to change (measured by the largest change in the probability of a subset of the domain), the upper bound decreases as d/t initially, and settles to O(d_2/3Υ_1/2) for large t. These bounds give necessary and sufficient conditions on Υ, the rate of change of the distribution of examples, to ensure that some learning algorithm can produce an acceptably small probability of misclassification. We also consider the case of learning a near-optimal subset of the domain when the examples and their labels are generated by a joint probability distribution on the example and label spaces. We give an upper bound on Υ that ensures learning is possible from a finite number of examples.

References

[1]

M. Anthony, N. Biggs, and J. Shawe.Taylor. Learnability and formal concept analysis. Technical Report CSD-TR-624, UCL, 1990.

Google Scholar

[2]

M. Anthony and J. Shawe-Taylor. A result of Vapnik with applications. Technical port CSD-TR-628, UCL, 1990.

Google Scholar

[3]

A. Blumer, A. Ehrenfeucht, D. Itaussler, and M. K. Warmuth. Learnability and the Vapnik-Chervonenkis dimension. Journal of the ACM, 36(4):929-965, 1989.

Digital Library

Google Scholar

[4]

P.R. Halmos. Measure Theory. Van Nostrand, 1950.

Google Scholar

[5]

D. Haussler, M. Kearns, N. Littlestone, and M. K. Warmuth. Equivalence of models for polynomial learnability. In Proceedings of the 1988 Workshop on Computational Learning Theory, pages 42-55. Motgan Kaufmann, San Mateo, CA, 1988.

Digital Library

Google Scholar

[6]

D.P. Helmbold and P. M. Long. Tracking drifting concepts using random examples. In Proceedings of the Fourth Annual Workshop on Computational Learning Theory, pages 13-23. Morgan Kaufmann, San Mateo, CA, 1991.

Digital Library

Google Scholar

[7]

D. Haussler, N. Littlestone, and M. K. Warmuth. Predicting {0, 1)-functions on randomly drawn points. Technical Report UCSC CRL-90-54, Baskin Center for Computer Engineering and Information Sciences, University of California Santa Cruz, 1990.

Digital Library

Google Scholar

[8]

A.H. Kramer. Learning despite distribution drift. In Proceedings of the Connectionist Models Summer School, pages 201-210. Morgan Kaufmann, San Mateo, CA, 1988.

Google Scholar

[9]

S. Kullbaek. A lower bound for discrimination information in terms of variation. IEEE Transactions on Information Theory, IT-13:126-127, 1967.

Crossref

Google Scholar

[10]

A. Renyi. On measures of entropy and information. In Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, volume 1, pages 547-561. University of California Press, 1961.

Google Scholar

[11]

L.G. Valiant. A theory of the learnable. Communications of the ACM, 27(11):1134- 1143, 1984.

Digital Library

Google Scholar

[12]

V. Vapnik. Estimation of Dependencies Based on Empirical Data. Springer-Verlag, 1982.

Digital Library

Google Scholar

[13]

V.N. Vapnik and A. Ya. Chervonenkis. On the uniform convergence of relative frequencies of events to their probabilities. Theory of Probability and its Applications, XVI(2):264-280, 1971.

Crossref

Google Scholar

Cited By

View all

Abel DBarreto AVan Roy BPrecup Dvan Hasselt HSingh SOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)A definition of continual reinforcement learningProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3668314(50377-50407)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3668314
Mazzetto AUpfal EOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)An adaptive algorithm for learning with unknown distribution driftProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3666562(10068-10087)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3666562
Mazzetto AUpfal EKrause ABrunskill ECho KEngelhardt BSabato SScarlett J(2023)Nonparametric density estimation under distribution driftProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619417(24251-24270)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.5555/3618408.3619417
Show More Cited By

Index Terms

Learning with a slowly changing distribution

Recommendations

Distribution-dependent sample complexity of large margin learning

We obtain a tight distribution-specific characterization of the sample complexity of large-margin classification with L₂ regularization: We introduce the margin-adapted dimension, which is a simple function of the second order statistics of the data ...
On computing the distribution function for the Poisson binomial distribution

The Poisson binomial distribution is the distribution of the sum of independent and non-identically distributed random indicators. Each indicator follows a Bernoulli distribution and the individual probabilities of success vary. When all success ...
A skewed truncated t distribution

Skewed symmetric distributions have attracted a great deal of attention in the last few years. One of them, the skewed t distribution suffers from limited applicability because of the lack of finite moments. This note proposes an alternative to the ...

Comments

Information & Contributors

Information

Published In

COLT '92: Proceedings of the fifth annual workshop on Computational learning theory

July 1992

452 pages

ISBN:089791497X

DOI:10.1145/130385

Chairman:
David Haussler

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 July 1992

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Article

Conference

COLT92

Sponsor:

COLT92: 5th Annual Workshop on Computational Learning Theory

July 27 - 29, 1992

Pennsylvania, Pittsburgh, USA

Acceptance Rates

Overall Acceptance Rate 35 of 71 submissions, 49%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

33
Total Citations
View Citations
319
Total Downloads

Downloads (Last 12 months)118
Downloads (Last 6 weeks)18

Reflects downloads up to 10 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Abel DBarreto AVan Roy BPrecup Dvan Hasselt HSingh SOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)A definition of continual reinforcement learningProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3668314(50377-50407)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3668314
Mazzetto AUpfal EOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)An adaptive algorithm for learning with unknown distribution driftProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3666562(10068-10087)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3666562
Mazzetto AUpfal EKrause ABrunskill ECho KEngelhardt BSabato SScarlett J(2023)Nonparametric density estimation under distribution driftProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619417(24251-24270)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.5555/3618408.3619417
De Silva ARamesh RPriebe CChaudhari PVogelstein JKrause ABrunskill ECho KEngelhardt BSabato SScarlett J(2023)The value of out-of-distribution dataProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3618700(7366-7389)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.5555/3618408.3618700
Piliouras GYu FLeyton-Brown KSamuelson LHartline J(2023)Multi-agent Performative Prediction: From Global Stability and Optimality to ChaosProceedings of the 24th ACM Conference on Economics and Computation10.1145/3580507.3597759(1047-1074)Online publication date: 9-Jul-2023
https://dl.acm.org/doi/10.1145/3580507.3597759
Hanneke SKpotufe S(2022)A no-free-lunch theorem for multitask learningThe Annals of Statistics10.1214/22-AOS218950:6Online publication date: 1-Dec-2022
https://doi.org/10.1214/22-AOS2189
Tomaszewska PLampert C(2022)Lightweight Conditional Model Extrapolation for Streaming Data under Class-Prior Shift2022 26th International Conference on Pattern Recognition (ICPR)10.1109/ICPR56361.2022.9956195(2128-2134)Online publication date: 21-Aug-2022
https://doi.org/10.1109/ICPR56361.2022.9956195
Perdomo JZrnic TMendler-Dünner CHardt MDaumé HSingh A(2020)Performative predictionProceedings of the 37th International Conference on Machine Learning10.5555/3524938.3525642(7599-7609)Online publication date: 13-Jul-2020
https://dl.acm.org/doi/10.5555/3524938.3525642
Kumar AMa TLiang PDaumé HSingh A(2020)Understanding self-training for gradual domain adaptationProceedings of the 37th International Conference on Machine Learning10.5555/3524938.3525445(5468-5479)Online publication date: 13-Jul-2020
https://dl.acm.org/doi/10.5555/3524938.3525445
Mendler-Dünner CPerdomo JZrnic THardt MLarochelle HRanzato MHadsell RBalcan MLin H(2020)Stochastic optimization for performative predictionProceedings of the 34th International Conference on Neural Information Processing Systems10.5555/3495724.3496138(4929-4939)Online publication date: 6-Dec-2020
https://dl.acm.org/doi/10.5555/3495724.3496138
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Cited By

Index Terms

Recommendations

Distribution-dependent sample complexity of large margin learning

On computing the distribution function for the Poisson binomial distribution

A skewed truncated t distribution

Comments

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF

eReader

Login options

Full Access

Abstract

References

Cited By

Index Terms

Recommendations

Distribution-dependent sample complexity of large margin learning

On computing the distribution function for the Poisson binomial distribution

A skewed truncated t distribution

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

PDF

eReader

Get Access

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations