research-article

Improved Lower Bounds for Learning from Noisy Examples

Authors:

Claudio Gentile,

David P. HelmboldAuthors Info & Claims

Information and Computation, Volume 166, Issue 2

Pages 133 - 155

https://doi.org/10.1006/inco.2000.2919

Published: 01 May 2001 Publication History

Abstract

This paper presents a general information-theoretic approach for obtaining lower bounds on the number of examples required for Probably Approximately Correct (PAC) learning in the presence of noise. This approach deals directly with the fundamental information quantities, avoiding a Bayesian analysis. The technique is applied to several different models, illustrating its generality and power. The resulting bounds add logarithmic factors to (or improve the constants in) previously known lower bounds.

References

[1]

D. Angluin, Learning regular sets from queries and counterexamples, Inform. and Comput., 75 (1987) 87-106.

Digital Library

[2]

D. Angluin, Queries and concept learning, Machine Learning, 2 (1988) 319-342.

Digital Library

[3]

D. Angluin, M. Krikis, R. Sloan, G. Turán, Malicious omissions and errors in answers to membership queries, Machine Learning, 28 (1997) 211-255.

[4]

D. Angluin, P.D. Laird, Learning from noisy examples, Machine Learning, 2 (1988) 343-370.

Digital Library

[5]

D. Angluin, L. Valiant, Fast probabilistic algorithms for Hamiltonian circuits and matchings, J. Comput. System Sci., 18 (1979) 155-193.

[6]

B. Apolloni, C. Gentile, Sample size lower bounds in PAC learning by algorithmic complexity theory, Theoret. Comput. Sci., 209 (1998) 141-162.

Digital Library

[7]

J.A. Aslam, S.E. Decatur, On the sample complexity of noise-tolerant learning, Inform. Process. Lett., 57 (1996) 189-195.

Digital Library

[8]

P. Bartlett, Learning with a slowly changing distribution, 1992.

[9]

Bartlett, P, and, Helmbold, D. 1998, Tracking a Drifting Concept in a Changing Environment, Technical Report UCSC-CRL-98-12, Baskin School of Engineering, University of California, Santa Cruz.

[10]

R.D. Barve, P.M. Long, On the complexity of learning from drifting distributions, Inform. and Comput., 138 (1997) 170-193.

Digital Library

[11]

G. Benedek, A. Itai, Learnability by fixed distributions, Theoret. Comput. Sci., 86 (1991) 377-389.

Digital Library

[12]

A. Blumer, A. Ehrenfeucht, D. Haussler, M. Warmuth, Learnability and the Vapnik¿Chervonenkis dimension, J. Assoc. Comput. Mach., 36 (1989) 929-965.

Digital Library

[13]

N. Cesa-Bianchi, E. Dichterman, P. Fischer, E. Shamir, H.U. Simon, 1997.

[14]

Cesa-Bianchi, N., Dichterman, E., Fischer, P., Shamir, E., and Simon, H. U. preliminary versions appeared in Proceedings, 28th Symposium on Theory of Computing, pp. 141-150, and Proceedings, 3rd European Conference on Computational Learning Theory, pp. 119-133.

[15]

T.M. Cover, J.A. Thomas, Wiley, New York, 1991.

[16]

Dobrushin, R. L.1959, General formulation of Shannon's main theorem in information theory, Uspekhi Mat. Nauk, 14, 3-104. Russian]; 1963,

[17]

R.L. Dobrushin, General formulation of Shannon's main theorem in information theory, Ameri. Math. Soc. Transl. Ser. 2, 33 (1959) 323-438.

[18]

R.M. Dudley, Springer-Verlag, Berlin/New York, 1984.

[19]

A. Ehrenfeucht, D. Haussler, M. Kearns, L. Valiant, A general lower bound on the number of examples needed for learning, Inform. and Comput., 82 (1989) 247-261.

Digital Library

[20]

B. Eisenberg, R.L. Rivest, On the sample complexity of Pac-learning using random and chosen examples, 1990.

[21]

Gentile, C. 1997, A note on sample size lower bounds for PAC-learning, manuscript.

[22]

D. Haussler, A. Barron, How well do Bayes' methods work for on-line prediction of {-1, +1} values?, 1992.

[23]

D. Haussler, M. Kearns, R. Schapire, Bounds on the sample complexity of Bayesian learning using information theory and the VC dimension, Machine Learning, 14 (1994) 84-114.

[24]

D. Haussler, N. Littlestone, M. Warmuth, Predicting {0, 1}-functions on randomly drawn points, Inform. and Comput., 115 (1994) 248-293.

Digital Library

[25]

D. Haussler, M. Opper, Mutual information, metric entropy, and cumulative relative entropy risk, Ann. Statist., 25 (1997) 2451-2492.

[26]

D. Helmbold, N. Littlestone, P. Long, Apple tasting, Inform. and Comput. (2000).

[27]

S. Ihara, World Scientific, River Edge, 1993.

[28]

M. Kearns, M. Li, Learning in the presence of malicious errors, SIAM J. Comput., 22 (1993) 807-837.

Digital Library

[29]

A.N. Kolmogorov, V.M. Tihomirov, ¿-entropy and ¿-capacity of sets in functional spaces, Amer. Math. Soc. Transl. Ser., 17 (1961) 277-364.

[30]

P. Laird, Kluwer AcademicKluwer International Series in Engineering and Computer Science, Boston, 1988.

[31]

W. Maass, G. Turán, On the complexity of learning from counterexamples and membership queries, 1990.

[32]

K. Sakakibara, On learning from queries and counterexamples in the presence of noise, Inform. Process. Lett., 37 (1991) 279-284.

Digital Library

[33]

H.U. Simon, General bounds on the number of examples needed for learning probabilistic concepts, J. Comput. System Sci., 52 (1996) 239-254.

Digital Library

[34]

R. Sloan, Four types of noise in data for PAC learning, Inform. Process. Lett., 54 (1995) 157-162.

Digital Library

[35]

G. Shackelford, D. Volper, Learning k-DNF with noise in the attributes, 1988.

[36]

G. Turàn, Lower bounds for PAC learning with queries, 1993.

[37]

L. Valiant, A theory of the learnable, Comm. Assoc. Comput. Mach., 27 (1984) 1134-1142.

Digital Library

[38]

V.N. Vapnik, Springer-Verlag, New York, 1982.

[39]

V.N. Vapnik, Springer-Verlag, New York, 1995.

[40]

B. Yu, Lower bounds on expected redundancy for nonparametric classes, IEEE Trans. Inform. Theory, 42 (1996) 272-275.

Digital Library

Cited By

Arunachalam SDe Wolf R(2018)Optimal quantum sample complexity of learning algorithmsThe Journal of Machine Learning Research10.5555/3291125.330963319:1(2879-2878)Online publication date: 1-Jan-2018
https://dl.acm.org/doi/10.5555/3291125.3309633
Arunachalam Sde Wolf R(2017)Optimal quantum sample complexity of learning algorithmsProceedings of the 32nd Computational Complexity Conference10.5555/3135595.3135620(1-31)Online publication date: 9-Jul-2017
https://dl.acm.org/doi/10.5555/3135595.3135620
Raginsky MRakhlin A(2011)Lower bounds for passive and active learningProceedings of the 24th International Conference on Neural Information Processing Systems10.5555/2986459.2986574(1026-1034)Online publication date: 12-Dec-2011
https://dl.acm.org/doi/10.5555/2986459.2986574

Index Terms

Improved Lower Bounds for Learning from Noisy Examples
1. Computing methodologies
  1. Machine learning

Recommendations

Time-space lower bounds for two-pass learning
CCC '19: Proceedings of the 34th Computational Complexity Conference

A line of recent works showed that for a large class of learning problems, any learning algorithm requires either super-linear memory size or a super-polynomial number of samples [11, 7, 12, 9, 2, 5]. For example, any algorithm for learning parities of ...
Improved Bounds on the Sample Complexity of Learning

We present a new general upper bound on the number of examples required to estimate all of the expectations of a set of random variables uniformly well. The quality of the estimates is measured using a variant of the relative error proposed by Haussler ...
Unconditional lower bounds for learning intersections of halfspaces

We prove new lower bounds for learning intersections of halfspaces, one of the most important concept classes in computational learning theory. Our main result is that any statistical-query algorithm for learning the intersection of $\sqrt{n}$ halfspaces in n ...

Comments

Information & Contributors

Information

Published In

cover image Information and Computation

Information and Computation Volume 166, Issue 2

May 1, 2001

76 pages

ISSN:0890-5401

Editor:
Albert R. Meyer
Massachusetts Institute of Technology, Cambridge

Issue’s Table of Contents

Copyright © Academic Press.

Publisher

Academic Press, Inc.

United States

Publication History

Published: 01 May 2001

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 26 Jul 2024

Other Metrics

View Author Metrics

Citations

Cited By

Arunachalam SDe Wolf R(2018)Optimal quantum sample complexity of learning algorithmsThe Journal of Machine Learning Research10.5555/3291125.330963319:1(2879-2878)Online publication date: 1-Jan-2018
https://dl.acm.org/doi/10.5555/3291125.3309633
Arunachalam Sde Wolf R(2017)Optimal quantum sample complexity of learning algorithmsProceedings of the 32nd Computational Complexity Conference10.5555/3135595.3135620(1-31)Online publication date: 9-Jul-2017
https://dl.acm.org/doi/10.5555/3135595.3135620
Raginsky MRakhlin A(2011)Lower bounds for passive and active learningProceedings of the 24th International Conference on Neural Information Processing Systems10.5555/2986459.2986574(1026-1034)Online publication date: 12-Dec-2011
https://dl.acm.org/doi/10.5555/2986459.2986574

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents