Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Improved Lower Bounds for Learning from Noisy Examples

Published: 01 May 2001 Publication History
  • Get Citation Alerts
  • Abstract

    This paper presents a general information-theoretic approach for obtaining lower bounds on the number of examples required for Probably Approximately Correct (PAC) learning in the presence of noise. This approach deals directly with the fundamental information quantities, avoiding a Bayesian analysis. The technique is applied to several different models, illustrating its generality and power. The resulting bounds add logarithmic factors to (or improve the constants in) previously known lower bounds.

    References

    [1]
    D. Angluin, Learning regular sets from queries and counterexamples, Inform. and Comput., 75 (1987) 87-106.
    [2]
    D. Angluin, Queries and concept learning, Machine Learning, 2 (1988) 319-342.
    [3]
    D. Angluin, M. Krikis, R. Sloan, G. Turán, Malicious omissions and errors in answers to membership queries, Machine Learning, 28 (1997) 211-255.
    [4]
    D. Angluin, P.D. Laird, Learning from noisy examples, Machine Learning, 2 (1988) 343-370.
    [5]
    D. Angluin, L. Valiant, Fast probabilistic algorithms for Hamiltonian circuits and matchings, J. Comput. System Sci., 18 (1979) 155-193.
    [6]
    B. Apolloni, C. Gentile, Sample size lower bounds in PAC learning by algorithmic complexity theory, Theoret. Comput. Sci., 209 (1998) 141-162.
    [7]
    J.A. Aslam, S.E. Decatur, On the sample complexity of noise-tolerant learning, Inform. Process. Lett., 57 (1996) 189-195.
    [8]
    P. Bartlett, Learning with a slowly changing distribution, 1992.
    [9]
    Bartlett, P, and, Helmbold, D. 1998, Tracking a Drifting Concept in a Changing Environment, Technical Report UCSC-CRL-98-12, Baskin School of Engineering, University of California, Santa Cruz.
    [10]
    R.D. Barve, P.M. Long, On the complexity of learning from drifting distributions, Inform. and Comput., 138 (1997) 170-193.
    [11]
    G. Benedek, A. Itai, Learnability by fixed distributions, Theoret. Comput. Sci., 86 (1991) 377-389.
    [12]
    A. Blumer, A. Ehrenfeucht, D. Haussler, M. Warmuth, Learnability and the Vapnik¿Chervonenkis dimension, J. Assoc. Comput. Mach., 36 (1989) 929-965.
    [13]
    N. Cesa-Bianchi, E. Dichterman, P. Fischer, E. Shamir, H.U. Simon, 1997.
    [14]
    Cesa-Bianchi, N., Dichterman, E., Fischer, P., Shamir, E., and Simon, H. U. preliminary versions appeared in Proceedings, 28th Symposium on Theory of Computing, pp. 141-150, and Proceedings, 3rd European Conference on Computational Learning Theory, pp. 119-133.
    [15]
    T.M. Cover, J.A. Thomas, Wiley, New York, 1991.
    [16]
    Dobrushin, R. L.1959, General formulation of Shannon's main theorem in information theory, Uspekhi Mat. Nauk, 14, 3-104. Russian]; 1963,
    [17]
    R.L. Dobrushin, General formulation of Shannon's main theorem in information theory, Ameri. Math. Soc. Transl. Ser. 2, 33 (1959) 323-438.
    [18]
    R.M. Dudley, Springer-Verlag, Berlin/New York, 1984.
    [19]
    A. Ehrenfeucht, D. Haussler, M. Kearns, L. Valiant, A general lower bound on the number of examples needed for learning, Inform. and Comput., 82 (1989) 247-261.
    [20]
    B. Eisenberg, R.L. Rivest, On the sample complexity of Pac-learning using random and chosen examples, 1990.
    [21]
    Gentile, C. 1997, A note on sample size lower bounds for PAC-learning, manuscript.
    [22]
    D. Haussler, A. Barron, How well do Bayes' methods work for on-line prediction of {-1, +1} values?, 1992.
    [23]
    D. Haussler, M. Kearns, R. Schapire, Bounds on the sample complexity of Bayesian learning using information theory and the VC dimension, Machine Learning, 14 (1994) 84-114.
    [24]
    D. Haussler, N. Littlestone, M. Warmuth, Predicting {0, 1}-functions on randomly drawn points, Inform. and Comput., 115 (1994) 248-293.
    [25]
    D. Haussler, M. Opper, Mutual information, metric entropy, and cumulative relative entropy risk, Ann. Statist., 25 (1997) 2451-2492.
    [26]
    D. Helmbold, N. Littlestone, P. Long, Apple tasting, Inform. and Comput. (2000).
    [27]
    S. Ihara, World Scientific, River Edge, 1993.
    [28]
    M. Kearns, M. Li, Learning in the presence of malicious errors, SIAM J. Comput., 22 (1993) 807-837.
    [29]
    A.N. Kolmogorov, V.M. Tihomirov, ¿-entropy and ¿-capacity of sets in functional spaces, Amer. Math. Soc. Transl. Ser., 17 (1961) 277-364.
    [30]
    P. Laird, Kluwer AcademicKluwer International Series in Engineering and Computer Science, Boston, 1988.
    [31]
    W. Maass, G. Turán, On the complexity of learning from counterexamples and membership queries, 1990.
    [32]
    K. Sakakibara, On learning from queries and counterexamples in the presence of noise, Inform. Process. Lett., 37 (1991) 279-284.
    [33]
    H.U. Simon, General bounds on the number of examples needed for learning probabilistic concepts, J. Comput. System Sci., 52 (1996) 239-254.
    [34]
    R. Sloan, Four types of noise in data for PAC learning, Inform. Process. Lett., 54 (1995) 157-162.
    [35]
    G. Shackelford, D. Volper, Learning k-DNF with noise in the attributes, 1988.
    [36]
    G. Turàn, Lower bounds for PAC learning with queries, 1993.
    [37]
    L. Valiant, A theory of the learnable, Comm. Assoc. Comput. Mach., 27 (1984) 1134-1142.
    [38]
    V.N. Vapnik, Springer-Verlag, New York, 1982.
    [39]
    V.N. Vapnik, Springer-Verlag, New York, 1995.
    [40]
    B. Yu, Lower bounds on expected redundancy for nonparametric classes, IEEE Trans. Inform. Theory, 42 (1996) 272-275.

    Cited By

    View all
    • (2018)Optimal quantum sample complexity of learning algorithmsThe Journal of Machine Learning Research10.5555/3291125.330963319:1(2879-2878)Online publication date: 1-Jan-2018
    • (2017)Optimal quantum sample complexity of learning algorithmsProceedings of the 32nd Computational Complexity Conference10.5555/3135595.3135620(1-31)Online publication date: 9-Jul-2017
    • (2011)Lower bounds for passive and active learningProceedings of the 24th International Conference on Neural Information Processing Systems10.5555/2986459.2986574(1026-1034)Online publication date: 12-Dec-2011

    Index Terms

    1. Improved Lower Bounds for Learning from Noisy Examples

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image Information and Computation
      Information and Computation  Volume 166, Issue 2
      May 1, 2001
      76 pages
      ISSN:0890-5401
      Issue’s Table of Contents

      Publisher

      Academic Press, Inc.

      United States

      Publication History

      Published: 01 May 2001

      Author Tags

      1. PAC learning
      2. entropy
      3. lower bounds
      4. mutual information
      5. noisy examples

      Qualifiers

      • Research-article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 26 Jul 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2018)Optimal quantum sample complexity of learning algorithmsThe Journal of Machine Learning Research10.5555/3291125.330963319:1(2879-2878)Online publication date: 1-Jan-2018
      • (2017)Optimal quantum sample complexity of learning algorithmsProceedings of the 32nd Computational Complexity Conference10.5555/3135595.3135620(1-31)Online publication date: 9-Jul-2017
      • (2011)Lower bounds for passive and active learningProceedings of the 24th International Conference on Neural Information Processing Systems10.5555/2986459.2986574(1026-1034)Online publication date: 12-Dec-2011

      View Options

      View options

      Get Access

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media