Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/180139.181016acmconferencesArticle/Chapter ViewAbstractPublication PagescoltConference Proceedingsconference-collections
Article
Free access

Efficient agnostic PAC-learning with simple hypothesis

Published: 16 July 1994 Publication History

Abstract

We exhibit efficient algorithms for agnostic PAC-learning with rectangles, unions of two rectangles, and unions of k intervals as hypotheses. These hypothesis classes are of some interest from the point of view of applied machine learning, because empirical studies show that hypotheses of this simple type (in just one or two of the attributes) provide good prediction rules for various real-world classification problems. In addition, optimal hypotheses of this type may provide valuable heuristic insight into the structure of a real world classification problem.
The algorithms that are introduced in this paper make it feasible to compute optimal hypotheses of this type for a training set of several hundred examples. We also exhibit an approximation algorithm that can compute nearly optimal hypotheses for much larger datasets.

References

[1]
D. Angluin, P. Laird, Learmng from noisy examples, Machine Learning, vol. 2, 1988, 343 - 370]]
[2]
W. Buntine, T. Niblett, A further comparison of splitting rules for decisiontree induction, Machine Learning, vol. 8, 1992, 75- 82]]
[3]
S.E. Decatur, Statistical queries and faulty PA C oracles, Proc. of the 6th ACM Conference on ComputationM Learning Theory, 1993, 262 - 268]]
[4]
D.P. Dobkin, D. Gunopulos, Computing the rectangle discrepancy (video), 3rd Annual Video Review of Computational Geometry]]
[5]
D.P. Dobkin, D. Gunopulos, Computing the rectangle discrepancy, Tech Report 443-94, Princeton University]]
[6]
D. Haussler, Decision theoretic generalizations of the PA C-model for neural nets and other learning applications, Inf. and Comp., vol. 100, 1992, 78 - 150]]
[7]
K.U. Hoeffgen, H. U. Simon, K. S. Van Horn, Robust trainability of single neurons, preprint (1993)]]
[8]
R.C. Holte, Very simple classification rules perform well on most commonly used datasets, Machine Learning, vol. 11, 1993, 63- 91]]
[9]
M. Kearns, Efficient noise-tolerant learning from statistical queries, Proc. of the 25th ACM Symp. on the Theory of Computing, 1993, 392- 401]]
[10]
M. Kearns, M. Li, Learning in the presence of malicious errors, SIAM J. Comput., vol. 22, 1993, 807- 837]]
[11]
M.J. Kearns, R. E. Schapire, Efficient distribution-free learning of probabilisiic concepts, Proc. of the 31st Annual IEEE Symp. on Foundations of Computer Science, 1990, 382 - 391]]
[12]
M.J. Kearns, R. E. Schapire, L. M. Sellie, Toward efficient agnostic learning, Proc. of the 5th ACM Workshop on Computational Learning Theory, 1992, 341 - 352]]
[13]
J. Mingers, An empirical comparision of pruning methodes for decision tree induction, Machine Learning, vol. 4, 1989, 227 - 243]]
[14]
K. Mehlhorn, Multi-Dimensional Searching and Computational Geometry, Springer, 1984]]
[15]
F.P. Preparata, M. I. Shamos, Computational Geometry, Springer, 1985]]
[16]
J.R. Quinlan, C4.5: Programs for Machine Learning, Morgan Kaufmann, 1992]]
[17]
M. Talagrand, Sharper bounds for empirical processes, to appear in Annals of Probability and its Applications]]
[18]
L.G. Valiant, A theory of the learnable, Comm. of the ACM, vol. 27, 1984, 1134- 1142]]
[19]
L.G. Valiant, Learning disjunctions of conjunctions, Proc. of the 9th Intern. Joint Conf. on Art. Int., 1985, 560 - 566]]
[20]
S.M. Weiss, R. Galen, P. V. Tadepalli, Maximizing the predictive value of production rules, Art. Int., vol. 45, 1990, 47 - 71]]
[21]
S.M. Weiss, I. Kapouleas, An empirical comparison of paltern recognition, neural nets, and machine learning classification methods, Proc. of the 11th Int. Joint Conf. on Art. Int. 1990, Morgan Kaufmann, 781 - 787]]
[22]
S.M. Weiss, C. A. Kulikowski, Computer Systems thai Learn, 1991, Morgan Kaufmann]]

Cited By

View all
  • (2024)Optimal Rates for Agnostic Distributed LearningIEEE Transactions on Information Theory10.1109/TIT.2023.334465670:4(2759-2778)Online publication date: Apr-2024
  • (2024)Data Collection and Analysis: The Foundation of Evidence-Based Research in Various DisciplinesIntelligent Signal Processing and RF Energy Harvesting for State of art 5G and B5G Networks10.1007/978-981-99-8771-9_9(147-165)Online publication date: 25-Feb-2024
  • (2023)Optimal convergence rates for agnostic Nyström kernel learningProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619226(19811-19836)Online publication date: 23-Jul-2023
  • Show More Cited By

Index Terms

  1. Efficient agnostic PAC-learning with simple hypothesis

    Recommendations

    Reviews

    Sally Goldman

    Most work in computational learning theory assumes the learner knows a set of classification rules (or hypotheses), one of which correctly classifies most examples of interest. While no such knowledge is available for most real-life problems, when designing a learning algorithm for such a problem, there is a set of classification rules from which the algorithm will select its final hypothesis. The agnostic probably approximately correct (PAC) learning model, defined by Haussler, does not require the learner to find a hypothesis having low error with respect to the correct classification; instead, it requires the learner to output a hypothesis h from the set H of classification rules whose error is near that of a best rule from H . This paper gives efficient algorithms for agnostic PAC learning when the hypothesis class H is the set of rectangles in the plane, the set of unions of two rectangles in the plane, or the set of k intervals over the real line. Thus for any classification problem, if you restrict your attention to rules from one such hypothesis class H , the algorithm will find a hypothesis from H whose error rate is nearly as good as the best hypothesis from H . While the hypothesis classes studied are fairly simple, this paper discusses empirical studies showing that simple prediction rules work well for many real-world classification problems. This paper nicely defines the formal models, and thus should be appropriate for people from the learning theory and machine learning communities who have some familiarity with the PAC model.

    Access critical reviews of Computing literature here

    Become a reviewer for Computing Reviews.

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    COLT '94: Proceedings of the seventh annual conference on Computational learning theory
    July 1994
    376 pages
    ISBN:0897916557
    DOI:10.1145/180139
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 16 July 1994

    Permissions

    Request permissions for this article.

    Check for updates

    Qualifiers

    • Article

    Conference

    7COLT94
    Sponsor:
    7COLT94: 7th Annual Conference on Computational Learning Theory
    July 12 - 15, 1994
    New Jersey, New Brunswick, USA

    Acceptance Rates

    Overall Acceptance Rate 35 of 71 submissions, 49%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)74
    • Downloads (Last 6 weeks)14
    Reflects downloads up to 12 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Optimal Rates for Agnostic Distributed LearningIEEE Transactions on Information Theory10.1109/TIT.2023.334465670:4(2759-2778)Online publication date: Apr-2024
    • (2024)Data Collection and Analysis: The Foundation of Evidence-Based Research in Various DisciplinesIntelligent Signal Processing and RF Energy Harvesting for State of art 5G and B5G Networks10.1007/978-981-99-8771-9_9(147-165)Online publication date: 25-Feb-2024
    • (2023)Optimal convergence rates for agnostic Nyström kernel learningProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619226(19811-19836)Online publication date: 23-Jul-2023
    • (2021)Geometric Heuristics for Transfer Learning in Decision TreesProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3482259(151-160)Online publication date: 26-Oct-2021
    • (2020)A general method for robust learning from batchesProceedings of the 34th International Conference on Neural Information Processing Systems10.5555/3495724.3497551(21775-21785)Online publication date: 6-Dec-2020
    • (2018)An Investigation into the Relationship among Psychiatric, Demographic and Socio-Economic Variables with Bayesian Network ModelingEntropy10.3390/e2003018920:3(189)Online publication date: 12-Mar-2018
    • (2016)Rules-6: A Simple Rule Induction Algorithm for Handling Large Data SetsProceedings of the Institution of Mechanical Engineers, Part C: Journal of Mechanical Engineering Science10.1243/095440605X31931219:10(1119-1137)Online publication date: 11-Aug-2016
    • (2016)Online Discretization of Continuous-Valued Attributes in Rule InductionProceedings of the Institution of Mechanical Engineers, Part C: Journal of Mechanical Engineering Science10.1243/095440605X31571219:8(829-842)Online publication date: 11-Aug-2016
    • (2015)Hierarchical Design of Fast Minimum Disagreement AlgorithmsProceedings of the 26th International Conference on Algorithmic Learning Theory - Volume 935510.1007/978-3-319-24486-0_9(134-148)Online publication date: 4-Oct-2015
    • (2013)SkyDiverProceedings of the 16th International Conference on Extending Database Technology10.1145/2452376.2452424(406-417)Online publication date: 18-Mar-2013
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media