Article

Progressive rademacher sampling

Authors:

Matti KääriäinenAuthors Info & Claims

Eighteenth national conference on Artificial intelligence

Pages 140 - 145

Published: 28 July 2002 Publication History

Abstract

Sampling can enhance processing of large training example databases, but without knowing all of the data, or the example producing process, it is impossible to know in advance what size of a sample to choose in order to guarantee good performance. Progressive sampling has been suggested to circumvent this problem. The idea in it is to increase the sample size according to some schedule until accuracy close to that which would be obtained using all of the data is reached. How to determine this stopping time efficiently and accurately is a central difficulty in progressive sampling.We study stopping time determination by approximating the generalization error of the hypothesis rather than by assuming the often observed shape for the learning curve and trying to detect whether the final plateau has been reached in the curve. We use data dependent generalization error bounds. Instead of using the common cross validation approach, we use the recently introduced Rademacher penalties, which have been observed to give good results on simple concept classes.We experiment with two-level decision trees built by the learning algorithm T2. It finds a hypothesis with the minimal error with respect to the sample. The theoretically well motivated stopping time determination based on Rademacher penalties gives results that are much closer to those attained using heuristics based on assumptions on learning curve shape than distribution independent estimates based on VC dimension do.

References

[1]

Auer, P.; Holte, R. C.; and Maass, W. 1995. Theory and Application of Agnostic PAC-Learning with Small Decision Trees. In Proceedings of the Twelfth International Conference on Machine Learning, 21-29. San Francisco, Calif.: Morgan Kaufmann.]]

[2]

Bartlett, P. L., and Mendelson, S. 2001. Rademacher and Gaussian Complexities: Risk Bounds and Structural Results. In Computational Learning Theory, Proceedings of the Fourteenth Annual Conference, 224-240. Lecture Notes in Artificial Intelligence 2111. Heidelberg: Springer.]]

Digital Library

[3]

Blake, C. L., and Merz, C. J. 1998. UCI Repository of Machine Learning Databases. Univ. of California, Irvine, Dept. of Information and Computer Science.]]

[4]

Catlett, J. 1991. Megainduction: A Test Flight. In Proceedings of the Eighth International Workshop on Machine Learning, 596-599. San Mateo, Calif.: Morgan Kaufmann.]]

[5]

Cortes, C.; Jackel, L. D.; Solla, S. A.; Vapnik, V.; and Denker J. S. 1994. Learning Curves: Asymptotic Values and Rate of Convergence. In Advances in Neural Information Processing Systems 6, 327-334. San Francisco, Calif.: Morgan Kaufmann.]]

[6]

Frey, L. J., and Fisher, D. H. 1999. Modeling Decision Tree Performance with the Power Law. In Proceedings of the Seventh International Workshop on Artificial Intelligence and Statistics, 59-65. San Francisco, Calif.: Morgan Kaufmann.]]

[7]

Fürnkranz, J. 1998. Integrative Windowing. Journal of Artificial Intelligence Research 8: 129-164.]]

Digital Library

[8]

Haussler, D.; Kearns, M.; Seung, H. S.; and Tishby, N. 1996. Rigorous Learning Curve Bounds from Statistical Mechanics. Machine Learning 25(2-3): 195-236.]]

Digital Library

[9]

John, G., and Langley, P. 1996. Static versus Dynamic Sampling for Data Mining. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, 367-370. Menlo Park, Calif.: AAAI Press.]]

[10]

Kivinen, J., and Mannila, H. 1994. The Power of Sampling in Knowledge Discovery. In Proceedings of the Thirteenth ACM Symposium on Principles of Database Systems, 77-85. New York, NY: ACM Press.]]

Digital Library

[11]

Koltchinskii, V. 2001. Rademacher Penalties and Structural Risk Minimization. IEEE Transactions on Information Theory 47(5): 1902-1914.]]

Digital Library

[12]

Koltchinskii, V.; Abdallah, C. T.; Ariola, M.; Dorato, P.; and Panchenko, D. 2000. Improved Sample Complexity Estimates for Statistical Learning Control of Uncertain Systems. IEEE Transactions on Automatic Control 45(12): 2383-2388.]]

[13]

Lozano, F. 2000. Model Selection Using Rademacher Penalization. In Proceedings of the Second ICSC Symposium on Neural Networks. Berlin, Germany: ICSC Academic.]]

[14]

Oates, T., and Jensen, D. 1997. The Effects of Training Set Size on Decision Tree Complexity. In Proceedings of the Fourteenth International Conference on Machine Learning , 254-261. San Francisco, Calif.: Morgan Kaufmann.]]

Digital Library

[15]

Provost, F.; Jensen, D.; and Oates, T. 1999. Efficient Progressive Sampling. In Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 23-32. New York, NY: ACM Press.]]

Digital Library

[16]

Quinlan, J. R. 1983. Learning Efficient Classification Procedures and Their Application to Chess End Games. In Michalski, R. S.; Carbonell, J. G.; and Mitchell, T. M., eds., Machine Learning: An Artificial Intelligence Approach, 463-482. Palo Alto, Calif.: Tioga.]]

[17]

Quinlan, J. R. 1993. C4.5: Programs for Machine Learning . San Francisco, Calif.: Morgan Kaufmann.]]

Digital Library

[18]

Scheffer, T., and Wrobel, S. 2000. A Sequential Sampling Algorithm for a General Class of Utility Criteria. In Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 330-334. New York, NY: ACM Press.]]

Digital Library

[19]

Schuurmans, D., and Greiner, R. 1995. Practical PAC Learning. In Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, 1169-1175. Menlo Park, Calif.: International Joint Conferences on Artificial Intelligence, Inc.]]

[20]

Toivonen, H. 1996. Sampling Large Databases for Association Rules. In Proceedings of the Twenty-Second International Conference on Very Large Databases, 134-145. San Francisco, Calif.: Morgan Kaufmann.]]

Digital Library

[21]

Van der Vaart, A. W., and Wellner, J. A. 2000. Weak Convergence and Empirical Processes. Corrected second printing. New York, NY: Springer-Verlag.]]

[22]

Vapnik, V. N. 1998. Statistical Learning Theory. New York, NY: Wiley.]]

Cited By

Riondato MVandin FGuo YFarooq F(2018)MiSoSouPProceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining10.1145/3219819.3219989(2130-2139)Online publication date: 19-Jul-2018
https://dl.acm.org/doi/10.1145/3219819.3219989
Riondato MUpfal E(2018)ABRAACM Transactions on Knowledge Discovery from Data10.1145/320835112:5(1-38)Online publication date: 20-Jul-2018
https://dl.acm.org/doi/10.1145/3208351
Kamat NNandi A(2018)A Session-Based Approach to Fast-But-Approximate Interactive Data Cube ExplorationACM Transactions on Knowledge Discovery from Data10.1145/307064812:1(1-26)Online publication date: 13-Feb-2018
https://dl.acm.org/doi/10.1145/3070648
Show More Cited By

Index Terms

Progressive rademacher sampling
1. Computing methodologies
  1. Artificial intelligence
    1. Knowledge representation and reasoning
      1. Probabilistic reasoning
      2. Vagueness and fuzzy logic
    2. Search methodologies
      1. Heuristic function construction
  2. Machine learning

Recommendations

Progressive Latin Hypercube Sampling

Efficient sampling strategies that scale with the size of the problem, computational budget, and users needs are essential for various sampling-based analyses, such as sensitivity and uncertainty analysis. In this study, we propose a new strategy, ...
Progressive Random Sampling With Stratification

A number of applications, including claims made under federal social welfare programs, requires retrospective sampling over multiple time periods. A common characteristic of such samples is that population members could appear in multiple time periods. ...
Mining top-K frequent itemsets through progressive sampling

We study the use of sampling for efficiently mining the top-K frequent itemsets of cardinality at most w. To this purpose, we define an approximation to the top-K frequent itemsets to be a family of itemsets which includes (resp., excludes) all very ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

Eighteenth national conference on Artificial intelligence

July 2002

1068 pages

ISBN:0262511290

Sponsors

NSF: National Science Foundation
Alberta Informatics Circle of Research Excellence (iCORE)
SIGAI: ACM Special Interest Group on Artificial Intelligence
Naval Research Laboratory: Naval Research Laboratory
AAAI: American Association for Artificial Intelligence
NASA Ames Research Center: NASA Ames Research Center
DARPA: Defense Advanced Research Projects Agency

Publisher

American Association for Artificial Intelligence

United States

Publication History

Published: 28 July 2002

Check for updates

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

7
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 17 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Riondato MVandin FGuo YFarooq F(2018)MiSoSouPProceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining10.1145/3219819.3219989(2130-2139)Online publication date: 19-Jul-2018
https://dl.acm.org/doi/10.1145/3219819.3219989
Riondato MUpfal E(2018)ABRAACM Transactions on Knowledge Discovery from Data10.1145/320835112:5(1-38)Online publication date: 20-Jul-2018
https://dl.acm.org/doi/10.1145/3208351
Kamat NNandi A(2018)A Session-Based Approach to Fast-But-Approximate Interactive Data Cube ExplorationACM Transactions on Knowledge Discovery from Data10.1145/307064812:1(1-26)Online publication date: 13-Feb-2018
https://dl.acm.org/doi/10.1145/3070648
Riondato MUpfal EKrishnapuram BShah MSmola AAggarwal CShen DRastogi R(2016)ABRAProceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining10.1145/2939672.2939770(1145-1154)Online publication date: 13-Aug-2016
https://dl.acm.org/doi/10.1145/2939672.2939770
Riondato MUpfal ECao LZhang CJoachims TWebb GMargineantu DWilliams G(2015)Mining Frequent Itemsets through Progressive Sampling with Rademacher AveragesProceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining10.1145/2783258.2783265(1005-1014)Online publication date: 10-Aug-2015
https://dl.acm.org/doi/10.1145/2783258.2783265
Satyanarayana ADavidson I(2005)A dynamic adaptive sampling algorithm (DASA) for real world applicationsProceedings of the 15th international conference on Foundations of Intelligent Systems10.1007/11425274_65(631-640)Online publication date: 25-May-2005
https://dl.acm.org/doi/10.1007/11425274_65
Kääriäinen MMalinen TElomaa T(2004)Selective Rademacher Penalization and Reduced Error Pruning of Decision TreesThe Journal of Machine Learning Research10.5555/1005332.10446965(1107-1126)Online publication date: 1-Dec-2004
https://dl.acm.org/doi/10.5555/1005332.1044696

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

Media

Figures

Other

Tables

View Table of Contents