Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1007/978-3-540-74958-5_42guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Analyzing Co-training Style Algorithms

Published: 17 September 2007 Publication History

Abstract

Co-training is a semi-supervised learning paradigm which trains two learners respectively from two different views and lets the learners label some unlabeled examples for each other. In this paper, we present a new PAC analysis on co-training style algorithms. We show that the co-training process can succeed even without two views, given that the two learners have large difference, which explains the success of some co-training style algorithms that do not require two views. Moreover, we theoretically explain that why the co-training process could not improve the performance further after a number of rounds, and present a rough estimation on the appropriate round to terminate co-training to avoid some wasteful learning rounds.

References

[1]
Abney, S.: Bootstrapping. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, Philadelphia, PA, pp. 360-367 (2002).
[2]
Balcan, M.F., Blum, A., Yang, K.: Co-training and expansion: Towards bridging theory and practice. In: Saul, L.K., Weiss, Y., Bottou, L. (eds.) Advances in Neural Information Processing Systems 17, pp. 89-96. MIT Press, Cambridge, MA (2005).
[3]
Belkin, M., Niyogi, P.: Semi-supervised learning on Riemannian manifolds. Machine Learning 56, 209-239 (2004).
[4]
Blake, C., Keogh, E., Merz, C.J.: UCI repository of machine learning databases. Department of Information and Computer Science, University of California, Irvine, CA (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html
[5]
Blum, A., Mitchell, T.: Combining labeled and unlabeled data with co-training. In: Proceedings of the 11th Annual Conference on Computational Learning Theory, Madison, WI, pp. 92-100 (1998).
[6]
Chapelle, O., Weston, J., Schölkopf, B.: Cluster kernels for semi-supervised learning. In: Becker, S., Thrun, S., Obermayer, K. (eds.) Advances in Neural Information Processing Systems 15, pp. 585-592. MIT Press, Cambridge, MA (2003).
[7]
Dasgupta, S., Littman, M., McAllester, D.: PAC generalization bounds for cotraining. In: Dietterich, T.G., Becker, S., Ghahramani, Z. (eds.) Advances in Neural Information Processing Systems 14, pp. 375-382. MIT Press, Cambridge, MA (2002).
[8]
Goldman, S., Zhou, Y.: Enhancing supervised learning with unlabeled data. In: Proceedings of the 17th International Conference on Machine Learning, San Francisco, CA, pp. 327-334 (2000).
[9]
Joachims, T.: Transductive inference for text classification using support vector machines. In: Proceedings of the 16th International Conference on Machine Learning, Bled, Slovenia, pp. 200-209 (1999).
[10]
Krogel, M.A., Scheffer, T.: Effectiveness of information extraction, multi-relational, and semi-supervised learning for predicting functional properties of genes. In: Proceedings of the 3rd IEEE International Conference on Data Mining, Melbourne, FL, pp. 569-572. IEEE Computer Society Press, Los Alamitos (2003).
[11]
Kushmerick, N.: Learning to remove internet advertisements. In: Proceedings of the 3rd Annual Conference on Autonomous Agents, Seattle, WA, pp. 175-181 (1999).
[12]
Mladenic, D.: Modeling information in textual data combining labeled and unlabeled data. In: Proceedings of the ESF Exploratory Workshop on Pattern Detection and Discovery, pp. 170-179.
[13]
Nigam, K., McCallum, A.K., Thrun, S., Mitchell, T.: Text classification from labeled and unlabeled documents using EM. Machine Learning 39, 103-134 (2000).
[14]
Pierce, D., Cardie, C.: Limitations of co-training for natural language learning from large data sets. In: Proceedings of the 2001 Conference on Empirical Methods in Natural Language Processing, Pittsburgh, PA, pp. 1-9 (2001).
[15]
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Francisco, CA (2005).
[16]
Zhou, Z.H., Li, M.: Semi-supervised regression with co-training. In: Proceedings of the 19th International Joint Conference on Artificial Intelligence, Edinburgh, Scotland, pp. 908-913 (2005).
[17]
Zhu, X., Ghahramani, Z., Lafferty, J.: Semi-supervised learning using Gaussian fields and harmonic functions. In: Proceedings of the 20th International Conference on Machine Learning, Washington, DC, pp. 912-919 (2003).

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
ECML '07: Proceedings of the 18th European conference on Machine Learning
September 2007
805 pages

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 17 September 2007

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 13 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)A Survey and an Empirical Evaluation of Multi-View Clustering ApproachesACM Computing Surveys10.1145/364510856:7(1-38)Online publication date: 8-Feb-2024
  • (2024)Multi-head co-trainingKnowledge-Based Systems10.1016/j.knosys.2024.112325302:COnline publication date: 25-Oct-2024
  • (2023)Co-Training-Teaching: A Robust Semi-Supervised Framework for Review-Aware Rating RegressionACM Transactions on Knowledge Discovery from Data10.1145/362539118:2(1-16)Online publication date: 26-Sep-2023
  • (2023)Semi-supervised Learning with Easy Labeled Data via Impartial Labeled Set ExtensionProceedings of the 1st International Workshop on Multimedia Content Generation and Evaluation: New Methods and Practice10.1145/3607541.3616815(29-39)Online publication date: 29-Oct-2023
  • (2023)A Co-training Approach for Noisy Time Series LearningProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3614759(3308-3318)Online publication date: 21-Oct-2023
  • (2022)A co-training approach for spatial data disaggregationProceedings of the 30th International Conference on Advances in Geographic Information Systems10.1145/3557915.3561475(1-10)Online publication date: 1-Nov-2022
  • (2022)Federated Multi-view Learning for Private Medical Data Integration and AnalysisACM Transactions on Intelligent Systems and Technology10.1145/350181613:4(1-23)Online publication date: 28-Jun-2022
  • (2021)Cross-Domain Object Representation via Robust Low-Rank Correlation AnalysisACM Transactions on Multimedia Computing, Communications, and Applications10.1145/345882517:4(1-20)Online publication date: 12-Nov-2021
  • (2020)A Distance-Weighted Selection of Unlabelled Instances for Self-training and Co-training Semi-supervised MethodsIntelligent Systems10.1007/978-3-030-61380-8_24(352-366)Online publication date: 20-Oct-2020
  • (2020)Generative View-Correlation Adaptation for Semi-supervised Multi-view LearningComputer Vision – ECCV 202010.1007/978-3-030-58568-6_19(318-334)Online publication date: 23-Aug-2020
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media