Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2093698.2093871acmotherconferencesArticle/Chapter ViewAbstractPublication PagesisabelConference Proceedingsconference-collections
research-article

Combining active learning and semi-supervised for improving learning performance

Published: 26 October 2011 Publication History

Abstract

In many learning tasks, there are abundant unlabeled samples but the number of labeled training samples is limited, because labeling the samples requires the efforts of human annotators and expertise. There are three major techniques for labeling the samples: semi-supervised learning, transductive learning and active learning. Semi-supervised and transductive learning deal with methods for automated exploiting unlabeled samples in addition to improve learning performance. Active learning deals with methods that assume the learner has control over the whole input space. So combing the advantage of semi-supervised learning and active learning is a practical technique for improving the learning performance. In this paper, a general framework of combing (Active Learning) AL and (Semi-Supervised Learning) SSL algorithms is proposed. Then the ensemble learning for combing AL and SSL algorithms is introduced, which is denoted by ASC (AL and SSL by Committee). At last, the ensemble learning and confidence measure of the ASC is discussed.

References

[1]
Zhou, Z. H. 2006. Learning with unlabeled data and its application to image retrieval. In Proceedings of the 9th Pacific rim international conference on artificial intelligence (Guilin, China, August 7--11, 2006). Springer, Heidelberg, 5--10. DOI=http://dx.doi.org/10.1007/978-3-540-36668-3_3.
[2]
Nigam, K. and Ghani, R. 2000. Analyzing the effectiveness and applicability of Co-Training. In Proceedings of the 9th International Conference on Information and Knowledge Management. (McLean, USA, Nov. 6--11, 2000). ACM, New York, NY, 86--93. DOI=http://dx.doi.org/10.1145/354756.354805.
[3]
Blum, A. and Mitchell, T. 1998. Combining labeled and unlabeled data with co-training. In Proceedings of the 11th Annual Conference on Computational Learning Theory. (Madison, WI, July 24--26, 1998). ACM, New York, 92--100. DOI=http://dx.doi.org/10.1145/279943.279962.
[4]
Nigam, K. 2001. Using Unlabeled Data to Improve Text Classification. Doctoral Thesis. Carnegie Mellon University Computer Science Dept. DOI=http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.2.4771&rep=rep1&type=pdf.
[5]
Zhou, Z. H. and Li, M. 2005. Tri-training: Exploiting unlabeled data using three classifiers. IEEE Transactions on Knowledge and Data Engineering. 17, 11 (Nov. 2005), 1529--1541. DOI=http://dx.doi.org/10.1109/TKDE.2005.186.
[6]
Li, M. and Zhou, Z. H. 2007. Improve computer-aided diagnosis with machine learning techniques using undiagnosed samples. IEEE Trans. Systems, Man and Cybernetics- Part A: Systems and Humans, 37, 6 (Nov. 2007), 1088--1098. DOI=http://dx.doi.org/10.1109/TSMCA.2007.904745.
[7]
Szummer, M. and Jaakkola, T. 2001. Partially labeled classification with markov random walks. In Proceedings of Advances in Neural Information Processing Systems. (Cambridge, MA, 2001). MIT Press, 945--952. DOI=http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.25.1955.
[8]
Blum, A. and Chawla, S. 2001. Learning from Labeled and Unlabeled Data using Graph Mincuts. In Proceedings of the 18th International Conference on Machine Learning (San Francisco, CA, USA, 2001). Morgan Kaufmann Publishers, 19--26. DOI=http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.19.3957.
[9]
Lewis, D. D. and Gale, A. W. 1994. A sequential algorithm for training text classifiers. In Proceedings of the Special Interest Group on Information Retrieval (Dublin, Ireland, July 3--6, 1994). ACM, New York, 3--12. DOI=http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.16.3103.
[10]
Muslea, I., Minton, S., and Knoblock, C. A. 2000. Selective sampling with redundant views. In Proceedings of the 17th National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence (Austin, USA, July 30--August 3, 2000). AAAI Press, 621--626. DOI=http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.71.9637&rep=rep1&type=pdf.
[11]
Freund, Y., Seung, H. S., Shamir, E., and Tishby, N. 1997. Selective sampling using the query by committee algorithm. Machine Learning. 28 (1997), 133--168. DOI=http://dx.doi.org/10.1023/A:1007330508534.
[12]
Wang, J. and Luo, S. W. 2006. Semi-supervised classification with active query selection. In Proceedings of Structural, syntactic, and statistical pattern recognition (Hong Kong, China, August 17--19, 2006). Springer, 741--746. DOI=http://dx.doi.org/10.1007/11815921_81.
[13]
Zhou, Z. H., Chen, K. J., and Jiang, Y. 2004. Exploiting unlabeled data in content-based image retrieval. In Proceedings of the 15th European Conference on Machine Learning (Pisa, Italy, Sept. 20--24, 2004). Springer, 525--536. DOI=http://dx.doi.org/10.1007/978-3-540-30115-8_48.
[14]
Muslea, I., Minton, S., and Knoblock, C. A. 2002. Active + semi-supervised learning = robust multi-view learning. In Proceedings of the 19th International Conference on Machine Learning (Sydney, Australia, July 8--12, 2002). Morgan Kaufmann Publisher, 435--442. DOI=http://citeseerx.ist.psu.edu/viewdoc/summary?doi=?doi=10.1.1.18.4294.
[15]
McCallum, A. K. and Nigam, K. 1998. Employing EM and pool-based active learning for text classification. In Proceedings of the 15th International Conference on Machine Learning (Madison, USA, July 24--27, 1998). Morgan Kaufmann Publisher, 350--358. DOI=http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.50.10.
[16]
Belkin, M., Niyogi, P., and Sindhwani, V. 2004. On Manifold Regularization. Technical Report. Department of Computer Science, University of Chicago.
[17]
Quinlan, J. R. 1996. Bagging, boosting, and C4.5. In Proceedings of the 13th National Conference on Artificial Intelligence (Portland, OR, 1996). AAAI Press, 725--730. DOI=http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.49.2457.
[18]
Wolpert, D. 1992. Stacked generalization. Neural Networks, 5, 2 (1992), 241--259. DOI=http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.133.8090.
[19]
Freund, Y. and Schapire, R. E.1995. A decision-theoretic generalization of on-line learning and an application to boosting. In Proceedings of the 2nd European Conference on Computational Learning Theory (1995). Springer, 23--37. DOI=http://dx.doi.org/10.1006/jcss.1997.1504.
[20]
Breiman, L.1998. Arcing classifiers. Annals of Statistics, 26, 3 (1998), 801--849. DOI=http://dx.doi.org/10.1214/aos/1024691079.
[21]
Harries, M. 1999. Boosting a strong learner: evidence against the minimum margin. In Proceedings of the 16th International Conference on Machine Learning (Bled, Slovenia, June 27--30, 1999). Morgan Kaufmann Publisher, 171--179. DOI=http://dx.doi.org/10.1007/3-540-39205-X_81.
[22]
Quinlan, J. R. 1999. Miniboosting decision trees. Machine Learning. (July 1999), 81--106. DOI=http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.52.2154.
[23]
Webb, G. I. 2000. MultiBoosting: a technique for combining boosting and wagging. Machine Learning, 40, 2 (2000), 159--196. DOI=http://dx.doi.org/10.1023/A:1007659514849.
[24]
Harries, M. 1999. Boosting a strong learner: evidence against the minimum margin. In Proceedings of the 16th International Conference on Machine Learning (Bled, Slovenia, June 27--30, 1999). Morgan Kaufmann Publisher, 171--179. DOI=http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.40.5522.
[25]
Efron, B. and Tibshirani, R. 1993. An Introduction to the Bootstrap. Chapman & Hall, New York (1993). DOI=http://dx.doi.org/10.1111/1467-9639.00050.
[26]
Zhou, Z. H. and Tang, W. 2003. Selective Ensemble of Decision Trees. In proceedings of the 9th international conference on Rough sets, fuzzy sets, data mining, and granular computing (April 2003). Springer, 476--483. DOI=http://dx.doi.org/10.1007/3-540-39205-X_81.
[27]
Wang, L. and Y, Y. 2009. Selective Ensemble Algorithms of Support Vector Machines Based on Constraint Projection. Lecture Notes in Computer Science, 5552 (2009), 287--295. DOI=http://dx.doi.org/10.1007/978-3-642-01510-6_33.
[28]
Zhou, Z. H., Wu, J. X, and Tang, W. 2001. Ensembling neural networks: Many could be better than all. Artificial Intelligence, 137, (May, 2002), 239--263. DOI=http://dx.doi.org/10.1016/S0004-3702(02)00190-X.
[29]
Provost, F. J. and Domingos, P. 2003. Tree induction for probability-based ranking. Machine Learning, 52, 30 (2003), 199--215. DOI=http://dx.doi.org/10.1023/A:1024099825458.
[30]
Liang, H. and Yan, Y. 2006. Improve decision trees for probability-based ranking by lazy learners. In Proceedings of the 18th IEEE International Conference on Tools with Artificial Intelligence (Arlington, VA, Nov.13--15, 2006). IEEE, 427--435. DOI=http://dx.doi.org/10.1109/ICTAI.2006.65.
[31]
Abdel, H. M. and Schwenker, F. 2010. Combining committee-based semi-supervised and active learning. Journal of computer and science and technology, 25, 4(July 2010), 681--698. DOI=http://dx.doi.org/10.1007/s11390-010-1053-z
[32]
Witten, I. H. and Frank, E. 1999. Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, October, 1999.
[33]
Blake, C. and Merz, C. J. 1998. UCI repository of machine learning databases. University of California, http://www.ics.uci.edu/learn/MLRepository.html, 1998

Cited By

View all
  • (2024)Mapping the diversity of land uses following deforestation across AfricaScientific Reports10.1038/s41598-024-52138-914:1Online publication date: 19-Jan-2024
  • (2020)A Load Identification Method Based on Active Deep Learning and Discrete Wavelet TransformIEEE Access10.1109/ACCESS.2020.3003778(1-1)Online publication date: 2020

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ISABEL '11: Proceedings of the 4th International Symposium on Applied Sciences in Biomedical and Communication Technologies
October 2011
949 pages
ISBN:9781450309134
DOI:10.1145/2093698
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

  • Universitat Pompeu Fabra
  • IEEE
  • Technical University of Catalonia Spain: Technical University of Catalonia (UPC), Spain
  • River Publishers: River Publishers
  • CTTC: Technological Center for Telecommunications of Catalonia
  • CTIF: Kyranova Ltd, Center for TeleInFrastruktur

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 October 2011

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. active learning
  2. combination
  3. general framework
  4. semi-supervised learning

Qualifiers

  • Research-article

Funding Sources

Conference

ISABEL '11
Sponsor:
  • Technical University of Catalonia Spain
  • River Publishers
  • CTTC
  • CTIF

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)0
Reflects downloads up to 16 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Mapping the diversity of land uses following deforestation across AfricaScientific Reports10.1038/s41598-024-52138-914:1Online publication date: 19-Jan-2024
  • (2020)A Load Identification Method Based on Active Deep Learning and Discrete Wavelet TransformIEEE Access10.1109/ACCESS.2020.3003778(1-1)Online publication date: 2020

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media