Improving Text Classification Performance with Incremental Background Knowledge

Silva, Catarina; Ribeiro, Bernardete

doi:10.1007/978-3-642-04274-4_95

Catarina Silva^18,19 &
Bernardete Ribeiro¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5768))

Included in the following conference series:

International Conference on Artificial Neural Networks

2075 Accesses
4 Citations

Abstract

Text classification is generally the process of extracting interesting and non-trivial information and knowledge from text. One of the main problems with text classification systems is the lack of labeled data, as well as the cost of labeling unlabeled data. Thus, there is a growing interest in exploring the use of unlabeled data as a way to improve classification performance in text classification. The ready availability of this kind of data in most applications makes it an appealing source of information.

In this work we propose an Incremental Background Knowledge (IBK) technique to introduce unlabeled data into the training set by expanding it using initial classifiers to deliver oracle decisions. The defined incremental SVM margin-based method was tested in the Reuters-21578 benchmark showing promising results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

MCVIE: An Effective Batch-Mode Active Learning for Multi-label Text Classification

Word Representation on Small Background Texts

Improving Supervised Classification Using Information Extraction

References

Schohn, G., Cohn, D.: Less is more: Active Learning with Support Vector Machines. In: International Conference on Machine Learning, pp. 839–846 (2000)
Google Scholar
Hong, J., Cho, S.-B.: Incremental Support Vector Machine for Unlabeled Data Classification. In: International Conference on Neural Information Processing (ICONIP), pp. 1403–1407 (2002)
Google Scholar
Liu, B., Dai, Y., Li, X., Lee, W., Yu, P.: Building Text Classifiers Using Positive and Unlabeled Examples. In: International Conference on Data Mining, pp. 179–188 (2003)
Google Scholar
Seeger, M.: Learning with Labeled and Unlabeled Data, Technical Report, Institute for Adaptive and Neural Computation, University of Edinburgh (2001)
Google Scholar
Silva, C., Ribeiro, B.: On Text-based Mining with Active Learning and Background Knowledge using SVM. Journal of Soft Computing - A Fusion of Foundations, Methodologies and Applications 11(6), 519–530 (2007)
Google Scholar
Joachims, T.: Transductive Inference for Text Classification using Support Vector Machines. In: International Conference on Machine Learning, pp. 200–209 (1999)
Google Scholar
Sebastiani, F.: A Tutorial on Automated Text categorisation. In: Amandi, A., Zunino, A. (eds.) Proceedings of ASAI 1999, 1st Argentinian Symposium on Artificial Intelligence, Buenos Aires, AR, pp. 7–35 (1999)
Google Scholar
Vapnik, V.: The Nature of Statistical Learning Theory, 2nd edn. Springer, Heidelberg (1999)
MATH Google Scholar
Zelikovitz, S., Hirsh, H.: Using LSI for text classification in the presence of background text. In: Tenth International Conference on Information Knowledge Management, pp. 113–118 (2001)
Google Scholar
Silva, C., Ribeiro, B.: Labeled and Unlabeled Data in Text Categorization. In: IEEE International Joint Conference on Neural Networks (2004)
Google Scholar
van Rijsbergen, C.: Information Retrieval, 2nd edn. Butterworths, London (1979)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

School of Technology and Management, Polytechnic Institute of Leiria, Portugal
Catarina Silva
Dep. Informatics Eng., Center Informatics and Systems, Univ. of Coimbra, Portugal
Catarina Silva & Bernardete Ribeiro

Authors

Catarina Silva
View author publications
You can also search for this author in PubMed Google Scholar
Bernardete Ribeiro
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dipartimento di Elettronica, Politecnico di Milano, Piazza L. da Vinci 32, 20133, Milano, Italy
Cesare Alippi
Department of Electrical and Computer Engineering, University of Cyprus, 75 Kallipoleos Street, 1678, Nicosia, Cyprus
Marios Polycarpou , Christos Panayiotou & Georgios Ellinas , &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Silva, C., Ribeiro, B. (2009). Improving Text Classification Performance with Incremental Background Knowledge. In: Alippi, C., Polycarpou, M., Panayiotou, C., Ellinas, G. (eds) Artificial Neural Networks – ICANN 2009. ICANN 2009. Lecture Notes in Computer Science, vol 5768. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04274-4_95

Download citation

DOI: https://doi.org/10.1007/978-3-642-04274-4_95
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04273-7
Online ISBN: 978-3-642-04274-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Improving Text Classification Performance with Incremental Background Knowledge

Abstract

Access this chapter

Preview

Similar content being viewed by others

MCVIE: An Effective Batch-Mode Active Learning for Multi-label Text Classification

Word Representation on Small Background Texts

Improving Supervised Classification Using Information Extraction

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Improving Text Classification Performance with Incremental Background Knowledge

Abstract

Access this chapter

Preview

Similar content being viewed by others

MCVIE: An Effective Batch-Mode Active Learning for Multi-label Text Classification

Word Representation on Small Background Texts

Improving Supervised Classification Using Information Extraction

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation