A Survey of Text Classification Algorithms

Aggarwal, Charu C.; Zhai, ChengXiang

doi:10.1007/978-1-4614-3223-4_6

Charu C. Aggarwal³ &
ChengXiang Zhai⁴

22k Accesses
383 Citations
1 Altmetric

Abstract

The problem of classification has been widely studied in the data mining, machine learning, database, and information retrieval communities with applications in a number of diverse domains, such as target marketing, medical diagnosis, news group filtering, and document organization. In this paper we will provide a survey of a wide variety of text classification algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Softcover Book: USD 249.99; Price excludes VAT (USA)

Hardcover Book: USD 249.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Supervised Machine Learning Text Classification: A Review

Assessing Intelligence Text Classification Techniques

Text Classification

References

C. C. Aggarwal, S. C. Gates, P. S. Yu. On Using Partial Supervision for Text Categorization, IEEE Transactions on Knowledge and Data Engineering, 16(2), 245–255, 2004.
Article Google Scholar
C. C. Aggarwal, N. Li. On Node Classification in Dynamic Contentbased Networks, SDM Conference, 2011.
Google Scholar
I. Androutsopoulos, J. Koutsias, K. Chandrinos, G. Paliouras, C. Spyropoulos. An Evaluation of Naive Bayesian Anti-Spam Filtering. Workshop on Machine Learning in the New Information Age, in conjunction with ECML Conference, 2000. http://arxiv.org/PS_cache/cs/pdf/0006/0006013v1.pdf
Google Scholar
R. Angelova, G. Weikum. Graph-based text classification: learn from your neighbors. ACM SIGIR Conference, 2006.
Google Scholar
C. Apte, F. Damerau, S. Weiss. Automated Learning of Decision Rules for Text Categorization, ACM Transactions on Information Systems, 12(3), pp. 233–251, 1994.
Article Google Scholar
M. Aizerman, E. Braverman, L. Rozonoer. Theoretical foundations of the potential function method in pattern recognition learning, Automation and Remote Control, 25: pp. 821–837, 1964.
MathSciNet Google Scholar
L. Baker, A. McCallum. Distributional Clustering ofWords for Text Classification, ACM SIGIR Conference, 1998.
Google Scholar
R. Bekkerman, R. El-Yaniv, Y. Winter, N. Tishby. On Feature Distributional Clustering for Text Categorization. ACM SIGIR Conference, 2001.
Google Scholar
S. Basu, A. Banerjee, R. J. Mooney. Semi-supervised Clustering by Seeding. ICML Conference, 2002.
Google Scholar
P. Bennett, S. Dumais, E. Horvitz. Probabilistic Combination of Text Classifiers using Reliability Indicators: Models and Results. ACM SIGIR Conference, 2002.
Google Scholar
P. Bennett, N. Nguyen. Refined experts: improving classification in large taxonomies. ACM SIGIR Conference, 2009.
Google Scholar
S. Bhagat, G. Cormode, S. Muthukrishnan. Node Classification in Social Networks, Book Chapter in Social Network Data Analytics, Ed. Charu Aggarwal, Springer, 2011.
Google Scholar
A. Blum, T. Mitchell. Combining labeled and unlabeled data with co-training. COLT, 1998.
Google Scholar
D. Boley, M. Gini, R. Gross, E.-H. Han, K. Hastings, G. Karypis, V. Kumar, B. Mobasher, J. Moore. Partitioning-based clustering for web document categorization. Decision Support Systems, Vol. 27, pp. 329–341, 1999.
Article Google Scholar
L. Brieman, J. Friedman, R. Olshen, C. Stone. Classification and Regression Trees, Wadsworth Advanced Books and Software, CA, 1984.
Google Scholar
L. Breiman. Bagging Predictors. Machine Learning, 24(2), pp. 123– 140, 1996.
MathSciNet MATH Google Scholar
L. Cai, T. Hofmann. Text categorization by boosting automatically extracted concepts. ACM SIGIR Conference, 2003.
Google Scholar
S. Chakrabarti, S. Roy, M. Soundalgekar. Fast and Accurate Text Classification via Multiple Linear Discriminant Projections, VLDB Journal, 12(2), pp. 172–185, 2003.
Google Scholar
S. Chakrabarti, B. Dom. R. Agrawal, P. Raghavan. Using taxonomy, discriminants and signatures for navigating in text databases, VLDB Conference, 1997.
Google Scholar
S. Chakrabarti, B. Dom, P. Indyk. Enhanced hypertext categorization using hyperlinks. ACM SIGMOD Conference, 1998.
Google Scholar
S. Chakraborti, R. Mukras, R. Lothian, N. Wiratunga, S. Watt, D. Harper. Supervised Latent Semantic Indexing using Adaptive Sprinkling, IJCAI, 2007.
Google Scholar
D. Chickering, D. Heckerman, C. Meek. A Bayesian approach for learning Bayesian networks with local structure. Thirteenth Conference on Uncertainty in Artificial Intelligence, 1997.
Google Scholar
V. R. de Carvalho, W. Cohen. On the collective classification of email ”speech acts”, ACM SIGIR Conference, 2005.
Google Scholar
V. Castelli, T. M. Cover. On the exponential value of labeled samples. Pattern Recognition Letters, 16(1), pp. 105–111, 1995.
Article Google Scholar
W. Cohen, H. Hirsh. Joins that generalize: text classification using Whirl. ACM KDD Conference, 1998.
Google Scholar
W. Cohen, Y. Singer. Context-sensitive learning methods for text categorization. ACM Transactions on Information Systems, 17(2), pp. 141–173, 1999.
Article Google Scholar
W. Cohen. Learning rules that classify e-mail. AAAI Conference, 1996.
Google Scholar
W. Cohen. Learning with set-valued features. AAAI Conference, 1996.
Google Scholar
W. Cooper. Some inconsistencies and misnomers in probabilistic information retrieval. ACM Transactions on Information Systems, 13(1), pp. 100–111, 1995.
Article Google Scholar
C. Cortes, V. Vapnik. Support-vector networks. Machine Learning, 20: pp. 273–297, 1995.
MATH Google Scholar
T. M. Cover, J. A. Thomas. Elements of information theory. New York: John Wiley and Sons, 1991.
Google Scholar
M. Craven, S. Slattery. Relational learning with statistical predicate invention: Better models for hypertext. Machine Learning, 43: pp. 97–119, 2001.
Article MATH Google Scholar
M. Craven, D. DiPasquo, D. Freitag, A. McCallum, T. Mitchell, K. Nigam, S. Slattery. Learning to Extract Symbolic Knowledge from the Worldwide Web. AAAI Conference, 1998.
Google Scholar
I. Dagan, Y. Karov, D. Roth. Mistake-driven Learning in Text Categorization, Proceedings of EMNLP, 1997.
Google Scholar
A. Dayanik, D. Lewis, D. Madigan, V. Menkov, A. Genkin. Constructing informative prior distributions from domain knowledge in text classification. ACM SIGIR Conference, 2006.
Google Scholar
A. P. Dempster, N.M. Laird, D.B. Rubin. Maximum likelihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society, Series B, 39(1): pp. 1–38, 1977.
MathSciNet MATH Google Scholar
F. Denis, A. Laurent. Text Classification and Co-Training from Positive and Unlabeled Examples, ICML 2003 Workshop: The Continuum from Labeled to Unlabeled Data. http://www.grappa. univ-lille3.fr/ftp/reports/icmlws03.pdf.
Google Scholar
S. Deerwester, S. Dumais, T. Landauer, G. Furnas, R. Harshman. Indexing by Latent Semantic Analysis. JASIS, 41(6), pp. 391–407, 1990.
Google Scholar
P. Domingos, M. J. Pazzani. On the the optimality of the simple Bayesian classifier under zero-one loss. Machine Learning, 29(2–3), pp. 103–130, 1997.
Article MATH Google Scholar
P. Domingos. MetaCost: A General Method for making Classifiers Cost-Sensitive. ACM KDD Conference, 1999.
Google Scholar
H. Drucker, D. Wu, V. Vapnik. Support Vector Machines for Spam Categorization. IEEE Transactions on Neural Networks, 10(5), pp. 1048–1054, 1999.
Article Google Scholar
R. Duda, P. Hart, W. Stork. Pattern Classification, Wiley Interscience, 2000.
Google Scholar
S. Dumais, J. Platt, D. Heckerman, M. Sahami. Inductive learning algorithms and representations for text categorization. CIKM Conference, 1998.
Google Scholar
S. Dumais, H. Chen. Hierarchical Classification of Web Content. ACM SIGIR Conference, 2000.
Google Scholar
C. Elkan. The foundations of cost-sensitive learning, IJCAI Conference, 2001.
Google Scholar
R. Fisher. The Use of Multiple Measurements in Taxonomic Problems. Annals of Eugenics, 7, pp. 179–188, 1936.
Article Google Scholar
R. El-Yaniv, O. Souroujon. Iterative Double Clustering for Unsupervised and Semi-supervised Learning. NIPS Conference, 2002.
Google Scholar
Y. Freund, R. Schapire. A decision-theoretic generalization of online learning and an application to boosting. In Proc. Second European Conference on Computational Learning Theory, pp. 23–37, 1995.
Google Scholar
Y. Freund, R. Schapire, Y. Singer, M. Warmuth. Using and combining predictors that specialize. Proceedings of the 29th Annual ACM Symposium on Theory of Computing, pp. 334–343, 1997.
Google Scholar
S. Gao, W. Wu, C.-H. Lee, T.-S. Chua. A maximal figure-of-merit learning approach to text categorization. SIGIR Conference, 2003.
Google Scholar
R. Gilad-Bachrach, A. Navot, N. Tishby. Margin based feature selection – theory and algorithms. ICML Conference, 2004.
Google Scholar
S. Gopal, Y. Yang. Multilabel classification with meta-level features. ACM SIGIR Conference, 2010.
Google Scholar
L. Guthrie, E.Walker. Document Classification by Machine: Theory and Practice. COLING, 1994.
Google Scholar
E.-H. Han, G. Karypis, V. Kumar. Text Categorization using Weighted-Adjusted k-nearest neighbor classification, PAKDD Conference, 2001.
Google Scholar
E.-H. Han, G. Karypis. Centroid-based Document Classification: Analysis and Experimental Results, PKDD Conference, 2000.
Google Scholar
D. Hardin, I. Tsamardinos, C. Aliferis. A theoretical characterization of linear SVM-based feature selection. ICML Conference, 2004.
Google Scholar
T. Hofmann. Probabilistic latent semantic indexing. ACM SIGIR Conference, 1999.
Google Scholar
P. Howland, M. Jeon, H. Park. Structure Preserving Dimension Reduction for Clustered Text Data based on the Generalized Singular Value Decomposition. SIAM Journal of Matrix Analysis and Applications, 25(1): pp. 165–179, 2003.
Article MathSciNet MATH Google Scholar
P. Howland, H. Park. Generalizing discriminant analysis using the generalized singular value decomposition, IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(8), pp. 995–1006, 2004.
Article Google Scholar
D. Hull, J. Pedersen, H. Schutze. Method combination for document filtering. ACM SIGIR Conference, 1996.
Google Scholar
R. Iyer, D. Lewis, R. Schapire, Y. Singer, A. Singhal. Boosting for document routing. CIKM Conference, 2000.
Google Scholar
M. James. Classification Algorithms, Wiley Interscience, 1985.
Google Scholar
D. Jensen, J. Neville, B. Gallagher. Why collective inference improves relational classification. ACM KDD Conference, 2004.
Google Scholar
T. Joachims. A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization. ICML Conference, 1997.
Google Scholar
T. Joachims. Text categorization with support vector machines: learning with many relevant features. ECML Conference, 1998.
Google Scholar
T. Joachims. Transductive inference for text classification using support vector machines. ICML Conference, 1999.
Google Scholar
T. Joachims. A Statistical Learning Model of Text Classification for Support Vector Machines. ACM SIGIR Conference, 2001.
Google Scholar
D. Johnson, F. Oles, T. Zhang, T. Goetz. A Decision Tree-based Symbolic Rule Induction System for Text Categorization, IBM Systems Journal, 41(3), pp. 428–437, 2002.
Article Google Scholar
I. T. Jolliffee. Principal Component Analysis. Springer, 2002.
Google Scholar
T. Kalt, W. B. Croft. A new probabilistic model of text classification and retrieval. Technical Report IR-78, University of Massachusetts Center for Intelligent Information Retrieval, 1996. http://ciir. cs.umass.edu/publications/index.shtml
Google Scholar
G. Karypis, E.-H. Han. Fast Supervised Dimensionality Reduction with Applications to Document Categorization and Retrieval, ACM CIKM Conference, 2000.
Google Scholar
T. Kawatani. Topic difference factor extraction between two document sets and its application to text categorization. ACM SIGIR Conference, 2002.
Google Scholar
Y.-H. Kim, S.-Y. Hahn, B.-T. Zhang. Text filtering by boosting naive Bayes classifiers. ACM SIGIR Conference, 2000.
Google Scholar
D. Koller, M. Sahami. Hierarchically classifying documents with very few words, ICML Conference, 2007.
Google Scholar
S. Lam, D. Lee. Feature reduction for neural network based text categorization. DASFAA Conference, 1999.
Google Scholar
W. Lam, C. Y. Ho. Using a generalized instance set for automatic text categorization. ACM SIGIR Conference, 1998.
Google Scholar
W. Lam, K.-Y. Lai. A meta-learning approach for text categorization. ACM SIGIR Conference, 2001.
Google Scholar
K. Lang. Newsweeder: Learning to filter netnews. ICML Conference, 1995.
Google Scholar
L. S. Larkey, W. B. Croft. Combining Classifiers in text categorization. ACM SIGIR Conference, 1996.
Google Scholar
D. Lewis, J. Catlett. Heterogeneous uncertainty sampling for supervised learning. ICML Conference, 1994.
Google Scholar
D. Lewis, M. Ringuette. A comparison of two learning algorithms for text categorization. SDAIR, 1994.
Google Scholar
D. Lewis. Naive (Bayes) at forty: The independence assumption in information retrieval. ECML Conference, 1998.
Google Scholar
D. Lewis. An Evaluation of Phrasal and Clustered Representations for the Text Categorization Task, ACM SIGIR Conference, 1992.
Google Scholar
D. Lewis, W. Gale. A sequential algorithm for training text classifiers, SIGIR Conference, 1994.
Google Scholar
D. Lewis, K. Knowles. Threading electronic mail: A preliminary study. Information Processing and Management, 33(2), pp. 209– 217, 1997.
Article Google Scholar
H. Li, K. Yamanishi. Document classification using a finite mixture model. Annual Meeting of the Association for Computational Linguistics, 1997.
Google Scholar
Y. Li, A. Jain. Classification of text documents. The Computer Journal, 41(8), pp. 537–546, 1998.
Article MATH Google Scholar
B. Liu, W. Hsu, Y. Ma. Integrating Classification and Association Rule Mining. ACM KDD Conference, 1998.
Google Scholar
B. Liu, L. Zhang. A Survey of Opinion Mining and Sentiment Analysis. Book Chapter in Mining Text Data, Ed. C. Aggarwal, C. Zhai, Springer, 2011.
Google Scholar
N. Littlestone. Learning quickly when irrelevant attributes abound: A new linear-threshold algorithm. Machine Learning, 2: pp. 285– 318, 1988.
Google Scholar
P. Long, R. Servedio. Random Classification Noise defeats all Convex Potential Boosters. ICML Conference, 2008.
Google Scholar
S. A. Macskassy, F. Provost. Classification in Networked Data: A Toolkit and a Univariate Case Study, Journal of Machine Learning Research, Vol. 8, pp. 935–983, 2007.
Google Scholar
A. McCallum. Bow: A toolkit for statistical language modeling, text retrieval, classification and clustering. http://www.cs.cmu. edu/~mccallum/bow, 1996.
Google Scholar
A. McCallum, K. Nigam. A Comparison of Event Models for Naive Bayes Text Classification. AAAI Workshop on Learning for Text Categorization, 1998.
Google Scholar
A. McCallum, R. Rosenfeld, T. Mitchell, A. Ng. Improving text classification by shrinkage in a hierarchy of classes. ICML Conference, 1998.
Google Scholar
McCallum, Andrew Kachites. ”MALLET: A Machine Learning for Language Toolkit.” http://mallet.cs.umass.edu. 2002.
Google Scholar
T. M. Mitchell. Machine Learning. WCB/McGraw-Hill, 1997.
Google Scholar
T. M. Mitchell. The role of unlabeled data in supervised learning. Proceedings of the Sixth International Colloquium on Cognitive Science, 1999.
Google Scholar
D. Mladenic, J. Brank, M. Grobelnik, N. Milic-Frayling. Feature selection using linear classifier weights: interaction with classification models. ACM SIGIR Conference, 2004.
Google Scholar
K. Myers, M. Kearns, S. Singh, M. Walker. A boosting approach to topic spotting on subdialogues. ICML Conference, 2000.
Google Scholar
H. T. Ng, W. Goh, K. Low. Feature selection, perceptron learning, and a usability case study for text categorization. ACM SIGIR Conference, 1997.
Google Scholar
A. Y. Ng, M. I. Jordan. On discriminative vs. generative classifiers: a comparison of logistic regression and naive Bayes. NIPS. pp. 841- 848, 2001.
Google Scholar
K. Nigam, A. McCallum, S. Thrun, T. Mitchell. Learning to classify text from labeled and unlabeled documents. AAAI Conference, 1998.
Google Scholar
H.-J. Oh, S.-H. Myaeng, M.-H. Lee. A practical hypertext categorization method using links and incrementally available class information. ACM SIGIR Conference, 2000.
Google Scholar
X. Qi, B. Davison. Classifiers without borders: incorporating fielded text from neighboring web pages. ACM SIGIR Conference, 2008.
Google Scholar
J. R. Quinlan, Induction of Decision Trees, Machine Learning, 1(1), pp 81–106, 1986.
Google Scholar
H. Raghavan, J. Allan. An interactive algorithm for asking and incorporating feature feedback into support vector machines. ACM SIGIR Conference, 2007.
Google Scholar
S. E. Robertson, K. Sparck-Jones. Relevance weighting of search terms. Journal of the American Society for Information Science, 27: pp. 129–146, 1976.
Article Google Scholar
J. Rocchio. Relevance feedback information retrieval. The Smart Retrieval System- Experiments in Automatic Document Processing, G. Salton, Ed. Prentice Hall, Englewood Cliffs, NJ, pp 313–323, 1971.
Google Scholar
M. Ruiz, P. Srinivasan. Hierarchical neural networks for text categorization. ACM SIGIR Conference, 1999.
Google Scholar
F. Sebastiani. Machine Learning in Automated Text Categorization, ACM Computing Surveys, 34(1), 2002.
Google Scholar
M. Sahami. Learning limited dependence Bayesian classifiers, ACM KDD Conference, 1996.
Google Scholar
M. Sahami, S. Dumais, D. Heckerman, E. Horvitz. A Bayesian approach to filtering junk e-mail. AAAI Workshop on Learning for Text Categorization. Tech. Rep. WS-98-05, AAAI Press. http:// robotics.stanford.edu/users/sahami/papers.html
Google Scholar
T. Salles, L. Rocha, G. Pappa, G. Mourao, W. Meira Jr., M. Goncalves. Temporally-aware algorithms for document classification. ACM SIGIR Conference, 2010.
Google Scholar
G. Salton. An Introduction to Modern Information Retrieval, Mc Graw Hill, 1983.
Google Scholar
R. Schapire, Y. Singer. BOOSTEXTER: A Boosting-based System for Text Categorization, Machine Learning, 39(2/3), pp. 135–168, 2000.
Article MATH Google Scholar
H. Schutze, D. Hull, J. Pedersen. A comparison of classifiers and document representations for the routing problem. ACM SIGIR Conference, 1995.
Google Scholar
R. Shapire, Y. Singer, A. Singhal. Boosting and Rocchio applied to text filtering. ACM SIGIR Conference, 1998.
Google Scholar
J. Shavlik, T. Eliassi-Rad. Intelligent agents for web-based tasks: An advice-taking approach. AAAI-98 Workshop on Learning for Text Categorization. Tech. Rep. WS-98-05, AAAI Press, 1998. http://www.cs.wisc.edu/~shavlik/mlrg/publications.html
Google Scholar
V. Sindhwani, S. S. Keerthi. Large scale semi-supervised linear SVMs. ACM SIGIR Conference, 2006.
Google Scholar
N. Slonim, N. Tishby. The power of word clusters for text classification. European Colloquium on Information Retrieval Research (ECIR), 2001.
Google Scholar
N. Slonim, N. Friedman, N. Tishby. Unsupervised document classification using sequential information maximization. ACM SIGIR Conference, 2002.
Google Scholar
J.-T. Sun, Z. Chen, H.-J. Zeng, Y. Lu, C.-Y. Shi, W.-Y. Ma. Supervised Latent Semantic Indexing for Document Categorization. ICDM Conference, 2004.
Google Scholar
V. Vapnik. Estimations of dependencies based on statistical data, Springer, 1982.
Google Scholar
V. Vapnik. The Nature of Statistical Learning Theory, Springer, New York, 1995.
Google Scholar
A. Weigand, E. Weiner, J. Pedersen. Exploiting hierarchy in text catagorization. Information Retrieval, 1(3), pp. 193–216, 1999.
Article Google Scholar
S, M. Weiss, C. Apte, F. Damerau, D. Johnson, F. Oles, T. Goetz, T. Hampp. Maximizing text-mining performance. IEEE Intelligent Systems, 14(4), pp. 63–69, 1999.
Google Scholar
S. M. Weiss, N. Indurkhya. Optimized Rule Induction, IEEE Exp., 8(6), pp. 61–69, 1993.
Article Google Scholar
E. Wiener, J. O. Pedersen, A. S. Weigend. A Neural Network Approach to Topic Spotting. SDAIR, pp. 317–332, 1995.
Google Scholar
G.-R. Xue, D. Xing, Q. Yang, Y. Yu. Deep classification in largescale text hierarchies. ACM SIGIR Conference, 2008.
Google Scholar
J. Yan, N. Liu, B. Zhang, S. Yan, Z. Chen, Q. Cheng, W. Fan, W.-Y. Ma. OCFS: optimal orthogonal centroid feature selection for text categorization. ACM SIGIR Conference, 2005.
Google Scholar
Y. Yang, L. Liu. A re-examination of text categorization methods, ACM SIGIR Conference, 1999.
Google Scholar
Y. Yang, J. O. Pederson. A comparative study on feature selection in text categorization, ACM SIGIR Conference, 1995.
Google Scholar
Y. Yang, C.G. Chute. An example-based mapping method for text categorization and retrieval. ACM Transactions on Information Systems, 12(3), 1994.
Google Scholar
Y. Yang. Noise Reduction in a Statistical Approach to Text Categorization, ACM SIGIR Conference, 1995.
Google Scholar
Y. Yang. A Study on Thresholding Strategies for Text Categorization. ACM SIGIR Conference, 2001.
Google Scholar
Y. Yang, T. Ault, T. Pierce. Combining multiple learning strategies for effective cross-validation. ICML Conference, 2000.
Google Scholar
J. Zhang, Y. Yang. Robustness of regularized linear classification methods in text categorization. ACM SIGIR Conference, 2003.
Google Scholar
T. Zhang, A. Popescul, B. Dom. Linear prediction models with graph regularization for web-page categorization, ACM KDD Conference, 2006.
Google Scholar
S. Zhu, K. Yu, Y. Chi, Y. Gong. Combining content and link for classification using matrix factorization. ACM SIGIR Conference, 2007.
Google Scholar

Download references

Author information

Authors and Affiliations

IBM T. J. Watson Research Center, Yorktown Heights, NY, USA
Charu C. Aggarwal
University of Illinois at Urbana-Champaign, Urbana, IL, USA
ChengXiang Zhai

Authors

Charu C. Aggarwal
View author publications
You can also search for this author in PubMed Google Scholar
ChengXiang Zhai
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Charu C. Aggarwal .

Editor information

Editors and Affiliations

Thomas J. Watson Research Center, IBM, Skyline Drive 19, Hawthorne, 10532, New York, USA
Charu C. Aggarwal
at Urbana-Champaign, University of Illinois, URBANA, 61801, Illinois, USA
ChengXiang Zhai

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Aggarwal, C.C., Zhai, C. (2012). A Survey of Text Classification Algorithms. In: Aggarwal, C., Zhai, C. (eds) Mining Text Data. Springer, Boston, MA. https://doi.org/10.1007/978-1-4614-3223-4_6

Download citation

DOI: https://doi.org/10.1007/978-1-4614-3223-4_6
Published: 07 January 2012
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4614-3222-7
Online ISBN: 978-1-4614-3223-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Survey of Text Classification Algorithms

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Supervised Machine Learning Text Classification: A Review

Assessing Intelligence Text Classification Techniques

Text Classification

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

A Survey of Text Classification Algorithms

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Supervised Machine Learning Text Classification: A Review

Assessing Intelligence Text Classification Techniques

Text Classification

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation