Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/319950.319966acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
Article
Free access

Text classification using ESC-based stochastic decision lists

Published: 01 November 1999 Publication History

Abstract

We propose a new method of text classification using stochastic decision lists. A stochastic decision list is an ordered sequence of IF-THEN rules, and our method can be viewed as a rule-based method for text classification having advantages of readability and refinability of acquired knowledge. Our method is unique in that decision lists are automatically constructed on the basis of the principle of minimizing Extended Stochastic Complexity (ESC), and with it we are able to construct decision lists that have fewer errors in classification. The accuracy of classification achieved with our method appears better than or comparable to those of existing rule-based methods.

References

[1]
Chidanand Apte, Fred Damerau, and Sholom M. Weiss. Automated learning of decision rules for text categorization. A CM Transactions on Information Systems, 12(3):233-251, 1994.
[2]
William W. Cohen and Yoram Singer. Contextsensitive learning methods for text categorization. http://www, research, ait. com/ginger, 1998.
[3]
Susan Dumais, John Platt, David Heckerman, and Mehran Sahami. Inductive learning algorithms and representations for text categorization. Proc. of CIKM'98, 1998.
[4]
Thorsten Joachims. Text categorization with support vector machines: Learning with many relevant features. Proc. of ECML '98, 1998.
[5]
Gautam Kar and Lee J. White. A distance measure for automatic document classification by sequential analysis. Information Processing and Managemeni, 14:57-69, 1978.
[6]
Daphne Koller and Mehran Sahami. Hierarchically classifying documents using very few words. Proc. o/ICML '97, pages 170-178, 1997.
[7]
David D. Lewis and Marc Ringuette. A comparison of two learning algorithms for test categorization. Proceedings of 3rd Annual Symposium on Document Analysis and Informalion Retrieval, pages 81-93, 1994.
[8]
David D. Lewis, Robert E. Schapire, James P. Callan, and Ron Papka. Training algorithms for linear text classifiers. Proc. of SIGIR'96, 1996.
[9]
Hang Li and Kenji Yamanishi. Document classification using a finite mixture model. Proc. of A CL'97, pages 39-47, 1997.
[10]
Jorma Rissanen. Fisher information and stochastic complexity. IEEE Transaction on Information Theory, 42(1):40-47, 1996.
[11]
S.E. Robertson and K. Sparck Jones. Relevance weighting of search terms. Journal of the American Society/or Information Science, 27:129-146, 1976.
[12]
J. Rocchio. Relevance feedback information retrieval, in Gerard Slaton, editor, The Smart Retrieval System -Experiments in Automatic Document Processing, pages 313-323. Prentice-Hall, 1971.
[13]
Robert E. Schapire, Yoram Singer, and Amit Singhal. Boosting and rocchio applied to text filtering. Proc. of $IGIR'98, 1998.
[14]
I-Iinrich Schutze, David A. Hull, and Jan O. Pedersen. A comparison of classifiers and document representations for the routing problem. Proc. of SIGIR '95, 1995.
[15]
Vladimir N. Vapnik. The Nature of Statistical Learning Theory. New York: Springer, 1995.
[16]
Kenji Yamanishi. A learning criterion for stochastic rules. Machine Learning, 9:165-203, 1992.
[17]
Kenji Yamanishi. A decision-theoretic extension of stochastic complexity and its applications to learning. iEEE Transactions on Information Theory, 44(4):1424-1439, 1998.
[18]
Yiming Yang and Jan O. Pedersen. A comparative study on feature selection in text categorization. Proc. o/ICML '97, pages 412-420, 1997.

Cited By

View all
  • (2024)A New Decision Support System for Enhancing Tourism Destination Management and Competitiveness2024 11th International Conference on Wireless Networks and Mobile Communications (WINCOM)10.1109/WINCOM62286.2024.10658530(1-6)Online publication date: 23-Jul-2024
  • (2023)Design and Development of Emotional Analysis System for Chinese Online Comment Text2023 4th International Conference on Electronic Communication and Artificial Intelligence (ICECAI)10.1109/ICECAI58670.2023.10176889(237-243)Online publication date: 12-May-2023
  • (2014)Construction of a Domain Dictionary for Fundamental Vocabulary and its Application to Automatic Blog Categorization Using Dynamically Estimated Domains of Unknown WordsJournal of Natural Language Processing10.5715/jnlp.21.81721:4(817-840)Online publication date: 2014
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '99: Proceedings of the eighth international conference on Information and knowledge management
November 1999
564 pages
ISBN:1581131461
DOI:10.1145/319950
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 November 1999

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Conference

CIKM99
Sponsor:
CIKM99: Conference on Information and Knowledge Management
November 2 - 6, 1999
Missouri, Kansas City, USA

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)59
  • Downloads (Last 6 weeks)8
Reflects downloads up to 22 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)A New Decision Support System for Enhancing Tourism Destination Management and Competitiveness2024 11th International Conference on Wireless Networks and Mobile Communications (WINCOM)10.1109/WINCOM62286.2024.10658530(1-6)Online publication date: 23-Jul-2024
  • (2023)Design and Development of Emotional Analysis System for Chinese Online Comment Text2023 4th International Conference on Electronic Communication and Artificial Intelligence (ICECAI)10.1109/ICECAI58670.2023.10176889(237-243)Online publication date: 12-May-2023
  • (2014)Construction of a Domain Dictionary for Fundamental Vocabulary and its Application to Automatic Blog Categorization Using Dynamically Estimated Domains of Unknown WordsJournal of Natural Language Processing10.5715/jnlp.21.81721:4(817-840)Online publication date: 2014
  • (2013)Fuzzy unordered rule induction algorithm in text categorization on top of geometric particle swarm optimization term selectionKnowledge-Based Systems10.5555/2770961.277109554:C(288-297)Online publication date: 1-Dec-2013
  • (2013)Fuzzy unordered rule induction algorithm in text categorization on top of geometric particle swarm optimization term selectionKnowledge-Based Systems10.1016/j.knosys.2013.09.02054(288-297)Online publication date: Dec-2013
  • (2009)Text and hypertext categorizationArtificial intelligence10.5555/1793943.1793945(11-38)Online publication date: 1-Jan-2009
  • (2009)Text and Hypertext CategorizationArtificial Intelligence An International Perspective10.1007/978-3-642-03226-4_2(11-38)Online publication date: 2009
  • (2009)Text Documents Classification by Associating Terms with Text CategoriesApplications of Soft Computing10.1007/978-3-540-89619-7_22(223-231)Online publication date: 2009
  • (2008)Improving Automatic Text Classification by Integrated Feature AnalysisIEICE - Transactions on Information and Systems10.1093/ietisy/e91-d.4.1101E91-D:4(1101-1109)Online publication date: 1-Apr-2008
  • (2007)Integrating web directories by learning their structuresProceedings of the 16th international conference on World Wide Web10.1145/1242572.1242785(1239-1240)Online publication date: 8-May-2007
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media