Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1007/11925231_96guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Fast text categorization based on a novel class space model

Published: 13 November 2006 Publication History
  • Get Citation Alerts
  • Abstract

    Automatic categorization has been shown to be an accurate alternative to manual categorization in which documents are processed and automatically assigned to pre-defined categories. The accuracy of different methods for categorization has been studied largely, but their efficiency has seldom been mentioned. Aiming to maintain effectiveness while improving efficiency, we proposed a fast algorithm for text categorization and a compressed document vector representation method based on a novel class space model. The experiments proved our methods have better efficiency and tolerable effectiveness.

    References

    [1]
    Yang, Y. & Liu, X. A re-examination of text categorization. The 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, (pp. 42-49). Morgan Kaufmann, 1999.
    [2]
    Rocchio, J. Relevance feedback in information retrieval. The Smart Retrieval System-Experiments in Automatic Document Proceeding, (pp. 313-323). Prentice-Hall, Englewood, Cliffs, New Jersy. 1971.
    [3]
    Yang, Y. Expert Network: Effective and efficient Learning from human decisions in text categorization and retrieval. Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval, (pp. 13-22). Dublin, Ireland, July, 1994.
    [4]
    Salton, G. & Mcgill, M. J. An Introduction to Modern Information Retrieval. McGraw-Hill, New York, 1983.
    [5]
    Salton, G. Automatic text processing: the transformation. Addison Wesley, 1989.
    [6]
    Aas, K. & Eikvil, L. Text categorisation: A survey. Technical report, Norwegian Computing Center. http://citeseer.nj.nec.com/aas99text.html, 1999.
    [7]
    Huang, R. & Guo, S.H. Research and Implementation of Text Categorization System Based on Class Space Model (in Chinese). Application Research of Computers, 22(8), 60-64, 2005.
    [8]
    Arango, G., Williams, G. & Iscoe, N. Domain Modeling for Software. The International Conference on. Software Engineering. ACM Press, Austin, Texas, 1991.
    [9]
    Lewis, D.D. Reuters-21578 Text Categorization Test Collection. http://www.daviddlewis.com/resources/testcollections/reuters21578, 2004.
    [10]
    Yang, Y. & Pedersen, J.O. A comparative study on feature selection in text categorization. Proceedings of ICML 297, 14th International Conference on Machine Learning, (pp. 412-420). San Francisco: Morgan Kaufmann Publishers Inc., 1997.
    [11]
    Sebastiani, F. Machine learning in automated text categorization. ACM Computing Surveys, 34(1), 1-47, 2002.
    [12]
    Zhou, S.G., Ling, T.W., Guan, J.H., Hu, J.T. & Zhou, A.Y. Fast text classification: a training-corpus pruning based approach. Database Systems for Advanced Applications, 2003. (DASFAA 2003). Proceedings. Eighth International Conference on 26-28 March 2003, (pp. 127-136).
    [13]
    Lewi, D.D. & Ringuette, M. A comparison of two learning algorithms for text classification. In Proc. of the Third Annual Symposium on Document Analysis and Information Retrieval (SDAIR'94), (pp. 81-93), 1994.
    [14]
    Wiener, E., Pedersen, J.O., & Weigend, A.S. A neural network approach to topic spotting. The Fourth Annual Symposium on Document Analysis and Information Retrieval (SDAIR'95), (pp. 317-332). Las Vegas, NV, 1995.
    [15]
    Shanks, V. & Williams, H.E. Fast categorisation of large document collections. String Processing and Information Retrieval (SPIRE 2001), (pp. 194-204), 2001.
    [16]
    Vapnik, V. The Nature of Statistical Learning Theory. New York. Springer-Verlag, 1995.
    [17]
    Yang, Y., Chute, C.G. An example-based mapping method for text categorization and retrieval. ACM Transaction on Information Systmes (TOIS), 12(3), (pp. 252-277), 1994.
    [18]
    Aote C., Damerau, F. & Weiss, S. Text mining with decision rules and decision trees. Workshop on Learning from text and the Web, Conference on Automated Learning and Discovery, 1998.
    [19]
    Mitchell, T. Machine Learning. McGraw: Hill, 1996.

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Guide Proceedings
    MICAI'06: Proceedings of the 5th Mexican international conference on Artificial Intelligence
    November 2006
    1232 pages
    ISBN:3540490264
    • Editors:
    • Alexander Gelbukh,
    • Carlos Alberto Reyes-Garcia

    Publisher

    Springer-Verlag

    Berlin, Heidelberg

    Publication History

    Published: 13 November 2006

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 0
      Total Downloads
    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 27 Jul 2024

    Other Metrics

    Citations

    View Options

    View options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media