Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Abstract

Kernel methods like support vector machines have successfully been used for text categorization. A standard choice of kernel function has been the inner product between the vector-space representation of two documents, in analogy with classical information retrieval (IR) approaches.

Latent semantic indexing (LSI) has been successfully used for IR purposes as a technique for capturing semantic relations between terms and inserting them into the similarity measure between two documents. One of its main drawbacks, in IR, is its computational cost.

In this paper we describe how the LSI approach can be implemented in a kernel-defined feature space.

We provide experimental results demonstrating that the approach can significantly improve performance, and that it does not impair it.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  • Aizerman, M., Braverman, E., and Rozonoer, L. (1964). Theoretical Foundations of the Potential Function Method in Pattern Recognition Learning. Automation and Remote Control, 25, 821–837.

    Google Scholar 

  • Boser, B.E., Guyon, I.M., and Vapnik, V.N. (1992). A Training Algorithm for Optimal Margin Classifiers. In D. Haussler (Eds.), Proceedings of the 5th Annual ACMWorkshop on Computational Learning Theory (pp. 144–152). New York: ACM Press.

    Google Scholar 

  • Cristianini, N. and Shawe-Taylor, J. (2000). An Introduction to Support Vector Machines. Cambridge: Cambridge University Press.

    Google Scholar 

  • Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., and Harshman, R.A. (1990). Indexing by Latent Semantic Analysis. Journal of the American Society for Information Science, 41(6), 391–407.

    Google Scholar 

  • Dumais, S., Platt, J., Heckerman, D., and Sahami, M. (1998). Inductive Learning Algorithms and Representations for Text Categorization. In 7th International Conference on Information and Knowledge Management.

  • Dumais, S.T., Letsche, T.A., Littman, M.L., and Landauer, T.K. (1997). Automatic Cross-Language Retrieval Using Latent Semantic Indexing. In AAAI Spring Symposuim on Cross-Language Text and Speech Retrieval (pp. 115–132).

  • Herbrich, R., Graepel, T., and Obermayer, K. (2000). Large Margin Rank Boundaries for Ordinal Regression. In A.J. Smola, P. Bartlett, B. Schölkopf, and C. Schuurmans (Eds.), Advances in Large Margin Classifiers. Cambridge, MA: MIT Press.

    Google Scholar 

  • Joachims, T. (1998). Text Categorization with Support Vector Machines. In Proceedings of European Conference on Machine Learning (ECML).

  • Joachims, T. (1999). Making Large-Scale SVM Learning Practical. In B. Schölkopf, C. Burges, and A. Smola (Eds.), Advances in Kernel—Methods Support Vector Learning. Cambridge, MA: MIT Press.

    Google Scholar 

  • Jiang, F. and Littman, M.L. (2000). Approximate Dimension Equalization in Vector-Based Information Retrieval. In Pat Langley (Ed.), Proceedings of the Seventeenth International Conference on Machine Learning. Los Altos, CA: Morgan-Kauffman.

    Google Scholar 

  • Leopold, E. and Kinderman, J. (2002). Text Categorization with Support Vector Machines. How to Represent Texts in Input Space? Machine Learning, 46, 423–444.

    Google Scholar 

  • Miller, G., Beckwith, R., Fellbaum, C., Gross, D., and Miller, K. (1993). Five Papers onWordnet. Technical report, Stanford University.

  • Opper, M. and Winther, O. (2000). Gaussian Processes and SVM: Mean Field and Leave-One-Out. In A.J. Smola, P. Bartlett, B. Schölkopf, and C. Schuurmans (Eds.), Advances in Large Margin Classifiers. Cambridge, MA: MIT Press.

    Google Scholar 

  • Press, W.H. (1992). Numerical Recipes in C: The Art of Scientific Computing. Cambridge: Cambridge University Press.

    Google Scholar 

  • Salton, G., Wang, A., and Yang, C.S. (1975). A Vector Space Model for Information Retrieval. Journal of the American Society for Information Science, 18, 613–620.

    Google Scholar 

  • Saunders, C., Gammermann, A., and Vovk, V. (1998). Ridge Regression Learning Algorithm in Dual Variables. In J. Shavlik (Ed.), Machine Learning: Proceedings of the Fifteenth International Conference. Los Altos, CA: Morgan Kaufmann.

    Google Scholar 

  • Schölkopf, B., Mika, S., Smola, A., Rôtsch, G., and Müller, K.-R. (1998). Kernel PCA Pattern Reconstruction via Approximate Pre-Images. In L. Niklasson, M. Bodén, and T. Ziemke (Eds.), Proceedings of the 8th International Conference on Artificial Neural Networks, Perspectives in Neural Computing (pp. 147–152). Berlin: Springer Verlag.

    Google Scholar 

  • Schölkopf, B., Smola, A.J., and Müller, K. (1999). Kernel Principal Component Analysis. In B. Schölkopf, C.J.C. Burges, and A.J. Smola (Eds.), Advances in Kernel Methods—Support Vector Learning (pp. 327–352). Cambridge, MA: MIT Press.

    Google Scholar 

  • Schölkopf, B., Platt, J.C., Shawe-Taylor, J., Williamson, R.C., and Smola, A.J. (2001). SV Estimating the Support of a Higher Dimensional Distribution, Neural Computation. In Neural Information Processing Systems, 13(7), 1443–1471.

    Google Scholar 

  • Shawe-Taylor, J., Bartlett, P.L., Williamson, R.C., and Anthony, M. (1998). Structural Risk Minimization over Data-Dependent Hierarchies. IEEE Transactions on Information Theory, 44(5), 1926–1940.

    Google Scholar 

  • Shawe-Taylor, J. and Cristianini, N. (2000). Margin Distribution and Soft Margin. In A.J. Smola, P.L. Bartlett, B. Schölkopf, and D. Schuurmans (Eds.), Advances in Large Margin Classifiers (pp. 349–358). Cambridge, MA: MIT Press.

    Google Scholar 

  • Sheridan, P. and Ballerini, J.P. (1996). Experiments in Multilingual Information Retrieval Using the Spi-der System. In Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 58–65). New York: ACM.

    Google Scholar 

  • Siolas, G. and d'AlchéBuc, F. (2000). Support Vectors Machines Based on a Semantic Kernel for Text Categorization. In Proceedings of the International Joint Conference on Neural Networks, IJCNN, Como, IEEE.

    Google Scholar 

  • Smola, A. and Schölkopf, B. (1998). A Tutorial on Support Vector Regression. NeuroCOLT Technical Report NC-TR-98-030, Royal Holloway, University of London.

  • Smola, A.J., Mangasarian, O.L., and Schölkopf, B. (1999). Sparse Kernel Feature Analysis. Technical Report 99–04, University of Wisconsin, Data Mining Institute, Madison.

    Google Scholar 

  • Vapnik, V. (1998). Statistical Learning Theory. New York: Wiley.

    Google Scholar 

  • Wong, S.K.M., Ziarko, W., and Wong, P.C.N. (1985). Generalized Vector Space Model in Information Retrieval. In ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 18–25). New York: ACM.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cristianini, N., Shawe-Taylor, J. & Lodhi, H. Latent Semantic Kernels. Journal of Intelligent Information Systems 18, 127–152 (2002). https://doi.org/10.1023/A:1013625426931

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1013625426931