Abstract
Kernel methods like support vector machines have successfully been used for text categorization. A standard choice of kernel function has been the inner product between the vector-space representation of two documents, in analogy with classical information retrieval (IR) approaches.
Latent semantic indexing (LSI) has been successfully used for IR purposes as a technique for capturing semantic relations between terms and inserting them into the similarity measure between two documents. One of its main drawbacks, in IR, is its computational cost.
In this paper we describe how the LSI approach can be implemented in a kernel-defined feature space.
We provide experimental results demonstrating that the approach can significantly improve performance, and that it does not impair it.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Aizerman, M., Braverman, E., and Rozonoer, L. (1964). Theoretical Foundations of the Potential Function Method in Pattern Recognition Learning. Automation and Remote Control, 25, 821–837.
Boser, B.E., Guyon, I.M., and Vapnik, V.N. (1992). A Training Algorithm for Optimal Margin Classifiers. In D. Haussler (Eds.), Proceedings of the 5th Annual ACMWorkshop on Computational Learning Theory (pp. 144–152). New York: ACM Press.
Cristianini, N. and Shawe-Taylor, J. (2000). An Introduction to Support Vector Machines. Cambridge: Cambridge University Press.
Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., and Harshman, R.A. (1990). Indexing by Latent Semantic Analysis. Journal of the American Society for Information Science, 41(6), 391–407.
Dumais, S., Platt, J., Heckerman, D., and Sahami, M. (1998). Inductive Learning Algorithms and Representations for Text Categorization. In 7th International Conference on Information and Knowledge Management.
Dumais, S.T., Letsche, T.A., Littman, M.L., and Landauer, T.K. (1997). Automatic Cross-Language Retrieval Using Latent Semantic Indexing. In AAAI Spring Symposuim on Cross-Language Text and Speech Retrieval (pp. 115–132).
Herbrich, R., Graepel, T., and Obermayer, K. (2000). Large Margin Rank Boundaries for Ordinal Regression. In A.J. Smola, P. Bartlett, B. Schölkopf, and C. Schuurmans (Eds.), Advances in Large Margin Classifiers. Cambridge, MA: MIT Press.
Joachims, T. (1998). Text Categorization with Support Vector Machines. In Proceedings of European Conference on Machine Learning (ECML).
Joachims, T. (1999). Making Large-Scale SVM Learning Practical. In B. Schölkopf, C. Burges, and A. Smola (Eds.), Advances in Kernel—Methods Support Vector Learning. Cambridge, MA: MIT Press.
Jiang, F. and Littman, M.L. (2000). Approximate Dimension Equalization in Vector-Based Information Retrieval. In Pat Langley (Ed.), Proceedings of the Seventeenth International Conference on Machine Learning. Los Altos, CA: Morgan-Kauffman.
Leopold, E. and Kinderman, J. (2002). Text Categorization with Support Vector Machines. How to Represent Texts in Input Space? Machine Learning, 46, 423–444.
Miller, G., Beckwith, R., Fellbaum, C., Gross, D., and Miller, K. (1993). Five Papers onWordnet. Technical report, Stanford University.
Opper, M. and Winther, O. (2000). Gaussian Processes and SVM: Mean Field and Leave-One-Out. In A.J. Smola, P. Bartlett, B. Schölkopf, and C. Schuurmans (Eds.), Advances in Large Margin Classifiers. Cambridge, MA: MIT Press.
Press, W.H. (1992). Numerical Recipes in C: The Art of Scientific Computing. Cambridge: Cambridge University Press.
Salton, G., Wang, A., and Yang, C.S. (1975). A Vector Space Model for Information Retrieval. Journal of the American Society for Information Science, 18, 613–620.
Saunders, C., Gammermann, A., and Vovk, V. (1998). Ridge Regression Learning Algorithm in Dual Variables. In J. Shavlik (Ed.), Machine Learning: Proceedings of the Fifteenth International Conference. Los Altos, CA: Morgan Kaufmann.
Schölkopf, B., Mika, S., Smola, A., Rôtsch, G., and Müller, K.-R. (1998). Kernel PCA Pattern Reconstruction via Approximate Pre-Images. In L. Niklasson, M. Bodén, and T. Ziemke (Eds.), Proceedings of the 8th International Conference on Artificial Neural Networks, Perspectives in Neural Computing (pp. 147–152). Berlin: Springer Verlag.
Schölkopf, B., Smola, A.J., and Müller, K. (1999). Kernel Principal Component Analysis. In B. Schölkopf, C.J.C. Burges, and A.J. Smola (Eds.), Advances in Kernel Methods—Support Vector Learning (pp. 327–352). Cambridge, MA: MIT Press.
Schölkopf, B., Platt, J.C., Shawe-Taylor, J., Williamson, R.C., and Smola, A.J. (2001). SV Estimating the Support of a Higher Dimensional Distribution, Neural Computation. In Neural Information Processing Systems, 13(7), 1443–1471.
Shawe-Taylor, J., Bartlett, P.L., Williamson, R.C., and Anthony, M. (1998). Structural Risk Minimization over Data-Dependent Hierarchies. IEEE Transactions on Information Theory, 44(5), 1926–1940.
Shawe-Taylor, J. and Cristianini, N. (2000). Margin Distribution and Soft Margin. In A.J. Smola, P.L. Bartlett, B. Schölkopf, and D. Schuurmans (Eds.), Advances in Large Margin Classifiers (pp. 349–358). Cambridge, MA: MIT Press.
Sheridan, P. and Ballerini, J.P. (1996). Experiments in Multilingual Information Retrieval Using the Spi-der System. In Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 58–65). New York: ACM.
Siolas, G. and d'AlchéBuc, F. (2000). Support Vectors Machines Based on a Semantic Kernel for Text Categorization. In Proceedings of the International Joint Conference on Neural Networks, IJCNN, Como, IEEE.
Smola, A. and Schölkopf, B. (1998). A Tutorial on Support Vector Regression. NeuroCOLT Technical Report NC-TR-98-030, Royal Holloway, University of London.
Smola, A.J., Mangasarian, O.L., and Schölkopf, B. (1999). Sparse Kernel Feature Analysis. Technical Report 99–04, University of Wisconsin, Data Mining Institute, Madison.
Vapnik, V. (1998). Statistical Learning Theory. New York: Wiley.
Wong, S.K.M., Ziarko, W., and Wong, P.C.N. (1985). Generalized Vector Space Model in Information Retrieval. In ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 18–25). New York: ACM.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Cristianini, N., Shawe-Taylor, J. & Lodhi, H. Latent Semantic Kernels. Journal of Intelligent Information Systems 18, 127–152 (2002). https://doi.org/10.1023/A:1013625426931
Issue Date:
DOI: https://doi.org/10.1023/A:1013625426931