Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Learning Linearly Separable Languages

  • Conference paper
Algorithmic Learning Theory (ALT 2006)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4264))

Included in the following conference series:

Abstract

This paper presents a novel paradigm for learning languages that consists of mapping strings to an appropriate high-dimensional feature space and learning a separating hyperplane in that space. It initiates the study of the linear separability of automata and languages by examining the rich class of piecewise-testable languages. It introduces a high-dimensional feature map and proves piecewise-testable languages to be linearly separable in that space. The proof makes use of word combinatorial results relating to subsequences. It also shows that the positive definite kernel associated to this embedding can be computed in quadratic time. It examines the use of support vector machines in combination with this kernel to determine a separating hyperplane and the corresponding learning guarantees. It also proves that all languages linearly separable under a regular finite cover embedding, a generalization of the embedding we used, are regular.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Angluin, D.: On the complexity of minimum inference of regular sets. Information and Control 3(39), 337–350 (1978)

    Article  MathSciNet  Google Scholar 

  2. Angluin, D.: Inference of reversible languages. Journal of the ACM (JACM) 3(29), 741–765 (1982)

    Article  MathSciNet  Google Scholar 

  3. Anthony, M.: Threshold Functions, Decision Lists, and the Representation of Boolean Functions. Neurocolt Technical report Series NC-TR-96-028, Royal Holloway, University of London (1996)

    Google Scholar 

  4. Bartlett, P., Shawe-Taylor, J.: Generalization performance of support vector machines and other pattern classifiers. In: Advances in kernel methods: support vector learning, pp. 43–54. MIT Press, Cambridge, MA, USA (1999)

    Google Scholar 

  5. Boser, B.E., Guyon, I., Vapnik, V.N.: A training algorithm for optimal margin classifiers. In: Proceedings of the Fifth Annual Workshop of Computational Learning Theory, Pittsburg, vol. 5, pp. 144–152. ACM, New York (1992)

    Chapter  Google Scholar 

  6. Cortes, C., Haffner, P., Mohri, M.: Rational Kernels: Theory and Algorithms. Journal of Machine Learning Research (JMLR) 5, 1035–1062 (2004)

    MathSciNet  Google Scholar 

  7. Cortes, C., Vapnik, V.N.: Support-Vector Networks. Machine Learning 20(3), 273–297 (1995)

    MATH  Google Scholar 

  8. Derryberry, J.: Private communication (2004)

    Google Scholar 

  9. Freund, Y., Kearns, M., Ron, D., Rubinfeld, R., Schapire, R.E., Sellie, L.: Efficient learning of typical finite automata from random walks. In: STOC 1993: Proceedings of the twenty-fifth annual ACM symposium on Theory of computing, pp. 315–324. ACM Press, New York (1993)

    Chapter  Google Scholar 

  10. García, P., Ruiz, J.: Learning k-testable and k-piecewise testable languages from positive data. Grammars 7, 125–140 (2004)

    Google Scholar 

  11. Gold, E.M.: Language identification in the limit. Information and Control 50(10), 447–474 (1967)

    Article  Google Scholar 

  12. Gold, E.M.: Complexity of automaton identification from given data. Information and Control 3(37), 302–420 (1978)

    Article  MathSciNet  Google Scholar 

  13. Haines, L.H.: On free monoids partially ordered by embedding. Journal of Combinatorial Theory 6, 35–40 (1969)

    Article  MathSciNet  Google Scholar 

  14. Haussler, D., Littlestone, N., Warmuth, M.K.: Predicting {0,1}- Functions on Randomly Drawn Points. In: Proceedings of the first annual workshop on Computational learning theory (COLT 1988), pp. 280–296. Morgan Kaufmann Publishers Inc., San Francisco (1988)

    Google Scholar 

  15. Higman, G.: Ordering by divisibility in abstract algebras. Proceedings of The London Mathematical Society 2, 326–336 (1952)

    Article  MATH  MathSciNet  Google Scholar 

  16. Kearns, M., Vazirani, U.: An Introduction to Computational Learning Theory. The MIT Press, Cambridge (1997)

    Google Scholar 

  17. Lodhi, H., Shawe-Taylor, J., Cristianini, N., Watkins, C.: Text classification using string kernels. In: Leen, T.K., Dietterich, T.G., Tresp, V. (eds.) NIPS 2000, pp. 563–569. MIT Press, Cambridge (2001)

    Google Scholar 

  18. Lothaire, M.: Combinatorics on Words. Encyclopedia of Mathematics and Its Applications, vol. 17. Addison-Wesley, Reading (1983)

    MATH  Google Scholar 

  19. Mateescu, A., Salomaa, A.: Volume 1: Word, Language, Grammar. In: Formal languages: an Introduction and a Synopsis. Handbook of Formal Languages, pp. 1–39. Springer, New York (1997)

    Google Scholar 

  20. Oncina, J., García, P., Vidal, E.: Learning subsequential transducers for pattern recognition interpretation tasks. IEEE Trans. Pattern Anal. Mach. Intell. 15(5), 448–458 (1993)

    Article  Google Scholar 

  21. Pitt, L., Warmuth, M.: The minimum consistent DFA problem cannot be approximated within any polynomial. Journal of the Assocation for Computing Machinery 40(1), 95–142 (1993)

    MATH  MathSciNet  Google Scholar 

  22. Ron, D., Singer, Y., Tishby, N.: On the learnability and usage of acyclic probabilistic finite automata. Journal of Computer and System Sciences 56(2), 133–152 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  23. Simon, I.: Piecewise testable events. In: Brakhage, H. (ed.) GI-Fachtagung 1975. LNCS, vol. 33. Springer, Heidelberg (1975)

    Google Scholar 

  24. Trakhtenbrot, B.A., Barzdin, J.M.: Finite Automata: Behavior and Synthesis. Fundamental Studies in Computer Science, vol. 1. North-Holland, Amsterdam (1973)

    MATH  Google Scholar 

  25. Vapnik, V.N.: Statistical Learning Theory. John Wiley & Sons, Chichester (1998)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kontorovich, L., Cortes, C., Mohri, M. (2006). Learning Linearly Separable Languages. In: Balcázar, J.L., Long, P.M., Stephan, F. (eds) Algorithmic Learning Theory. ALT 2006. Lecture Notes in Computer Science(), vol 4264. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11894841_24

Download citation

  • DOI: https://doi.org/10.1007/11894841_24

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-46649-9

  • Online ISBN: 978-3-540-46650-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics