Abstract
This paper describes novel methods of learning general context free grammars from sample strings, which are implemented in Synapse system. Main features of the system are incremental learning, rule generation based on bottom-up parsing of positive samples, and search for rule sets. From the results of parsing, a rule generation process, called “bridging,” synthesizes the production rules that make up any lacking parts of an incomplete derivation tree for each positive string. To solve the fundamental problem of complexity for learning CFG, we employ methods of searching for non-minimum, semi-optimum sets of rules as well as incremental learning based on related grammars. One of the methods is search strategy called “serial search,” which finds additional rules for each positive sample and not to find the minimum rule set for all positive samples as in global search. The other methods are not to minimize nonterminal symbols in rule generation and to restrict the form of generated rules. The paper shows experimental results and compares various synthesis methods.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Angluin, D., Kharitonov, M.: When and Won’t Membership Queries Help? Jour. Computers and System Sciences 50, 336–355 (1995)
Bratko, I.: PROLOG Programming for Artificial Intelligence, 3rd edn. Addison Wesley, Reading (2000)
de la Higuera, C., Oncina, J.: Inferring Deterministic Linear Langauges. In: Kivinen, J., Sloan, R.H. (eds.) COLT 2002. LNCS (LNAI), vol. 2375, pp. 185–200. Springer, Heidelberg (2002)
Hopcroft, J.E., Ullman, J.E.: Introduction to Automata Theory, Languages, and Computation. Addison-Wesley, Reading (1979)
Langley, P., Stromsten, S.: Learning Context-Free Grammars with a Simplicity Bias. In: Lopez de Mantaras, R., Plaza, E. (eds.) ECML 2000. LNCS (LNAI), vol. 1810, pp. 220–228. Springer, Heidelberg (2000)
Nakamura, K., Ishiwata, Y.: Synthesizing context free grammars from sample strings based on inductive CYK algorithm. In: Oliveira, A.L. (ed.) ICGI 2000. LNCS (LNAI), vol. 1891, pp. 186–195. Springer, Heidelberg (2000)
Nakamura, K.: Incremental Learning of Context Free Grammars by Extended Inductive CYK Algorithm. In: Workshop on Learning Context Free Grammars (2003)
Nakamura, K., Matsumoto, M.: Incremental Learning of Context Free Grammars Based on Bottom-up Parsing and Search. Pattern Recognition 38, 1384–1392 (2005)
Nakamura, K., Hoshina, A.: Learning of Context Free Grammars by Parsing-Based Rule Generation and Rule Set Search (in Japanese). Trans. of JSAI 21, 371–379 (2006)
Nienhuys-Cheng, S.H., de Wolf, R.: Foundations of Inductive Logic Programming. Springer, Heidelberg (1997)
Pitt, L., Warmuth, M.: The Minimum Consistent DFA Problem Cannot be Approximated within any Polynomial. Jour. of ACM 40, 95–142 (1993)
Sakakibara, Y.: Learning context-free grammars from positive structured examples. Information and Computation 97, 23–60 (1992)
Sakakibara, Y.: Recent advances of grammatical inference. Theoretical Computer Science 185, 15–45 (1997)
Sakakibara, Y., Muramatsu, H.: Learning context-free grammars from partially structured examples. In: Oliveira, A.L. (ed.) ICGI 2000. LNCS (LNAI), vol. 1891, pp. 229–240. Springer, Heidelberg (2000)
Tanaka, T.: Definite clause set grammars: A formalism for problem solving. J. Logic Programming 10(1), 1–17 (1991)
Vervoort, M.: Emile 4.4.6 User Guide, Universiteit van Amsterdam (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Nakamura, K. (2006). Incremental Learning of Context Free Grammars by Bridging Rule Generation and Search for Semi-optimum Rule Sets. In: Sakakibara, Y., Kobayashi, S., Sato, K., Nishino, T., Tomita, E. (eds) Grammatical Inference: Algorithms and Applications. ICGI 2006. Lecture Notes in Computer Science(), vol 4201. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11872436_7
Download citation
DOI: https://doi.org/10.1007/11872436_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-45264-5
Online ISBN: 978-3-540-45265-2
eBook Packages: Computer ScienceComputer Science (R0)