Abstract
We consider the problem of restoring regular expressions from good examples. We describe a natural learning algorithm for obtaining a “plausible” regular expression from one example. The algorithm is based on finding the longest substring which can be matched by some part of the so far obtained expression. We believe that the algorithm to a certain extent mimics humans guessing regular expressions from the same sort of examples. We show that for regular expressions of bounded length successful learning takes time linear in the length of the example, provided that the example is “good”. Under certain natural restrictions the run-time of the learning algorithm is polynomial also in unsuccessful cases. In the end we discuss the computer experiment of learning regular expressions via the described algorithm, showing that the proposed learning method is quite practical.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
D.Angluin. A note on the number of queries to identify regular languages. Information and Computation, 51:76–87, 1981.
D.Angluin. Learning regular sets from queries and counterexamples. Information and Computation, 75(2):87–106, 1987.
J.Barzdin, G.Barzdin, K.Apsitis, U.Sarkans. Towards efficient inductive synthesis of expressions from input/output examples. In Proc of th 4th Workshop on Algorithmic Learning Theory (ALT'93), Lect. Notes in Artific. Intel., Springer, 1993, 59–72.
A.Brazma. Efficient identification of regular expressions from representative examples. In Proc. of 6th Annual Workshop on Comput. Learn. Theory COLT'93, ACM Press, 1993, p.236–242.
A.Brazma, K.Cerans. Efficient learning of regular expressions from good examples. In proc. of 4th Intern. Workshop on Analogical and Inductive Inference (AII'94), Lecture Notes in Artificial Intelligence, Vol 872, 1994, pp.76–90.
A.Brazma. Efficient algorithms for learning simple regular expressions from noisy examples. In proc. of 5th International Workshop on Algorithmic Learning Theory (ALT'94), Lecture Notes in Artificial Intelligence, Vol 872, pp.260–271.
R.Freivalds, E.Kinber, R.Wiehagen. Inductive inference from good examples. Lecture Notes in Artificial Intelligence, 397, 1–18, 1989.
E.M.Gold. Language identification in the limit. Inform. contr., 10:447–474, 1967.
E.Kinber. Learning a class of regular expressions via restricted subset queries, Lecture Notes in Artificial Intelligence, 642, 232–243, 1992.
S.Muggleton. Inductive Acquisition of Expert Knowledge, Turings Institute Press, 1990.
L.Pitt. Inductive Inference, DFAs, and Computational Complexity. Lecture Notes in Artificial Intelligence, 397:18–44, Springer-Verlag, 1989
N.Tanida, T.Yokomori. Polynomial-time identification of strictly regular languages in the limit. IEICE Trans. Inf. & Syst., V E75-D, 1992, 125–132.
L.G.Valiant. A theory of the learnable. Comm. Assoc. Comp. Mach., 27(11):1134–1142, 1984.
R.Wiehagen. From inductive inference to algorithmic learning. Proc. Third Workshop on Algorithmic Learning Theory, ALT'92, Sawado, 1992, 13–24.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1995 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Brāzma, A. (1995). Learning of regular expressions by pattern matching. In: Vitányi, P. (eds) Computational Learning Theory. EuroCOLT 1995. Lecture Notes in Computer Science, vol 904. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-59119-2_194
Download citation
DOI: https://doi.org/10.1007/3-540-59119-2_194
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-59119-1
Online ISBN: 978-3-540-49195-8
eBook Packages: Springer Book Archive