Positions and derivatives are two essential notions in the conversion methods from regular expres... more Positions and derivatives are two essential notions in the conversion methods from regular expressions to equivalent finite automata. Partial derivative based methods have recently been extended to regular expressions with intersection (semi-extended). In this paper, we present a position automaton construction for those expressions. This construction generalizes the notion of position, making it compatible with intersection. The resulting automaton is homogeneous and has the partial derivative automaton as a quotient.
In this paper, the relation between the Glushkov automaton (Apos) and the partial derivative auto... more In this paper, the relation between the Glushkov automaton (Apos) and the partial derivative automaton (Apd) of a given regular expression, in terms of transition complexity, is studied. The average transition complexity of Apos was proved by Nicaud to be linear in the size of the corresponding expression. This result was obtained using an upper bound of the number of transitions of Apos. Here we present a new quadratic construction of Apos that leads to a more elegant and straightforward implementation, and that allows the exact counting of the number of transitions. Based on that, a better estimation of the average size is presented. Asymptotically, and as the alphabet size grows, the number of transitions per state is on average 2. Broda et al. computed an upper bound for the ratio of the number of states of Apd to the number of states of Apos, which is about 12 for large alphabet sizes. Here we show how to obtain an upper bound for the number of transitions in Apd, which we then...
International Journal of Foundations of Computer Science, 2019
For regular expressions in (strong) star normal form a large set of efficient algorithms is known... more For regular expressions in (strong) star normal form a large set of efficient algorithms is known, from conversions into finite automata to characterisations of unambiguity. In this paper we study the average complexity of this class of expressions using analytic combinatorics. As it is not always feasible to obtain explicit expressions for the generating functions involved, here we show how to get the required information for the asymptotic estimates with an indirect use of the existence of Puiseux expansions at singularities. We study, asymptotically and on average, the alphabetic size, the size of the [Formula: see text]-follow automaton and of the position automaton, as well as the ratio and the size of these expressions to standard regular expressions.
The state complexity of basic operations on finite languages (considering complete DFAs) has been... more The state complexity of basic operations on finite languages (considering complete DFAs) has been in studied the literature. In this paper we study the incomplete (deterministic) state and transition complexity on finite languagesof boolean operations, concatenation, star, and reversal. For all operations we give tight upper bounds for both descriptional measures. We correct the published state complexity of concatenation for complete DFAs and provide a tight upper bound for the case when the right automaton is larger than the left one. For all binary operations the tightness is proved using family languages with a variable alphabet size. In general the operational complexities depend not only on the complexities of the operands but also on other refined measures. 1
Y. Gao et al. studied for the first time the transition complexity of Boolean operations on regul... more Y. Gao et al. studied for the first time the transition complexity of Boolean operations on regular languages based on non necessarily complete DFAs. For the intersection and the complementation, tight bounds were presented, but for the union operation the upper and ...
Kleene algebra with tests (KAT) is an equational system for program verification, which is the co... more Kleene algebra with tests (KAT) is an equational system for program verification, which is the combination of Boolean algebra (BA) and Kleene algebra (KA), the algebra of regular expressions. In particular, KAT subsumes the propositional fragment of Hoare logic (PHL) which is a formal system for the specification and verification of programs, and that is currently the base of most tools for checking program correctness. Both the equational theory of KAT and the encoding of PHL in KAT are known to be decidable. In this paper we present a new decision procedure for the equivalence of two KAT expressions based on the notion of partial derivatives. We also introduce the notion of derivative modulo particular sets of equations. With this we extend the previous procedure for deciding PHL. Some experimental results are also presented.
Context-free languages are highly important in computer language processing technology as well as... more Context-free languages are highly important in computer language processing technology as well as in formal language theory. The Pumping Lemma is a property that is valid for all context-free languages, and is used to show the existence of non context-free languages. This paper presents a formalization, using the Coq proof assistant, of the Pumping Lemma for context-free languages.
International Journal of Foundations of Computer Science, 2020
We are interested in regular expressions and transducers that represent word relations in an alph... more We are interested in regular expressions and transducers that represent word relations in an alphabet-invariant way — for example, the set of all word pairs [Formula: see text] where [Formula: see text] is a prefix of [Formula: see text] independently of what the alphabet is. Current software systems of formal language objects do not have a mechanism to define such objects. We define transducers in which transition labels involve what we call set specifications, some of which are alphabet invariant. In fact, we give a more broad definition of automata-type objects, called labelled graphs, where each transition label can be any string, as long as that string represents a subset of a certain monoid. Then, the behavior of the labelled graph is a subset of that monoid. We do the same for regular expressions. We obtain extensions of a few classic algorithmic constructions on ordinary regular expressions and transducers at the broad level of labelled graphs and in such a way that the comp...
Positions and derivatives are two essential notions in the conversion methods from regular expres... more Positions and derivatives are two essential notions in the conversion methods from regular expressions to equivalent finite automata. Partial derivative based methods have recently been extended to regular expressions with intersection (semi-extended). In this paper, we present a position automaton construction for those expressions. This construction generalizes the notion of position, making it compatible with intersection. The resulting automaton is homogeneous and has the partial derivative automaton as a quotient.
In this paper, the relation between the Glushkov automaton (Apos) and the partial derivative auto... more In this paper, the relation between the Glushkov automaton (Apos) and the partial derivative automaton (Apd) of a given regular expression, in terms of transition complexity, is studied. The average transition complexity of Apos was proved by Nicaud to be linear in the size of the corresponding expression. This result was obtained using an upper bound of the number of transitions of Apos. Here we present a new quadratic construction of Apos that leads to a more elegant and straightforward implementation, and that allows the exact counting of the number of transitions. Based on that, a better estimation of the average size is presented. Asymptotically, and as the alphabet size grows, the number of transitions per state is on average 2. Broda et al. computed an upper bound for the ratio of the number of states of Apd to the number of states of Apos, which is about 12 for large alphabet sizes. Here we show how to obtain an upper bound for the number of transitions in Apd, which we then...
International Journal of Foundations of Computer Science, 2019
For regular expressions in (strong) star normal form a large set of efficient algorithms is known... more For regular expressions in (strong) star normal form a large set of efficient algorithms is known, from conversions into finite automata to characterisations of unambiguity. In this paper we study the average complexity of this class of expressions using analytic combinatorics. As it is not always feasible to obtain explicit expressions for the generating functions involved, here we show how to get the required information for the asymptotic estimates with an indirect use of the existence of Puiseux expansions at singularities. We study, asymptotically and on average, the alphabetic size, the size of the [Formula: see text]-follow automaton and of the position automaton, as well as the ratio and the size of these expressions to standard regular expressions.
The state complexity of basic operations on finite languages (considering complete DFAs) has been... more The state complexity of basic operations on finite languages (considering complete DFAs) has been in studied the literature. In this paper we study the incomplete (deterministic) state and transition complexity on finite languagesof boolean operations, concatenation, star, and reversal. For all operations we give tight upper bounds for both descriptional measures. We correct the published state complexity of concatenation for complete DFAs and provide a tight upper bound for the case when the right automaton is larger than the left one. For all binary operations the tightness is proved using family languages with a variable alphabet size. In general the operational complexities depend not only on the complexities of the operands but also on other refined measures. 1
Y. Gao et al. studied for the first time the transition complexity of Boolean operations on regul... more Y. Gao et al. studied for the first time the transition complexity of Boolean operations on regular languages based on non necessarily complete DFAs. For the intersection and the complementation, tight bounds were presented, but for the union operation the upper and ...
Kleene algebra with tests (KAT) is an equational system for program verification, which is the co... more Kleene algebra with tests (KAT) is an equational system for program verification, which is the combination of Boolean algebra (BA) and Kleene algebra (KA), the algebra of regular expressions. In particular, KAT subsumes the propositional fragment of Hoare logic (PHL) which is a formal system for the specification and verification of programs, and that is currently the base of most tools for checking program correctness. Both the equational theory of KAT and the encoding of PHL in KAT are known to be decidable. In this paper we present a new decision procedure for the equivalence of two KAT expressions based on the notion of partial derivatives. We also introduce the notion of derivative modulo particular sets of equations. With this we extend the previous procedure for deciding PHL. Some experimental results are also presented.
Context-free languages are highly important in computer language processing technology as well as... more Context-free languages are highly important in computer language processing technology as well as in formal language theory. The Pumping Lemma is a property that is valid for all context-free languages, and is used to show the existence of non context-free languages. This paper presents a formalization, using the Coq proof assistant, of the Pumping Lemma for context-free languages.
International Journal of Foundations of Computer Science, 2020
We are interested in regular expressions and transducers that represent word relations in an alph... more We are interested in regular expressions and transducers that represent word relations in an alphabet-invariant way — for example, the set of all word pairs [Formula: see text] where [Formula: see text] is a prefix of [Formula: see text] independently of what the alphabet is. Current software systems of formal language objects do not have a mechanism to define such objects. We define transducers in which transition labels involve what we call set specifications, some of which are alphabet invariant. In fact, we give a more broad definition of automata-type objects, called labelled graphs, where each transition label can be any string, as long as that string represents a subset of a certain monoid. Then, the behavior of the labelled graph is a subset of that monoid. We do the same for regular expressions. We obtain extensions of a few classic algorithmic constructions on ordinary regular expressions and transducers at the broad level of labelled graphs and in such a way that the comp...
Uploads