Abstract. GUItar is a visualization software tool for various types of automata (standard, weight... more Abstract. GUItar is a visualization software tool for various types of automata (standard, weighted, pushdown, transducers, Turing machines, etc.). It provides interactive manipulation of diagrams, comprehensive graphic style creation, multiple export/import ...
Recently the descriptional complexity of formal languages has been extensively researched. One of... more Recently the descriptional complexity of formal languages has been extensively researched. One of the most studied complexity measures for regular languages is the number of states of its minimal automaton (state complexity of the language). Other measures can be related to other structural components and other models of computation. The complexity of a language operation is the complexity of the resulting language seen as a function of the complexities of the operation arguments. This proliferous research gave origin to a multitude of results scattered over a few hundred articles, with the inevitable lack of unified terminology and notation. This makes it very difficult for an interested researcher to have a global perspective of this field and realize what is the current coverage achieved in order to know where to allocate more research efforts. In this paper we present a first step towards the development of a knowledge base and a Web interface where descriptional complexity resu...
We introduce the concept of an f -maximal error-detecting block code, for some parameter f betwee... more We introduce the concept of an f -maximal error-detecting block code, for some parameter f between 0 and 1, in order to formalize the situation where a block code is close to maximal with respect to being error-detecting. Our motivation for this is that constructing a maximal error-detecting code is a computationally hard problem. We present a randomized algorithm that takes as input two positive integers N, `, a probability value f , and a specification of the errors permitted in some application, and generates an error-detecting, or error-correcting, block code having up to N codewords of length `. If the algorithm finds less than N codewords, then those codewords constitute a code that is f -maximal with high probability. The error specification is modelled as a (nondeterministic) transducer, which allows one to model any rational combination of substitution and synchronization errors. We also present some elements of our implementation of various error-detecting properties and t...
Extended regular expressions (with complement and intersection) are used in many applications due... more Extended regular expressions (with complement and intersection) are used in many applications due to their succinctness. In particular, regular expressions extended with intersection only (also called semi-extended) can already be exponentially smaller than standard regular expressions or equivalent nondeterministic finite automata. For practical purposes it is important to study the average behaviour of conversions between these models. In this paper, we focus on the conversion of regular expressions with intersection to nondeterministic finite automata, using partial derivatives and the notion of support. We give a tight upper bound of 2O(n) for the worst-case number of states of the resulting partial derivative automaton, where n is the size of the expression. Using the framework of analytic combinatorics, we establish an upper bound of (1.056+o(1))n for its asymptotic average-state complexity, which is significantly smaller than the one for the worst case. Some experimental resu...
The distinguishability language of a regular language L is the set of words distinguishing betwee... more The distinguishability language of a regular language L is the set of words distinguishing between pairs of words under the Myhill-Nerode equivalence induced by L, i.e., between pairs of distinct left quotients of L. The similarity relation induced by a language L is a similarity relation inspired by the Myhill-Nerode equivalence and it was used to obtain compact representation of automata for a finite language L, i.e., deterministic finite cover automata, which are deterministic finite automata accepting all the words of L and possibly some other words that are longer than any word of L. The dissimilarity language of a finite language L is defined as the set of words that separate a pair of words which are not similar w.r.t. to a (finite) language L. In this paper we extend the study of distinguishability operation on regular languages to l-dissimilarity, for l ∈ N, and the dissimilarity operation on finite languages. We examine their properties, the state complexity, and relations...
There are many different constructions when converting regular expressions to finite automata. In... more There are many different constructions when converting regular expressions to finite automata. In this paper we focus on the prefix automaton, $\apre$, introduced by Yamamoto in 2014. We present two different methods for the construction of $\apre$. First, an inductive one, based on a system of expression equations. A second one using an iterative function for computing the states and transitions. We establish relationships between $\apre$ and other constructions, such as the position automaton, partial derivative automaton and their double reversal (dual) counterparts. We study the average size of these constructions, both experimentally and from an analytic combinatorics point of view. Finally, we extend the construction of the prefix automaton to regular expressions with intersection and show that the relationships with the other automaton constructions also hold for these expressions.
Descriptional complexity is the study of the conciseness of the various models representing forma... more Descriptional complexity is the study of the conciseness of the various models representing formal languages. The state complexity of a regular language is the size, measured by the number of states of the smallest, either deterministic or nondeterministic, finite automaton that recognises it. Operational state complexity is the study of the state complexity of operations over languages. In this survey, we review the state complexities of individual regularity preserving language operations on regular and some subregular languages. Then we revisit the state complexities of the combination of individual operations. We also review methods of estimation and approximation of state complexity of more complex combined operations.
IFIP Advances in Information and Communication Technology, 2021
part : TC 1: Foundations of Computer ScienceInternational audienceDescriptional complexity has hi... more part : TC 1: Foundations of Computer ScienceInternational audienceDescriptional complexity has historically been a multidisciplinary area of study, with contributions from automata theory, computational complexity, cryptography, information theory, probability, statistics, pattern recognition, machine learning, computational learning theory, computer vision, neural networks, formal languages and other fields. Some basic questions are: How succinctly can a descriptional system represent objects (for example, encoded as formal languages) in comparison with other descriptional systems? What is the maximal size trade-off when changing from one system to another, and can it be achieved
Positions and derivatives are two essential notions in the conversion methods from regular expres... more Positions and derivatives are two essential notions in the conversion methods from regular expressions to equivalent finite automata. Partial derivative based methods have recently been extended to regular expressions with intersection (semi-extended). In this paper, we present a position automaton construction for those expressions. This construction generalizes the notion of position, making it compatible with intersection. The resulting automaton is homogeneous and has the partial derivative automaton as a quotient.
FAdo is an ongoing project which goal is the development of a Python environment for manipulation... more FAdo is an ongoing project which goal is the development of a Python environment for manipulation of finite automata and regular expressions. Currently it provides most standard automata operations including conversion from deterministic to nondeterministic, minimisation, boolean operations, concatenation, conversion between automata and regular expressions, and word recognition. It includes, also, an innovative method for testing non-equivalence of two automata (or regular expressions) using a DFA canonical form and a witness generator of the difference of two automata. Our main
Abstract. GUItar is a visualization software tool for various types of automata (standard, weight... more Abstract. GUItar is a visualization software tool for various types of automata (standard, weighted, pushdown, transducers, Turing machines, etc.). It provides interactive manipulation of diagrams, comprehensive graphic style creation, multiple export/import ...
Recently the descriptional complexity of formal languages has been extensively researched. One of... more Recently the descriptional complexity of formal languages has been extensively researched. One of the most studied complexity measures for regular languages is the number of states of its minimal automaton (state complexity of the language). Other measures can be related to other structural components and other models of computation. The complexity of a language operation is the complexity of the resulting language seen as a function of the complexities of the operation arguments. This proliferous research gave origin to a multitude of results scattered over a few hundred articles, with the inevitable lack of unified terminology and notation. This makes it very difficult for an interested researcher to have a global perspective of this field and realize what is the current coverage achieved in order to know where to allocate more research efforts. In this paper we present a first step towards the development of a knowledge base and a Web interface where descriptional complexity resu...
We introduce the concept of an f -maximal error-detecting block code, for some parameter f betwee... more We introduce the concept of an f -maximal error-detecting block code, for some parameter f between 0 and 1, in order to formalize the situation where a block code is close to maximal with respect to being error-detecting. Our motivation for this is that constructing a maximal error-detecting code is a computationally hard problem. We present a randomized algorithm that takes as input two positive integers N, `, a probability value f , and a specification of the errors permitted in some application, and generates an error-detecting, or error-correcting, block code having up to N codewords of length `. If the algorithm finds less than N codewords, then those codewords constitute a code that is f -maximal with high probability. The error specification is modelled as a (nondeterministic) transducer, which allows one to model any rational combination of substitution and synchronization errors. We also present some elements of our implementation of various error-detecting properties and t...
Extended regular expressions (with complement and intersection) are used in many applications due... more Extended regular expressions (with complement and intersection) are used in many applications due to their succinctness. In particular, regular expressions extended with intersection only (also called semi-extended) can already be exponentially smaller than standard regular expressions or equivalent nondeterministic finite automata. For practical purposes it is important to study the average behaviour of conversions between these models. In this paper, we focus on the conversion of regular expressions with intersection to nondeterministic finite automata, using partial derivatives and the notion of support. We give a tight upper bound of 2O(n) for the worst-case number of states of the resulting partial derivative automaton, where n is the size of the expression. Using the framework of analytic combinatorics, we establish an upper bound of (1.056+o(1))n for its asymptotic average-state complexity, which is significantly smaller than the one for the worst case. Some experimental resu...
The distinguishability language of a regular language L is the set of words distinguishing betwee... more The distinguishability language of a regular language L is the set of words distinguishing between pairs of words under the Myhill-Nerode equivalence induced by L, i.e., between pairs of distinct left quotients of L. The similarity relation induced by a language L is a similarity relation inspired by the Myhill-Nerode equivalence and it was used to obtain compact representation of automata for a finite language L, i.e., deterministic finite cover automata, which are deterministic finite automata accepting all the words of L and possibly some other words that are longer than any word of L. The dissimilarity language of a finite language L is defined as the set of words that separate a pair of words which are not similar w.r.t. to a (finite) language L. In this paper we extend the study of distinguishability operation on regular languages to l-dissimilarity, for l ∈ N, and the dissimilarity operation on finite languages. We examine their properties, the state complexity, and relations...
There are many different constructions when converting regular expressions to finite automata. In... more There are many different constructions when converting regular expressions to finite automata. In this paper we focus on the prefix automaton, $\apre$, introduced by Yamamoto in 2014. We present two different methods for the construction of $\apre$. First, an inductive one, based on a system of expression equations. A second one using an iterative function for computing the states and transitions. We establish relationships between $\apre$ and other constructions, such as the position automaton, partial derivative automaton and their double reversal (dual) counterparts. We study the average size of these constructions, both experimentally and from an analytic combinatorics point of view. Finally, we extend the construction of the prefix automaton to regular expressions with intersection and show that the relationships with the other automaton constructions also hold for these expressions.
Descriptional complexity is the study of the conciseness of the various models representing forma... more Descriptional complexity is the study of the conciseness of the various models representing formal languages. The state complexity of a regular language is the size, measured by the number of states of the smallest, either deterministic or nondeterministic, finite automaton that recognises it. Operational state complexity is the study of the state complexity of operations over languages. In this survey, we review the state complexities of individual regularity preserving language operations on regular and some subregular languages. Then we revisit the state complexities of the combination of individual operations. We also review methods of estimation and approximation of state complexity of more complex combined operations.
IFIP Advances in Information and Communication Technology, 2021
part : TC 1: Foundations of Computer ScienceInternational audienceDescriptional complexity has hi... more part : TC 1: Foundations of Computer ScienceInternational audienceDescriptional complexity has historically been a multidisciplinary area of study, with contributions from automata theory, computational complexity, cryptography, information theory, probability, statistics, pattern recognition, machine learning, computational learning theory, computer vision, neural networks, formal languages and other fields. Some basic questions are: How succinctly can a descriptional system represent objects (for example, encoded as formal languages) in comparison with other descriptional systems? What is the maximal size trade-off when changing from one system to another, and can it be achieved
Positions and derivatives are two essential notions in the conversion methods from regular expres... more Positions and derivatives are two essential notions in the conversion methods from regular expressions to equivalent finite automata. Partial derivative based methods have recently been extended to regular expressions with intersection (semi-extended). In this paper, we present a position automaton construction for those expressions. This construction generalizes the notion of position, making it compatible with intersection. The resulting automaton is homogeneous and has the partial derivative automaton as a quotient.
FAdo is an ongoing project which goal is the development of a Python environment for manipulation... more FAdo is an ongoing project which goal is the development of a Python environment for manipulation of finite automata and regular expressions. Currently it provides most standard automata operations including conversion from deterministic to nondeterministic, minimisation, boolean operations, concatenation, conversion between automata and regular expressions, and word recognition. It includes, also, an innovative method for testing non-equivalence of two automata (or regular expressions) using a DFA canonical form and a witness generator of the difference of two automata. Our main
Uploads