Abstract We initiate a study of languages of partial words related to regular languages of full w... more Abstract We initiate a study of languages of partial words related to regular languages of full words. First, we study the possibility of expressing a regular language of full words as the image of a partial-words-language through a substitution that only replaces the hole symbols of the partial words with a finite set of letters. Results regarding the structure, uniqueness and succinctness of such a representation, as well as a series of related decidability and computational-hardness results, are presented.
Erdős raised the question whether there exist infinite abelian square-free words over a given alp... more Erdős raised the question whether there exist infinite abelian square-free words over a given alphabet, that is, words in which no two adjacent subwords are permutations of each other. It can easily be checked that no such word exists over a three-letter alphabet. However, infinite abelian square-free words have been constructed over alphabets of sizes as small as four. In this paper, we investigate the problem of avoiding abelian squares in partial words, or sequences that may contain some holes.
Abstract The notion of repetition of factors in words is central to considerations on sequences. ... more Abstract The notion of repetition of factors in words is central to considerations on sequences. One of the recent generalizations regarding this concept was introduced by Czeizler et al.(2010) and investigates a restricted version of that notion in the context of DNA computing and bioinformatics. It considers a word to be a pseudo-repetition if it is the iterated concatenation of one of its prefixes and the image of this prefix through an involution.
Abstract A k-abelian cube is a word uvw, where u, v, w are either equal or have the same factors ... more Abstract A k-abelian cube is a word uvw, where u, v, w are either equal or have the same factors of length k with the same multiplicities and the same prefixes and suffixes of length k− 1. Previously it has been known that k-abelian cubes are avoidable over a binary alphabet for k≥ 8. Here it is proved that this holds for k≥ 5.
Abstract The problem of classifying all the avoidable binary patterns in (full) words has been co... more Abstract The problem of classifying all the avoidable binary patterns in (full) words has been completely solved (see Chapter 3 of M. Lothaire, Algebraic Combinatorics on Words, Cambridge University Press, 2002). In this paper, we classify all the avoidable binary patterns in partial words, or sequences that may have some undefined positions called holes.
Abstract Pseudo-repetitions are a natural generalization of the classical notion of repetitions i... more Abstract Pseudo-repetitions are a natural generalization of the classical notion of repetitions in sequences. We solve fundamental algorithmic questions on pseudo-repetitions by application of insightful combinatorial results on words. More precisely, we efficiently decide whether a word is a pseudo-repetition and find all the pseudo-repetitive factors of a word.
Abstract We consider here two formal operations on words inspired by the DNA biochemistry: hairpi... more Abstract We consider here two formal operations on words inspired by the DNA biochemistry: hairpin lengthening introduced in [15] and its inverse called hairpin shortening. We study the closure of the class of regular languages under the non-iterated and iterated variants of the two operations. The main results are: although any finite number of applications of the hairpin lengthening to a regular language may lead to non-regular languages, the iterated hairpin lengthening of a regular language is always regular.
Abstract A partial word, sequence over a finite alphabet that may have some undefined positions o... more Abstract A partial word, sequence over a finite alphabet that may have some undefined positions or holes, is bordered if one of its proper prefixes is compatible with one of its suffixes. The number theoretical problem of enumerating all bordered full words (the ones without holes) of a fixed length n over an alphabet of a fixed size k is well
Abstract Pseudopalindromes are words that are fixed points for some antimorphic involution. In th... more Abstract Pseudopalindromes are words that are fixed points for some antimorphic involution. In this paper we discuss a newer word operation, that of pseudopalindromic completion, in which symbols are added to either side of the word such that the new obtained words are pseudopalindromes. This notion represents a particular type of hairpin completion, where the length of the hairpin is at most one.
Abstract Extending the general undecidability result concerning the absoluteness of inequalities ... more Abstract Extending the general undecidability result concerning the absoluteness of inequalities between subword histories, in this paper we show that the question whether such inequalities hold for all words is undecidable already over a binary alphabet and bounded number of blocks, and even in very simple cases an answer requires an intractable computation.
We propose an algorithm that given as input a full word w of length n, and positive integers p an... more We propose an algorithm that given as input a full word w of length n, and positive integers p and d, outputs, if any exists, a maximal p-periodic partial word contained in w with the property that no two holes are within distance d (so-called d-valid). Our algorithm runs in O (nd) time and is used for the study of repetition-freeness of partial words.
El objeto de esta tesis está representado por las repeticiones de palabras parciales, palabras qu... more El objeto de esta tesis está representado por las repeticiones de palabras parciales, palabras que, además de las letras regulares, pueden tener un número de símbolos desconocidos, llamados símbolos" agujeros" o" no sé qué". Más concretamente, se presenta y se resuelve una extensión de la noción de repetición establecida por Axel Thue.
Partial words, or sequences over a finite alphabet that may have do-not-know symbols or holes, ha... more Partial words, or sequences over a finite alphabet that may have do-not-know symbols or holes, have been recently the subject of much investigation. Several interesting combinatorial properties have been studied such as the periodic behavior and the counting of distinct squares in partial words. In this paper, we extend the three-squares lemma on words to partial words with one hole.
Abstract: The avoidability of binary patterns by binary cube-free words is investigated and the e... more Abstract: The avoidability of binary patterns by binary cube-free words is investigated and the exact bound between unavoidable and avoidable patterns is found. All avoidable patterns are shown to be D0L-avoidable. For avoidable patterns, the growth rates of the avoiding languages are studied. All such languages, except for the overlap-free language, are proved to have exponential growth.
In this paper we investigate several periodicity-related algorithms for partial words. First, we ... more In this paper we investigate several periodicity-related algorithms for partial words. First, we show that all periods of a partial word of length n are determined in ${\mathcal O}(n\log n)$ time, and provide algorithms and data structures that help us answer in constant time queries regarding the periodicity of their factors. For this we need a ${\mathcal O}(n^2)$ preprocessing time and a ${\mathcal O}(n)$ updating time, whenever the words are extended by adding a letter. In the second part we show that substituting letters of a word w with holes, with the property that no two holes are too close to each other, to make it periodic can be done in optimal time ${\mathcal O}(|w|)$ . Moreover, we show that inserting the minimum number of holes such that the word keeps the property can be done as fast.
Partial words are sequences over a finite alphabet that may have some undefined positions, or “ho... more Partial words are sequences over a finite alphabet that may have some undefined positions, or “holes,” that are denoted by $\ensuremath{\diamond}$ ’s. A nonempty partial word is called bordered if one of its proper prefixes is compatible with one of its suffixes (here $\ensuremath{\diamond}$ is compatible with every letter in the alphabet); it is called unbordered otherwise. In this paper, we investigate the problem of computing the maximum number of holes a partial word of a fixed length can have and still fail to be bordered.
Well-known results on the avoidance of large squares in (full) words include the following: (1) F... more Well-known results on the avoidance of large squares in (full) words include the following: (1) Fraenkel and Simpson showed that we can construct an infinite binary word containing at most three distinct squares; (2) Entringer, Jackson and Schatz showed that there exists an infinite binary word avoiding all squares of the form xxxx such that |x|≥3|x|≥3, and that the bound 3 is optimal; (3) Dekking showed that there exists an infinite cube-free binary word that avoids all squares xxxx with |x|≥4|x|≥4, and that the bound of 4 is best possible. In this paper, we investigate these avoidance results in the context of partial words, or sequences that may have some undefined symbols called holes. Here, a square has the form uvuv with uu and vv compatible, and consequently, such a square is compatible with a number of full words that are squares over the given alphabet. We show that (1) holds for partial words with at most two holes. We prove that (2) extends to partial words having infinitely many holes. Regarding (3), we show that there exist binary partial words with infinitely many holes that avoid cubes and have only eleven full word squares compatible with factors of it. Moreover, this number is optimal, and all such squares xxxx satisfy |x|≤4|x|≤4.
Abstract We initiate a study of languages of partial words related to regular languages of full w... more Abstract We initiate a study of languages of partial words related to regular languages of full words. First, we study the possibility of expressing a regular language of full words as the image of a partial-words-language through a substitution that only replaces the hole symbols of the partial words with a finite set of letters. Results regarding the structure, uniqueness and succinctness of such a representation, as well as a series of related decidability and computational-hardness results, are presented.
Erdős raised the question whether there exist infinite abelian square-free words over a given alp... more Erdős raised the question whether there exist infinite abelian square-free words over a given alphabet, that is, words in which no two adjacent subwords are permutations of each other. It can easily be checked that no such word exists over a three-letter alphabet. However, infinite abelian square-free words have been constructed over alphabets of sizes as small as four. In this paper, we investigate the problem of avoiding abelian squares in partial words, or sequences that may contain some holes.
Abstract The notion of repetition of factors in words is central to considerations on sequences. ... more Abstract The notion of repetition of factors in words is central to considerations on sequences. One of the recent generalizations regarding this concept was introduced by Czeizler et al.(2010) and investigates a restricted version of that notion in the context of DNA computing and bioinformatics. It considers a word to be a pseudo-repetition if it is the iterated concatenation of one of its prefixes and the image of this prefix through an involution.
Abstract A k-abelian cube is a word uvw, where u, v, w are either equal or have the same factors ... more Abstract A k-abelian cube is a word uvw, where u, v, w are either equal or have the same factors of length k with the same multiplicities and the same prefixes and suffixes of length k− 1. Previously it has been known that k-abelian cubes are avoidable over a binary alphabet for k≥ 8. Here it is proved that this holds for k≥ 5.
Abstract The problem of classifying all the avoidable binary patterns in (full) words has been co... more Abstract The problem of classifying all the avoidable binary patterns in (full) words has been completely solved (see Chapter 3 of M. Lothaire, Algebraic Combinatorics on Words, Cambridge University Press, 2002). In this paper, we classify all the avoidable binary patterns in partial words, or sequences that may have some undefined positions called holes.
Abstract Pseudo-repetitions are a natural generalization of the classical notion of repetitions i... more Abstract Pseudo-repetitions are a natural generalization of the classical notion of repetitions in sequences. We solve fundamental algorithmic questions on pseudo-repetitions by application of insightful combinatorial results on words. More precisely, we efficiently decide whether a word is a pseudo-repetition and find all the pseudo-repetitive factors of a word.
Abstract We consider here two formal operations on words inspired by the DNA biochemistry: hairpi... more Abstract We consider here two formal operations on words inspired by the DNA biochemistry: hairpin lengthening introduced in [15] and its inverse called hairpin shortening. We study the closure of the class of regular languages under the non-iterated and iterated variants of the two operations. The main results are: although any finite number of applications of the hairpin lengthening to a regular language may lead to non-regular languages, the iterated hairpin lengthening of a regular language is always regular.
Abstract A partial word, sequence over a finite alphabet that may have some undefined positions o... more Abstract A partial word, sequence over a finite alphabet that may have some undefined positions or holes, is bordered if one of its proper prefixes is compatible with one of its suffixes. The number theoretical problem of enumerating all bordered full words (the ones without holes) of a fixed length n over an alphabet of a fixed size k is well
Abstract Pseudopalindromes are words that are fixed points for some antimorphic involution. In th... more Abstract Pseudopalindromes are words that are fixed points for some antimorphic involution. In this paper we discuss a newer word operation, that of pseudopalindromic completion, in which symbols are added to either side of the word such that the new obtained words are pseudopalindromes. This notion represents a particular type of hairpin completion, where the length of the hairpin is at most one.
Abstract Extending the general undecidability result concerning the absoluteness of inequalities ... more Abstract Extending the general undecidability result concerning the absoluteness of inequalities between subword histories, in this paper we show that the question whether such inequalities hold for all words is undecidable already over a binary alphabet and bounded number of blocks, and even in very simple cases an answer requires an intractable computation.
We propose an algorithm that given as input a full word w of length n, and positive integers p an... more We propose an algorithm that given as input a full word w of length n, and positive integers p and d, outputs, if any exists, a maximal p-periodic partial word contained in w with the property that no two holes are within distance d (so-called d-valid). Our algorithm runs in O (nd) time and is used for the study of repetition-freeness of partial words.
El objeto de esta tesis está representado por las repeticiones de palabras parciales, palabras qu... more El objeto de esta tesis está representado por las repeticiones de palabras parciales, palabras que, además de las letras regulares, pueden tener un número de símbolos desconocidos, llamados símbolos" agujeros" o" no sé qué". Más concretamente, se presenta y se resuelve una extensión de la noción de repetición establecida por Axel Thue.
Partial words, or sequences over a finite alphabet that may have do-not-know symbols or holes, ha... more Partial words, or sequences over a finite alphabet that may have do-not-know symbols or holes, have been recently the subject of much investigation. Several interesting combinatorial properties have been studied such as the periodic behavior and the counting of distinct squares in partial words. In this paper, we extend the three-squares lemma on words to partial words with one hole.
Abstract: The avoidability of binary patterns by binary cube-free words is investigated and the e... more Abstract: The avoidability of binary patterns by binary cube-free words is investigated and the exact bound between unavoidable and avoidable patterns is found. All avoidable patterns are shown to be D0L-avoidable. For avoidable patterns, the growth rates of the avoiding languages are studied. All such languages, except for the overlap-free language, are proved to have exponential growth.
In this paper we investigate several periodicity-related algorithms for partial words. First, we ... more In this paper we investigate several periodicity-related algorithms for partial words. First, we show that all periods of a partial word of length n are determined in ${\mathcal O}(n\log n)$ time, and provide algorithms and data structures that help us answer in constant time queries regarding the periodicity of their factors. For this we need a ${\mathcal O}(n^2)$ preprocessing time and a ${\mathcal O}(n)$ updating time, whenever the words are extended by adding a letter. In the second part we show that substituting letters of a word w with holes, with the property that no two holes are too close to each other, to make it periodic can be done in optimal time ${\mathcal O}(|w|)$ . Moreover, we show that inserting the minimum number of holes such that the word keeps the property can be done as fast.
Partial words are sequences over a finite alphabet that may have some undefined positions, or “ho... more Partial words are sequences over a finite alphabet that may have some undefined positions, or “holes,” that are denoted by $\ensuremath{\diamond}$ ’s. A nonempty partial word is called bordered if one of its proper prefixes is compatible with one of its suffixes (here $\ensuremath{\diamond}$ is compatible with every letter in the alphabet); it is called unbordered otherwise. In this paper, we investigate the problem of computing the maximum number of holes a partial word of a fixed length can have and still fail to be bordered.
Well-known results on the avoidance of large squares in (full) words include the following: (1) F... more Well-known results on the avoidance of large squares in (full) words include the following: (1) Fraenkel and Simpson showed that we can construct an infinite binary word containing at most three distinct squares; (2) Entringer, Jackson and Schatz showed that there exists an infinite binary word avoiding all squares of the form xxxx such that |x|≥3|x|≥3, and that the bound 3 is optimal; (3) Dekking showed that there exists an infinite cube-free binary word that avoids all squares xxxx with |x|≥4|x|≥4, and that the bound of 4 is best possible. In this paper, we investigate these avoidance results in the context of partial words, or sequences that may have some undefined symbols called holes. Here, a square has the form uvuv with uu and vv compatible, and consequently, such a square is compatible with a number of full words that are squares over the given alphabet. We show that (1) holds for partial words with at most two holes. We prove that (2) extends to partial words having infinitely many holes. Regarding (3), we show that there exist binary partial words with infinitely many holes that avoid cubes and have only eleven full word squares compatible with factors of it. Moreover, this number is optimal, and all such squares xxxx satisfy |x|≤4|x|≤4.
Uploads
Papers by Robert Mercas