Theoretical Computer Science: Joel D. Day, Daniel Reidenbach, Johannes C. Schneider
Theoretical Computer Science: Joel D. Day, Daniel Reidenbach, Johannes C. Schneider
Theoretical Computer Science: Joel D. Day, Daniel Reidenbach, Johannes C. Schneider
a r t i c l e i n f o a b s t r a c t
Article history: The Dual Post Correspondence Problem asks, for a given word , if there exists a non-
Available online 28 August 2015 periodic morphism g and an arbitrary morphism h such that g ( ) = h( ). Thus satises
the Dual PCP if and only if it belongs to a non-trivial equality set. Words which do not
Keywords:
satisfy the Dual PCP are called periodicity forcing, and are important to the study of word
Equality sets
Morphisms
equations, equality sets and ambiguity of morphisms. In this paper, a prime subset of
Dual post correspondence problem periodicity forcing words is presented. It is shown that when combined with a particular
Periodicity forcing sets type of morphism it generates exactly the full set of periodicity forcing words. Furthermore,
Periodicity forcing words it is shown that there exist examples of periodicity forcing words which contain any given
Ambiguity of morphisms factor/prex/sux. Finally, an alternative class of mechanisms for generating periodicity
forcing words is developed, resulting in a class of examples which contrast those known
already.
2015 Published by Elsevier B.V.
1. Introduction
The Dual Post Correspondence Problem (Dual PCP) is a decidable variation of the famous Post Correspondence Problem
(see Post [12]). It was introduced by Culik II and Karhumki in [1], where the authors make progress towards a charac-
terisation of binary equality sets. A word is said to satisfy the Dual PCP if it belongs to an equality set E( g , h) for two
morphisms g, h where at least one morphism is non-periodic. For example, the word abba belongs to E( g , h) where
g , h : {a, b} {a, b} are the morphisms given by:
aba if x = a, a if x = a,
g (x) := and h(x) :=
b if x = b, bab if x = b.
Thus abba satises the Dual PCP; in other words, it is a non-trivial equality word. In contrast, the word abaab does not
satisfy the Dual PCP, but this claim is much harder to verify. The latter is called a periodicity forcing word since it forces each
pair of morphisms which agree on it to be periodic.
Identifying which words belong to non-trivial equality sets and which do not is of immediate signicance to the Post
Correspondence Problem, which is simply the emptiness problem for equality sets. It is well known that although the PCP is
undecidable in general, it is decidable even in polynomial time in the binary case (see Halava and Holub [6]). It is therefore
no surprise that, for binary words, the Dual PCP is relatively well understood.
This work was supported by the London Mathematical Society, grant SC7-1112-02.
A preliminary version [4] of this work was presented at the conference WORDS 2013.
* Corresponding authors.
E-mail addresses: J.Day@lboro.ac.uk (J.D. Day), D.Reidenbach@lboro.ac.uk (D. Reidenbach), johannes.schneider@dialogika.de (J.C. Schneider).
http://dx.doi.org/10.1016/j.tcs.2015.08.033
0304-3975/ 2015 Published by Elsevier B.V.
J.D. Day et al. / Theoretical Computer Science 601 (2015) 214 3
This is due to both the original research by Culik II and Karhumki [1], and from results on equality sets (e.g., Holub [7],
Hadravova and Holub [5]) and word equations (e.g., Czeizler et al. [2], Karhumki and Petre [9]). Much less, however, is
known about the Dual PCP for larger alphabets.
One reason for this is that although the Dual PCP is known to be decidable, the proof (given by Culik II and
Karhumki [1]) relies on Makanins algorithm for solving word equations [11]. While this algorithm demonstrates that
the problem is computable in principle, the complexity is extremely high, and it provides little insight into the nature of
words which do/do not satisfy the Dual PCP. It is worth noting that the decidability of the PCP for alphabet sizes 3 to 6 is
a long-standing open problem, and therefore equality words over these alphabets are of particular interest.
In the present paper, we investigate the Dual PCP in the general case, specically looking at periodicity forcing words.
While examples of equality words are easily found, deciding on whether a word is periodicity forcing can be a particularly
intricate task, and becomes even more so as the alphabet size increases. In [3], we overcome this problem by employing the
use of morphisms to generate periodicity forcing words over arbitrary alphabets. Since it can be shown that many simple
morphisms (such as : {a, b} {a, b} given by (a) := a and (b) := ab) preserve the property of being periodicity
forcing, it is possible to span large parts of the set of periodicity forcing words (denoted by DPCP ) by applying such
morphisms to existing examples.
In Section 3 of the present paper, we explore this phenomenon further. Specically, DPCP is divided into those words
which may be reached by a non-trivial morphism from other elements of the set, and those which cannot. The latter form
a prime subset of DPCP from which all periodicity forcing words may be generated using a specic class of morphisms
characterised in [3]. In order to nd examples of these prime words therefore demonstrating that the subset is non-empty
it makes sense to consider the shortest periodicity forcing words. Thus, we also give bounds on the length of the shortest
periodicity forcing words for any alphabet.
In Section 4, it is shown that there exist periodicity forcing words with arbitrary factors. This not only further demon-
strates the complexity of the Dual PCP, but also provides another large, previously unknown class of periodicity forcing
words and with it, further insight into their structure.
Finally, motivated by Section 3, we employ some alternative techniques for nding periodicity forcing words over larger
alphabets, yielding insights into the set of prime words.
An alphabet is a set of symbols, or letters. A word over is a concatenation of symbols from . The empty word
consisting of no symbols is . We denote by the set of all words over (including ). + is \{ }. Let be an
alphabet. Let u , v . Then v is a factor of u if there exist w 1 , w 2 such that u = w 1 v w 2 . A word u is primitive
if u = v n for some v implies n = 1, otherwise u is imprimitive. If u = v n for some n N and v is primitive, then v is a
primitive root of u; it is unique if and only if u = . Two words u , v commute if uv = vu. More generally, a set of words
{u 1 , u 2 , , un } commutes if for every i , j, u i u j = u j u i . For a set X , the notation | X | refers to the cardinality of X , and for
a word u, |u | stands for the length of u. By |u |a , we denote the number of occurrences of the letter a in the word u. Let
u {a1 , a2 , , an } be a word. The Parikh vector of u, denoted by P(u ), is the vector (|u |a1 , |u |a2 , , |u |an ). The result of
dividing the Parikh vector by the greatest common divisor of its components is called the basic Parikh vector. A word u
is ratio-imprimitive if there exist v , w such that u = v w and v, w have the same basic Parikh vector. Otherwise u is
ratio-primitive.
Let N := {1, 2, } be the set of natural numbers, and let N0 := N {0}. We often use N as an innite alphabet of
symbols. In order to distinguish between a word over N and a word over a (possibly nite) alphabet , we call the former
a pattern. Given a pattern N , we call symbols occurring in variables and denote the set of variables in by var( ).
Hence, var( ) N. Sometimes, for convenience, we will also use {x1 , x2 , } to denote (possibly unknown) variables in N.
We use the symbol to separate the variables in a pattern, so that, for instance, 1 1 2 is not confused with 11 2. Given
patterns and , if may be obtained from by deleting all occurrences of some variables in , then is a subpattern
of . If var( ) = {1, 2, , n} and the leftmost occurrence of each variable x N appears to the left of any variable y with
y > x, then is in canonical form.
Given arbitrary alphabets A, B , a morphism is a mapping h : A B that is compatible with concatenation, i.e., for
all v , w A , h( v w ) = h( v )h( w ). Hence, h is fully dened for all v A as soon as it is dened for all symbols in A.
A morphism h is called periodic if and only if there exists a v B such that h(a) { v } for every a A. The morphisms
g , h : A B are distinct if and only if there exists an a A such that g (a) = h(a). For the composition of two morphisms
g , h : A A , we write g h, i.e., for every w A , g h( w ) = g (h( w )). If g ( v ) = h( v ) for some v A+ , then g and h
agree on v. The set of all words on which g and h agree is called the equality set of g and h. A morphism g : A B is
called a renaming morphism if it is injective, and | g (a)| = 1 for every a A. For words u , v A+ , if there exists a renaming
morphism g such that v = g (u ), then v is simply said to be a renaming of u.
Two words u A+ , v B + are morphically coincident if there exist morphisms g : A B and h : B A such that
g (u ) = v and h( v ) = u. A pattern N+ is morphically imprimitive if it is morphically coincident to some pattern with
|| < | |. Otherwise is morphically primitive. It is shown in [13] that if two patterns are morphically coincident, then they
are either renamings of each other, or at least one is morphically imprimitive.
4 J.D. Day et al. / Theoretical Computer Science 601 (2015) 214
A morphism g is said to be ambiguous with respect to a pattern if there exists another morphism h such that g ( ) =
h( ) and g , h are distinct. Thus a pattern satises the Dual PCP (see Section 1) if there exists an ambiguous non-periodic
morphism with respect to . In order to remain consistent with the notation in [3], we will often use and to denote
morphisms when considering ambiguity and the Dual PCP, especially if they map patterns in N to words in . We will
normally use and if we are mapping patterns to other patterns.
It is convenient, particularly in Section 3, to refer to the set of patterns which satisfy the Dual PCP and its complement.
Thus we dene the set: DPCP := { N+ | there exists a non-periodic morphism and an arbitrary morphism such that
( ) = ( ) and (x) = (x) for some x var( )}. We denote the complement of DPCP by DPCP . Note that DPCP is
exactly the set of periodicity forcing words (see Section 1).
We can extend periodicity forcing words to periodicity forcing sets in the natural way: a set of patterns is periodicity
forcing if, whenever two distinct morphisms agree on all patterns in the set, they are periodic. A set of patterns T is said
to be a test set of another set of patterns S if any two morphisms which agree on every pattern in T also agree on every
pattern in S. Note that this means any test set of a periodicity forcing set must also be periodicity forcing.
For a set of unknowns X := {x1 , x2 , , xn }, a word equation is an equation
= for some words
, X + . Its
solutions, over some given alphabet , are words w 1 , w 2 , , w n such that substituting each w i for xi resolves the
equation (it is equal on both sides). Thus solutions to the word equation may be expressed as morphisms : X
such that (
) and () are equal. Unless otherwise specied, X is usually a set of variables, while is a set of letters.
As a result, word equations equate patterns, and their solutions are substitutions to terminal words (words which are not
patterns). We will say that a set of words satises an equation if the associated substitution/morphism is a solution. We
will use the following well known and fundamental result on word equations throughout the rest of the paper.
Lemma 1. (See Lothaire [10].) Non-trivial word equations in two unknowns have only periodic solutions.
Thus, one obtains the following. For the third statement in the corollary to hold, it must be assumed that every primitive
word is a primitive root of the empty word. Note that this ts with our denition given above.
In our investigation into the use of morphisms to generate periodicity forcing words in [3], we provide the following
criterion. Any morphism which satises the criterion preserves the property of being periodicity forcing, and thus can be
used to obtain new periodicity forcing words from known ones.
Lemma 3. (See [3].) Let 1 , 2 be sets of variables. Let : 1 2 be a morphism such that, for every x 2 , there exists a
y 1 satisfying x var( ( y )), and
Characterisations of morphisms which satisfy conditions (i) and (ii) of Lemma 3 are given in [3]. In particular, condi-
tion (ii) is satised if and only if the set S := { (x) | x 1 } is periodicity forcing.
Since a set of patterns commutes if and only if each pair of patterns in the set commutes, by Corollary 2, the morphism
: 1 {a, b} is periodic if and only if the set { ( (x)) | x 1 } commutes. Hence, condition (i) is satised if and
only if the set S is commutativity forcing, that is, for every morphism for which the set { () | S } commutes, all
images (x), x 2 commute. This implies that is periodic.
Note that it follows from basic properties of morphisms that, for any periodicity forcing (resp. commutativity forcing) set
{1 , 2 , , n }, if a new pattern n+1 is added which does not contain any new variables (i.e., variables which do not
appear in any i , 1 i n), then the resulting set remains periodicity forcing (resp. commutativity forcing).
A method of obtaining periodicity forcing words as the morphic images of previously known examples is developed
in [3]. One consequence of the constructions given is the following:
Corollary 4. (See [3].) Let DPCP . Then there exists a morphism : N N which is not a renaming morphism, such that
( ) DPCP .
J.D. Day et al. / Theoretical Computer Science 601 (2015) 214 5
Although this statement is itself fairly easily obtained, and comes as no surprise, it is worth noting the richness and
variety in such morphisms (which are characterised in [3]), and therefore also in the subsequent patterns ( ) which
can be obtained through the application of morphisms. Thus an obvious question arises: is every periodicity forcing word
the morphic image of another?
Of course the answer is trivially armative if is permitted to be a renaming morphism (such as the identity), or
if can be unary (every pattern is a morphic image of := 1). However, if we restrict and to avoid these trivial
instances, the answer is no longer clear. In fact, a negative answer is provided by Proposition 9 below. Hence, the partition
of periodicity forcing words into those which are morphic images of another, and those which are not, is non-trivial. We
will call the latter prime. Moreover, it is reasonable to expect that these prime periodicity forcing words are sucient, given
the appropriate set of morphisms, to generate the full set. This is conrmed later by Theorem 10.
The proofs of these results rely on a lower bound for the length of periodicity forcing words, given relative to the
alphabet size. This bound is achieved by considering patterns belonging to the equality sets of (pairs of) nearly periodic
morphisms of the form
ar bas if x = y ,
(x) :=
a px otherwise,
where y is some xed variable, and r , s, p x N0 . It is apparent that the equality set of two morphisms 1 and 2 of this
type is determined by a system of linear Diophantine equations, and in the case that y is the same for both morphisms,
it is possible to infer a strong sucient condition for a pattern to belong to such an equality set. Since the morphisms are
non-periodic, any such pattern is not periodicity forcing.
Proposition 5. Let be a pattern, and let n := |var( )|. Suppose that | |x < n for some x var( ). Then DPCP.
Proof. Consider a pattern such that var( ) = {x1 , x2 , . . . , xn }, and | |xi < n for some i n. W.l.o.g. let i := n. Then there
exists a k N such that | |xn = n k, and can be written as 1 xn 2 xn . . . nk xn nk+1 for some patterns
1 , 2 , . . . , nk+1 {x1 , x2 , . . . , xn1 } .
Consider the morphisms , : {x1 , x2 , . . . , xn } {a, b} given by
ar1 bas1 if i = n, ar2 bas2 if i = n,
(xi ) := p i and (xi ) :=
a otherwise, aqi otherwise,
It follows that, for a periodicity forcing word with n variables, each variable must occur at least n times, implying the
next corollary which provides a lower bound on the length of the shortest periodicity forcing word for any alphabet size.
Corollary 6. Let
/ DPCP, and let n := |var( )|. Then | | n2 .
Since periodicity forcing words can be obtained as concatenations of words in a particular type of periodicity forcing set
(see Section 5), it is possible to infer a corresponding upper bound from results by Holub and Kortelainen [8]. The authors
provide a concise test set (containing at most 5n words, each of length n) for the set S n consisting of all permutations of
the word x1 x2 xn . Although it is stated in [8] that S n itself is not periodicity forcing, it can be veried using results
from [8] and [1] that the augmented set S n := S n {x1 x1 x2 x2 xn xn } is. Given a test set T n for S n , a test set for S n
is clearly T n {x1 x1 x2 x2 xn xn }. Thus, there exists a test set for S n containing at most 5n words of length n and one
word of length 2n. The periodicity forcing word resulting from concatenating these words is at most 5n2 + 2n letters long.
Proposition 7. Let n be a shortest pattern not in DPCP such that |var( )| = n. Then n2 | | 5n2 + 2n.
The lower bounds are particularly useful when considering prime elements of DPCP , which we dene formally below.
Denition 8. Let DPCP be a pattern with |var( )| 2. Then is said to be a prime element of DPCP (or simply prime)
if for every pattern DPCP with |var()| > 1, and every morphism : var() var( ) , () = implies that is a
renaming morphism.
Showing that a pattern satises Denition 8 is, in general, a highly non-trivial task, since all morphisms must be ac-
counted for with respect to every pattern DPCP . However, due to Proposition 5, it is possible to provide a relatively
simple example:
Proof. It is known from Culik II and Karhumki [1] that is periodicity forcing. Assume that DPCP is a pattern, and
that : var() var( ) is a morphism such that () = . Due to the fact that | |2 = 2, there exists a variable x var()
such that ||x 2. Hence, by Proposition 5, |var()| = 2. Since is primitive, is non-erasing and thus || 5. Furthermore,
all periodicity forcing words of length at most 5 are given by Culik II and Karhumki [1], so it is possible to determine
by inspection that no non-renaming morphism exists which maps any of these patterns to , and thus Denition 8 is
satised. 2
i i +1 i +2 i +n
where each i is the morphic image of i 1 . By Corollary 4, all such chains can continue indenitely in one direction.
Theorem 10 below conrms that any such chain must terminate in the other. Note that for convenience when proving the
theorem, the order of the indices of the patterns i has been reversed.
Theorem 10. There does not exist an innite sequence of periodicity forcing words S := 0 , 1 , 2 , such that for every i > 1,
Proof. Assume to the contrary that a such a sequence S exists which is innite. For any i , j N0 with i > j, let i , j :=
j+1 j+2 i , so that i, j (i ) = j . We will need to use the following results from [13]: rstly that if two patterns
are morphically coincident, then they are either the same (up to renaming) or at least one is morphically imprimitive and
therefore not periodicity forcing, and secondly that if a pattern is xed by a non-trivial morphism (not the identity), it is
morphically imprimitive. We now prove some further preliminary claims.
(k) (k)
Fig. 1. Depiction of the rst 5 patterns of the sequence S k . Each pattern i has its subpatterns j listed below. Solid arrows indicate the morphisms
which are explicitly given in the denition of the sequence, while the dashed arrows represent the implicit non-erasing morphisms from the subpatterns.
(k)
Note that for clarity, the dotted arrows are omitted for all but the leftmost occurrence of each i .
Proof of Claim 1. Assume to the contrary that, for some i , j N0 with i > j, i is a renaming of j . Let be the renaming
morphism such that ( j ) = i . If i = j + 1, then i (i ) = j . Thus, i (i ) = i . However, since i is not a renaming
morphism, i is not the identity, and i is morphically imprimitive. If i > j + 1, then i (i ) = i 1 , and i 1, j (i 1 ) = j .
This implies i 1, j (i 1 ) = i . Thus, at least one of i , i 1 is morphically imprimitive. 2
Our second claim provides a bound on the number of variables occurring in the patterns i .
Claim 2. There exists n N such that every pattern in S has at most n variables.
Proof of Claim 2. Let n := |0 |. Let i N be arbitrary and consider the morphism i ,0 mapping i to 0 . In particular,
consider the subset of var(i ) of variables which are not erased by i ,0 . Clearly the subset contains at least one variable x.
Furthermore, |i |x n. By Proposition 5, it follows that |var(i )| n. 2
Note that we can replace any i with one of its renamings, and S will still satisfy the criteria of the theorem. Thus, by
assuming that the patterns of the sequence are in canonical form, we can assume that there exists a nite alphabet such
that each i . We now give our nal preliminary claim.
Claim 3. Any innite subsequence of S also satises the conditions of the theorem.
Proof of Claim 3. Let S = p 0 , p 1 , p 2 , . . . be an innite subsequence of S. Then, for every p i > 1, there exists a mor-
phism p i satisfying p i1 = p i ( p i ) (simply take = p i , p i1 ). Furthermore, by Claim 1, each p i cannot be a renaming
morphism. Thus S satises the conditions of the theorem. 2
We are now ready to prove the theorem, which we do by deriving from S an innite subsequence S k which satises the
conditions for the theorem whenever S does. Thus, by showing S k does not satisfy the conditions, we obtain a contradiction
and our assumption that S is innite cannot hold.
Let i ,0 be the subpattern of i whose variables are not erased by i ,0 . Since each i ,0 contains only variables from a
nite alphabet , and must have length at most |0 |, the set {i ,0 | i N} contains only nitely many different patterns. In
particular, at least one such pattern i ,0 must occur as a subpattern of innitely many different patterns j . Let this pattern
be 0 . By Claim 3, the sequence S 0 obtained by removing all patterns after 0 which do not have 0 as a subpattern still
satises the criteria of the theorem. Note that S 0 is also still innite. We will call the patterns of the modied sequence
(0) (0) (0) (0) (0)
0 , 1 , 2 etc., and dene the morphisms i and i , j accordingly.
(0) (0) (0)
Similarly let i ,1 be the subpattern of i whose variables are not erased by i ,1 . By the same reasoning as above, there
(0)
exists some innitely occurring subpattern 1 , so we can produce an innite subsequence S 1 of S 0 containing only the
(0) (0) (0) (0)
patterns 0 , 1 and i with 1 as a subpattern when i > 1.
By repeating this process k > 2|+1| times, we have an innite sequence S k for which each pattern i , i > k contains
(k)
(k) (k) (k) (k) (k)
0 , 1 , . . . , k as subpatterns (see Fig. 1). Note that by denition, each i is a (non-erasing) morphic image of i .
can only have nitely many (at most 2|| 1) different, non-empty subpatterns. Thus there exist p, q, r
(k)
However, i
(k) (k) (k) (k)
such that p = r for some p > q > r. Note that r is a sub-pattern of q , since q r + 1. Furthermore, there exists a
(k) (k) (k) (k) (k) (k) (k)
morphism p ,q from p to q . However, since r (= p ) is a subpattern of q , there exists a morphism from q to
(k)
p (see Fig. 2). This implies they are morphically coincident, and since, by Claim 1, they are not renamings of each other,
at least one must be morphically imprimitive. This contradicts the assumption that all patterns are periodicity forcing, and
thus completes the proof. 2
8 J.D. Day et al. / Theoretical Computer Science 601 (2015) 214
(k) (k)
Fig. 2. Diagram showing morphic coincidence of p and q . Morphisms are indicated by arrows, where the solid arrows indicate which morphisms
responsible for the coincidence loop.
Consequently every periodicity forcing word is either a prime element of DPCP or the morphic image of a prime
element of DPCP , and the set DPCP is spanned by one-sided innite chains of the form
0 1 n
where each i is the morphic image of i 1 and 0 is prime.
Corollary 11. Let be a periodicity forcing word. Then is either prime, or the morphic image of a prime periodicity forcing word.
Since a characterisation of morphisms which map periodicity forcing words to periodicity forcing words is given in [3],
Theorem 10 provides a strong insight into the structure of DPCP .
By denition, it is not possible to use morphisms to generate prime periodicity forcing words, so alternative methods
must be used to nd them. This is investigated in Section 5, where some additional insights are gained.
Section 3 and [3] present constructions for periodicity forcing words over any given alphabet. An immediate consequence
is that we are also able to construct, for any pattern , a periodicity forcing set containing . For example, if DPCP
and var() = var( ), then {, } is periodicity forcing. More generally, the addition of a periodicity forcing word over
an appropriate alphabet is sucient to turn any nite set of patterns into a periodicity forcing set. Thus we have a high
degree of freedom when producing sets which are periodicity forcing, and therefore also morphisms satisfying Lemma 3.
In particular, we are able to construct, for any given pattern , a morphism and pre-image such that the pattern
:= ( ) is periodicity forcing and contains as a factor, prex or sux.
In order to guarantee that satises the conditions given in Lemma 3, the set { (x) | x var( )} must not only be pe-
riodicity forcing, but also commutativity forcing i.e. every morphism such that the words ( (x)), x var( ) commute
is periodic. A construction satisfying this condition is given in the next proposition.
Proposition 12. Let 0 be a pattern, and let n := log2 (|var(0 )|)
. There exist patterns 1 , 2 , . . . , n with P(0 ) = P(1 ) = =
P(n ) such that {0 , 1 , , n } is commutativity forcing.
Proof. Consider the case that |var(0 )| = 2n . The case that this is not true may easily be adapted. W.l.o.g. let 0 be in
canonical form, and note that this implies that 0 can be expressed as 1 2 m where m = |var(0 )|, and i := i i for
some pattern i {1, 2, . . . , i } . For i k, let i be the pattern obtained from 0 by swapping adjacent factors consisting
of 2i 1 consecutive patterns j , i.e.,
1 = 2 1 4 3 m1 m
2 = 3 4 1 2 m1 m m3 m2
..
.
k = m2 +1 m2 +2 m 1 2 m2
Note that P(0 ) = P(1 ) = = P(k ), so for any morphism , we have that
| (0 )| = | (1 )| = = | (n )|.
Thus, the system of word equations
i j = j i
for all i, j with 0 i < j n is equivalent to the simpler system
0 = 1 = = n .
J.D. Day et al. / Theoretical Computer Science 601 (2015) 214 9
It is now shown that all solutions to the above system of word equations are periodic. Let : {1, 2, . . . , n} {a, b} be an
arbitrary solution, and consider the equality 0 = 1 . This is equivalent to
(1 ) (2 ) (m ) = (2 ) (1 ) (m ) (m1 ).
By comparing the prex of length | (1 )| + | (2 )| on either side, (1 ) (2 ) = (2 ) (1 ). By Corollary 2, it follows
that there exists a primitive word w 1 {a, b} such that (1 ), (2 ) { w 1 } . A similar argument may be made for the
next, and indeed every pair of patterns j , j +1 where j < m is odd. Thus, for 1 i m 2
, there exists a primitive word
w i {a, b} such that (2i 1 ), (2i ) { w i } . Moreover, by the equation 1 = 2 , it is possible to employ the same
argument to determine that for 1 i m 4
, the words w 2i 1 and w 2i are equal. By continuing this argument for each
successive equality j = j +1 , it follows that w 1 = w 2 = = w m , so there exists a primitive word w {a, b} such that
2
(i ) { w } for all 1 i m.
Since 1 1 , this implies (1) { w } . Assume that (1), (2), . . . , (r ) { w } for some 1 r < m. Then since
+
It is now possible to show that for any given pattern , there exists a periodicity forcing word with as a factor.
Proof. It is known from [3] that there exists a pattern 1 / DPCP such that var() = var(1 ). By Proposition 12, there exist
patterns 2 , 3 , . . . , n with P() = P(2 ) = = P(n ) such that the set {, 2 , , n } is commutativity forcing. Since
var() = var(i ) for 1 i n, it follows that the augmented set {, 1 , 2 , , n } is commutativity forcing. Furthermore,
since 1 is periodicity forcing, the set is also periodicity forcing. Thus the morphism : {1, 2, . . . , n + 1} var() given
by (i ) := i for 1 i n and (n + 1) := satises both conditions of Lemma 3. From [3], there exists a pattern / DPCP
such that var( ) = {1, 2, . . . , n + 1}, and by Lemma 3, := ( ) / DPCP. Since = (n + 1) and n + 1 var( ), is a
factor of as required. The case that is a prex (resp. sux) of can be shown simply by using renamings of for
which n + 1 occurs at as a prex (resp. sux). 2
Example 14 demonstrates how , and therefore may be constructed in the case that = 1 1 2 3.
While Section 3 provides motivation for the further study of generating periodicity forcing words with morphisms, it
also demonstrates the need for other methods, since prime patterns can clearly not be obtained in this way. In [1], Culik II
and Karhumki show that this may be done using periodicity forcing sets. Indeed, patterns not in DPCP are essentially
periodicity forcing sets with a cardinality of 1. However, it is generally easier to construct periodicity forcing sets with
higher cardinalities, as more patterns result in a more restricted class of pairs of morphisms which agree on every pattern.
This is precisely the advantage gained when using morphisms to generate periodicity forcing words.
It follows from their basic properties that the agreement of two morphisms on a ratio-imprimitive pattern can be reduced
to the agreement of those morphisms on a set of two (or more) shorter patterns. In particular, if = 1 2 . . . n , where
P(1 ) = P(2 ) = = P(n ), then
/ DPCP if and only if {1 , 2 , . . . , n } is a periodicity forcing set.
10 J.D. Day et al. / Theoretical Computer Science 601 (2015) 214
Hence, given a periodicity forcing set of patterns with the same basic Parikh vector, it is possible to construct periodicity
forcing words by concatenating all the patterns in the set. It is the focus of the present section to investigate periodicity
forcing sets which have this additional property and use them to obtain periodicity forcing words which may be prime.
We will give constructions (Theorem 17 and Theorem 21) which allow new periodicity forcing sets to be formed from
existing ones. In particular, since strong sucient conditions are known for a set of patterns over two variables to be peri-
odicity forcing (see, e.g., Holub [7]), we will provide constructions which increase the alphabet size. We take the following
concise example from [1] which will be used later on.
Lemma 15. (See Culik II and Karhumki [1].) The set {1 2, 1 1 2 2} is periodicity forcing.
Note that by the reasoning above, we can infer that the patterns 1 2 1 1 2 2 and 1 1 2 2 1 2 are periodicity
forcing.
Our constructions are based on the substitution of individual variables with patterns. For example, consider the set
{ , } for some patterns , . We can immediately conclude for any , which agree on both patterns of the
set, that they are either identical over and (i.e. ( ) = ( ) and () = ()), or they are periodic over and (i.e.,
( ), ( ), (), () { w } for some word w). Since any morphic image of (resp. ) is also a morphic image of 1
(resp. 2), the existence of and not adhering to one of these cases would be in direct contradiction to Lemma 15.
Note however that the set { , } is not necessarily periodicity forcing. For example, it may be the case that a
morphism is periodic over and , but not their individual variables. In general, additional patterns will be required in
order to achieve to turn the original set into a periodicity forcing one. These additional patterns will be formed by splitting
a pattern = 1 2 and inserting some other pattern , obtaining 1 2 . Thus in the case described above, we have
that (1 2 ) is of the form w k1 u w q v w k2 where uv = w. Thus, we will use the following technical lemma when
considering the agreement of two such morphisms on 1 2 .
Lemma 16. Let w be a primitive word, and let u, u , v, v be words such that u , v = and u v = u v = w. Then for any
k1 , k2 , k3 , k4 , q1 , q2 N0 with q1 = 0 or q2 = 0, the equation
w k1 u w q1 v w k2 = w k3 u w q2 v w k4 (1)
only has solutions in the case that k1 = k3 , k2 = k4 , q1 = q2 , u = u and v = v.
Proof. Firstly, suppose that q1 = 0. Then equality (1) can be reduced to w (k1 +k2 +1)(k3 +k4 ) = u w q2 v . In this case is well
known and easily proved that u, v and w commute and thus that the statement of the lemma holds. Hence we assume
q1 = 0. Symmetrically, we can also assume that q2 = 0, and by the same reasoning, that u , v = .
W.l.o.g. let |u | |u |. Then since u v = u v , there exist words c, d, e such that u = cd, v = e, u = c and v = de. Note
that this implies w = cde. Hence equality (1) can be expressed as
(cde )k1 cd (cde )q1 e (cde )k2 = (cde )k3 c (cde )q2 de (cde )k4 .
If d = , then unless k1 = k3 , k2 = k4 and q1 = q2 , the equation is non-trivial and in two unknowns namely c and e, so by
Lemma 1, c and e commute and w is imprimitive. Hence c = , d = and e = .
The equation can be divided into three distinct cases, according to the sign of k1 k3 . In each case, it is shown that
whenever the equation is non-trivial, w must be imprimitive, which is a contradiction.
If k1 > k3 , by comparing the prex of each side of length (k3 + 1)|cde | + |c |,
(cde ) c = c (cde )
so c and cde commute. Since c , d, e = , | w | > |c |. Thus, w is imprimitive, which is a contradiction.
If k1 < k3 , by comparing the prex of length (k3 + q2 )|cde | + |c | + |d|, there exist n, m N0 such that
We now present our rst of two constructions for producing new periodicity forcing sets from existing ones. Note that
both constructions can easily be used to produce sets of patterns which share the same basic Parikh vector. Thus we can
use the following theorems to generate periodicity forcing words which are not necessarily obtainable using the methods
from [3]. The construction relies on splitting one variable y into two (so each occurrence of y becomes, e.g., y 1 y 2 ) in
each pattern. New patterns are then introduced to force the periodicity of y 1 and y 2 . Although the theorem appears very
technical, it is relatively simple to apply, as Example 18 shall demonstrate.
(i) 1 , 2 , . . . , t are patterns such that var(1 ) = var(2 ) = = var(t ) = \{xn }, and
(ii) the set {1 , 2 , , t } is commutativity forcing,
Proof. Let , : ( { y }) {a, b} be two distinct morphisms which agree on the set { (2 ), . . . , (m )}. Then since
{1 , 2 , . . . , m } is a periodicity forcing set, we have one of the following cases:
Consider rst Case 1. It follows from the denition of that (xn y ) = (xn y ), and (xi ) = (xi ) for 1 i < n. Fur-
thermore, (t +1 ) = (t +1 ). Then and must agree on xn xn y y. However, by Lemma 15 {xn y , xn xn y y } is a
periodicity forcing set, so there exists a w {a, b} and k1 , k2 , k3 , k4 N0 such that (xn ) = w k1 , (xn ) = w k3 , ( y ) = w k2 ,
( y ) = w k4 . Due to the fact that (i ) = (i ) for 1 i t,
w k1 (1 ) w k2 = w k3 (1 ) w k4
w k1 (2 ) w k2 = w k3 (2 ) w k4
..
.
w k1 (t ) w k2 = w k3 (t ) w k4 .
Note that since ( (xi )) = ( (xi )) for 1 i n and i {x1 , x2 , . . . , xn1 } for 1 i t, it follows that (i ) = (i )
for 1 i t. Unless k1 = k3 , and k2 = k4 (in which case and are not distinct), each equation is non-trivial and in
two variables (w and (i )), so by Lemma 1, (i ) { w } for 1 i t. Thus the words (i ) commute. However, by
Condition (ii) of the proposition, this implies that there exists a primitive word w {a, b} such that (xi ) { w } for
1 i < n. It follows from Lemma 1 that w = w, so is periodic. The same holds for .
Consider Case 2. Then there exist k1 , k2 , . . . kn , l1 , l2 , . . . ln N0 and a word w {a, b}+ such that (xi ) = w ki and
(xi ) = w li for 1 i < n, and (xn y ) = w kn , (xn y ) = w ln . If kn = ln = 0, then and are periodic. Otherwise there
exist u, v, u , v and q1 , q2 , q3 , q4 N0 such that (xn ) = w q1 u, ( y ) = v w q2 , (xn ) = w q3 u and ( y ) = v w q4 , with
uv = u v = w. Note that if u = or v = , is periodic. Since (1 ) = (1 ),
(xn ) (1 ) ( y ) = (xn ) (1 ) ( y ),
so
w q 1 u w s1 v w q 2 = w q 3 u w s2 v w q 4
12 J.D. Day et al. / Theoretical Computer Science 601 (2015) 214
Example 18. Let := {1, 2}, y := 3, and := {1 2, 1 1 2 2}. Then : {1, 2} {1, 2, 3} is the morphism given by
(1) = 1 and (2) = 2 3. Let 1 := 1, 1 := 2 1 3 and 2 := 1 1 2 2 3 3. Then by Theorem 17, we have that the set
:= {1 2 3, 1 1 2 3 2 3, 2 1 3, 1 1 2 2 3 3} is periodicity forcing. Since all the patterns have the same basic
Parikh vector, we can conclude that, for example, the pattern 1 2 3 1 1 2 3 2 3 2 1 3 1 1 2 2 3 3 is periodicity
forcing.
We can then use to again apply the theorem. This time we have y := 4 and := {1, 2, 3}. By Proposition 12,
possible choices for 1 and 2 are 1 2 and 2 1. Thus, by applying the theorem, we can conclude that the set :=
{1 2 3 4, 1 1 2 3 4 2 3 4, 2 1 3 4, 1 1 2 2 3 4 3 4, 3 1 2 4, 3 2 1 4, 1 1 2 2 3 3 4 4} is periodicity
forcing, and again we can concatenate the patterns to form a periodicity forcing word.
Our second method relies on inserting a new variable repeatedly into occurrences of a single pattern not in DPCP. It is
relatively simple to establish a set of patterns with the same basic Parikh vectors in this way. The following denition is
given to provide a notation for inserting a new variable x at a specied place in a pattern .
Denition 19. Let be a pattern and let x var( ) be a variable. Let prex ( ) be the prex of up to, and including the
rst occurrence of x. Let sufx ( ) be the sux of starting after (not including) the rst occurrence of x.
Note that prex ( ) sufx ( ) = , so the pattern prex ( ) y sufx ( ) is the pattern obtained by inserting the variable y into
the pattern directly after the rst occurrence of x.
The following lemma produces periodicity forcing sets which will form the basis of our construction. Although the
patterns in these sets do not have the same basic Parikh vectors, it is expanded in Theorem 21 to provide a construction
with patterns that do, and thus can be used to produce periodicity forcing words.
Lemma 20. Let / DPCP be a pattern, and let x / var( ) be a variable. Let z denote the pattern prez ( ) x sufz ( ) for any
z var( ). Then the set { , x} { y | y var( )} is periodicity forcing.
Proof. Let , : (var( ) {x}) {a, b} be distinct morphisms, let y be arbitrary, and consider the equation ( y ) =
( y ). If ( ) = ( ), by properties of DPCP, there must exist a word w {a, b} such that (z) { w } for every z var( ).
Therefore, there exist p , q, r , s N0 such that ( y ) = ( y ) if and only if
w p (x) w q = w r (x) w s .
Note that y can be chosen such that p = r whenever , are distinct, by taking the leftmost variable such that ( y ) = ( y ).
Furthermore, because (x) = (x) = u for some word u {a, b} , by Lemma 1, u and w must commute, so and must
be periodic to agree on every pattern in { , x} { y | y var( )} as required. 2
Note that in the following theorem, the set {x, } from Lemma 20 is replaced with a set containing patterns with the
same basic Parikh vector as the others. More specically, the new set is formed by substituting the variables 1 and 2 in the
example from Lemma 15 for x and . Using the set from Lemma 15 is not the only possibility, however. The construction
is easily generalised to use any periodicity forcing set of patterns with the appropriate basic Parikh vector.
Proof. Let , : (var( ) {x}) {a, b} be distinct morphisms which agree on every pattern in . Then they agree on
x and x x , so by Lemma 15, either
If Case 1 holds, then and agree on { , x}. Since this is a superset of the set {x, } {pre y ( ) x suf y ( ) | y var( )},
which by Lemma 20 is periodicity forcing, and are periodic. Consider Case 2 and assume to the contrary that is
non-periodic. Then there exists a y var( ) such that ( y ) / { w } . Let y be the rst such variable to occur in , and
consider the equation
J.D. Day et al. / Theoretical Computer Science 601 (2015) 214 13
w k1 u w q1 v w k2 = w k3 u w q2 v w k4 .
Note that if both q1 and q2 are 0, then (x) = (x) = , meaning ( ) = ( ); so must be periodic, which is a contra-
diction. Thus it is assumed that q1 > 0 or q2 > 0, and by Lemma 16, k1 = k3 , k2 = k4 , q1 = q2 , u = u , and v = v . Therefore
and are not distinct, which is a contradiction. A symmetrical argument can be made for when is non-periodic. Thus
and must be periodic to agree on every element in , so is a periodicity forcing set. 2
By applying Theorem 21 to := 1 2 1 1 2 and x := 3, and concatenating the patterns in the resulting set, we obtain,
for example, the periodicity forcing word
3 1 2 1 1 2 3 3 1 2 1 1 2 1 2 1 1 2 1 3 2 1 1 2 1 2 3 1 1 2,
which appears to be a good candidate for being prime. We can also conclude the following from Theorem 21:
Proposition 22. Let = k for some pattern and number k |var( )| + 3. Then is not a prime element of DPCP .
Proof. Let be a pattern and let x / var( ). By Theorem 21, the set := {x , x x } {pre y ( ) x suf y ( ) | y var( )}
is periodicity forcing. Furthermore, every pattern in has the same basic Parikh vector. Thus any concatenation of patterns
in such that every pattern is included at least once is not in DPCP. Let = 1 2 . . . k be such a pattern with i for
1 i k. Notice that k ||, and || = 3 + |var( )|. Let : (var( ) {x}) var( ) be the morphism given by (x) :=
and ( y ) := y for every y var( ). Clearly (i ) = for 1 i k, so () = k , and = k is not prime as required. 2
This is an interesting result since the properties associated with the Dual PCP are, due to the nature of morphisms,
generally consistent for repetitions of the same word. It can also be interpreted that, as a result of the proposition, the
majority of periodicity forcing words are not prime.
6. Conclusion
In a recent paper [3], we began an analysis of the Dual PCP in the context of larger alphabets, complementing the existing
research which has so far been focused on the better-understood binary case. In the present paper, we have continued this
analysis by focusing specically on those words which do not satisfy the Dual PCP.
In Section 3 we have introduced a prime subset of DPCP , allowing the set to be described as chains of morphic im-
ages. We have shown that this subset is non-empty, and thus that DPCP can be exactly generated by the set of prime
periodicity forcing words. In Section 4, we have given a construction for periodicity forcing words containing any given
factor/prex/sux. This not only produces a rich class of new examples, but demonstrates a previously unknown level of
generality within the seemingly very restrictive set. In Section 5, motivated by the study of the prime periodicity forcing
words introduced earlier, we have examined alternative methods for generating periodicity forcing words. The results give
examples of periodicity forcing words which contrast those known so far, and provide further insights into the prime words
considered earlier in the paper. As a by-product of results from this paper and existing literature, it has been possible to
give tight bounds on the length of the shortest periodicity forcing word over a given alphabet.
Acknowledgements
The authors wish to thank the anonymous referees of the conference version [4] of this paper for their helpful remarks
and suggestions which have provided a useful additional reference and a construction which has produced a stronger form
of Proposition 7. The helpful suggestions of the referees of the full version of this paper are also gratefully acknowledged.
References
[1] K. Culik II, J. Karhumki, On the equality sets for homomorphisms on free monoids with two generators, RAIRO Theor. Inform. Appl. 14 (1980) 349369.
[2] E. Czeizler, . Holub, J. Karhumki, M. Laine, Intricacies of simple word equations: an example, Internat. J. Found. Comput. Sci. 18 (2007) 11671175.
[3] J.D. Day, D. Reidenbach, J.C. Schneider, On the dual post correspondence problem, in: Proc. 17th International Conference on Developments in Language
Theory, DLT 2013, 2013.
[4] J.D. Day, D. Reidenbach, J.C. Schneider, Periodicity forcing words, in: Proc. 9th International Conference on Words, WORDS 2013, in: Lecture Notes in
Computer Science, vol. 8079, 2013, pp. 107118.
[5] J. Hadravova, . Holub, Large simple binary equality words, Internat. J. Found. Comput. Sci. 23 (2012) 13851403.
[6] V. Halava, . Holub, Reduction tree of the binary generalized post correspondence problem, Internat. J. Found. Comput. Sci. 22 (2011) 473490.
14 J.D. Day et al. / Theoretical Computer Science 601 (2015) 214
[7] . Holub, Binary equality sets are generated by two words, J. Algebra 259 (2003) 142.
[8] . Holub, J. Kortelainen, Linear size test sets for certain commutative languages, RAIRO Theor. Inform. Appl. 35 (2001) 453475.
[9] J. Karhumki, E. Petre, On some special equations on words, technical report 584, Turku Centre for Computer Science, TUCS, 2003.
[10] M. Lothaire, Combinatorics on Words, AddisonWesley, Reading, MA, 1983.
[11] G.S. Makanin, The problem of solvability of equations in a free semi-group, Sov. Math., Dokl. 18 (1977) 330334.
[12] E.L. Post, A variant of a recursively unsolvable problem, Bull. Amer. Math. Soc. (N.S.) 52 (1946) 264268.
[13] D. Reidenbach, J.C. Schneider, Morphically primitive words, Theoret. Comput. Sci. 410 (2009) 21482161.