Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Graph automata

2008, Theoretical Computer Science

Magmoids satisfying the 15 fundamental equations of graphs, namely graphoids, are introduced. Automata on directed hypergraphs are defined by virtue of a relational graphoid. The closure properties of the so-obtained class are investigated, and a comparison is being made with the class of syntactically recognizable graph languages. © 2007 Elsevier Ltd. All rights reserved.

This article was published in an Elsevier journal. The attached copy is furnished to the author for non-commercial research and education use, including for instruction at the author’s institution, sharing with colleagues and providing to institution administration. Other uses, including reproduction and distribution, or selling or licensing copies, or posting to personal, institutional or third party websites are prohibited. In most cases authors are permitted to post their version of the article (e.g. in Word or Tex form) to their personal website or institutional repository. Authors requiring further information regarding Elsevier’s archiving and manuscript policies are encouraged to visit: http://www.elsevier.com/copyright Author's personal copy Theoretical Computer Science 393 (2008) 147–165 www.elsevier.com/locate/tcs Graph automata Symeon Bozapalidis, Antonios Kalampakas ∗ Aristotle University of Thessaloniki, Department of Mathematics, 54124, Thessaloniki, Greece Received 5 July 2006; received in revised form 26 November 2007; accepted 30 November 2007 Communicated by Z. Esik Abstract Magmoids satisfying the 15 fundamental equations of graphs, namely graphoids, are introduced. Automata on directed hypergraphs are defined by virtue of a relational graphoid. The closure properties of the so-obtained class are investigated, and a comparison is being made with the class of syntactically recognizable graph languages. c 2007 Elsevier B.V. All rights reserved. Keywords: Automata; Directed hypergraphs; Magmoids 1. Introduction Various models of graph generation have been introduced in the literature and have been investigated in many directions (for an overview see [9]). However a general theory of graph automata (i.e., automata with a graph as input) is still missing from formal graph language theory. Historically, Arbib and Give’on were the first who extended tree automata to operate on planar directed ordered acyclic graphs (cf. [1]). In order to model derivations of 0-type grammars, Kamimura and Slutzki studied automata on rooted pdags, i.e., planar, directed, acyclic graphs (cf. [18,19]). The automata considered by Bossut, Dauchet and Warin, work exactly in the same way except that they use pdags with an unbounded number of roots and that they control the sequences of initial and final states by rational languages of state words (cf [5]). They have shown that the class of automaton definable languages of pdags is the smallest class containing finite sets of pdags and closed under union, (iterated) serial composition and (iterated) nondeterministic parallel composition. On the other hand there is no systematic approach towards recognizing devices for general graph languages. Our aim, in this paper, is to extend the above automata on general graphs and to compare the corresponding class with the class of syntactically recognizable graph languages (cf. [7]). To be able to construct graph automata we have to detect the category in which the graphs, over a specified alphabet constitute the free object. In the case of tree automata, the corresponding category is that of algebraic theories (cf. [4]). Recognizability of tree languages in locally finite algebraic theories was considered in [13]. At the level of trees, recognizability defined through congruences is equivalent to the recognizability defined via automata. ∗ Corresponding author. Tel.: +30 2310214088. E-mail addresses: bozapali@math.auth.gr (S. Bozapalidis), akalamp@math.auth.gr (A. Kalampakas). 0304-3975/$ - see front matter c 2007 Elsevier B.V. All rights reserved. doi:10.1016/j.tcs.2007.11.022 Author's personal copy 148 S. Bozapalidis, A. Kalampakas / Theoretical Computer Science 393 (2008) 147–165 More precisely, let Γ be a finite ranked alphabet, X n = {x1 , . . . , xn }, (n ≥ 0) and denote by TΓ (X n ) the set of all Γ -trees over X n . Then TΓ (X ) = (TΓn (X m ))m,n≥0 , with tree substitution as operation, is the free algebraic theory over Γ . A tree language F ⊆ TΓ (X ) is recognizable if there is a locally finite theory M and a theory morphism h : TΓ (X ) → M so that F = h −1 (h(F)), or equivalently F is saturated by a locally finite congruence ∼ on TΓ (X ). A tree automaton is a system A = (Γ , Q, h A , IA , TA ), where Q is the finite set of states, IA , TA ⊆ Q are the sets of initial and final states and h A : Γ → R(Q) is the transition function, where R(Q) denotes the theory of relations over Q. The behavior of A is the tree language |A| = h̄ −1 A (T ), where h̄ A : TΓ (X ) → R(Q) is the theory morphism uniquely extending h A and T ⊆ R(Q) is constructed by IA , TA in the obvious way. It is well known that F ⊆ TΓ (X ) is recognizable if and only if it is the behavior of such an automaton (cf. [14,4]). The notion of magmoids, introduced by Arnold and Dauchet (cf. [2,3]), is the algebraic structure which has, at the level of graphs, the role that algebraic theories have for the case of trees. Recall that a magmoid is a double ranked set M = (Mm,n )m,n≥0 equipped with two operations ◦ : Mm,n × Mn,k → Mm,k , m, n, k > 0,  : Mm,n × Mm ′ ,n ′ → Mm+m ′ ,n+n ′ , m, n, m ′ , n ′ > 0, which are associative, unitary and mutually coherent in a canonical way. It is known that the set of (hyper)graphs GR(Σ ), with hyperedges labelled over a finite doubly ranked alphabet Σ , can be organized into such a structure with ◦ being the graph product and  the graph sum (cf. [6,7]). Moreover Engelfriet and Vereijken have proved that, GR(Σ ) is finitely generated, that is, any graph can be built from a specific finite set of elementary graphs by using the two operations of product and sum (Theorem 7 of [12]). However, for any given hypergraph, there are infinitely many expressions representing it. This ambiguity was recently settled by constructing a finite set of equations E with the property that two expressions represent the same hypergraph if and only if one can be transformed into the other through these equations (cf. [6]). Therefore, GR(Σ ) is characterized as the quotient of the free magmoid generated by Σ , divided by E, or equivalently, it is the free object generated by Σ within the category of all magmoids that satisfy the equations E, called graphoids. In this respect, various algebraic properties of graphs and graph languages can be investigated inside the framework of magmoids. The scope of the present paper is to study automata on patterns and graphs. A graph automaton is a system A = (Σ , Q, δA , IA , TA ), where Q is the finite set of states, IA , TA ⊆ Q ∗ initial and final rational languages, δA : Σ → Rel(Q) is the transition function, and Rel(Q) = (Relm,n (Q)), Relm,n (Q) = {R | R ⊆ Q m × Q n }, m, n ≥ 0, is the graphoid of relations (over Q) with operations composition and −1 boxing of relations. The behavior of A is |A| = δ̄A (T ), where δ̄A : GR(Σ ) → Rel(Q) is the graphoid morphism uniquely extending δA and T ⊆ Rel(Q) is constructed from IA , TA in the obvious way. A graph language is called recognizable if it is obtained as a behavior of a graph automaton. As opposed to the case of context free graph languages (cf. [11]), the membership problem is decidable in polynomial time for recognizable graph languages. In [7] graph language recognizability through congruences is considered. To any language L ⊆ GR(Σ ), a congruence ∼ L (called syntactic) is associated, which constitutes the most economical congruence saturating L. Thus L is syntactically recognizable if and only if ∼ L has locally finite index that is if the quotient graphoid GR(Σ )/ ∼ L (called syntactic) is locally finite. In this paper we shall see that the above two notions of graph recognizability are not equivalent — as it is the case for trees. Actually, we prove that the class of recognizable graph languages is strictly contained into that of syntactically recognizable languages. The paper is divided into 6 sections. The notion of a (semi-)magmoid, together with some preliminary matter, is presented in Section 2. Examples of this algebraic structure are considered. We particularly insist on the construction of the magmoid of hypergraphs by recalling the definition of hypergraphs introduced in [12] together with the operations product and sum. Moreover, we construct the free magmoid generated by a doubly ranked alphabet and give the definitions of pattern and pattern language. In Section 3 we introduce graphoids. An important example of a graphoid, which will be useful in the construction of graph automata, is presented. Moreover, we prove that the set of all graphs over the doubly ranked alphabet Σ , is the Author's personal copy S. Bozapalidis, A. Kalampakas / Theoretical Computer Science 393 (2008) 147–165 149 free graphoid generated by Σ . Automata on graphs are introduced in Section 4. We prove that the membership problem is decidable in polynomial time for the class of recognizable graph languages. Closure properties are investigated in Section 5. Automaton recognizability is shown to be closed under union, intersection, -product and inverse graph homomorphism but not under complement and ◦-product. In Section 6 we compare the notions of recognizability and syntactic recognizability. The class of recognizable graph languages is strictly contained into that of syntactically recognizable graph languages and the complement of any recognizable graph language is syntactically recognizable. 2. Preliminaries Recall that a doubly ranked set (or a doubly ranked alphabet) (Am,n )m,n∈N is a set A together with a function rank : A → N × N, where N is the set of natural numbers. For m, n ∈ N, Am,n is the set {a ∈ A | rank(a) = (m, n)}. In what follows we will drop the subscript m, n ∈ N and denote a doubly ranked set simply by (Am,n ). A semi-magmoid is a doubly ranked set M = (Mm,n ) equipped with two operations ◦ : Mm,n × Mn,k → Mm,k , m, n, k > 0  : Mm,n × Mm ′ ,n ′ → Mm+m ′ ,n+n ′ , m, n, m ′ , n ′ > 0 which are associative in the obvious way and satisfy the distributivity law ( f ◦ g)  ( f ′ ◦ g ′ ) = ( f  f ′ ) ◦ (g  g ′ ) whenever all the above operations are defined. A magmoid is a semi-magmoid M = (Mm,n ), equipped with a sequence of constants en ∈ Mn,n (n > 0), called units, such that em ◦ f = f = f ◦ en , e0  f = f = f  e0 for all f ∈ Mm,n and all m, n > 0, and the additional condition em  en = em+n , for all m, n > 0 holds. Notice that, due to the last equation, the element en (n ≥ 2) is uniquely determined by e1 . From now on e1 will be simply denoted by e. In other words a magmoid is nothing but an x-categor y (cf. [8,16,17]) or strict monoidal category (cf. Chapter VII of [20]) whose set of objects is the set of natural numbers. Subsemi-magmoids, morphisms, congruences and quotients of semi-magmoids (resp. magmoids) are defined in the obvious way. In order to fix our notation we need the following definitions. Let M = (Mm,n ) be a magmoid. We say that a doubly ranked family L = (L m,n ) is a subset of M (notation L ⊆ M), whenever L m,n ⊆ Mm,n for all m, n ∈ N. The boolean operations on subsets of M are defined in the obvious way. Given subsets L , L ′ of a magmoid M (with unit sequence en ) we define their ◦-product L ◦ L ′ by setting [ (L ◦ L ′ )m,n = L m,k ◦ L ′k,n , m, n ∈ N k>0 and their -product L  L ′ by setting [ (L  L ′ )m,n = L κ,λ  L ′κ ′ ,λ′ , m, n ∈ N. κ+κ ′ =m λ+λ′ =n The subsets E and F of M given by E m,n = {en } if m = n and ∅ else, while Fm,n = {e0 } if m = n = 0 and ∅ else, are the units of the operations ◦ and  respectively. The ◦-star is the union of the successive ◦-powers of L ⊆ M: [ L ◦,k , L◦ = k>0 Author's personal copy 150 S. Bozapalidis, A. Kalampakas / Theoretical Computer Science 393 (2008) 147–165 where L ◦,k is inductively given by L ◦,0 = E, L ◦,1 = L , ..., L ◦,k+1 = L ◦ L ◦,k . The -star L  is defined analogously. Example 1 (Magmoids of Functions and Relations). The sets Functm,n (Q) of all functions from Q m to Q n Functm,n (Q) = { f | f : Q m → Q n }, m, n > 0, can be structured into a magmoid with ◦ being the usual function composition, while the operation  is the function boxing defined as follows: for f ∈ Functm,n (Q) and f ′ ∈ Functm ′ ,n ′ (Q) ( f  f ′ )(uu ′ ) = f (u) f ′ (u ′ ), ′ u ∈ Qm , u′ ∈ Qm . In a similar way the sets Relm,n (Q) = {R | R ⊆ Q m × Q n } of all relations from Q m to Q n can be organized into a magmoid, ◦ being the relation composition and  the relation boxing, defined as above. Its units en ⊆ Q n × Q n are given by en = {(u, u) | u ∈ Q n }, n > 0, and e0 = {(ε, ε)}, ε the empty word of Q ∗ . Clearly Funct (Q) = (Functm,n (Q)) is a sub-magmoid of Rel(Q) = (Relm,n (Q)). Let Σ be a doubly ranked alphabet. We denote by S M(Σ ) = (S Mm,n (Σ )) the smallest doubly ranked set satisfying the next items: - Σm,n ⊆ S Mm,n (Σ ) for all m, n ≥ 0, - if p ∈ S Mm,n (Σ ) and q ∈ S Mn,k (Σ ) then their horizontal concatenation p q ∈ S Mm,k (Σ ), p - if p ∈ S Mm,n (Σ ) and p ′ ∈ S Mm ′ ,n ′ (Σ ) then their vertical concatenation ′ ∈ S Mm+m ′ ,n+n ′ (Σ ). p Let ∼= (∼m,n ) be the doubly ranked equivalence on S M(Σ ), compatible with horizontal and vertical concatenation, generated by the relations p1 p1′ p2 p2′ ∼ p1 p2 p1′ p2′ for all pi , pi′ of suitable ranks. The quotient S M(Σ )/ ∼= (S Mm,n (Σ )/ ∼m,n ) is denoted smag(Σ ) and obviously is a semi-magmoid. The elements of smagm,n (Σ ) are called (m, n)-patterns or patterns of rank (m, n). Subsets of smag(Σ ) are called pattern languages. Our patterns are analogous with the unsorted abstract dags of [18,19,5]. For another formalization see also [15].   p p Convention. In order to avoid confusion in the pattern calculus instead of ′ we write . The associativity law p p′ takes the form     p1 p1   p2  =  p2  . p3 p3 This common pattern will be denoted   p1  p2  . p3 Author's personal copy S. Bozapalidis, A. Kalampakas / Theoretical Computer Science 393 (2008) 147–165 151 The distributivity law takes the form    ′  p1 p1 p1 p1′ . = p2 p2′ p2 p2′ Actually smag(Σ ) is the free semi-magmoid generated by Σ as confirmed by the next result. Proposition 1. For every semi-magmoid M = (Mm,n ) and every doubly ranked function f : Σ → M, there exists a unique morphism of semi-magmoids f¯ : smag(Σ ) → M making the following triangle commutative. Actually, f¯ is given by the clauses, - f¯(x) = f (x), for all x ∈ Σ,  p - f¯( p q) = f¯( p) ◦ f¯(q), f¯ ′ = f¯( p)  f¯( p ′ ), p for all p, q, p ′ ∈ smag(Σ ) of suitable rank. The construction of the free magmoid follows naturally. Let (en )n≥0 be a sequence of symbols not in Σ and denote by mag(Σ ) the free semi-magmoid smag(Σ ∪ {en | n ≥ 0}) divided by the congruence generated by the relations     e0 p ≈p≈ , em en ≈ em+n em p ≈ p ≈ p en , p e0 for all m, n ≥ 0 and all patterns p of suitable rank. Then mag(Σ ) clearly constitutes a magmoid which has a universal property analogous to the one stated in Proposition 1, i.e., mag(Σ ) is the free magmoid generated by Σ (cf. [6]). Next we introduce the magmoid of hypergraphs which will be of constant use throughout this paper. Given a finite alphabet X , we denote by X ∗ the set of all words over X and for every word w ∈ X ∗ , |w| denotes its length. Formally, a concrete (m, n)-graph over a doubly ranked alphabet Σ = (Σm,n ) is a tuple G = (V, E, s, t, l, begin, end), where - V is the finite set of nodes, E is the finite set of hyperedges, s : E → V ∗ is the source function, t : E → V ∗ is the target function, l : E → Σ is the labelling function such that rank(l(e)) = (|s(e)|, |t (e)|) for every e ∈ E, begin ∈ V ∗ with |begin| = m is the sequence of begin nodes and end ∈ V ∗ with |end| = n is the sequence of end nodes. Notice that according to this definition vertices can be duplicated in the begin and end sequences of the graph and also at the sources and targets of an edge. For an edge e of a hypergraph G we simply write rank(e) to denote rank(l(e)). The specific sets V and E chosen to define a concrete graph G are actually irrelevant. We shall not distinguish between two isomorphic graphs. Hence we have the following definition of an abstract graph. Two concrete (m, n)graphs G = (V, E, s, t, l, begin, end) and G ′ = (V ′ , E ′ , s ′ , t ′ , l ′ , begin′ , end′ ) over Σ are isomorphic iff there exist two bijections h V : V → V ′ and h E : E → E ′ commuting with source, target, labelling, begin and end in the usual way. An abstract (m, n)-graph is defined to be the equivalence class of a concrete (m, n)-graph with respect to isomorphism. We denote by G Rm,n (Σ ) the set of all abstract (m, n)-graphs over Σ . Since we shall mainly be Author's personal copy 152 S. Bozapalidis, A. Kalampakas / Theoretical Computer Science 393 (2008) 147–165 interested in abstract graphs we simply call them graphs except when it is necessary to emphasize that they are defined up to an isomorphism. Any graph G ∈ G Rm,n (Σ ) having no edges is called a discrete (m, n)-graph. If G is an (m, n)-graph represented by (V, E, s, t, l, begin, end) and H is an (n, k)-graph represented by (V ′ , E ′ , s ′ , t ′ , l ′ , begin′ , end′ ) then their product G ◦ H is the (m, k)-graph represented by the concrete graph obtained by taking the disjoint union of G and H and then identifying the ith end node of G with the ith begin node of H , for every i ∈ {1, ..., n}; also, begin(G ◦ H ) = begin(G) and end(G ◦ H ) = end(H ). The sum G H of arbitrary graphs G and H is their disjoint union with their sequences of begin nodes concatenated and similarly for their end nodes. For instance let Σ = {a, b, c, d}, with rank(a) = (2, 1), rank(b) = (1, 1) and rank(c) = (1, 2). In the following pictures, edges are represented by boxes, nodes by dots, and the sources and targets of an edge by directed lines that enter and leave the corresponding box, respectively. The order of the sources and targets of an edge is the vertical order of the directed lines as drawn in the pictures. We display two graphs G ∈ G R3,4 (Σ ) and H ∈ G R4,2 (Σ ), where the ith begin node is indicated by bi , and the ith end node by ei . Then their product G ◦ H is the (3, 2)-graph and, their sum G  H is the (7, 6)-graph For every n ∈ N we denote by E n the discrete graph of rank (n, n) with nodes x1 , . . . , xn and begin(E n ) = end(E n ) = x1 · · · xn ; we write E for E 1 . Note that E 0 is the empty graph. It is straightforward to verify that GR(Σ ) = (G Rm,n (Σ )) with the operations defined above is a magmoid, whose units are the graphs E n , n ≥ 0, see Lemma 6 of [12]. Subsets of GR(Σ ) are referred to as graph languages. The discrete graphs of GR(Σ ) form obviously a sub-magmoid DISC of GR(Σ ) and the function sending each graph G ∈ GR(Σ ) to its underlying discrete graph is indeed an epimorphism of magmoids discΣ : GR(Σ ) → DISC. Engelfriet and Vereijken proved that, GR(Σ ) is finitely generated, that is, any graph can be built from a specific finite set of elementary graphs (cf. [12]). More precisely, let us denote by I p,q the discrete ( p, q)-graph having a single node x and whose begin and end sequences are x · · · x ( p times) and x · · · x (q times) respectively. Note that I1,1 is equal with E. Let also Π be the discrete (2, 2)-graph having two nodes x and y and whose begin and end sequences are x y and yx, respectively. Finally, for every σ ∈ Σm,n , we denote again by σ the (m, n)-graph having only one edge Author's personal copy S. Bozapalidis, A. Kalampakas / Theoretical Computer Science 393 (2008) 147–165 153 and m + n nodes x1 , . . . , xm , y1 , . . . , yn . The edge is labelled by σ , and the begin (resp. end) sequence of the graph is the sequence of sources (resp. targets) of the edge, viz. x1 · · · xm (resp. y1 · · · yn ). Now let us introduce the alphabet D, formed by the following five symbols i 21 : 2 → 1 i 01 : 0 → 1 i 12 : 1 → 2 i 10 : 1 → 0 π : 2 → 2, where x : m → n indicates that symbol x has first rank m and second rank n, and denote by mag(Σ ∪ D) the free magmoid generated by the doubly ranked alphabet Σ ∪ D. We denote by valΣ : mag(Σ ∪ D) → GR(Σ ) the unique magmoid morphism extending the function described by the assignments i 21 7→ I2,1 , i 01 7→ I0,1 , σ 7→ σ, for all σ ∈ Σ , i 12 → 7 I1,2 , i 10 7→ I1,0 , en → 7 E n , for all n ∈ N. π 7→ Π , Theorem 1 (cf. [12]). The magmoid GR(Σ ) is generated by the set Σ ∪ {I12 , I10 , I21 , I01 , Π }. The previous theorem implies that the morphism valΣ is a surjection. However, valΣ is not an injection and in fact, for any given hypergraph, there are infinitely many patterns representing it. This ambiguity was recently settled by constructing a finite set of equations with the property that two patterns represent the same hypergraph if and only if one can be transformed into the other through these equations (cf. [6]). More precisely, we denote by πn,1 the pattern inductively defined by    en−1 πn−1,1 . π1,0 = e, πn,1 = π e Notice that for n = 1, π1,1 = π. Given a finite doubly ranked alphabet Σ , the set of equations E:         π e π e π e , = ππ = e2 , π e π e π e       e i e i = e, i = 21 i 21 , e i 21 21 i 01 21     e i πi 21 = i 21 , π = 01 , e i 01       e i 21 e π = π, i 21 e π e       e i 12 e = e, , i 12 = i 12 i 12 i 10 e i 12     e i = 10 , i 12 π = i 12 , π e i 10       e π e i 12 =π , i 12 e π e    e i 12 = i 21 i 12 , i 12 i 21 = e, i 21 e     e σ πn,1 , where σ ∈ Σm,n , m, n ≥ 0, = πm,1 σ e has the following property: for all patterns p and q, valΣ ( p) = valΣ (q) if and only if p = q. E Author's personal copy 154 S. Bozapalidis, A. Kalampakas / Theoretical Computer Science 393 (2008) 147–165 To better understand the internal structure of a graph, a normal form representation is necessary. As usual Sm stands for the group of all permutations of the set {1, 2, . . . , m}. Given a permutation α ∈ Sm   1 2 ... m α= α(1) α(2) . . . α(m) the discrete graph having {x1 , x2 , . . . , xm } as set of nodes, xα(1) xα(2) · · · xα(m) as begin sequence and x1 x2 · · · xm as end sequence, is denoted Πα and is called the permutation graph associated with α. Observe that Π is the graph associated with the permutation   1 2 α= 2 1 whereas the pattern πn,1 , introduced previously, represents the graph associated with the permutation   1 2 ... n+1 2 ... n + 1 1 interchanging the last n numbers with the first one. It is easy to see that for all α, β ∈ Sm and α ′ ∈ Sm ′ , Πα ◦ Πβ = Πα◦β and Πα  Πα ′ = Πα  α ′ so that the sets PERM m = {Πα | α ∈ Sm } with m ≥ 0, form a sub-magmoid P E R M = (PERM m,n ) of DISC, by setting PERM m,m = PERM m and PERM m,n = ∅, if m 6= n. For every σ ∈ Σm,n , BF(σ ) is the same as the graph σ , except that begin(BF(σ )) = begin(σ )end(σ ) = x1 · · · xm y1 · · · yn and end(BF(σ )) = ε (where ε denotes the empty sequence). The (m + n, 0)-graph BF(σ ) is equal with (cf. [6]) n BF(σ ) = (σ  E n ) ◦ Παn ◦ I20 , where n I20 = (I21 ◦ I10 )  · · ·  (I21 ◦ I10 ), n times, and αn ∈ S2n is defined by αn (i) = 2i − 1, αn (n + i) = 2i, for 1 ≤ i ≤ n. Now we present the canonical decomposition of a graph (cf. [6]). Theorem 2. (i) Every discrete graph ∆ ∈ DISC, ∆ 6= E 0 , can be written in the following normal form. Πα ◦ (I p1 ,q1  · · ·  I pn ,qn ) ◦ Πβ . (ii) Every graph G ∈ GR(Σ ), G 6= 0, can be written in the following normal form. Πα ◦ (I p1 ,q1  · · ·  I pn ,qn ) ◦ Πβ ◦ (E s  BF(σ1 )  · · ·  BF(σk )), where Πα , Πβ are permutation graphs. The next definition will be used later on. We call size of a pattern p ∈ smag(Σ ) the number of symbols of Σ occurring in p. The size of a graph G ∈ GR(Σ ) is then size(G) = min{size( p) | p ∈ val−1 Σ (G)}. Author's personal copy S. Bozapalidis, A. Kalampakas / Theoretical Computer Science 393 (2008) 147–165 155 3. Graphoids As we have seen in the previous section, the equations E are satisfied in GR(Σ ) by replacing π by Π and i κλ by Iκ,λ . Magmoids with such a property are called graphoids. More precisely, a graphoid M = (M, D) consists of a semi-magmoid M and a set D = {η0 , η, s, d01 , d21 , d10 , d12 }, where η0 ∈ M0,0 , η ∈ M1,1 , s ∈ M2,2 , d01 ∈ M0,1 , d21 ∈ M2,1 , d10 ∈ M1,0 , d12 ∈ M1,2 such that (M, η0 , η) is a magmoid, i.e., (1) ηm ◦ f = f = f ◦ ηn , η0  f = f = f  η0 , where ηn = η  · · ·  η (n-times, n ≥ 0), f ∈ Mm,n , (m, n ≥ 0), and additionally the following equations hold (2) (3) s ◦ s = η2 , (s  η) ◦ (η  s) ◦ (s  η) = (η  s) ◦ (s  η) ◦ (η  s) (η  d21 ) ◦ d21 = (d21  η) ◦ d21 , (η  d01 ) ◦ d21 = η, s ◦ d21 = d21 , (η  d01 ) ◦ s = (d01  η) , (s  η) ◦ (η  s) ◦ (d21  η) = (η  d21 ) ◦ s, (4) d12 ◦ (η  d12 ) = d12 ◦ (d12  η) , d12 ◦ (η  d10 ) = η, d12 ◦ s = d12 , s ◦ (η  d10 ) = (d10  η) , (d12  η) ◦ (η  s) ◦ (s  η) = s ◦ (η  d12 ) , (5) d12 ◦ d21 = η, (d12  η) ◦ (η  d21 ) = d21 ◦ d12 . (6) sm,1 ◦ ( f  η) = (η  f ) ◦ sn,1 , for all f ∈ Mm,n , where sm,1 is defined inductively by s analogously with πm,1 (see Section 2). We point out that the last equation holds in GR(Σ ) since it holds for all the letters of the doubly ranked alphabet Σ (cf. [6]). Thus the pair GR(Σ ) = (GR(Σ ), D), where D = {E 0 , E, Π , I0,1 , I2,1 , I1,0 , I1,2 } is a graphoid. The semi-magmoid Rel(Q) of relations over a set Q, can be structured into a graphoid by considering the following relations. - Q Q η0 ∈ Rel0,0 (Q), with η0 = {(ε, ε)}, η Q ∈ Rel1,1 (Q), with η Q = {(q, q) | q ∈ Q}, s Q ∈ Rel2,2 (Q), with s Q = {(q1 q2 , q2 q1 ) | q1 , q2 ∈ Q}, Q Q d01 ∈ Rel0,1 (Q), with d01 = {(ε, q) | q ∈ Q}, Q Q Q Q Q Q - d21 ∈ Rel2,1 (Q), with d21 = {(qq, q) | q ∈ Q}, - d10 ∈ Rel1,0 (Q), with d10 = {(q, ε) | q ∈ Q}, - d12 ∈ Rel1,2 (Q), with d12 = {(q, qq) | q ∈ Q}, where ε is the empty word of Q ∗ . The pair Rel(Q) = (Rel(Q), D Q ), with Q Q Q Q Q D Q = {η0 , η Q , s Q , d01 , d21 , d10 , d12 } obviously constitutes a graphoid which will be used in the construction of graph automata. Given graphoids (M, D) and (M ′ , D ′ ), a semi-magmoid morphism H : M → M ′ preserving D-sets, i.e., ′ , is called a morphism of graphoids. H (η0 ) = η0′ , H (η) = η′ , H (s) = s ′ and H (dκλ ) = dκλ We have already discussed how the set GR(Σ ) can be structured into a graphoid; in fact it is the free graphoid generated by Σ . Theorem 3. The doubly ranked function j : Σ → GR(Σ ), with j (σ ) = σ , for all σ ∈ Σ , has the following universal property: for any graphoid M = (M, D), D = {η0 , η, s, d10 , d12 , d01 , d21 } and any doubly ranked function f : Σ → M, there exists a unique morphism of graphoids f¯ : GR(Σ ) → M making commutative the following triangle. Author's personal copy 156 S. Bozapalidis, A. Kalampakas / Theoretical Computer Science 393 (2008) 147–165 The morphism f¯ is defined by the clauses - f¯(σ ) = f (σ ), σ ∈ Σ , f¯(E 0 ) = η0 , f¯(E) = η, f¯(Π ) = s, f¯(Ii j ) = di j , f¯(G 1 ◦ G 2 ) = f¯(G 1 ) ◦ f¯(G 2 ), f¯(G 1  G 2 ) = f¯(G 1 )  f¯(G 2 ), for all graphs G 1 , G 2 of suitable rank. Proof. By virtue of Proposition 1 there exists a unique magmoid morphism fˆ making commutative the triangle: Since all equations E = (1) ∪ (2) ∪ (3) ∪ (4) ∪ (5) ∪ (6) are valid in M, the kernel of fˆ includes E, K er ( fˆ) ⊇ E, and thus fˆ induces a unique graphoid morphism f¯ : mag(Σ ∪ D)/E −→ M rendering commutative the triangle where g is the canonical projection sending every element of mag(Σ ∪ D) to its class with respect to the congruence generated by E. The result comes by combining the above two diagrams and the main result of [6] stating that the magmoids mag(Σ ∪ D)/E and GR(Σ ) are isomorphic.  A graph homomorphism H : GR(Σ ) → GR(Σ ′ ) is just a morphism of graphoids. Hence, by virtue of the previous theorem it is completely determined by its values H (σ ), σ ∈ Σ . A graph homomorphism H : GR(Σ ) → GR(Σ ′ ) is called a projection whenever H (Σ ) ⊆ Σ ′ . 4. Pattern and graph automata Pattern automata have originated in [10], see also [21], and their basic properties are examined in [18,19] and [5] under the name of pdag automata. A pattern automaton is a structure A = (Σ , Q, θA , IA , TA ), where Σ , Q are the finite sets of input alphabet and states respectively, IA , TA ⊆ Q ∗ are rational languages for the initial and final configurations and θA : Σ → Rel(Q) is the move function. Its behavior is the pattern language (m) (n) |A| = { p | p ∈ smagm,n (Σ ), θ̄A ( p) ∩ (IA × TA ) 6= ∅, m, n ∈ N}, (m) (n) where IA = IA ∩ Q m and TA = TA ∩ Q n and θ̄A : smag(Σ ) → Rel(Q) is the unique semi-magmoid morphism extending θA . A graph automaton over Σ is a structure A = (Σ , Q, δA , IA , TA ), where: - Q is the finite set of states; Author's personal copy S. Bozapalidis, A. Kalampakas / Theoretical Computer Science 393 (2008) 147–165 157 - δA : Σ → Rel(Q) is the doubly ranked transition function; - IA , TA are the initial and final rational subsets of Q ∗ . According to Theorem 3 the function δA : Σ → Rel(Q) is uniquely extended into a morphism of graphoids δ̄A : GR(Σ ) → Rel(Q). The behavior of A is given by (m) (n) |A| = {F | F ∈ G Rm,n (Σ ), δ̄A (F) ∩ (IA × TA ) 6= ∅, m, n ∈ N}, (n) (m) where IA = IA ∩ Q m and TA = TA ∩ Q n . Graph automata are finite machines due to the fact that the set of equations (1)–(6) is finite. Notice that the difference between the two notions of automata above lies in the way that their moves are defined: θA ranges over the semi-magmoid Rel(Q) whereas δA ranges over the graphoid Rel(Q). A graph language is called recognizable whenever it is obtained as the behavior of a graph automaton. The class of all such languages over the doubly ranked alphabet Σ is denoted by Rec(Σ ). Example 2. We recall that I p,q denotes the discrete ( p, q)-graph having a single node x and whose begin and end sequences are x · · · x ( p times) and x · · · x (q times) respectively. Moreover, for all n ≥ 2 it holds I1,n = I1,2 ◦ (I1,n−1  E) = I1,2 ◦ (E  I1,n−1 ) and similarly for In,1 . Now let σ ∈ Σ1,1 , then the graph automaton A p = (Σ , Q, δA p , IA p , TA p ) with Q = {q1 , q2 }, I = {q1 }, T = {q2 } and δ(σ ) = {(q1 , q2 )}, clearly computes the graph language L p , containing the graphs: G n = I1,n ◦ (σ  · · ·  σ ) ◦ In,1 , with n occurrences of σ , depicted by Proposition 2. The membership problem is decidable in polynomial time for recognizable graph languages. Proof. Consider an automaton A = (Σ , Q, δA , IA , TA ) and a graph F ∈ G Rm,n (Σ ). By inspecting the proof of Theorem 7 of [12], we see that a pattern f ∈ smagm,n (Σ ∪ D), can be constructed in polynomial time so that valΣ ∪D ( f ) = F. Thus, δ̄A (F) can be calculated in polynomial time via f . Moreover, because of the rationality of IA , TA , the nonemptiness conditions (m) IA = IA ∩ Q m 6= ∅, (n) TA = TA ∩ Q n 6= ∅ can be checked in polynomial time as well. Hence, the set (m) (n) C = {C | C ⊆ Q m × Q n , C ∩ (IA × TA ) 6= ∅} is constructible in polynomial time and so is the condition F ∈ |A|, if and only if, δ̄A (F) ∈ C.  Author's personal copy 158 S. Bozapalidis, A. Kalampakas / Theoretical Computer Science 393 (2008) 147–165 We say that the pattern automaton A = (Σ ∪ D, Q, θA , IA , TA ) has graph structure if Q θA (e0 ) = η0 , θA (e) = η Q , Q θA (π ) = s Q , θA (i κ,λ ) = dκ,λ . Consider the seven element alphabet D = {e0 , e, π, i 01 , i 21 , i 10 , i 12 } and the mapping valΣ : smag(Σ ∪ D) → GR(Σ ). Proposition 3. A graph language L ⊆ GR(Σ ) is recognizable if and only if the language val−1 Σ (L) ⊆ smag(Σ ∪ D) can be recognized by a pattern automaton with graph structure. Proof. Assume that L is recognized by the graph automaton A = (Σ , Q, δA , IA , TA ) then val−1 Σ (L) is recognized by the pattern automaton with graph structure A′ = (Σ ∪ D, Q, θA′ , IA , TA ) with θA′ = δA ◦ valΣ . Conversely, let A = (Σ ∪ D, Q, θA , IA , TA ) ′ be a graph structure pattern automaton recognizing val−1 Σ (L). Then the function δA : Σ → Rel(Q) with δA′ (σ ) = θA (σ ), for all σ ∈ Σ is such that the following triangle commutes. Hence, the graph automaton A′ = (Σ , Q, δA′ , IA , TA ) recognizes the graph language valΣ (val−1 Σ (L)) = L .  5. Closure properties In this section we examine the main closure properties of recognizable graph languages. Proposition 4. The class Rec(Σ ) is closed under intersection. Proof. Given R1 ∈ Rel(Q 1 ), R2 ∈ Rel(Q 2 ), we define R1 ⊠ R2 = {((q1 , q1′ ) · · · (qm , qm′ ), (h 1 , h ′1 ) · · · (h n , h ′n )) | (q1 · · · qm , h 1 · · · h n ) ∈ R1 and (q1′ · · · qm′ , h ′1 · · · h ′n ) ∈ R2 }. It holds (R1 ⊠ R2 )♦(S1 ⊠ S2 ) = (R1 ♦S1 ) ⊠ (R2 ♦S2 ), ♦ = ◦, . We construct the automaton A1 ⊠ A2 = (Σ , Q 1 × Q 2 , δA1 ⊠A2 , IA1 ⊠A2 , TA1 ⊠A2 ) with - δA1 ⊠A2 (σ ) = δA1 (σ ) ⊠ δA2 (σ ), σ ∈ Σ , −1 −1 −1 −1 (TA2 ) (IA1 ) ∩ pr Q (IA2 ), TA1 ⊠A2 = pr Q (TA1 ) ∩ pr Q - IA1 ⊠A2 = pr Q 2 1 2 1 Author's personal copy S. Bozapalidis, A. Kalampakas / Theoretical Computer Science 393 (2008) 147–165 159 and pr Q 2 : (Q 1 × Q 2 )∗ → Q ∗2 pr Q 1 : (Q 1 × Q 2 )∗ → Q ∗1 , are the canonical projections. By induction we can show that δ̄A1 ⊠A2 (G) = δ̄A1 (G) ⊠ δ̄A2 (G) for all G ∈ GR(Σ ). We have (m) (m) (n) (n) m IA = IA1 ⊠A2 ∩ (Q 1 × Q 2 )m = IA1 × IA2 1 ⊠A2 TAn 1 ⊠A2 = TA1 ⊠A2 ∩ (Q 1 × Q 2 )n = TA1 × TA2 and so (m) δ̄A1 ⊠A2 (G) ∩ (IA (n) 1 ⊠A2 × TA 1 ⊠A2 (m) ) 6= 0 iff (n) (m) (n) (δ̄A1 (G) ⊠ δ̄A2 (G)) ∩ ((IA1 × TA1 ) × (IA2 × TA2 )) 6= 0 iff (m) (n) δ̄A1 (G) ∩ (IA1 × TA1 ) 6= 0 and (m) (n) δ̄A2 (G) ∩ (IA2 × TA2 ) 6= 0. In other words |A1 ⊠ A2 | = |A1 | ∩ |A2 |.  Let us set G R + (Σ ) = (G Rm,n (Σ ))m,n>0 . Proposition 5. If the languages L 1 , L 2 ⊆ G R + (Σ ) are recognizable, then so is their union L 1 ∪ L 2 . Proof. Let A1 = (Σ , Q 1 , δA1 , IA1 , TA1 ) and A2 = (Σ , Q 2 , δA2 , IA2 , TA2 ) be graph automata such that |A1 | = L 1 , |A2 | = L 2 and Q 1 ∩ Q 2 = ∅. Consider the automaton A = (Σ , Q 1 ∪ Q 2 , δA , IA1 ∪ IA2 , TA1 ∪ TA2 ), where δA is defined by δA (σ ) = δA1 (σ ) ∪ δA2 (σ ) for all σ ∈ Σ . Using induction on the size of a graph G we show that (u, v) ∈ δ̄A (G) and u ∈ Q ∗1 (resp. u ∈ Q ∗2 ) implies v ∈ Q ∗1 (resp. v ∈ Q ∗2 ). Indeed, this implication holds for all elementary graphs I12 , I10 , I21 , I01 , Π , σ ∈ Σ . Furthermore, any other graph G can be written either G = G 1 ◦ G 2 or G = G 1  G 2 with size(G i ) < size(G), i = 1, 2. In the first case if (u, v) ∈ δ̄A (G) = δ̄A (G 1 ) ◦ δ̄A (G 2 ), u ∈ Q ∗1 then there exists a state word w ∈ (Q 1 ∪ Q 2 )∗ such that (u, w) ∈ δ̄A (G 1 ), (w, v) ∈ δ̄A (G 2 ). Applying successively the induction hypothesis for the graphs G 1 and G 2 we get w ∈ Q ∗1 and v ∈ Q ∗1 as well. In the second case we have (u, v) ∈ δ̄A (G) = δ̄A (G 1  G 2 ), u ∈ Q ∗1 implies that there are factorizations u = u 1 u 2 , with u 1 u 2 ∈ Q ∗1 and v = v1 v2 with v1 v2 ∈ (Q 1 ∪ Q 2 )∗ Author's personal copy 160 S. Bozapalidis, A. Kalampakas / Theoretical Computer Science 393 (2008) 147–165 such that (u 1 , v1 ) ∈ δ̄A (G 1 ), (u 2 , v2 ) ∈ δ̄A (G 2 ). Again by the induction principle we get v1 , v2 ∈ Q ∗1 and so v = v1 v2 ∈ Q ∗1 . Consequently, G ∈ |A| means that the relation δ̄A (G) contains either a pair (u, v) ∈ IA1 × TA1 or a pair (u, v) ∈ IA2 × TA2 , and so G ∈ |A1 | or G ∈ |A2 | as wanted.  Remark. The assumption L 1 , L 2 ⊆ G R + (Σ ) means that none of the two languages contains a graph of type either (0, 0), (0, k), (k ′ , 0), k, k ′ arbitrary. In the opposite case the automaton A of the above proof would also recognize graphs of the form G 1  G 2 , where G 1 ∈ L 1 and G 2 ∈ L 2 ∩ G R0,0 (Σ ) or vise versa. A similar phenomenon appears when L 1 ∩ G R0,k (Σ ) 6= ∅ and L 2 ∩ G Rk ′ ,0 (Σ ) 6= ∅. Proposition 6. The class Rec(Σ ) is not closed under complement provided that Σm,n 6= ∅ for some m, n > 0. Proof. First we observe that the graph automaton A = (Σ , {0}, δA , IA , TA ), with δA = ∅ and IA = TA = 0∗ recognizes the graph language DISC of all discrete graphs. Hence, DISC is recognizable. Now assume that the behavior of the graph automaton B = (Σ , Q, δB , IB , TB ) is the complement DISCc of DISC. Let σ ∈ Σ1,1 , then the graph (loop) F = I1,2 ◦ (σ  E) ◦ I2,1 ∈ DISCc . Thus, there exist q1 ∈ IB and q2 ∈ TB such that (q1 , q2 ) ∈ δ̄B (F). It holds: δB (I12 ) = {(q, qq) | q ∈ Q}, δB (I21 ) = {(qq, q) | q ∈ Q}. δB (E) = {(q, q) | q ∈ Q} and Hence q1 = q2 = q, with q ∈ IB , TB , and thus E ∈ |B|, which is a contradiction. It is easy to see that the above argument can be applied for every σ ∈ Σm,n , m, n > 0.  Notice that the complement of a recognizable graph language lies always in the larger class of syntactically recognizable graph languages (see Section 6). Proposition 7. If the languages L 1 , L 2 ⊆ GR(Σ ) are recognizable then so is L 1  L 2 . Proof. Let Ai = (Σ , Q i , δAi , IAi , TAi ) be graph automata such that |Ai | = L i , (i = 1, 2) and Q 1 ∩ Q 2 = ∅. We construct the graph automaton A = (Σ , Q 1 ∪ Q 2 , δA , IA1 IA2 , TA1 TA2 ) with δA (σ ) = δA1 (σ ) ∪ δA2 (σ ), for all σ ∈ Σ . We are going to show that for all graphs G ∈ G Rm,n (Σ ) and words u 1 , v1 ∈ Q ∗1 and u 2 , v2 ∈ Q ∗2 , if (u 1 u 2 , v1 v2 ) ∈ δ̄A (G) then G = G 1  G 2 and (u 1 , v1 ) ∈ δ̄A1 (G 1 ), (u 2 , v2 ) ∈ δ̄A2 (G 2 ). Clearly this implication holds true for any discrete graph D. For the general case we need Theorem 2(ii) (see also [6]) i.e., every graph G can be factorized in the form G = ∆ ◦ (E s  BF(σ1 )  · · ·  BF(σk )), where ∆ is a discrete graph and BF(σ ) (σ ∈ Σm,n ) is the following graph of rank (m + n, 0): n BF(σ ) = (σ  E n ) ◦ Παn ◦ I20 . (b) Author's personal copy S. Bozapalidis, A. Kalampakas / Theoretical Computer Science 393 (2008) 147–165 161 n is the discrete graph introduced at the end of Section 2 and that α ∈ S is defined by α (i) = 2i − 1, Recall that I20 n n 2n αn (n + i) = 2i, for 1 ≤ i ≤ n. Now if w ∈ (Q 1 ∪ Q 2 )∗ is such that (u 1 u 2 , w) ∈ δ̄A (D) then (w, v1 v2 ) ∈ δ̄A (E s  BF(σ1 )  · · ·  BF(σk )) and we get that w = v1 v2 w ′ with w′ ∈ (Q 1 ∪ Q 2 )∗ and (w ′ , ε) ∈ δ̄A (BF(σ1 )  · · ·  BF(σk )). Taking into account Eq. (b) as well as the fact that δA (σ ) = δA1 (σ ) ∪ δA2 (σ ) we obtain that i +n i i +n i δ̄A (BF(σi )) = {(t, ε) | t ∈ Q m or t ∈ Q m }, 1 2 where rank(σi ) = (m i , n i ), (i = 1, 2). Now we recall the following equality from [6] (Lemma 6): Πm,n ◦ (A  B) = (B  A) ◦ Πk,l , where Πm,n , is the permutation graph associated with the permutation αm,n which interchanges the last m numbers with the first n one’s:   1 ··· n n + 1 ··· m + n αm,n = m + 1 ··· m + n 1 ··· m and A ∈ G Rm,k (Σ ) and B ∈ G Rn,l (Σ ). We point out that this equation for l = 0 takes the form: Πm,n ◦ (A  B) = (B  A) ◦ Πk,0 ⇔ Πm,n ◦ (A  B) = (B  A). Since Πk,0 = E k . Applying repeatedly this equation we can write G = D ◦ (E s  BF(σ1 )  · · ·  BF(σk )) = D ◦ Πγ ◦ [E s1  BF(σi1 )  · · ·  BF(σir )  E s2  BF(σ j1 )  · · ·  BF(σ js )], where Πγ is a permutation graph and it holds (u 1 u 2 , v1 w1 v2 w2 ) ∈ δ̄A (D ◦ Πγ ) with wi ∈ Q i∗ (i = 1, 2). Since u 1 , v1 w1 ∈ Q ∗1 and u 2 , v2 w2 ∈ Q ∗2 and D ◦ Πγ is a discrete graph, we get that D ◦ Πγ = D1  D2 and (u 1 , v1 w1 ) ∈ δ̄A (D1 ), (u 2 , v2 w2 ) ∈ δ̄A (D2 ) with D1 , D2 discrete graphs. It turns out that G = (D1  D2 ) ◦ [E s1  BF(σi1 )  · · ·  BF(σir )  E s2  BF(σ j1 )  · · ·  BF(σ js )] = [D1 ◦ (E s1  BF(σi1 )  · · ·  BF(σir ))]  [D2 ◦ (E s2  BF(σ j1 )  · · ·  BF(σ js ))] = G 1  G 2 with (u 1 , v1 ) ∈ δ̄A (G 1 ), (u 2 , v2 ) ∈ δ̄A (G 2 ), u 1 , v1 ∈ Q ∗1 , u 2 , v2 ∈ Q ∗2 or equivalently (u 1 , v1 ) ∈ δ̄A1 (G 1 ), (u 2 , v2 ) ∈ δ̄A2 (G 2 ) as desired. Now let G ∈ |A|. Then there are words u ∈ IA1 IA2 and v ∈ TA1 TA2 so that (u, v) ∈ δ̄A (G). Hence u = u 1 u 2 , v = v1 v2 with u 1 ∈ IA1 , v1 ∈ TA1 , u 2 ∈ IA2 , v2 ∈ TA2 . According to our previous argument, G = G 1  G 2 and (u 1 , v1 ) ∈ δ̄A1 (G 1 ), (u 2 , v2 ) ∈ δ̄A2 (G 2 ) and so G 1 ∈ |A1 |, G 2 ∈ |A2 |, thus |A| ⊆ |A1 |  |A2 |. The opposite inclusion is obvious.  Proposition 8. The class Rec(Σ ) is not closed under ◦-product. Author's personal copy 162 S. Bozapalidis, A. Kalampakas / Theoretical Computer Science 393 (2008) 147–165 Proof. We recall the language L p of Example 2 and let A = (Σ , Q, δA , IA , TA ) be a graph automaton computing the language L p ◦ L p . The graph σ ◦σ ∈ L p ◦ L p and thus there exist q1 ∈ IA and q2 ∈ TA such that (q1 , q2 ) ∈ δ̄A (σ ◦σ ). Now let H be the graph I1,2 ◦ ((σ ◦ σ )  (σ ◦ σ )) ◦ I2,1 . Then it holds (q1 , q2 ) ∈ δ̄A (H ) and thus H ∈ |A|, which is a contradiction since H ∈ / L p ◦ L p.  Proposition 9. The class Rec(Σ ) is closed under inverse graph homomorphisms and projections. Proof. Let A = (Σ , Q, δA , IA , TA ) be a graph automaton with |A| = L. Given a graph homomorphism H : GR(Γ ) → GR(Σ ) the graph automaton H −1 (A) = (Γ , Q, δ H −1 (A) , IA , TA ) with δ H −1 (A) (γ ) = δ̄A (H (γ )), γ ∈ Γ , recognizes the graph language H −1 (|A|) = H −1 (L). Finally assume that H : GR(Σ ) → GR(Γ ) is a projection. Then for the graph automaton [ H (A) = (Γ , Q, δ H (A) , IA , TA ), with δ H (A) (γ ) = δA (σ ), H (σ )=γ it holds |H (A)| = H (|A|) = H (L).  Example 3. If L ⊆ DISC is recognizable then the set of all graphs G ∈ GR(Σ ) whose underlying discrete graph belongs to L, that is the set disc−1 Σ (L), is recognizable as well. 6. Comparison with syntactic recognizability In this section we connect automata recognizability with syntactic recognizability as it is presented in [7]. α,β Let ξm,n be a new symbol with rank (m, n) and denote by F Rm,n (Σ ) the subset of G Rα,β (Σ ∪{ξm,n }) with just one α,β occurrence of ξm,n . The elements of F Rm,n (Σ ) are called frames with exterior rank (α, β) and interior rank (m, n). α,β α,β The set F Rm,n (Σ ) acts on G Rm,n (Σ ) via substitution at ξm,n : for F ∈ F Rm,n (Σ ) and G ∈ G Rm,n (Σ ), F · G = F[G/ξm,n ]. An equivalence ∼= (∼m,n ) on the magmoid M = (Mm,n ) is a congruence whenever, for all m, n > 0, f, g ∈ Mm,n α,β and all ω ∈ F Rm,n (M) f ∼m,n g implies ω[ f ] ∼α,β ω[g]. Let L be a subset of the magmoid M and f ∈ Mm,n , we set α,β C L ( f ) = {ω | ω ∈ F Rm,n (M), ω[ f ] ∈ L}. The equivalence ∼ L on M defined by f ∼ L ,m,n g, whenever C L ( f ) = C L (g) is a congruence. Given a magmoid M and a set L ⊆ M, ∼ L is called the syntactic congruence of L and the quotient magmoid M L = M/ ∼ L is the syntactic magmoid of L. Since the epimorphic image of a graphoid is again a graphoid, the syntactic magmoid of a graph language L ⊆ GR(Σ ) is a graphoid. The next universal property characterizes GR(Σ ) L up to isomorphism. Author's personal copy S. Bozapalidis, A. Kalampakas / Theoretical Computer Science 393 (2008) 147–165 163 Theorem 4 (cf. [7]). Let L ⊆ GR(Σ ) and h : GR(Σ ) → M an epimorphism of graphoids such that h −1 (h(L)) = L. h : M → GR(Σ ) L , making commutative the next triangle. Then there is an epimorphism of graphoids e In a graph automaton A = (Σ , Q, δA , IA , TA ) the transition relation corresponding to the graph G ∈ G Rm,n (Σ ) is δ̄A (G) ⊆ Q m × Q n . Applying the previous remark to the triangle we get the following proposition. Proposition 10. If the graphs G 1 , G 2 define the same transition relation, then they have the same contexts with respect to |A|, i.e., δ̄A (G 1 ) = δ̄A (G 2 ) implies C|A| (G 1 ) = C|A| (G 2 ). Recognizability at the level of magmoids is defined by suitably adapting the corresponding notion for monoids. We say that a congruence ∼= (∼m,n ) on the magmoid GR(Σ ) saturates L ⊆ GR(Σ ) whenever, for all m, n ≥ 0, the subset L m,n is a union of ∼m,n -classes. If for all m, n ≥ 0, the congruence ∼m,n has finite index (i.e., finite number of equivalence classes) we say that ∼ has locally finite index. A subset L of GR(Σ ) is called syntactically recognizable if there exists a locally finite magmoid N = (Nm,n ) (i.e., Nm,n finite for all m, n ∈ N) and a morphism h : GR(Σ ) → N , so that L = h −1 (P), for some P ⊆ N . The class of all syntactically recognizable subsets of GR(Σ ) is denoted by S Rec(Σ ). Theorem 5 (cf. [7]). Let L ⊆ GR(Σ ); the following conditions are equivalent: (i) (ii) (iii) (iv) (v) (vi) L is syntactically recognizable, L is saturated by a congruence of a locally finite index, ∼ L has locally finite index, the set car d{C L ( f ) | f ∈ GR(Σ )m,n } is finite for all m, n ∈ N, the syntactic magmoid GR(Σ ) L is locally finite, there exist a locally finite, graphoid N and a morphism of graphoids h : GR(Σ ) → N so that L = h −1 (P), for some P ⊆ N. Corollary 1. The class S Rec(Σ ) is closed under Boolean operations and inverse graph homomorphisms. Proposition 11. The graph language C O N (Σ ) of all connected graphs over Σ is not recognizable. Proof. Suppose that A = {Σ , Q, δA , IA , TA } is a graph automaton accepting C O N (Σ ). The graph I21 ◦ I12 is connected and thus I21 ◦ I12 ∈ |A|. Hence, there exist states q1 , q2 , q3 , q4 ∈ Q, with q1 q2 ∈ IA , q3 q4 ∈ TA such that (q1 q2 , q3 q4 ) ∈ δ̄A (I21 ◦ I12 ). Moreover, there exists a state q2′ ∈ Q such that (q1 q2 , q2′ ) ∈ δA (I21 ) and since δA (I21 ) = {(qq, q) | q ∈ Q}, we deduce that q1 = q2 = q2′ . Using similar arguments for I12 it follows that q1 = q2 = q3 = q4 . Hence there exists an element q ∈ Q such that qq ∈ IA and qq ∈ TA but (qq, qq) ∈ δ̄A (E  E) which is a contradiction because the graph E  E is not connected.  Author's personal copy 164 S. Bozapalidis, A. Kalampakas / Theoretical Computer Science 393 (2008) 147–165 Remark. An element a of a magmoid M is said to be connected if it cannot be factorized as a = a1  a2 for a1 , a2 ∈ M − {e0 }. It is easily seen that val−1 Σ (CON(Σ )) is just the set of all connected patterns of smag(Σ ∪ D) which is shown to be recognizable in [5]. However according to Proposition 11 and Proposition 3 this graph language cannot be recognized by a pattern automaton with graph structure. Proposition 12. The class of recognizable graph languages is properly included in the class of syntactically recognizable graph languages. Rec(Σ ) ( S Rec(Σ ). Proof. As we have seen the behavior of any graph automaton A = (Σ , Q, δA , IA , TA ) −1 is given by |A| = δ̄A (Θ), where Θ = (Θm,n ) is the doubly ranked set defined by Θm,n = (IA ∩ Q m ) × (TA ∩ Q n ). Thus, the inclusion follows by taking into account that Rel(Q) is a locally finite graphoid. The proper inclusion assertion follows by Proposition 11.  As we have seen in the previous section Rec(Σ ) fails to be closed under complement. However, Proposition 13. The complement of any recognizable graph language is a syntactically recognizable graph language. Proof. We only have to combine Corollary 1 and Proposition 12.  7. Conclusion and future work The introduction of graph automata seems to be one step towards the construction of a powerful and robust graph language recognizability theory. In this perspective, it is natural to examine the next subjects: 1. 2. 3. 4. Graph automata and monadic second-order logic. Infinite behavior of finite graph automata. Weighted graph automata and formal series on graphs. Graph transducers. As another future research direction we point out that the definition of a graph automaton A requires that the set of relations Rel(Q) over the state set Q of A should be a graphoid. Due to the importance of recognizing graph languages by means of automata the question that arises is whether we can give to Q an extra algebraic structure so that Rel(Q) becomes a graphoid. In this way new recognizability classes of graph languages will appear. Acknowledgement We deeply thank the referees for their numerous comments and suggestions. References [1] M.A. Arbib, Y. Give’on, Algebra automata I: Parallel programming as a prolegomena to the categorical approach, Inform. and Control 12 (1968) 331–345. [2] A. Arnold, M. Dauchet, Théorie des magmoides. I, RAIRO Inform. Théor. 12 (3) (1978) 235–257. [3] A. Arnold, M. Dauchet, Théorie des magmoides. II, RAIRO Inform. Théor. 13 (2) (1979) 135–154. [4] S.L. Bloom, Z. Ésik, Iteration theories, in: EATCS Monographs on Theoretical Computer Science, Springer-Verlag, 1991. [5] F. Bossut, M. Dauchet, B. Warin, A Kleene theorem for a class of planar acyclic graphs, Inform. and Comput. 117 (1995) 251–265. [6] S. Bozapalidis, A. Kalampakas, An axiomatization of graphs, Acta Inform. 41 (2004) 19–61. [7] S. Bozapalidis, A. Kalampakas, Recognizability of graph and pattern languages, Acta Inform. 42 (2006) 553–581. [8] V. Claus, Ein Vollständigkeitssatz für programme und schaltkreise, Acta Inform. 1 (1971) 64–78. [9] H. Ehrig, G. Engels, H.-J. Kreowski, G. Rozenberg (Eds.), Handbook of Graph Grammars and Computing by Graph Transformation, vol. I, World Scientific, Singapore, 1997. [10] S. Eilenberg, J.B. Wright, Automata in general algebras, Inform. and Control 11 (1967) 452–470. [11] J. Engelfriet, Context-free graph grammars, in: G. Rozenberg, A. Salomaa (Eds.), Handbook of Formal Languages, vol. III: Beyond Words, Springer, 1997, pp. 125–213 (Chapter 3). Author's personal copy S. Bozapalidis, A. Kalampakas / Theoretical Computer Science 393 (2008) 147–165 [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] 165 J. Engelfriet, J.J. Vereijken, Context-free graph grammars and concatenation of graphs, Acta Inform. 34 (1997) 773–803. Z. Esik, A variety theorem for trees and theories, Publ. Math. 54 (1999) 711–762. F. Gécseg, M. Steinby, Tree Automata, Akademiai Kiado, Budapest, 1984. J. Gibbons, An initial-algebra approach to directed acyclic graphs, in: Mathematics of Program Construction (Kloster Irsee, 1995), in: LNCS, vol. 947, Springer, Berlin, 1995, pp. 282–303. G. Hotz, Eine algebraisierung des syntheseproblems von schaltkreisen, EIK 1 (1965) 185–205. 209–231. G. Hotz, Eindeutigkeit und mehrdeutigkeit formaler sprachen, EIK 2 (1966) 235–246. T. Kamimura, G. Slutzki, Parallel and two-way automata on directed ordered acyclic graphs, Inform. and Control 49 (1981) 10–51. T. Kamimura, G. Slutzki, Transductions of DAGS and trees, Math. Syst. Theory 15 (1982) 225–249. S. MacLane, Categories for the Working Mathematician, Springer Verlag, 1971. C.P. Schnorr, Transformational classes of grammars, Inform. and Control 14 (1969) 252–277.