Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Expressing topological connectivity of spatial databases

2000

We consider two-dimensional spatial databases defined in terms of polynomial inequalities and focus on the potential of program-ming languages for such databases to express queries related to topo-logical connectivity. It is known that the topological connectivity test is not first-order expressible.

Expressing topological connectivity of spatial databases Floris Geerts and Bart Kuijpers⋆ University of Limburg (LUC) Department WNI B-3590 Diepenbeek, Belgium {floris.geerts, bart.kuijpers}@luc.ac.be Abstract. We consider two-dimensional spatial databases defined in terms of polynomial inequalities and focus on the potential of programming languages for such databases to express queries related to topological connectivity. It is known that the topological connectivity test is not first-order expressible. One approach to obtain a language in which connectivity queries can be expressed would be to extend FO+Poly with a generalized (or Lindström) quantifier expressing that two points belong to the same connected component of a given database. For the expression of topological connectivity, extensions of first-order languages with recursion have been studied (in analogy with the classical relational model). Two such languages are spatial Datalog and FO+Poly+While. Although both languages allow the expression of non-terminating programs, their (proven for FO+Poly+While and conjectured for spatial Datalog) computational completeness makes them interesting objects of study. Previously, spatial Datalog programs have been studied for more restrictive forms of connectivity (e.g., piece-wise linear connectivity) and these programs were proved to correctly test connectivity on restricted classes of spatial databases (e.g., linear databases) only. In this paper, we present a spatial Datalog program that correctly tests topological connectivity of arbitrary compact (i.e., closed and bounded) spatial databases. In particular, it is guaranteed to terminate on this class of databases. This program is based on a first-order description of a known topological property of spatial databases, namely that locally they are conical. We also give a very natural implementation of topological connectivity in FO+Poly+While, that is based on a first-order implementation of the curve selection lemma, and that works correctly on arbitrary spatial databases inputs. Finally, we raise the question whether topological connectivity of arbitrary spatial databases can also be expressed in spatial Datalog. ⋆ Research done while this author was at the Department of Mathematics and Computer Science of the University of Antwerp (UIA) as post-doctoral research fellow of the Fund for Scientific Research of Flanders (FWO-Vlaanderen). 1 Introduction The framework of constraint databases, introduced by Kanellakis, Kuper and Revesz [10] (an overview of the area of constraint databases can be found in [14]), provides a rather general model for spatial databases [16]. In this context, a spatial database, which conceptually can be viewed as an infinite set of points in the real space, is finitely represented as a union of systems of polynomial equations and inequalities (in mathematical terminology, such figures are called semi-algebraic sets [3]). The set of points in the real plane that are situated between two touching circles together with a segment of a parabola, depicted in Figure 1, is an example of such a spatial database and it could be represented by the polynomial constraint formula (x2 + (y − 1)2 ≤ 1 ∧ 25x2 + (5y − 4)2 > 16) ∨ (y 2 − x = 0 ∧ (0 ≤ y ∧ x ≤ 1)). In this paper, we will restrict our attention to two-dimensional spatial databases, a class of figures that supports important spatial database applications such as geographic information systems (GIS). (0,0) Fig. 1. An example of a spatial database. In the past ten years, several languages to query spatial databases have been proposed and studied. A very natural query language, commonly known as FO+Poly, is obtained by extending the relational calculus with polynomial inequalities [16]. The query that returns the topological interior of a database S is expressed by the FO+Poly-formula (∃ε > 0)(∀x′ )(∀y ′ )((x − x′ )2 + (y − y ′ )2 < ε2 → S(x′ , y ′ )), with free variables x and y that represent the co-ordinates of the points in the result of the query. Although variables in such expressions range over the real numbers, FO+Poly queries can still be effectively computed [5, 18]. A combination of results by Benedikt, Dong, Libkin and Wong [2] and results of Grumbach and Su [6] implies that one cannot express in FO+Poly that a database is topologically connected. The topological connectivity test and the computation of connected components of databases are decidable queries [8, 17] and are of great importance in many spatial database applications, however. One approach to obtain a language in which connectivity queries can be expressed would be to extend FO+Poly with a generalized (or Lindström) quantifier expressing that two points belong to the same connected component of a given database. In analogy with the classical graph connectivity query, which cannot be expressed in the standard relational calculus but which can be expressed in languages that typically contain a recursion mechanism (such as Datalog), we study extensions of FO+Poly with recursion for expressing topological connectivity, however. Two such languages are spatial Datalog and FO+Poly+While. Both languages suffer from the well-known defect that their recursion, that involves arithmetic over an unbounded domain (namely polynomial inequalities over the real numbers), is no longer guaranteed to terminate. Therefore, these languages are not closed. FO+Poly+While is known to be a computationally complete language for spatial databases [7], however. Spatial Datalog is believed to be complete too [11, 13]. It is therefore interesting to establish the termination of particular programs in these languages (even be it by ad hoc arguments) as it is interesting to do this for programs in computationally complete generalpurpose programming languages. Spatial Datalog [10, 11, 13] essentially is Datalog augmented with polynomial inequalities in the bodies of rules. Programs written in spatial Datalog are not guaranteed to terminate. It is known that useful restrictions on the databases under consideration or on the syntax of allowed spatial Datalog programs are unlikely to exist [11]. As a consequence, termination of particular spatial recursive queries has to be established by ad-hoc arguments. On the other hand, if a spatial Datalog program terminates, a finite representation of its output can be effectively computed. A first attempt [11] to express the topological connectivity test in this language consisted in computing a relation Path which contains all pairs of points of the spatial database which can be connected by a straight line segment that is completely contained in the database and by then computing the transitive closure of the relation Path and testing whether the result contains all pairs of points of the input database. In fact, this program tests for piece-wise linear connectivity, which is a stronger condition than connectivity. The program, however, cannot be guaranteed to work correctly on non-linear databases [11]: it experiences both problems with termination and with the correctness of testing connectivity (as an illustration: the origin of the database of Figure 1 (a) is a cusp point and cannot be connected to any interior point of the database by means of a finite number of straight line segments). In this paper, we follow a different approach that will lead to a correct implementation of topological connectivity queries in spatial Datalog for compact database inputs. In our approach we make use of the fact that locally around each of its points a spatial database is conical [3]. Our implementation first determines (in FO+Poly) for each point a radius within which the database is conical. Then all pairs of points within that radius are added to the relation Path and, finally we use the recursion of spatial Datalog to compute the transitive closure of the relation Path.1 We raise the question whether topological connectivity of arbitrary (not necessarily compact) spatial databases can be implemented in spatial Datalog. It can be implemented in FO+Poly+While, the extension of FO+Poly with a while-loop. FO+Poly+While is a computationally complete language for spatial databases [7], and therefore the known algorithms to test connectivity (One of the oldest methods uses homotopy groups computed from a CAD [17]. A more recent and more efficient method uses Morse functions [9]) can be implemented in this language. Our implementation is a very natural one, however. It is based on a constructive version of the curve selection lemma of semi-algebraic sets [3, Theorem 2.5.5]. We show that this curve selection can be performed in FO+Poly. Also in this implementation the transitive closure of a relation Path (this time initialized using an iteration) is computed. Once this transitive closure is computed, a number of connectivity queries, such as “Is the spatial database connected?”, “Return the connected component of the point p in the database”, “Are the points p and q in the same connected component of the database?” can be formulated. Grumbach and Su give examples of other interesting queries that can be reduced to connectivity [6]. Both of the spatial Datalog and of the FO+Poly+While implementation we prove they are guaranteed to terminate and to give correct results. This paper is organized as follows. In the next section we define spatial databases and the mentioned query languages and recall the property that spatial databases are locally conical. In Section 3, we will describe our spatial Datalog and FO+Poly+While implementations. In Section 4, we will prove their correctness and termination. 2 Preliminaries In this section, we define spatial databases and three query languages for spatial databases. Let R denote the set of the real numbers, and R2 the real plane. 2.1 Definitions Definition 1. A spatial database is a geometrical figure in R2 that can be defined as a Boolean combination (union, intersection and complement) of sets of the form {(x, y) | p(x, y) > 0}, where p(x, y) is a polynomial with integer coefficients in the real variables x and y. 1 In fact, for our purposes it would suffice to consider the extension of FO+Poly with an operator for transitive closure, rather than the full recursive power of spatial Datalog. Note that p(x, y) = 0 is used to abbreviate ¬(p(x, y) > 0) ∧ ¬(−p(x, y) > 0). In this paper, we will use the relational calculus augmented with polynomial inequalities, FO+Poly for short, as a query language. Definition 2. A formula in FO+Poly is a first-order logic formula built using the logical connectives and quantifiers from two kinds of atomic formula: S(x, y) and p(x1 , . . . , xk ) > 0, where S is a binary relation name representing the spatial database and p(x1 , . . . , xk ) is a polynomial in the variables x1 , . . . , xk with integer coefficients. Variables in such formulas are assumed to range over R. A second query language we will use is FO+Poly+While. Definition 3. A program in FO+Poly+While is a finite sequence of statements and while-loops. Each statement has the form R := {(x1 , . . . , xk ) | ϕ(x1 , . . . , xk )}, where ϕ is an FO+Poly formula that uses the binary relation name S (of the input database) and previously introduced relation names. Each whileloop has the form while ϕ do P od, where P is a program and ϕ an FO+Poly formula that uses the binary relation name S and previously introduced relation names. The semantics of a program applied to a spatial databases is the operational, step by step execution. Over the real numbers it is true that for every computable constraint query, such as connectivity, there is an equivalent FO+Poly+While program. A restricted class of FO+Poly+While programs consists of programs in spatial Datalog. Definition 4. Spatial Datalog is Datalog where, 1. The underlying domain is R; 2. The only EDB predicate is S, which is interpreted as the set of points in the spatial database (or equivalently, as a binary relation); 3. Relations can be infinite; 4. Polynomial inequalities are allowed in rule bodies. We interpret these programs under the the bottom-up semantics. To conclude this section, we remark that a well-known argument can be used to show that FO+Poly can be expressed in (recursion-free) spatial Datalog with stratified negation [1]. In this paper we also admit stratified negation in our Datalog program. 2.2 Spatial databases are locally conical Property 1 ([3], Theorem 9.3.5). For a spatial database A and a point p in the plane there exists a radius εp such that for each 0 < ε < εp holds that B 2 (p, ε)∩A is isotopic to the cone with top p and base S 1 (p, ε) ∩ A.2 2 With B 2 (p, ε) we denote the closed disk with center p and radius ε and with S 1 (p, ε) its bordering circle. A homeomorphism h : R2 → R2 is continuous bijective function We remark that a spatial database is also conical towards infinity. In the next section, we will show that such a radius εp , can be uniformly defined in FO+Poly. The database of Figure 1 is locally around the origin isotopic to the cone that is shown in Figure 2. It is a basic property of semi-algebraic sets that the base S 1 (p, ε) ∩ A is the finite union of points and open arc segments on S 1 (p, ε) [3]. We will refer to the parts of B 2 (p, ε) ∩ A defined by these open intervals and points as the sectors of p in A. In the example of Figure 2, we see that the origin has five sectors: two arc segments of the larger circle, a segment of the parabolic curve and two areas between the two circles. Sectors are curves or fully two-dimensional. (iii) (v) (ii) (iv) (i) (0,0) Fig. 2. The cone of (0, 0) of the spatial database in Figure 1. In the next sections, we will use the following property. It can be proven similarly as was done for closed spatial databases [12]. Property 2. Let A be a spatial database. Then the following holds: 1. Only a finite number of cone types appear in A; 2. A can only have infinitely many points of five cone types (interior points, points on a smooth border of the interior that (don’t) belong to the database, points on a curve, points on a curve of the complement); 3. The number of cone types appearing in A is finite and hence the number of points in A with a cone different from these five is finite. The points with a cone of the five types mentioned in (2) of Property 2 are called regular points of the database. Non-regular points are called singular. We remark that the regularity of a point is expressible in FO+Poly [12]. whose inverse is also continuous. An isotopy of the plane is an orientation-preserving homeomorphism. Two sets are said to be isotopic if there is an isotopy that maps one to the other. 3 Connectivity queries in spatial Datalog In general, a set S of points in the plane is defined to be topologically connected if it cannot be partitioned by two disjoint open subsets. This second-order definition seems to be unsuitable for implementation in spatial Datalog. Fortunately, for spatial databases S, we have the property that S is topologically connected if and only if S is path connected [3, Section 2.4] (i.e., if and only if any pair of points of S can be connected by a semi-algebraic curve that is entirely contained in S). In this section, we will show that for compact spatial databases path connectivity can be implemented in spatial Datalog and that for arbitrary databases it can be implemented in FO+Poly+While. 3.1 A program in spatial Datalog with stratified negation for connectivity of compact spatial databases The spatial Datalog program for testing connectivity that we describe in this section is given in Figure 3. Path(x, y, x′ , y ′ ) ←− ϕcone (S, x, y, x′ , y ′ ) Obstructed (x, y, x′ , y ′ ) ←− ¬S(x̄, ȳ), x̄ = a1 t + b1 , ȳ = a2 t + b2 , 0 ≤ t, t ≤ 1, b1 = x, b2 = y, a1 + b1 = x′ , a2 + b2 = y ′ ′ ′ Path(x, y, x , y ) ←− ¬Obstructed (x, y, x′ , y ′ ) Path(x, y, x′ , y ′ ) ←− Path(x, y, x′′ , y ′′ ), Path(x′′ , y ′′ , x′ , y ′ ) Disconnected ←− S(x, y), S(x′ , y ′ ), ¬Path(x, y, x′ , y ′ ) Connected ←− ¬Disconnected . Fig. 3. A program in spatial Datalog with stratified negation for topological connectivity of compact databases. The first rule is actually an abbreviation of a spatial Datalog program that computes an FO+Poly formula ϕcone (S, x, y, x′ , y ′ ) that adds to the relation Path all pairs of points ((x, y), (x′ , y ′ )) ∈ S×S such that (x′ , y ′ ) is within distance ε(x,y) of (x, y), where ε(x,y) is such that S is conical (in the sense of Property 1) in B 2 ((x, y), ε(x,y) ). We will make the description of ϕcone (S, x, y, x′ , y ′ ) more precise below. Then all pairs of points of the spatial database are added in the relation Path which can be connected by a straight line segment that is completely contained in the database. Next, the transitive closure of the relation Path is computed and in the final two rules of the program of Figure 3 it is tested whether the relation Path contains all pairs of points of the input database. Variations of the last two rules in the program of Figure 3 can then be used to formulate several connectivity queries (e.g., the connectivity test or the computation of the connected component of a given point p in the input database). The description of ϕcone (S, x, y, x′ , y ′ ) in FO+Poly will be clear from the proof of the following theorem. Theorem 1. There exists an FO+Poly formula that returns for a given spatial database A and a given point p a radius εp such that the database A is conical within B 2 (p, εp ) (in the sense of Property 1). Proof. (Sketch) Let A be a spatial database and p be a point. If p is an interior point of A, this is trivial. Assume that p is not an interior point of A. We compute within the disk B 2 (p, 1) the set γA,p in FO+Poly. For each ε ≤ 1, S 1 (p, ε) ∩ A is the disjoint union of a finite number of intervals and points. We then define γA,p ∩ S 1 (p, ε) to consists of these points and the midpoints of these intervals. For the database A of Figure 4 (a), γA,p is shown in (b) of that figure in full lines and γAc ,p is shown in dotted lines. These sets can be defined in FO+Poly using the predicate Betweenp,ε (x′ , y ′ , x1 , y1 , x2 , y2 ). Betweenp,ε (x′ , y ′ , x1 , y1 , x2 , y2 ) expresses for points (x′ , y ′ ), (x1 , y1 ) and (x2 , y2 ) on S 1 (p, ε) that (x′ , y ′ ) is equal to (x1 , y1 ) or (x2 , y2 ) or is located between the clockwise ordered pair of points ((x1 , y1 ), (x2 , y2 )) of S 1 (p, ε) (for a detailed description of the expression of this relation in FO+Poly we refer to [12]). p p (a) (b) Fig. 4. The 1-environment of a database around p (a) and the construction to determine an εp -environment (smaller dashed circle) in which the database is conical (b). In (b), γA,p is given in full lines and γAc ,p in dotted lines. Next, the singular points of γA,p ∪ γAc ,p can be found in FO+Poly. Let d be the minimal distance between p and these singular points. Any radius strictly smaller than d, e.g., εp = d/2, will satisfy the condition of the statement of this Theorem. Then B 2 (p, εp ) ∩ (γA,p ∪ γAc ,p ) consists of a finite number of non-intersecting (simple Jordan) curves starting in points S 1 (p, εp ) ∩ (γA,p ∪ γAc ,p ) and ending in p and that for every ε ≤ εp each have a single intersection point with S 1 (p, ε). It is easy (but tedious) to show that there is an isotopy that brings B 2 (p, εp ) ∩ (γA,p ∪ γAc ,p ) to the cone with top p and base S 1 (p, εp ) ∩ (γA,p ∪ γAc ,p ). This is the isotopy we are looking for. ⊓ ⊔ 3.2 An FO+Poly+While program for connectivity of arbitrary spatial databases For compact spatial databases, all sectors of a boundary point p are all in the same connected component of the database (because a boundary point is always part of the database). Therefore all pairs of points in an εp -environment of p, can be added to the relation Path, even if they are in different sectors of p. For arbitrary databases, the sectors of a point p ∈ ∂S \ S 3 are not necessarily in the same connected component of the database.4 This means that in general only pairs of points can be added to the relation Path if they are in the same sector of a point. We can achieve this by iteratively processing all sectors of the border points and adding only pairs of points that are in the same sector of a border point to the relation Path. For this iterative process we use the language FO+Poly+While. The resulting program is shown in Figure 5. As in the compact case, we first initialize a relation Path and end with computing the transitive closure of this relation. In the initialization part of the program, first all pairs of points which can be connected by a straight line segment lying entirely in the database, are added to the relation Path. Then, a 5-airy relation Current is maintained (actually destroyed) by an iteration that, as we will show, will terminate when the relation Current will have become empty. During each iteration step the relation Path is augmented with, for each border point p of the database, all pairs of points on the midcurve of the sector of p that is being processed during the current iteration. The remainder of this section is devoted to the description of the implementations in FO+Poly of the algorithms INIT, SeRA (sector removal algorithm) and CSA (curve selection algorithm) of Figure 5. The correctness and termination of the resulting program will be proved in the next section. The relation Current will at all times contain tuples (xp , yp , ε, x, y) where (xp , yp ) range over the the border points of the input database A, where ε is a radius and where (x, y) belong to a set containing the part of the ε-environment of (xp , yp ) that still has to be processed. Initially, INIT (S, xp , yp , εp , x, y) sets the relation Current to the set of five-tuples {(xp , yp , εp , x, y) | (xp , yp ) ∈ ∂S, εp = 1, (x, y) ∈ B 2 ((xp , yp ), εp ) ∩ S}. It is clear that INIT can be defined in FO+Poly. Next, for all border points p = (xp , yp ) of the database, both in SeRA and CSA the “first sector” of p in the relation Current(xp , yp , εp , x, y) will be determined. This is implemented as follows. We distinguish between a sector that is a curve and a fully two-dimensional one. We look at the latter case (the former is similar). 3 4 We denote the topological border of S by ∂S. The same is true for the point at infinity, which can be considered as a boundary point of the database that does not belong to the database. To improve readability, we only consider bounded inputs in this section. Path := {(x, y, x′ , y ′ ) | S(x, y) ∧ S(x′ , y ′ ) ∧ (x, y)(x′ , y ′ ) ⊆ S} Current := IN IT (S, xp , yp , ε, x, y) while Current = ∅ do Current := SeRA(Current(xp , yp , ε, u, v), εnew , x, y) Path := CSA(Current(xp , yp , ε, u, v), x, y, x′ , y ′ ) od Y := ∅ while Y = Path do Y := Path; Path := Path ∪ {(x, y, x′ , y ′ ) | (∃x′′ )(∃y ′′ )(Path(x, y, x′′ , y ′′ )∧ Path(x′′ , y ′′ , x′ , y ′ ))} od. Fig. 5. An FO+Poly+While program for topological connectivity of arbitrary databases. The notation (x, y)(x′ , y ′ ) stands for the line segment between the points (x, y) and (x′ , y ′ ). We define an order on the circle S 1 (p, ε) with 0 < ε < εp , by using the relation Betweenp,ε (x′ , y ′ , x1 , y1 , x2 , y2 ), and by taking the point p + (0, ε) as a starting point (see proof of Theorem 1). For each 0 < ε < εp , the intersection of the “first fully two-dimensional sector” with S 1 (p, ε) is defined as the first (using the just defined order) open interval on this circle. This is clearly dependent on the radius ε. For the database of Figure 6 (a) this dependency is illustrated in (b) of that figure. The “first sector” falls apart into four parts (shaded dark), depending on the radius ε. Furthermore, as in Theorem 1, the first midcurve, i.e. the midcurve of the “first sector”, within radius εp is computed in FO+Poly (the thick curve segments in Figure 6 (b)). By our definition of the “first sector”, this first midcurve needs not to be connected. Hence, we obviously do not want to add all pairs (q, q ′ ) of points in this set to the relation Path. We can, however, compute a new (and smaller) εnew such that the curve p of midpoints has no longer singular points within B 2 (p, εnew p ). In Figure 6 (b), the small dashed circle around p has radius εnew . Within the radius εnew the p p midcurve is connected and the point p belongs to its closure. SeRA now updates εp in the relation Current to εnew and removes the first p sector from the relation Current. This means that the set of points (x, y) that are in the relation Current with the point p = (xp , yp ) will initially be B 2 (p, εp ) ∩ A, then B 2 (p, εnew p ) ∩ A minus the first sector of p (after the first iteration), then 2 new′ B (p, εp ) ∩ A minus the first two sectors of p (after the second iteration), etc. CSA will add to the relation Path, all pairs (q, q ′ ) of midpoints at different distances ε, ε′ < εnew from p (ε can be taken 0, if p belongs to the database) of p the sector that has just been removed by SeRA. p p (a) (b) Fig. 6. The εp -environment of the point p in (a) and the “first sector” of p, the midcurve in (b). of the “first sector” and εnew p 4 Correctness and termination of the programs In this section, we prove the correctness and termination of both programs of the previous section. Theorem 2. The spatial Datalog program of the previous section correctly tests connectivity of compact spatial databases and the FO+Poly+While program of the previous section correctly tests connectivity of arbitrary spatial databases. In particular, the spatial Datalog program is guaranteed to terminate on compact input databases and FO+Poly+While program terminates on all input databases. Proof. (Sketch) To prove correctness, we first have to verify that for every input database S our programs are sound (i.e., two points in S are in the same connected component of S if and only if they are in the relation Path). Secondly, we have to determine the termination of our programs (i.e., we have to show that the first while-loop in Figure 5 that initializes the relation Path ends after a finite number of steps and that for both programs the computation of the transitive closure of the relation Path ends). To prove the latter it is sufficient that we show that there exists a bound α(S) such that any two points in the same connected component of S end up in the relation Path after at most α(S) iterations of the computation of the transitive closure. To improve readability, we only consider bounded inputs of the FO+Poly+While program in this proof. Soundness. The if-implication of soundness (cf. supra) is trivial. So, we concentrate on the only-if implication. We use Collins’s Cylindrical Algebraic Decomposition (CAD) [5] to establish the only-if direction. This decomposition returns for a polynomial constraint description of S, a decomposition c(S) of the plane in a finite number of cells. Each cell is either a point, a 1-dimensional curve (without endpoints), or a two-dimensional open region. Moreover, every cell is either part of S or of the complement of S. In order to prove that any two points in the same connected component of S are in the transitive closure of the relation Path, it is sufficient to prove this for 1. any two points of S that are in one cell of c(S), in particular, a. two points of S that are in the same region, b. two points of S that are on the same curve, and 2. any two points of S that are in adjacent cells of c(S), in particular, a. a point that is in a region and a point that is on an adjacent curve, b. a point that is in a region and an adjacent point, c. a point that is on a curve and an adjacent point. 1.a. In this case the two points p and q are part of a region in the interior of S, they can be connected by a semi-algebraic curve γ lying entirely in the interior of S [3]. Since uniformly continuous curves (such as semi-algebraic ones) can be arbitrarily closely approximated by a piece-wise linear curve with the same endpoints [15], p and q can be connected by a piece-wise linear curve lying entirely in the interior of S, we are done. 1.b. The curves in the decomposition are either part of the boundary of S or vertical lines belonging to S. In the latter case, the vertical line itself connects the two points. For the former case, let p and q be points on a curve γ in the cell decomposition. We prove for the case of the FO+Poly+While program that p and q are in the transitive closure of the relation Path. For the spatial Datalog program the proof is analogous. Let γpq be the curve segment of γ between p and q. Since all points r on γpq belong to the border of S, the algorithm SeRA processes the curve γ twice as sectors of r. We cover γpq with disks B 2 (r, εr ), where εr is the radius constructed by SeRA when processing γ as a sector of r. Since γpq is a compact curve this covering has a finite sub-covering of, say, m closed balls. Then, the points p and q are in the relation Path after m iterations in the computation of the transitive closure. 2.a. A point on a vertical border line of a region can be connected by one single horizontal line with a point in the adjacent region. Hence, this case reduces to Case 1.a. If the point is on a non-vertical boundary curve of S, the sector around that point, intersecting the adjacent region contains a midcurve, connecting the point to the interior of the adjacent region (in the case of the spatial Datalog program even more pairs are added). Again this case reduces to Case 1.a. 2.b. In this case there is a midcurve from p in to the two-dimensional sector intersecting the region cell in c(S). We distinguish between two cases. These two cases are depicted in Figure 7. In (a) the midcurve intersects the cell, while in (b) this is not the case. In Case (a), point p is connected by this midcurve to the cell, hence this case reduces to Case 1.a. For Case (b), we let r be a midpoint of a curve computed by SeRA belonging to the connected component of the interior of S that contains q. Hence, after using a similar argument as in Case 1.a, p and q belong to the transitive closure of the relation Path via a curve that passes q q r p r εp p (a) εp (b) Fig. 7. The two cases in 2.b. through r. the vertical line through p adjacent to the region. 2.c. There are various cases to be distinguished here. A vertical curve can be dealt with as before. A non-vertical curve is either a curve belonging to the border of S or a curve belonging to the interior of S. The latter case can be dealt with like in Case 2.b. For the former case, the algorithm SeRA will add p and a point of the border curve to the relation Path. The desired path to the border point can be found as in Case 1.b. Termination. The first while-loop of the program in Figure 5 terminates since every border point of a spatial database has only a finite number of sectors in its cone and furthermore this number is bounded (this follows immediately from Property 2). After a finite number of runs of SeRA, the relation Current will therefore become empty. To prove the termination of the computation of the transitive closure of the relation Path in both programs, we return to Collins’s CAD. From the soundnessproof it is clear that it is sufficient to show that there exists an upper bound α(c) on the number of iterations of the transitive closure to connect two points in a cell c of c(S). We now show that for each region (i.e., two-dimensional cell) c in the CAD, there is a transversal γ(c) in the relation Path from the bottom left corner of c to the upper right corner of c of finite length β(c). Any two points of c can then be connected by at most α(c) = β(c)+2 iterations of the transitive closure of the relation Path (namely by vertically connecting to the transversal and following it). For this we remark that the bottom left corner point of the cell c can be connected by a finite and fixed number of steps (see proof of soundness) with a point p in the interior of c. Similarly, the upper right corner point of c can be connected to some interior point q of c. The points p and q can be connected by a piece-wise linear curve, consisting of β(c) line segments. This gives the desired transversal γ(c). For cells c which are vertical line segments or single points the upper bound is 1. Remark that points on a one-dimensional cell c can also be connected by a finite number α(c) of line segments. This follows from the compactness of the curves (see Case 1.b of the soundness proof). ⊓ ⊔ 5 Discussion It is not clear whether the first while-loop of the FO+Poly+While program of Figure 5, which initializes the Path relation, can be expressed in spatial Datalog with stratified negation. More generally, we can wonder about the following. Question: Can spatial Datalog with stratified negation express all computable spatial database queries? References 1. S. Abiteboul, R. Hull, and V. Vianu. Foundations of Databases. Addison-Wesley, 1995. 2. M. Benedikt, G. Dong, L. Libkin, and L. Wong. Relational expressive power of constraint query languages. Journal of the ACM, 45(1):1–34, 1998. 3. J. Bochnak, M. Coste, and M.-F. Roy. Géométrie Algébrique Réelle. SpringerVerlag, Berlin, 1987 (also Real Algebraic Geometry. Springer-Verlag, Berlin, 1998). 4. B.F. Caviness and J.R. Johnson (eds.) Quantifier Elimination and Cylindrical Algebraic Decomposition Springer-Verlag, Wien New York, 1998. 5. G.E. Collins. Quantifier elimination for real closed fields by cylindrical algebraic decomposition. In H. Brakhage, editor, Automata Theory and Formal Languages, volume 33 of Lecture Notes in Computer Science, pages 134–183, Berlin, 1975. Springer-Verlag. 6. S. Grumbach and J. Su. Finitely representable databases. Journal of Computer and System Sciences, 55(2):273–298, 1997. 7. M. Gyssens, J. Van den Bussche, and D. Van Gucht. Complete geometrical query languages. in Proceedings of the 16th ACM Symposium on Principles of Database Systems, pages 62–67, ACM Press, New York, 1997. 8. J. Heintz, T. Reico, and M.-F. Roy. Algorithms in Real Algebraic Geometry and Applications to Computational Geometry. Discrete and Computational Geometry: Selected Papers from the DIMACS Special Year, Eds. J.E. Goodman, R. Pollack and W. Steiger, AMS and ACM, 6:137–164, 1991. 9. J. Heintz, M.-F. Roy, and P. Solernò. Description of the Connected Components of a Semi-Algebraic Set in Single Exponential Time. Discrete and Computational Geometry, 11: 121–140, 1994. 10. P.C. Kanellakis, G.M. Kuper, and P.Z. Revesz. Constraint query languages. Journal of Computer and System Sciences, 51(1):26–52, 1995 (Originally in Proceedings of the 9th ACM Symposium on Principles of Database Systems, pages 299–313, ACM Press, New York, 1990). 11. B. Kuijpers, J. Paredaens, M. Smits, and J. Van den Bussche. Termination properties of spatial Datalog programs. In D. Pedreschi and C. Zaniolo, editors, Proceedings of ”Logic in Databases”, number 1154 in Lecture Notes in Computer Science, pages 101–116, Berlin, 1996. Springer-Verlag. 12. B. Kuijpers, J. Paredaens, and J. Van den Bussche. Topological elementary equivalence of closed semi-algebraic sets in the real plane. The Journal of Symbolic Logic, to appear, 1999. 13. B. Kuijpers and M. Smits. On expressing topological connectivity in spatial Datalog. In V. Gaede, A. Brodsky, O. Gunter, D. Srivastava, V. Vianu, and M. Wallace, 14. 15. 16. 17. 18. editors, Proceedings of Workshop on Constraint Databases and their Applications, number 1191 in Lecture Notes in Computer Science, pages 116–133, Berlin, 1997. Springer-Verlag. G. Kuper, L. Libkin, and J. Paredaens. Constraint databases. Springer-Verlag, 2000. E.E. Moise. Geometric topology in dimensions 2 and 3, volume 47 of Graduate Texts in Mathematics. Springer, 1977. J. Paredaens, J. Van den Bussche, and D. Van Gucht. Towards a theory of spatial database queries. In Proceedings of the 13th ACM Symposium on Principles of Database Systems, pages 279–288, ACM Press, New York, 1994. J.T. Schwartz and M. Sharir. On the piano movers’ problem II. In J.T. Schwartz, M. Sharir, and J. Hopcroft, editors, Planning, Geometry, and Complexity of Robot Motion, pages 51–96. Ablex Publishing Corporation, Norwood, New Jersey, 1987. A. Tarski. A Decision Method for Elementary Algebra and Geometry. University of California Press, Berkeley, 1951.