Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

HTML conversions sometimes display errors due to content that did not convert correctly from the source. This paper uses the following packages that are not yet supported by the HTML conversion tool. Feedback on these issues are not necessary; they are known and are being worked on.

  • failed: datetime2
  • failed: xstring

Authors: achieve the best HTML results from your LaTeX submissions by following these best practices.

License: arXiv.org perpetual non-exclusive license
arXiv:2401.05165v1 [cs.PL] 10 Jan 2024
\provide@command\G\renew@command\G

G \provide@command\C \renew@command\C C

11institutetext: Helmut Seidl 22institutetext: Julian Erhard 33institutetext: Sarah Tilscher 44institutetext: Michael Schwarz 55institutetext: Technische Universität München, Garching, Germany
55email: {helmut.seidl, julian.erhard, sarah.tilscher, m.schwarz}@tum.de

Non-Numerical Weakly Relational Domains

Helmut Seidl    Julian Erhard    Sarah Tilscher    Michael Schwarz
(January 10, 2024)
Abstract

The weakly relational domain of Octagons offers a decent compromise between precision and efficiency for numerical properties. Here, we are concerned with the construction of non-numerical relational domains. We provide a general construction of weakly relational domains, which we exemplify with an extension of constant propagation by disjunctions. Since for the resulting domain of 2-disjunctive formulas, satisfiability is NP-complete, we provide a general construction for a further, more abstract weakly relational domain where the abstract operations of restriction and least upper bound can be efficiently implemented.

In the second step, we consider a relational domain that tracks conjunctions of inequalities between variables, and between variables and constants for arbitrary partial orders of values. Examples are sub(multi)sets, as well as prefix, substring or scattered substring orderings on strings. When the partial order is a lattice, we provide precise polynomial algorithms for satisfiability, restriction, and the best abstraction of disjunction. Complementary to the constructions for lattices, we find that, in general, satisfiability of conjunctions is NP-complete. We therefore again provide polynomial abstract versions of restriction, conjunction, and join. By using our generic constructions, these domains are extended to weakly relational domains that additionally track disjunctions.

For all our domains, we indicate how abstract transformers for assignments and guards can be constructed.

Keywords:
weakly relational domains, 2-decomposable relational domains, 2-disjunctive constants, directed domains

1 Introduction

Relational analyses have been observed to be indispensable for verifying intricate program properties. In particular, this is the case when for the purpose of verification, ghost variables have been introduced which must be related to program variables. Termination may be verified by introducing a ghost loop counter, which can be proven bounded by a relational domain relating it to the actual bounded iteration variable Albert et al. (2014). The validity of string operations on null-terminated strings as employed, e.g., in the programming language C, may be verified by introducing ghost variables for the length of a buffer as well as for tracking the position of the null byte in the buffer Dor et al. (2001). It also has been observed that monolithic relational domains such as the polyhedra abstract domain Cousot and Halbwachs (1978) scale badly to larger programs. Therefore, weakly relational domains have been proposed which can only express simple relational properties, but have the potential to scale better Miné (2004). Examples of weakly relational numerical properties are the Two Variables Per Inequality domain Simon et al. (2002), or domains given by a finite set of linear templates Sankaranarayanan et al. (2005). The most prominent example of a template numerical domain is the Octagon domain Miné (2001, 2006) which allows tracking upper and lower bounds not only of program variables but also of sums and differences of two program variables. One such octagon abstract relation could, e.g., be given by the conjunction

(x5)(x10)(x+y0)(xz1)𝑥5𝑥10𝑥𝑦0𝑥𝑧1(-x\leq-5)\wedge(x\leq 10)\wedge(x+y\leq 0)\wedge(x-z\leq 1)( - italic_x ≤ - 5 ) ∧ ( italic_x ≤ 10 ) ∧ ( italic_x + italic_y ≤ 0 ) ∧ ( italic_x - italic_z ≤ 1 )

Octagons thus can be considered as a mild extension of the non-relational domain of Intervals for program variables, and a variety of efficient algorithms have been provided Bagnara et al. (2008, 2009); Chawdhary et al. (2019); Schwarz and Seidl (2023). Here, we are concerned with constructing non-numerical abstract domains.

For that, we provide a general technique to construct from every relational domain a weakly relational domain. As one instance of the general construction, we consider 2-disjunctive constants as mentioned in Schwarz et al. (2023). This weakly relational domain allows, e.g., to relate the names of functions with function pointers as in the formula

x="foo"y=&𝖿𝗈𝗈x="bar"y=&𝖻𝖺𝗋𝑥"foo"𝑦𝖿𝗈𝗈𝑥"bar"𝑦𝖻𝖺𝗋x=\textsf{"foo"}\wedge y=\&\textsf{foo}\;\vee\;x=\textsf{"bar"}\wedge y=\&% \textsf{bar}italic_x = "foo" ∧ italic_y = & foo ∨ italic_x = "bar" ∧ italic_y = & bar

Since satisfiability of formulas from that domain turns out to be NP-complete, we provide a further mild abstraction, again for arbitrary relational domains, to provide us with a weakly relational domain where all required operations become tractable.

Another family of relational non-numerical domains has been introduced by Arceri et al. (2022). Based on a partial order of values, conjunctions of ordering constraints xysquare-image-of-or-equals𝑥𝑦x\sqsubseteq yitalic_x ⊑ italic_y for program variables x,y𝑥𝑦x,yitalic_x , italic_y are considered. They observe that analyses of prefixes or the substring relation could be helpful for programs in programming languages supporting high-level operations on strings. Here, we study this kind of directed domains in greater detail. For conjunctions of inequalities over some partial order P𝑃Pitalic_P, we extend the constraints from Arceri et al. (2022) by allowing for variables both lower and upper bounds from P𝑃Pitalic_P. For arbitrary partial orders, though, we find that then satisfiability is NP-complete. Partial orders p𝑝pitalic_p that are lattices form a notable exception. An instance of this are subsets of some universe or multisets. For lattices, we show that satisfiability is decidable in polynomial time. Moreover, we provide polynomial constructions both for restriction as well as the optimal join operation. Turning to general partial orders of values, we thus cannot hope for polynomial algorithms. Therefore, we provide a meaningful abstraction so that both abstract restriction as well as join is again polynomial. This family of relational domains is already weakly relational. Still, our generic constructions can be applied to obtain more expressive weakly relational domains that additionally support disjunctions at a limited amount of extra costs.

The paper is organized as follows: Section 2 provides background definitions on relational domains. It formally introduces our notion of weakly relational domains and provides a general construction of weakly relational domains. Section 3 is dedicated to disjunctive constants. When applying the generic construction from the last section to this relational domain, the weakly relational domain of 2-disjunctive constants is obtained. Here, we prove that satisfiability for these formulas still is NP-complete. Therefore, a generic abstraction technique is presented so that, when applied to disjunctive constants, normalization, projection, as well as least upper bounds all turn out to be polynomial time.

Finally, abstract transformers for assignments as well as guards are derived. Section 4 then introduces directed domains which do not track equalities but inequalities over a partial order of values. While the first subsection provides polynomial constructions for the case that the partial order for values is a lattice, the second subsection is concerned with arbitrary partial orders as value domain. Since satisfiability, in general, turns out to be NP-complete, again a polynomial abstraction is provided. In a further subsection, we indicate how the generic constructions from the last sections provide us with weakly relational domains that additionally support disjunctions of inequalities. We exemplify the resulting domains with conjunctions and disjunctions of inequalities over the integers. In the final subsection, dedicated abstract transformers are constructed for assignments, while the last subsection discusses the treatment of guards. Section 5 summarizes the contributions and sketches further directions of research.

2 Weakly Relational Domains

Let us recall basic definitions for relational domains. We mostly follow the notation used in previous work Schwarz et al. (2023), where the notion of 2222-decomposability has been introduced. Let X𝑋{\mathcal{}X}italic_X be some finite set of variables. A relational domain R𝑅{\mathcal{}R}italic_R maintains relations between variables in X𝑋{\mathcal{}X}italic_X. We require that a relational domain is a bounded lattice, i.e., has a partial order square-image-of-or-equals\sqsubseteq, a least element bottom\bot, a greatest element top\top, as well as binary operators for the greatest lower bound (meet) square-intersection\sqcap and the least upper bound (join) square-union\sqcup. We do not demand relational domains to be complete lattices, i.e., to provide for every subset of elements a least upper bound: the polyhedral domain, e.g., is not complete Cousot and Halbwachs (1978). However, we demand that a relational domain supports the following monotonic operations:

x:=e:RR (assignment of e to x)|Y:RR (restriction to YX)?c:RR (guard for condition c)\begin{array}[]{rcl}\llbracket x\,{:=}\,e\rrbracket^{\sharp}&:&{\mathcal{}R}% \to{\mathcal{}R}\text{ (assignment of $e$ to $x$)}\\ {\left.\kern-1.2pt\cdot\vphantom{|}\right|_{Y}}&:&{\mathcal{}R}\to{\mathcal{}R% }\text{ (restriction to $Y\subseteq{\mathcal{}X}$)}\\ \llbracket?c\rrbracket^{\sharp}&:&{\mathcal{}R}\to{\mathcal{}R}\text{ (guard % for condition $c$)}\end{array}start_ARRAY start_ROW start_CELL ⟦ italic_x := italic_e ⟧ start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT end_CELL start_CELL : end_CELL start_CELL italic_R → italic_R (assignment of italic_e to italic_x ) end_CELL end_ROW start_ROW start_CELL ⋅ | start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT end_CELL start_CELL : end_CELL start_CELL italic_R → italic_R (restriction to italic_Y ⊆ italic_X ) end_CELL end_ROW start_ROW start_CELL ⟦ ? italic_c ⟧ start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT end_CELL start_CELL : end_CELL start_CELL italic_R → italic_R (guard for condition italic_c ) end_CELL end_ROW end_ARRAY

where e𝑒eitalic_e and c𝑐citalic_c are from some expression and condition language, respectively.

The abstract transformers for basic actions of programs are given by these functions. Restricting a relation r𝑟ritalic_r to a subset Y𝑌Yitalic_Y of variables amounts to forgetting all information about variables in XY𝑋𝑌{\mathcal{}X}\setminus Yitalic_X ∖ italic_Y. Thus, we require that

r|X=rr|={ifr=otherwiser|Y1r|Y2whenY1Y2(r|Y1)|Y2=r|Y1Y2evaluated-at𝑟𝑋𝑟evaluated-at𝑟casesbottomif𝑟bottomtopotherwiseevaluated-at𝑟subscript𝑌1square-original-of-or-equalsevaluated-at𝑟subscript𝑌2whensubscript𝑌1subscript𝑌2evaluated-atevaluated-at𝑟subscript𝑌1subscript𝑌2evaluated-at𝑟subscript𝑌1subscript𝑌2\begin{array}[]{lll}{\left.\kern-1.2ptr\vphantom{|}\right|_{{\mathcal{}X}}}&=&% r\\ {\left.\kern-1.2ptr\vphantom{|}\right|_{\emptyset}}&=&\left\{\begin{array}[]{% ll}\bot&\text{if}\;r=\bot\\ \top&\text{otherwise}\end{array}\right.\\ {\left.\kern-1.2ptr\vphantom{|}\right|_{Y_{1}}}&\sqsupseteq&{\left.\kern-1.2% ptr\vphantom{|}\right|_{Y_{2}}}\qquad\text{when}\;Y_{1}\subseteq Y_{2}\\ {\left.\kern-1.2pt({\left.\kern-1.2ptr\vphantom{|}\right|_{Y_{1}}})\vphantom{|% }\right|_{Y_{2}}}&=&{\left.\kern-1.2ptr\vphantom{|}\right|_{Y_{1}\cap Y_{2}}}% \end{array}start_ARRAY start_ROW start_CELL italic_r | start_POSTSUBSCRIPT italic_X end_POSTSUBSCRIPT end_CELL start_CELL = end_CELL start_CELL italic_r end_CELL end_ROW start_ROW start_CELL italic_r | start_POSTSUBSCRIPT ∅ end_POSTSUBSCRIPT end_CELL start_CELL = end_CELL start_CELL { start_ARRAY start_ROW start_CELL ⊥ end_CELL start_CELL if italic_r = ⊥ end_CELL end_ROW start_ROW start_CELL ⊤ end_CELL start_CELL otherwise end_CELL end_ROW end_ARRAY end_CELL end_ROW start_ROW start_CELL italic_r | start_POSTSUBSCRIPT italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL start_CELL ⊒ end_CELL start_CELL italic_r | start_POSTSUBSCRIPT italic_Y start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT when italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⊆ italic_Y start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL ( italic_r | start_POSTSUBSCRIPT italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) | start_POSTSUBSCRIPT italic_Y start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL start_CELL = end_CELL start_CELL italic_r | start_POSTSUBSCRIPT italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∩ italic_Y start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL end_ROW end_ARRAY (1)

A restriction |Y{\left.\kern-1.2pt\cdot\vphantom{|}\right|_{Y}}⋅ | start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT to some set Y𝑌Yitalic_Y therefore is an idempotent operation. We remark that from these axioms it follows that |Y={\left.\kern-1.2pt\bot\vphantom{|}\right|_{Y}}=\bot⊥ | start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT = ⊥ and |Y={\left.\kern-1.2pt\top\vphantom{|}\right|_{Y}}=\top⊤ | start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT = ⊤ for any YX𝑌𝑋Y\subseteq{\mathcal{}X}italic_Y ⊆ italic_X. Given that there is some relation rcRsubscript𝑟𝑐𝑅r_{c}\in{\mathcal{}R}italic_r start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ∈ italic_R describing all states satisfying the condition c𝑐citalic_c, the transformation for the guard ?c?𝑐?c? italic_c can be described by

?cr=rrc\llbracket?c\rrbracket^{\sharp}r=r\sqcap r_{c}⟦ ? italic_c ⟧ start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT italic_r = italic_r ⊓ italic_r start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT (2)

– at least, if there is a concretization function γ𝛾\gammaitalic_γ such that

γ(r1r2)=γr1γr2𝛾square-intersectionsubscript𝑟1subscript𝑟2𝛾subscript𝑟1𝛾subscript𝑟2\gamma\,(r_{1}\sqcap r_{2})=\gamma\,r_{1}\cap\gamma\,r_{2}italic_γ ( italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⊓ italic_r start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) = italic_γ italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∩ italic_γ italic_r start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT (3)

i.e., the binary meet operation is precise.

Example 1

For numerical variables, a variety of such relational domains have been proposed, e.g., (conjunctions of) affine equalities Karr (1976); Müller-Olm and Seidl (2004, 2007) or affine inequalities Cousot and Halbwachs (1978). For affine equalities or inequalities, restriction to a subset of Y𝑌Yitalic_Y of variables corresponds to the geometric projection onto the subspace defined by Y𝑌Yitalic_Y, combined with arbitrary values for variables zY𝑧𝑌z\not\in Yitalic_z ∉ italic_Y. ∎

One way to tackle the high cost of relational domains is to track the relationships not between all variables, but only between subclusters of variables. We call such domains Weakly Relational Domains.

For a subset YX𝑌𝑋Y\subseteq{\mathcal{}X}italic_Y ⊆ italic_X, let RY={r|YrR}superscript𝑅𝑌conditionalevaluated-at𝑟𝑌𝑟𝑅{\mathcal{}R}^{Y}=\{{\left.\kern-1.2ptr\vphantom{|}\right|_{Y}}\mid r\in{% \mathcal{}R}\}italic_R start_POSTSUPERSCRIPT italic_Y end_POSTSUPERSCRIPT = { italic_r | start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT ∣ italic_r ∈ italic_R } be the set of all abstract values from R𝑅{\mathcal{}R}italic_R that contains only information on those variables in Y𝑌Yitalic_Y. For any collection S2X𝑆superscript2𝑋{\mathcal{}S}\subseteq 2^{{\mathcal{}X}}italic_S ⊆ 2 start_POSTSUPERSCRIPT italic_X end_POSTSUPERSCRIPT of clusters of variables, a relation rR𝑟𝑅r\in{\mathcal{}R}italic_r ∈ italic_R can be approximated by a meet of relations from RY,YSsuperscript𝑅𝑌𝑌𝑆{\mathcal{}R}^{Y},Y\in\mathcal{}Sitalic_R start_POSTSUPERSCRIPT italic_Y end_POSTSUPERSCRIPT , italic_Y ∈ italic_S since for every rR𝑟𝑅r\in{\mathcal{}R}italic_r ∈ italic_R,

rY𝒮r|Ysquare-image-of-or-equals𝑟evaluated-atsubscript𝑌𝒮𝑟𝑌r\sqsubseteq\bigsqcap_{Y\in\mathcal{S}}{\left.\kern-1.2ptr\vphantom{|}\right|_% {Y}}italic_r ⊑ ⨅ start_POSTSUBSCRIPT italic_Y ∈ caligraphic_S end_POSTSUBSCRIPT italic_r | start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT (4)

holds, as rr|Ysquare-image-of-or-equals𝑟evaluated-at𝑟𝑌r\sqsubseteq{\left.\kern-1.2ptr\vphantom{|}\right|_{Y}}italic_r ⊑ italic_r | start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT holds for each YS𝑌𝑆Y\in Sitalic_Y ∈ italic_S. In fact, the right-hand side of (4) is the best approximation of r𝑟ritalic_r by some meet over abstract relations sY,YS,subscript𝑠𝑌𝑌𝑆s_{Y},Y\in{\mathcal{}S},italic_s start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT , italic_Y ∈ italic_S , with sYRYsubscript𝑠𝑌superscript𝑅𝑌s_{Y}\in{\mathcal{}R}^{Y}italic_s start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT ∈ italic_R start_POSTSUPERSCRIPT italic_Y end_POSTSUPERSCRIPT, i.e., with sY|Y=sYevaluated-atsubscript𝑠𝑌𝑌subscript𝑠𝑌{\left.\kern-1.2pts_{Y}\vphantom{|}\right|_{Y}}=s_{Y}italic_s start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT | start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT = italic_s start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT, since

r|Y(YSsY)|YsY|Y(by monotonicity of restriction)=sYevaluated-at𝑟𝑌square-image-of-or-equalsevaluated-atsuperscript𝑌𝑆subscript𝑠superscript𝑌𝑌missing-subexpressionsquare-image-of-or-equalsevaluated-atsubscript𝑠𝑌𝑌(by monotonicity of restriction)missing-subexpressionsubscript𝑠𝑌\begin{array}[]{lcl}{\left.\kern-1.2ptr\vphantom{|}\right|_{Y}}&\sqsubseteq&{% \left.\kern-1.2pt(\bigsqcap{Y^{\prime}\in{\mathcal{}S}}s_{Y^{\prime}})% \vphantom{|}\right|_{Y}}\\ &\sqsubseteq&{\left.\kern-1.2pts_{Y}\vphantom{|}\right|_{Y}}\qquad\qquad\quad% \text{(by monotonicity of restriction)}\\ &=&s_{Y}\end{array}start_ARRAY start_ROW start_CELL italic_r | start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT end_CELL start_CELL ⊑ end_CELL start_CELL ( ⨅ italic_Y start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_S italic_s start_POSTSUBSCRIPT italic_Y start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ) | start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL ⊑ end_CELL start_CELL italic_s start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT | start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT (by monotonicity of restriction) end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL = end_CELL start_CELL italic_s start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT end_CELL end_ROW end_ARRAY

holds for all YS𝑌𝑆Y\in{\mathcal{}S}italic_Y ∈ italic_S.

Schwarz et al. (2023) have introduced 2222-decomposable relational domains. These are domains where the full value r𝑟ritalic_r can be recovered from the restrictions of r𝑟ritalic_r to all clusters p𝑝pitalic_p from the set 𝒮=[X]2𝒮subscriptdelimited-[]𝑋2\mathcal{S}=[{\mathcal{}X}]_{2}caligraphic_S = [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT of non-empty clusters of variables of size at most 2222. Furthermore, Schwarz et al. (2023) ask for binary least upper bounds to be determined by computing within these clusters only. More precisely, this amounts to requiring the following two properties

r=𝑟absent\displaystyle r=italic_r = p[X]2r|pevaluated-atsubscript𝑝subscriptdelimited-[]𝑋2𝑟𝑝\displaystyle\bigsqcap_{p\in[{\mathcal{}X}]_{2}}{\left.\kern-1.2ptr\vphantom{|% }\right|_{p}}⨅ start_POSTSUBSCRIPT italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_r | start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT (5)
(r1r2)|p=evaluated-atsquare-unionsubscript𝑟1subscript𝑟2𝑝absent\displaystyle{\left.\kern-1.2pt\left(r_{1}\sqcup r_{2}\right)\vphantom{|}% \right|_{p}}=( italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⊔ italic_r start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) | start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT = r1|pr2|p(p[X]2)square-unionevaluated-atsubscript𝑟1𝑝evaluated-atsubscript𝑟2𝑝𝑝subscriptdelimited-[]𝑋2\displaystyle{\left.\kern-1.2ptr_{1}\vphantom{|}\right|_{p}}\sqcup{\left.\kern% -1.2ptr_{2}\vphantom{|}\right|_{p}}\qquad(p\in[{\mathcal{}X}]_{2})italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT | start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⊔ italic_r start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT | start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) (6)

to hold for all abstract relations r,r1,r2R𝑟subscript𝑟1subscript𝑟2𝑅r,r_{1},r_{2}\in{\mathcal{}R}italic_r , italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_r start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∈ italic_R. The most prominent example of a 2222-decomposable domain is the octagon domain Miné (2001) – either over rationals or integers, while affine equalities or affine inequalities are examples of domains that are not 2222-decomposable.

Any relational domain R𝑅{\mathcal{}R}italic_R, however, which satisfies (6) gives rise to a 2-decomposable domain R2subscript𝑅2{\mathcal{}R}_{2}italic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT of its 2-cluster approximations.

For rR𝑟𝑅r\in{\mathcal{}R}italic_r ∈ italic_R, let r¯=p[X]2r|p¯𝑟evaluated-atsubscript𝑝subscriptdelimited-[]𝑋2𝑟𝑝\overline{r}=\bigsqcap_{p\in[{\mathcal{}X}]_{2}}{\left.\kern-1.2ptr\vphantom{|% }\right|_{p}}over¯ start_ARG italic_r end_ARG = ⨅ start_POSTSUBSCRIPT italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_r | start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT denote the approximation of r𝑟ritalic_r by the meet of its restrictions to clusters p[X]2𝑝subscriptdelimited-[]𝑋2p\in[{\mathcal{}X}]_{2}italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. Let R2subscript𝑅2{\mathcal{}R}_{2}italic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT denote the subset of R𝑅{\mathcal{}R}italic_R of all abstract relations of the form r¯,rR¯𝑟𝑟𝑅\overline{r},r\in{\mathcal{}R}over¯ start_ARG italic_r end_ARG , italic_r ∈ italic_R, where the ordering is inherited from R𝑅{\mathcal{}R}italic_R. In particular, bottom\bot as well as top\top from R𝑅{\mathcal{}R}italic_R are also in R2subscript𝑅2{\mathcal{}R}_{2}italic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT.

Theorem 2.1

Assume that R𝑅{\mathcal{}R}italic_R is an abstract relational domain which satisfies (6). Then the following holds:

  1. 1.

    r=r¯𝑟¯𝑟r=\overline{r}italic_r = over¯ start_ARG italic_r end_ARG for all conjunctions r=p[X]2sp𝑟subscript𝑝subscriptdelimited-[]𝑋2subscript𝑠𝑝r=\bigsqcap_{p\in[{\mathcal{}X}]_{2}}s_{p}italic_r = ⨅ start_POSTSUBSCRIPT italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_s start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT with spRp,p[X]2formulae-sequencesubscript𝑠𝑝superscript𝑅𝑝𝑝subscriptdelimited-[]𝑋2s_{p}\in{\mathcal{}R}^{p},p\in[{\mathcal{}X}]_{2}italic_s start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ∈ italic_R start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT , italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, i.e., all such conjunctions are contained in R2subscript𝑅2{\mathcal{}R}_{2}italic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT.

  2. 2.

    For r1,r2R2subscript𝑟1subscript𝑟2subscript𝑅2r_{1},r_{2}\in{\mathcal{}R}_{2}italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_r start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∈ italic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, the abstract relation r1r2square-intersectionsubscript𝑟1subscript𝑟2r_{1}\sqcap r_{2}italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⊓ italic_r start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, as provided by R𝑅{\mathcal{}R}italic_R, is in R2subscript𝑅2{\mathcal{}R}_{2}italic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT.

  3. 3.

    The binary least upper bound operation in R2subscript𝑅2{\mathcal{}R}_{2}italic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT exists and is given by

    r12r2=p[X]2(r1|pr2|p)subscriptsquare-union2subscript𝑟1subscript𝑟2subscript𝑝subscriptdelimited-[]𝑋2square-unionevaluated-atsubscript𝑟1𝑝evaluated-atsubscript𝑟2𝑝r_{1}\sqcup_{2}r_{2}=\bigsqcap_{p\in[{\mathcal{}X}]_{2}}({\left.\kern-1.2ptr_{% 1}\vphantom{|}\right|_{p}}\sqcup{\left.\kern-1.2ptr_{2}\vphantom{|}\right|_{p}})italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⊔ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_r start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = ⨅ start_POSTSUBSCRIPT italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT | start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⊔ italic_r start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT | start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT )
  4. 4.

    For R2subscript𝑅2{\mathcal{}R}_{2}italic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, the best approximation r|Y,2evaluated-at𝑟𝑌2{\left.\kern-1.2ptr\vphantom{|}\right|_{Y,2}}italic_r | start_POSTSUBSCRIPT italic_Y , 2 end_POSTSUBSCRIPT to the restriction r|Yevaluated-at𝑟𝑌{\left.\kern-1.2ptr\vphantom{|}\right|_{Y}}italic_r | start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT of rR2𝑟subscript𝑅2r\in{\mathcal{}R}_{2}italic_r ∈ italic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT onto some subset YX𝑌𝑋Y\subseteq{\mathcal{}X}italic_Y ⊆ italic_X of variables is given by

    r|Y,2=p[X]2r|pYevaluated-at𝑟𝑌2evaluated-atsubscript𝑝subscriptdelimited-[]𝑋2𝑟𝑝𝑌{\left.\kern-1.2ptr\vphantom{|}\right|_{Y,2}}=\bigsqcap_{p\in[{\mathcal{}X}]_{% 2}}{\left.\kern-1.2ptr\vphantom{|}\right|_{p\cap Y}}italic_r | start_POSTSUBSCRIPT italic_Y , 2 end_POSTSUBSCRIPT = ⨅ start_POSTSUBSCRIPT italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_r | start_POSTSUBSCRIPT italic_p ∩ italic_Y end_POSTSUBSCRIPT
  5. 5.

    the partial order R2subscript𝑅2{\mathcal{}R}_{2}italic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT with the given binary greatest lower and least upper bounds is a 2-decomposable relational domain.

Proof

For a proof of statement (1), we first observe that for each p[X]2𝑝subscriptdelimited-[]𝑋2p\in[{\mathcal{}X}]_{2}italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT,

r|p=(p[X]2sp)|psp|p=spevaluated-at𝑟𝑝evaluated-atsubscript𝑝subscriptdelimited-[]𝑋2subscript𝑠𝑝𝑝square-image-of-or-equalsevaluated-atsubscript𝑠𝑝𝑝subscript𝑠𝑝{\left.\kern-1.2ptr\vphantom{|}\right|_{p}}={\left.\kern-1.2pt\left(\bigsqcap_% {p\in[{\mathcal{}X}]_{2}}s_{p}\right)\vphantom{|}\right|_{p}}\sqsubseteq{\left% .\kern-1.2pts_{p}\vphantom{|}\right|_{p}}=s_{p}italic_r | start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT = ( ⨅ start_POSTSUBSCRIPT italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_s start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ) | start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⊑ italic_s start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT | start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT = italic_s start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT

by monotonicity and idempotence of restriction. Thus,

rr¯=p[X]2r|pp[X]2sp=rsquare-image-of-or-equals𝑟¯𝑟evaluated-atsubscript𝑝subscriptdelimited-[]𝑋2𝑟𝑝square-image-of-or-equalssubscript𝑝subscriptdelimited-[]𝑋2subscript𝑠𝑝𝑟r\sqsubseteq\overline{r}=\bigsqcap_{p\in[{\mathcal{}X}]_{2}}{\left.\kern-1.2% ptr\vphantom{|}\right|_{p}}\sqsubseteq\bigsqcap_{p\in[{\mathcal{}X}]_{2}}s_{p}=ritalic_r ⊑ over¯ start_ARG italic_r end_ARG = ⨅ start_POSTSUBSCRIPT italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_r | start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⊑ ⨅ start_POSTSUBSCRIPT italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_s start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT = italic_r

where the first inequality follows from Eq. 4. Thus, statement (1) follows.

For a proof of statement (2), consider elements r,sR2𝑟𝑠subscript𝑅2r,s\in{\mathcal{}R}_{2}italic_r , italic_s ∈ italic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. Then

rs=p[X]2r|pp[X]2s|p=p[X]2(r|ps|p)square-intersection𝑟𝑠square-intersectionevaluated-atsubscript𝑝subscriptdelimited-[]𝑋2𝑟𝑝evaluated-atsubscript𝑝subscriptdelimited-[]𝑋2𝑠𝑝subscript𝑝subscriptdelimited-[]𝑋2square-intersectionevaluated-at𝑟𝑝evaluated-at𝑠𝑝r\sqcap s=\bigsqcap_{p\in[{\mathcal{}X}]_{2}}{\left.\kern-1.2ptr\vphantom{|}% \right|_{p}}\sqcap\bigsqcap_{p\in[{\mathcal{}X}]_{2}}{\left.\kern-1.2pts% \vphantom{|}\right|_{p}}=\bigsqcap_{p\in[{\mathcal{}X}]_{2}}({\left.\kern-1.2% ptr\vphantom{|}\right|_{p}}\sqcap{\left.\kern-1.2pts\vphantom{|}\right|_{p}})italic_r ⊓ italic_s = ⨅ start_POSTSUBSCRIPT italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_r | start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⊓ ⨅ start_POSTSUBSCRIPT italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_s | start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT = ⨅ start_POSTSUBSCRIPT italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_r | start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⊓ italic_s | start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT )

Now, we claim that for every p[X]2𝑝subscriptdelimited-[]𝑋2p\in[{\mathcal{}X}]_{2}italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT,

r|ps|p=(r|ps|p)|psquare-intersectionevaluated-at𝑟𝑝evaluated-at𝑠𝑝evaluated-atsquare-intersectionevaluated-at𝑟𝑝evaluated-at𝑠𝑝𝑝{\left.\kern-1.2ptr\vphantom{|}\right|_{p}}\sqcap{\left.\kern-1.2pts\vphantom{% |}\right|_{p}}={\left.\kern-1.2pt({\left.\kern-1.2ptr\vphantom{|}\right|_{p}}% \sqcap{\left.\kern-1.2pts\vphantom{|}\right|_{p}})\vphantom{|}\right|_{p}}italic_r | start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⊓ italic_s | start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT = ( italic_r | start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⊓ italic_s | start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ) | start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT

To prove the claim, we argue that

r|ps|p(r|ps|p)|p(by monotonicity)(r|p)|p(s|p)|p(by monotonicity)=r|ps|p(by idempotence)square-intersectionevaluated-at𝑟𝑝evaluated-at𝑠𝑝square-image-of-or-equalsevaluated-atsquare-intersectionevaluated-at𝑟𝑝evaluated-at𝑠𝑝𝑝(by monotonicity)missing-subexpressionsquare-image-of-or-equalssquare-intersectionevaluated-atevaluated-at𝑟𝑝𝑝evaluated-atevaluated-at𝑠𝑝𝑝(by monotonicity)missing-subexpressionsquare-intersectionevaluated-at𝑟𝑝evaluated-at𝑠𝑝(by idempotence)\begin{array}[]{lcl@{\quad}l}{\left.\kern-1.2ptr\vphantom{|}\right|_{p}}\sqcap% {\left.\kern-1.2pts\vphantom{|}\right|_{p}}&\sqsubseteq&{\left.\kern-1.2pt({% \left.\kern-1.2ptr\vphantom{|}\right|_{p}}\sqcap{\left.\kern-1.2pts\vphantom{|% }\right|_{p}})\vphantom{|}\right|_{p}}&\text{(by monotonicity)}\\ &\sqsubseteq&{\left.\kern-1.2pt({\left.\kern-1.2ptr\vphantom{|}\right|_{p}})% \vphantom{|}\right|_{p}}\sqcap{\left.\kern-1.2pt({\left.\kern-1.2pts\vphantom{% |}\right|_{p}})\vphantom{|}\right|_{p}}&\text{(by monotonicity)}\\ &=&{\left.\kern-1.2ptr\vphantom{|}\right|_{p}}\sqcap{\left.\kern-1.2pts% \vphantom{|}\right|_{p}}&\text{(by idempotence)}\end{array}start_ARRAY start_ROW start_CELL italic_r | start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⊓ italic_s | start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT end_CELL start_CELL ⊑ end_CELL start_CELL ( italic_r | start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⊓ italic_s | start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ) | start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT end_CELL start_CELL (by monotonicity) end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL ⊑ end_CELL start_CELL ( italic_r | start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ) | start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⊓ ( italic_s | start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ) | start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT end_CELL start_CELL (by monotonicity) end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL = end_CELL start_CELL italic_r | start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⊓ italic_s | start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT end_CELL start_CELL (by idempotence) end_CELL end_ROW end_ARRAY

and the claim follows. So far, we have proven that

rs=p[X]2tpsquare-intersection𝑟𝑠subscript𝑝subscriptdelimited-[]𝑋2subscript𝑡𝑝r\sqcap s=\bigsqcap_{p\in[{\mathcal{}X}]_{2}}t_{p}italic_r ⊓ italic_s = ⨅ start_POSTSUBSCRIPT italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT

for some tpRpsubscript𝑡𝑝superscript𝑅𝑝t_{p}\in{\mathcal{}R}^{p}italic_t start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ∈ italic_R start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT, p[X]2𝑝subscriptdelimited-[]𝑋2p\in[{\mathcal{}X}]_{2}italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. Then, statement (2) follows from statement (1).

For a proof of statement (3), we note that any upper bound of r1,r2subscript𝑟1subscript𝑟2r_{1},r_{2}italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_r start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT in R2subscript𝑅2{\mathcal{}R}_{2}italic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT is also an upper bound of r1r2square-unionsubscript𝑟1subscript𝑟2r_{1}\sqcup r_{2}italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⊔ italic_r start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT in R𝑅{\mathcal{}R}italic_R. Therefore, the least upper bound od r1,r2subscript𝑟1subscript𝑟2r_{1},r_{2}italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_r start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT in R2subscript𝑅2{\mathcal{}R}_{2}italic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT is given by r1r2¯¯square-unionsubscript𝑟1subscript𝑟2\overline{r_{1}\sqcup r_{2}}over¯ start_ARG italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⊔ italic_r start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG. We calculate:

r1r2¯=p[X]2(r1r2)|p(by definition)=p[X]2(r1|pr2|p)(by (6))¯square-unionsubscript𝑟1subscript𝑟2evaluated-atsubscript𝑝subscriptdelimited-[]𝑋2square-unionsubscript𝑟1subscript𝑟2𝑝(by definition)missing-subexpressionsubscript𝑝subscriptdelimited-[]𝑋2square-unionevaluated-atsubscript𝑟1𝑝evaluated-atsubscript𝑟2𝑝(by (6))\begin{array}[]{lll@{\quad}l}\overline{r_{1}\sqcup r_{2}}&=&\bigsqcap_{p\in[{% \mathcal{}X}]_{2}}{\left.\kern-1.2pt(r_{1}\sqcup r_{2})\vphantom{|}\right|_{p}% }&\text{(by definition)}\\ &=&\bigsqcap_{p\in[{\mathcal{}X}]_{2}}({\left.\kern-1.2ptr_{1}\vphantom{|}% \right|_{p}}\sqcup{\left.\kern-1.2ptr_{2}\vphantom{|}\right|_{p}})&\text{(by % \eqref{def:decomp2})}\end{array}start_ARRAY start_ROW start_CELL over¯ start_ARG italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⊔ italic_r start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG end_CELL start_CELL = end_CELL start_CELL ⨅ start_POSTSUBSCRIPT italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⊔ italic_r start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) | start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT end_CELL start_CELL (by definition) end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL = end_CELL start_CELL ⨅ start_POSTSUBSCRIPT italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT | start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⊔ italic_r start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT | start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ) end_CELL start_CELL (by ( )) end_CELL end_ROW end_ARRAY

and statement (3) follows.

The best approximation of r|Yevaluated-at𝑟𝑌{\left.\kern-1.2ptr\vphantom{|}\right|_{Y}}italic_r | start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT in R2subscript𝑅2{\mathcal{}R}_{2}italic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT is given by r|Y¯¯evaluated-at𝑟𝑌\overline{{\left.\kern-1.2ptr\vphantom{|}\right|_{Y}}}over¯ start_ARG italic_r | start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT end_ARG. Thus, we have

r|Y,2=p[X]2(r|Y)|p=p[X]2r|Yp=p[X]2(r|p)|Yevaluated-at𝑟𝑌2evaluated-atsubscript𝑝subscriptdelimited-[]𝑋2evaluated-at𝑟𝑌𝑝evaluated-atsubscript𝑝subscriptdelimited-[]𝑋2𝑟𝑌𝑝evaluated-atsubscript𝑝subscriptdelimited-[]𝑋2evaluated-at𝑟𝑝𝑌{\left.\kern-1.2ptr\vphantom{|}\right|_{Y,2}}=\bigsqcap_{p\in[{\mathcal{}X}]_{% 2}}{\left.\kern-1.2pt({\left.\kern-1.2ptr\vphantom{|}\right|_{Y}})\vphantom{|}% \right|_{p}}=\bigsqcap_{p\in[{\mathcal{}X}]_{2}}{\left.\kern-1.2ptr\vphantom{|% }\right|_{Y\cap p}}=\bigsqcap_{p\in[{\mathcal{}X}]_{2}}{\left.\kern-1.2pt({% \left.\kern-1.2ptr\vphantom{|}\right|_{p}})\vphantom{|}\right|_{Y}}italic_r | start_POSTSUBSCRIPT italic_Y , 2 end_POSTSUBSCRIPT = ⨅ start_POSTSUBSCRIPT italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_r | start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT ) | start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT = ⨅ start_POSTSUBSCRIPT italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_r | start_POSTSUBSCRIPT italic_Y ∩ italic_p end_POSTSUBSCRIPT = ⨅ start_POSTSUBSCRIPT italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_r | start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ) | start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT

i.e., it can be determined by applying the restriction onto variables from Y𝑌Yitalic_Y for each cluster p[X]2𝑝subscriptdelimited-[]𝑋2p\in[{\mathcal{}X}]_{2}italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT separately. This implies statement (4).

Statement (5) is an immediate consequence of statements (3) and (4). ∎

The polyhedral domain, e.g., satisfies (6). Applied to the polyhedral relational domain, the construction from Theorem 2.1 results in the domain of affine inequalities with at most two variables per inequality Simon et al. (2002).

According to Theorem 2.1, every value r𝑟ritalic_r from the 2222-decomposable relational domain R2subscript𝑅2{\mathcal{}R}_{2}italic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT can be represented as the meet of its restrictions to 2222-clusters, i.e., by the collection r|pp[X]2subscriptdelimited-⟨⟩evaluated-at𝑟𝑝𝑝subscriptdelimited-[]𝑋2\langle{\left.\kern-1.2ptr\vphantom{|}\right|_{p}}\rangle_{p\in[{\mathcal{}X}]% _{2}}⟨ italic_r | start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT. We call this representation normal, and an algorithm that computes it normalization. Consider now an arbitrary collection spp[X]2subscriptdelimited-⟨⟩subscript𝑠𝑝𝑝subscriptdelimited-[]𝑋2\langle s_{p}\rangle_{p\in[{\mathcal{}X}]_{2}}⟨ italic_s start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT with spRpsubscript𝑠𝑝superscript𝑅𝑝s_{p}\in{\mathcal{}R}^{p}italic_s start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ∈ italic_R start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT with r=p[X]2sp𝑟subscript𝑝subscriptdelimited-[]𝑋2subscript𝑠𝑝r=\bigsqcap_{p\in[{\mathcal{}X}]_{2}}s_{p}italic_r = ⨅ start_POSTSUBSCRIPT italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_s start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT. Then r|pspsquare-image-of-or-equalsevaluated-at𝑟𝑝subscript𝑠𝑝{\left.\kern-1.2ptr\vphantom{|}\right|_{p}}\sqsubseteq s_{p}italic_r | start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⊑ italic_s start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT always holds, while equality need not hold. In the Octagon domain over the rationals or the integers, the normal representation of an octagon value corresponds to its closure as introduced in previous work Miné (2001); Bagnara et al. (2008). While for rational Octagons, closure in cubic time was already proposed by Miné (2001), it is much more recent that a corresponding algorithm was provided for the case when constraints are interpreted over integers Bagnara et al. (2008, 2009).

Subsequently, we introduce non-numerical weakly relational domains and provide polynomial algorithms for these.

3 Disjunctive Constants

Constant propagation relies on a domain that maintains conjunctions of atomic propositions x=a𝑥𝑎x=aitalic_x = italic_a where x𝑥xitalic_x is a program variable and a𝑎aitalic_a is from a finite set U𝑈Uitalic_U of possible values. In the following, we consider a (mild) generalization of this domain where also disjunctions of at most two atomic propositions are allowed.

Assume we are given a finite set U𝑈Uitalic_U representing possible values for variables from X𝑋{\mathcal{}X}italic_X. We consider propositions of the form (xA)𝑥𝐴(x\in A)( italic_x ∈ italic_A ) for AU𝐴𝑈A\subseteq Uitalic_A ⊆ italic_U which correspond to the disjunction of atomic propositions x=a,aAformulae-sequence𝑥𝑎𝑎𝐴x=a,a\in Aitalic_x = italic_a , italic_a ∈ italic_A. Thus, the proposition xA𝑥𝐴x\in Aitalic_x ∈ italic_A for some AU𝐴𝑈A\subseteq Uitalic_A ⊆ italic_U can be understood as an atomic proposition of a multi-valued propositional logic where A𝐴Aitalic_A serves as the set of logical values of the propositional variable x𝑥xitalic_x Beckert et al. (2000). Every monotonic Boolean combination ΨΨ\Psiroman_Ψ of propositions xA𝑥𝐴x\in Aitalic_x ∈ italic_A with xX,AUformulae-sequence𝑥𝑋𝐴𝑈x\in{\mathcal{}X},A\subseteq Uitalic_x ∈ italic_X , italic_A ⊆ italic_U, represents a function Ψ:(XU)B\llbracket\Psi\rrbracket:({\mathcal{}X}\to U)\to{\mathcal{}B}⟦ roman_Ψ ⟧ : ( italic_X → italic_U ) → italic_B defined by

xAσ=(σx)AΨ1Ψ2σ=Ψ1σΨ2σΨ1Ψ2σ=Ψ1σΨ2σ\begin{array}[]{lll}\llbracket x\in A\rrbracket\;\sigma&=&(\sigma\,x)\in A\\ \llbracket\Psi_{1}\vee\Psi_{2}\rrbracket\;\sigma&=&\llbracket\Psi_{1}% \rrbracket\,\sigma\vee\llbracket\Psi_{2}\rrbracket\,\sigma\\ \llbracket\Psi_{1}\wedge\Psi_{2}\rrbracket\;\sigma&=&\llbracket\Psi_{1}% \rrbracket\,\sigma\wedge\llbracket\Psi_{2}\rrbracket\,\sigma\\ \end{array}start_ARRAY start_ROW start_CELL ⟦ italic_x ∈ italic_A ⟧ italic_σ end_CELL start_CELL = end_CELL start_CELL ( italic_σ italic_x ) ∈ italic_A end_CELL end_ROW start_ROW start_CELL ⟦ roman_Ψ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∨ roman_Ψ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ⟧ italic_σ end_CELL start_CELL = end_CELL start_CELL ⟦ roman_Ψ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⟧ italic_σ ∨ ⟦ roman_Ψ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ⟧ italic_σ end_CELL end_ROW start_ROW start_CELL ⟦ roman_Ψ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∧ roman_Ψ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ⟧ italic_σ end_CELL start_CELL = end_CELL start_CELL ⟦ roman_Ψ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⟧ italic_σ ∧ ⟦ roman_Ψ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ⟧ italic_σ end_CELL end_ROW end_ARRAY

Let C[U]𝐶delimited-[]𝑈{\mathcal{}C}[U]italic_C [ italic_U ] denote the complete lattice of all equivalence classes of formulas ΨΨ\Psiroman_Ψ where the ordering is semantic implication. The least element in this ordering can be represented by the empty disjunction or bottom\bot (false), while the greatest element is equivalent to the empty conjunction or top\top (true). Each formula ΨΨ\Psiroman_Ψ has an equivalent CNF as well as an equivalent DNF where each clause (conjunction) contains at most one proposition xA𝑥𝐴x\in Aitalic_x ∈ italic_A for every variable x𝑥xitalic_x. Converting ΨΨ\Psiroman_Ψ into DNF allows checking satisfiability and computing the restriction Ψ|Yevaluated-atΨ𝑌{\left.\kern-1.2pt\Psi\vphantom{|}\right|_{Y}}roman_Ψ | start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT onto a subset YX𝑌𝑋Y\subseteq{\mathcal{}X}italic_Y ⊆ italic_X of variables. A formula for Ψ|Yevaluated-atΨ𝑌{\left.\kern-1.2pt\Psi\vphantom{|}\right|_{Y}}roman_Ψ | start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT is obtained from a DNF for ΨΨ\Psiroman_Ψ where each conjunction contains at most one proposition for each variable by the following steps: First, every conjunction which contains y𝑦y\in\emptysetitalic_y ∈ ∅ for some y𝑦yitalic_y is removed. From each remaining conjunction, then every proposition yA𝑦𝐴y\in Aitalic_y ∈ italic_A with yY𝑦𝑌y\not\in Yitalic_y ∉ italic_Y is removed. It follows that Ψ|Yevaluated-atΨ𝑌{\left.\kern-1.2pt\Psi\vphantom{|}\right|_{Y}}roman_Ψ | start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT is distributive, i.e., commutes with binary least upper bounds.

For an arbitrary ΨC[U]Ψ𝐶delimited-[]𝑈\Psi\in{\mathcal{}C}[U]roman_Ψ ∈ italic_C [ italic_U ], computing an equivalent DNF is an exponential time operation. The same holds if all restrictions Ψ|{x,y}evaluated-atΨ𝑥𝑦{\left.\kern-1.2pt\Psi\vphantom{|}\right|_{\{x,y\}}}roman_Ψ | start_POSTSUBSCRIPT { italic_x , italic_y } end_POSTSUBSCRIPT are computed via this normal form. Let C2[U]subscript𝐶2delimited-[]𝑈{\mathcal{}C}_{2}[U]italic_C start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT [ italic_U ] denote the 2-decomposable domain obtained from C[U]𝐶delimited-[]𝑈{\mathcal{}C}[U]italic_C [ italic_U ] according to theorem 2.1. The lattice C2[U]subscript𝐶2delimited-[]𝑈{\mathcal{}C}_{2}[U]italic_C start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT [ italic_U ] consists of all elements ΨΨ\Psiroman_Ψ which can be represented as conjunctions of clauses with at most two propositions xAx𝑥subscript𝐴𝑥x\in A_{x}italic_x ∈ italic_A start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT per clause. According to theorem 2.1, the least upper bound operation 2subscriptsquare-union2\sqcup_{2}⊔ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT for C2[U]subscript𝐶2delimited-[]𝑈{\mathcal{}C}_{2}[U]italic_C start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT [ italic_U ] can be realized by a clusterwise disjunction. In particular, it does not coincide with logical disjunction – but is an over-approximation of it.

Example 2

Let Ψ1(x{a})subscriptΨ1𝑥𝑎\Psi_{1}\equiv(x\in\{a\})roman_Ψ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≡ ( italic_x ∈ { italic_a } ) and Ψ2(y{b}z{c})subscriptΨ2𝑦𝑏𝑧𝑐\Psi_{2}\equiv(y\in\{b\}\lor z\in\{c\})roman_Ψ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≡ ( italic_y ∈ { italic_b } ∨ italic_z ∈ { italic_c } ). Then both Ψ1subscriptΨ1\Psi_{1}roman_Ψ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and Ψ2subscriptΨ2\Psi_{2}roman_Ψ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT are from C2[U]subscript𝐶2delimited-[]𝑈{\mathcal{}C}_{2}[U]italic_C start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT [ italic_U ], but their disjunction is not. In fact, the least upper bound in C2[U]subscript𝐶2delimited-[]𝑈{\mathcal{}C}_{2}[U]italic_C start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT [ italic_U ] for

(x{a})(y{b})(z{c})𝑥𝑎𝑦𝑏𝑧𝑐(x\in\{a\})\lor(y\in\{b\})\lor(z\in\{c\})( italic_x ∈ { italic_a } ) ∨ ( italic_y ∈ { italic_b } ) ∨ ( italic_z ∈ { italic_c } )

is top\top. ∎

3.1 Approximating 2-disjunctive Conjunctions

Any CNF ΨΨ\Psiroman_Ψ over some set Y𝑌Yitalic_Y of variables of bounded size can, in polynomial time, be transformed into a DNF ΨsuperscriptΨ\Psi^{\prime}roman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. Each DNF over two distinct variables x,y𝑥𝑦x,yitalic_x , italic_y can be brought into the canonical normal form

(a,b)L(x=a)(y=b)subscript𝑎𝑏𝐿𝑥𝑎𝑦𝑏\bigvee_{(a,b)\in L}(x=a)\wedge(y=b)⋁ start_POSTSUBSCRIPT ( italic_a , italic_b ) ∈ italic_L end_POSTSUBSCRIPT ( italic_x = italic_a ) ∧ ( italic_y = italic_b ) (7)

for some LU×U𝐿𝑈𝑈L\subseteq U\times Uitalic_L ⊆ italic_U × italic_U. Conjunction and disjunction of two such normal forms then correspond to intersection and union of the respective subsets of U×U𝑈𝑈U\times Uitalic_U × italic_U.

For arbitrary sets Y𝑌Yitalic_Y of variables, though, it is non-trivial even to decide whether a given conjunction is different from bottom\bot.

Theorem 3.1

To decide for a formula ΨC2[U]normal-Ψsubscript𝐶2delimited-[]𝑈\Psi\in{\mathcal{}C}_{2}[U]roman_Ψ ∈ italic_C start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT [ italic_U ] whether or not Ψnormal-Ψ\Psiroman_Ψ is satisfiable, i.e., different from bottom\bot, is NP-complete.

Proof

Since a satisfying assignment for ΨΨ\Psiroman_Ψ can be guessed and then checked in polynomial time, satisfiablity of ΨΨ\Psiroman_Ψ is in NP. NP-hardness, on the other hand, follows by a reduction from 3-colorability of graphs Beckert et al. (2000). We illustrate the reduction with an example.

Example 3

For X={x1,x2,x3,x4}𝑋subscript𝑥1subscript𝑥2subscript𝑥3subscript𝑥4{\mathcal{}X}=\{x_{1},x_{2},x_{3},x_{4}\}italic_X = { italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT }, consider the formula ΨΨ\Psiroman_Ψ

{xi,xj}E(xi{b,c}xj{b,c})(xi{a,c}xj{a,c})(xi{a,b}xj{a,b})subscriptsubscript𝑥𝑖subscript𝑥𝑗𝐸subscript𝑥𝑖𝑏𝑐subscript𝑥𝑗𝑏𝑐subscript𝑥𝑖𝑎𝑐subscript𝑥𝑗𝑎𝑐subscript𝑥𝑖𝑎𝑏subscript𝑥𝑗𝑎𝑏missing-subexpression\bigwedge_{\{x_{i},x_{j}\}\in E}\begin{array}[t]{ll}\left(x_{i}\in\{b,c\}\vee x% _{j}\in\{b,c\}\right)&\wedge\\ \left(x_{i}\in\{a,c\}\vee x_{j}\in\{a,c\}\right)&\wedge\\ \left(x_{i}\in\{a,b\}\vee x_{j}\in\{a,b\}\right)\end{array}⋀ start_POSTSUBSCRIPT { italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT } ∈ italic_E end_POSTSUBSCRIPT start_ARRAY start_ROW start_CELL ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ { italic_b , italic_c } ∨ italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ∈ { italic_b , italic_c } ) end_CELL start_CELL ∧ end_CELL end_ROW start_ROW start_CELL ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ { italic_a , italic_c } ∨ italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ∈ { italic_a , italic_c } ) end_CELL start_CELL ∧ end_CELL end_ROW start_ROW start_CELL ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ { italic_a , italic_b } ∨ italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ∈ { italic_a , italic_b } ) end_CELL start_CELL end_CELL end_ROW end_ARRAY

where E𝐸Eitalic_E is given by

E={{x1,x2},{x1,x4},{x2,x3},{x3,x4},{x1,x3}}𝐸subscript𝑥1subscript𝑥2subscript𝑥1subscript𝑥4subscript𝑥2subscript𝑥3subscript𝑥3subscript𝑥4subscript𝑥1subscript𝑥3E=\left\{\{x_{1},x_{2}\},\{x_{1},x_{4}\},\{x_{2},x_{3}\},\{x_{3},x_{4}\},\{x_{% 1},x_{3}\}\right\}italic_E = { { italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT } , { italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT } , { italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT } , { italic_x start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT } , { italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT } }

Then ΨΨ\Psiroman_Ψ is satisfiable iff the undirected graph (X,E)𝑋𝐸({\mathcal{}X},E)( italic_X , italic_E ) has a 3-coloring. In the given example, the graph

x1subscript𝑥1x_{1}italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPTx2subscript𝑥2x_{2}italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPTx3subscript𝑥3x_{3}italic_x start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPTx4subscript𝑥4x_{4}italic_x start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT

cannot be colored by three colors. Therefore, ΨΨ\Psiroman_Ψ is equivalent to bottom\bot. ∎

Exact normalization (as defined in Section 2) of a relation represented by some 2-CNF thus, in general, may be difficult to compute. Instead of giving dedicated further abstraction techniques, we prefer to provide for an arbitrary relational domain R𝑅{\mathcal{}R}italic_R, a general construction to approximate the 2-decomposable domain R2subscript𝑅2{\mathcal{}R}_{2}italic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT further by a 2-decomposable domain R2superscriptsubscript𝑅2{\mathcal{}R}_{2}^{\sharp}italic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT. This construction is based on approximate normalization.

Assume that an element in R2subscript𝑅2{\mathcal{}R}_{2}italic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT is given by the meet R𝑅\bigsqcap R⨅ italic_R where R𝑅Ritalic_R is the collection spp[X]2subscriptdelimited-⟨⟩subscript𝑠𝑝𝑝subscriptdelimited-[]𝑋2\langle s_{p}\rangle_{p\in[{\mathcal{}X}]_{2}}⟨ italic_s start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT with spRpsubscript𝑠𝑝superscript𝑅𝑝s_{p}\in{\mathcal{}R}^{p}italic_s start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ∈ italic_R start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT (p[X]2𝑝subscriptdelimited-[]𝑋2p\in[{\mathcal{}X}]_{2}italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT). According to Theorem 2.1, (R)|pspsquare-image-of-or-equalsevaluated-at𝑅𝑝subscript𝑠𝑝{\left.\kern-1.2pt(\bigsqcap R)\vphantom{|}\right|_{p}}\sqsubseteq s_{p}( ⨅ italic_R ) | start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⊑ italic_s start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT for all p[X]2𝑝subscriptdelimited-[]𝑋2p\in[{\mathcal{}X}]_{2}italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. As we have seen for 2-disjunctive constants, however, exact normalization of R𝑅\bigsqcap R⨅ italic_R, i.e., the values (R)|pevaluated-at𝑅𝑝{\left.\kern-1.2pt(\bigsqcap R)\vphantom{|}\right|_{p}}( ⨅ italic_R ) | start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT may be hard to compute precisely. For an approximate normalization, we introduce a constraint system in unknowns rp,p[X]2subscript𝑟𝑝𝑝subscriptdelimited-[]𝑋2r_{p},p\in[{\mathcal{}X}]_{2}italic_r start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT , italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT with the constraints

r{x,y}s{x,y}(x,yX)r{x,y}(r{x,z}r{z,y})|{x,y}(x,y,zX)subscript𝑟𝑥𝑦square-image-of-or-equalssubscript𝑠𝑥𝑦𝑥𝑦𝑋subscript𝑟𝑥𝑦square-image-of-or-equalsevaluated-atsquare-intersectionsubscript𝑟𝑥𝑧subscript𝑟𝑧𝑦𝑥𝑦𝑥𝑦𝑧𝑋\begin{array}[]{lll@{\;\;}r}r_{\{x,y\}}&\sqsubseteq&s_{\{x,y\}}&(x,y\in{% \mathcal{}X})\\ r_{\{x,y\}}&\sqsubseteq&{\left.\kern-1.2pt(r_{\{x,z\}}\sqcap r_{\{z,y\}})% \vphantom{|}\right|_{\{x,y\}}}&(x,y,z\in{\mathcal{}X})\end{array}start_ARRAY start_ROW start_CELL italic_r start_POSTSUBSCRIPT { italic_x , italic_y } end_POSTSUBSCRIPT end_CELL start_CELL ⊑ end_CELL start_CELL italic_s start_POSTSUBSCRIPT { italic_x , italic_y } end_POSTSUBSCRIPT end_CELL start_CELL ( italic_x , italic_y ∈ italic_X ) end_CELL end_ROW start_ROW start_CELL italic_r start_POSTSUBSCRIPT { italic_x , italic_y } end_POSTSUBSCRIPT end_CELL start_CELL ⊑ end_CELL start_CELL ( italic_r start_POSTSUBSCRIPT { italic_x , italic_z } end_POSTSUBSCRIPT ⊓ italic_r start_POSTSUBSCRIPT { italic_z , italic_y } end_POSTSUBSCRIPT ) | start_POSTSUBSCRIPT { italic_x , italic_y } end_POSTSUBSCRIPT end_CELL start_CELL ( italic_x , italic_y , italic_z ∈ italic_X ) end_CELL end_ROW end_ARRAY (8)

This constraint system has already been considered for the normalization of 2222-projective domains Schwarz and Seidl (2023). As all right-hand sides are monotonic, the constraint system has a greatest solution – whenever each Rp,p[X]2,superscript𝑅𝑝𝑝subscriptdelimited-[]𝑋2{\mathcal{}R}^{p},p\in[{\mathcal{}X}]_{2},italic_R start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT , italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , is a complete lattice.

In case that there is a greatest solution rpp[X]2subscriptdelimited-⟨⟩subscript𝑟𝑝𝑝subscriptdelimited-[]𝑋2\langle r_{p}\rangle_{p\in[{\mathcal{}X}]_{2}}⟨ italic_r start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT, (R)|prpsquare-image-of-or-equalsevaluated-at𝑅𝑝subscript𝑟𝑝{\left.\kern-1.2pt(\bigsqcap R)\vphantom{|}\right|_{p}}\sqsubseteq r_{p}( ⨅ italic_R ) | start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⊑ italic_r start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT holds for all p𝑝pitalic_p, since (R)|pp[X]2subscriptdelimited-⟨⟩evaluated-at𝑅𝑝𝑝subscriptdelimited-[]𝑋2\langle{\left.\kern-1.2pt(\bigsqcap R)\vphantom{|}\right|_{p}}\rangle_{p\in[{% \mathcal{}X}]_{2}}⟨ ( ⨅ italic_R ) | start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT is also a solution of the system (8). Then we call the collection rpp[X]2subscriptdelimited-⟨⟩subscript𝑟𝑝𝑝subscriptdelimited-[]𝑋2\langle r_{p}\rangle_{p\in[{\mathcal{}X}]_{2}}⟨ italic_r start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT the approximate normal form of the collection R𝑅Ritalic_R. Here, we are not only interested in the existence of a greatest solution of (8) but also that it can be effectively computed. For that, we consider the sets of values possibly occurring during some fixpoint iteration for a particular collection R=spp[X]2𝑅subscriptdelimited-⟨⟩subscript𝑠𝑝𝑝subscriptdelimited-[]𝑋2R=\langle s_{p}\rangle_{p\in[{\mathcal{}X}]_{2}}italic_R = ⟨ italic_s start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT.

Let IR[R]p,p[X]2,subscript𝐼𝑅superscriptdelimited-[]𝑅𝑝𝑝subscriptdelimited-[]𝑋2I_{\mathcal{}R}[R]^{p},p\in[{\mathcal{}X}]_{2},italic_I start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT [ italic_R ] start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT , italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , be the least collection of sets such that

  • spIR[R]psubscript𝑠𝑝subscript𝐼𝑅superscriptdelimited-[]𝑅𝑝s_{p}\in I_{\mathcal{}R}[R]^{p}italic_s start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ∈ italic_I start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT [ italic_R ] start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT;

  • If r,rRRp𝑟superscript𝑟superscriptsubscript𝑅𝑅𝑝r,r^{\prime}\in{\mathcal{}R}_{R}^{p}italic_r , italic_r start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_R start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT then also rrRRpsquare-intersection𝑟superscript𝑟superscriptsubscript𝑅𝑅𝑝r\sqcap r^{\prime}\in{\mathcal{}R}_{R}^{p}italic_r ⊓ italic_r start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_R start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT;

  • If rIR[R]{x,z}𝑟subscript𝐼𝑅superscriptdelimited-[]𝑅𝑥𝑧r\in I_{\mathcal{}R}[R]^{\{x,z\}}italic_r ∈ italic_I start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT [ italic_R ] start_POSTSUPERSCRIPT { italic_x , italic_z } end_POSTSUPERSCRIPT and rIR[R]{z,y}superscript𝑟subscript𝐼𝑅superscriptdelimited-[]𝑅𝑧𝑦r^{\prime}\in I_{\mathcal{}R}[R]^{\{z,y\}}italic_r start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_I start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT [ italic_R ] start_POSTSUPERSCRIPT { italic_z , italic_y } end_POSTSUPERSCRIPT, then
    (rr)|{x,y}IR[R]{x,y}evaluated-atsquare-intersection𝑟superscript𝑟𝑥𝑦subscript𝐼𝑅superscriptdelimited-[]𝑅𝑥𝑦{\left.\kern-1.2pt(r\sqcap r^{\prime})\vphantom{|}\right|_{\{x,y\}}}\in I_{% \mathcal{}R}[R]^{\{x,y\}}( italic_r ⊓ italic_r start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) | start_POSTSUBSCRIPT { italic_x , italic_y } end_POSTSUBSCRIPT ∈ italic_I start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT [ italic_R ] start_POSTSUPERSCRIPT { italic_x , italic_y } end_POSTSUPERSCRIPT for all x,y,zX𝑥𝑦𝑧𝑋x,y,z\in{\mathcal{}X}italic_x , italic_y , italic_z ∈ italic_X.

The sets IR[R]psubscript𝐼𝑅superscriptdelimited-[]𝑅𝑝I_{\mathcal{}R}[R]^{p}italic_I start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT [ italic_R ] start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT collect the potential iterates occurring during greatest fixpoint iteration of (8). By construction, each set IR[R]psubscript𝐼𝑅superscriptdelimited-[]𝑅𝑝I_{\mathcal{}R}[R]^{p}italic_I start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT [ italic_R ] start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT has a greatest element, namely, spsubscript𝑠𝑝s_{p}italic_s start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT, and is closed under binary square-intersection\sqcap. For the termination of Kleene fixpoint iteration for (8), it suffices for each set IR[R]psubscript𝐼𝑅superscriptdelimited-[]𝑅𝑝I_{\mathcal{}R}[R]^{p}italic_I start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT [ italic_R ] start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT to have a least element – whose collection then coincides with the greatest solution of (8). This observation is summarized in the following proposition.

Proposition 1

The following two statements are equivalent:

  1. 1.

    For each p[X]2𝑝subscriptdelimited-[]𝑋2p\in[{\mathcal{}X}]_{2}italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, IR[R]psubscript𝐼𝑅superscriptdelimited-[]𝑅𝑝I_{\mathcal{}R}[R]^{p}italic_I start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT [ italic_R ] start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT has a least element;

  2. 2.

    The constraint system (8) has a greatest solution which can be attained by Kleene fixpoint iteration.

Proof

Assume that for each p[X]2𝑝subscriptdelimited-[]𝑋2p\in[{\mathcal{}X}]_{2}italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, there is a least element dpIR[R]psubscript𝑑𝑝subscript𝐼𝑅superscriptdelimited-[]𝑅𝑝d_{p}\in I_{\mathcal{}R}[R]^{p}italic_d start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ∈ italic_I start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT [ italic_R ] start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT. We claim that R¯=dpp[X]2¯𝑅subscriptdelimited-⟨⟩subscript𝑑𝑝𝑝subscriptdelimited-[]𝑋2\underline{R}=\langle d_{p}\rangle_{p\in[{\mathcal{}X}]_{2}}under¯ start_ARG italic_R end_ARG = ⟨ italic_d start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT is the greatest solution of (8). Since for each p[X]2𝑝subscriptdelimited-[]𝑋2p\in[{\mathcal{}X}]_{2}italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, dpsubscript𝑑𝑝d_{p}italic_d start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT is a lower bound to all elements in IR[R]psubscript𝐼𝑅superscriptdelimited-[]𝑅𝑝I_{\mathcal{}R}[R]^{p}italic_I start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT [ italic_R ] start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT, all constraints of (8) are satisfied. Therefore, R¯¯𝑅\underline{R}under¯ start_ARG italic_R end_ARG is a solution. By induction on the definition of the sets IR[R]psubscript𝐼𝑅superscriptdelimited-[]𝑅𝑝I_{\mathcal{}R}[R]^{p}italic_I start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT [ italic_R ] start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT, any other solution R=rp[X]2superscript𝑅subscriptdelimited-⟨⟩subscriptsuperscript𝑟𝑝subscriptdelimited-[]𝑋2R^{\prime}=\langle r^{\prime}_{p}\rangle_{[{\mathcal{}X}]_{2}}italic_R start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = ⟨ italic_r start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT consists of lower bounds of these sets, i.e., rpIR[R]p=dpsquare-image-of-or-equalssubscriptsuperscript𝑟𝑝subscript𝐼𝑅superscriptdelimited-[]𝑅𝑝subscript𝑑𝑝r^{\prime}_{p}\sqsubseteq\bigsqcap I_{\mathcal{}R}[R]^{p}=d_{p}italic_r start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⊑ ⨅ italic_I start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT [ italic_R ] start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT = italic_d start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT – implying our claim. To conclude statement (2), it remains to prove that the greatest solution R¯¯𝑅\underline{R}under¯ start_ARG italic_R end_ARG can be reached by Kleene iteration. For every p𝑝pitalic_p, dpsubscript𝑑𝑝d_{p}italic_d start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT is an element of the set IR[R]psubscript𝐼𝑅superscriptdelimited-[]𝑅𝑝I_{\mathcal{}R}[R]^{p}italic_I start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT [ italic_R ] start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT, and therefore, has arrived there after finitely many applications of the inductive rule of their definitions. Let hhitalic_h be an upper bound to these numbers for all dp,p[X]2subscript𝑑𝑝𝑝subscriptdelimited-[]𝑋2d_{p},p\in[{\mathcal{}X}]_{2}italic_d start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT , italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. Then, Kleene iteration for the constraint system (8) will also reach these values after at most hhitalic_h iterations.

For the reverse direction, assume that Kleene iteration for the greatest solution of (8) terminates after hhitalic_h iterations with a collection R¯=dpp[X]2¯𝑅subscriptdelimited-⟨⟩subscript𝑑𝑝𝑝subscriptdelimited-[]𝑋2\underline{R}=\langle d_{p}\rangle_{p\in[{\mathcal{}X}]_{2}}under¯ start_ARG italic_R end_ARG = ⟨ italic_d start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT. By induction on the number j𝑗jitalic_j of rounds, we find each value dp(j)subscriptsuperscript𝑑𝑗𝑝d^{(j)}_{p}italic_d start_POSTSUPERSCRIPT ( italic_j ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT attained for rpsubscript𝑟𝑝r_{p}italic_r start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT, p[X]2𝑝subscriptdelimited-[]𝑋2p\in[{\mathcal{}X}]_{2}italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, after j𝑗jitalic_j rounds, is an element of IR[R]psubscript𝐼𝑅superscriptdelimited-[]𝑅𝑝I_{\mathcal{}R}[R]^{p}italic_I start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT [ italic_R ] start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT. Therefore, dp=dp(h)IR[R]psubscript𝑑𝑝superscriptsubscript𝑑𝑝subscript𝐼𝑅superscriptdelimited-[]𝑅𝑝d_{p}=d_{p}^{(h)}\in I_{\mathcal{}R}[R]^{p}italic_d start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT = italic_d start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_h ) end_POSTSUPERSCRIPT ∈ italic_I start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT [ italic_R ] start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT for all p𝑝pitalic_p. It remains to prove that dpsubscript𝑑𝑝d_{p}italic_d start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT is also a lower bound of IR[R]psubscript𝐼𝑅superscriptdelimited-[]𝑅𝑝I_{\mathcal{}R}[R]^{p}italic_I start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT [ italic_R ] start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT. To show this, we again proceed by induction, this time on the number i𝑖iitalic_i of applications of the inductive rule for the construction of the IR[R]psubscript𝐼𝑅superscriptdelimited-[]𝑅𝑝I_{\mathcal{}R}[R]^{p}italic_I start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT [ italic_R ] start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT, and prove that for all i𝑖iitalic_i and any value dsuperscript𝑑d^{\prime}italic_d start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT added to some set IR[R]psubscript𝐼𝑅superscriptdelimited-[]𝑅𝑝I_{\mathcal{}R}[R]^{p}italic_I start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT [ italic_R ] start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT in the i𝑖iitalic_ith step, it holds that dp(i)dsquare-image-of-or-equalssuperscriptsubscript𝑑𝑝𝑖superscript𝑑d_{p}^{(i)}\sqsubseteq d^{\prime}italic_d start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT ⊑ italic_d start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. Therefore, dpsubscript𝑑𝑝d_{p}italic_d start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT is a lower bound to IR[R]psubscript𝐼𝑅superscriptdelimited-[]𝑅𝑝I_{\mathcal{}R}[R]^{p}italic_I start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT [ italic_R ] start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT for all p𝑝pitalic_p, and statement (1) follows. ∎

If all operations on abstract relations rRY𝑟superscript𝑅𝑌r\in{\mathcal{}R}^{Y}italic_r ∈ italic_R start_POSTSUPERSCRIPT italic_Y end_POSTSUPERSCRIPT for clusters Y𝑌Yitalic_Y of size at most 3 are constant time and the height of all R[R]p𝑅superscriptdelimited-[]𝑅𝑝{\mathcal{}R}[R]^{p}italic_R [ italic_R ] start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT are bounded by hhitalic_h, then the greatest solution of the constraint system (8) can be computed in time polynomial in hhitalic_h and the number of variables.

We call a relational domain 2-nice, if the statements of Proposition 1 are satisfied for each collection R=spp[X]2𝑅subscriptdelimited-⟨⟩subscript𝑠𝑝𝑝subscriptdelimited-[]𝑋2R=\langle s_{p}\rangle_{p\in[{\mathcal{}X}]_{2}}italic_R = ⟨ italic_s start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT with spRpsubscript𝑠𝑝superscript𝑅𝑝s_{p}\in{\mathcal{}R}^{p}italic_s start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ∈ italic_R start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT.

Let us instantiate this construction to 2-disjunctive constants. First, we note that the relational domain C[U]𝐶delimited-[]𝑈{\mathcal{}C}[U]italic_C [ italic_U ] is finite and thus, in particular, 2-nice. Let Ψ=spp[X]2Ψsubscriptdelimited-⟨⟩subscript𝑠𝑝𝑝subscriptdelimited-[]𝑋2\Psi=\langle s_{p}\rangle_{p\in[{\mathcal{}X}]_{2}}roman_Ψ = ⟨ italic_s start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT denote a collection with spC[U]psubscript𝑠𝑝𝐶superscriptdelimited-[]𝑈𝑝s_{p}\in{\mathcal{}C}[U]^{p}italic_s start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ∈ italic_C [ italic_U ] start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT for all p𝑝pitalic_p. Assume that X𝑋{\mathcal{}X}italic_X consists of n𝑛nitalic_n variables, and let m𝑚mitalic_m be the number of constants occurring in any of the spsubscript𝑠𝑝s_{p}italic_s start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT. According to the normal form (7), the lattice IC[U][Ψ]psubscript𝐼𝐶delimited-[]𝑈superscriptdelimited-[]Ψ𝑝I_{{\mathcal{}C}[U]}[\Psi]^{p}italic_I start_POSTSUBSCRIPT italic_C [ italic_U ] end_POSTSUBSCRIPT [ roman_Ψ ] start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT has height at most m𝑚mitalic_m if p𝑝pitalic_p consists of a single variable, and height bounded by m2superscript𝑚2m^{2}italic_m start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT if p𝑝pitalic_p is a two-element set. Since there are 12n(n+1)12𝑛𝑛1\frac{1}{2}n(n+1)divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_n ( italic_n + 1 ) clusters, fixpoint iteration will terminate after O(n2m2)𝑂superscript𝑛2superscript𝑚2{\mathcal{}O}(n^{2}\cdot m^{2})italic_O ( italic_n start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ⋅ italic_m start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) updates. ∎

Due to NP-hardness of satisfiability, we cannot expect the greatest solution of the constraint system for 2-disjunctive constants to always return the exact normal form. For the formula from Example 3, e.g., it returns for each pair {xi,xj}Esubscript𝑥𝑖subscript𝑥𝑗𝐸\{x_{i},x_{j}\}\in E{ italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT } ∈ italic_E, ij𝑖𝑗i\neq jitalic_i ≠ italic_j,

(xi=axj{b,c})(xi=bxj{a,c})(xi=cxj{a,b})subscript𝑥𝑖𝑎subscript𝑥𝑗𝑏𝑐limit-fromsubscript𝑥𝑖𝑏subscript𝑥𝑗𝑎𝑐subscript𝑥𝑖𝑐subscript𝑥𝑗𝑎𝑏\begin{array}[]{l}\left(x_{i}=a\wedge x_{j}\in\{b,c\}\right)\vee\left(x_{i}=b% \wedge x_{j}\in\{a,c\}\right)\vee\\ \quad\left(x_{i}=c\wedge x_{j}\in\{a,b\}\right)\end{array}start_ARRAY start_ROW start_CELL ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_a ∧ italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ∈ { italic_b , italic_c } ) ∨ ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_b ∧ italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ∈ { italic_a , italic_c } ) ∨ end_CELL end_ROW start_ROW start_CELL ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_c ∧ italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ∈ { italic_a , italic_b } ) end_CELL end_ROW end_ARRAY

– which is different from bottom\bot.

For a relational domain R𝑅{\mathcal{}R}italic_R, we call a collection R=rpp[X]2𝑅subscriptdelimited-⟨⟩subscript𝑟𝑝𝑝subscriptdelimited-[]𝑋2R=\langle r_{p}\rangle_{p\in[{\mathcal{}X}]_{2}}italic_R = ⟨ italic_r start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT with rpRpsubscript𝑟𝑝superscript𝑅𝑝r_{p}\in{\mathcal{}R}^{p}italic_r start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ∈ italic_R start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT for all p𝑝pitalic_p, stable if it is a solution of the constraint system (8) with sprpsubscript𝑠𝑝subscript𝑟𝑝s_{p}\equiv r_{p}italic_s start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ≡ italic_r start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT. We remark that stability of R𝑅Ritalic_R implies that, if rp=subscript𝑟𝑝bottomr_{p}=\botitalic_r start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT = ⊥ for some p𝑝pitalic_p, then rp=subscript𝑟superscript𝑝bottomr_{p^{\prime}}=\botitalic_r start_POSTSUBSCRIPT italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT = ⊥ for all other p[X]2superscript𝑝subscriptdelimited-[]𝑋2p^{\prime}\in[{\mathcal{}X}]_{2}italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT as well. Now we introduce for a relational domain R𝑅{\mathcal{}R}italic_R the domain R2superscriptsubscript𝑅2{\mathcal{}R}_{2}^{\sharp}italic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT of all stable collections. The ordering superscriptsquare-image-of-or-equals\sqsubseteq^{\sharp}⊑ start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT on the domain R2superscriptsubscript𝑅2{\mathcal{}R}_{2}^{\sharp}italic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT is defined by RRsuperscriptsquare-image-of-or-equals𝑅superscript𝑅R\sqsubseteq^{\sharp}R^{\prime}italic_R ⊑ start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT italic_R start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT if rprpsquare-image-of-or-equalssubscript𝑟𝑝subscriptsuperscript𝑟𝑝r_{p}\sqsubseteq r^{\prime}_{p}italic_r start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⊑ italic_r start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT for all p[X]2𝑝subscriptdelimited-[]𝑋2p\in[{\mathcal{}X}]_{2}italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT when R=rpp[X]2𝑅subscriptdelimited-⟨⟩subscript𝑟𝑝𝑝subscriptdelimited-[]𝑋2R=\langle r_{p}\rangle_{p\in[{\mathcal{}X}]_{2}}italic_R = ⟨ italic_r start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT and R=rpp[X]2superscript𝑅subscriptdelimited-⟨⟩subscriptsuperscript𝑟𝑝𝑝subscriptdelimited-[]𝑋2R^{\prime}=\langle r^{\prime}_{p}\rangle_{p\in[{\mathcal{}X}]_{2}}italic_R start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = ⟨ italic_r start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT. Thus, (R)(R)square-image-of-or-equals𝑅superscript𝑅(\bigsqcap R)\sqsubseteq(\bigsqcap R^{\prime})( ⨅ italic_R ) ⊑ ( ⨅ italic_R start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) whenever RRsuperscriptsquare-image-of-or-equals𝑅superscript𝑅R\sqsubseteq^{\sharp}R^{\prime}italic_R ⊑ start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT italic_R start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT.

Abstract join as well as abstract restriction for R2superscriptsubscript𝑅2{\mathcal{}R}_{2}^{\sharp}italic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT then is modeled along the definitions of join and restriction for R2subscript𝑅2{\mathcal{}R}_{2}italic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, but refers to the representation as solution to the constraint system (8). For R=rpp[X]2𝑅subscriptdelimited-⟨⟩subscript𝑟𝑝𝑝subscriptdelimited-[]𝑋2R=\langle r_{p}\rangle_{p\in[{\mathcal{}X}]_{2}}italic_R = ⟨ italic_r start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT, R=rpp[X]2superscript𝑅subscriptdelimited-⟨⟩subscriptsuperscript𝑟𝑝𝑝subscriptdelimited-[]𝑋2R^{\prime}=\langle r^{\prime}_{p}\rangle_{p\in[{\mathcal{}X}]_{2}}italic_R start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = ⟨ italic_r start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT in R2superscriptsubscript𝑅2{\mathcal{}R}_{2}^{\sharp}italic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT, we define the abstract join by

RR=rprpp[X]2superscriptsquare-union𝑅superscript𝑅subscriptdelimited-⟨⟩square-unionsubscript𝑟𝑝subscriptsuperscript𝑟𝑝𝑝subscriptdelimited-[]𝑋2R\sqcup^{\sharp}R^{\prime}=\langle r_{p}\sqcup r^{\prime}_{p}\rangle_{p\in[{% \mathcal{}X}]_{2}}italic_R ⊔ start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT italic_R start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = ⟨ italic_r start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⊔ italic_r start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT

while for YX𝑌𝑋Y\subseteq{\mathcal{}X}italic_Y ⊆ italic_X, and R=rpp[X]2𝑅subscriptdelimited-⟨⟩subscript𝑟𝑝𝑝subscriptdelimited-[]𝑋2R=\langle r_{p}\rangle_{p\in[{\mathcal{}X}]_{2}}italic_R = ⟨ italic_r start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT, we define abstract restriction by

rpp[X]2|Y=rp|Yp[X]2=rp|Ypp[X]2evaluated-atsubscriptdelimited-⟨⟩subscript𝑟𝑝𝑝subscriptdelimited-[]𝑋2𝑌subscriptdelimited-⟨⟩evaluated-atsubscript𝑟𝑝𝑌𝑝subscriptdelimited-[]𝑋2missing-subexpressionsubscriptdelimited-⟨⟩evaluated-atsubscript𝑟𝑝𝑌𝑝𝑝subscriptdelimited-[]𝑋2\begin{array}[]{lll}{\left.\kern-1.2pt\langle r_{p}\rangle_{p\in[{\mathcal{}X}% ]_{2}}\vphantom{|}\right|^{\sharp}_{Y}}&=&\langle{\left.\kern-1.2ptr_{p}% \vphantom{|}\right|_{Y}}\rangle_{p\in[{\mathcal{}X}]_{2}}\\ &=&\langle{\left.\kern-1.2ptr_{p}\vphantom{|}\right|_{Y\cap p}}\rangle_{p\in[{% \mathcal{}X}]_{2}}\\ \end{array}start_ARRAY start_ROW start_CELL ⟨ italic_r start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT | start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT end_CELL start_CELL = end_CELL start_CELL ⟨ italic_r start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT | start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL = end_CELL start_CELL ⟨ italic_r start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT | start_POSTSUBSCRIPT italic_Y ∩ italic_p end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL end_ROW end_ARRAY

where the latter equality follows since for rpRpsubscript𝑟𝑝superscript𝑅𝑝r_{p}\in{\mathcal{}R}^{p}italic_r start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ∈ italic_R start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT, rp|p=rpevaluated-atsubscript𝑟𝑝𝑝subscript𝑟𝑝{\left.\kern-1.2ptr_{p}\vphantom{|}\right|_{p}}=r_{p}italic_r start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT | start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT = italic_r start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT. We have:

Proposition 2

Assume that R𝑅{\mathcal{}R}italic_R is 2-nice and satisfies (6). Then we have:

  1. 1.

    For each R,RR2𝑅superscript𝑅superscriptsubscript𝑅2R,R^{\prime}\in{\mathcal{}R}_{2}^{\sharp}italic_R , italic_R start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT, also RRsuperscriptsquare-union𝑅superscript𝑅R\sqcup^{\sharp}R^{\prime}italic_R ⊔ start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT italic_R start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT is again in R2superscriptsubscript𝑅2{\mathcal{}R}_{2}^{\sharp}italic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT and is the least upper bound of R,R𝑅superscript𝑅R,R^{\prime}italic_R , italic_R start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. Moreover,

    (R)(R)(RR)square-image-of-or-equalssquare-union𝑅superscript𝑅superscriptsquare-union𝑅superscript𝑅(\bigsqcap R)\sqcup(\bigsqcap R^{\prime})\sqsubseteq\bigsqcap(R\sqcup^{\sharp}% R^{\prime})( ⨅ italic_R ) ⊔ ( ⨅ italic_R start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ⊑ ⨅ ( italic_R ⊔ start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT italic_R start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT )
  2. 2.

    For each RR2𝑅superscriptsubscript𝑅2R\in{\mathcal{}R}_{2}^{\sharp}italic_R ∈ italic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT and YX𝑌𝑋Y\subseteq{\mathcal{}X}italic_Y ⊆ italic_X, R|Yevaluated-at𝑅𝑌{\left.\kern-1.2ptR\vphantom{|}\right|^{\sharp}_{Y}}italic_R | start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT is again in R2superscriptsubscript𝑅2{\mathcal{}R}_{2}^{\sharp}italic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT where

    (R)|Y(R|Y)square-image-of-or-equalsevaluated-at𝑅𝑌evaluated-at𝑅𝑌{\left.\kern-1.2pt(\bigsqcap R)\vphantom{|}\right|_{Y}}\sqsubseteq\bigsqcap({% \left.\kern-1.2ptR\vphantom{|}\right|^{\sharp}_{Y}})( ⨅ italic_R ) | start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT ⊑ ⨅ ( italic_R | start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT )

    holds.

  3. 3.

    For each R=rpp[X]2𝑅subscriptdelimited-⟨⟩subscript𝑟𝑝𝑝subscriptdelimited-[]𝑋2R=\langle r_{p}\rangle_{p\in[{\mathcal{}X}]_{2}}italic_R = ⟨ italic_r start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ,R=rpp[X]2superscript𝑅subscriptdelimited-⟨⟩subscriptsuperscript𝑟𝑝𝑝subscriptdelimited-[]𝑋2R^{\prime}=\langle r^{\prime}_{p}\rangle_{p\in[{\mathcal{}X}]_{2}}italic_R start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = ⟨ italic_r start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT in R2superscriptsubscript𝑅2{\mathcal{}R}_{2}^{\sharp}italic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT, the greatest lower bound RR=rp′′p[X]2superscriptsquare-intersection𝑅superscript𝑅subscriptdelimited-⟨⟩subscriptsuperscript𝑟′′𝑝𝑝subscriptdelimited-[]𝑋2R\sqcap^{\sharp}R^{\prime}=\langle r^{\prime\prime}_{p}\rangle_{p\in[{\mathcal% {}X}]_{2}}italic_R ⊓ start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT italic_R start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = ⟨ italic_r start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT is determined as the greatest solution of (8) with start values sp=rprpsubscript𝑠𝑝square-intersectionsubscript𝑟𝑝subscriptsuperscript𝑟𝑝s_{p}=r_{p}\sqcap r^{\prime}_{p}italic_s start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT = italic_r start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⊓ italic_r start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT (p[X]2𝑝subscriptdelimited-[]𝑋2p\in[{\mathcal{}X}]_{2}italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT).

Proof

For the first statement, let R=rpp[X]2𝑅subscriptdelimited-⟨⟩subscript𝑟𝑝𝑝subscriptdelimited-[]𝑋2R=\langle r_{p}\rangle_{p\in[{\mathcal{}X}]_{2}}italic_R = ⟨ italic_r start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT and R=rpp[X]2superscript𝑅subscriptdelimited-⟨⟩subscriptsuperscript𝑟𝑝𝑝subscriptdelimited-[]𝑋2R^{\prime}=\langle r^{\prime}_{p}\rangle_{p\in[{\mathcal{}X}]_{2}}italic_R start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = ⟨ italic_r start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT. As the ordering on R2superscriptsubscript𝑅2{\mathcal{}R}_{2}^{\sharp}italic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT is componentwise, it suffices to prove that RRsuperscriptsquare-union𝑅superscript𝑅R\sqcup^{\sharp}R^{\prime}italic_R ⊔ start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT italic_R start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT is again in R2superscriptsubscript𝑅2{\mathcal{}R}_{2}^{\sharp}italic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT, i.e., the collection rprp,p[X]2,square-unionsubscript𝑟𝑝subscriptsuperscript𝑟𝑝𝑝subscriptdelimited-[]𝑋2r_{p}\sqcup r^{\prime}_{p},p\in[{\mathcal{}X}]_{2},italic_r start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⊔ italic_r start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT , italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , is a solution of the constraints in (8). For this, we calculate:

r{x,y}r{x,y}(r{x,z}r{z,y})|{x,y}(r{x,z}r{z,y})|{x,y}((r{x,z}r{x,z})(r{z,y}r{z,y}))|{x,y}square-unionsubscript𝑟𝑥𝑦subscriptsuperscript𝑟𝑥𝑦missing-subexpressionsquare-image-of-or-equalssquare-unionevaluated-atsquare-intersectionsubscript𝑟𝑥𝑧subscript𝑟𝑧𝑦𝑥𝑦evaluated-atsquare-intersectionsubscriptsuperscript𝑟𝑥𝑧subscriptsuperscript𝑟𝑧𝑦𝑥𝑦missing-subexpressionsquare-image-of-or-equalsevaluated-atsquare-intersectionsquare-unionsubscript𝑟𝑥𝑧subscriptsuperscript𝑟𝑥𝑧square-unionsubscript𝑟𝑧𝑦subscriptsuperscript𝑟𝑧𝑦𝑥𝑦\begin{array}[]{lll}r_{\{x,y\}}\sqcup r^{\prime}_{\{x,y\}}\hfil\\ &\sqsubseteq&{\left.\kern-1.2pt(r_{\{x,z\}}\sqcap r_{\{z,y\}})\vphantom{|}% \right|_{\{x,y\}}}\sqcup{\left.\kern-1.2pt(r^{\prime}_{\{x,z\}}\sqcap r^{% \prime}_{\{z,y\}})\vphantom{|}\right|_{\{x,y\}}}\\ &\sqsubseteq&{\left.\kern-1.2pt((r_{\{x,z\}}\sqcup r^{\prime}_{\{x,z\}})\sqcap% (r_{\{z,y\}}\sqcup r^{\prime}_{\{z,y\}}))\vphantom{|}\right|_{\{x,y\}}}\\ \end{array}start_ARRAY start_ROW start_CELL italic_r start_POSTSUBSCRIPT { italic_x , italic_y } end_POSTSUBSCRIPT ⊔ italic_r start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT { italic_x , italic_y } end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL ⊑ end_CELL start_CELL ( italic_r start_POSTSUBSCRIPT { italic_x , italic_z } end_POSTSUBSCRIPT ⊓ italic_r start_POSTSUBSCRIPT { italic_z , italic_y } end_POSTSUBSCRIPT ) | start_POSTSUBSCRIPT { italic_x , italic_y } end_POSTSUBSCRIPT ⊔ ( italic_r start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT { italic_x , italic_z } end_POSTSUBSCRIPT ⊓ italic_r start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT { italic_z , italic_y } end_POSTSUBSCRIPT ) | start_POSTSUBSCRIPT { italic_x , italic_y } end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL ⊑ end_CELL start_CELL ( ( italic_r start_POSTSUBSCRIPT { italic_x , italic_z } end_POSTSUBSCRIPT ⊔ italic_r start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT { italic_x , italic_z } end_POSTSUBSCRIPT ) ⊓ ( italic_r start_POSTSUBSCRIPT { italic_z , italic_y } end_POSTSUBSCRIPT ⊔ italic_r start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT { italic_z , italic_y } end_POSTSUBSCRIPT ) ) | start_POSTSUBSCRIPT { italic_x , italic_y } end_POSTSUBSCRIPT end_CELL end_ROW end_ARRAY

for all variables x,y,zX𝑥𝑦𝑧𝑋x,y,z\in{\mathcal{}X}italic_x , italic_y , italic_z ∈ italic_X. From that, the statement follows.

To prove the second statement, we must verify that the collection rp|Yp,p[X]2evaluated-atsubscript𝑟𝑝𝑌𝑝𝑝subscriptdelimited-[]𝑋2{\left.\kern-1.2ptr_{p}\vphantom{|}\right|_{Y\cap p}},p\in[{\mathcal{}X}]_{2}italic_r start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT | start_POSTSUBSCRIPT italic_Y ∩ italic_p end_POSTSUBSCRIPT , italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT satisfies all constraints in (8). Indeed, we find by monotonicity,

r{x,y}|Y(r{x,z}r{z,y})|{x,y}Y(r{x,z}|Yr{z,y}|Y)|{x,y}Yevaluated-atsubscript𝑟𝑥𝑦𝑌square-image-of-or-equalsevaluated-atsquare-intersectionsubscript𝑟𝑥𝑧subscript𝑟𝑧𝑦𝑥𝑦𝑌missing-subexpressionsquare-image-of-or-equalsevaluated-atsquare-intersectionevaluated-atsubscript𝑟𝑥𝑧𝑌evaluated-atsubscript𝑟𝑧𝑦𝑌𝑥𝑦𝑌\begin{array}[]{lll}{\left.\kern-1.2ptr_{\{x,y\}}\vphantom{|}\right|_{Y}}&% \sqsubseteq&{\left.\kern-1.2pt(r_{\{x,z\}}\sqcap r_{\{z,y\}})\vphantom{|}% \right|_{\{x,y\}\cap Y}}\\ &\sqsubseteq&{\left.\kern-1.2pt({\left.\kern-1.2ptr_{\{x,z\}}\vphantom{|}% \right|_{Y}}\sqcap{\left.\kern-1.2ptr_{\{z,y\}}\vphantom{|}\right|_{Y}})% \vphantom{|}\right|_{\{x,y\}\cap Y}}\\ \end{array}start_ARRAY start_ROW start_CELL italic_r start_POSTSUBSCRIPT { italic_x , italic_y } end_POSTSUBSCRIPT | start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT end_CELL start_CELL ⊑ end_CELL start_CELL ( italic_r start_POSTSUBSCRIPT { italic_x , italic_z } end_POSTSUBSCRIPT ⊓ italic_r start_POSTSUBSCRIPT { italic_z , italic_y } end_POSTSUBSCRIPT ) | start_POSTSUBSCRIPT { italic_x , italic_y } ∩ italic_Y end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL ⊑ end_CELL start_CELL ( italic_r start_POSTSUBSCRIPT { italic_x , italic_z } end_POSTSUBSCRIPT | start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT ⊓ italic_r start_POSTSUBSCRIPT { italic_z , italic_y } end_POSTSUBSCRIPT | start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT ) | start_POSTSUBSCRIPT { italic_x , italic_y } ∩ italic_Y end_POSTSUBSCRIPT end_CELL end_ROW end_ARRAY

for all x,y,zX𝑥𝑦𝑧𝑋x,y,z\in{\mathcal{}X}italic_x , italic_y , italic_z ∈ italic_X, and the claim follows. The final statement then follows from the definition. ∎

Elements of R2superscriptsubscript𝑅2{\mathcal{}R}_{2}^{\sharp}italic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT are collections rpp[X]2subscriptdelimited-⟨⟩subscript𝑟𝑝𝑝subscriptdelimited-[]𝑋2\langle r_{p}\rangle_{p\in[{\mathcal{}X}]_{2}}⟨ italic_r start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT. For every p[X]2𝑝subscriptdelimited-[]𝑋2p\in[{\mathcal{}X}]_{2}italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, we can consider elements rpRpsubscript𝑟𝑝superscript𝑅𝑝r_{p}\in{\mathcal{}R}^{p}italic_r start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ∈ italic_R start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT as elements of R2superscriptsubscript𝑅2{\mathcal{}R}_{2}^{\sharp}italic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT as well by assuming that rpsubscript𝑟𝑝r_{p}italic_r start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT represents the stable collection rp|qq[X]2subscriptdelimited-⟨⟩evaluated-atsubscript𝑟𝑝𝑞𝑞subscriptdelimited-[]𝑋2\langle{\left.\kern-1.2ptr_{p}\vphantom{|}\right|_{q}}\rangle_{q\in[{\mathcal{% }X}]_{2}}⟨ italic_r start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT | start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT italic_q ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT.

According to Proposition 2, both joins and restrictions can be computed componentwise. As a consequence, we find:

Theorem 3.2

For a 2-nice relational domain R𝑅{\mathcal{}R}italic_R which satisfies (6), the domain R2superscriptsubscript𝑅2normal-♯{\mathcal{}R}_{2}^{\sharp}italic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT is a 2-decomposable relational domain. ∎

Fig. 1 shows the abstract relational domains R,R2𝑅subscript𝑅2{\mathcal{}R},{\mathcal{}R}_{2}italic_R , italic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, and R2superscriptsubscript𝑅2{\mathcal{}R}_{2}^{\sharp}italic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT together with the mappings between them.

Refer to captionR2subscript𝑅2{\mathcal{}R}_{2}italic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPTR𝑅{\mathcal{}R}italic_RR2superscriptsubscript𝑅2{\mathcal{}R}_{2}^{\sharp}italic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT(.)¯\overline{(.)}over¯ start_ARG ( . ) end_ARGid.|pp[X]2\langle{\left.\kern-1.2pt.\vphantom{|}\right|_{p}}\rangle_{p\in[{\mathcal{}X}]% _{2}}⟨ . | start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT.\bigsqcap\,.⨅ .
Figure 1: The relationship between abstract relational domains.

According to Theorem 3.2, the domain C2[U]superscriptsubscript𝐶2delimited-[]𝑈{\mathcal{}C}_{2}^{\sharp}[U]italic_C start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT [ italic_U ] of abstract 2-disjunctive constants is indeed 2-decomposable. The given construction provides us with polynomial algorithms for least upper bound, greatest lower bound, and projection.

3.2 Assignments

Let us return to the relational domain C2[U]subscript𝐶2delimited-[]𝑈{\mathcal{}C}_{2}[U]italic_C start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT [ italic_U ] of 2-disjunctive constants and indicate how abstract transformers for assignments x:=sassign𝑥𝑠x\,{:=}\,sitalic_x := italic_s can be tailored. For 2-disjunctive constants, we only consider right-hand sides s𝑠sitalic_s where s𝑠sitalic_s is either ???? (unknown value), or of the form A|y1||ykconditional𝐴subscript𝑦1subscript𝑦𝑘A|y_{1}|\ldots|y_{k}italic_A | italic_y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT | … | italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT where A𝐴Aitalic_A is a set of constants and y1,,ykXsubscript𝑦1subscript𝑦𝑘𝑋y_{1},\ldots,y_{k}\in{\mathcal{}X}italic_y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∈ italic_X are variables. The concrete semantics of such an assignment is given by

x:=?Σ={σ{xc}σΣ,cU}x:=A|y1||ykΣ={σ{xa}σΣ,aA}j=1k{σ{xσyj}σΣ}\begin{array}[]{lll}\llbracket x\,{:=}\,?\rrbracket\,\Sigma&=&\{\sigma\oplus\{% x\mapsto c\}\mid\sigma\in\Sigma,c\in U\}\\ \llbracket x\,{:=}\,A|y_{1}|\ldots|y_{k}\rrbracket\,\Sigma&=&\{\sigma\oplus\{x% \mapsto a\}\mid\sigma\in\Sigma,a\in A\}\cup\\ &&\bigcup_{j=1}^{k}\{\sigma\oplus\{x\mapsto\sigma\,y_{j}\}\mid\sigma\in\Sigma% \}\end{array}start_ARRAY start_ROW start_CELL ⟦ italic_x := ? ⟧ roman_Σ end_CELL start_CELL = end_CELL start_CELL { italic_σ ⊕ { italic_x ↦ italic_c } ∣ italic_σ ∈ roman_Σ , italic_c ∈ italic_U } end_CELL end_ROW start_ROW start_CELL ⟦ italic_x := italic_A | italic_y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT | … | italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ⟧ roman_Σ end_CELL start_CELL = end_CELL start_CELL { italic_σ ⊕ { italic_x ↦ italic_a } ∣ italic_σ ∈ roman_Σ , italic_a ∈ italic_A } ∪ end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL end_CELL start_CELL ⋃ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT { italic_σ ⊕ { italic_x ↦ italic_σ italic_y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT } ∣ italic_σ ∈ roman_Σ } end_CELL end_ROW end_ARRAY

Generalizing the corresponding abstract semantics for (copy) constant propagation, we define the logic transformer for C2[U]subscript𝐶2delimited-[]𝑈{\mathcal{}C}_{2}[U]italic_C start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT [ italic_U ] by

x:=?2Ψ=Ψ|X{x}x:=A|y1||yk2Ψ=(xA)Ψ|X{x}22j=1kx:=yj2Ψ\begin{array}[]{lll}\llbracket x\,{:=}\;?\rrbracket_{2}\,\Psi&=&{\left.\kern-1% .2pt\Psi\vphantom{|}\right|_{{\mathcal{}X}\setminus\{x\}}}\\ \llbracket x\,{:=}\;A|y_{1}|\ldots|y_{k}\rrbracket_{2}\,\Psi&=&(x\in A)\land{% \left.\kern-1.2pt\Psi\vphantom{|}\right|_{{\mathcal{}X}\setminus\{x\}}}\sqcup_% {2}\\ &&\bigsqcup_{2\;j=1}^{\phantom{2\;}k}\;\;\llbracket x\;{:=}\,y_{j}\rrbracket_{% 2}\,\Psi\end{array}start_ARRAY start_ROW start_CELL ⟦ italic_x := ? ⟧ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT roman_Ψ end_CELL start_CELL = end_CELL start_CELL roman_Ψ | start_POSTSUBSCRIPT italic_X ∖ { italic_x } end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL ⟦ italic_x := italic_A | italic_y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT | … | italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ⟧ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT roman_Ψ end_CELL start_CELL = end_CELL start_CELL ( italic_x ∈ italic_A ) ∧ roman_Ψ | start_POSTSUBSCRIPT italic_X ∖ { italic_x } end_POSTSUBSCRIPT ⊔ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL end_CELL start_CELL ⨆ start_POSTSUBSCRIPT 2 italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ⟦ italic_x := italic_y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ⟧ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT roman_Ψ end_CELL end_ROW end_ARRAY
Proposition 3
  1. 1.

    The logic transformer x:=?2\llbracket x\,{:=}\,?\rrbracket_{2}⟦ italic_x := ? ⟧ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT is precise, i.e.,

    x:=?(γΨ)=γ(x:=?2Ψ)\llbracket x\,{:=}\,?\rrbracket\,(\gamma\,\Psi)=\gamma\,(\llbracket x\,{:=}\,?% \rrbracket_{2}\,\Psi)⟦ italic_x := ? ⟧ ( italic_γ roman_Ψ ) = italic_γ ( ⟦ italic_x := ? ⟧ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT roman_Ψ ) (9)

    In particular, it is distributive and commutes with bottom\bot.

  2. 2.

    The logic transformer x:=Ay1||yk2\llbracket x\,{:=}\,A\mid y_{1}|\ldots|y_{k}\rrbracket_{2}⟦ italic_x := italic_A ∣ italic_y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT | … | italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ⟧ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT is precise, if the logic transformers for x:=yjassign𝑥subscript𝑦𝑗x\,{:=}\,y_{j}italic_x := italic_y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT, j=1,,k𝑗1𝑘j=1,\ldots,kitalic_j = 1 , … , italic_k, are.

Thus, we have reduced the construction of logic transformers for assignments to restriction and the construction of logic transformers for variable-variable assignments x:=yassign𝑥𝑦x\,{:=}\,yitalic_x := italic_y. For yx𝑦𝑥y\equiv xitalic_y ≡ italic_x, the assignment is the identity, i.e., we set x:=x2Ψ=Ψ\llbracket x\,{:=}\,x\rrbracket_{2}\,\Psi=\Psi⟦ italic_x := italic_x ⟧ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT roman_Ψ = roman_Ψ. Therefore, assume that y𝑦yitalic_y is different from x𝑥xitalic_x, and assume that Ψ|X{x}=Ψevaluated-atΨ𝑋𝑥superscriptΨ{\left.\kern-1.2pt\Psi\vphantom{|}\right|_{{\mathcal{}X}\setminus\{x\}}}=\Psi^% {\prime}roman_Ψ | start_POSTSUBSCRIPT italic_X ∖ { italic_x } end_POSTSUBSCRIPT = roman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. Let B𝐵Bitalic_B denote the set of constants so that Ψ|{y}evaluated-atsuperscriptΨ𝑦{\left.\kern-1.2pt\Psi^{\prime}\vphantom{|}\right|_{\{y\}}}roman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | start_POSTSUBSCRIPT { italic_y } end_POSTSUBSCRIPT equals yB𝑦𝐵y\in Bitalic_y ∈ italic_B. Let ΨysubscriptΨ𝑦\Psi_{y}roman_Ψ start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT denote the conjunction of all formulas Ψ|pevaluated-atsuperscriptΨ𝑝{\left.\kern-1.2pt\Psi^{\prime}\vphantom{|}\right|_{p}}roman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT for p[X]2𝑝subscriptdelimited-[]𝑋2p\in[{\mathcal{}X}]_{2}italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT with yp𝑦𝑝y\in pitalic_y ∈ italic_p. Let Ψ′′=Ψy[x/y]superscriptΨ′′subscriptΨ𝑦delimited-[]𝑥𝑦\Psi^{\prime\prime}=\Psi_{y}[x/y]roman_Ψ start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT = roman_Ψ start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT [ italic_x / italic_y ] denote the formula obtained from ΨysubscriptΨ𝑦\Psi_{y}roman_Ψ start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT by renaming each occurrence of the variable y𝑦yitalic_y with x𝑥xitalic_x. Then we define

x:=y2Ψ=Ψ(aBx=ay=a)Ψ′′\llbracket x\,{:=}\,y\rrbracket_{2}\,\Psi=\Psi^{\prime}\wedge\left(\bigvee_{a% \in B}x=a\wedge y=a\right)\wedge\Psi^{\prime\prime}⟦ italic_x := italic_y ⟧ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT roman_Ψ = roman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∧ ( ⋁ start_POSTSUBSCRIPT italic_a ∈ italic_B end_POSTSUBSCRIPT italic_x = italic_a ∧ italic_y = italic_a ) ∧ roman_Ψ start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT

Let Ψ¯¯Ψ\bar{\Psi}over¯ start_ARG roman_Ψ end_ARG denote the formula returned by that transformer for ΨΨ\Psiroman_Ψ. Intuitively, our definition means for xp𝑥𝑝x\not\in pitalic_x ∉ italic_p, that Ψ¯|p=Ψ|pevaluated-at¯Ψ𝑝evaluated-atΨ𝑝{\left.\kern-1.2pt\bar{\Psi}\vphantom{|}\right|_{p}}={\left.\kern-1.2pt\Psi% \vphantom{|}\right|_{p}}over¯ start_ARG roman_Ψ end_ARG | start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT = roman_Ψ | start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT, i.e., Ψ|pevaluated-atΨ𝑝{\left.\kern-1.2pt\Psi\vphantom{|}\right|_{p}}roman_Ψ | start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT is preserved while additionally, Ψ¯|{x}=Ψ|{y}[x/y]evaluated-at¯Ψ𝑥evaluated-atΨ𝑦delimited-[]𝑥𝑦{\left.\kern-1.2pt\bar{\Psi}\vphantom{|}\right|_{\{x\}}}={\left.\kern-1.2pt% \Psi\vphantom{|}\right|_{\{y\}}}[x/y]over¯ start_ARG roman_Ψ end_ARG | start_POSTSUBSCRIPT { italic_x } end_POSTSUBSCRIPT = roman_Ψ | start_POSTSUBSCRIPT { italic_y } end_POSTSUBSCRIPT [ italic_x / italic_y ], Ψ¯|{x,y}=aBx=by=bevaluated-at¯Ψ𝑥𝑦subscript𝑎𝐵𝑥𝑏𝑦𝑏{\left.\kern-1.2pt\bar{\Psi}\vphantom{|}\right|_{\{x,y\}}}=\bigvee_{a\in B}x=b% \wedge y=bover¯ start_ARG roman_Ψ end_ARG | start_POSTSUBSCRIPT { italic_x , italic_y } end_POSTSUBSCRIPT = ⋁ start_POSTSUBSCRIPT italic_a ∈ italic_B end_POSTSUBSCRIPT italic_x = italic_b ∧ italic_y = italic_b, and for z{x,y}𝑧𝑥𝑦z\not\in\{x,y\}italic_z ∉ { italic_x , italic_y }, Ψ¯|{x,z}=Ψ|{y,z}[x/y]evaluated-at¯Ψ𝑥𝑧evaluated-atΨ𝑦𝑧delimited-[]𝑥𝑦{\left.\kern-1.2pt\bar{\Psi}\vphantom{|}\right|_{\{x,z\}}}={\left.\kern-1.2pt% \Psi\vphantom{|}\right|_{\{y,z\}}}[x/y]over¯ start_ARG roman_Ψ end_ARG | start_POSTSUBSCRIPT { italic_x , italic_z } end_POSTSUBSCRIPT = roman_Ψ | start_POSTSUBSCRIPT { italic_y , italic_z } end_POSTSUBSCRIPT [ italic_x / italic_y ].

Proposition 4

The logic transformer x:=y2\llbracket x\,{:=}\,y\rrbracket_{2}⟦ italic_x := italic_y ⟧ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT is precise, i.e.,

x:=y(γΨ)=γ(x:=y2Ψ)\llbracket x\,{:=}\,y\rrbracket\,(\gamma\,\Psi)=\gamma\,(\llbracket x\,{:=}\,y% \rrbracket_{2}\,\Psi)⟦ italic_x := italic_y ⟧ ( italic_γ roman_Ψ ) = italic_γ ( ⟦ italic_x := italic_y ⟧ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT roman_Ψ ) (10)

holds. ∎

The same construction allows us to construct abstract logic transformers x:=s2:C2[U]C2[U]\llbracket x\,{:=}\,s\rrbracket_{2}^{\sharp}:C_{2}^{\sharp}[U]\to C_{2}^{% \sharp}[U]⟦ italic_x := italic_s ⟧ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT : italic_C start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT [ italic_U ] → italic_C start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT [ italic_U ] – only that the least upper bound operation and projection of C2[U]subscript𝐶2delimited-[]𝑈{\mathcal{}C}_{2}[U]italic_C start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT [ italic_U ] must be replaced by the corresponding operations of C2[U]superscriptsubscript𝐶2delimited-[]𝑈{\mathcal{}C}_{2}^{\sharp}[U]italic_C start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT [ italic_U ]. The abstract transformer then, however, is only sound and no longer precise, since the projection operation of C2[U]superscriptsubscript𝐶2delimited-[]𝑈{\mathcal{}C}_{2}^{\sharp}[U]italic_C start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT [ italic_U ] may return for an abstract relation R𝑅Ritalic_R whose concretization is empty an abstract relation with a non-empty concretization. Accordingly, Eq. 9 and Eq. 10 may be violated.

3.3 Guards

It remains to provide the semantics of guards. Again, we first consider the domain C2[U]subscript𝐶2delimited-[]𝑈{\mathcal{}C}_{2}[U]italic_C start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT [ italic_U ] of 2-disjunctive formulas (modulo logical equivalence), ordered by implication. We consider positive guards of the form xA𝑥𝐴x\in Aitalic_x ∈ italic_A, and conversely, negative guards of the form xA𝑥𝐴x\not\in Aitalic_x ∉ italic_A. Positive guards thus can directly be expressed in C2[U]subscript𝐶2delimited-[]𝑈{\mathcal{}C}_{2}[{\mathcal{}U}]italic_C start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT [ italic_U ]. Thus we set

?(xA)Ψ=Ψ(xA)\llbracket?(x\in A)\rrbracket\,\Psi=\Psi\wedge(x\in A)⟦ ? ( italic_x ∈ italic_A ) ⟧ roman_Ψ = roman_Ψ ∧ ( italic_x ∈ italic_A ) (11)

Negative guards on the other hand cannot be directly expressed in C2[U]subscript𝐶2delimited-[]𝑈{\mathcal{}C}_{2}[U]italic_C start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT [ italic_U ] – at least if there are unknown constant values beyond the finite universe U𝑈Uitalic_U. To deal with this, we introduce a dedicated fresh symbol U\bullet\not\in U∙ ∉ italic_U with the understanding that \bullet repesents any value aU𝑎𝑈a\not\in Uitalic_a ∉ italic_U. The property xA𝑥𝐴x\not\in Aitalic_x ∉ italic_A then can equivalently be represented by

x(U{})A𝑥𝑈𝐴x\in(U\cup\{\bullet\})\setminus Aitalic_x ∈ ( italic_U ∪ { ∙ } ) ∖ italic_A

allowing us to deal with such co-finite sets of possible values in the same way as we did for finite sets of values alone.

4 Directed Relational Domains

Instead of plain equalities, let us now consider inequalities between variables and constants instead of equalities and abandon disjunctions. We will, however, add disjunctions in the end as well. Thus for now, we just consider finite conjunctions of inequalities of the form

dx,xy,orxdformulae-sequencesquare-image-of-or-equals𝑑𝑥formulae-sequencesquare-image-of-or-equals𝑥𝑦orsquare-image-of-or-equals𝑥𝑑d\sqsubseteq x,\quad x\sqsubseteq y,\quad\text{or}\quad x\sqsubseteq ditalic_d ⊑ italic_x , italic_x ⊑ italic_y , or italic_x ⊑ italic_d

for variables x,yX𝑥𝑦𝑋x,y\in{\mathcal{}X}italic_x , italic_y ∈ italic_X and constant values d𝑑ditalic_d. As usual, we consider conjunctions only up to semantic equivalence. We call inequalities of the form dxsquare-image-of-or-equals𝑑𝑥d\sqsubseteq xitalic_d ⊑ italic_x lower bound constraints, and d𝑑ditalic_d a lower bound for x𝑥xitalic_x. Analogously for upper bounds. Inequalities of the form xysquare-image-of-or-equals𝑥𝑦x\sqsubseteq yitalic_x ⊑ italic_y are called variable constraints.

Assume we are given a partial order (po), i.e., a set P𝑃Pitalic_P partially ordered by some relation \leq. Examples of partial orders of interest are

Subsets.

The set 2Usuperscript2𝑈2^{U}2 start_POSTSUPERSCRIPT italic_U end_POSTSUPERSCRIPT of all subsets of some finite universe U𝑈Uitalic_U where the ordering is subset inclusion \subseteq;

Integers.

The set \mathbb{Z}blackboard_Z of integers equipped with the natural ordering subscript\leq_{\mathbb{Z}}≤ start_POSTSUBSCRIPT blackboard_Z end_POSTSUBSCRIPT;

Multisets.

Multisets, i.e., the set of all mappings μ:U:𝜇𝑈\mu:U\to\mathbb{N}italic_μ : italic_U → blackboard_N from elements in U𝑈Uitalic_U to their multiplicities ordered by multiset inclusion Nsubscript𝑁\subseteq_{\mathcal{}N}⊆ start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT.

Strings.

The set of all strings Σ*superscriptΣ\Sigma^{*}roman_Σ start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT for some finite alphabet ΣΣ\Sigmaroman_Σ. Several partial orderings are of interest:

  • the prefix ordering psubscript𝑝\leq_{p}≤ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT; e.g., 𝖺𝖻p𝖺𝖻𝖼𝖽subscript𝑝𝖺𝖻𝖺𝖻𝖼𝖽\textsf{ab}\leq_{p}\textsf{abcd}ab ≤ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT abcd;

  • the substring ordering ssubscript𝑠\leq_{s}≤ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT, e.g., 𝖻𝖼s𝖺𝖻𝖼𝖽𝖾subscript𝑠𝖻𝖼𝖺𝖻𝖼𝖽𝖾\textsf{bc}\leq_{s}\textsf{abcde}bc ≤ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT abcde;

  • the scattered substring ordering sssubscript𝑠𝑠\leq_{ss}≤ start_POSTSUBSCRIPT italic_s italic_s end_POSTSUBSCRIPT, e.g., 𝖻𝖽ss𝖺𝖻𝖼𝖽𝖾subscript𝑠𝑠𝖻𝖽𝖺𝖻𝖼𝖽𝖾\textsf{bd}\leq_{ss}\textsf{abcde}bd ≤ start_POSTSUBSCRIPT italic_s italic_s end_POSTSUBSCRIPT abcde.

Much more expressive constraints on strings have been studied, e.g., in Chen et al. (2018); Day et al. (2023); Abdulla et al. (2019); Ganesh et al. (2011). In particular, for a fragment containing the prefix ordering, decision procedures are known based on (synchronous) multi-tape finite automata Yu et al. (2011). Due to their expressiveness, these techniques come with a considerable computational effort. Instead, we follow Arceri et al. (2022) where basic relational domains are considered for reasoning about variables of string type, sets (of characters), or integers (lengths of strings). Their analyses relate program variables only according to some partial order, and also consider lower bounds. Here, these considerations are complemented by taking upper bounds into account as well and, eventually, by adding disjunctions.

A mapping σ:XP:𝜎𝑋𝑃\sigma:{\mathcal{}X}\to Pitalic_σ : italic_X → italic_P is a model of ΨΨ\Psiroman_Ψ (relative to P𝑃Pitalic_P), written as σΨmodels𝜎Ψ\sigma\models\Psiitalic_σ ⊧ roman_Ψ, if ΨΨbottom\Psi\neq\botroman_Ψ ≠ ⊥, and

  • dσx𝑑𝜎𝑥d\leq\sigma\,xitalic_d ≤ italic_σ italic_x (in P𝑃Pitalic_P) for each constraint dxsquare-image-of-or-equals𝑑𝑥d\sqsubseteq xitalic_d ⊑ italic_x in ΨΨ\Psiroman_Ψ;

  • σxd𝜎𝑥𝑑\sigma\,x\leq ditalic_σ italic_x ≤ italic_d (in P𝑃Pitalic_P) for each constraint xdsquare-image-of-or-equals𝑥𝑑x\sqsubseteq ditalic_x ⊑ italic_d in ΨΨ\Psiroman_Ψ; and

  • σxσy𝜎𝑥𝜎𝑦\sigma\,x\leq\sigma\,yitalic_σ italic_x ≤ italic_σ italic_y (in P𝑃Pitalic_P) for each constraint xysquare-image-of-or-equals𝑥𝑦x\sqsubseteq yitalic_x ⊑ italic_y in ΨΨ\Psiroman_Ψ.

Let D[P]𝐷delimited-[]𝑃{\mathcal{}D}[P]italic_D [ italic_P ] denote all finite conjunctions over P𝑃Pitalic_P modulo semantic equivalence where the ordering on D[P]𝐷delimited-[]𝑃{\mathcal{}D}[P]italic_D [ italic_P ] is semantic implication. As before, normal forms of conjunctions will be considered up to reordering of atomic propositions. Thus, syntactic equality of conjunctions here means equality of the respective sets of propositions. Let ΨΨ\Psiroman_Ψ denote a finite conjunction where VP𝑉𝑃V\subseteq Pitalic_V ⊆ italic_P is the set of values occurring in ΨΨ\Psiroman_Ψ as lower or upper bounds. To provide a first normal form for ΨΨ\Psiroman_Ψ, we proceed in two steps. First, we determine the transitive closure ()+(\leq\cup\sqsubseteq)^{+}( ≤ ∪ ⊑ ) start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT on the set XV𝑋𝑉{\mathcal{}X}\cup Vitalic_X ∪ italic_V of the constraints provided by ΨΨ\Psiroman_Ψ. In case that (a,b)()+(a,b)\in(\leq\cup\sqsubseteq)^{+}( italic_a , italic_b ) ∈ ( ≤ ∪ ⊑ ) start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT for a,bV𝑎𝑏𝑉a,b\in Vitalic_a , italic_b ∈ italic_V where ab𝑎𝑏a\leq bitalic_a ≤ italic_b does not hold in P𝑃Pitalic_P, then ΨΨ\Psiroman_Ψ is unsatisfiable and therefore represented by the dedicated element Ψ=superscriptΨbottom\Psi^{\prime}=\botroman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = ⊥. If this is not the case, let ΨsuperscriptΨ\Psi^{\prime}roman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT denote the conjunction of all inequalities s1s2square-image-of-or-equalssubscript𝑠1subscript𝑠2s_{1}\sqsubseteq s_{2}italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⊑ italic_s start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT where (s1,s2)()+(s_{1},s_{2})\in(\leq\cup\sqsubseteq)^{+}( italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_s start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ∈ ( ≤ ∪ ⊑ ) start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT and either s1subscript𝑠1s_{1}italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT or s2subscript𝑠2s_{2}italic_s start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT or both are in X𝑋{\mathcal{}X}italic_X.

In the second step, when ΨsuperscriptΨbottom\Psi^{\prime}\neq\botroman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ≠ ⊥, we remove all redundant constraints. These are constraints of the form

  • xxsquare-image-of-or-equals𝑥𝑥x\sqsubseteq xitalic_x ⊑ italic_x for xX𝑥𝑋x\in{\mathcal{}X}italic_x ∈ italic_X, as these constraints hold vacuously;

  • axsquare-image-of-or-equals𝑎𝑥a\sqsubseteq xitalic_a ⊑ italic_x for aV𝑎𝑉a\in Vitalic_a ∈ italic_V and xX𝑥𝑋x\in{\mathcal{}X}italic_x ∈ italic_X if there is also a constraint bxsquare-image-of-or-equals𝑏𝑥b\sqsubseteq xitalic_b ⊑ italic_x with ab𝑎𝑏a\leq bitalic_a ≤ italic_b, i.e., there is a stricter lower bound;

  • xbsquare-image-of-or-equals𝑥𝑏x\sqsubseteq bitalic_x ⊑ italic_b for bV𝑏𝑉b\in Vitalic_b ∈ italic_V and xX𝑥𝑋x\in{\mathcal{}X}italic_x ∈ italic_X if there is also a constraint xasquare-image-of-or-equals𝑥𝑎x\sqsubseteq aitalic_x ⊑ italic_a with ab𝑎𝑏a\leq bitalic_a ≤ italic_b, i.e., there is a stricter upper bound.

Additionally, we set ΨsuperscriptΨ\Psi^{\prime}roman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT to bottom\bot whenever for some variable x𝑥xitalic_x,

  • there is no lower bound in P𝑃Pitalic_P for the set of upper bounds provided for x𝑥xitalic_x by ΨΨ\Psiroman_Ψ; or

  • there is no upper bound in P𝑃Pitalic_P for the set of lower bounds provided for x𝑥xitalic_x by ΨΨ\Psiroman_Ψ.

Assume, e.g., that ΨΨ\Psiroman_Ψ is given by

(𝖺𝖻𝖼x)(𝖺𝖻𝖽x)square-image-of-or-equals𝖺𝖻𝖼𝑥square-image-of-or-equals𝖺𝖻𝖽𝑥(\textsf{abc}\sqsubseteq x)\wedge(\textsf{abd}\sqsubseteq x)( abc ⊑ italic_x ) ∧ ( abd ⊑ italic_x )

where we consider the prefix order psubscript𝑝\leq_{p}≤ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT on strings. Since 𝖺𝖻𝖼,𝖺𝖻𝖽𝖺𝖻𝖼𝖺𝖻𝖽\textsf{abc},\textsf{abd}abc , abd cannot be prefixes of the same string, this conjunction is considered equivalent to bottom\bot.

Let us denote the resulting conjunction ΨsuperscriptΨ\Psi^{\prime}roman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT by 𝗇𝖿0[Ψ]subscript𝗇𝖿0delimited-[]Ψ\textsf{nf}_{0}[\Psi]nf start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT [ roman_Ψ ] and call it the 0-normal form of ΨΨ\Psiroman_Ψ. Assuming that comparisons of values as well as checks for common lower or upper bounds are constant-time operations, 0-normal forms can be computed in polynomial time.

4.1 Lattice Domains

An important special case is when P𝑃Pitalic_P is a lattice, i.e., a po where every two elements a,b𝑎𝑏a,bitalic_a , italic_b both have a least upper bound ab𝑎𝑏a\vee bitalic_a ∨ italic_b and a greatest lower bound ab𝑎𝑏a\wedge bitalic_a ∧ italic_b.

Example 4

The po 2Usuperscript2𝑈2^{U}2 start_POSTSUPERSCRIPT italic_U end_POSTSUPERSCRIPT ordered by subset inclusion is a complete lattice and thus, in particular, a lattice. The integers \mathbb{Z}blackboard_Z with the natural ordering is another example of a lattice, this time without least or greatest element. Yet another example are multisets: this lattice has a least, but no greatest element.

The po Σ*superscriptΣ\Sigma^{*}roman_Σ start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT of strings ordered by the prefix relation is not a lattice. Σ*superscriptΣ\Sigma^{*}roman_Σ start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT provides a least element ϵitalic-ϵ\epsilonitalic_ϵ, as well as greatest lower bounds, namely, the maximal common prefix, but does not have least upper bounds to all pairs of strings. There is, for example, no upper bound to abc and abd in Σ*superscriptΣ\Sigma^{*}roman_Σ start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT. ∎

When P𝑃Pitalic_P is a lattice, we can provide a dedicated normal form which, however, may now use constants from P𝑃Pitalic_P which did not occur in ΨΨ\Psiroman_Ψ before. Assume now that ΨsuperscriptΨ\Psi^{\prime}roman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT is the 0-normal form of ΨΨ\Psiroman_Ψ. If P𝑃Pitalic_P has a least element Psubscriptbottom𝑃\bot_{P}⊥ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT, we add the vacuous constraint Px\bot_{P}\sqsubseteq x⊥ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT ⊑ italic_x to every variable x𝑥xitalic_x. Likewise, if P𝑃Pitalic_P has a greatest element Psubscripttop𝑃\top_{P}⊤ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT, we add the constraint xPsquare-image-of-or-equals𝑥subscripttop𝑃x\sqsubseteq\top_{P}italic_x ⊑ ⊤ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT.

If ΨsuperscriptΨ\Psi^{\prime}roman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT is different from bottom\bot, we subsequently simplify ΨsuperscriptΨ\Psi^{\prime}roman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT further by replacing for each variable xX𝑥𝑋x\in{\mathcal{}X}italic_x ∈ italic_X,

  • the set of upper bound constraints occurring in ΨsuperscriptΨ\Psi^{\prime}roman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, if it is non-empty and consists of (xb1)(xbr)square-image-of-or-equals𝑥subscript𝑏1square-image-of-or-equals𝑥subscript𝑏𝑟(x\sqsubseteq b_{1})\land\ldots\land(x\sqsubseteq b_{r})( italic_x ⊑ italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ∧ … ∧ ( italic_x ⊑ italic_b start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ), with the single constraint (x(i=1rbi))square-image-of-or-equals𝑥superscriptsubscript𝑖1𝑟subscript𝑏𝑖(x\sqsubseteq(\bigwedge_{i=1}^{r}b_{i}))( italic_x ⊑ ( ⋀ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_r end_POSTSUPERSCRIPT italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) );

  • the set of lower bound constraints in ΨsuperscriptΨ\Psi^{\prime}roman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, if it is non-empty and consists of (a1x)(arx)square-image-of-or-equalssubscript𝑎1𝑥square-image-of-or-equalssubscript𝑎𝑟𝑥(a_{1}\sqsubseteq x)\land\ldots\land(a_{r}\sqsubseteq x)( italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⊑ italic_x ) ∧ … ∧ ( italic_a start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ⊑ italic_x ), with the single constraint ((i=1rai)x)square-image-of-or-equalssuperscriptsubscript𝑖1𝑟subscript𝑎𝑖𝑥((\bigvee_{i=1}^{r}a_{i})\sqsubseteq x)( ( ⋁ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_r end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ⊑ italic_x ).

Let us denote the resulting formula by 𝗇𝖿1[Ψ]subscript𝗇𝖿1delimited-[]Ψ\textsf{nf}_{1}[\Psi]nf start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT [ roman_Ψ ] and call it the 1-normal form of ΨΨ\Psiroman_Ψ. The 1-normal form of ΨΨ\Psiroman_Ψ can be computed in polynomial time as well – given that comparisons as well as pairwise least upper bounds and greatest lower bounds in P𝑃Pitalic_P are constant time. We have:

Theorem 4.1

Assume that the po P𝑃Pitalic_P is a lattice. Then the following holds:

  1. 1.

    A conjunction ΨΨ\Psiroman_Ψ is satisfiable over P𝑃Pitalic_P iff 𝘯𝘧1[Ψ]subscript𝘯𝘧1delimited-[]Ψbottom\textsf{nf}_{1}[\Psi]\neq\botnf start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT [ roman_Ψ ] ≠ ⊥.

  2. 2.

    For arbitrary conjunctions Ψ1,Ψ2subscriptΨ1subscriptΨ2\Psi_{1},\Psi_{2}roman_Ψ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , roman_Ψ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT over P𝑃Pitalic_P, Ψ1Ψ2subscriptΨ1subscriptΨ2\Psi_{1}\implies\Psi_{2}roman_Ψ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⟹ roman_Ψ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT iff 𝘯𝘧1[Ψ1]=𝘯𝘧1[Ψ1Ψ2]subscript𝘯𝘧1delimited-[]subscriptΨ1subscript𝘯𝘧1delimited-[]subscriptΨ1subscriptΨ2\textsf{nf}_{1}[\Psi_{1}]=\textsf{nf}_{1}[\Psi_{1}\land\Psi_{2}]nf start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT [ roman_Ψ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ] = nf start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT [ roman_Ψ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∧ roman_Ψ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ].

Satisfiability as well as implication are decidable in polynomial time. ∎

Proof

If Ψ=𝗇𝖿1[Ψ]=superscriptΨsubscript𝗇𝖿1delimited-[]Ψbottom\Psi^{\prime}=\textsf{nf}_{1}[\Psi]=\botroman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = nf start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT [ roman_Ψ ] = ⊥, then ΨΨ\Psiroman_Ψ cannot be satisfiable since any of the simplification steps preserves the set of satisfying assignments. So, assume that ΨsuperscriptΨ\Psi^{\prime}roman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT is syntactically different from bottom\bot. Let σ𝜎\sigmaitalic_σ be the variable assignment which maps each variable x𝑥xitalic_x to its lower bound axPsubscript𝑎𝑥𝑃a_{x}\in Pitalic_a start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT ∈ italic_P – if it exists, and to some fixed element a¯¯𝑎\underline{a}under¯ start_ARG italic_a end_ARG which is less or equal to any other lower bound mentioned in ΨsuperscriptΨ\Psi^{\prime}roman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. Then all single variable constraints are satisfied as well as, by transitivity, all constraints xysquare-image-of-or-equals𝑥𝑦x\sqsubseteq yitalic_x ⊑ italic_y occurring in ΨsuperscriptΨ\Psi^{\prime}roman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. Therefore, σΨmodels𝜎Ψ\sigma\models\Psiitalic_σ ⊧ roman_Ψ – implying that ΨΨ\Psiroman_Ψ is satisfiable. From this, statement (1) follows.

To prove statement (2), consider conjunctions Ψ1,Ψ2subscriptsuperscriptΨ1subscriptsuperscriptΨ2\Psi^{\prime}_{1},\Psi^{\prime}_{2}roman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , roman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT both in 1-normal form. If these syntactically coincide, then obviously also Ψ1Ψ2iffsubscriptsuperscriptΨ1subscriptsuperscriptΨ2\Psi^{\prime}_{1}\iff\Psi^{\prime}_{2}roman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⇔ roman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT holds. For the reverse direction, we prove that if ΨisubscriptsuperscriptΨ𝑖\Psi^{\prime}_{i}roman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT are distinct, then they cannot be equivalent. From that, the assertion follows. If one of them equals bottom\bot and the other not, then by statement (1), they cannot be equivalent. Therefore, assume that both are satisfiable and thus, different from bottom\bot. We consider all cases how the ΨisubscriptΨ𝑖\Psi_{i}roman_Ψ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT may differ.

Lower bounds.

First, assume that there are constraints aixsquare-image-of-or-equalssubscript𝑎𝑖𝑥a_{i}\sqsubseteq xitalic_a start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⊑ italic_x, i=1,2𝑖12i=1,2italic_i = 1 , 2, for some variable x𝑥xitalic_x in ΨisubscriptsuperscriptΨ𝑖\Psi^{\prime}_{i}roman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT where a1subscript𝑎1a_{1}italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT is different from a2subscript𝑎2a_{2}italic_a start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. Assume w.l.o.g. that a1a2not-less-than-or-equalssubscript𝑎1subscript𝑎2a_{1}\not\leq a_{2}italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≰ italic_a start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT holds. Let Lxsubscript𝐿𝑥L_{x}italic_L start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT denote the set consisting of x𝑥xitalic_x together with variables zX𝑧𝑋z\in{\mathcal{}X}italic_z ∈ italic_X where Ψ2subscriptsuperscriptΨ2\Psi^{\prime}_{2}roman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT has a constraint zxsquare-image-of-or-equals𝑧𝑥z\sqsubseteq xitalic_z ⊑ italic_x. Let σ𝜎\sigmaitalic_σ denote some assignment with σΨ2models𝜎subscriptsuperscriptΨ2\sigma\models\Psi^{\prime}_{2}italic_σ ⊧ roman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. Then we construct a variable assignment σsuperscript𝜎\sigma^{\prime}italic_σ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT such that σΨ2modelssuperscript𝜎subscriptsuperscriptΨ2\sigma^{\prime}\models\Psi^{\prime}_{2}italic_σ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⊧ roman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT but σ⊧̸Ψ1not-modelssuperscript𝜎subscriptsuperscriptΨ1\sigma^{\prime}\not\models\Psi^{\prime}_{1}italic_σ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⊧̸ roman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT by

σz={σza2if zLxσzotherwisesuperscript𝜎𝑧cases𝜎𝑧subscript𝑎2if 𝑧subscript𝐿𝑥𝜎𝑧otherwise\sigma^{\prime}\,z=\begin{cases}\sigma\,z\wedge a_{2}&\text{if }z\in L_{x}\\ \sigma\,z&\text{otherwise}\end{cases}italic_σ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_z = { start_ROW start_CELL italic_σ italic_z ∧ italic_a start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL start_CELL if italic_z ∈ italic_L start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_σ italic_z end_CELL start_CELL otherwise end_CELL end_ROW

Then still σΨ2modelssuperscript𝜎subscriptsuperscriptΨ2\sigma^{\prime}\models\Psi^{\prime}_{2}italic_σ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⊧ roman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. But since a1a2not-less-than-or-equalssubscript𝑎1subscript𝑎2a_{1}\not\leq a_{2}italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≰ italic_a start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, it follows that σsuperscript𝜎\sigma^{\prime}italic_σ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT does not satisfy a1xsquare-image-of-or-equalssubscript𝑎1𝑥a_{1}\sqsubseteq xitalic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⊑ italic_x and thus it does not model Ψ1subscriptsuperscriptΨ1\Psi^{\prime}_{1}roman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT.

If there is a constraint a1xsquare-image-of-or-equalssubscript𝑎1𝑥a_{1}\sqsubseteq xitalic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⊑ italic_x in Ψ1subscriptsuperscriptΨ1\Psi^{\prime}_{1}roman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, but no lower bound constraint for x𝑥xitalic_x in Ψ2subscriptsuperscriptΨ2\Psi^{\prime}_{2}roman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, then there is some value ¯P¯bottom𝑃\underline{\bot}\in Punder¯ start_ARG ⊥ end_ARG ∈ italic_P different from a1subscript𝑎1a_{1}italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT so that ¯a1σx¯bottomsubscript𝑎1𝜎𝑥\underline{\bot}\leq a_{1}\wedge\sigma\,xunder¯ start_ARG ⊥ end_ARG ≤ italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∧ italic_σ italic_x holds. This value allows us to construct an analogous distinguishing assignment σsuperscript𝜎\sigma^{\prime}italic_σ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT where we use ¯¯bottom\underline{\bot}under¯ start_ARG ⊥ end_ARG instead of a2subscript𝑎2a_{2}italic_a start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT.

Upper bounds.

First, assume that there are constraints xbisquare-image-of-or-equals𝑥subscript𝑏𝑖x\sqsubseteq b_{i}italic_x ⊑ italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, i=1,2𝑖12i=1,2italic_i = 1 , 2, for some variable x𝑥xitalic_x in ΨisubscriptsuperscriptΨ𝑖\Psi^{\prime}_{i}roman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT where b1subscript𝑏1b_{1}italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT is different from b2subscript𝑏2b_{2}italic_b start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. W.l.o.g., assume that b2b1not-less-than-or-equalssubscript𝑏2subscript𝑏1b_{2}\not\leq b_{1}italic_b start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≰ italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT. Let UxXsubscript𝑈𝑥𝑋U_{x}\subseteq{\mathcal{}X}italic_U start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT ⊆ italic_X denote the subset consisting of x𝑥xitalic_x together with all unknowns z𝑧zitalic_z where Ψ2subscriptsuperscriptΨ2\Psi^{\prime}_{2}roman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT has a constraint xzsquare-image-of-or-equals𝑥𝑧x\sqsubseteq zitalic_x ⊑ italic_z. Let σ𝜎\sigmaitalic_σ denote some assignment with σΨ2models𝜎subscriptsuperscriptΨ2\sigma\models\Psi^{\prime}_{2}italic_σ ⊧ roman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. Then we construct a variable assignment σsuperscript𝜎\sigma^{\prime}italic_σ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT by:

σz={σzb2if zUxσzotherwisesuperscript𝜎𝑧cases𝜎𝑧subscript𝑏2if 𝑧subscript𝑈𝑥𝜎𝑧otherwise\sigma^{\prime}\,z=\begin{cases}\sigma\,z\vee b_{2}&\text{if }z\in U_{x}\\ \sigma\,z&\text{otherwise}\end{cases}italic_σ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_z = { start_ROW start_CELL italic_σ italic_z ∨ italic_b start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL start_CELL if italic_z ∈ italic_U start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_σ italic_z end_CELL start_CELL otherwise end_CELL end_ROW

Then still σΨ2modelssuperscript𝜎subscriptsuperscriptΨ2\sigma^{\prime}\models\Psi^{\prime}_{2}italic_σ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⊧ roman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT holds. But since b2b1not-less-than-or-equalssubscript𝑏2subscript𝑏1b_{2}\not\leq b_{1}italic_b start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≰ italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, σsuperscript𝜎\sigma^{\prime}italic_σ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT does not satisfy Ψ1subscriptsuperscriptΨ1\Psi^{\prime}_{1}roman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT.

If there is a constraint xb1square-image-of-or-equals𝑥subscript𝑏1x\sqsubseteq b_{1}italic_x ⊑ italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT in Ψ1subscriptsuperscriptΨ1\Psi^{\prime}_{1}roman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, but no upper bound constraint for x𝑥xitalic_x in Ψ2subscriptsuperscriptΨ2\Psi^{\prime}_{2}roman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, we introduce a value ¯P¯top𝑃\overline{\top}\in Pover¯ start_ARG ⊤ end_ARG ∈ italic_P which is different from b1subscript𝑏1b_{1}italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT with (b1σx)¯subscript𝑏1𝜎𝑥¯top(b_{1}\vee\sigma\,x)\leq\overline{\top}( italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∨ italic_σ italic_x ) ≤ over¯ start_ARG ⊤ end_ARG, and construct an analogous distinguishing assignment σsuperscript𝜎\sigma^{\prime}italic_σ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT only that we use ¯¯top\overline{\top}over¯ start_ARG ⊤ end_ARG instead of b2subscript𝑏2b_{2}italic_b start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT.

Variable Constraints.

Assume that, w.l.o.g., Ψ1subscriptsuperscriptΨ1\Psi^{\prime}_{1}roman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT has a constraint (xy)square-image-of-or-equals𝑥𝑦(x\sqsubseteq y)( italic_x ⊑ italic_y ) for x,yX𝑥𝑦𝑋x,y\in{\mathcal{}X}italic_x , italic_y ∈ italic_X which does not occur in Ψ2subscriptsuperscriptΨ2\Psi^{\prime}_{2}roman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT where we assume that for every variable zX𝑧𝑋z\in{\mathcal{}X}italic_z ∈ italic_X both lower and upper bounds are provided by Ψ1subscriptsuperscriptΨ1\Psi^{\prime}_{1}roman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT iff they are provided by Ψ2subscriptsuperscriptΨ2\Psi^{\prime}_{2}roman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT and that, whenever they are provided, they agree. Consider again the set Uxsubscript𝑈𝑥U_{x}italic_U start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT of x𝑥xitalic_x together with all variables z𝑧zitalic_z with constraints xzsquare-image-of-or-equals𝑥𝑧x\sqsubseteq zitalic_x ⊑ italic_z, and the set Lysubscript𝐿𝑦L_{y}italic_L start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT of y𝑦yitalic_y together with all variables z𝑧zitalic_z with constraints zysquare-image-of-or-equals𝑧𝑦z\sqsubseteq yitalic_z ⊑ italic_y occurring in Ψ2subscriptsuperscriptΨ2\Psi^{\prime}_{2}roman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. Since xysquare-image-of-or-equals𝑥𝑦x\sqsubseteq yitalic_x ⊑ italic_y does not occur in Ψ2subscriptsuperscriptΨ2\Psi^{\prime}_{2}roman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, UxLy=subscript𝑈𝑥subscript𝐿𝑦U_{x}\cap L_{y}=\emptysetitalic_U start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT ∩ italic_L start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT = ∅.

Let σ𝜎\sigmaitalic_σ denote an assignment with σΨ2models𝜎subscriptsuperscriptΨ2\sigma\models\Psi^{\prime}_{2}italic_σ ⊧ roman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. First assume that Ψ2subscriptsuperscriptΨ2\Psi^{\prime}_{2}roman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT has constraints xbsquare-image-of-or-equals𝑥𝑏x\sqsubseteq bitalic_x ⊑ italic_b and aysquare-image-of-or-equals𝑎𝑦a\sqsubseteq yitalic_a ⊑ italic_y. From xysquare-image-of-or-equals𝑥𝑦x\sqsubseteq yitalic_x ⊑ italic_y not occurring in Ψ2subscriptsuperscriptΨ2\Psi^{\prime}_{2}roman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, it follows that banot-less-than-or-equals𝑏𝑎b\not\leq aitalic_b ≰ italic_a. Now we construct an assignment σsuperscript𝜎\sigma^{\prime}italic_σ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT by:

σz={bσzif zUx{x}aσzif zLy{y}σzotherwisesuperscript𝜎𝑧cases𝑏𝜎𝑧if 𝑧subscript𝑈𝑥𝑥𝑎𝜎𝑧if 𝑧subscript𝐿𝑦𝑦𝜎𝑧otherwise\sigma^{\prime}\,z=\begin{cases}b\vee\sigma\,z&\text{if }z\in U_{x}\cup\{x\}\\ a\wedge\sigma\,z&\text{if }z\in L_{y}\cup\{y\}\\ \sigma\,z&\text{otherwise}\end{cases}italic_σ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_z = { start_ROW start_CELL italic_b ∨ italic_σ italic_z end_CELL start_CELL if italic_z ∈ italic_U start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT ∪ { italic_x } end_CELL end_ROW start_ROW start_CELL italic_a ∧ italic_σ italic_z end_CELL start_CELL if italic_z ∈ italic_L start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT ∪ { italic_y } end_CELL end_ROW start_ROW start_CELL italic_σ italic_z end_CELL start_CELL otherwise end_CELL end_ROW

Then σΨ2modelssuperscript𝜎subscriptsuperscriptΨ2\sigma^{\prime}\models\Psi^{\prime}_{2}italic_σ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⊧ roman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, while σx=bsuperscript𝜎𝑥𝑏\sigma^{\prime}\,x=bitalic_σ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_x = italic_b and σy=asuperscript𝜎𝑦𝑎\sigma^{\prime}\,y=aitalic_σ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_y = italic_a. As banot-less-than-or-equals𝑏𝑎b\not\leq aitalic_b ≰ italic_a, σsuperscript𝜎\sigma^{\prime}italic_σ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT does not fulfill the constraint xysquare-image-of-or-equals𝑥𝑦x\sqsubseteq yitalic_x ⊑ italic_y from Ψ1subscriptsuperscriptΨ1\Psi^{\prime}_{1}roman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT.

If no upper bound of x𝑥xitalic_x is provided, we choose some value b𝑏bitalic_b strictly larger than σxσy𝜎𝑥𝜎𝑦\sigma\,x\vee\sigma\,yitalic_σ italic_x ∨ italic_σ italic_y, and define a variable assignment σsuperscript𝜎\sigma^{\prime}italic_σ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT by σz=bσzsuperscript𝜎𝑧𝑏𝜎𝑧\sigma^{\prime}\,z=b\vee\sigma\,zitalic_σ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_z = italic_b ∨ italic_σ italic_z for zUx𝑧subscript𝑈𝑥z\in U_{x}italic_z ∈ italic_U start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT, and σz=σzsuperscript𝜎𝑧𝜎𝑧\sigma^{\prime}\,z=\sigma\,zitalic_σ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_z = italic_σ italic_z otherwise. Then σΨ2modelssuperscript𝜎subscriptsuperscriptΨ2\sigma^{\prime}\models\Psi^{\prime}_{2}italic_σ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⊧ roman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. In order to additionally satisfy xysquare-image-of-or-equals𝑥𝑦x\sqsubseteq yitalic_x ⊑ italic_y, we would have σx=bσx=bσysuperscript𝜎𝑥𝑏𝜎𝑥𝑏superscript𝜎𝑦\sigma^{\prime}\,x=b\vee\sigma\,x=b\leq\sigma^{\prime}\,yitalic_σ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_x = italic_b ∨ italic_σ italic_x = italic_b ≤ italic_σ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_y – which is impossible.

Likewise, if no lower bound of y𝑦yitalic_y is provided, we choose some value a𝑎aitalic_a strictly less than σxσy𝜎𝑥𝜎𝑦\sigma\,x\wedge\sigma\,yitalic_σ italic_x ∧ italic_σ italic_y, and define a variable assignment σsuperscript𝜎\sigma^{\prime}italic_σ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT by σz=aσzsuperscript𝜎𝑧𝑎𝜎𝑧\sigma^{\prime}\,z=a\wedge\sigma\,zitalic_σ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_z = italic_a ∧ italic_σ italic_z for zLy𝑧subscript𝐿𝑦z\in L_{y}italic_z ∈ italic_L start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT, and σz=σzsuperscript𝜎𝑧𝜎𝑧\sigma^{\prime}\,z=\sigma\,zitalic_σ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_z = italic_σ italic_z otherwise. Then σΨ2modelssuperscript𝜎subscriptsuperscriptΨ2\sigma^{\prime}\models\Psi^{\prime}_{2}italic_σ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⊧ roman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. In order to additionally satisfy xysquare-image-of-or-equals𝑥𝑦x\sqsubseteq yitalic_x ⊑ italic_y, we would have σx=σxσy=asuperscript𝜎𝑥𝜎𝑥superscript𝜎𝑦𝑎\sigma^{\prime}\,x=\sigma\,x\leq\sigma^{\prime}\,y=aitalic_σ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_x = italic_σ italic_x ≤ italic_σ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_y = italic_a – which again is impossible.

For lattices, therefore, the construction of normal forms allows deciding satisfiability as well as semantic implication. From our examples, sets, integers, and multisets are lattices. Strings, ordered by the prefix relation, on the other hand, already do not form a lattice anymore. This po, however, is bounded-complete. Recall that a po P𝑃Pitalic_P is bounded-complete if every subset AP𝐴𝑃A\subseteq Pitalic_A ⊆ italic_P which has some upper bound, also has a least upper bound. When P𝑃Pitalic_P is bounded-complete, then we at least know that

  • every non-empty subset BP𝐵𝑃B\subseteq Pitalic_B ⊆ italic_P has a greatest lower bound; and

  • P𝑃Pitalic_P has a least element Psubscriptbottom𝑃\bot_{P}⊥ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT.

Thus, every formula ΨΨ\Psiroman_Ψ over a bounded-complete po P𝑃Pitalic_P which provides some upper bound to every variable xX𝑥𝑋x\in{\mathcal{}X}italic_x ∈ italic_X also can be brought into 1-normal form. Let us call such conjunctions bounded. We obtain:

Proposition 5

Given a po P𝑃Pitalic_P that is bounded-complete, the following holds:

  1. 1.

    A bounded conjunction ΨΨ\Psiroman_Ψ is satisfiable over P𝑃Pitalic_P iff 𝘯𝘧1[Ψ]subscript𝘯𝘧1delimited-[]Ψbottom\textsf{nf}_{1}[\Psi]\neq\botnf start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT [ roman_Ψ ] ≠ ⊥.

  2. 2.

    For arbitrary bounded conjunctions Ψ1,Ψ2subscriptΨ1subscriptΨ2\Psi_{1},\Psi_{2}roman_Ψ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , roman_Ψ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT over P𝑃Pitalic_P, Ψ1Ψ2subscriptΨ1subscriptΨ2\Psi_{1}\implies\Psi_{2}roman_Ψ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⟹ roman_Ψ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT iff 𝘯𝘧1[Ψ1]=𝘯𝘧1[Ψ1Ψ2]subscript𝘯𝘧1delimited-[]subscriptΨ1subscript𝘯𝘧1delimited-[]subscriptΨ1subscriptΨ2\textsf{nf}_{1}[\Psi_{1}]=\textsf{nf}_{1}[\Psi_{1}\land\Psi_{2}]nf start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT [ roman_Ψ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ] = nf start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT [ roman_Ψ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∧ roman_Ψ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ]. ∎

When we drop the extra assumption that conjunctions are bounded, Proposition 5 need no longer hold.

Example 5

For prefixes of strings, consider the conjunction

(𝖺𝖻x)(x𝖺𝖻𝖼)(𝖺𝖻𝖽y)(xy)square-image-of-or-equals𝖺𝖻𝑥square-image-of-or-equals𝑥𝖺𝖻𝖼square-image-of-or-equals𝖺𝖻𝖽𝑦square-image-of-or-equals𝑥𝑦(\textsf{ab}\sqsubseteq x)\wedge(x\sqsubseteq\textsf{abc})\wedge(\textsf{abd}% \sqsubseteq y)\wedge(x\sqsubseteq y)( ab ⊑ italic_x ) ∧ ( italic_x ⊑ abc ) ∧ ( abd ⊑ italic_y ) ∧ ( italic_x ⊑ italic_y )

This formula is semantically equivalent to

(𝖺𝖻x)(x𝖺𝖻)(𝖺𝖻𝖽y)(xy)square-image-of-or-equals𝖺𝖻𝑥square-image-of-or-equals𝑥𝖺𝖻square-image-of-or-equals𝖺𝖻𝖽𝑦square-image-of-or-equals𝑥𝑦(\textsf{ab}\sqsubseteq x)\wedge(x\sqsubseteq\textsf{ab})\wedge(\textsf{abd}% \sqsubseteq y)\wedge(x\sqsubseteq y)( ab ⊑ italic_x ) ∧ ( italic_x ⊑ ab ) ∧ ( abd ⊑ italic_y ) ∧ ( italic_x ⊑ italic_y )

although the formulas are syntactically different.

Even without upper bounds, not all implications can be inferred via transitive closure alone. Again for prefixes of strings, consider

(𝖺𝖻𝖼y1)(𝖺𝖻𝖽y2)(xy1)(xy2)(𝖺𝖻z)square-image-of-or-equals𝖺𝖻𝖼subscript𝑦1square-image-of-or-equals𝖺𝖻𝖽subscript𝑦2square-image-of-or-equals𝑥subscript𝑦1square-image-of-or-equals𝑥subscript𝑦2square-image-of-or-equals𝖺𝖻𝑧\begin{array}[]{c}(\textsf{abc}\sqsubseteq y_{1})\wedge(\textsf{abd}% \sqsubseteq y_{2})\wedge(x\sqsubseteq y_{1})\wedge(x\sqsubseteq y_{2})\wedge(% \textsf{ab}\sqsubseteq z)\end{array}start_ARRAY start_ROW start_CELL ( abc ⊑ italic_y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ∧ ( abd ⊑ italic_y start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ∧ ( italic_x ⊑ italic_y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ∧ ( italic_x ⊑ italic_y start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ∧ ( ab ⊑ italic_z ) end_CELL end_ROW end_ARRAY

The first four constraints imply that xabsquare-image-of-or-equals𝑥abx\sqsubseteq\textsc{ab}italic_x ⊑ ab, which, by the last constraint, implies that xzsquare-image-of-or-equals𝑥𝑧x\sqsubseteq zitalic_x ⊑ italic_z must hold as well. ∎

For a conjunction ΨΨ\Psiroman_Ψ and a subset YX𝑌𝑋Y\subseteq{\mathcal{}X}italic_Y ⊆ italic_X of variables, let Ψ|Yevaluated-atΨ𝑌{\left.\kern-1.2pt\Psi\vphantom{|}\right|^{\sharp}_{Y}}roman_Ψ | start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT yield bottom\bot if ΨΨ\Psiroman_Ψ equals bottom\bot, and otherwise, yield the conjunction of all constraints in ΨΨ\Psiroman_Ψ that only uses variables from Y𝑌Yitalic_Y.

For conjunctions Ψ1,Ψ2subscriptΨ1subscriptΨ2\Psi_{1},\Psi_{2}roman_Ψ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , roman_Ψ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT in 1-normal form and different from bottom\bot, we define the abstract join Ψ1Ψ2superscriptsquare-unionsubscriptΨ1subscriptΨ2\Psi_{1}\sqcup^{\sharp}\Psi_{2}roman_Ψ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⊔ start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT roman_Ψ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT as the conjunction of the following constraints:

  • all constraints xysquare-image-of-or-equals𝑥𝑦x\sqsubseteq yitalic_x ⊑ italic_y, x,yX𝑥𝑦𝑋x,y\in{\mathcal{}X}italic_x , italic_y ∈ italic_X, which occur both in Ψ1subscriptΨ1\Psi_{1}roman_Ψ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and Ψ2subscriptΨ2\Psi_{2}roman_Ψ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT;

  • all constraints (d1d2)xsquare-image-of-or-equalssubscript𝑑1subscript𝑑2𝑥(d_{1}\wedge d_{2})\sqsubseteq x( italic_d start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∧ italic_d start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ⊑ italic_x, d1,d2Psubscript𝑑1subscript𝑑2𝑃d_{1},d_{2}\in Pitalic_d start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_d start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∈ italic_P, xX𝑥𝑋x\in{\mathcal{}X}italic_x ∈ italic_X where dixsquare-image-of-or-equalssubscript𝑑𝑖𝑥d_{i}\sqsubseteq xitalic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⊑ italic_x occurs in ΨisubscriptΨ𝑖\Psi_{i}roman_Ψ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT;

  • all constraints x(d1d2)square-image-of-or-equals𝑥subscript𝑑1subscript𝑑2x\sqsubseteq(d_{1}\vee d_{2})italic_x ⊑ ( italic_d start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∨ italic_d start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ), d1,d2Psubscript𝑑1subscript𝑑2𝑃d_{1},d_{2}\in Pitalic_d start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_d start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∈ italic_P, xX𝑥𝑋x\in{\mathcal{}X}italic_x ∈ italic_X where xdisquare-image-of-or-equals𝑥subscript𝑑𝑖x\sqsubseteq d_{i}italic_x ⊑ italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT occurs in ΨisubscriptΨ𝑖\Psi_{i}roman_Ψ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT.

Then we have:

Theorem 4.2

Assume that P𝑃Pitalic_P is a lattice.

  1. 1.

    If ΨΨ\Psiroman_Ψ is a conjunction in 1-normal form, then for every subset YX𝑌𝑋Y\subseteq{\mathcal{}X}italic_Y ⊆ italic_X, Ψ|Yevaluated-atΨ𝑌{\left.\kern-1.2pt\Psi\vphantom{|}\right|_{Y}}roman_Ψ | start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT is given by Ψ|Yevaluated-atΨ𝑌{\left.\kern-1.2pt\Psi\vphantom{|}\right|^{\sharp}_{Y}}roman_Ψ | start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT where the latter conjunction is again in 1-normal form.

  2. 2.

    For Ψ1,Ψ2subscriptΨ1subscriptΨ2\Psi_{1},\Psi_{2}roman_Ψ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , roman_Ψ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT in 1-normal form, Ψ1Ψ2superscriptsquare-unionsubscriptΨ1subscriptΨ2\Psi_{1}\sqcup^{\sharp}\Psi_{2}roman_Ψ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⊔ start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT roman_Ψ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT is the least upper bound of Ψ1,Ψ2subscriptΨ1subscriptΨ2\Psi_{1},\Psi_{2}roman_Ψ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , roman_Ψ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT in D[P]𝐷delimited-[]𝑃{\mathcal{}D}[P]italic_D [ italic_P ].

  3. 3.

    The domain D[P]𝐷delimited-[]𝑃{\mathcal{}D}[P]italic_D [ italic_P ] is a 2-decomposable relational domain. ∎

While statement (1) of Theorem 4.2 remains true also for bounded conjunctions over a bounded-complete po, the least upper bound of two bounded conjunctions need no longer be bounded, as the least upper bounds of the respective upper bounds need not exist. For the prefix ordering on Σ*superscriptΣ\Sigma^{*}roman_Σ start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT, e.g., we have

(x𝖺𝖻𝖼)(x𝖺𝖻𝖽)=square-unionsquare-image-of-or-equals𝑥𝖺𝖻𝖼square-image-of-or-equals𝑥𝖺𝖻𝖽top(x\sqsubseteq\textsf{abc})\sqcup(x\sqsubseteq\textsf{abd})=\top( italic_x ⊑ abc ) ⊔ ( italic_x ⊑ abd ) = ⊤

i.e., all information about upper bounds is lost.

4.2 The General Case

For general (even finite) partial orders, the dedicated constructions for lattices cannot be directly applied. Already the problem of determining whether or not a conjunction is satisfiable, turns out to be surprisingly difficult. Assume that elements in P𝑃Pitalic_P can be represented and compared in polynomial time. Then we find:

Theorem 4.3

The problem of determining for a given partial order P𝑃Pitalic_P and a conjunction Ψnormal-Ψ\Psiroman_Ψ, whether Ψnormal-Ψ\Psiroman_Ψ is satisfiable over P𝑃Pitalic_P, is NP-complete.

Proof

Since a satisfying assignment for a conjunction ΨΨ\Psiroman_Ψ can be guessed in polynomial time, it remains to prove the hardness part. For that, consider the problem of 3-colorability of an undirected finite graph G=(V,E)𝐺𝑉𝐸G=(V,E)italic_G = ( italic_V , italic_E ). Let v1,,vnsubscript𝑣1subscript𝑣𝑛v_{1},\ldots,v_{n}italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_v start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT be an enumeration of the vertices in V𝑉Vitalic_V. Then, we construct a partial order P𝑃Pitalic_P consisting of the elements

{vi,ci=1,,n,c=1,2,3}˙{v¯ii=1,,n}˙{v¯ii=1,,n}conditional-setsubscript𝑣𝑖𝑐formulae-sequence𝑖1𝑛𝑐123˙conditional-setsubscript¯𝑣𝑖𝑖1𝑛˙conditional-setsubscript¯𝑣𝑖𝑖1𝑛\{\langle v_{i},c\rangle\mid i=1,\ldots,n,c=1,2,3\}\;\begin{array}[t]{@{}l}% \dot{\cup}\;\{\underline{v}_{i}\mid i=1,\ldots,n\}\\ \dot{\cup}\;\{\overline{v}_{i}\mid i=1,\ldots,n\}\end{array}{ ⟨ italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_c ⟩ ∣ italic_i = 1 , … , italic_n , italic_c = 1 , 2 , 3 } start_ARRAY start_ROW start_CELL over˙ start_ARG ∪ end_ARG { under¯ start_ARG italic_v end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∣ italic_i = 1 , … , italic_n } end_CELL end_ROW start_ROW start_CELL over˙ start_ARG ∪ end_ARG { over¯ start_ARG italic_v end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∣ italic_i = 1 , … , italic_n } end_CELL end_ROW end_ARRAY

where the partial ordering \leq of P𝑃Pitalic_P is the least partial order satisfying

vi,cvj,cwhenever{vi,vj}Ei<jccvi,cv¯iwheneverj>i.{i,j}Ev¯jvj,cwheneveri<j.{i,j}Esubscript𝑣𝑖𝑐subscript𝑣𝑗superscript𝑐wheneversubscript𝑣𝑖subscript𝑣𝑗𝐸𝑖𝑗𝑐superscript𝑐subscript𝑣𝑖𝑐subscript¯𝑣𝑖formulae-sequencewhenever𝑗𝑖𝑖𝑗𝐸subscript¯𝑣𝑗subscript𝑣𝑗𝑐formulae-sequencewhenever𝑖𝑗𝑖𝑗𝐸\begin{array}[]{lll@{\quad}l}\langle v_{i},c\rangle&\leq&\langle v_{j},c^{% \prime}\rangle&\text{whenever}\;\{v_{i},v_{j}\}\in E\land i<j\land c\neq c^{% \prime}\\ \langle v_{i},c\rangle&\leq&\overline{v}_{i}&\text{whenever}\;\exists\,j>i.\,% \{i,j\}\in E\\ \underline{v}_{j}&\leq&\langle v_{j},c\rangle&\text{whenever}\;\exists\,i<j.\,% \{i,j\}\in E\\ \end{array}start_ARRAY start_ROW start_CELL ⟨ italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_c ⟩ end_CELL start_CELL ≤ end_CELL start_CELL ⟨ italic_v start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , italic_c start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⟩ end_CELL start_CELL whenever { italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT } ∈ italic_E ∧ italic_i < italic_j ∧ italic_c ≠ italic_c start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL ⟨ italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_c ⟩ end_CELL start_CELL ≤ end_CELL start_CELL over¯ start_ARG italic_v end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_CELL start_CELL whenever ∃ italic_j > italic_i . { italic_i , italic_j } ∈ italic_E end_CELL end_ROW start_ROW start_CELL under¯ start_ARG italic_v end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_CELL start_CELL ≤ end_CELL start_CELL ⟨ italic_v start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , italic_c ⟩ end_CELL start_CELL whenever ∃ italic_i < italic_j . { italic_i , italic_j } ∈ italic_E end_CELL end_ROW end_ARRAY

For P𝑃Pitalic_P, we define a conjunction ΨΨ\Psiroman_Ψ in the variables xi,i=1,,nformulae-sequencesubscript𝑥𝑖𝑖1𝑛x_{i},i=1,\ldots,nitalic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_i = 1 , … , italic_n, by

{vi,vj}E,i<j(xiv¯i)(xixj)(v¯jxj)subscriptformulae-sequencesubscript𝑣𝑖subscript𝑣𝑗𝐸𝑖𝑗square-image-of-or-equalssubscript𝑥𝑖subscript¯𝑣𝑖square-image-of-or-equalssubscript𝑥𝑖subscript𝑥𝑗square-image-of-or-equalssubscript¯𝑣𝑗subscript𝑥𝑗\begin{array}[]{l}\bigwedge_{\{v_{i},v_{j}\}\in E,i<j}(x_{i}\sqsubseteq% \overline{v}_{i})\wedge(x_{i}\sqsubseteq x_{j})\wedge(\underline{v}_{j}% \sqsubseteq x_{j})\end{array}start_ARRAY start_ROW start_CELL ⋀ start_POSTSUBSCRIPT { italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT } ∈ italic_E , italic_i < italic_j end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⊑ over¯ start_ARG italic_v end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ∧ ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⊑ italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) ∧ ( under¯ start_ARG italic_v end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ⊑ italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) end_CELL end_ROW end_ARRAY

Both P𝑃Pitalic_P and ΨΨ\Psiroman_Ψ can be constructed from G𝐺Gitalic_G in polynomial time. Moreover, it holds that σΨmodels𝜎Ψ\sigma\models\Psiitalic_σ ⊧ roman_Ψ iff σxi=vi,ci𝜎subscript𝑥𝑖subscript𝑣𝑖subscript𝑐𝑖\sigma\,x_{i}=\langle v_{i},c_{i}\rangleitalic_σ italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = ⟨ italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⟩ for some coloring γ:V{1,2,3}:𝛾𝑉123\gamma:V\to\{1,2,3\}italic_γ : italic_V → { 1 , 2 , 3 } with γvi=ci𝛾subscript𝑣𝑖subscript𝑐𝑖\gamma\,v_{i}=c_{i}italic_γ italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT. It follows that ΨΨ\Psiroman_Ψ is satisfiable iff G𝐺Gitalic_G has a 3-coloring. In summary, we obtain a polynomial time reduction from the problem of 3-colorability of undirected finite graphs into satisfiability of finite conjunctions over some partial order. This concludes the proof. ∎.

For general partial orders P𝑃Pitalic_P, however, we still may rely on the 0-normal form 𝗇𝖿0subscript𝗇𝖿0\textsf{nf}_{0}nf start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and otherwise perform the same constructions as we did for lattices with the 1-normal form. Thus, we define an abstract ordering by

Ψ1Ψ2iff𝗇𝖿0[Ψ1]=𝗇𝖿0[Ψ1Ψ2]formulae-sequencesuperscriptsquare-image-of-or-equalssubscriptΨ1subscriptΨ2iffsubscript𝗇𝖿0delimited-[]subscriptΨ1subscript𝗇𝖿0delimited-[]subscriptΨ1subscriptΨ2\Psi_{1}\sqsubseteq^{\sharp}\Psi_{2}\qquad\text{iff}\qquad\textsf{nf}_{0}[\Psi% _{1}]=\textsf{nf}_{0}[\Psi_{1}\wedge\Psi_{2}]roman_Ψ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⊑ start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT roman_Ψ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT iff nf start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT [ roman_Ψ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ] = nf start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT [ roman_Ψ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∧ roman_Ψ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] (12)

Let us denote the resulting abstract domain by D[P]0𝐷subscriptdelimited-[]𝑃0{\mathcal{}D}[P]_{0}italic_D [ italic_P ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT. We have:

Theorem 4.4

For an arbitrary po P𝑃Pitalic_P, the following holds:

  1. 1.

    If a conjunction ΨΨ\Psiroman_Ψ is satisfiable over P𝑃Pitalic_P then 𝘯𝘧0[Ψ]subscript𝘯𝘧0delimited-[]Ψbottom\textsf{nf}_{0}[\Psi]\neq\botnf start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT [ roman_Ψ ] ≠ ⊥.

  2. 2.

    For all conjunctions Ψ1,Ψ2subscriptΨ1subscriptΨ2\Psi_{1},\Psi_{2}roman_Ψ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , roman_Ψ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, 𝘯𝘧0[Ψ1]=𝘯𝘧0[Ψ1Ψ2]subscript𝘯𝘧0delimited-[]subscriptΨ1subscript𝘯𝘧0delimited-[]subscriptΨ1subscriptΨ2\textsf{nf}_{0}[\Psi_{1}]=\textsf{nf}_{0}[\Psi_{1}\land\Psi_{2}]nf start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT [ roman_Ψ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ] = nf start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT [ roman_Ψ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∧ roman_Ψ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] implies that Ψ1Ψ2subscriptΨ1subscriptΨ2\Psi_{1}\implies\Psi_{2}roman_Ψ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⟹ roman_Ψ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT.

For arbitrary po P𝑃Pitalic_P, we define the abstract projection in the same way as for conjunctions over a lattice P𝑃Pitalic_P – only that we now rely on formulas in 0-normal form. For such a formula ΨΨ\Psiroman_Ψ the projection Ψ|Yevaluated-atΨ𝑌{\left.\kern-1.2pt\Psi\vphantom{|}\right|^{\sharp}_{Y}}roman_Ψ | start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT onto a subset YX𝑌𝑋Y\subseteq{\mathcal{}X}italic_Y ⊆ italic_X of variables, is again defined by removing all constraints mentioning variables not in Y𝑌Yitalic_Y.

It is for the abstract join operation that we must find a more general definition, since least upper bounds or greatest lower bounds of sets of values in P𝑃Pitalic_P are no longer at hand. Assume that Ψ1,Ψ2subscriptΨ1subscriptΨ2\Psi_{1},\Psi_{2}roman_Ψ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , roman_Ψ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT are in 0-normal form and different from bottom\bot. Then, we define the abstract join Ψ1Ψ2superscriptsquare-unionsubscriptΨ1subscriptΨ2\Psi_{1}\sqcup^{\sharp}\Psi_{2}roman_Ψ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⊔ start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT roman_Ψ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT as the conjunction of the following constraints

  • all constraints xysquare-image-of-or-equals𝑥𝑦x\sqsubseteq yitalic_x ⊑ italic_y, x,yX𝑥𝑦𝑋x,y\in{\mathcal{}X}italic_x , italic_y ∈ italic_X, which occur both in Ψ1subscriptΨ1\Psi_{1}roman_Ψ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and Ψ2subscriptΨ2\Psi_{2}roman_Ψ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT;

  • all constraints dixsquare-image-of-or-equalssubscript𝑑𝑖𝑥d_{i}\sqsubseteq xitalic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⊑ italic_x, d1,d2Psubscript𝑑1subscript𝑑2𝑃d_{1},d_{2}\in Pitalic_d start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_d start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∈ italic_P, xX𝑥𝑋x\in{\mathcal{}X}italic_x ∈ italic_X where dixsquare-image-of-or-equalssubscript𝑑𝑖𝑥d_{i}\sqsubseteq xitalic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⊑ italic_x occurs in ΨisubscriptΨ𝑖\Psi_{i}roman_Ψ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT for i=1,2𝑖12i=1,2italic_i = 1 , 2 and did3isubscript𝑑𝑖subscript𝑑3𝑖d_{i}\leq d_{3-i}italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ≤ italic_d start_POSTSUBSCRIPT 3 - italic_i end_POSTSUBSCRIPT;

  • all constraints xdisquare-image-of-or-equals𝑥subscript𝑑𝑖x\sqsubseteq d_{i}italic_x ⊑ italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, d1,d2Psubscript𝑑1subscript𝑑2𝑃d_{1},d_{2}\in Pitalic_d start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_d start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∈ italic_P, xX𝑥𝑋x\in{\mathcal{}X}italic_x ∈ italic_X where xdisquare-image-of-or-equals𝑥subscript𝑑𝑖x\sqsubseteq d_{i}italic_x ⊑ italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT occurs in ΨisubscriptΨ𝑖\Psi_{i}roman_Ψ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT for i=1,2𝑖12i=1,2italic_i = 1 , 2 and d3idisubscript𝑑3𝑖subscript𝑑𝑖d_{3-i}\leq d_{i}italic_d start_POSTSUBSCRIPT 3 - italic_i end_POSTSUBSCRIPT ≤ italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT.

This definition essentially amounts to keeping those ordering constraints between variables in which Ψ1subscriptΨ1\Psi_{1}roman_Ψ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and Ψ2subscriptΨ2\Psi_{2}roman_Ψ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT agree and only keep a lower or upper bound if it is more liberal than a corresponding bound of the other formula.

Example 6

For the po Σ*superscriptΣ\Sigma^{*}roman_Σ start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT with the substring ordering, consider the formulas

Ψ1=(𝖺𝖻x)(y𝖺𝖻)(yz)Ψ2=(𝖺𝖻𝖼x)(y𝖺𝖻𝖼)subscriptΨ1square-image-of-or-equals𝖺𝖻𝑥square-image-of-or-equals𝑦𝖺𝖻square-image-of-or-equals𝑦𝑧subscriptΨ2square-image-of-or-equals𝖺𝖻𝖼𝑥square-image-of-or-equals𝑦𝖺𝖻𝖼\begin{array}[]{lll}\Psi_{1}&=&(\textsf{ab}\sqsubseteq x)\wedge(y\sqsubseteq% \textsf{ab})\wedge(y\sqsubseteq z)\\ \Psi_{2}&=&(\textsf{abc}\sqsubseteq x)\wedge(y\sqsubseteq\textsf{abc})\\ \end{array}start_ARRAY start_ROW start_CELL roman_Ψ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL start_CELL = end_CELL start_CELL ( ab ⊑ italic_x ) ∧ ( italic_y ⊑ ab ) ∧ ( italic_y ⊑ italic_z ) end_CELL end_ROW start_ROW start_CELL roman_Ψ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL start_CELL = end_CELL start_CELL ( abc ⊑ italic_x ) ∧ ( italic_y ⊑ abc ) end_CELL end_ROW end_ARRAY

Then, according to our definition,

Ψ1Ψ2=(𝖺𝖻x)(y𝖺𝖻𝖼)superscriptsquare-unionsubscriptΨ1subscriptΨ2square-image-of-or-equals𝖺𝖻𝑥square-image-of-or-equals𝑦𝖺𝖻𝖼\Psi_{1}\sqcup^{\sharp}\Psi_{2}=(\textsf{ab}\sqsubseteq x)\wedge(y\sqsubseteq% \textsf{abc})roman_Ψ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⊔ start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT roman_Ψ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = ( ab ⊑ italic_x ) ∧ ( italic_y ⊑ abc )

With these definitions, the binary operation superscriptsquare-union\sqcup^{\sharp}⊔ start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT returns the least upper bound of its arguments w.r.t. the ordering superscriptsquare-image-of-or-equals\sqsubseteq^{\sharp}⊑ start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT. Moreover, D[P]0𝐷subscriptdelimited-[]𝑃0{\mathcal{}D}[P]_{0}italic_D [ italic_P ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT turns into a 2-decomposable relational domain as well.

Theorem 4.5

For every po P𝑃Pitalic_P, D[P]0𝐷subscriptdelimited-[]𝑃0{\mathcal{}D}[P]_{0}italic_D [ italic_P ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT is a 2-decomposable relational domain. ∎

4.3 Directed Domains with Disjunctions

Subsequently, we extend the relational domain D[P]𝐷delimited-[]𝑃{\mathcal{}D}[P]italic_D [ italic_P ] for lattices P𝑃Pitalic_P (resp. D[P]0𝐷subscriptdelimited-[]𝑃0{\mathcal{}D}[P]_{0}italic_D [ italic_P ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT for arbitrary po’s) with disjunctions. This extension corresponds to the disjunctive completion of D[P]𝐷delimited-[]𝑃{\mathcal{}D}[P]italic_D [ italic_P ] (resp. D[P]0𝐷subscriptdelimited-[]𝑃0{\mathcal{}D}[P]_{0}italic_D [ italic_P ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT) Cousot and Cousot (1992). The elements of the resulting relational domain are disjunctions of normal form conjunctions (1-normal forms if P𝑃Pitalic_P is a lattice, and 0-normal forms in general) where for YX𝑌𝑋Y\subseteq{\mathcal{}X}italic_Y ⊆ italic_X, the restriction Ψ|Yevaluated-atΨ𝑌{\left.\kern-1.2pt\Psi\vphantom{|}\right|_{Y}}roman_Ψ | start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT of the disjunction ΨΨ\Psiroman_Ψ is defined as the disjunction of the restrictions c|Yevaluated-at𝑐𝑌{\left.\kern-1.2ptc\vphantom{|}\right|_{Y}}italic_c | start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT of the normal form conjunctions c𝑐citalic_c contained in ΨΨ\Psiroman_Ψ. By definition, restrictions therefore are distributive. Let D¯[P]¯𝐷delimited-[]𝑃\overline{\mathcal{}D}[P]over¯ start_ARG italic_D end_ARG [ italic_P ] (resp. D¯[P]0¯𝐷subscriptdelimited-[]𝑃0\overline{\mathcal{}D}[P]_{0}over¯ start_ARG italic_D end_ARG [ italic_P ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT) denote the resulting relational abstract domains. If P𝑃Pitalic_P is infinite, these relational domains have infinite strictly ascending chains, and therefore must have also strictly descending chains of unbounded length. For the lattice \mathbb{Z}blackboard_Z, e.g., there are even infinite strictly descending chains, e.g.,

(0x),(1x),(2x),square-image-of-or-equals0𝑥square-image-of-or-equals1𝑥square-image-of-or-equals2𝑥(0\sqsubseteq x),\;(1\sqsubseteq x),\;(2\sqsubseteq x),\;\ldots( 0 ⊑ italic_x ) , ( 1 ⊑ italic_x ) , ( 2 ⊑ italic_x ) , …

Nonetheless, we have:

Proposition 6
  1. 1.

    For every po P𝑃Pitalic_P, D¯[P]0¯𝐷subscriptdelimited-[]𝑃0\overline{\mathcal{}D}[P]_{0}over¯ start_ARG italic_D end_ARG [ italic_P ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT is 2-nice.

  2. 2.

    For every lattice P𝑃Pitalic_P, D¯[P]¯𝐷delimited-[]𝑃\overline{\mathcal{}D}[P]over¯ start_ARG italic_D end_ARG [ italic_P ] is 2-nice.

Proof

Let D𝐷Ditalic_D denote an arbitrary collection dpp[X]2subscriptdelimited-⟨⟩subscript𝑑𝑝𝑝subscriptdelimited-[]𝑋2\langle d_{p}\rangle_{p\in[{\mathcal{}X}]_{2}}⟨ italic_d start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT with dpD¯[P]0psubscript𝑑𝑝¯𝐷superscriptsubscriptdelimited-[]𝑃0𝑝d_{p}\in\overline{\mathcal{}D}[P]_{0}^{p}italic_d start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ∈ over¯ start_ARG italic_D end_ARG [ italic_P ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT. Consider an arbitrary formula dpsubscriptsuperscript𝑑𝑝d^{\prime}_{p}italic_d start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT from the set ID¯[P][D]psubscript𝐼¯𝐷delimited-[]𝑃superscriptdelimited-[]𝐷𝑝I_{\overline{\mathcal{}D}[P]}[D]^{p}italic_I start_POSTSUBSCRIPT over¯ start_ARG italic_D end_ARG [ italic_P ] end_POSTSUBSCRIPT [ italic_D ] start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT. It consists of disjunctions of conjunctions each of which may only mention variables from p𝑝pitalic_p or constants occurring in any of the dp,p[X]2subscript𝑑superscript𝑝superscript𝑝subscriptdelimited-[]𝑋2d_{p^{\prime}},p^{\prime}\in[{\mathcal{}X}]_{2}italic_d start_POSTSUBSCRIPT italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT , italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. Since the number of these formulas is finite, statement (1) follows.

The proof of the second statement is analogous – only that the occurring constants now may also be finite meets of constants occurring in upper-bound constraints of the initial collection or finite joins of constants occurring in lower-boudn constraints. Still, the number of possible formulas remains finite. ∎

Due to Proposition 6, the construction from Section 3 can be applied resulting in the 2-decomposable relational domains D¯2[P]superscriptsubscript¯𝐷2delimited-[]𝑃\overline{\mathcal{}D}_{2}^{\sharp}[P]over¯ start_ARG italic_D end_ARG start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT [ italic_P ] (in case of lattices P𝑃Pitalic_P) and D¯2[P]0superscriptsubscript¯𝐷2subscriptdelimited-[]𝑃0\overline{\mathcal{}D}_{2}^{\sharp}[P]_{0}over¯ start_ARG italic_D end_ARG start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT [ italic_P ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT (for arbitrary pos).

We exemplify the construction for the lattice \mathbb{Z}blackboard_Z of integers, i.e., for D¯2[]superscriptsubscript¯𝐷2delimited-[]\overline{\mathcal{}D}_{2}^{\sharp}[\mathbb{Z}]over¯ start_ARG italic_D end_ARG start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT [ blackboard_Z ]. One-variable properties expressible in this lattice are disjunctions of interval constraints such as

(x3)(5x)(x7)square-image-of-or-equals𝑥3square-image-of-or-equals5𝑥square-image-of-or-equals𝑥7(x\sqsubseteq 3)\vee(5\sqsubseteq x)\wedge(x\sqsubseteq 7)( italic_x ⊑ 3 ) ∨ ( 5 ⊑ italic_x ) ∧ ( italic_x ⊑ 7 )

Two-variable properties expressible in this lattice are, e.g.,

(x1)(xy)(0x)(x5)(2y)(6x)(yx)(y19)square-image-of-or-equals𝑥1limit-fromsquare-image-of-or-equals𝑥𝑦square-image-of-or-equals0𝑥square-image-of-or-equals𝑥5limit-fromsquare-image-of-or-equals2𝑦square-image-of-or-equals6𝑥square-image-of-or-equals𝑦𝑥square-image-of-or-equals𝑦19\begin{array}[]{l}(x\sqsubseteq-1)\wedge(x\sqsubseteq y)\;\;\vee\\ (0\sqsubseteq x)\wedge(x\sqsubseteq 5)\wedge(2\sqsubseteq y)\;\;\vee\\ (6\sqsubseteq x)\wedge(y\sqsubseteq x)\wedge(y\sqsubseteq 19)\end{array}start_ARRAY start_ROW start_CELL ( italic_x ⊑ - 1 ) ∧ ( italic_x ⊑ italic_y ) ∨ end_CELL end_ROW start_ROW start_CELL ( 0 ⊑ italic_x ) ∧ ( italic_x ⊑ 5 ) ∧ ( 2 ⊑ italic_y ) ∨ end_CELL end_ROW start_ROW start_CELL ( 6 ⊑ italic_x ) ∧ ( italic_y ⊑ italic_x ) ∧ ( italic_y ⊑ 19 ) end_CELL end_ROW end_ARRAY

Arbitrary elements in D¯2[]superscriptsubscript¯𝐷2delimited-[]\overline{D}_{2}^{\sharp}[\mathbb{Z}]over¯ start_ARG italic_D end_ARG start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT [ blackboard_Z ] can be understood as representations of conjunctions of such properties.

Assume that we are given a collection Z=spp[X]2𝑍subscriptdelimited-⟨⟩subscript𝑠𝑝𝑝subscriptdelimited-[]𝑋2Z=\langle s_{p}\rangle_{p\in[{\mathcal{}X}]_{2}}italic_Z = ⟨ italic_s start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT with spD¯[]psubscript𝑠𝑝¯𝐷superscriptdelimited-[]𝑝s_{p}\in\overline{\mathcal{}D}[\mathbb{Z}]^{p}italic_s start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ∈ over¯ start_ARG italic_D end_ARG [ blackboard_Z ] start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT – which is not yet stable, and we would like to determine the corresponding stable collection by performing a fixpoint iteration to determine the greatest solution of Eq. 8. During that iteration, we only need to consider upper and lower bounds for each variable x𝑥xitalic_x which have already occurred in the formulas spsubscript𝑠𝑝s_{p}italic_s start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT. Therefore, the length of each intermediate formula is bounded by a polynomial in the input, and each unknown rpsubscript𝑟𝑝r_{p}italic_r start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT is updated only polynomially often. As a consequence, all operations abstract join, abstract meet and abstract projection for D¯2[]superscriptsubscript¯𝐷2delimited-[]\overline{\mathcal{}D}_{2}^{\sharp}[\mathbb{Z}]over¯ start_ARG italic_D end_ARG start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT [ blackboard_Z ] are polynomial. For arbitrary lattice or po P𝑃Pitalic_P, we may proceed analogously. Efficiency of the fixpoint iteration, though, remains to be checked separately for every P𝑃Pitalic_P.

4.4 Assignments

Let us turn to the construction of abstract transformers for assignments. We only describe these for the relational domains D[P]𝐷delimited-[]𝑃{\mathcal{}D}[P]italic_D [ italic_P ] and D[P]0𝐷subscriptdelimited-[]𝑃0{\mathcal{}D}[P]_{0}italic_D [ italic_P ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, respectively. We first consider three simple cases: assignments of unknown values; assignments of constants; and copying one variable into the other.

x:=?Ψ=Ψ|X{x}x:=dΨ=Ψ|X{x}(dx)(xd)x:=yΨ=Ψ|X{x}(xy)(yx)\begin{array}[]{lll}\llbracket x\,{:=}\,?\rrbracket^{\sharp}\,\Psi&=&{\left.% \kern-1.2pt\Psi\vphantom{|}\right|^{\sharp}_{{\mathcal{}X}\setminus\{x\}}}\\ \llbracket x\,{:=}\,d\rrbracket^{\sharp}\,\Psi&=&{\left.\kern-1.2pt\Psi% \vphantom{|}\right|^{\sharp}_{{\mathcal{}X}\setminus\{x\}}}\land(d\sqsubseteq x% )\land(x\sqsubseteq d)\\ \llbracket x\,{:=}\,y\rrbracket^{\sharp}\,\Psi&=&{\left.\kern-1.2pt\Psi% \vphantom{|}\right|^{\sharp}_{{\mathcal{}X}\setminus\{x\}}}\land(x\sqsubseteq y% )\land(y\sqsubseteq x)\end{array}start_ARRAY start_ROW start_CELL ⟦ italic_x := ? ⟧ start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT roman_Ψ end_CELL start_CELL = end_CELL start_CELL roman_Ψ | start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_X ∖ { italic_x } end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL ⟦ italic_x := italic_d ⟧ start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT roman_Ψ end_CELL start_CELL = end_CELL start_CELL roman_Ψ | start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_X ∖ { italic_x } end_POSTSUBSCRIPT ∧ ( italic_d ⊑ italic_x ) ∧ ( italic_x ⊑ italic_d ) end_CELL end_ROW start_ROW start_CELL ⟦ italic_x := italic_y ⟧ start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT roman_Ψ end_CELL start_CELL = end_CELL start_CELL roman_Ψ | start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_X ∖ { italic_x } end_POSTSUBSCRIPT ∧ ( italic_x ⊑ italic_y ) ∧ ( italic_y ⊑ italic_x ) end_CELL end_ROW end_ARRAY (13)

for dP𝑑𝑃d\in Pitalic_d ∈ italic_P and x,yX𝑥𝑦𝑋x,y\in{\mathcal{}X}italic_x , italic_y ∈ italic_X with xynot-equivalent-to𝑥𝑦x\not\equiv yitalic_x ≢ italic_y. Again, we realize the assignment of unknown values by restriction. For assigning constants and variables, we remark that equality can be expressed via a pair of inequalities.

Individual partial orders, though, may support further forms of right-hand sides in assignments. Subsequently, we enumerate more general forms of assignments for sets and for the prefix, substring, and scattered substring partial orders on strings.

Sets.

For sets, we consider right-hand sides of the form y1y2subscript𝑦1subscript𝑦2y_{1}\cap y_{2}italic_y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∩ italic_y start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT or y1y2subscript𝑦1subscript𝑦2y_{1}\cup y_{2}italic_y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∪ italic_y start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT for y1,y2Xsubscript𝑦1subscript𝑦2𝑋y_{1},y_{2}\in{\mathcal{}X}italic_y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_y start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∈ italic_X with x{y1,y2}𝑥subscript𝑦1subscript𝑦2x\not\in\{y_{1},y_{2}\}italic_x ∉ { italic_y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_y start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT }. We define

x:=y1y2Ψ=Ψ|X{x}(xy1)(xy2)x:=y1y2Ψ=Ψ|X{x}(y1x)(y2x)\begin{array}[]{lll}\llbracket x\,{:=}\,y_{1}\cap y_{2}\rrbracket^{\sharp}\,% \Psi&=&{\left.\kern-1.2pt\Psi\vphantom{|}\right|^{\sharp}_{{\mathcal{}X}% \setminus\{x\}}}\land(x\sqsubseteq y_{1})\land(x\sqsubseteq y_{2})\\ \llbracket x\,{:=}\,y_{1}\cup y_{2}\rrbracket^{\sharp}\,\Psi&=&{\left.\kern-1.% 2pt\Psi\vphantom{|}\right|^{\sharp}_{{\mathcal{}X}\setminus\{x\}}}\land(y_{1}% \sqsubseteq x)\land(y_{2}\sqsubseteq x)\\ \end{array}start_ARRAY start_ROW start_CELL ⟦ italic_x := italic_y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∩ italic_y start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ⟧ start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT roman_Ψ end_CELL start_CELL = end_CELL start_CELL roman_Ψ | start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_X ∖ { italic_x } end_POSTSUBSCRIPT ∧ ( italic_x ⊑ italic_y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ∧ ( italic_x ⊑ italic_y start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) end_CELL end_ROW start_ROW start_CELL ⟦ italic_x := italic_y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∪ italic_y start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ⟧ start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT roman_Ψ end_CELL start_CELL = end_CELL start_CELL roman_Ψ | start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_X ∖ { italic_x } end_POSTSUBSCRIPT ∧ ( italic_y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⊑ italic_x ) ∧ ( italic_y start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ⊑ italic_x ) end_CELL end_ROW end_ARRAY

Thus, we obtain after the assignment as new upper (lower) bounds of x𝑥xitalic_x in terms of the variables y1subscript𝑦1y_{1}italic_y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and y2subscript𝑦2y_{2}italic_y start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. An analogous construction can also be applied to multisets. We remark that the given right-hand sides do not entail that the equalities x=y1y2𝑥subscript𝑦1subscript𝑦2x=y_{1}\cap y_{2}italic_x = italic_y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∩ italic_y start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT and x=y1y2𝑥subscript𝑦1subscript𝑦2x=y_{1}\cup y_{2}italic_x = italic_y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∪ italic_y start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, respectively, hold after the assignments.

Prefixes.

In this case, right-hand sides of interest are concatenations of a constant or variable, possibly followed by some further value, i.e., are of the form s?𝑠?s\,?italic_s ? for s𝑠sitalic_s either in Σ*superscriptΣ\Sigma^{*}roman_Σ start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT, or in X{x}𝑋𝑥{\mathcal{}X}\setminus\{x\}italic_X ∖ { italic_x }, with “?” again denoting unknown input. We define

x:=s?Ψ=Ψ|X{x}(sx)\begin{array}[]{lll}\llbracket x\,{:=}\,s\,?\rrbracket^{\sharp}\,\Psi&=&{\left% .\kern-1.2pt\Psi\vphantom{|}\right|^{\sharp}_{{\mathcal{}X}\setminus\{x\}}}% \land(s\sqsubseteq x)\\ \end{array}start_ARRAY start_ROW start_CELL ⟦ italic_x := italic_s ? ⟧ start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT roman_Ψ end_CELL start_CELL = end_CELL start_CELL roman_Ψ | start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_X ∖ { italic_x } end_POSTSUBSCRIPT ∧ ( italic_s ⊑ italic_x ) end_CELL end_ROW end_ARRAY

i.e., we only obtain information about lower bounds for x𝑥xitalic_x after the assignment but lose all information about upper bounds.

Substrings.

Again, we consider right-hand sides which are concatenations of constants or variables with further values. These now are of the form ?s1??sk??subscript𝑠1??subscript𝑠𝑘??\,s_{1}\,?\ldots?\,s_{k}\,?? italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ? … ? italic_s start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ? (siΣ*X{x}subscript𝑠𝑖superscriptΣ𝑋𝑥s_{i}\in\Sigma^{*}\cup{\mathcal{}X}\setminus\{x\}italic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ roman_Σ start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT ∪ italic_X ∖ { italic_x }). We define

x:=?s1??sk?Ψ=Ψ|X{x}(s1x)(skx)\begin{array}[]{lll}\llbracket x\,{:=}\,?\,s_{1}\,?\ldots?\,s_{k}\,?\rrbracket% ^{\sharp}\,\Psi&=&\begin{array}[t]{@{}l}{\left.\kern-1.2pt\Psi\vphantom{|}% \right|^{\sharp}_{{\mathcal{}X}\setminus\{x\}}}\;\land\\ (s_{1}\sqsubseteq x)\land\ldots\land(s_{k}\sqsubseteq x)\end{array}\end{array}start_ARRAY start_ROW start_CELL ⟦ italic_x := ? italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ? … ? italic_s start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ? ⟧ start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT roman_Ψ end_CELL start_CELL = end_CELL start_CELL start_ARRAY start_ROW start_CELL roman_Ψ | start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_X ∖ { italic_x } end_POSTSUBSCRIPT ∧ end_CELL end_ROW start_ROW start_CELL ( italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⊑ italic_x ) ∧ … ∧ ( italic_s start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ⊑ italic_x ) end_CELL end_ROW end_ARRAY end_CELL end_ROW end_ARRAY

For scattered substrings, we proceed similarly. In both cases, no information is obtained for upper bounds to the left-hand side variable x𝑥xitalic_x after the assignment.

So far, we have assumed that the right-hand side s𝑠sitalic_s does not contain the variable x𝑥xitalic_x from the left-hand side. In case that x𝑥xitalic_x occurs in s𝑠sitalic_s, we split the assignment into the sequence

𝗍𝗆𝗉:=s;x:=𝗍𝗆𝗉;formulae-sequenceassign𝗍𝗆𝗉𝑠assign𝑥𝗍𝗆𝗉\textsf{tmp}\;{:=}\;s;\;x\;{:=}\;\textsf{tmp};tmp := italic_s ; italic_x := tmp ;

for some fresh variable tmp, i.e., first store the value of the right-hand side s𝑠sitalic_s in tmp whose value only then is assigned to the left-hand side variable x𝑥xitalic_x.

These abstract tranformers for the relational domains D[P]𝐷delimited-[]𝑃{\mathcal{}D}[P]italic_D [ italic_P ] (resp. D[P]0𝐷subscriptdelimited-[]𝑃0{\mathcal{}D}[P]_{0}italic_D [ italic_P ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT) are readily lifted to corresponding transformers for the weakly relational domains D¯2[P]superscriptsubscript¯𝐷2delimited-[]𝑃\overline{\mathcal{}D}_{2}^{\sharp}[P]over¯ start_ARG italic_D end_ARG start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT [ italic_P ] (resp. D¯2[P]0superscriptsubscript¯𝐷2subscriptdelimited-[]𝑃0\overline{\mathcal{}D}_{2}^{\sharp}[P]_{0}over¯ start_ARG italic_D end_ARG start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT [ italic_P ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT).

4.5 Guards and Negated Inequalities

Let us now turn to a treatment of guards ?c?𝑐?c? italic_c for the directed domain D¯2[P]superscriptsubscript¯𝐷2delimited-[]𝑃\overline{\mathcal{}D}_{2}^{\sharp}[P]over¯ start_ARG italic_D end_ARG start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT [ italic_P ] where P𝑃Pitalic_P is a lattice. The case for D¯2[P]0superscriptsubscript¯𝐷2subscriptdelimited-[]𝑃0\overline{\mathcal{}D}_{2}^{\sharp}[P]_{0}over¯ start_ARG italic_D end_ARG start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT [ italic_P ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT (when P𝑃Pitalic_P is not a lattice) is analogous.

A condition c𝑐citalic_c which consists of an inequality s1s2square-image-of-or-equalssubscript𝑠1subscript𝑠2s_{1}\sqsubseteq s_{2}italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⊑ italic_s start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT for sisubscript𝑠𝑖s_{i}italic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT being variables or constants already represents an abstract relation. Therefore, Eq. 2 can be used to define the abstract effect of ?c\llbracket?c\rrbracket^{\sharp}⟦ ? italic_c ⟧ start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT.

If the condition c𝑐citalic_c is a negated inequality s1s2not-square-image-of-or-equalssubscript𝑠1subscript𝑠2s_{1}\not\sqsubseteq s_{2}italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⋢ italic_s start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, this is not immediately possible. Assume that the variables occurring in c𝑐citalic_c all occur in p[X]2𝑝subscriptdelimited-[]𝑋2p\in[{\mathcal{}X}]_{2}italic_p ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. Now consider an arbitrary element D=dpp[X]2𝐷subscriptdelimited-⟨⟩subscript𝑑superscript𝑝superscript𝑝subscriptdelimited-[]𝑋2D=\langle d_{p^{\prime}}\rangle_{p^{\prime}\in[{\mathcal{}X}]_{2}}italic_D = ⟨ italic_d start_POSTSUBSCRIPT italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ [ italic_X ] start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT. In particular, dpD¯[P]psubscript𝑑𝑝¯𝐷superscriptdelimited-[]𝑃𝑝d_{p}\in\overline{\mathcal{}D}[P]^{p}italic_d start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ∈ over¯ start_ARG italic_D end_ARG [ italic_P ] start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT, i.e., dp=e1eksubscript𝑑𝑝subscript𝑒1subscript𝑒𝑘d_{p}=e_{1}\vee\ldots\vee e_{k}italic_d start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT = italic_e start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∨ … ∨ italic_e start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT for conjunctions e1,,eksubscript𝑒1subscript𝑒𝑘e_{1},\ldots,e_{k}italic_e start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_e start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT all using variables from p𝑝pitalic_p only. In this case, we define

?cD=D{ejej⟹̸(s1s2)}\begin{array}[]{lll}\llbracket?c\rrbracket^{\sharp}\,D&=&D\sqcap\bigvee\{e_{j}% \mid e_{j}\not\implies(s_{1}\sqsubseteq s_{2})\}\end{array}start_ARRAY start_ROW start_CELL ⟦ ? italic_c ⟧ start_POSTSUPERSCRIPT ♯ end_POSTSUPERSCRIPT italic_D end_CELL start_CELL = end_CELL start_CELL italic_D ⊓ ⋁ { italic_e start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ∣ italic_e start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ⟹̸ ( italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⊑ italic_s start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) } end_CELL end_ROW end_ARRAY

Thus, the negated inequality c𝑐citalic_c allows to improve the abstract relation D𝐷Ditalic_D by possibly removing those conjuncts ejsubscript𝑒𝑗e_{j}italic_e start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT from dpsubscript𝑑𝑝d_{p}italic_d start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT which contradict c𝑐citalic_c.

5 Conclusion

We considered a construction of 2-decomposable relational domains from arbitrary relational domains and exemplified this construction by deriving 2-disjunctive constants from the relational domain of disjunctive constants. For 2-disjunctive constants, it turned out that normalization is prohibitively expensive. Therefore, we provided a second general construction of 2-decomposable relational domains, now based on greatest solutions of constraint systems, which – in the case of disjunctive constants – results in a 2-decomposable domain where the operations join, meet, and restriction are polynomial.

In the second part, we then considered directed domains as conjunctions of inequalities over lattices or general partial orders. For lattices, we provided the 1-normal form for a syntactic characterization of semantic equivalence. We showed that the resulting domain is 2-decomposable and provided precise polynomial algorithms for 1-normalization, projection, join, and meet. For arbitrary partial orders, we use a weaker form of normalization for constructing a weaker 2-decomposable relational domain, for which we again provided polynomial algorithms, now for 0-normalization, projection, join, and meet. Only in the very last step, we added disjunctions by applying the general construction of 2-decomposable domain based on approximate normalization from the previous section. Both for 2-disjunctive constants and for directed domains, we indicated how transfer functions for assignments and guards can be constructed.

Our results can be extended in several directions. In the case of constants, one may, e.g., additionally, track equalities as well as disequalities between variables; likewise for directed domains, an extensive study of the impact of negated inequalities could be of interest. Here, we only studied lattice operations and transfer functions. Directed domains, though, may have infinite strictly ascending chains. Therefore, tailored widening and narrowing operators are of interest when these domains are employed for practical static analysis.

Acknowledgements.

This work has been supported by Shota Rustaveli National Science Foundation of Georgia under the project FR-21-7973 and by Deutsche Forschungsgemeinschaft (DFG) – 378803395/2428 ConVeY.

References

  • Abdulla et al. (2019) Abdulla, P.A., Atig, M.F., Diep, B.P., Holík, L., Janku, P.: Chain-free string constraints. In: Chen, Y., Cheng, C., Esparza, J. (eds.) Automated Technology for Verification and Analysis - 17th International Symposium, ATVA 2019, Taipei, Taiwan, October 28-31, 2019, Proceedings, Lecture Notes in Computer Science, vol. 11781, pp. 277–293. Springer (2019). URL https://doi.org/10.1007/978-3-030-31784-3_16
  • Albert et al. (2014) Albert, E., Arenas, P., Genaim, S., Puebla, G., Román-Díez, G.: Conditional termination of loops over heap-allocated data. Sci. Comput. Program. 92, 2–24 (2014). URL https://doi.org/10.1016/j.scico.2013.04.006
  • Arceri et al. (2022) Arceri, V., Olliaro, M., Cortesi, A., Ferrara, P.: Relational string abstract domains. In: Finkbeiner, B., Wies, T. (eds.) Verification, Model Checking, and Abstract Interpretation - 23rd International Conference, VMCAI 2022, Philadelphia, PA, USA, January 16-18, 2022, Proceedings, Lecture Notes in Computer Science, vol. 13182, pp. 20–42. Springer (2022). URL https://doi.org/10.1007/978-3-030-94583-1_2
  • Bagnara et al. (2008) Bagnara, R., Hill, P.M., Zaffanella, E.: An improved tight closure algorithm for integer octagonal constraints. In: Logozzo, F., Peled, D.A., Zuck, L.D. (eds.) Verification, Model Checking, and Abstract Interpretation, pp. 8–21. Springer Berlin Heidelberg, Berlin, Heidelberg (2008)
  • Bagnara et al. (2009) Bagnara, R., Hill, P.M., Zaffanella, E.: Weakly-relational shapes for numeric abstractions: improved algorithms and proofs of correctness. Formal Methods Syst. Des. 35(3), 279–323 (2009). URL https://doi.org/10.1007/s10703-009-0073-1
  • Beckert et al. (2000) Beckert, B., Hähnle, R., Manyà, F.: The 2-sat problem of regular signed CNF formulas. In: 30th IEEE International Symposium on Multiple-Valued Logic, ISMVL 2000, Portland, Oregon, USA, May 23-25, 2000, Proceedings, pp. 331–336. IEEE Computer Society (2000). URL https://doi.org/10.1109/ISMVL.2000.848640
  • Chawdhary et al. (2019) Chawdhary, A., Robbins, E., King, A.: Incrementally closing octagons. Formal Methods Syst. Des. 54(2), 232–277 (2019). URL https://doi.org/10.1007/s10703-017-0314-7
  • Chen et al. (2018) Chen, T., Chen, Y., Hague, M., Lin, A.W., Wu, Z.: What is decidable about string constraints with the replaceall function. Proc. ACM Program. Lang. 2(POPL), 3:1–3:29 (2018). URL https://doi.org/10.1145/3158091
  • Cousot and Cousot (1992) Cousot, P., Cousot, R.: Abstract interpretation frameworks. Journal of logic and computation 2(4), 511–547 (1992)
  • Cousot and Halbwachs (1978) Cousot, P., Halbwachs, N.: Automatic discovery of linear restraints among variables of a program. In: Aho, A.V., Zilles, S.N., Szymanski, T.G. (eds.) Conference Record of the Fifth Annual ACM Symposium on Principles of Programming Languages, Tucson, Arizona, USA, January 1978, pp. 84–96. ACM Press (1978). URL https://doi.org/10.1145/512760.512770
  • Day et al. (2023) Day, J.D., Ganesh, V., Grewal, N., Manea, F.: On the expressive power of string constraints. Proc. ACM Program. Lang. 7(POPL), 278–308 (2023). URL https://doi.org/10.1145/3571203
  • Dor et al. (2001) Dor, N., Rodeh, M., Sagiv, S.: Cleanness checking of string manipulations in C programs via integer analysis. In: Cousot, P. (ed.) Static Analysis, 8th International Symposium, SAS 2001, Paris, France, July 16-18, 2001, Proceedings, pp. 194–212. Springer, LNCS 2126 (2001). URL https://doi.org/10.1007/3-540-47764-0_12
  • Ganesh et al. (2011) Ganesh, V., Minnes, M., Solar-Lezama, A., Rinard, M.: What is decidable about strings? (2011)
  • Karr (1976) Karr, M.: Affine relationships among variables of a program. Acta Informatica 6, 133–151 (1976). URL https://doi.org/10.1007/BF00268497
  • Miné (2001) Miné, A.: The octagon abstract domain. In: WCRE’ 01, p. 310. IEEE Computer Society (2001). DOI 10.1109/WCRE.2001.957836
  • Miné (2004) Miné, A.: Weakly relational numerical abstract domains. (domaines numériques abstraits faiblement relationnels). Ph.D. thesis, École Polytechnique, Palaiseau, France (2004). URL https://tel.archives-ouvertes.fr/tel-00136630
  • Miné (2006) Miné, A.: The octagon abstract domain. Higher Order Symbol. Comput. 19(1), 31–100 (2006). URL https://doi.org/10.1007/s10990-006-8609-1
  • Müller-Olm and Seidl (2004) Müller-Olm, M., Seidl, H.: Precise interprocedural analysis through linear algebra. In: Jones, N.D., Leroy, X. (eds.) Proceedings of the 31st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2004, Venice, Italy, January 14-16, 2004, pp. 330–341. ACM (2004). URL https://doi.org/10.1145/964001.964029
  • Müller-Olm and Seidl (2007) Müller-Olm, M., Seidl, H.: Analysis of modular arithmetic. ACM Trans. Program. Lang. Syst. 29(5), 29 (2007). URL https://doi.org/10.1145/1275497.1275504
  • Sankaranarayanan et al. (2005) Sankaranarayanan, S., Sipma, H.B., Manna, Z.: Scalable analysis of linear systems using mathematical programming. In: Cousot, R. (ed.) Verification, Model Checking, and Abstract Interpretation, LNCS, vol. 3385, pp. 25–41. Springer, Berlin, Heidelberg (2005)
  • Schwarz et al. (2023) Schwarz, M., Saan, S., Seidl, H., Erhard, J., Vojdani, V.: Clustered relational thread-modular abstract interpretation with local traces. In: Wies, T. (ed.) Programming Languages and Systems - 32nd European Symposium on Programming, ESOP 2023, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2023, Paris, France, April 22-27, 2023, Proceedings, Lecture Notes in Computer Science, vol. 13990, pp. 28–58. Springer (2023). URL https://doi.org/10.1007/978-3-031-30044-8_2
  • Schwarz and Seidl (2023) Schwarz, M., Seidl, H.: Octagons revisited. In: Hermenegildo, M.V., Morales, J.F. (eds.) Static Analysis, pp. 485–507. Springer Nature Switzerland, Cham (2023)
  • Simon et al. (2002) Simon, A., King, A., Howe, J.M.: Two variables per linear inequality as an abstract domain. In: Leuschel, M. (ed.) Logic Based Program Synthesis and Transformation, 12th International Workshop, LOPSTR 2002, Madrid, Spain, September 17-20,2002, Revised Selected Papers, LNCS, vol. 2664, pp. 71–89. Springer (2002). URL https://doi.org/10.1007/3-540-45013-0_7
  • Yu et al. (2011) Yu, F., Bultan, T., Hardekopf, B.: String abstractions for string verification. In: Groce, A., Musuvathi, M. (eds.) Model Checking Software - 18th International SPIN Workshop, Snowbird, UT, USA, July 14-15, 2011. Proceedings, Lecture Notes in Computer Science, vol. 6823, pp. 20–37. Springer (2011). URL https://doi.org/10.1007/978-3-642-22306-8_3