research-article

Open access

Interactive Abstract Interpretation with Demanded Summarization

Authors:

Benno Stein,

Bor-Yuh Evan Chang,

Manu SridharanAuthors Info & Claims

ACM Transactions on Programming Languages and Systems, Volume 46, Issue 1

Article No.: 4, Pages 1 - 40

https://doi.org/10.1145/3648441

Published: 29 March 2024 Publication History

PDF eReader

Abstract

We consider the problem of making expressive, interactive static analyzers compositional. Such a technique could help bring the power of server-based static analyses to integrated development environments (IDEs), updating their results live as the code is modified. Compositionality is key for this scenario, as it enables reuse of already-computed analysis results for unmodified code. Previous techniques for interactive static analysis either lack compositionality, cannot express arbitrary abstract domains, or are not from-scratch consistent.

We present demanded summarization, the first algorithm for incremental compositional analysis in arbitrary abstract domains that guarantees from-scratch consistency. Our approach analyzes individual procedures using a recent technique for demanded analysis, computing summaries on demand for procedure calls. A dynamically updated summary dependency graph enables precise result invalidation after program edits, and the algorithm is carefully designed to guarantee from-scratch-consistent results after edits, even in the presence of recursion and in arbitrary abstract domains. We formalize our technique and prove soundness, termination, and from-scratch consistency. An experimental evaluation of a prototype implementation on synthetic and real-world program edits provides evidence for the feasibility of this theoretical framework, showing potential for major performance benefits over non-demanded compositional analyses.

1 Introduction

This article addresses the problem of developing incremental and demand-driven static analyzers that simultaneously support real-time user interaction, arbitrarily complex abstract domains, and compositional analysis. Static analysis is being increasingly deployed for bug finding and verification in continuous integration (CI) pipelines and automated code review systems [Distefano et al. 2019; Sadowski et al. 2018]. However, the cost of analyzing large programs means that developers must wait minutes or sometimes hours to receive analysis results after making changes, depressing fix rates and ultimately limiting the effectiveness of these tools.

In this article, we describe an analysis engine that is (1) incremental in that it reuses analysis results unaffected by program edits, (2) demand-driven in that it performs only the analysis work needed to respond to client-issued queries, and (3) compositional in that it produces reusable and composable procedure summaries for interprocedural analysis. Furthermore, our framework makes no demands on analysis implementors beyond those typical of a batch abstract interpretation engine—specifically, it supports arbitrary abstract domains, including infinite-height domains with non-monotonic widening operators.

We formalize incrementality and demand for summary-based abstract interpretation and show that our approach is sound, terminating, and from-scratch consistent, i.e., that analysis results are always at least as precise as reanalyzing the program from scratch. Soundness and termination are familiar metatheoretic properties from the program analysis literature, whereas from-scratch consistency has been called “the fundamental correctness property of incremental computation” in general incremental computation literature [Hammer et al. 2015]. In the context of interactive program analysis, from-scratch consistency guarantees that there is no loss of precision or flakiness introduced by incremental or demand-driven infrastructure.

Recent work offers an approach for interactive analysis, making analyses with arbitrary abstract domains incremental and demand-driven by explicitly reifying abstract interpretation computation into acyclic dependency structures called demanded¹ abstract interpretation graphs (DAIGs) [Stein et al. 2021a]. This combination of incrementality and demand is desirable for scenarios requiring interactive performance: It avoids unnecessary recomputation of unchanged outputs when inputs change and also avoids unnecessary computation of outputs that are never needed by the client [Hammer et al. 2015;, 2014].

Although this technique computes analysis results efficiently and interactively, it is limited to intraprocedural analysis of imperative programs. Stein et al. [2021a] describe an informal extension of their whole-program operational approach to interprocedural analysis, but this extension does not handle recursion at all and is known to have issues in scaling up to large programs. Instead, a standard approach to scale batch abstract interpretation to large programs is to compute procedure summaries and then apply these summaries to analyze the whole program compositionally [Calcagno and Distefano 2011; Blackshear et al. 2018; Distefano et al. 2019; Reps et al. 1995; Schubert et al. 2021].

Ideally, an interactive demanded abstract interpreter could interoperate with a batch compositional analysis seamlessly, updating analysis results locally while drawing on existing procedure summaries computed on CI servers—with the same arbitrarily complex abstract domains.

Traditional compositional analyses based on the well-known tabulation algorithm combine “operational” dataflow analysis (i.e., how to interpret commands intraprocedurally to compute the analysis state at a program point, given the analysis states at predecessor points) with “denotational” procedure summaries mapping code to a functional or relational abstraction of its semantics (i.e., a compilation of code to a transformer on analysis states for interprocedural analysis) [Sharir and Pnueli 1981; Reps et al. 1995], but existing dependency-based approaches to incremental or demand-driven analysis tend to focus on one or the other style of analysis.

In this article, we show how demanded analysis infrastructure can be extended to tabulation-based compositional analysis in the presence of arbitrary abstract domains and recursive procedures, while providing meta-theoretical guarantees including soundness, termination, and from-scratch consistency of analysis results. From-scratch consistency ensures that incremental results agree precisely with an underlying batch analysis and is crucial for reliable deployment in analysis systems that strive for deterministic and reproducible results.

To address this challenge, we introduce a dependency map that is dynamically extended to capture the dependencies that arise from using demanded procedure summaries. These dynamic dependencies also enable our algorithm to detect the self-referential procedure summaries that arise when analyzing recursive procedures and require a further fixed-point iteration.

Our framework improves over the state-of-the-art in multiple ways. Compared to previous work on intraprocedural demanded analysis [Stein et al. 2021a], which relies on the restricted structure of loops in reducible control flow graphs, our approach naturally handles the richly structured dependency graphs that arise from analyzing interprocedural control flow with recursion.

Supporting the analysis of recursive procedures is particularly important in languages with higher-order or dynamic dispatch, as it is well known that reasonable approximation in the underlying call graph construction can lead to recursion in the program abstraction.

Other recent compositional analysis frameworks [Blackshear et al. 2018; Calcagno and Distefano 2011] have claimed some degree of incrementality to reduce analysis times on CI servers after a code change. It is true that compositionality naturally yields some degree of incrementality, but tracking the invalidation of summaries is tricky and can lead to subtle soundness bugs, especially surrounding virtual calls and incremental changes to dependency structures. Such issues have manifested in the Infer static analyzer, for example, due to insufficiently conservative dependency tracking or invalidation of pre-edit summaries using the post-edit dependency structure [Stein 2023]. These works give no formal treatment of incrementality and are typically deployed in a non-incremental configuration. A variant of the technique described in this article has recently been implemented in the Infer static analyzer, enabling deployment of incremental analysis and yielding significant (on the order of 3 $\times$ ) analysis speedups in CI [Stein 2023].

This article aims to formalize and provide a richer understanding of the dependency management problem that underlies sound and precise incremental and demand-driven analysis.

To our best knowledge, this article presents the first algorithm for incremental compositional analysis in arbitrary abstract domains (including infinite-height domains with non-monotonic widening). Further, it supports demand-driven queries for analysis results, and we provide a full meta-theory showing soundness, termination, and from-scratch consistency in the presence of recursive procedures.

In short, we make the following key contributions:

—

We introduce a framework for interactive abstract interpretation with demanded summarization, reifying the dependency structure of a tabulation-based analysis as a dynamically evolving demanded summarization graph (DSG) (Section 4). The result is a demand-driven interprocedural analysis framework that is both incremental and compositional by default.

—

We formalize such an analysis as an on-demand evaluation of DSGs, showing that it is sound, terminating, and from-scratch consistent with the underlying batch analysis (Section 5).

—

We describe a prototype implementation of the framework and evaluate it on both synthetic benchmarks and real bug-fixing edits from open-source Java programs. Our proof-of-concept evaluation finds that our framework enables real-time interprocedural analysis in rich abstract domains (Section 6).

—

We formalize extensions to the framework that enable memory and computation-saving optimizations without compromising its formal guarantees (Section 7). While seemingly simple, we see that maintaining the summary dependency structure in a sound way is surprisingly subtle.

2 Overview

In this section, we illustrate demanded summarization by example. Figure 1 shows a numerical program (adapted from Sagiv et al. [1996]) in an imperative language, with a recursive procedure p. Figure 1 also shows a simple example edit on line 3 of the program, designed to demonstrate some of our technique’s key features.

Fig. 1.

Note that throughout this worked example, we assume a standard semantics for variable scoping and formal/actual parameter binding; our formalism elides these details for the sake of simplicity and clarity, as they are unrelated to the developments in this article.

For illustration, we consider analysis of the original and edited programs using an interval abstract domain to bound variable ranges—a textbook example of an infinite height domain with a non-monotonic widening operator [Cousot and Cousot 1977]. With this domain, the analysis can prove that $x = -1$ at the exit of main originally, and $x = -5$ at exit after the edit. An efficient incremental analysis should re-use many intermediate results from analysis of the original program when re-analyzing the edited version.

Our key goal is to define an analysis framework that simultaneously supports incremental, demand-driven, and compositional analysis in arbitrary abstract domains. Recent work by Stein et al. [2021a] supports incremental and demand-driven analysis in arbitrary domains, but only for intraprocedural analysis; they present an informal extension for interprocedural analysis, but it is not compositional and cannot handle recursion.

We first show the drawbacks of a lack of compositionality for our example, then show how our demanded summarization approach achieves both compositionality and support for recursion. This technique enables the use of Stein et al. [2021a] fine-grained demanded abstract interpreters at scale, so we describe the approach using DAIGs to be concrete in our presentation and support incrementality at the fine granularity of statements. However, the same general approach could be used to derive an analysis with a coarser procedure-level granularity if instantiated with a standard batch intraprocedural abstract interpreter.

2.1 Incrementality and Context Sensitivity

Figure 2 illustrates how an operational² approach to interprocedural analysis has disadvantages for incremental analysis. In the operational call-strings approach [Sharir and Pnueli 1981], a procedure is analyzed separately for each abstract call stack, or call string, in which it is called. Figure 2 shows the analysis result for the original program using call strings of length 1; p is analyzed once for each call site (lines 4 and 10). Using this context-sensitivity policy provides sufficient precision to prove $x = -1$ at exit.

Fig. 2.

The downside of the operational approach comes in doing incremental analysis. The operational approach indexes its results on the calling context under which a procedure is invoked. Hence, if an edit leads to recomputing facts at some call site of a procedure, then the procedure must be re-analyzed for corresponding contexts. In Figure 1, the edit on line 3 causes re-computation of the abstract facts before the call site at line 4. This in turn necessitates invalidation of all analysis facts computed in p for this call (shown with ✗). Similarly, all facts for p from the recursive call must be invalidated. This invalidation is wasteful: p was not modified and it invokes no other functions (besides itself), so results for p should be unaffected by the change.

A denotational $^{2}$ interprocedural analysis computes a transformer for each procedure—typically, some two-state relational abstraction or Hoare triple(s) over the procedure, which can then be applied at callsites when the abstract state is compatible with a summary’s initial state. This is the “functional approach” of Sharir and Pnueli [1981], and such analyses are often referred to as “summary-based,” “compositional,” or “modular.” Note that a denotational interprocedural analysis need not analyze each procedure precisely once but may compute and tabulate multiple partial procedure summaries.

Figure 3 shows the result of our demanded summarization approach, which is a denotational, tabulation-based approach to interprocedural analysis for our example. For the moment, ignore the edit and arrows between the boxes, which are part of explaining our approach further below. What to note now is that copies of p are distinguished by the abstract fact reaching p’s entry, e.g., $\lbrace x\mapsto [0,0];a\mapsto [3,3]\rbrace$ . In this summary-based approach, clearly editing line 3 does not impact the results for p, as the tabulated summaries are independent of specific callers. Hence, only results in main need to be invalidated. Our challenge lies in adapting the denotational approach to build incremental and demand-driven analyses for arbitrary abstract domains with interactive performance and provable correctness and precision guarantees.

Fig. 3.

2.2 Demanded Abstract Interpretation

In the work of Stein et al. [2021a], incremental and demand-driven analysis is achieved via demanded abstract interpretation graphs, or DAIGs. These structures reify analysis computation in a graph that makes dependencies among analysis results and program statements explicit and support two key analysis operations: queries and edits. A query can be raised for the analysis result at some program point p. The query is answered by computing analysis results at all points backward-reachable from p in the DAIG, which captures all dependencies. Intermediate results are cached to speed up computation of future queries. In response to a program edit, analysis results that are forward-reachable from the edit point(s) in the DAIG (those results dependent on edited statements) are dirtied; these dirtied results may be recomputed in response to a future query.

These techniques—and the “demanded” terminology $^{1}$ —draw heavily on the incremental computation literature and can be seen as specializing traditional graph-based incremental computation techniques to the setting of intraprocedural abstract interpretation [Hammer et al. 2014, 2015; Stein et al. 2021a].

We will avoid delving into the details and complexities of intraprocedural analysis in this article for clarity of presentation and to show that demanded summarization has no fundamental dependency on the specific intraprocedural analysis used to produce summaries.

2.3 Demanded Summarization

Our new approach to incremental, demand-driven, compositional interprocedural analysis is based on demanded summarization graphs (DSGs), which embody three key ideas. First, each denotational procedure summary is computed using a DAIG, which is a node in the DSG. Each individual DAIG is guaranteed to be from-scratch consistent [Stein et al. 2021a]. Second, at a procedure call, a new DAIG summary must be computed on demand; an edge in the DSG tracks where the summary is applied. To ensure from-scratch consistency, a procedure summary is only applied at call sites after computation of that summary has converged. Finally, recursion is handled via an additional fixed-point computation within the recursive procedure, designed carefully to maintain from-scratch consistency.

The key difficulty addressed by the DSG semantics is the interleaving of intraprocedural analysis (within DAIGs) and interprocedural analysis (orchestrated among a collection of DAIGs) in a way that is sound, terminating, and from-scratch consistent.

The dependency structures induced by general interprocedural control flow—wherein procedures may have many recursive calls in arbitrary position, possibly nested in loops, and intermingled with calls to other procedures—require fundamentally different handling than cyclic intraprocedural control flow.

Figure 3 shows how the example of Figure 1 is analyzed using a DSG. Each box in the diagram represents a DAIG, while the blue edges represent interprocedural summary dependencies: A DSG essentially consists of a collection of independent procedure summary DAIGs overlaid by an interprocedural dependency graph.

Assume an initial query for the analysis state at the exit of main when $x = 0$ at main’s entry. Our approach first creates a corresponding DAIG (labeled main @ ${\left\lbrace x \mathbin {\mapsto } [0,0] \right\rbrace }$ ) to perform this analysis. Intraprocedural analysis proceeds until it reaches the callp(n) with fact $\lbrace x\mapsto [0,0];n\mapsto [3,3]\rbrace$ . Handling the call and finishing the analysis ofmain requires first computing a summary forp, so we instantiate a new DAIG forp (at top right of Figure 3, after performing formal/actual parameter binding on the abstract state).

Recursion. Analysis within the DAIG for p with initial fact $\lbrace x\mapsto [0,0];a\mapsto [3,3]\rbrace$ reaches a recursive call p(a) . Here, we recognize the call as recursive and apply widening before instantiating another DAIG over p, yielding initial state $\lbrace x\mapsto [0,0];a\mapsto (-\infty ,3]\rbrace$ (at bottom right of Figure 3). Analysis in this new DAIG once again reaches the recursive callsite, but widening of the entry state has converged to $\lbrace x\mapsto [0,0];a\mapsto (-\infty ,3]\rbrace$ .

Now, we must analyze the control-flow path(s) after the recursive call in the final (self-referential) DAIG, as we need a converged summary before we can propagate back to callers. Further, a fixed point may be required to converge on the exit state.

To obtain a converged exit state, we run a carefully designed fixed-point analysis within the final DAIG, using intraprocedural dirtying to iterate. The process is illustrated step-by-step in Figure 4.

Fig. 4.

It starts by injecting $\bot$ as the post-state of the final recursive call (step (a)), since no dataflow has yet reached the exit. Propagating to the exit yields a fact $\lbrace x\mapsto [5,\infty ];a\mapsto (-\infty ,0]\rbrace$ , due to the join with the non-recursive path. To iterate, we dirty the post-state of the recursive call within the DAIG, update the post-state to this new fact, and then re-query the exit state (step (b)). We continue this process one more time (step (c)) and see the exit state converges to $\lbrace x\mapsto [-1,\infty ];a\mapsto (-\infty ,3]\rbrace$ .

Finally, the fully analyzed rightmost DAIG of Figure 4 can be applied as a summary in the DAIG that initially demanded it (at top right of Figure 3), which can then be fully analyzed and applied as a summary in the main DAIG. We add interprocedural dependency edges to the DSG to track each of these summary applications: the blue edges not marked by a + in Figure 3. Finally, we finish analyzing main and return the query result $\lbrace x\mapsto [-1,-1];n\mapsto [3,3]\rbrace$ .

Incremental Updates. Now, when the edit is made to main, its downstream dependencies (the states marked by ✗ in Figure 3) are dirtied in the main DAIG, but no other DAIG is affected, because the main summary has no dependencies. Suppose another query is issued for the exit abstract state of main : We now reach the p(n) callsite with no applicable summary and so instantiate a new DAIG for p with initial state $\lbrace x\mapsto [0,0];a\mapsto [5,5]\rbrace$ , at bottom left of Figure 3. However, the previously computed summary³ with precondition $\lbrace x\mapsto [0,0];a\mapsto [3,3]\rbrace$ can now be used at the recursive callsite p(a), allowing analysis to complete without recomputing a fixed point over the recursive procedure. Once again, summary applications are tracked by dependency edges to enable precise dirtying in response to future edits.

From-scratch consistency. From-scratch consistency requires computing exactly the same result on an edited program as a batch analysis. Hence, our approach strictly orders computation of analysis results, including integration of summaries into callers, enabling from-scratch consistency for arbitrary domains. The DSG algorithm guarantees that the requested summary for p will only be integrated into main once summary computation has completely converged.

In tabulation analyses like the IFDS algorithm [Reps et al. 1995], intermediate summary results may be propagated to callers before all paths through a procedure are completely analyzed. So, for p, a summary result for the non-recursive path could be propagated to callers before the recursive call is fully analyzed. For an IFDS problem, the variance in propagation ordering does not impact the final analysis result, since the join operation is always set union. However, since our framework supports infinite abstract domains with non-monotonic widening operators (like the interval domain), variance in propagation order could cause differences in the final analysis result.

Demanded Summarization and Compositional Analysis. When a new summary is required to analyze a procedure call, demanded summarization instantiates a new DAIG with the requisite initial abstract state and issues a query for the abstract state at its exit. Within an instantiated DAIG, we can reuse results to respond to future queries and efficiently dirty results in response to program edits. However, instantiated DAIGs cost memory, so a balance must be struck between the granularity of cached analysis results and memory usage.

We can tune this balance by condensing DAIGs to two-state summaries, remembering the entry and exit states—essentially a Hoare triple over the procedure—while discarding all intermediate analysis facts that contributed to said triple. This is formalized in Section 7, where we show that from-scratch consistency is preserved under this transformation.

A demanded summarization-based analysis can also be initialized with procedure summaries computed from a batch compositional analysis, provided that dependencies between summaries are preserved to enable dirtying after edits. This addresses the motivating problem of connecting server-based compositional analysis with interactive analysis in the IDE.

3 Preliminaries: Abstract Interpretation

In this section, we fix syntax, semantics, and terminology for programs and abstract interpreters.

These are intentionally standard and define a typical interface for a batch summary-based batch abstract interpretation engine (see, e.g., Padhye and Khedker [2013] for a similar formulation of the approach, which can be understood as an extension of Reps et al. [1995] to abstract interpretation). By design, all constructions are generic in the underlying abstract domain and statement language and can be instantiated to a wide range of analysis problems and imperative programming languages. A procedure $\langle L, E\rangle$ is a reducible control-flow graph over an unspecified statement language $\mathit {Stmt}$ with procedure call edges of the form $\ell \!-\hspace{-6.00006pt} \mbox{-}\hspace{-1.79993pt} [\texttt {call}\;\rho ]\hspace{-5.0pt}\rightarrow \! \ell ^{\prime }$ , where $\ell$ is the location preceding the call and $\ell ^{\prime }$ is the return site:

A program is a labeled collection of such procedures with a distinguished “main” entry-point.

\begin{equation*} \begin{array}{rcl} \text{procedure labels} & \rho \in \mathit {Label}& \ni \rho _\text{main}\\ \text{programs} & P & : \mathit {Label} \hookrightarrow \mathit {Proc} \\ \end{array} \end{equation*}

We denote by $L^\rho _P$ , $E^\rho _P$ , $\textsf {entry}_P(\rho)$ and $\textsf {exit}_P(\rho)$ , respectively, the program-location set, control-flow edge set, entry location, and exit location of $\rho$ . We also write $\rho _P^\ell$ for the unique $\rho$ containing $\ell$ and $E_P^\ast$ for the set of all control-flow graph (CFG) edges in P and elide P subscripts where they are clear from context. This core language considers direct calls without parameters or return values for ease of presentation only and is not a fundamental limitation. In Section 6, we discuss implementation considerations when applying our approach to Java programs. However, note that our formal framework does not directly model mutual recursion or higher-order control flow.

Concrete Semantics. To define a concrete semantics of this language, we assume a denotational semantics for statements:

\begin{equation*} \begin{array}{rcl} \text{concrete states} & \sigma \in \Sigma & \text{(with initial state $\sigma _0$)}\\ \text{concrete semantics} & \mathopen [\![ \cdot \mathclose ]\!] & :~ \mathit {Stmt}\rightarrow \Sigma \rightarrow \Sigma _\bot \\ \text{concrete stacks} & \kappa \in K & ::= \varepsilon ~\vert ~ e :: \kappa \\ \end{array} \end{equation*}

where the function $\mathopen [\![ \cdot \mathclose ]\!]$ gives a denotational interpretation of non-callsite program statements over concrete program states $\sigma$ (with $\bot$ being an invalid state and $\Sigma _\bot = \Sigma \cup \lbrace \bot \rbrace$ ). Concrete call stacks $\kappa$ are used to keep track of return sites in procedure calls. A state-collecting semantics $\mathopen [\![ \cdot \mathclose ]\!] ^\ast _P \mathrel {:} \mathit {Loc}\rightarrow \mathcal {P}(\Sigma \times K)$ for this language with procedure calls can be defined as a fixed-point over the concrete transfer function $\mathopen [\![ \cdot \mathclose ]\!]$ .

Abstract Semantics. An abstract interpreter for procedures of this language is a tuple $\langle \Sigma ^\sharp ,\varphi _0,\mathopen [\![ \cdot \mathclose ]\!] ^\sharp ,\sqsubseteq ,\sqcup ,\nabla \rangle$ consisting of an abstract domain $\Sigma ^\sharp$ equipped with distinguished initial state $\varphi _0$ , partial order $\sqsubseteq$ , join $\sqcup$ , and widening $\nabla$ , as well as an abstract semantics (i.e., transfer function) $\mathopen [\![ \cdot \mathclose ]\!] ^\sharp$ that interprets statements as functions over abstract states, all subject to the usual soundness conditions. That is, we require that (1) $\Sigma ^\sharp$ forms a semi-lattice under $\sqsubseteq$ with join $\sqcup$ and bottom element $\bot$ , (2) $\nabla$ satisfies the ascending chain condition, and (3) $\varphi _0$ and $\mathopen [\![ \cdot \mathclose ]\!] ^\sharp$ are sound with respect to their concrete counterparts $\sigma _0$ and $\mathopen [\![ \cdot \mathclose ]\!]$ .

A solution to an intraprocedural abstract interpretation problem is a pre-fixed-point of the abstract transfer function over the flow graph $\langle L,E\rangle$ , with the join $\sqcup$ applied at locations in L with in-degree $\ge 2$ and the widening $\nabla$ applied infinitely often on cycles. It is well known that such a solution can be computed by worklist iteration and that such a solution is sound if the transfer function is locally sound [Cousot and Cousot 1977].

Just as we defined the concrete state-collecting semantics $\mathopen [\![ \cdot \mathclose ]\!] ^\ast _P$ for this language with procedure calls as a fixed-point over the concrete transfer function, we can define an abstract state-collecting semantics $\mathopen [\![ \cdot \mathclose ]\!] ^{\sharp \ast }_P$ as a fixed-point over abstract semantic functions. However, such a definition does not correspond to a computable analysis, since the space K of concrete stacks is unbounded. A standard solution to this issue is to apply the call-strings approach [Sharir and Pnueli 1981], abstracting concrete stacks by truncating instead of extending them indiscriminately in the equation for procedure entry locations.

3.1 Intraprocedural Demanded Abstract Interpretation

In this section, we define semantic judgments for intraprocedural demand-driven analysis and incremental edits and make some assumptions about their behavior. This treatment is based on the DAIGs of Stein et al. [2021a] but is intentionally abstract in the underlying details of intraprocedural analysis.

This “black-box” description is meant to facilitate a clear and understandable presentation of interprocedural analysis in the sections to follow and to demonstrate that the demanded summarization algorithm does not depend on specific implementation details of DAIGs and generalizes in principle to other producers of procedure summaries.

A DAIG $\mathcal {D}$ reifies an abstract interpretation computation into a structure that supports interactive program analysis, that is, demand-driven queries and incremental edits [Stein et al. 2021a].

Partial analysis results are stored in reference cells that are assigned unique names $n ::= \underline{\ell }~\vert ~n_1 \mathord {\cdot }n_2~\vert ~\dots$ . The location name $\underline{\ell }$ corresponds to the cell that, when demanded, stores the fixed-point invariant at location $\underline{\ell }$ , and the product name $\underline{\ell }\mathord {\cdot }\underline{\ell }^{\prime }$ stores the statement s (or procedure call $\texttt {call}\;\rho$ ) labeling the control-flow edge from $\ell$ to $\ell ^{\prime }$ . A DAIG is then a directed acyclic hypergraph where reference cells are nodes and where edges are labeled by abstract interpreter operations (e.g., $\mathopen [\![ \cdot \mathclose ]\!] ^\sharp$ , $\sqcup$ , $\nabla$ ) that specify the operation to apply when its output cell is queried. Demanded abstract interpretation is captured by two DAIG-rewriting judgments.

\begin{equation*} \begin{array}{rl} \text{demand-driven queries} & \mathcal {D} \vdash n \Rightarrow \varphi ~; \mathcal {D}^{\prime }\\ \text{incremental edits} & \mathcal {D}\vdash n \Leftarrow v_\varepsilon ~; \mathcal {D}^{\prime }\end{array} \end{equation*}

Queries. The judgment form $\mathcal {D} \vdash n \Rightarrow \varphi ~; \mathcal {D}^{\prime }$ is read as “a query for the abstract state named by n in DAIG $\mathcal {D}$ yields result $\varphi$ by demanded abstract interpretation with updated DAIG $\mathcal {D}^{\prime }$ .” A demanded abstract interpretation [Stein et al. 2021a] computes a result $\varphi$ to store in the cell named by n by evaluating backwards-reachable dependencies of cell n in DAIG $\mathcal {D}$ while unrolling fixed-point computations on demand to maintain the DAIG acyclicity invariant and eventually resulting in $\mathcal {D}^{\prime }$ . We assume that this querying judgment is sound, terminating, and from-scratch consistent with respect to the underlying abstract interpretation.

Edits. The judgment form $\mathcal {D}\vdash n \Leftarrow v_\varepsilon ~; \mathcal {D}^{\prime }$ is read as “an edit that writes value v (or the empty symbol $\varepsilon$ ) to the cell named by n in DAIG $\mathcal {D}$ yields updated DAIG $\mathcal {D}^{\prime }$ with cells depending on n marked as ‘dirty.”’ Demanded abstract interpretation supports incremental edits by eagerly “dirtying” those results forward-reachable from the edit to cell n in DAIG $\mathcal {D}$ (e.g., writing a statement s to a control flow edge from $\ell$ to $\ell ^{\prime }$ is given by $\mathcal {D}\vdash \underline{\ell }\mathord {\cdot }\underline{\ell ^{\prime }} \Leftarrow s ~; \mathcal {D}^{\prime }$ ). We assume that this dirtying judgment is conservative in the sense that all potentially affected abstract states are dirtied. Since incremental dirtying is eager and demand-driven query evaluation is lazy, dirtied results are only recomputed if needed for a future query.

3.2 Summarization and Tabulation

Instead of abstracting concrete stacks and propagating abstract data flow in the resulting interprocedural control-flow graph, an alternative approach is to build compositional summaries for each procedure (i.e., the functional approach [Sharir and Pnueli 1981]). Procedure summaries can be built by defining relational, two-state abstract domains (e.g., Jeannet et al. [2010] and Yorsh et al. [2008]) or by tabulating pairs of abstract states [Reps et al. 1995; Sharir and Pnueli 1981; Padhye and Khedker 2013]. In either case, the goal is to compute procedure summaries $\left\lbrace \varphi \right\rbrace \mathbin {\rho }\left\lbrace \varphi ^{\prime }\right\rbrace$ for each procedure $\rho$ that can be applied at call sites $\ell \!-\hspace{-6.00006pt} \mbox{-}\hspace{-1.79993pt} [\texttt {call}\;\rho ]\hspace{-5.0pt}\rightarrow \! \ell ^{\prime }$ .

A tabulation approach works by propagating intraprocedural dataflow through the procedure control-flow graph using the abstract transfer function $\mathopen [\![ \cdot \mathclose ]\!] ^\sharp$ , join $\sqcup$ , and widen $\nabla$ until it encounters a procedure call, at which point either (1) corresponding procedure summaries have already been computed and can simply be applied, or (2) new summaries are required, so the process continues recursively. Convergence is guaranteed through suitable applications of widening⁴ at recursive calls.

This approach operates over a worklist of $\Sigma ^\sharp \times \mathit {Loc}$ pairs (i.e., a procedure-entry abstract state and a program location) and computes for each encountered $(\varphi ,\ell)$ an abstract state $I(\varphi ,\ell)$ , which is best understood as the post-condition of a Hoare triple with pre-condition $\varphi$ over the valid paths to $\ell$ from the entry of the containing procedure. Thus, when $\ell$ is a procedure exit location $\textsf {exit}(\rho)$ , the abstract state $I(\varphi ,\ell)$ is equivalent to a procedure summary $\left\lbrace \varphi \right\rbrace \mathbin {\rho }\left\lbrace I(\varphi ,\ell)\right\rbrace$ , which can be used to interpret calls to $\rho$ in compatible abstract states.

4 Demanded Summarization

In this section, we describe demanded summarization, which generates procedure summaries on demand using a tabulation-style interprocedural abstract interpretation. Demanded summarization bridges the gap between operational-style demanded abstract interpretation and denotational-style procedure summaries.

One can of course construct a single operational-style dependency structure over the interprocedural control-flow graph that results from connecting call sites $\ell \!-\hspace{-6.00006pt} \mbox{-}\hspace{-1.79993pt} [\texttt {call}\;\rho ]\hspace{-5.0pt}\rightarrow \! \ell ^{\prime }$ to procedures $\rho$ (i.e., with $\ell$ transitioning to $\textsf {entry}(\rho)$ and $\textsf {exit}(\rho)$ to $\ell ^{\prime }$ ) with a suitable context abstraction such as k-limited call strings. However, as illustrated in Section 2, this global dependency structure induced by context-sensitive analysis is overly conservative for programs with good procedural abstraction and thus leads to poor incremental reuse.

Instead, we employ distinct intraprocedural dependency structures for each procedure summary, instantiated on demand. In particular, we assume that an initial DAIG $\mathcal {D}^\textsf {init}_{\rho ,\varphi }$ can be constructed for any procedure $\rho$ with initial abstract state $\varphi$ stored in cell $\underline{\textsf {entry}(\rho)}$ and with otherwise uninitialized abstract-state cells; that is, $\mathcal {D}^\textsf {init}_{\rho ,\varphi }$ reifies the computational structure of analyzing procedure $\rho$ , but with no actual analysis work performed yet beyond filling in the abstract state at $\textsf {entry}(\rho)$ .

Although Stein et al. [2021a] give a mechanism to reify intraprocedural analysis computation, procedure call CFG edges must also be translated somehow. To do so, we extend slightly the DAIG syntax and semantics of Section 3.1 as follows: In Figure 5, we show how a CFG edge $\ell \!-\hspace{-6.00006pt} \mbox{-}\hspace{-1.79993pt} [\texttt {call}\;\rho ]\hspace{-5.0pt}\rightarrow \! \ell ^{\prime }$ is encoded by a new kind of DAIG edge with label $\rho$ connecting the abstract state reference cell $\underline{\ell }$ to that of $\underline{\ell ^{\prime }}$ . This label $\rho$ indicates that the function to compute $\underline{\ell ^{\prime }}$ from $\underline{\ell }$ is described by a summary of procedure $\rho$ , which we can view as a (higher-order) reference to another DAIG for $\rho$ . However, since the intraprocedural DAIG semantics of Stein et al. [2021a] have no means to evaluate such an edge, queries can now get “stuck” with a value for $\underline{\ell }$ but no way to compute a value for $\underline{\ell ^{\prime }}$ .

Fig. 5.

Querying a DAIG $\mathcal {D}$ for the value of a cell n thus may result in either an abstract state to store in n (if no interprocedural analysis is required) or getting stuck and demanding a procedure summary.

To capture this second possibility—when intraprocedural analysis is unable to proceed without some procedure summary—we introduce a judgment form

\begin{equation*} \begin{array}{rl} \text{demanding a procedure summary} & \mathcal {D}\mathbin \vdash n\mathbin {\overset{n^{\prime }}{\rightsquigarrow }}(\rho ,\varphi)\mathbin {;}\mathcal {D}^{\prime }\end{array} \end{equation*}

which indicates that a query for the value named by n in $\mathcal {D}$ is stuck, unable to compute (and store at the return site $n^{\prime }$ ) the result of a call to $\rho$ with abstract state $\varphi$ and where $\mathcal {D}^{\prime }$ reflects any intraprocedural analysis evaluation in $\mathcal {D}$ before it got stuck on the $\rho$ -labeled edge.

Example 4.1.

Figure 5 gives a visual example of this judgment $\mathcal {D}\mathbin \vdash n\mathbin {\overset{\underline{\ell \rlap{{}^{\prime }}}}{\rightsquigarrow }}(\rho ,\varphi)\mathbin {;}\mathcal {D}^{\prime }$ . The middle box, labeled by $\mathcal {D}$ , shows a fragment of an initial DAIG containing a call to some procedure $\rho$ , upon which some analysis state named n transitively depends. A query for the value of n triggers analysis computation (and caching of results) up to the cell $\underline{\ell }$ preceding the call site, at which it gets “stuck,” needing a summary of $\rho$ in initial state $\varphi$ to compute an invariant at $\ell ^{\prime }$ and continue. The partially analyzed state is reflected in DAIG $\mathcal {D}^{\prime }$ , shown in the right box.

In the rest of this section, we introduce demanded summarization graphs (DSGs), which enable these partial computations stuck on demanding a callee summary to get “unstuck.” Evaluation of DSGs interleaves intraprocedural DAIG evaluation with procedure summary synthesis. Procedure summaries are synthesized by instantiating and/or evaluating DAIGs for procedures on demand.

A DSG $\mathcal {G}=\langle \mathcal {D}^\ast ,\Delta \rangle$ consists of a DAIG for each partially or fully computed summary of a procedure $\rho$ with initial abstract state $\varphi$ , which we denote by $\mathcal {D}^\ast (\rho ,\varphi)$ . Thus, there may be multiple instantiated DAIGs for a procedure $\rho$ , each with a different initial abstract state. Crucially for incremental analysis (cf. Section 4.2), it also maintains a demanded summarization dependency map $\Delta$ that records interprocedural analysis dependencies from applying procedure summaries.

A demanded summarization dependency $\delta$ has the form $(\rho^{\prime}, \varphi^{\prime}){\stackrel{n}\hookleftarrow}({\rho},\varphi)$ , which indicates that the return-site abstract state named by n in $\mathcal {D}^\ast (\rho ^{\prime },\varphi ^{\prime })$ depends upon the summary DAIG $\mathcal {D}^\ast (\rho ,\varphi)$ . Note that this is a very fine-grained dependency: Not only do we record that procedure $\rho ^{\prime }$ depends on procedure $\rho$ , but also the relevant precondition states and the specific point n in the analysis of $\rho ^{\prime }$ , which depends on a summary of $\rho$ .

We denote by $\mathcal {D}^{\mathcal {G}}_{\rho ,\varphi }$ the DAIG $\mathcal {D}$ mapped to by $(\rho ,\varphi)$ in DSG $\mathcal {G}$ . We use $\mathcal {G}[\mathcal {D}/(\rho ,\varphi)]$ as shorthand for $\langle \mathcal {D}^\ast [\mathcal {D}/ (\rho ,\varphi)],\Delta \rangle$ , that is, $\mathcal {G}$ with the DAIG summarizing $\rho$ in initial state $\varphi$ updated to $\mathcal {D}$ . Similarly, we denote by $\Delta _{\rho ,\varphi }$ the set of dependencies in $\Delta$ on the procedure- $\rho$ summary with initial state $\varphi$ and use $\mathcal {G}[\delta ]$ as shorthand for $\langle \mathcal {D}^\ast , \Delta ;\delta \rangle$ , that is, $\mathcal {G}$ with an added summary-dependency edge $\delta$ .

In Section 4.1, we define demand-driven query evaluation using the judgment form

\begin{equation*} \begin{array}{rl} \text{demanding an abstract state in a DSG} & \mathcal {G}\mathbin \vdash _{\varphi } \ell \Downarrow \varphi ^{\prime } \mathbin {;} \mathcal {G}^{\prime } \end{array} \end{equation*}

that says, “Given a DSG $\mathcal {G}$ , a query for the abstract state at $\ell$ under pre-condition $\varphi$ for $\ell$ ’s enclosing procedure returns result $\varphi ^{\prime }$ and updated DSG $\mathcal {G}^{\prime }$ .”

Example 4.2 (DSG Queries).

In the example of Figure 3, analyzing the program after the edit corresponds to a derivation of the query evaluation judgment $\mathcal {G}\mathbin \vdash _{\varphi } \ell \Downarrow \varphi ^{\prime } \mathbin {;} \mathcal {G}^{\prime }$ , where:

—

$\mathcal {G}$ is the DSG representing the analysis state immediately after the program edit is processed. $\mathcal {G}$ contains the two previously computed summaries of p on the right side of the figure, and also the partially computed summary of main after dirtying (see Section 3.1) from the edit.

—

$\varphi$ is the initial abstract state, $\varphi ^{\prime }$ the final abstract state, and $\ell$ the exit location for main.

—

$\mathcal {G}^{\prime }$ is the fully evaluated DSG after the query is complete. $\mathcal {G}^{\prime }$ contains both previously cached summaries of p, a now-fully computed summary of main, and the freshly computed summary of p (shown beneath the main summary).

Figure 6 shows a $\mathcal {G}\mathbin \vdash \left\lbrace \varphi \right\rbrace \mathbin {\rho }\left\lbrace \varphi ^{\prime }\right\rbrace \mathbin ;\mathcal {G}^{\prime }$ judgment form that makes explicit our interpretation of DAIG analysis results as composable procedure summaries. In particular, the Summarize rule shows how a composable procedure summary $\left\lbrace \varphi \right\rbrace \mathbin {\rho }\left\lbrace \varphi ^{\prime }\right\rbrace$ (i.e., a Hoare triple) for the procedure $\rho$ is implied by a DSG query result computed at the exit of the corresponding DAIG (i.e., $\mathcal {G}\mathbin \vdash _{\varphi } \textsf {exit}(\rho) \Downarrow \varphi ^{\prime } \mathbin {;} \mathcal {G}^{\prime }$ ). Intuitively, this rule captures reifying the denotational procedure summary through the operational fixed-point computation therein, applying local transfer functions and/or summary transformers.

Fig. 6.

4.1 Demand-driven Query Evaluation in DSGs

Abstract interpretation in DSGs is demand-driven by default: Analysis results are computed only as needed to answer queries. A query requests the abstract state at a specific program location $\ell$ under some procedure-entry precondition $\varphi$ and may be issued directly by a developer through their IDE, programmatically by a client analysis, or internally in service of another query.

4.1.1 Queries.

Queries in DSGs are governed by the $\mathcal {G}\mathbin \vdash _{\varphi } \ell \Downarrow \varphi ^{\prime } \mathbin {;} \mathcal {G}^{\prime }$ judgment form, introduced in the previous section and defined inductively by the Q-* (for “query”) inference rules of Figure 7.

Fig. 7.

We explain the inference rules in turn. The Q-Instantiate rule is used to instantiate a DAIG for a procedure on demand. More specifically, if the current DSG does not have a DAIG for the procedure $\rho ^\ell$ containing $\ell$ with pre-condition $\varphi$ (i.e., $(\rho ^\ell ,\varphi)\not\in \textsf {dom}(\mathcal {D}^\ast)$ ), then Q-Instantiate instantiates a new DAIG $\smash{\mathcal {D}^\textsf {init}_{\rho ^\ell ,\varphi }}$ for $\rho ^\ell$ , with entry abstract state $\varphi$ , and then re-issues the query for location $\ell$ in this extended DSG.

The Q-Delegate rule applies when the relevant DAIG is already available in the DSG and can handle the query on its own, i.e., without getting stuck due to dependence on another procedure summary. In this case, an intra-procedural DAIG query (discussed previously in Section 3.1) suffices to compute the result. Note that if the queried abstract state had previously been computed, then no analysis computation is performed and the analysis state is unchanged.

Finally, the Q-Apply-Summary rule handles the case when a procedure summary must be applied to handle a query. This rule does the heavy lifting of composing analysis results by applying summaries and tracking interprocedural analysis dependencies. Its first premise states that the query-relevant DAIG $\mathcal {D}^{\mathcal {G}}_{\rho ^\ell ,\varphi }$ is unable to produce a result by intraprocedural analysis, as the result transitively depends upon a call to $\rho$ in abstract state $\varphi ^{\prime }$ at the call site. The second premise uses an auxiliary summary query judgment form.

\begin{equation*} \text{computing and applying summaries} \;\; \smash{\mathcal {G}\mathbin \vdash (\rho ,\varphi)\mathbin {\overset{n}{⇜ }}(\rho ^{\prime },\varphi ^{\prime }) \mathbin {;} \mathcal {G}^{\prime }} \; \end{equation*}

It says, “In DSG $\mathcal {G}$ and in procedure $\rho$ with pre-condition $\varphi$ , a query at the return site n is resolved with a procedure summary over $\rho ^{\prime }$ with pre-condition $\varphi ^{\prime }$ , yielding updated $\mathcal {G}^{\prime }$ .” We discuss the rules for handling such summary queries in detail in Section 4.1.2.

Example 4.3 (Summary Queries).

In Figure 3, each interprocedural dependency edge corresponds to a derivation of a summary query judgment $\mathcal {G}\mathbin \vdash (\rho ,\varphi)\mathbin {\overset{n}{⇜ }}(\rho ^{\prime },\varphi ^{\prime }) \mathbin {;} \mathcal {G}^{\prime }$ . These summary queries arise as subderivations of a larger analysis query derivation, where

—

$\mathcal {G}$ is the interprocedural analysis state in which intraprocedural analysis got stuck requiring a procedure summary to continue;

—

$\rho ^{\prime }$ is the callee function and $\varphi ^{\prime }$ the entry abstract state of the summary at the edge’s source;

—

n is the name of the return-site abstract state of the callsite at the edge’s destination;

—

$\rho$ is the caller function and $\varphi$ the entry abstract state of the summary containing that callsite; and

—

$\mathcal {G}^{\prime }$ is the interprocedural analysis state extending $\mathcal {G}$ with any newly computed summaries and the return-site abstract state computed and stored at n.

The summary query judgment in Q-ApplySummary ’s premises demands a summary of the invoked procedure $\rho$ with pre-condition $\varphi ^{\prime }$ , applying it to resolve the post-state n of a call in the summary DAIG indexed by $(\rho ^\ell , \varphi)$ . Once a compatible summary is applied for the call to $\rho$ in the updated DSG $\mathcal {G}^{\prime }$ , the query for location $\ell$ is reissued, just like with Q-Instantiate .

Note the connection between stuck intraprocedural analysis (denoted with $\mathbin {\overset{n}{\rightsquigarrow }}$ ) and interprocedural summary queries (denoted with $\mathbin {\overset{n}{⇜ }}$ ) in Q-Apply-Summary ’s premises—whenever analysis is unable to proceed without some procedure summary, this rule computes or fetches the requisite summary, applies it, and then analysis proceeds through the third premise.

4.1.2 Summary Queries.

The complexity of interprocedural analysis thus lies in computing summaries with the judgment form $\mathcal {G}\mathbin \vdash (\rho ,\varphi)\mathbin {\overset{n}{⇜ }}(\rho ^{\prime },\varphi ^{\prime }) \mathbin {;} \mathcal {G}^{\prime }$ defined inductively with the SQ-* (for “summary query”) rules of Figure 7.

Let us first consider the simpler case of a non-recursive call to a different procedure $\rho ^{\prime }$ from the current caller $\rho$ with an abstract state $\varphi ^{\prime }$ at the call site, handled by the SQ-Other-Proc rule. In the second premise, we compute a Hoare-style summary $\mathcal {G}\mathbin \vdash \left\lbrace \varphi ^{\prime }\right\rbrace \mathbin {\rho ^{\prime }}\left\lbrace \varphi _\textsf {post}\right\rbrace \mathbin ;\mathcal {G}^{\prime }$ of the callee; since these judgment forms are mutually inductively defined, deriving this second premise can involve arbitrary additional analysis computations.

Then, we write $\varphi _\textsf {post}$ to the return site n in the third premise and add the fine-grained demanded summarization dependency $({\rho },{\varphi }){\stackrel{n}\hookleftarrow}({\rho ^{\prime }},{\varphi ^{\prime }})$ to $\mathcal {G}^{\prime \prime }$ , expressing that the value at the return site n now depends on the $\left\lbrace \varphi ^{\prime }\right\rbrace \mathbin {\rho ^{\prime }}\left\lbrace \varphi _\textsf {post}\right\rbrace$ summary. Critically, if the applied summary is ever invalidated by incremental edits, then the return site n must also be invalidated.

The next two rules SQ-Other-Pre and SQ-Self implement an incremental and demand-driven tabulation with (directly⁵) recursive procedures. Note that both rules derive judgments in which a query in $(\rho , \varphi)$ depends on some summary of the same procedure $\rho$ .

In the case where the callsite state $\varphi ^{\prime }$ is not included in the pre-condition of this procedure DAIG $\varphi$ (i.e., $\varphi ^{\prime } \not\sqsubseteq \varphi$ ) in SQ-Other-Pre, we demand another summary of $\rho$ with pre-condition $\varphi \nabla \varphi ^{\prime }$ , which yields post-condition $\varphi _\textsf {post}$ , allowing analysis to proceed in the same manner as in the non-recursive case with SQ-Other-Proc . This summary is guaranteed to be compatible (i.e., $\varphi ^{\prime } \sqsubseteq \varphi \nabla \varphi ^{\prime }$ ), but we need to apply widening in the abstract pre-condition to ensure that this iterative demanding of new summaries for procedure $\rho$ converges. Intuitively, applying SQ-Other-Pre corresponds to a demanded unrolling of recursive calls a finite (but a priori unbounded) number of times determined by the widening operator $\nabla$ .

In the other case (i.e., the SQ-Self rule), the summary of procedure $\rho$ that we need is the same one that we are currently demanding—a self-referential summary . That is, the recursive-call–site state $\varphi ^{\prime }$ is contained in the pre-condition $\varphi$ of this summary that is currently being demanded (i.e., $\varphi ^{\prime } \sqsubseteq \varphi$ ). When a procedure summary depends on itself, some special care must be taken to compute a fixed point along the control-flow cycle(s) formed between the procedure exit and recursive return site(s), as illustrated in Figure 4. This fixed-point computation is implemented by a judgment form for

\begin{equation*} \text{demanding a self-referential summary} \;\; \mathcal {G}\vdash \textsf {fix}(\rho ,\varphi);\mathcal {G}^{\prime }\; \end{equation*}

which says, “In DSG $\mathcal {G}$ , the fixed point of a self-referential summary of procedure $\rho$ with pre-condition $\varphi$ yields an updated DSG $\mathcal {G}^{\prime }$ .”

Note that the summarization, abstract state query, summary resolution, and self-referential summary fixed-point iteration judgments are mutually recursively defined, so any number of summaries can be produced, cached, and/or memo-matched upon in the process.

4.1.3 Fixed Points.

The SQ-Self rule initializes this fixed-point iteration, which is then implemented by the F-* (for “fixed point”) rules. Since analysis has yet to reach the procedure exit, we initialize the call’s return state at n with $\bot$ in the second premise and then demand a fixed-point for this self-referential summary $(\rho , \varphi)$ in the third. Note that this self-dependency is made explicit with the self-referential demanded summarization dependency $({\rho },{\varphi }){\stackrel{n}\hookleftarrow}({\rho },{\varphi})$ extending $\mathcal {G}$ .

The fixed-point computation of a self-referential procedure summary can now be described using the machinery defined to this point. At each step, we demand a summary for procedure $\rho$ with pre-condition $\varphi$ , that is, $\mathcal {G}\mathbin \vdash \left\lbrace \varphi \right\rbrace \mathbin {\rho }\left\lbrace \varphi ^{\prime }\right\rbrace \mathbin ;\mathcal {G}^{\prime }$ . The shorthand $\textsf {R}_{\rho ,\varphi }(\mathcal {G}^{\prime })$ then yields the set of recursive return sites whose abstract state is not over-approximated by the procedure-exit state $\varphi ^{\prime }$ in the procedure DAIG indexed by $(\rho ,\varphi)$ . When this set is empty ( F-Converge ), the fixed-point iteration has converged. Otherwise, some return sites $n_i$ have abstract state $\varphi _i$ not over-approximated by the procedure-exit state $\varphi ^{\prime }$ , so F-Step applies, widening $\varphi ^{\prime }$ onto each return-site state before taking another step in the fixed-point iteration. Importantly, the widened state $\varphi _i \nabla \varphi ^{\prime }$ is written to cell n using the dirtying-on-an-edit judgment, meaning forward-reachable nodes from n in the procedure DAIG get dirtied.

Example 4.4 (Fixed-points).

As a concrete example, such a fixed-point computation is instantiated in Figure 4, showing the steps taken to produce a procedure summary of p using Summarize .

In the initial (left-most) state, Q-Apply-Summary is needed, because intraprocedural analysis is stuck at a callsite. Then, each transition in the figure applies the DSG evaluation semantics as follows:

(a)

To satisfy Q-Apply-Summary ’s second antecedent—i.e., to apply a procedure summary—we must derive one of the SQ-* rules; in this case, SQ-Self applies, since the callee and caller are the same and the callsite abstract state is smaller than the caller initial state. SQ-Self writes $\bot$ to the return site, then continues analysis (via the summarization judgment in the first antecedent of both F-* rules) to the procedure exit to derive a fixed-point judgment.

(b)

Since the analysis has not yet converged (i.e., $R_{\rho ,\varphi }(\mathcal {G}^{\prime })$ is non-empty, since the $\bot$ abstract state at the return site is smaller than the exit state), F-Step applies. The exit state is widened into the return site, and another fixed-point judgment must be derived to satisfy the final antecedent so analysis runs to the procedure exit as before.

(c)

Since analysis still has not converged, F-Step applies again, analogously to step (b). At this point, the return site abstract state is larger than the exit state, so F-Converge applies and the derivation is complete.

Note that the “steps” here are not sequential composition in the style of small-step operational semantics but rather nested derivations in the style of big-step operational semantics; each step is a sub-derivation of the prior using the semantics of Figure 7.

This example demonstrates the interplay between the three judgment forms governing analysis queries in DSGs: To summarize a procedure, we query for the procedure-exit abstract state; to get to the procedure exit, we must derive procedure summaries; when those summaries are self-referential, we must derive a fixed-point.

Note that these interleavings can be far more complex in general: Some (possibly recursive) callee procedure summaries may be computed as further subderivations of each judgment form.

4.2 Incremental Edits in DSGs

When the program under analysis is edited, a DSG must discard those analysis results that are potentially affected while retaining as many cached results as possible for future incremental reuse. This operation is given by an operational semantics over analysis states $\mathcal {G}$ , just as with demand-driven query evaluation. The judgment form $\mathcal {G} \mathbin \vdash _{\rho } n\Leftarrow s \mathbin ; \mathcal {G}^{\prime }$ of Figure 8 defines the impact of an edit that modifies the statement named by n in procedure $\rho$ , discarding facts from DSG $\mathcal {G}$ to yield $\mathcal {G}^{\prime }$ . Note that although this judgment as written only applies to CFG-structure-preserving statement modifications, the extension to statement insertions and deletions is straightforward: We insert or remove the DAIG region corresponding to the edit, then dirty $\mathcal {G}$ from its exit (merging the entry and exit location of deleted regions).

Fig. 8.

It is also important to notice that no special handling of edits involving procedure calls is needed—the transitive dirtying operation is based on the summary dependencies of the pre-edit program, as recorded in $\Delta$ , and any relevant changes to the program’s call structure will be reflected in summary dependencies the next time the procedure is analyzed. This is not just an optimization; it is crucial for correct and from-scratch consistent demanded analysis.

The top-level program edit rule E-Delegate is required, since there may be multiple extant summary DAIGs for the edited procedure $\rho$ , each with its own entry abstract state. E-Delegate simply delegates the dirtying of each such DAIG to the D-* rules in Figure 8, using judgment form $\mathcal {G} \mathbin \vdash _{\rho ,\varphi }n\Leftarrow s_\varepsilon \mathbin {;}\mathcal {G}^{\prime }$ to dirty a specific DAIG.

Rule D-DemandedSummaries is the inductive case for dirtying demanded summarization dependencies: When a fact named by $n^{\prime }$ in the $\rho ^{\prime },\varphi ^{\prime }$ DAIG depends upon the $\rho ,\varphi$ DAIG we are currently dirtying, we drop the dependency edge, dirty from $n^{\prime }$ in the $\rho ^{\prime },\varphi ^{\prime }$ DAIG, and continue recursively. Once all demanded summarization dependencies have been processed in that manner, the D-DAIG base case can be applied, dirtying any affected analysis results in the relevant DAIG (and their dependency edges $\Delta ^{\prime }$ ).

Formalizing incremental and demand-driven interprocedural analysis in this manner exposes questions of what is minimal dirtying and maximal incremental reuse. The analysis semantics we have described and implemented in this section dirty all (transitive) callers of any edited procedure, as those analysis results depend on a summary that has been dirtied. This is optimal in the sense that it dirties exactly those results potentially invalidated by an edit, and our query semantics are optimal in the sense that they lazily compute only what is needed to resolve a query.

However, those callers depend on the summary triple only, not the underlying procedure implementation. As such, it is possible that by re-demanding the exit location of the callee, a DSG may discover that a particular edit does not change the summary post-condition and thus dirtying need not propagate to callers. However, this eager re-demanding may not have been needed to respond to any future query.

5 From-scratch Consistency

We now define and give proof sketches for several critical meta-theoretic properties of demanded summarization graphs, namely, termination of queries and edits, from-scratch consistency with the underlying batch abstract interpreter, and soundness with respect to the concrete semantics. Proofs are elided here but can be found in Appendix A.

Definition 5.1 (DSG Semantic Consistency).

We say that a DSG $\mathcal {G}=\langle \mathcal {D}^\ast ,\Delta \rangle$ is semantically consistent when it is syntactically well formed and consistent with the program structure and underlying abstract interpretation semantics.

—

Each constituent DAIG $\mathcal {D}^\ast (\rho ,\varphi)$ is well formed and consistent with the corresponding procedure CFG $P(\rho)$ and intraprocedural abstract semantics $\langle \Sigma ^\sharp ,\varphi ,\mathopen [\![ \cdot \mathclose ]\!] ^\sharp ,\sqsubseteq ,\sqcup ,\nabla \rangle$ .

—

Each return-site abstract state has a corresponding dependency edge from its callee in $\Delta$ . That is, if n names a non-empty ref cell in some $\mathcal {D}=\mathcal {D}^\ast (\rho ,\varphi)$ with a $\rho ^{\prime }$ -labeled edge to n from $n^{\prime }$ , then either

–

$(\rho, \varphi){\stackrel{n}\hookleftarrow}({\rho ^{\prime }},{\mathcal {D}(n^{\prime }))}\in \Delta$ (when $\rho \ne \rho ^{\prime }$ ) or

–

$(\rho, \varphi){\stackrel{n}\hookleftarrow}({\rho ^{\prime }},\varphi\nabla{\mathcal {D}(n^{\prime }))}\in \Delta$ (when $\rho =\rho ^{\prime }$ ).

—

Dependency edges $(\rho^{\prime}, \varphi^{\prime}){\stackrel{n}\hookleftarrow}({\rho},\varphi)$ in $\Delta$ are consistent with the relevant DAIGs, in that

–

$\mathcal {D}^\ast (\rho ^{\prime },\varphi ^{\prime })(n) = \mathcal {D}^\ast (\rho ,\varphi)(\textsf {exit}(\rho))$ (when $\rho \ne \rho ^{\prime }$ ) or

–

$\mathcal {D}^\ast (\rho ^{\prime },\varphi ^{\prime })(n) \sqsupseteq \mathcal {D}^\ast (\rho ,\varphi)(\textsf {exit}(\rho))$ (when $\rho =\rho ^{\prime }$ ).

—

Analysis results in DAIGs are equal to the corresponding invariants produced by batch tabulation, where the domains coincide: for all $\mathcal {D}^\ast (\rho ,\varphi)(\underline{\ell })$ where $(\varphi ,\ell)\in \textsf {dom}(I)$ , $\smash{\mathcal {D}^\mathcal {G}_{\rho ,\varphi }(\underline{\ell })} = I(\varphi ,\ell)$ .

Lemma 5.1 (Consistency Preservation).

If $\mathcal {G}$ is semantically consistent (with respect to a program P), then:

if $\mathcal {G}\mathbin \vdash _{\varphi } \ell \Downarrow \varphi ^{\prime } \mathbin {;} \mathcal {G}^{\prime }$ , then $\mathcal {G}^{\prime }$ is semantically consistent, and

if $\mathcal {G} \mathbin \vdash _{\rho } n\Leftarrow s \mathbin ; \mathcal {G}^{\prime }$ , then $\mathcal {G}^{\prime }$ is semantically consistent (with respect to the edited version of P).

Preservation of each of the four conditions of semantic consistency can be shown by induction on the derivations of the query and edit judgments and ensures that the theorems to follow are applicable to the sequences of queries and edits made during an interactive analysis session.

Theorem 5.2 (Termination).

Queries and edits terminate:

—

For all $\varphi$ , $\ell \in L$ , and consistent $\mathcal {G}$ , there exist $\varphi ^{\prime }$ and $\mathcal {G}^{\prime }$ such that $\mathcal {G}\mathbin \vdash _{\varphi } \ell \Downarrow \varphi ^{\prime } \mathbin {;} \mathcal {G}^{\prime }$ .

—

For all s, n, $\rho$ , and consistent $\mathcal {G}$ where n names a CFG edge in $E^\rho$ , there exists a $\mathcal {G}^{\prime }$ such that $\mathcal {G} \mathbin \vdash _{\rho } n\Leftarrow s \mathbin ; \mathcal {G}^{\prime }$ .

In other words, any query or edit against a consistent DSG has a finite corresponding big-step evaluation. The edit-termination condition (the second bullet point) is straightforward: $\Delta$ and k are both finite in E-Delegate because $\mathcal {G}$ is consistent, and each of its premises terminates because the only recursive dirtying rule D-DemandedSummaries decreases in $\Delta$ .

The query-termination condition is more complicated but can be shown by induction on the number of transitive callees of the procedure $\rho ^\ell$ in which the query is issued, which decreases at all non-recursive calls. At recursive calls, termination relies on the convergence of the underlying abstract interpreter’s widening operator.

Theorem 5.3 (From-Scratch Consistency).

Query results are equal to the corresponding invariant computed by tabulation: If $\mathcal {G}$ is semantically consistent, $\mathcal {G}\mathbin \vdash _{\varphi } \ell \Downarrow \varphi ^{\prime } \mathbin {;} \mathcal {G}^{\prime }$ , and $(\varphi ,\ell)\in {\rm \textsf {dom}}(I),$ then $\varphi ^{\prime }=I(\varphi ,\ell)$ .

Intuitively, query results are from-scratch consistent due to (1) the from-scratch consistency of intraprocedural analysis in DAIGs, and (2) the fact that summary triples derived by $\mathcal {G}\mathbin \vdash \left\lbrace \varphi \right\rbrace \mathbin {\rho }\left\lbrace \varphi ^{\prime }\right\rbrace \mathbin ;\mathcal {G}^{\prime }$ are identical to the summary edges produced by a batch tabulation and are applied at call sites only when their pre-condition matches the caller abstract state.

Corollary 5.4 (Soundness).

Query results are sound with respect to the concrete semantics:

If $\mathcal {G}\mathbin \vdash _{\!} \ell \Downarrow \varphi \mathbin {;} \mathcal {G}^{\prime }$ and $\mathcal {G}$ is semantically consistent, then $\sigma \models \varphi$ for all $\sigma \in \mathopen [\![ \ell \mathclose ]\!] ^\ast$ .

Soundness is a strictly weaker condition than from-scratch consistency: Since demanded query results are the same as would be computed from-scratch by the batch analysis that is sound, so, too, are demanded query results.

6 Implementation & Evaluation

In this section, we describe a prototype analysis framework based on DSGs and evaluate it on synthetically generated benchmarks as well as a corpus of bug-fix program edits drawn from open-source Java applications.

A core challenge in empirically evaluating an incremental or demand-driven analysis is the dearth of publicly available data on real-world program edits and analysis queries.

Although some previous research on incremental and demand-driven program analysis has demonstrated considerable performance benefits, it has typically been applied to carefully restricted programming languages or synthetically generated program edits [Stein et al. 2021a; Szabó et al. 2021].⁶ Our aim is to show that demanded summarization can provide comparable performance benefits while also supporting interprocedural analysis, procedure summarization, recursion, and the associated complexities.

However, some existing work [Arzt and Bodden 2014; Erhard et al. 2022] has been evaluated on open-source commit histories, providing stronger evidence of applicability to real edits but making it significantly more difficult to meaningfully evaluate interactive performance, since there are fewer data points and the edits are of much coarser granularity.

Our Approach. We aim to cover both bases using two distinct sets of benchmarks: A synthetic corpus (Section 6.2) is designed to evaluate the scalability and efficacy of the demanded summarization algorithm, while an open-source corpus (Section 6.3) demonstrates the generalizability of these results to more realistic programs and edits and the practicability of incremental analysis infrastructure. Taken together, these experiments demonstrate the promise of demanded summarization as a framework for reliable interactive static analysis tools.

Since this article’s contribution is a domain-generic framework, we choose standard abstract domains throughout this section; our aim is to evaluate the scalability and generalizability of the technique rather than the particulars of any given abstract domain. Nonetheless, we instantiate the framework with several different domains—including some with infinite height and non-monotonic widening operators—throughout the evaluation to demonstrate its genericity.

6.1 Implementation

We implemented a prototype analysis framework in approximately 7,000 lines of OCaml code, split roughly evenly between frontend infrastructure and analysis logic. The framework’s frontend, intermediate representations, and core analysis engine are all designed to support incremental edits and demand queries. The implementation and experimental setup are publicly available on Github [Stein et al. 2021b].

Frontend. We use the tree-sitter incremental parsing library to interpret source-level changes at the granularity of concrete syntax tree nodes [Brunsfeld 2021]. In practice, we consider program edits that add or delete procedures or modify their headers; add, modify, or delete statements; and modify the headers of loops and conditionals. These concrete syntax tree edit scripts can then be interpreted in-place on our control-flow graph IR and analysis data structures.

To resolve virtual calls, we rely on an upfront application-only callgraph computed using the WALA [2021] analysis library. A virtual call with multiple potential targets is interpreted as non-deterministic choice over direct calls to each target.

WALA does not provide incremental callgraphs, but techniques (e.g., Schubert et al. [2021]) exist for incremental callgraph construction, which we could adopt in the future. Our use of an upfront callgraph is meant simply to exclude virtual call resolution issues from these experiments, since it is largely orthogonal to this article’s core contributions on abstract interpretation and dataflow analysis.

Analysis. The core analysis engine of our framework is a fairly direct implementation of the demanded summarization graph $\mathcal {G}$ : We keep a map from procedure identifiers and procedure-entry abstract states to DAIGs, which we issue queries against, instantiate, and produce summaries for as needed to respond to extrinsically provided queries.

Our intermediate representation of programs is similar to the control-flow graph language P of Section 3, the main difference being support of formal/actual parameter binding and variable scoping. These details are elided from the formalism for the sake of simplicity and clarity but can in practice be handled using standard techniques without any major change to the underlying system as formalized in this article.

The analysis engine is domain-agnostic: It is parameterized over an abstract domain module that provides standard abstract domain operations (essentially $\langle \Sigma ^\sharp ,\varphi _0,\mathopen [\![ \cdot \mathclose ]\!] ^\sharp ,\sqsubseteq ,\sqcup ,\nabla \rangle$ ) but can be implemented in a general-purpose language and needs not be aware of incrementality or demand.

We have instantiated the implementation with interval and octagon domains based on the APRON numerical domain library [Jeannet and Miné 2009] and a simple list-segment shape analysis domain, all of which are of infinite height and have non-monotonic widenings. To better exercise our analysis on object-oriented programs in Section 6.3, we also implemented a finite domain tracking nullability of variables and heap addresses using allocation sites to abstract memory locations.

Throughout this section, we consider four variants of the demanded summarization algorithm to re-analyze each program after an edit:

—

Batch analysis, wherein the program is analyzed exhaustively from scratch when the program changes;

—

Incremental analysis, which analyzes the program exhaustively and eagerly when the program changes but reuses results from the previous version where possible;

—

Demand-driven analysis, which computes only those summaries that are required to respond to a client-issued query but discards all analysis state when the program changes; and

—

Demanded (i.e., incremental and demand-driven) analysis, which both reuses previous-version results and also computes only those summaries that are required to respond to queries.

The four variants amount to configuration options in our analysis framework: Essentially, the incremental analyses apply the edit semantics of Figure 8, while the demand-driven analyses apply the query semantics of Figure 7.

6.2 Scalability: Synthetic Benchmarks

To evaluate the scalability and responsiveness of the demanded summarization approach, we analyze a synthetic workload of program edits and queries in each of the four configurations described above.

We instantiate the analysis framework for these experiments with an octagon domain backed by the APRON library [Jeannet and Miné 2009]; octagons are a popular relational numerical abstract domain, and APRON is a widely used implementation and thus has performance characteristics representative of many real-world domains. Furthermore, the infinite height lattice and non-monotonic widening operator of the octagon domain render it out of the reach of most existing incremental or demand-driven analysis techniques.

Programs are generated in an imperative language with arithmetic, Booleans, conditional branching, while-loops, and recursive (but not mutually recursive) procedure calls. Each edit selects a random program location at which to add a loop, procedure call, conditional, or statement with probability 5%, 10%, 10%, and 75%, respectively; statements and expressions are generated probabilistically from their respective grammars, and procedure call edits introduce a new procedure as their target 5% of the time.

These hyperparameters are chosen to simulate the structure of typical programs and edits thereof, allowing us to evaluate the performance characteristics of each analysis configuration with much more data and at a larger scale than is possible with real code and human subjects. Despite our efforts to produce realistically structured synthetic programs, possible disparity with real-world control-flow structure is an inevitable threat to validity in these experiments. However, we argue that the statement-language abstract semantics and domains are less critical, acting as a constant factor throughout but not affecting the bottom-line comparisons between analysis configurations.

We analyze eight separate trials of 2,500 edits in each configuration, beginning with a program consisting only of a no-op main procedure and seeding the random number generators such that the same edits are analyzed by each analysis variant. Given the probabilistic grammar described above, this produces programs that are roughly 3,000 significant lines of code on average at the end of each trial. In the demand-driven configurations, we issue three random queries between each edit.

The results are shown in Figure 9 in the form of a continuous distribution plot and a table of summary statistics. Both incremental analysis and demand-driven analysis offer a significant improvement over the batch analysis baseline, both in terms of their average analysis cost and worst-case runtime at the 95th and 99th percentiles. However, demanded summarization, by combining incremental and demand-driven techniques, yields analysis results in 2.7 seconds on average and significantly mitigates the worst-case analysis costs at the tail of the distribution.

Fig. 9.

Of course, this performance improvement comes at the cost of an increased memory footprint. Profiling the memory usage of each configuration across the experiments, we find sub-linear increases for batch and demand-driven analysis, up to 450 and 347 MB, respectively after 2,500 edit/analysis iterations. Since both configurations discard all analysis results every iteration, this total memory footprint is roughly what is needed to analyze the full program (batch) and the transitive dependencies of a random query (demand-driven).

However, both the incremental and demanded configurations consume memory approximately linear in the size of the edit history, growing to 11.1 and 24.1 GB, respectively, after 2,500 edit/analysis iterations. Though this is too large for most local development environments, it is feasible in a CI environment and could be tuned to hardware constraints using the memory-saving optimizations described in Section 7.2 without sacrificing the formal guarantees of demanded summarization.

As a whole, these experiments demonstrate that the combination of incremental and demand-driven techniques leveraged by demanded summarization is well suited to interactive use, even in this setting of tabulation-based interprocedural analysis over recursive procedures.

6.3 Generalizability: Open-source Corpus

We run our prototype analysis framework on a subset of the BugSwarm dataset, which consists of program pairs drawn from open-source applications and their continuous integration histories [Tomassi et al. 2019]. Each program pair consists of a “fail” version in which a CI failure was observed and a subsequent “pass” version in which the failure is corrected.

We restrict our attention to pairs where the edit is in application code (rather than configuration or tests), affects between 1 and 200 lines of code, and the failure is not a compilation failure. On average, each program is 45 kLOC and consists of around 5,000 distinct procedures. They make extensive use of Java language features including exceptions and a wide variety of control-flow mechanisms, but we do not consider programs that make use of lambda expressions or method references.

Table 1 shows the result of analyzing 10 Java program pairs from the BugSwarm dataset using the interval analysis and nullability analysis described in Section 6.1. The interval abstract domain is widely used in practice for numerical analysis problems but is difficult to handle incrementally due to its infinite height and non-monotonic widening operator, while the nullability abstract domain is finite but more well suited to the analysis of the object oriented programs found in the dataset. For the demand-driven configurations, we select query locations that correspond to the failures observed in the initial program version or as near as possible in our internal representation. These demand queries aim to emulate the natural use-case of querying at the alarm location after making an edit to address the alarm, for example, to enable interactive and responsive IDE features built on top of an abstract interpreter.

Table 1.

artifact	kLOC	eLOC	\|CG\|	Domain	# $\varphi$	(s)	% $\varphi$	(s)	% $\varphi$	(s)	% $\varphi$	(s)
					Batch		Incremental		Dem.-driven		Demanded
2021a	32.3	4	1,233	Itv.	7,029	0.20	12.5	0.02	28.0	0.10	0.2	<0.01
				Oct.	6,441	0.31	4.7	0.02	24.4	0.11	7.3	0.11
				Null	13,972	0.20	7.6	0.01	16.6	0.05	1.0	<0.01
2021b	147.6	56	2,920	Itv.	12,199	0.25	1.1	0.01	0.3	<0.01	0.0	<0.01
				Oct.	12,495	0.40	1.0	0.01	0.3	<0.01	0.0	<0.01
				Null	19,558	0.24	0.7	<0.01	0.2	<0.01	0.0	<0.01
2021c	15.8	4	1,599	Itv.	17,084	0.40	0.2	<0.01	48.7	0.27	<0.1	<0.01
				Oct.	17,165	0.74	0.3	0.03	48.7	0.54	0.2	0.01
				Null	88,160	2.61	<0.1	0.01	31.2	0.84	<0.1	<0.01
2021d	45.7	134	3,866	Itv.	35,395	1.35	<0.1	0.02	0.0	<0.01	0.0	<0.01
				Oct.	37,257	3.20	<0.1	0.03	0.0	<0.01	0.0	<0.01
				Null	348,760	16.59	<0.1	<0.01	0.0	<0.01	0.0	<0.01
2021e	36.2	8	7,379	Itv.	73,322	6.88	<0.1	0.03	3.4	0.09	<0.1	<0.01
				Oct.	74,026	29.86	0.1	0.12	3.2	0.17	<0.1	<0.01
				Null	542,853	29.01	<0.1	0.04	0.6	0.08	<0.1	<0.01
2021f	39.4	4	8,212	Itv.	77,954	6.94	0.1	0.04	5.6	0.23	0.0	<0.01
				Oct.	79,909	40.63	0.1	0.06	3.2	0.18	0.0	0.01
				Null	622,602	33.71	1.5	5.91	0.8	0.12	<0.1	0.02
2021g	39.6	8	8,260	Itv.	78,906	6.57	<0.1	0.03	2.4	0.07	<0.1	<0.01
				Oct.	80,877	28.62	<0.1	0.04	2.4	0.12	<0.1	<0.01
				Null	627,081	33.69	<0.1	0.01	0.5	0.07	<0.1	<0.01
2021h	63.8	4	10,048	Itv.	100,112	6.70	17.9	1.12	63.9	5.25	0.6	0.18
				Oct.	97,850	24.87	19.5	4.29	62.7	18.98	18.9	4.80
				Null	1,533,103	88.17	16.1	23.69	87.2	73.13	14.7	23.11
2021i	23.5	4	2,559	Itv.	23,377	0.73	0.4	0.01	2.3	0.02	0.3	<0.01
				Oct.	22,982	2.90	0.4	0.02	2.4	0.03	0.3	<0.01
				Null	85,782	1.79	0.2	<0.01	0.7	0.01	<0.1	<0.01
2021j	15.2	20	3,558	Itv.	44,604	3.70	0.1	0.02	0.0	<0.01	0.0	<0.01
				Oct.	39,856	8.0	0.1	0.03	0.0	<0.01	0.0	<0.01
				Null	96,125	2.49	<0.1	<0.01	0.0	<0.01	0.0	<0.01
					Batch		Incremental		Dem.-driven		Demanded
average	45.9	25	4,963	Itv.	46,998	3.37	3.2	0.13	15.5	0.60	0.1	0.02
				Oct.	46,886	13.95	2.6	0.47	14.7	2.02	2.7	0.50
				Null	397,800	20.85	2.6	2.97	13.8	7.43	1.6	2.31
average peak memory usage (MB)					667.6		693.1		233.1		252.4

Table 1. Statistics about programs, edits, and analysis thereof in the open-source corpus, showing the relative degrees of reuse that are achieved by each analysis configuration

Artifacts refer to BugSwarm program pairs, for which we report final program size (in kLOC), program edit size (eLOC), and application-only callgraph size (|CG|). We analyze each program pair in three domains: the interval (Itv.), octagon (Oct.), and nullability domains (N) described in Section 6.1. Then, we report for each analysis configuration the number of abstract states computed during reanalysis after applying the edit, in raw terms ( $\#\varphi$ ) for batch analysis and as a percentage ( $\%\varphi$ ) of the batch analysis baseline for each other configuration, as well as the amount of time required. Last, we report the average peak memory usage of the analysis for each analysis configuration.

Note that we do not purport to statically identify each of the failures and prove their fixes correct; the bugs are varied and complex, and the design of domains to reason precisely about their semantics is outside the scope of this article. We use the BugSwarm corpus as a source of real-world bug-fixing edits, investigating the potential analysis cost savings that can be realized by demanded summarization in realistically structured programs under actual patterns of queries and edits.

The analysis results are shown in Table 1, which presents symbolic metrics of analysis cost in terms of abstract states computed during reanalysis, in addition to wall-clock time and average peak memory usage.

We see that batch analysis requires computing tens to hundreds of thousands of abstract states for each program edit. Both incremental analysis and demand-driven analysis represent significant improvements over the batch approach but exhibit high variability: Depending on the location and relative proximity of queries, edits, and their dependencies, both approaches can incur significant analysis cost, averaging around 3% and 15%, respectively, of that required for batch analysis.

Demanded summarization, which combines aspects of incremental and demand-driven analysis, consistently requires only <1% as much analysis work as the batch baseline, although there are some outliers where the relative locations of the program edits and demand queries require performing a larger fraction of the analysis work. Notably, the edit and fix locations in one artifact [2021h] cause the demanded configuration to perform comparably to the non-demand-driven incremental configuration, since the bug fix location depends on a large fraction of the program under analysis.

The peak memory usage of both non-demand-driven configurations is significantly higher than both demand-driven configurations, since they require analyzing a smaller fraction of the program, while adding incrementality increases peak memory usage by 4%–8%. This degree of overhead is approximately as expected: Incrementality allows some summaries to be persisted that would otherwise be dropped, but the overhead is small when looking at a single edit. Note also that an optimized batch analysis framework would use less memory than our batch configuration; nonetheless, the memory footprint of all of these configurations is sufficiently small to fit on modern consumer hardware.

The consistent trends in symbolic metrics across these three different domains is an indication that the rough proportion of reuse and analysis cost savings can generalize when the DSG framework is instantiated with different abstract domains.

7 Summary Tabulation & Weakening

The previous sections describe demanded summarization : a framework for incremental and demand-driven abstract interpretation of recursive interprocedural programs in arbitrary abstract domains.

However, the algorithm and formalism described to this point present some opportunities for optimization. This section addresses some practical concerns that will arise in the implementation of a future analysis framework based on the theory of demanded summarization. Note that the techniques formalized in the remainder of this section are not applied in the implementation of our experimental evaluation (Section 6).

First, since our approach makes extensive use of caching and memoization, it must inevitably confront memory limitations. However, there is no means to do so in the operational semantics of Section 4. We develop and formalize in Section 7.1 some techniques to gracefully handle memory pressure in DSGs by discarding intermediate analysis state and keeping input/output procedure summaries, and we show that this extension preserves the metatheoretic results of Section 5.

Then, though we fix a policy of maximizing precision in the previous sections to ensure from-scratch consistency, it may be desirable in practice to weaken or merge summaries during analysis to maximize incremental reuse. We formalize this notion in Section 7.2 and show that while it clearly violates from-scratch consistency, it does preserve soundness and termination.

These extensions to the semantics of Section 4 correspond to well-known optimizations that are more or less standard practice in batch summary-based abstract interpreters (e.g., Padhye and Khedker [2013]; Calcagno and Distefano [2011]; Fähndrich and Logozzo [2010]). The purpose of this section is to relate them to our formalism and show how they preserve or weaken the metatheoretic results of Section 5, thus providing tunable parameters in the design space of practical interactive analysis tools based on demanded summarization.

We also discuss some subtleties of interprocedural dependency structures highlighted by these changes.

7.1 Summary Tabulation for Memory Pressure

Given infinite memory, an incremental analysis could simply store all previously computed results forever, keeping them on hand in case they are ever needed. This is essentially what is formalized in Section 4. DAIGs are instantiated and added to $\mathcal {D}^\ast$ by Q-Instantiate and, though their contents are dirtied in response to edits by the semantics of Figure 8, they are never disposed of themselves. This design keeps the presentation simple, but it is infeasible in practice. This section extends the DSG formalism to explicitly account for memory-conserving operations, showing that soundness, termination, and from-scratch consistency are preserved under the extension.

We introduce a judgment form $\mathcal {G}\dashrightarrow \mathcal {G}^{\prime }$ to describe transformations of analysis state that can be applied at will by an analyzer. It is possible of course to encode these operations as extensions to the query evaluation ( $\mathcal {G}\mathbin \vdash _{\varphi } \ell \Downarrow \varphi ^{\prime } \mathbin {;} \mathcal {G}^{\prime }$ ) or edit ( $\mathcal {G} \mathbin \vdash _{\rho } n\Leftarrow s \mathbin ; \mathcal {G}^{\prime }$ ) semantics, but such an approach ties the memory-freeing operations to analysis client interactions and complicates proofs of termination (by admitting infinite derivations that alternate between computing and discarding analysis facts).

The most direct such transformation is simply to discard procedure summary DAIGs that are not depended upon by others; this is, however, more heavy-handed than desired in most cases. In particular, note that DAIGs contain cached intermediate results at the granularity of individual semantic functions, but summary application in DSGs operates at the granularity of procedures.

Thus, instead of discarding an entire DAIG, an analyzer may exploit the fact that only its entry and exit abstract states are needed to summarize calls, by caching those two states and discarding any intermediate results.

We can extend the syntax and semantics of DSGs to express and reason about this direct summarization. Formally, we first add a summary table $\mathcal {T}$ to demanded summarization graphs, leaving the other components unchanged.

\begin{equation*} \begin{array}{rcl} \text{summary tables} & \mathcal {T}& ::= \varepsilon \mathrel \vert \mathcal {T}\mathbin ;\left\lbrace \varphi \right\rbrace \mathbin {\rho }\left\lbrace \varphi ^{\prime }\right\rbrace \\ \text{summary table-equipped DSGs} & \mathcal {G}& \triangleq \left\langle \mathcal {D}^\ast , \Delta ,\mathcal {T}\right\rangle \end{array} \end{equation*}

A summary table $\mathcal {T}$ is a collection of summary triples $\left\lbrace \varphi \right\rbrace \mathbin {\rho }\left\lbrace \varphi ^{\prime }\right\rbrace$ not backed by any DAIG in $\mathcal {D}^\ast$ . For the most part, the operational semantics of Figures 7 and 8 just work with DSGs replaced by summary table-equipped analogues and tables $\mathcal {T}$ threaded through the rules accordingly.

However, some additional rules are needed to tabulate and apply summaries. As such, we reproduce the modified rules and provide the new rules in in Figure 10, highlighting the modifications relative to Section 4 in green.

Fig. 10.

First, we add a rule Tabulate to the $\mathcal {G}\dashrightarrow \mathcal {G}^{\prime }$ judgment form, allowing the analyzer to drop a fully solved DAIG $\mathcal {D}$ and replace it by an equivalent Hoare-style summary triple at any point.

Note that the dependency map $\Delta$ is unchanged when we drop a DAIG $\mathcal {D}$ and tabulate the corresponding triple $\left\lbrace \varphi \right\rbrace \mathbin {\rho }\left\lbrace \varphi ^{\prime }\right\rbrace$ . Thus, any analysis fact that depended on $\mathcal {D}$ now depends on the summary triple, and any DAIG or summary triple that contributed to $\mathcal {D}$ also contributed to the summary triple.

Then, we add an additional inference rule S-Apply, which allows the tabulated Hoare triples in $\mathcal {T}$ to resolve summary queries and be applied at procedure call sites.

The procedure summarization judgment $\mathcal {G}\mathbin \vdash \left\lbrace \varphi \right\rbrace \mathbin {\rho }\left\lbrace \varphi ^{\prime }\right\rbrace \mathbin ;\mathcal {G}^{\prime }$ effectively abstracts away the two different types of cached results in our summary table-equipped DSGs: Summarize (Section 4) interprets fully solved DAIGs as summaries, while S-Apply (Figure 10) interprets rows of the summary table $\mathcal {T}$ as summaries.

Next, we add a premise $\left\lbrace \varphi \right\rbrace \mathbin {\rho ^\ell }\left\lbrace \_\right\rbrace \not\in \mathcal {T}$ to the Q-Instantiate rule, preventing the instantiation of a DAIG that shadows a summary already in $\mathcal {T}$ . Similarly, we tweak the definition of $\lbrace \varphi _1,\dots ,\varphi _k\rbrace$ in the E-Delegate rule of the edit judgment, ensuring that when a procedure is edited, not only DAIGs but also summary triples over that procedure are dirtied. Both modifications consist of checking whether a summary exists in $\mathcal {T}$ in addition to the original check whether it exists in $\mathcal {D}^\ast$ .

Finally, we add a rule D-Summary to the dirtying judgment, which is analogous to the D-DAIG rule but discards a summary triple instead of dirtying a DAIG. Note, though, that instead of dropping only those interprocedural dependencies $\Delta ^{\prime }$ for now-dirtied intermediate results in $\mathcal {D}$ (as in D-DAIG ), we drop all backward dependencies of the discarded triple.

Applying the Tabulate transformation allows a DSG-based analyzer to selectively coarsen its memoized results, achieving a significantly smaller memory footprint at the cost of fine-grained incremental reuse in cases where there is an edit inside a method whose DAIG has been dropped.

Metatheory. Moreover, extending demanded summarization graphs $\mathcal {G}$ with summary tables $\mathcal {T}$ as described here does not affect analysis results, since summaries in $\mathcal {T}$ are created by Tabulate, applied by S-Apply, and dirtied by D-Intraproc-Summary under exactly the conditions that apply to corresponding DAIGs in $\mathcal {D}^\ast$ .

We do not reproduce the theorems and proofs in full here, as they are largely unchanged from Section 4, but they are available in Appendix B. The key modification required is an additional condition for semantic consistency (Definition 5.1), stating that whenever a summary triple is in $\mathcal {T}$ , all of the dependency edges that would be required to compute that summary from scratch are in $\Delta$ . By ensuring that summary triples in $\mathcal {T}$ are accompanied by the proper dependencies, we can guarantee that they are invalidated as needed in response to semantically relevant edits in other procedures.

Condensing DAIGs down to input/output summaries with Tabulate provides a semantic foundation that can be used to implement memory conservation mechanisms without jeopardizing hard-won guarantees of soundness, termination, and from-scratch consistency. Similar approaches to reducing memory usage by intelligently discarding some analysis results have been shown to significantly improve performance in IFDS-based analysis frameworks [Arzt 2021].

Classic cache replacement strategies allow an interactive analysis engine to intelligently discard or condense DAIGs at set intervals or whenever memory usage crosses some threshold. For example, the least-recently or least-frequently used DAIGs can be reduced to two-state summaries, treating the DSG as an LRU/LFU cache of intermediate analysis results and reducing the overall memory footprint of the DSG without incurring any runtime cost unless/until the summary is affected by a program edit.

Beyond the practical applicability of these operations, the relative simplicity and straightforwardness of the extension indicates the generality and foundational nature of the DSG approach. For example, we believe that the DSG approach could be adapted to operate over custom relational domains rather than state domains, essentially replacing the DAIG intraprocedural analysis black-box with a different mechanism that produces such summaries, while tracking interprocedural dependencies in much the same way as we describe here.

7.2 Summary Weakening for Reuse

To apply a summary at a procedure call, the core DSG operational semantics laid out in Section 4 require that its precondition exactly matches the callsite abstract state.⁷ However, an analysis implementation may wish to maximize the reuse of previously computed summaries by applying a compatible one with a weaker-than-needed precondition instead. This is analogous to the use of monotonicity constraints in Datalog-based incremental analyses to reuse facts when there is an $\sqsubseteq$ -increasing change to some input [Szabó et al. 2018].

This optimization is enabled by adding the following inference rule to the summarization judgment $\mathcal {G}\mathbin \vdash \left\lbrace \varphi \right\rbrace \mathbin {\rho }\left\lbrace \varphi ^{\prime }\right\rbrace \mathbin ;\mathcal {G}^{\prime }$ .

Ignoring the demanded summarization graphs (i.e., everything before a turnstile or after a semicolon), this rule is precisely the familiar Hoare logic rule of consequence for preconditions [Hoare 1969]. Although it would be sound to do so, we do not include the dual postcondition rule (nor combine the two into one rule), as there is no benefit to applying it in our framework—it would simply produce less-precise analysis results.

The primed extensions of $\Delta$ and $\mathcal {T}$ in the conclusion of S-Consequence serve to avoid a subtle issue: Without them, when this rule is used to weaken a summary, the Q-Apply-Summary rule adds a dependency edge corresponding to the weakened (conclusion) summary rather than the stronger (premise) summary from which it was derived. As a result, when the premise summary is dirtied the analysis would unsoundly fail to propagate that change to the callsite where its weakened form was applied. This shortcoming is fully addressed, though, by adding the derived triple to $\mathcal {T}$ , along with a dependency edge in $\Delta$ on the (weaker) premise triple, since dirtying will transitively propagate the change through the derived triple $\left\lbrace \varphi \right\rbrace \mathbin {\rho }\left\lbrace \varphi ^{\prime \prime }\right\rbrace$ in $\mathcal {T}^{\prime }$ .⁸ This does require that $\varphi$ and $\varphi$ be strictly ordered, as $\Delta ^{\prime }$ would contain spurious self-loops if they were equal. This is not a significant restriction, since “weakening” a summary to itself has no real benefit.

Weakening summaries by applying S-Consequence preserves the soundness of analysis results by the same reasoning that it is sound to apply the consequence rule in a standard Hoare logic. That is, the weaker triple $\left\lbrace \varphi ^{\prime }\right\rbrace \mathbin {\rho }\left\lbrace \varphi ^{\prime \prime }\right\rbrace$ in its premise implies the stronger triple inferred as its conclusion: Since $\varphi \sqsubset \varphi ^{\prime }$ , any concrete state modelling $\varphi$ also models $\varphi ^{\prime }$ and is thus guaranteed to be mapped by the semantics of $\rho$ to a state modelling $\varphi ^{\prime \prime }$ .

However, introducing the S-Consequence rule comes at the cost of from-scratch consistency for demanded analysis. To see why, consider an edit immediately before a procedure call that has been summarized with triple $\lbrace \varphi _\text{pre}\rbrace \mathbin {\rho }\lbrace \varphi _\text{post}\rbrace$ , and suppose that the edit results in a stronger precondition $\varphi _\text{pre}^{\prime }$ for the call. It is sound to reuse the previously computed summary via S-Consequence, deriving the same postcondition $\varphi _\text{post}$ for the call as before. However, it is possible that a from-scratch recomputation with the newly strengthened precondition may have produced a stronger summary $\lbrace \vphantom{\varphi _\text{pre}}\varphi _\text{pre}^{\prime }\vphantom{\varphi _\text{pre}}\rbrace \mathop \rho \lbrace \vphantom{\varphi _\text{post}}\varphi _\text{post}^{\prime }\vphantom{\varphi _\text{post}}\rbrace$ , so from-scratch consistency has been violated.

Thus, weakening of summaries represents a tradeoff and a tunable parameter for the design of practical analysis tools based on demanded summarization. In some circumstances, the benefit (i.e., computational savings due to increased summary reuse) may be well worth the cost of weaker precision guarantees; in others, the predictable behavior and maximal precision of a from-scratch consistent demanded analysis are more important.

8 Related Work

Compositional Analysis. Sharir and Pnueli [1981] defined the functional, compositional approach to interprocedural analysis, and much work has followed in its footsteps. Previous approaches have achieved compositionality by tabulating function summaries [Reps et al. 1995; Sagiv et al. 1996; Naeem et al. 2010; Padhye and Khedker 2013] or by deriving two-state relational summaries [Jeannet et al. 2010; Calcagno et al. 2011; Cousot and Cousot 2002; Yorsh et al. 2008; Salcianu 2006; Montagu and Jensen 2020; Dillig et al. 2011; Chatterjee et al. 1999; Madhavan et al. 2015]. Of these, the batch analysis formulation underlying our technique is most similar to that of Padhye and Khedker [2013]. Their work tabulates function summaries keyed on a “value context,” consisting of a procedure name and entrypoint abstract state, similar to our formulation. Further, the worklist algorithm described in that paper also tracks dependencies between such contexts, similar to the $\Delta$ component of DSGs (Section 4), to propagate interprocedural data flow from procedure exits to caller return sites. To perform demanded analysis and ensure from-scratch consistency, our approach rigorously defines invalidation and recomputation of such dependencies and also exerts more fine-grained control over iteration ordering during analysis.

Monadically parameterized semantics can also be used to define reusable abstract interpreter metatheory, offering compositionality of soundness proofs and generality over a family of different interprocedural analyses [Sergey et al. 2013; Keidel and Erdweg 2019; Darais et al. 2017]. Darais et al. [2017] cache analysis results during batch evaluation of a monadic interpreter, but rely on a finite abstract domain to ensure termination of the fixed-point computation and do not address incremental or demand-driven evalutation of their abstract interpreter.

Deductive verification systems [Leino 2010; Jacobs et al. 2011; Filliâtre and Marché 2007; Barnett et al. 2011; Müller et al. 2016] have long used compositionality to provide a real-time verification experience [Leino and Wüstholz 2015], generally with user-supplied pre- and post-conditions. While we have focused on tabulation in this article to directly support common one-state abstractions, the two-state relational approach is conceptually the same to support for demanded analysis, except with DAIG reference cells storing two-state relations instead of single-state abstractions.

Our use of “operational” and “denotational” rather than the traditional “context-sensitive” and “functional” for Sharir and Pnueli [1981]’s two approaches to interprocedural analysis is borrowed from Jeannet et al. [2010], whose lucid treatment of the topic makes a convincing case for these more generalized terms.

Recent static analyses have achieved large scalability gains on CI servers through compositionality [Calcagno and Distefano 2011; Distefano et al. 2019; Fähndrich and Logozzo 2010; Blackshear et al. 2018]. This compositionality suggests an incremental deployment model in which procedures are reanalyzed only when they are affected by an edit [Calcagno et al. 2011].

However, naive approaches to such incrementality have yielded unreliable and/or unsound results, so existing industrial deployments of these compositional analyses have largely eschewed incremental reuse of summaries in practice. This work has directly enabled the development of rigorous incremental infrastructure for such analyses, though: A variant of demanded summarization maps is now implemented in the Infer static analyzer [Calcagno and Distefano 2011]. These developments have produced significant analysis speedups in continuous integration (on the order of 3 $\times$ across all analyses and up to 10 $\times$ at the 95th percentile for certain workloads) as well as a near-total elimination of the unreliability and flakiness that characterized earlier iterations of incremental analysis in Infer [Stein 2023].

Incremental Analysis. Incremental variants of many standard compiler analyses have been studied and developed to support responsive continuous compilation and structured editors/integrated development environments. These include dataflow analyses [Ryder 1983; Zadeck 1984; Carroll and Ryder 1988; Pollock and Soffa 1989], pointer analyses [Lu et al. 2013; Liu and Huang 2022; Gupta et al. 1993], and attribute grammars [Demers et al. 1981; Reps 1982; Reps et al. 1983; Söderberg and Hedin 2012], which combine parse trees with semantic information and can encode many analyses including dataflow.

Recent work has also contributed incremental versions of several broader classes of program analysis, including IFDS/IDE dataflow analyses [Arzt and Bodden 2014; Do et al. 2017] and analysis DSLs based on extensions to Datalog [Szabó et al. 2016, 2018, 2021] These specialized approaches offer very effective solutions for these particular classes of program analysis but place restrictions on abstract domains that rule out arbitrary abstract interpretations in infinite-height domains, for example, by requiring domains to fall in the IFDS/IDE subset or by requiring that all domain operations are monotonic.

Some recent work by Van der Plas et al. [2020, 2023] and Garcia-Contreras et al. [2021] has explored similar approaches to incrementalization of compositional dataflow analyses, tracking and reifying inter-procedural dependencies and supporting infinite-height abstract domains. Notably, Van der Plas et al. [2023] interleave invalidation with (eager) recomputation, potentially saving a great deal of recomputation over our approach when an edit does not affect a procedure’s semantics, as alluded to in Section 4.2, and Garcia-Contreras et al. [2021] also offer a demand-driven interface, analyzing only the dependencies of a given “goal” in constrained Horn clause programs. The main benefit of our approach over these closely related works is its from-scratch consistency, offering analysis designers/implementers strong precision guarantees and simplifying debugging and deployment as a result.

These approaches are automatic (in that they require no user-provided specifications or loop invariants) and sound, but make no guarantee of from-scratch consistency and in fact violate it in some cases to maximize incremental reuse. However, Leino and Wüstholz [2015] propose a fine-grained incremental verification technique for the Boogie language, which verifies user-provided specifications of imperative procedures. Since these specifications include loop invariants, their algorithm can avoid altogether the issues and complexities introduced by cyclic dependencies.

Demand-driven Analysis. Demand-driven techniques for classical dataflow analysis are similarly well studied. The intra-procedural problem was studied by Babich and Jazayeri [1978]. Several extensions to inter-procedural analysis have been presented, for example, by Reps [1994], Duesterwald et al. [1995], Horwitz et al. [1995], and Sagiv et al. [1996], and applied to problems such as array bounds check elimination, parallel communication optimization, and integration testing [Bodík et al. 2000; Yuan et al. 1997; Duesterwald et al. 1996].

As with the incremental variants discussed in the previous section, these works are focused on finite domains (or, in the case of Sagiv et al. [1996], infinite domains of finite height).

Other types of static analysis (i.e., neither dataflow analysis nor abstract interpretation) can also be performed on-demand. Any analysis expressible as a context-free language reachability (CFL-reachability) problem can be computed in a demand-driven fashion as a “single-source” problem [Reps 1998]. As such, several papers have presented demand-driven algorithms for flow-insensitive pointer analysis [Heintze and Tardieu 2001; Sridharan et al. 2005; Späth et al. 2016]. Reference attribute grammars (RAGs) are declarative specifications of properties over ASTs (including potentially cyclic flow analyses), which can be evaluated incrementally and on-demand [Magnusson and Hedin 2007; Söderberg and Hedin 2012]. Termination of RAG evaluation requires that all cyclic computations converge to a fixed-point in finitely many iterations [Magnusson and Hedin 2007; Farrow 1986]; this convergence property holds for finite domains with monotone operators but may also be achieved through other means (e.g., widening). Unlike RAG-based approaches to dataflow analysis, our approach comes with proofs of termination and from-scratch consistency and specifies the exact conditions required to ensure termination in infinite-height domains with non-monotone widening operators.

Incremental computation. Our technique is heavily influenced by dependency graph-based techniques for general incremental and self-adjusting computation [Acar et al. 2002, 2008; Demers et al. 1981; Reps 1982; Hammer et al. 2009]. Some recent work in this area has also explored dynamic dependency tracking similar to our own, but applied to build systems rather than static analysis [Konat et al. 2018].

In particular, we draw on insights from the demanded computation graphs of Adapton [Hammer et al. 2015, 2014], which extends traditional graph-based incremental computation techniques to support interactive demand-driven computations and provide robust formal guarantees.

By reifying the partial order of computation dependencies in a demanded computation graph (DCG), this line of work provides a powerful and general approach to the design and implementation of efficient interactive systems. However, its low-level primitives make it difficult to express the complex fixed-point computations over cyclic control-flow graphs and recursive call structures that are found in arbitrary abstract interpretations [Stein et al. 2021a]. This work takes a great deal of inspiration from demanded computation graphs but specializes the language of demanded computations to abstract interpretation, both with syntactic structures and with query and edit semantics that dynamically modify the dependency graph to model program analysis computation.

9 Conclusion

We have described a novel framework for interactive abstract interpretation that simultaneously supports demand-driven queries, incremental handling of program edits, and compositional application of procedure summaries—all in the context of supporting arbitrary complex abstract domains with non-monotonic widening operators and recursive procedures.

The key innovation in our framework is demanded summarization, which instantiates demanded abstract interpretation graphs (DAIGs) on demand to synthesize summaries as needed for a compositional interprocedural analysis, where a significant technical challenge is soundly handling self-referential summaries that naturally arise from demanded summarization of recursive procedures. Our evaluation provides evidence of demanded summarization’s scalability and responsiveness using synthetic benchmarks, and of its generalizability using real-world edits drawn from open-source Java programs. Together, these experiments provide evidence of the feasibility of this approach as an interactive interface to compositional summary-based analyses with arbitrarily complex abstract domains.

Acknowledgments

We thank David Flores for his valuable contributions to the preparation of the experimental evaluation in this article. We also thank Matthew Hammer, members of the CUPLV lab, and anonymous reviewers from TOPLAS and other venues whose comments helped us improve this article.

Footnotes

We use “demanded” to describe a computation that is both incremental and demand-driven, following Hammer et al. [2015];, 2014] and Stein et al. [2021a], all of which reify dependency graphs to enable real-time interactivity.

We use “operational” and “denotational” (following Jeannet et al. [2010]), which are analogous to but more general than the commonly used “call-strings” and “functional” terminology of Sharir and Pnueli [1981]. In particular, (1) call-strings are just one instance of a broader class of context-sensitive interprocedural analyses that operationally propagate predicates along a control-flow graph until a fixed-point is reached, and (2) summaries need not be functions but can be any abstraction of a procedure’s denotational semantics.

Note that this summary is actually more precise than the one that would have been computed by from-scratch batch analysis of the updated program. A DSG may apply a compatible summary whose precondition is stronger than the result of widening at recursive callsites without loss of precision, though its result may actually be more precise than from-scratch batch analysis if it does so. Although this optimization strictly improves analysis precision, it technically violates from-scratch consistency and as such is not applied in our formal system.

⁴

Throughout this article, we fix a strategy of widening at every recursive call for simplicity and clarity of presentation. Our general approach is compatible with other widening strategies, too, but we choose not to introduce that degree of freedom into the formalism and instead present the simple and conservative widen-everywhere strategy. It is also not clear that arbitrary widening strategies give rise to from-scratch consistent implementations if the choice of whether or not to widen depends on some information not preserved in the incremental setting.

⁵

Note that we assume that there is no mutual recursion (e.g., each strongly connected component in the call graph is compiled to a directly recursive procedure).

⁶

Notably, every program in the corpus of Section 6.3 includes some recursive procedure(s) and is thus out of reach of Stein et al. [2021a]’s operational DAIG approach, and both Stein et al. [2021a] and Szabó et al. [2021] are evaluated on synthetic sequences of edits.

⁷

Modulo widening, in the case of recursive calls

⁸

The issue can also be addressed without summary tables $\mathcal {T}$ by folding S-Consequence directly into Q-Apply-Summary and adjusting its dependency map accordingly. We present it this way for the sake of clarity, keeping some separation of concerns between judgment forms.

Appendices

A Proofs for Section 5 (From-scratch Consistency)

Definition A.1 (DSG Semantic Consistency).

—

–

$(\rho, \varphi){\stackrel{n}\hookleftarrow}({\rho ^{\prime }},{\mathcal {D}(n^{\prime }))}\in \Delta$ (when $\rho \ne \rho ^{\prime }$ ) or

–

$(\rho, \varphi){\stackrel{n}\hookleftarrow}({\rho ^{\prime }},\varphi\nabla{\mathcal {D}(n^{\prime }))}\in \Delta$ (when $\rho =\rho ^{\prime }$ ).

—

Dependency edges $(\rho^{\prime}, \varphi^{\prime}){\stackrel{n}\hookleftarrow}({\rho},\varphi)$ in $\Delta$ are consistent with the relevant DAIGs, in that

–

$\mathcal {D}^\ast (\rho ^{\prime },\varphi ^{\prime })(n) = \mathcal {D}^\ast (\rho ,\varphi)(\textsf {exit}(\rho))$ (when $\rho \ne \rho ^{\prime }$ ) or

–

$\mathcal {D}^\ast (\rho ^{\prime },\varphi ^{\prime })(n) \sqsupseteq \mathcal {D}^\ast (\rho ,\varphi)(\textsf {exit}(\rho))$ (when $\rho =\rho ^{\prime }$ ).

—

Lemma A.1 (Initial DSG Consistency).

The DSG $\langle \varepsilon ,\varepsilon \rangle$ with no cached results is semantically consistent.

Proof.

All four conditions of Definition 5.1 are universally quantified over $\mathcal {D}$ or $\Delta$ , which are empty here and thus vacuously satisfied. □

Lemma A.2 (Consistency Preservation).

If $\mathcal {G}$ is semantically consistent (with respect to a program P), then:

if $\mathcal {G}\mathbin \vdash _{\varphi } \ell \Downarrow \varphi ^{\prime } \mathbin {;} \mathcal {G}^{\prime }$ , then $\mathcal {G}^{\prime }$ is semantically consistent, and

if $\mathcal {G} \mathbin \vdash _{\rho } n\Leftarrow s \mathbin ; \mathcal {G}^{\prime }$ , then $\mathcal {G}^{\prime }$ is semantically consistent (with respect to the edited version of P).

Proof.

We will show that each of the the four conditions of Definition 5.1 is preserved in turn.

—

Each $\mathcal {D}\in \mathcal {D}^\ast$ is well formed and consistent. This follows directly from Stein et al. [2021a], since the only way we instantiate and modify DAIGs is through the inference rules defined there (modulo $\rho$ -labeled edges, over which DAIG well-formedness makes no claims, but we will handle in the third bullet point).

—

Each return-site abstract state has in $\Delta$ a corresponding dependency edge from its callee. The only query rules that modify return site abstract states either add the requisite edge to $\Delta$ (SQ-Other-Proc, SQ-Other-Pre, SQ-Self) or have a premise that guarantees the requisite edge is in $\Delta$ (F-Step, via $\textsf {R}_{\rho ,\varphi }$ ).

The dirtying rule D-DemandedSummaries empties a return-site $n^{\prime }$ and drops the corresponding edge from $\Delta$ , while D-DAIG throws away any dangling dependency edges $\Delta ^{\prime }$ after dirtying intraprocedurally.

—

Each edge in $\Delta$ is consistent with the values on either side. First, note that by the same argument as the previous case, dependency edges are always removed when the return site they point to is dirtied. Then, we need only consider the SQ-* rules that add dependency edges to $\Delta$ : SQ-Other-Proc, SQ-Other-Pre, and SQ-Self. The dependency edges added in SQ-Other-* both satisfy the condition, since the value $\varphi _\textsf {post}$ at the callee exit is written directly to the return site n.

The edge added in SQ-Self temporarily violates the condition, but the conclusion of its final premise guarantees that the resulting $\mathcal {G}^{\prime }$ satisfies it (by the $\textsf {R}_{\rho ,\varphi }(\mathcal {G}^{\prime })$ premise of F-Converge).

—

Cached results agree with batch tabulation.

To show that this property is preserved under edits, first note that E-Delegate (the only edit rule) applies the edit by dirtying each constituent DAIG over the edited procedure. So, we will proceed by structural induction on the derivation of the i-indexed dirtying premise of that rule, showing that all possibly affected analysis results are dirtied:

–

Case D-DAIG: Since there are no dependency edges on this procedure in $\Delta$ , the only cached analysis results affected by the edit are in the DAIG $\mathcal {D}^\ast (\rho ,\varphi)$ . By Stein et al. [2021a], the local dirtying in that DAIG is sound.

–

Case D-DemandedSummaries: Because the dependency edge $(\rho^{\prime}, \varphi^{\prime}){\stackrel{n}\hookleftarrow}({\rho},\varphi)$ is in $\Delta$ , the analysis result at $n^{\prime }$ in $\mathcal {D}^\ast (\rho ^{\prime },\varphi ^{\prime })$ relied on this summary. By the inductive hypothesis and preservation of the second condition of semantic consistency, the first premise dirties all results that depended transitively on that result, producing a $\mathcal {G}$ whose dependency map reflects all other uses of this summary. By the inductive hypothesis, the second premise also preserves cache agreement with batch tabulation, so the final resulting $\mathcal {G}^{\prime }$ contains only those results that did not depend on the edit.

To show that it is preserved under queries, we proceed by structural induction on the derivation of $\mathcal {G}\mathbin \vdash _{\varphi } \ell \Downarrow \varphi ^{\prime } \mathbin {;} \mathcal {G}^{\prime }$ :

–

Case Q-Instantiate: The initial DSG of the premise is consistent, because the initial DAIG has only one non-empty cell (its entry) where $\mathcal {D}^\textsf {init}_{\rho ^\ell ,\varphi }(\textsf {entry}(\rho ^\ell)) = \varphi = I(\varphi ,\textsf {entry}(\rho ^\ell))$ .

–

Cast Q-Delegate: By from-scratch consistency of DAIGs.

–

Case Q-Apply-Summary: First, since the only inference rule of the $\mathcal {G}\mathbin \vdash \left\lbrace \varphi \right\rbrace \mathbin {\rho }\left\lbrace \varphi ^{\prime }\right\rbrace \mathbin ;\mathcal {G}^{\prime }$ judgment is Summarize, we apply the inductive hypothesis at its premise to get that $\mathcal {G}^{\prime }$ is consistent and thus $\varphi _\textsf {post}=I(\varphi _\textsf {pre},\textsf {exit}(\rho))$ . Then, writing $\varphi _\textsf {post}$ to the return site n is exactly summary application in tabulation, so the premise of the inductive premise is consistent and therefore so is its conclusion by the inductive hypothesis.

□

Theorem A.2 (Termination).

Queries and edits terminate:

—

We show each bullet point separately. First, for queries:

Proof.

Let $tc(\rho)$ denote the number of non- $\rho$ transitive callees of $\rho$ in P and note that $tc(\rho ^{\prime }) \lt tc(\rho)$ whenever $\rho$ calls $\rho ^{\prime }$ and $\rho \ne \rho ^{\prime }$ (this does not hold if we do not exclude $\rho$ itself, since, e.g., a callgraph where $\rho$ calls $\rho ^{\prime }$ and $\rho ^{\prime }$ calls itself would violate it).

We will proceed by induction on $tc(\rho ^\ell)$ , where $\rho ^\ell$ is the procedure containing the query location $\ell$ .

Note that Q-Instantiate can apply at most once for each $\rho ^\ell ,\varphi$ pair and so we will ignore it throughout this proof, assuming DAIGs are materialized whenever needed.

We will split each case into two subcases based on whether or not $\rho ^\ell$ is recursive (i.e., contains a call to itself).

—

Base case ( $tc(\rho ^\ell)=0$ , non-recursive $\rho ^\ell$ ): There are no calls in $\rho ^\ell$ , so Q-Delegate applies, its premise guaranteed by intraprocedural DAIG query termination [Stein et al. 2021a].

—

Base case ( $tc(\rho ^\ell)=0$ , recursive $\rho ^\ell$ ): A query against the relevant sub-DAIG either returns a result (i.e., $\mathcal {D}^\mathcal {G}_{\rho ^\ell ,\varphi } \vdash \underline{\ell } \Rightarrow \varphi ^{\prime } ~; \mathcal {D}^{\prime }$ ) or is blocked at a recursive callsite (i.e., $\mathcal {D}^\mathcal {G}_{\rho ^\ell ,\varphi }\mathbin \vdash \underline{\ell }\mathbin {\overset{n}{\rightsquigarrow }}(\rho ,\varphi)\mathbin {;}\mathcal {D}^{\prime }$ ).

(1)

$\mathcal {D}^\mathcal {G}_{\rho ^\ell ,\varphi } \vdash \underline{\ell } \Rightarrow \varphi ^{\prime } ~; \mathcal {D}^{\prime }$

The sub-DAIG query completes on its own, so Q-Delegate applies and the query terminates with its result.

(2)

$\mathcal {D}^\mathcal {G}_{\rho ^\ell ,\varphi }\mathbin \vdash \underline{\ell }\mathbin {\overset{n}{\rightsquigarrow }}(\rho ^\ell ,\varphi ^{\prime })\mathbin {;}\mathcal {D}^{\prime }$ :

Only Q-Apply-Summary applies, so we must derive a summary query.

Since $\rho ^\ell =\rho ^\ell$ , we consider only SQ-Other-Pre and SQ-Self, depending on whether or not $\varphi ^{\prime }\sqsubseteq \varphi$ .

–

SQ-Self: We proceed through the F-* rules for fixed-point computation to derive the final premise. Due to the convergence condition of $\nabla$ , F-Step can only apply finitely many times before F-Converge applies. The $\mathcal {G}\mathbin \vdash \left\lbrace \varphi \right\rbrace \mathbin {\rho }\left\lbrace \varphi ^{\prime }\right\rbrace \mathbin ;\mathcal {G}^{\prime }$ premises of the F-* rules similarly converge either via Q-Delegate or finitely many applications of Q-Apply-Summary (since there are finitely many syntactic callsites in a given procedure).

–

SQ-Other-Pre: By the convergence condition of $\nabla$ , only finitely many new DAIGs over $\rho ^\ell$ may be instantiated via the $\mathcal {G}\mathbin \vdash \left\lbrace \varphi \nabla \varphi ^{\prime }\right\rbrace \mathbin {\rho }\left\lbrace \varphi _\textsf {post}\right\rbrace \mathbin ;\mathcal {G}^{\prime }$ premise, eventually yielding one where SQ-Self applies instead of SQ-Other-Pre and terminates by the argument of the previous case, allowing any intermediate $\rho$ DAIGs also to terminate via intraprocedural analysis with Q-Delegate after their demanded summaries return.

—

Inductive case ( $tc(\rho ^\ell)=n$ , non-recursive $\rho ^\ell$ ): A query for $\underline{\ell }$ under precondition $\varphi$ against the relevant sub-DAIG $\mathcal {D}^\mathcal {G}_{\rho ^\ell ,\varphi }$ either returns a result (in which case Q-Delegate applies and the query terminates) or is blocked at some callsite (i.e., $\mathcal {D}^\mathcal {G}_{\rho ^\ell ,\varphi }\mathbin \vdash \underline{\ell }\mathbin {\overset{n}{\rightsquigarrow }}(\rho ,\varphi ^{\prime })\mathbin {;}\mathcal {D}^{\prime }$ ), in which case, the only applicable rule is Q-Apply-Summary.

We can derive the summary premise $\mathcal {G}\mathbin \vdash \lbrace \varphi _\textsf {pre}\rbrace \mathbin {\rho }\lbrace \varphi _\textsf {post}\rbrace \mathbin ;\mathcal {G}^{\prime }$ of Q-Apply-Summary via Summarize, using the inductive hypothesis to derive its premise because $tc(\rho)\lt n$ . After writing the resulting $\varphi _\textsf {post}$ to n in $\mathcal {D}^{\prime }$ and updating $\Delta$ accordingly, we reissue the query for $\ell$ under $\varphi$ (i.e., the recursive final premise of Q-Apply-Summary).

Note that the inductive hypothesis does not apply here, since the reissued query is in the same procedure $\rho ^\ell$ . However, since $\rho ^\ell$ is non-recursive, either Q-Delegate or Q-Apply-Summary must apply again.

If it is Q-Delegate, then the query terminates with its result. However, if it is Q-Apply-Summary, then that reissued query may itself reissue a query for which Q-Apply-Summary applies, and so on. Note, though, that each time the query is reissued it is against a DSG with one fewer empty return site abstract state reference cell in its ( $\rho ^\ell$ , $\varphi$ ) DAIG. Thus, since there are only finitely many return sites in $\rho ^\ell$ , eventually Q-Delegate will apply and the query will terminate.

—

Inductive case ( $tc(\rho ^\ell)=n$ , recursive $\rho ^\ell$ ): This case is shown by combining the arguments of the previous two cases: The inductive hypothesis ensures termination of any sub-queries at non-recursive calls (as in the non-recursive inductive case), while widening ensures termination of any sub-queries at recursive calls and convergence of the in-place fixed-point computation performed by F-Step and F-Converge (as in the recursive base case).

□

And next, for edits:

Proof.

Because $\mathcal {D}^\ast$ and therefore k are finite, this amounts to showing that each of the k-indexed dirtying premises terminates, i.e., that for all $\rho ,\varphi ,n,s$ , and semantically consistent $\mathcal {G}$ , there exists $\mathcal {G}^{\prime }$ such that $\mathcal {G} \mathbin \vdash _{\rho ,\varphi }n\Leftarrow s\mathbin {;}\mathcal {G}^{\prime }$ . We proceed by induction on the size of the dependency map $\Delta$ . In the base case when $\Delta _{\rho ,\varphi }$ is empty (as is necessarily the case when $|\Delta |=0$ ), D-DAIG applies—we dirty the $\mathcal {D}^\ast (\rho ,\varphi)$ from n to produce $\mathcal {D}^{\prime }$ , discard any dependencies $\Delta ^{\prime }$ of dirtied local results, construct $\mathcal {G}^{\prime }=\left\langle \mathcal {D}^\ast [\mathcal {D}^{\prime }/(\rho ,\varphi)],\Delta /\Delta ^{\prime } \right\rangle$ , and terminate.

In the inductive case with $|\Delta |=n$ and $|\Delta _{\rho ,\varphi }|\gt 0$ , D-DemandedSummaries applies. Its first inductive premise is with $|\Delta |=n-1$ , so by the inductive hypothesis it terminates with some $\mathcal {G}$ . That $\mathcal {G}$ also has $|\Delta \le n-1$ , since the dirtying rules only remove and never add dependency edges in $\Delta$ , so it also terminates by the inductive hypothesis, and we terminate with its result $\mathcal {G}^{\prime }$ . □

Theorem A.3 (From-scratch Consistency).

Query results are equal to the corresponding invariant computed by tabulation: if $\mathcal {G}$ is semantically consistent, $\mathcal {G}\mathbin \vdash _{\varphi } \ell \Downarrow \varphi ^{\prime } \mathbin {;} \mathcal {G}^{\prime }$ , and $(\varphi ,\ell)\in {\rm \textsf {dom}}(I),$ then $\varphi ^{\prime }=I(\varphi ,\ell)$ .

Proof.

Having shown semantic consistency preservation and initial-DSG semantic consistency, the proof of this property is straightforward. Since

(1)

the fourth property of semantic consistency ensures that DAIG results are consistent with tabulation results,

(2)

Q-Delegate (which reads its result directly from the relevant procedure DAIG) is the only $\mathcal {G}\mathbin \vdash _{\varphi } \ell \Downarrow \varphi ^{\prime } \mathbin {;} \mathcal {G}^{\prime }$ rule with no inductive premise, and

(3)

the other two rules $\mathcal {G}\mathbin \vdash _{\varphi } \ell \Downarrow \varphi ^{\prime } \mathbin {;} \mathcal {G}^{\prime }$ return the result of their inductive premise.

All query results must correspond to DAIG cell values, which are themselves consistent with batch tabulation results where the domains coincide. □

Corollary A.4 (Soundness).

Query results are sound with respect to the concrete semantics:

Proof.

Soundness follows directly from from-scratch consistency: Results are equal to those of the underlying batch analysis, which is itself sound. □

B Proofs for Section 7 (Summary Tabulation & Weakening)

First, some modifications are required in the definition of DSG Semantic Consistency (Definition 5.1) to accommodate the summary triples in $\mathcal {T}$ .

Definition B.1 (DSG Semantic Consistency with Summary Tables).

Conditions (1) and (2) of Definition 5.1 are unmodified and therefore elided here. Conditions (3) and (4) are modified and a condition (5) added as follows:

(3)

Dependency edges $(\rho^{\prime}, \varphi^{\prime}){\stackrel{n}\hookleftarrow}({\rho},\varphi)$ in $\Delta$ are consistent with the relevant DAIGs and summary triples, such that whenever $(\rho ^{\prime },\varphi)\in \textsf {dom}(\mathcal {D}^\ast),$ we have either

–

$\mathcal {D}^\ast (\rho ^{\prime },\varphi ^{\prime })(n) = \varphi _\textsf {callee-exit}$ (when $\rho \ne \rho ^{\prime }$ ) or

–

$\mathcal {D}^\ast (\rho ^{\prime },\varphi ^{\prime })(n) \sqsupseteq \varphi _\textsf {callee-exit}$ (when $\rho =\rho ^{\prime }$ ).

where $\varphi _\textsf {callee-exit} = \left\lbrace \!\!\begin{array}{lr} \varphi ^{\prime } &\text{ if }\exists ~\left\lbrace \varphi \right\rbrace \mathbin {\rho }\left\lbrace \varphi ^{\prime }\right\rbrace \in \mathcal {T}\\ \mathcal {D}^\ast (\rho ,\varphi)(\textsf {exit}(\rho))&\text{ otherwise.} \end{array}\right.$

(4)

Analysis results in DAIGs and summary triples are equal to the corresponding invariants produced by batch tabulation, where the domains coincide:

–

$\smash{\mathcal {D}^\mathcal {G}_{\rho ,\varphi }(\underline{\ell })} = I(\varphi ,\ell)$ for all $\mathcal {D}^\ast (\rho ,\varphi)(\underline{\ell })$ where $(\varphi ,\ell)\in \textsf {dom}(I)$ , and

–

$\varphi ^{\prime }=I(\varphi ,\textsf {exit}(\rho))$ for all $\left\lbrace \varphi \right\rbrace \mathbin {\rho }\left\lbrace \varphi ^{\prime }\right\rbrace \in \mathcal {T}$ where $(\varphi ,\textsf {exit}(\rho))\in \textsf {dom}(I)$ .

(5)

For all $\left\lbrace \varphi \right\rbrace \mathbin {\rho }\left\lbrace \varphi ^{\prime }\right\rbrace \in \mathcal {T}$ , we have that $\Delta \supseteq \Delta ^{\prime }$ where $\Delta ^{\prime }$ is defined by $\langle \varepsilon ,\varepsilon ,\varepsilon \rangle \mathbin \vdash _{\varphi } \textsf {exit}(\rho) \Downarrow \varphi ^{\prime } \mathbin {;} \langle \_,\Delta ^{\prime },\_\rangle$ .

The tweaks to (3) and (4) are straightforward, simply allowing for the fact that analysis results can now be stored either in $\mathcal {D}^\ast$ or $\mathcal {T}$ , rather than just in $\mathcal {D}^\ast$ . In the first bullet point, we have inserted $\varphi _\textsf {callee-exit}$ into the two equations where we previously just had $\mathcal {D}^\ast (\rho ,\varphi)(\textsf {exit}(\rho))$ . In the second bullet point, we have added the additional condition that summary triples agree with batch tabulation, where previously it just referred to DAIG analysis results.

The newly added condition (5) states that, whenever we have a summary triple in $\mathcal {T}$ , we also have all of the dependency edges in $\Delta$ that would be required to compute that summary from scratch. By ensuring that summary triples in $\mathcal {T}$ are accompanied by the proper dependencies, we can guarantee that they are invalidated as needed in response to semantically relevant edits to other procedures.

Theorem B.2 (Consistency Preservation under.

Tabulate) If $\mathcal {G}\dashrightarrow \mathcal {G}^{\prime }$ and $\mathcal {G}$ is semantically consistent with respect to P, then $\mathcal {G}^{\prime }$ is also semantically consistent with respect to P.

Proof Sketch.

Conditions (1) and (2) are vacuously preserved. Both modified conditions (3) and (4) are ensured by Tabulate and Definition 5.1 of $\mathcal {G}$ , since the postcondition of a tabulated triple is exactly the exit abstract state of the discarded DAIG. Condition (5) is similarly guaranteed by the definition of Tabulate, where $\Delta$ contains all requisite dependencies for $\mathcal {D}$ (by semantic consistency of $\mathcal {G}$ ) and is unchanged in $\mathcal {G}^{\prime }$ . □

Theorem B.3 (Consistency Preservation).

Semantic consistency of summary table-equipped DSGs is preserved under queries and edits as in Lemma 5.1.

Proof Sketch.

The proof is largely unchanged for the original 4 conditions of semantic consistency. Condition (5) is ensured, because summary triples in $\mathcal {T}$ are never introduced by queries or edits (only by the Tabulate operation) and are discarded by the D-Summary rule whenever any transitively reachable dependency in $\Delta$ is dirtied by an edit. □

Theorem B.4 (Termination).

Queries and edits terminate in semantically consistent summary table-equipped demanded summarization graphs.

Proof Sketch.

The added rules S-Apply and D-Summary clearly terminate—S-Apply trivially, D-Summary by the same argument given for D-DAIG in Section 5. The modified rules Q-Instantiate and E-Delegate also terminate, both by the same arguments given originally in Section 5. □

Theorem B.5 (From-scratch Consistency and Soundness).

Demanded analysis in summary table-equipped demanded summarization graphs produces from-scratch consistent (and therefore sound) results.

Proof Sketch.

The Tabulate operation only produces summaries that are identical to their DAIG analogues, which we know to be from-scratch consistent. Furthermore, E-Delegate and D-Summary ensure that any summary triple potentially affected by an edit is discarded. Therefore, summary triples are only applied under the same conditions as their DAIG analogues would have been, so the abstract semantics are identical to those of Section 4. □

References for the Appendix

Benno Stein, Bor-Yuh Evan Chang, and Manu Sridharan. 2021. Demanded abstract interpretation. In Conference on Programming Language Design and Implementation (PLDI’21), Stephen N. Freund and Eran Yahav (Eds.). ACM, 282–295.

References

[1]

Umut A. Acar, Amal Ahmed, and Matthias Blume. 2008. Imperative self-adjusting computation. In Symposium on Principles of Programming Languages (POPL’08). ACM, 309–322. DOI:

Abstract

1 Introduction

2 Overview

2.1 Incrementality and Context Sensitivity

2.2 Demanded Abstract Interpretation

2.3 Demanded Summarization

3 Preliminaries: Abstract Interpretation

3.1 Intraprocedural Demanded Abstract Interpretation

3.2 Summarization and Tabulation

4 Demanded Summarization

4.1 Demand-driven Query Evaluation in DSGs

4.1.1 Queries.

4.1.2 Summary Queries.

4.1.3 Fixed Points.

4.2 Incremental Edits in DSGs

5 From-scratch Consistency

6 Implementation & Evaluation

6.1 Implementation

6.2 Scalability: Synthetic Benchmarks

6.3 Generalizability: Open-source Corpus

7 Summary Tabulation & Weakening

7.1 Summary Tabulation for Memory Pressure

7.2 Summary Weakening for Reuse

8 Related Work

9 Conclusion

Acknowledgments

Footnotes

Appendices

A Proofs for Section 5 (From-scratch Consistency)

B Proofs for Section 7 (Summary Tabulation & Weakening)

References for the Appendix

References

Index Terms

Recommendations

Demanded abstract interpretation

Deriving escape analysis by abstract interpretation

Control-flow analysis of function calls and returns by abstract interpretation

Comments

Information

Published In

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

View options

PDF

eReader

Get Access

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations