research-article

Open access

Focusing on Refinement Typing

Authors:

Dimitrios J. Economou,

Neel Krishnaswami,

Jana DunfieldAuthors Info & Claims

ACM Transactions on Programming Languages and Systems, Volume 45, Issue 4

Article No.: 22, Pages 1 - 62

https://doi.org/10.1145/3610408

Published: 20 December 2023 Publication History

PDF eReader

Abstract

We present a logically principled foundation for systematizing, in a way that works with any computational effect and evaluation order, SMT constraint generation seen in refinement type systems for functional programming languages. By carefully combining a focalized variant of call-by-push-value, bidirectional typing, and our novel technique of value-determined indexes, our system generates solvable SMT constraints without existential (unification) variables. We design a polarized subtyping relation allowing us to prove our logically focused typing algorithm is sound, complete, and decidable. We prove type soundness of our declarative system with respect to an elementary domain-theoretic denotational semantics. Type soundness implies, relatively simply, the total correctness and logical consistency of our system. The relative ease with which we obtain both algorithmic and semantic results ultimately stems from the proof-theoretic technique of focalization.

1 Introduction

True, “well-typed programs cannot ‘go wrong’ ” [Milner 1978], but only relative to a given semantics (if the type system is proven sound with respect to it). Unfortunately, well-typed programs go wrong, in many ways that matter, but about which a conventional type system cannot speak: divisions by zero, out-of-bounds array accesses, information leaks. To prove a type system rules out (at compile time) more runtime errors, its semantics must be refined. However, there is often not enough type structure with which to express such semantics statically. So, we must refine our types with more information that tends to be related to programs. Great care is needed, though, because incorporating too much information (such as nonterminating programs themselves, as may happen in a dependent type system, where program terms may appear in types) can spoil good properties of the type system, such as type soundness or the decidability of type checking and inference.

Consider the inductive type \(\mathsf {List} \; A\) of lists with entries of type A. Such a list is either nil ( \([]\) ) or a term x of type A together with a tail list \(\mathit {xs}\) (that is, \(x :: \mathit {xs}\) ). In a typed functional language such as Haskell or OCaml, the programmer can define such a type by specifying its constructors:

Suppose we define, by pattern matching, the function get, which takes a list \(\mathit {xs}\) and a natural number y, and returns the yth element of \(\mathit {xs}\) (where the first element is numbered zero):

A conventional type system has no issue checking get against, say, the type \(\textsf {List} \; A \rightarrow \textsf {Nat} \rightarrow A\) (for any type A), but get is unsafe, because it throws an out-of-bounds error when the input number is greater than or equal to the length of the input list. If it should be impossible for get to throw such an error, then get must have a type where the input number is restricted to natural numbers strictly less than the length of the input list. Ideally, the programmer would simply refine the type of get, while leaving the program alone (except, perhaps, for omitting the first clause).

This, in contrast to dependent types [Martin-Löf 1984], is the chief aim of refinement types [Freeman and Pfenning 1991]: to increase the expressive power of a pre-existing (unrefined) type system, while keeping the latter’s good properties, such as type soundness and, especially, decidability of typing, so programmers are not too burdened with refactoring their code or manually providing tedious proofs. In other words, the point of refinement types is to increase the expressive power of a given type system while maintaining high automation (of typing for normal programs), whereas the point of dependent types is to be maximally expressive (even with the ambitious aim of expressing all mathematics), at the cost of automation (which dependent type system designers may try to increase after the fact of high expressivity).

To refine get’s type to rule out, statically, runtime out-of-bounds errors, we need to compare numbers against list lengths. Thus, we refine the type of lists by their length: \({\left\lbrace {\nu : \textsf {List} \; A} \;\middle |\; {\textsf {len} \; \nu = n}\right\rbrace }\) , the type of lists \(\nu\) of length n. This type looks a bit worrying, though, because the measurement, \(\textsf {len} \; \nu = n\) , seems to use a recursive program, len. The structurally recursive

happens to terminate when applied to lists, but there is no general algorithm for deciding whether an arbitrary computation terminates [Turing 1936]. As such, we would prefer not to use ordinary recursive programs at all in our type refinements. Indeed, doing so would seem to violate a phase distinction¹ [Moggi 1989a; Harper et al. 1990] between static (compile time) specification and dynamic (runtime) program, which is almost indispensable for decidable typing.

The refinement type system Dependent ML (DML) [Xi 1998] provides a phase distinction in refining ML by an index domain that has no runtime content.² Type checking and inference in DML is only decidable when it generates constraints whose satisfiability is decidable. In practice, DML did generate decidable constraints, but that was not guaranteed by its design. DML’s distinction between indexes and programs allows it to support refinement types in the presence of computational effects (such as nontermination, exceptions, and mutable references) in a relatively straightforward manner. Further, the index-program distinction clarifies how to give a denotational semantics: A refinement type denotes a subset of what the type’s erasure (of indexes) denotes and a program denotes precisely what its erasure denotes [Melliès and Zeilberger 2015]. Dependent type systems, by contrast, do not have such an erasure semantics.

It seems liquid type systems [Rondon et al. 2008; Kawaguchi et al. 2009; Vazou et al. 2013;, 2014] achieve highly expressive, yet sound and decidable recursive refinements [Kawaguchi et al. 2009] of inductive types by a kind of phase distinction: namely, by restricting the recursive predicates of specifications to terminating measures (like len) that soundly characterize, in a theory decidable by off-the-shelf tools like SMT solvers, the static structure of inductive types. Unlike DML, liquid typing can, for example, use the measure of whether a list of natural numbers is in increasing order, while remaining decidable. However, liquid typing’s lack of index-program distinction makes it unclear how to give it a denotational semantics and has also led to subtleties involving the interaction between effects and evaluation strategy (we elaborate later in this section and Section 2). Vazou et al. [2014] appear to provide a denotational semantics in Section 3.3, but this is not really a denotational semantics in the sense we intend, because it is defined in terms of an operational semantics and not a separate and well-established mathematical model (such as domain theory).

Let us return to the \(\mathsf {get}\) example. Following the tradition of index refinement [Xi 1998], we maintain a phase distinction by syntactically distinguishing index terms, which can safely appear in types, from program terms, which cannot. In this approach, we want to check get against a more informative type

quantifying over indexes l of sort \(\mathbb {N}\) (natural numbers) and requiring the accessing number to be less than l. However, this type is not quite right, because Nat is a type and \(\mathbb {N}\) is a sort, so writing “ \(\nu \lt l\) ” confounds our phase distinction between programs and indexes. Instead, the type should look more like

where

computes the index term of sort \(\mathbb {N}\) that corresponds to a program term of type Nat, by a structural recursion homologous to that of len. The third clause of get has a nonempty list as input, so its index (length) must be \(1 + m\) for some m; the type checker assumes \(\textsf {index} \, (\textsf {succ} \, y) \lt 1 + m\) ; by the aforementioned homology, these constraints are again satisfied at the recursive call ( \(\mathsf {index} \, y \lt m\) ), until the second clause returns ( \(0 \lt 1 + m^{\prime }\) ). The first clause of get is impossible, because no natural number is less than zero. We can therefore safely remove this clause, or (equivalently) replace error with unreachable, which checks against any type under a logically inconsistent context, such as \(l:\mathbb {N}, n:\mathbb {N}, l = 0, n \lt l\) in this case.

Applying get to a list and a number determines the indexes l and n. We say that l and n are value-determined (here, by applying the function to values). If (perhaps in a recursive call) get is called with an empty list \([]\) and a natural number, then l is determined to be 0, and, since no index that is both negative and a natural number exists, no out-of-bounds error can arise by calling get. (Further, because \(l : \mathbb {N}\) strictly decreases at recursive calls, calling \(\mathsf {get}\) terminates.)

While this kind of reasoning about get’s type refinement may seem straightforward, how do we generalize it to recursion over any algebraic datatype (ADT)? What are its logical and semantic ingredients? How do we use these ingredients to concoct a type system with decidable typing, good (localized) error messages, and so on, while also keeping its metatheory relatively stable or maintainable under various extensions or different evaluation strategies?

Type systems that can do this kind of reasoning automatically, especially in a way that can handle any evaluation strategy, are hard to design correctly. Indeed, the techniques used in the original (call-by-value) liquid type system(s) [Rondon et al. 2008; Kawaguchi et al. 2009] had to be modified for Haskell, essentially because of Haskell’s call-by-name evaluation order [Vazou et al. 2014]. The basic issue was that binders can bind in (static) refinement predicates, which is fine when binders only bind values (as in call-by-value), but not when they bind computations that may not terminate (as in call-by-name). Liquid Haskell regained (operational) type soundness by introducing ad hoc restrictions that involve approximating whether binders terminate and using the refinement logic to verify termination.

We design a foundation on which to build liquid typing features that allows us to establish clear (semantic) soundness results, as well as the completeness of a decidable bidirectional typing algorithm. The main technique facilitating this is focusing, which we combine with bidirectional typing and value-determined indexes (the latter being a key ingredient to make measures work). In other words, this article is a first step toward reconciling DML and Liquid Haskell, using the proof-theoretic technique of focusing.

Andreoli [1992] introduced focusing to reduce the search space for proofs (programs) of logical formulas (types) by exploiting the property that some inference rules are invertible (the rule’s conclusion implies its premises). In relation to functional programming, focusing has been used, for example, to explain features such as pattern matching [Krishnaswami 2009], to explain the interaction between evaluation order and effects [Zeilberger 2009], and to reason about contextual program equivalence [Rioux and Zdancewic 2020]. Focusing has been applied to design a union and intersection refinement typing algorithm [Zeilberger 2009]. As far as we know, until now focusing has not been used to design an index refinement typing algorithm. By focusing on the typing of function argument lists and results, our focused system guarantees that value-determined existential indexes (unification variables) are solved before passing constraints to an SMT solver. For example, when our system infers a type for \(\mathsf {get} ([3,1,2], 2)\) , we first use the top-level annotation of \(\mathsf {get}\) to synthesize the type

(in which we have added polarity shifts \({\downarrow \!{-}}\) and \({\uparrow \!{-}}\) arising from focusing). The downshift \({\downarrow \!{-}}\) takes a negative type to a positive type of suspended computations. Second, we check the argument list ( \([3,1,2], 2\) ) against the negative (universally quantified) type. The upshift \({\uparrow \!{-}}\) takes a positive type A to negative type \({\uparrow \!{A}}\) (computations returning a value of type A). In typechecking the argument list, the upshift signifies the end of a (left) focusing stage, at which point the first argument value \([3,1,2]\) will have determined l to be 3 and the second argument value 2 will have determined n to be 2, outputting an SMT constraint without existentials: \(2 \lt 3\) .

Levy [2004] introduced the paradigm and calculus call-by-push-value(CBPV), which puts both call-by-name and call-by-value on equal footing in the storm of computational effects (such as nontermination). CBPV subsumes both call-by-name (CBN) and call-by-value (CBV) functional languages, because it allows us to encode both via type discipline. In particular, CBPV polarizes types into (positive) value types P and (negative) computation types N and provides polarity shifts \({\uparrow \!{P}}\) (negative) and \({\downarrow \!{N}}\) (positive); the monads functional programmers use to manage effects arise as the composite \({\downarrow \!{{\uparrow \!{-}}}}\) . These polarity shifts are the same as those arising from focusing. CBPV can be derived logically by way of focalization [Espírito Santo 2017], which we did in our system. Focalized CBPV is a good foundation for a refinement typing algorithm: Designing refinement typing algorithms is challenging and sensitive to effects and evaluation strategy, so it helps to refine a language that makes evaluation order explicit. We leverage focusing and our technique of value-determined indexes (a novelty in the DML tradition) to guarantee (like Liquid Haskell) the generation of SMT-solvable constraints.

Bidirectional typing [Pierce and Turner 2000] systematizes the difference between input (for example, type checking) and output (for example, type inference) and seems to fit nicely with focused systems [Dunfield and Krishnaswami 2021]. Bidirectional typing has its own practical virtues: It is easy to implement (if inputs and outputs fit together properly, that is, if the system is well-moded); it scales well (to refinement types, higher-rank polymorphism [Dunfield and Krishnaswami 2019], subtyping, effects—and so does CBPV); it leads to localized error messages; and it clarifies where type annotations are needed, typically in reasonable positions (such as at the top level) that are helpful as machine-checked documentation. In our system, annotations are needed only for recursive functions (to express termination refinements) and top-level definitions.

A focused and bidirectional approach therefore appears suitable, both theoretically and practically, for systematically designing and implementing an expressive language of type refinement that can handle any evaluation strategy and effect. We show that bidirectional typing and logical focusing work very well together at managing the complex flow of information pertaining to indexes of recursive data. In particular, value-determined existential indexes of input types are solved within focusing stages, ultimately leading to the output of constraints (and types) in the quantifier-free fragment solvable by SMT solvers.

Contributions. Our two key contributions are both a declarative/logical/semantic and an algorithmic account of recursive, index-based refinement of algebraic data types. For the logical account, we design a declarative type system in a bidirectional and focused style, resulting in a system with clear denotational semantics and soundness proofs and which is convenient for type theorists of programming languages. The declarative system conjures index solutions to existentials. For the algorithmic account, we design a type system similar to the declarative one but solving all existentials and prove it is decidable as well as sound and complete. We contribute:

—

A polarized declarative type system, including (polarized) subtyping, universal types, existential types, and index refinements with ordinary SMT constraints, as well as (nullary) recursive predicates on inductive data (which, in ongoing work, we are extending to multi-argument measures, which can express, for example, lists of increasing integer elements).

—

A proof that declarative typing is stable under substitution, which requires proving, among other things, that subtyping is transitive and typing subsumption is admissible.

—

A clear denotational semantics of the declarative system, based on elementary domain theory.

—

A type soundness proof with respect to our denotational semantics, which implies, relatively easily, both the refinement system’s logical consistency and total correctness—even if the programs are non-structurally recursive. To prove type soundness, we prove that value-determined indexes are sound: that is, semantic values uniquely determine value-determined indexes, semantically speaking (in particular, see Lemma 5.4).

—

A polarized subtyping algorithm, together with proofs that it is sound, complete, and decidable.

—

A polarized typing algorithm, together with proofs that it is sound, complete, and decidable. Completeness relies on the combination of our novel technique of value-determined indexes, focusing, and bidirectional typing. In particular, Lemma 7.7 implies that all existential variables are solved by the algorithm.

We relatively easily obtain both semantic and algorithmic results for a realistic language essentially by applying just one technique (based on fundamental logical principles): focusing.

All proofs are in the appendix.

2 Overview

This article is a first step toward reconciling Dependent ML and Liquid Haskell. The main thing we get from DML is the index-program distinction. Liquid Haskell provides or inspires three things. First, the observation of difficulties with effects and evaluation order inspired our use of CBPV. Second, we study (nullary) measures (supporting multi-argument measures is ongoing work). Third, our technique of value-determined indexes was inspired by the observation that variables appearing in liquid refinements correspond to inputs or outputs of functions.

Before diving into the details of our type system, we give an overview of the central logical, semantic, and algorithmic issues informing its design. The main technique we use to easily support both semantic and algorithmic results is focalization.

Refinement typing, evaluation strategy, and computational effects. The interactions between refinement typing (and other fancy typing), evaluation strategy, and computational effects are a source of peril. The combination of parametric polymorphism with effects is often unsound [Harper and Lillibridge 1991]; the value restriction in Standard ML recovers soundness in the presence of mutable references by restricting polymorphic generalization to syntactic values [Wright 1995]. The issue was also not peculiar to polymorphism: Davies and Pfenning [2000] discovered that a similar value restriction recovers type soundness for intersection refinement types and effects in call-by-value languages. For union types, Dunfield and Pfenning [2003] obtained soundness by an evaluation context restriction on union elimination.

For similar reasons, Liquid Haskell was also found unsound in practice and had to be patched; we adapt an example [Vazou et al. 2014] demonstrating the discovered unsoundness:

In typechecking unsafe, we need to check that the type of y (a singleton type of one value: 0) is a subtype of safediv’s second argument type (under the context of the let-binding). Due to the refinement of the let-bound notused, this subtyping generates a constraint or verification condition of the form “if false is true, then...” This constraint holds vacuously, implying that unsafe is safe. But unsafe really is unsafe, because Haskell evaluates lazily: Since notused is not used, diverge is never called, and hence safediv divides by zero (and crashes if uncaught). Vazou et al. [2014] recover type soundness and decidable typing by restricting let-binding and subtyping, using an operational semantics to approximate whether or not expressions diverge and whether or not terminating terms terminate to a finite value.

The value and evaluation context restrictions seem like ad hoc ways to cope with the failure of simple typing rules to deal with the interactions between effects and evaluation strategy. However, Zeilberger [2009] explains the value and evaluation context restrictions in terms of a logical view of refinement typing. Not only does this perspective explain these restrictions, it provides theoretical tools for designing type systems for functional languages with effects. At the heart of Zeilberger’s approach is the proof-theoretic technique of focusing, which we discuss near the end of this overview. An important question we address is whether polarization and focusing can also help us understand Liquid Haskell’s restrictions on let-binding and subtyping: Basically, our let-binding rule requires the bound computation (negative type) to terminate to a value (positive type). In other words, focalized systems satisfy any necessary value (and covalue) restrictions by default. We discuss this further in Section 8.

Focalization can also yield systems with good semantic properties under computational effects, in particular, variants of call-by-push-value.

Refining call-by-push-value. Call-by-push-value [Levy 2004] subsumes both call-by-value and call-by-name by polarizing the usual terms and types of the \(\lambda\) -calculus into a finer structure that can be used to encode both evaluation strategies in a way that can accommodate computational effects: value (or positive) types (classifying terms that “are,” that is, values v), computation (or negative) types (classifying terms that “do,” that is, expressions e), and polarity shifts \({\uparrow \!{-}}\) and \({\downarrow \!{-}}\) between them. An upshift \({\uparrow \!{P}}\) lifts a (positive) value type P up to a (negative) computation type of expressions that compute values (of type P). A downshift \({\downarrow \!{N}}\) pushes a (negative) computation type N down into a (positive) value type of thunked or suspended computations (of type N). We can embed the usual \(\lambda\) -calculus function type \(A \rightarrow _\lambda B\) (written with a subscript to distinguish it from the CBPV function type), for example, into CBPV (whose function types have the form \(P \rightarrow N\) for positive P and negative N) so it behaves like CBV, via the translation \(\iota _{\text{CBV}}\) with \(\iota _{\text{CBV}}(A \rightarrow _\lambda B) = {\downarrow \!{(\iota _{\text{CBV}}(A) \rightarrow {\uparrow \!{\iota _{\text{CBV}}(B)}})}}\) ; or so it behaves like CBN, via the translation \(\iota _{\text{CBN}}\) with \(\iota _{\text{CBN}}(A \rightarrow _\lambda B) = ({\downarrow \!{\iota _{\text{CBN}}(A)) \rightarrow \iota _{\text{CBN}}(B)}}\) .

Evaluation order is made explicit by CBPV type discipline. Therefore, adding a refinement layer on top of CBPV requires directly and systematically dealing with the interaction between type refinement and evaluation order. If we add this layer to CBPV correctly from the very beginning, then we can be confident that our type refinement system will be semantically well-behaved when extended with other computational effects. The semantics of CBPV are well-studied and this helps us establish semantic metatheory. In later parts of this overview, we show the practical effects of refining our focalized variant of CBPV, especially when it comes to algorithmic matters.

Type soundness, totality, and logical consistency. The unrefined system underlying our system has the computational effect of nontermination and hence is not total. To model nontermination, we give the unrefined system an elementary domain-theoretic denotational semantics. Semantic type soundness says that a syntactic typing derivation can be faithfully interpreted as a semantic typing derivation, that is, a morphism in a mathematical category, in this case a logical refinement of domains. Semantic type soundness basically corresponds to syntactic type soundness with respect to a big-step operational semantics. While we do not provide an operational semantics in this article, we do prove a syntactic substitution lemma that would be a key ingredient to prove that an operational semantics preserves typing (beta reduction performs syntactic substitution). The substitution lemma is also helpful to programmers, because it means they can safely perform program transformations and preserve typing. Because the unrefined system is (a focalized variant of) CBPV, proving type soundness is relatively straightforward.

In contrast to dependent types, the denotational semantics of our refined system is defined in terms of that of its erasure (of indexes), that is, its underlying, unrefined system. A refined type denotes a logical subset of what its erasure denotes. An unrefined return type \({\uparrow \!{P}}\) denotes either what P denotes or divergence/nontermination. A refined return type \({\uparrow \!{P}}\) denotes only what P denotes. Therefore, our refined type soundness result implies that our refined system (without a partial upshift type) enforces termination. The refined system can be extended (by a partial upshift type) to permit divergence while keeping type soundness (which implies partial correctness for partial upshifts). Type soundness also implies logical consistency, because a logically inconsistent refinement type denotes the empty set. We also prove that syntactic substitution is semantically sound, which would be a main lemma in proving that an operational semantics is equivalent to our denotational semantics.

In Section 5, we discuss these semantic issues in more detail.

Algebraic data types and measures. A novelty of liquid typing is the use of measures: functions, defined on algebraic data types, which may be structurally recursive, but are guaranteed to terminate and can therefore safely be used to refine the inductive types over which they are defined. (In this article, we only consider nullary measures.)

For example, consider the type \(\mathsf {BinTree} \; A\) of binary trees with terms of type A at leaves:

Suppose we want to refine \(\mathsf {BinTree} \; A\) by the height of trees. Perhaps the most direct way to specify this is to measure it using a function hgt defined by structural recursion:

As another example, consider an inductive type \(\mathsf {Expr}\) of expressions in a CBV lambda calculus:

Measures need not involve recursion. For example, if we want to refine the type \(\mathsf {Expr}\) to expressions \(\mathsf {Expr}\) that are values (in the sense of CBV, not CBPV), then we can use \(\mathsf {isval}\) :

Because \(\mathsf {isval}\) is not recursive and returns indexes, it is safe to use it to refine \(\mathsf {Expr}\) to expressions that are CBV values: \({\left\lbrace {\nu : \mathsf {Expr}} \;\middle |\; {\mathsf {isval} \; \nu = \mathsf {tt}}\right\rbrace }\) . But, as with len (Section 1), we may again be worried about using the seemingly dynamic, recursively defined \(\mathsf {hgt}\) in a static type refinement. Again, we need not worry, because hgt, like len, is a terminating function into a decidable logic [Barrett et al. 2009]. We can use it to specify that, say, a height function defined by pattern matching on trees of type \({\left\lbrace {\nu : \mathsf {BinTree} \; A} \;\middle |\; {\mathsf {hgt} \; \nu = n}\right\rbrace }\) actually returns (the program value representing) n for any tree of height n. Given the phase distinction between indexes (like n) and programs, how do we specify such a function type? We use refinement type unrolling and singletons.

Unrolling and singletons. Let us consider a slightly simpler function, length, that takes a list and returns its length:

What should be the type specifying that length actually returns a list’s length? The proposal \(\forall {n:\mathbb {N}}.\: \textsf {List}(A)(n) \rightarrow {\uparrow \!{\textsf {Nat}}}\) does not work, because Nat has no information about the length n. Something like \(\forall {n:\mathbb {N}}.\: \textsf {List}(A)(n) \rightarrow {\uparrow \!{(n : \textsf {Nat})}}\) , read literally as returning the index n, would violate our phase distinction between programs and indexes. Instead, we use a singleton type in the sense of Xi [1998]: A singleton type contains just those program terms (of the type’s erasure) that correspond to exactly one semantic index. For example, given \(n:\mathbb {N}\) , we define the singleton type \(\textsf {Nat}(n)\) (which may also be written \(\textsf {Nat} \; n\) ) by \({\left\lbrace {\nu : \mathsf {Nat}} \;\middle |\; {\mathsf {index}\,{\nu } = n}\right\rbrace }\) where

specifies the indexes (of sort \(\mathbb {N}\) ) corresponding to program values of type \(\mathsf {Nat}\) .

How do we check length against \(\forall {n:\mathbb {N}}.\: \textsf {List}(A)(n) \rightarrow {\uparrow \!{(\mathsf {Nat}({n}))}}\) ? In the first clause, the input [] has type \(\textsf {List}(A)(n)\) for some n, but we need to know \(n=0\) (and that the index of zero is 0). Similarly, we need to know \(x :: \mathit {xs}\) has length \(n = 1 + n^{\prime }\) where \(n^{\prime } : \mathbb {N}\) is the length of \(\mathit {xs}\) . To generate these constraints, we use an unrolling judgment (Section 4.6) that unrolls a refined inductive type. Unrolling puts the type’s refinement constraints, expressed by asserting and existential types, in the structurally appropriate positions. An asserting type is written \(P \wedge \phi\) (read “P with \(\phi\) ”), where P is a (positive) type and \(\phi\) is an index proposition. If a term has type \(P \wedge \phi\) , then the term has type P and also \(\phi\) holds. (Dual to asserting types, we have the guarded type \(\phi \mathrel {\supset }N\) , which is equivalent to N if \(\phi\) holds, but is otherwise useless.) We use asserting types to express that index equalities like \(n=0\) hold for terms of inductive type. We use existentials to quantify over indexes that characterize the refinements of recursive subparts of inductive types, like \(n^{\prime }\) . For example, modulo a small difference (see Section 4.6), \(\textsf {List}(A)(n)\) unrolls to

That is, to construct an A-list of length n, the programmer (or library designer) can either left-inject a unit value, provided the constraint \(n=0\) holds, or right-inject a pair of one A value and a tail list, provided that \(n^{\prime }\) , the length of the tail list, is \(n - 1\) (the equations \(n = 1 + n^{\prime }\) and \(n - 1 = n^{\prime }\) are equivalent). These index constraints are not a syntactic part of the list itself. That is, a term of the above refined type is also a term of the type’s erasure (of indexes):

\begin{align*} 1 + (| A | \times (\mathsf {List} \; | A |)), \end{align*}

where \(| - |\) erases indexes. Dual to verifying refined inductive values, pattern matching on refined inductive values, such as in the definition of length, allows us to use the index refinements locally in the bodies of the various clauses for different patterns. Liquid Haskell similarly extracts refinements of data constructors for use in pattern matching.

The shape of the refinement types generated by our unrolling judgment (such as the one above) is a judgmental and refined-ADT version of the fact that generalized ADTs (GADTs) can be desugared into types with equalities and existentials that express constraints of the return types for constructors [Cheney and Hinze 2003; Xi et al. 2003]. It would be tedious and inefficient for the programmer to work directly with terms of types produced by our unrolling judgment, but we can implement (in our system) singleton refinements of base types and common functions on them, such as addition, subtraction, multiplication, division, and the modulo operation on integers, and build these into the surface language used by the programmer, similarly to the implementation of Dependent ML [Xi 1998].

Inference and subtyping. For a typed functional language to be practical, it must support some degree of inference, especially for function application (to eliminate universal types) and constructing values (to introduce existential types). For example, to pass a value to a function, its type must be compatible with the function’s argument type, but it would be burdensome to make programmers always have to prove this compatibility. In our setting, for example, if \(x : \textsf {Nat}(3)\) and \(f : {\downarrow \!{(\forall {a:\mathbb {N}}.\: \textsf {Nat}(a) \rightarrow {\uparrow \!{P}})}}\) , then we would prefer to write \(f\;x\) rather than \(f\;[3]\;x\) , which would quickly make our programs incredibly—and unnecessarily—verbose.

Omitting index and type annotations, however, has significant implications. In particular, we need a mechanism to instantiate indexes somewhere in our typing rules: For example, if \(g : {\downarrow \!{({\downarrow \!{(\textsf {Nat}(4+b) \rightarrow {\uparrow \!{P}})}} \rightarrow N)}}\) and \(h : {\downarrow \!{((\exists {a:\mathbb {N}}.\: \textsf {Nat}(a)) \rightarrow {\uparrow \!{P}})}}\) , then to apply g to h, we need to know \(\textsf {Nat}(4+b)\) is compatible with \(\exists {a:\mathbb {N}}.\: \textsf {Nat}(a)\) , which requires instantiating the bound a to a term logically equal to \(4+b\) . Our system does this kind of instantiation via subtyping, which refinement types naturally give rise to: A type refinement is essentially a subtype of its erasure. Index instantiations are propagated locally across adjacent nodes in the syntax tree, similarly to Pierce and Turner [2000]. (Liquid typing allows for more inference, including inference of refinements based on templates, which provides additional convenience for programmers, but we do not consider this kind of inference in this article.)

We polarize subtyping into two, mutually recursive, positive and negative relations \({\Theta }\vdash{P}\leq^{+}{Q}\) and \({\Theta }\vdash{N}\leq^{-}{M}\) (where \(\Theta\) is a logical context including index propositions). The algorithmic versions of these only introduce existential variables in positive supertypes and negative subtypes, guaranteeing they can always be solved by indexes without any existential variables. We delay checking constraints until the end of certain, logically designed stages (the focusing ones, as we will see), when all of their existential variables are guaranteed to have been solved.

Value-determined indexes and type well-formedness. Like DML, we have an index-program distinction, but unlike DML and like Liquid Haskell, we want to guarantee SMT solvable constraints. We accomplish this with our technique of value-determined indexes. To guarantee that our algorithm can always instantiate quantifiers, we restrict quantification to indexes appearing in certain positions within types: namely, those that are uniquely determined (semantically speaking) by values of the type, both according to a measure and before crossing a polarity shift (which in this case marks the end of a focusing stage). For example, in \({\left\lbrace {\nu : \textsf {List} \; A} \;\middle |\; {\textsf {len}\,{\nu } = b}\right\rbrace }\) , the index b is uniquely determined by values of that type: The list \([x, y]\) uniquely determines b to be 2 (by the length measure). This value-determinedness restriction on quantification has served to explain why a similar restriction in the typing algorithm of Flux (Liquid Rust) seemed to work well in practice [Lehmann et al. 2023].

We make this restriction in the type well-formedness judgment, which outputs a context \(\Xi\) tracking value-determined indexes; well-formed types can only quantify over indexes in \(\Xi\) . For example, \(\exists {b:\mathbb {N}}.\: {\left\lbrace {\nu : \textsf {List} \; A} \;\middle |\; {\textsf {len}\,{\nu } = b}\right\rbrace }\) is well-formed. The variables in \(\Xi\) also pay attention to type structure: For example, a value of a product type is a pair of values, where the first value determines all \(\Xi _1\) (for the first type component) and the second value determines all \(\Xi _2\) (second type component), so the \(\Xi\) of the product type is their union \(\Xi _1 \cup \Xi _2\) . We also take the union for function types \(P \rightarrow N\) , because index information flows through argument types toward the return type, marked by a polarity shift.

By emptying \(\Xi\) at shift types, we prevent lone existential variables from being introduced at a distance, across polarity shifts. In practice, this restriction on quantification is not onerous, because most functional types that programmers use are, in essence, of the form

where the “ \(\forall\) ” quantifies over indexes of argument types \(P_k\) that are uniquely determined by argument values, and the “ \(\exists\) ” quantifies over indexes of the return type that are determined by (or at least depend on) fully applying the function and thereby constructing a value to return. The idea of this restriction was inspired by liquid types, because they implicitly follow it: Variables appearing in liquid type refinements must ultimately come from arguments x to dependent functions \(x\!:\!A \rightarrow B\) and their return values (however, these are not explicitly index variables).

Types that quantify only across polarity shifts tend to be empty, useless, or redundant. The ill-formed type \(\forall {n:\mathbb {N}}.\: 1 \rightarrow {\uparrow \!{\mathsf {Nat}({n})}}\) is empty because no function returns all naturals when applied to unit. A term of ill-formed type \(\exists {m:\mathbb {N}}.\: {\downarrow \!{(\mathsf {Nat}({m}) \rightarrow {\uparrow \!{\mathsf {Bool}}}}})\) can only be applied to an unknown number, which is useless because the number is unknown. The ill-formed type \(\exists {n:\mathbb {N}}.\: {\uparrow \!{{\downarrow \!{\mathsf {Nat}({n})}}}}\) is redundant because it is semantically equivalent to \({\downarrow \!{{\uparrow \!{\exists {n:\mathbb {N}}.\: \mathsf {Nat}({n})}}}}\) (which does not quantify across a polarity shift), and similarly \(\forall {n:\mathbb {N}}.\: {\uparrow \!{{\downarrow \!{(\mathsf {Nat}({n}) \rightarrow {\uparrow \!{\mathsf {Nat}({n})}})}}}}\) is semantically equivalent to \({\uparrow \!{{\downarrow \!{(\forall {n:\mathbb {N}}.\: \mathsf {Nat}({n}) \rightarrow {\uparrow \!{\mathsf {Nat}({n})}})}}}}\) . Some refinements are not value-determined but useful nonetheless, such as dimension types [Kennedy 1994; Dunfield 2007b] that statically check that dimensions are used consistently (minutes can be added to minutes, but not to kilograms) but do not store the dimensions at runtime. In this article, we do not consider these non-value-determined refinements, and Liquid Haskell does not support them either.

Our value-determinedness restriction on type well-formedness, together with focusing, is very helpful metatheoretically, because it means that our typing algorithm only introduces—and is guaranteed to solve—existential variables for indexes within certain logical stages. For example, consider checking a list against \(\exists {b:\mathbb {N}}.\: {\left\lbrace {\nu : \textsf {List} \; A} \;\middle |\; {\textsf {len}\,{\nu } = b}\right\rbrace }\) . An existential variable \(\hat{b}\) for b is generated, and we check the unrolled list against the unrolling of \({\lbrace {\nu : \textsf {List} \; A} \; |\; {\textsf {len}\,{\nu } = \hat{b}}\rbrace }\) . A solution to \(\hat{b}\) will be found within value typechecking (the right-focusing stage), using the invariant that no measure (such as \(\textsf {len}\) ) contains any existential variables. Similarly, applying a function with universal quantifiers will solve all existential variables arising from these quantifiers by the end of a left-focusing stage, which typechecks an argument list.

Focusing, CBPV, and bidirectional typing. In proof theory, the technique of focusing [Andreoli 1992] exploits invertibility properties of logical formulas (types), as they appear in inference rules (typing rules), to rule out many redundant proofs. Having fewer possible proofs makes proof search more tractable. Brock-Nannestad et al. [2015] and Espírito Santo [2017] study the relation between CBPV and focusing: Each work provides a focused calculus that is essentially the same as CBPV, “up to the question of \(\eta\) -expansion” [Brock-Nannestad et al. 2015]. Our system is also a focused variant of CBPV; in fact, it arises from a certain focalization (and bidirectionalization) of ordinary intuitionistic logic.

An inference rule is invertible if its conclusion implies its premises. For example, in intuitionistic logic, the right rule for implication is invertible because its premise \(\Gamma , A {\vdash} B\) can be derived from its conclusion \(\Gamma {\vdash} A \rightarrow B\) :

However, both right rules for disjunction, for example, are not invertible, which we can prove with a counterexample: \(A_1 + A_2 {\vdash} A_1 + A_2\) but \(A_1 + A_2 \nvdash A_1\) and \(A_1 + A_2 \nvdash A_2\) . In a sequent calculus, positive formulas have invertible left rules and negative formulas have invertible right rules. A weakly focused sequent calculus eagerly applies non-invertible rules as far as possible (in either left- or right-focusing stages); a strongly focused sequent calculus does, too, but also eagerly applies invertible rules as far as possible (in either left- or right-inversion stages). There are also stable stages (or moments) in which a decision has to be made between focusing on the left or on the right [Espírito Santo 2017]. The decision can be made explicitly via proof terms (specifically, cuts): In our system, a principal task of let-binding, a kind of cut, is to focus on the left (to process the list of arguments in a bound function application); and a principal task of pattern matching, another kind of cut, is to focus on the right (to decompose the value being matched against a pattern).

From a Curry–Howard view, let-binding and pattern matching are different kinds of cuts. The cut formula A—basically, the type being matched or let-bound—must be synthesized (inferred) as an output (judgmentally, \(\cdots {\color{red}{\Rightarrow}} A\) ) from heads h (variables and annotated values) or bound expressions g (function application and annotated returner expressions); and ultimately, the outcomes of these cuts in our system are synthesized. But all other program terms are checked against input types A: judgmentally, \(\cdots {\color{blue}{\Leftarrow}} A \cdots\) or \(\cdots [A] {\vdash} \cdots\) . In this sense, both our declarative and algorithmic type systems are bidirectional [Dunfield and Krishnaswami 2021].

In inversion stages, that is, expression typechecking (where a negative type is on the right of a turnstile) and pattern matching (where a positive type is on the left of a turnstile), refinements often need to be extracted from types to be used. For example, suppose we want to check the expression \({\lambda} {x}. {\tt return}\,{x}\) against the type \((1 \wedge \mathsf {ff}) \rightarrow {\uparrow \!{(1 \wedge \mathsf {ff})}}\) , which is semantically equivalent to \(\mathsf {ff}\mathrel {\supset }1 \rightarrow {\uparrow \!{(1 \wedge \mathsf {ff})}}\) . We need to extract \(\mathsf {ff}\) (of \((1 \wedge \mathsf {ff}) \rightarrow \cdots\) ) and put it in the logical context so we can later use it (in typechecking the value x) to verify \(\mathsf {ff}\) (of \(\cdots {\uparrow \!{(1 \wedge \mathsf {ff})}}\) ). If we instead put \(x : 1 \wedge \mathsf {ff}\) in the program context, then \(\mathsf {ff}\) would not be usable (unless we can extract it from the program context, but its simpler to extract from types as soon as possible rather than to extend the extraction judgment or to add another one) and typechecking would fail (it should succeed). Similarly, since subtyping may be viewed as implication, index information from positive subtypes or negative supertypes needs to be extracted for use. Declaratively, it is okay not to extract eagerly at polarity shifts in subtyping (the subtyping rules that extract are invertible), but it seems necessary in the algorithmic system.

Our declarative system includes two focusing stages, one (value typechecking) for positive types on the right of the turnstile ( \({\vdash}\) ) and the other (spine typing) for negative types on the left. Our algorithmic system closely mirrors the declarative one, but does not conjure index instantiations or witnesses (like t in DeclSpine \(\forall\) below), and instead systematically introduces and solves existential variables (like solving the existential variable \(\hat{a}\) as t in AlgSpine \(\forall\) below), which we keep in algorithmic contexts \(\Delta\) .

For example, applying a function of type \(\forall {b:\mathbb {N}}.\: \mathsf {List}{{(\mathsf {Nat})}}{{(b)}} \rightarrow \cdots\) to the list \([4,1,2]\) should solve b to an index semantically equal to 3; the declarative system guesses an index term (like \(3+0\) ), but the algorithmic system mechanically solves for it.

Our algorithmic right-focusing judgment has the form \({\Theta ; \Delta }; {\Gamma } {\vdash} {v} {\color{blue}{\Leftarrow}} {P} \mathrel {/}{\chi } \dashv {\Delta ^{\prime }}\) , where \(\chi\) is an output list of typing constraints and \(\Delta ^{\prime }\) is an output context that includes index solutions to existentials. Similarly, the left-focusing stage is \({\Theta ; \Delta }; {\Gamma }; [{N}] {\vdash} {s} \gg {{\uparrow \!{P}}} \mathrel {/}{\chi } \dashv {\Delta ^{\prime }}\) ; it focuses on decomposing N (the type of the function being applied), introducing its existential variables for the arguments in the list s (sometimes called a spine [Cervesato and Pfenning 2003]) and outputting its guards to verify at the end of decomposition (an upshift). These existential variables flow to the right-focusing stage (value typechecking) and are solved there, possibly via subtyping. Constraints \(\chi\) are only checked right at the end of focusing stages, when all their existential variables are solved. For example, consider our rule for (fully) applying a function h to a list of arguments s:

After synthesizing a thunk type \({\downarrow \!{N}}\) for the function h we are applying, we process the entire list of arguments s, until N’s return type \({\uparrow \!{P}}\) . All (existential) value-determined indexes \(\Xi _N\) of N are guaranteed to be solved by the time an upshift is reached, and these solutions are eagerly applied to constraints \(\chi\) , so \(\chi\) does not have existential variables and is hence SMT-solvable ( \({\Theta }; {\Gamma } \lhd {\chi }\) ). The polarization of CBPV helps guarantee all solutions have no existential variables. Focusing stages introduce existential variables to input types, which may appear initially as a positive supertype in the subtyping premise for typechecking (value) variables. These existential variables are solved using the positive subtype, which never has existential variables. Dually, negative subtypes may have existential variables, but negative supertypes never do.

Our system requires intermediate computations like \(h(s)\) to be explicitly named and sequenced by let-binding (a kind of A-normal [Flanagan et al. 1993] or let-normal form). Combined with focusing, this allows us to use (within the value typechecking stage) subtyping only in the typing rule for (value) variables. This makes our subsumption rule syntax-directed, simplifying and increasing the efficiency of our algorithm. We nonetheless prove a general subsumption lemma, which is needed to prove that substitution respects typing, a key syntactic or operational correctness property.

Focusing also gives us pattern matching for free [Krishnaswami 2009]: From a Curry–Howard view, the left-inversion stage is pattern matching. The (algorithmic³) left-inversion stage in our system is written \({\Theta }; {\Gamma }; [{P}] \rhd {\lbrace {{r}_{i}} \mathbin {\Rightarrow }{{e}_{i}}\rbrace _{{i} \in {I}}} {\color{blue}{\Leftarrow}} {N}\) : It decomposes the positive P (representing the pattern being matched) to the left of the turnstile (written \(\rhd\) to distinguish the algorithmic judgment from the corresponding declarative judgment, which instead uses \({\vdash}\) ). Our system is “half” strongly focused: We eagerly apply right-invertible but not left-invertible rules. This makes pattern matching in our system resemble the original presentation of pattern matching in CBPV. From a Curry–Howard view, increasing the strength of focusing would permit nested patterns.

A pattern type can have index equality constraints, such as for refined ADT constructors (for example, that the length of an empty list is zero) as output by unrolling. By using these equality constraints, we get a coverage-checking algorithm. For example, consider checking get (introduced in Section 1) against the type

At the clause

we extract a logically inconsistent context ( \(l:\mathbb {N}, k:\mathbb {N}, l = 0, k \lt l\) ), which entails that unreachable checks against any type. Proof-theoretically, this use of equality resembles the elimination rule for Girard–Schroeder-Heister equality [Girard 1992; Schroeder-Heister 1994].

Bidirectional typing controls the flow of type information. Focusing in our system directs the flow of index information. The management of the flow of type refinement information, via the stratification of both focusing and bidirectionality, makes our algorithmic metatheory highly manageable.

3 Example: Verifying Mergesort

We show how to verify a non-structurally recursive mergesort function in our system: namely, that it terminates and returns a list with the same length as the input list (to verify it returns an ordered list, we need to extend our system with multi-argument measures, which is outside the scope of this article). We only consider sorting lists of natural numbers \(\mathsf {Nat}\) , defined as \(\exists {n : \mathbb {N}}.\: \mathsf {Nat}({n})\) . For clarity, and continuity with Sections 1 and 2, we sometimes use syntactic sugar such as clausal pattern-matching, combining let-binding with pattern-matching on the let-bound variable, using “if-then-else” rather than pattern-matching on Boolean type, and combining two or more pattern-matching expressions into one with a nested pattern such as \(x :: y :: \mathit {xs}\) .

Given type A and \(n : \mathbb {N}\) , we define \(\mathsf {List}{{(A)}}{{(n)}}\) by \({\left\lbrace {\nu : \textsf {List} \; A} \;\middle |\; {\textsf {len} \; \nu = n}\right\rbrace }\) . Modulo a small difference (see Section 4.6), our unrolling judgment unrolls \(\textsf {List}(A)(n)\) to

which is a refinement of \(1 + (| A | \times \mathsf {List} \; | A |)\) . This is an unrolling of the inductive type, not the inductive type itself, so we must roll values of it into the inductive type. We use syntactic sugar: namely, \([]\) stands for \({\tt into}({{\tt inj}_{1}\, \texttt{ ()}})\) and \(x :: \mathit {xs}\) stands for \({\tt into}({{\tt inj}_{2}\, \langle {x}, {\mathit {xs}}\rangle })\) .

Just as we need a natural number type associating natural number program values with natural number indexes, we need a Boolean type of values corresponding to Boolean indexes. To this end, define the measure

Given \(b:\mathbb {B}\) , the singleton type of Boolean b is \(\mathsf {Bool}(b) = {\left\lbrace {\nu : \mathsf {Bool}} \;\middle |\; {\mathsf {ixbool} \; \nu = b}\right\rbrace }\) . Our unrolling judgment (Section 4.6) outputs the following type, a refinement of the Boolean type encoded as \(1 + 1\) :

We encode \({\tt true}\) as \({\tt into}({{\tt inj}_{1}\,{\texttt{ ()}}})\) , which has type \(\mathsf {Bool}(\mathsf {tt})\) , and \({\tt false}\) as \({\tt into}({{\tt inj}_{2}\,{\texttt{ ()}}})\) , which has type \(\mathsf {Bool}(\mathsf {ff})\) . The Boolean type \(\mathsf {Bool}\) is defined as \(\exists {b:\mathbb {B}}.\: \mathsf {Bool}(b)\) .

Assume we have the following:

The SMT solver Z3 [de Moura and Bjørner 2008], for example, supports integer division (and modulo and remainder operators); internally, these operations are translated to multiplication. Here, we are considering natural number indexes, but we can add the constraint \(n \ge 0\) (for naturals n) when translating them to integers in an SMT solver such as Z3. Allowing integer division in general is not SMT decidable, but for this example, n is always instantiated to a constant (2), which is decidable. Note that Z3 supports division by zero, but our \(\mathsf {div}\) has a guard requiring the divisor to be nonzero ( \(n \ne 0\) ), so we need not consider this. Division on naturals takes the floor of what would otherwise be a rational number (for example, \(3 \div 2 = 1\) ).

First, we define the function \(\mathsf {merge}\) for merging two lists while preserving order. It takes two lists as inputs, but also a natural number representing the sum of their lengths. Since at least one list decreases in length at recursive calls, so does the sum of their lengths, implying the function terminates when applied. However, in the refined system presented in (the figures of) this article, to keep things simple, we provide only one rule for recursive expressions, whose termination metric is \(\lt\) on natural numbers. Because the system as presented lacks a rule that supports a termination metric on sums \(n_1 + n_2\) of natural numbers, we need to reify the sum \(n = n_1 + n_2\) of the length indexes \(n_1\) and \(n_2\) into a ghost parameter n of (asserting) singleton type. However, we emphasize the ghost parameter in this example is not a fundamental limitation of our system, because our system can be extended to include other termination metrics such as \(\lt\) on the sum of natural numbers (we discuss this further in Section 4.7).

In a well-typed let-binding \({\tt let}\;{x}\,\texttt {=}\,{g}{\tt ;}\; e\) the bound expression g is a value-returning computation (that is, has upshift type), and e is a computation that binds to x the value (of positive type) resulting from computing g. (Liquid Haskell, lacking CBPV’s type distinction between computations and values, instead approximates whether binders terminate to a value.) Since x has positive type, we can match it against patterns (see, for example, the final clause of \(\mathsf {split}\) , discussed next).

We now define the function \(\mathsf {split}\) that takes a list and splits it into two lists. It is a standard “every-other” implementation, and we have to be a bit careful about the type refinement so as not to be “off by one” in the lengths of the resulting lists.

We are ready to implement \(\mathsf {mergesort}\) , but, since we use a ghost parameter, we need to define an auxiliary function \(\mathsf {auxmergesort}\) additionally taking the length of the list being ordered. We introduce syntactic sugar for a let-binding followed by pattern-matching on its result.

Finally, we define a \(\mathsf {mergesort}\) that is verified to terminate and to return a list of the same length as the input list.

In the system of this article, we cannot verify that \(\mathsf {mergesort}\) returns a sorted list. This is because our system lacks multi-argument measures that can specify relations between indexes of different parts of a data type. (To handle this, we are extending our system with multi-argument measures, which is nontrivial and requires a significant degree of additional machinery outside the scope of this article.) But this example is interesting nonetheless, because \(\mathsf {auxmergesort}\) is not structurally recursive: Its recursive calls are on lists obtained by splitting the input list roughly in half, not on the structure of the list ( \(- :: -\) ). Further, it showcases the main features of our foundational system, the declarative specification of which we turn to next, in Section 4.

4 Declarative System

We present our core declarative calculus and type system.

First, we discuss the syntax of program terms, types, index terms, sorts, functors, algebras, and contexts. Then, in Section 4.1, we discuss the index sorting judgment \(\Theta {\vdash} t : \tau\) and the propositional validity judgment \({\Theta } {\vdash} {\phi } \;\mathsf {true}\) , index-level substitution, and basic logical properties required of the index domain. In Section 4.2, we discuss the well-formedness of (logical and program) contexts ( \({\Theta }\; \mathsf {ctx}\) and \({\Theta } {\vdash} {\Gamma }\; \mathsf {ctx}\) ), types ( \({\Theta } {\vdash} {A}\; \mathsf {type} {[{\Xi }}]\) ), functors ( \({\Theta } {\vdash} {\mathcal {F}}\; \mathsf {functor} {[{\Xi }]}\) ), and algebras ( \({\Xi }; {\Theta } {\vdash} {\alpha } : {F}({\tau }) \Rightarrow {\tau }\) ). In Section 4.3, we discuss judgmental equivalence. In Section 4.4, we discuss extraction ( \({\Theta }\vdash{A}\rightsquigarrow{A^{\prime }}[{\Theta ^{\prime }}]\) ). In Section 4.5, we discuss subtyping ( \({\Theta }\vdash {A}\leq{B}\) ). In Section 4.6, we discuss the unrolling judgment for refined inductive types. In Section 4.7, we discuss the typing system. In Section 4.8, we extend substitution to that of program values for program variables and prove a substitution lemma stating that typing is stable under substitution (a key operational correctness result).

Fig. 1.

In Figure 1, we summarize the key judgments constituting the declarative system. In the figure, “pre.” abbreviates “presupposes,” which basically lists the judgments (rightmost column) we tend to leave implicit in rules defining the given judgment (leftmost column). Presuppositions also indicate the (input or output) moding of judgments. For example, on the one hand, \({\Theta }; {\Gamma } {\vdash} {v} {\color{blue}{\Leftarrow}} {P}\) presupposes \({\Theta } {\vdash} {\Gamma }\; \mathsf {ctx}\) and \({\Theta } {\vdash} {P}\; \mathsf {type} {[{\Xi _P}}]\) for some \(\Xi _P\) , where the former presupposes \({\Theta }\; \mathsf {ctx}\) , and \(\Theta\) , \(\Gamma\) , and P are input-moded; on the other hand, \({\Theta }; {\Gamma } {\vdash} {h} {\color{red}{\Rightarrow}} {P}\) does not presuppose \({\Theta } {\vdash} {P}\; \mathsf {type} {[{\Xi _P}}]\) for some \(\Xi _P\) , but rather, we must prove that the output-moded P is well-formed (which is straightforward). Groups of mutually defined judgments are separated by blank lines.

Program terms. Program terms are defined in Figure 2. We polarize terms into two main syntactic categories: expressions (which have negative type) and values (which have positive type). Program terms are further distinguished according to whether their principal types are synthesized (heads and bound expressions) or checked (spines and patterns).

Fig. 2.

Expressions e consist of functions \({\lambda} {x}{e}\) , recursive expressions \({\tt rec}\;{x:N}.\;{e}\) , let-bindings \({\tt let}\;{x}\,\texttt {=}\,{g}{\tt ;}\;{e}\) , match expressions \({\tt match}\;{h}\;{\lbrace {{r}_{i}} \mathbin {\Rightarrow }{{e}_{i}}\rbrace _{{i} \in {I}}}\) , value returners (or producers) \({\tt return}\,{v}\) , and an unreachable expression unreachable (such as an impossible match). Bound expressions g, which can be let-bound, consist of expressions annotated with a returner type \(\texttt {(}{e} : {{\uparrow \!{P}}}\texttt {)}\) and applications \(h(s)\) of a head h to a spine s. Heads h, which can be applied to a spine or pattern-matched, consist of variables x and positive-type-annotated values \(\texttt {(}{v} : {P}\texttt {)}\) . Spines s are lists of values; we often omit the empty spine \(\cdot\) , writing (for example) \(v_1, v_2\) instead of \(v_1, v_2, \cdot\) . In match expressions, heads are matched against patterns r.

Values consist of variables x, the unit value \(\left\langle \right\rangle\) , pairs \(\langle {v_1}, {v_2}\rangle\) , injections into sum type \({\tt inj}_{k}\,{v}\) where k is 1 or 2, rollings into inductive type \({\tt into}({v})\) , and thunks (suspended computations) \(\left\lbrace {e}\right\rbrace\) .

Types. Types are defined in Figure 3. Types are polarized into positive (value) types P and negative (computation) types N. We write A, B, and C for types of either polarity.

Fig. 3.

Positive types consist of the unit type 1, products \(P_1 \times P_2\) , the void type 0, sums \(P_1 + P_2\) , downshifts (of negative types; thunk types) \({\downarrow \!{N}}\) , asserting types \(P \wedge \phi\) (read “P with \(\phi\) ”), index-level existential quantifications \(\exists {a:\tau }.\: P\) , and refined inductive types \({\left\lbrace {\nu : \mu F} \;\middle |\; {(\mathsf {fold}_{F}\;{\alpha })\,{\nu } =_\tau t}\right\rbrace }\) . We read \({\left\lbrace {\nu : \mu F} \;\middle |\; {(\mathsf {fold}_{F}\;{\alpha })\,{\nu } =_\tau t}\right\rbrace }\) as the type having values \(\nu\) of inductive type \(\mu F\) (with signature F) such that the (index-level) measurement \((\mathsf {fold}_{F}\;{\alpha })\,{\nu } =_\tau t\) holds; in Sections 4.0.1 and 5, we explain the metavariables F, \(\alpha\) , \(\tau\) , and t, as well as what these and the syntactic parts \(\mu\) and \(\mathsf {fold}\) denote. Briefly, \(\mu\) roughly denotes “least fixed point of” and a \(\mathsf {fold}\) over F with \(\alpha\) (having carrier sort \(\tau\) ) indicates a measure on the inductive type \(\mu F\) into \(\tau\) .

Negative types consist of function types \(P \rightarrow N\) , upshifts (of positive types; lift or returning types) \({\uparrow \!{P}}\) (dual to \({\downarrow \!{N}}\) ), propositionally guarded types \(\phi \mathrel {\supset }N\) (read “ \(\phi\) implies N”; dual to \(P \wedge \phi\) ), and index-level universal quantifications \(\forall {a:\tau }.\: N\) (dual to \(\exists {a:\tau }.\: P\) ).

In \(P \wedge \phi\) and \(\phi \mathrel {\supset }N\) , the index proposition \(\phi\) has no runtime content. Neither does the a in \(\exists {a:\tau }.\: P\) and \(\forall {a:\tau }.\: N\) , nor the recursive refinement predicate \((\mathsf {fold}_{F}\;{\alpha })\,{\nu } =_\tau t\) in \({\left\lbrace {\nu : \mu F} \;\middle |\; {(\mathsf {fold}_{F}\;{\alpha })\,{\nu } =_\tau t}\right\rbrace }\) .

Index language: sorts, terms, and propositions. Our type system is parametric in the index domain, provided the latter has certain (basic) properties. For our system to be decidable, the index domain must be decidable. It is instructive to work with a specific index domain: Figure 4 defines a quantifier-free logic of linear equality, inequality, and arithmetic, which is decidable [Barrett et al. 2009].

Fig. 4.

Sorts \(\tau\) consist of Booleans \(\mathbb {B}\) , natural numbers \(\mathbb {N}\) , integers \(\mathbb {Z}\) , and products \(\tau _1 \times \tau _2\) . Index terms t consist of variables a, numeric constants n, addition \(t_1 + t_2\) , subtraction \(t_1 - t_2\) , pairs \((t_1, t_2)\) , projections \(\pi _1\,{t}\) and \(\pi _2\,{t}\) , and propositions \(\phi\) . Propositions \(\phi\) (the logic of the index domain) are built over index terms and consist of equality \(t_1 = t_2\) , inequality \(t_1 \le t_2\) , conjunction \(\phi _1 \wedge \phi _2\) , disjunction \(\phi _1 \vee \phi _2\) , negation \(\lnot \phi\) , trivial truth \(\mathsf {tt}\) , and trivial falsity \(\mathsf {ff}\) .

4.0.1 Inductive Types, Functors, and Algebras.

We encode algebraic data types (and measures on them) using their standard semantics. In the introduction (Section 1), to refine the type of A-lists by their length, we defined a recursive function len over the inductive structure of lists. Semantically, we characterize this structural recursion by algebraic folds over polynomial endofunctors; we design our system in line with this semantics. While this presentation may appear overly abstract for the user, it should be possible to allow the user to use the same or similar syntax as programs to express measures if they annotate them as measures in the style of Liquid Haskell.

We express inductive type structure without reference to constructor names by syntactic functors resembling the polynomial functors. For example (modulo the difference for simplifying unrolling), we can specify the signature of the inductive type of lists of terms of type A syntactically as a functor \(\underline{1} \oplus (\underline{A} \otimes \mathrm{Id})\) , where \(\underline{C}\) denotes the constant (set) functor (sending any set to the set denoted by type C), \(\mathrm{Id}\) denotes the identity functor (sending any set to itself), the denotation of \(F_1 \otimes F_2\) sends a set X to the product \(([\![ {F_1}]\!] _{} X) \times ([\![ {F_2}]\!] _{} X)\) , and the denotation of \(F_1 \oplus F_2\) sends a set X to the disjoint union \(([\![ {F_1}]\!] _{}X) \uplus ([\![ {F_2}]\!] _{}X)\) . The idea is that each component of the sum functor \(\oplus\) represents a data constructor, so (for example) \(\underline{1}\) represents the nullary constructor \([]\) , and \(\underline{A}\) represents the head element of a cons cell that is attached (via \(\otimes\) ) to the recursive tail list represented by \(\mathrm{Id}\) .

A functor F (Figure 5) is a sum ( \(\oplus\) ) of products ( \(\hat{P}\) ), which multiply ( \(\otimes\) ) base functors (B) consisting of identity functors that represent recursive positions ( \(\mathrm{Id}\) ) and constant functors ( \(\underline{P}\) ) at positive type P. The rightmost factor in a product \(\hat{P}\) is the (product) unit functor I. By convention, \(\otimes\) has higher precedence than \(\oplus\) . For convenience in specifying functor well-formedness (appendix Figure 7) and denotation (appendix Figure 37), \(\mathcal {F}\) is a functor F or a base functor B.

Fig. 5.

A direct grammar F for sums of products (of constant and identity functors) consists of \(F {::=} \hat{P}{|} F \oplus F\) and \(\hat{P}{::=} B {|} B \otimes \hat{P}\) and \(B{::=} \underline{P} {|} \mathrm{Id}\) . The grammar \(F {::=} F \oplus F {|} F \otimes F {|} B\) is semantically equivalent to sums of products, but syntactically inconvenient, because it allows writing products of sums. We do not use either of these grammars, but rather Figure 5, because it simplifies inductive type unrolling (Section 4.6). (In any case, a surface language where data types have named constructors would have to be elaborated to use one of these grammars.) These grammars have naturally isomorphic interpretations. For example, in our functor grammar (Figure 5), we instead write \(\textsf {NatF} = I \oplus (\mathrm{Id}\otimes I)\) (note I is semantically equivalent to \(\underline{1}\) ): Notice that for any set X, we have \([\![ {I \oplus (\mathrm{Id}\otimes I)}]\!] _{} X = 1 \uplus (X \times 1) \cong 1 \uplus X = [\![ {\underline{1} \oplus \mathrm{Id}}]\!] _{} X\) .

As we will discuss in Section 5, every polynomial endofunctor F has a fixed point \(\mu F\) satisfying a recursion principle for defining measures (on \(\mu F\) ) by folds with algebras. We define algebras in Figure 6. An algebra \(\alpha\) is a list of clauses \({p} \mathbin {\Rightarrow }{t}\) that pattern match on algebraic structure (p, q, and o are patterns) and bind variables in index bodies t. Sum algebra patterns p consist of \({\tt inj}_{1}\,{p}\) and \({\tt inj}_{2}\,{p}\) (which match on sum functors \(\oplus\) ). Product algebra patterns q consist of tuples \((o, q)\) (which match on \(\otimes\) ) ending in the unit pattern \(\texttt{ ()}\) (which match on I). Base algebra patterns o consist of wildcard patterns \(\top\) (which match on constant functors \(\underline{P}\) ), variable patterns a (which match on the identity functor \(\mathrm{Id}\) ), and pack patterns \({{\tt pack}}{\left(a, o\right)}\) (which match on existential constant functors \(\underline{\exists {a:\tau }.\: P}\) , where a is also bound in the bodies t of algebra clauses).

Fig. 6.

For example, given a type P, consider the functor \(I \oplus (\underline{P} \otimes \mathrm{Id}\otimes I)\) . To specify the function \(\textsf {length} : \textsf {List}\;P \rightarrow \mathbb {N}\) computing the length of a list of values of type P, we write the algebra \({\tt inj}_{1}\,{\texttt{ ()}} \mathbin {\Rightarrow }{0} \rm \,\texttt {|}\hspace{-5.35pt}\texttt {|}\hspace{-5.35pt}\texttt {|}\hspace{-5.35pt}\texttt {|}\hspace{-5.35pt}\texttt {|}\hspace{-5.35pt}\texttt {|}\hspace{-5.35pt}\texttt {|}\hspace{-5.35pt}\texttt {|}\,{\tt inj}_{2}\,{(\top , (a, \texttt{ ()}))} \mathbin {\Rightarrow }{1+a}\) with which to fold \(\textsf {List}\;P\) .

With the pack algebra pattern, we can use indexes of an inductive type in our measures. For example, given \(a:\mathbb {N}\) , and defining the singleton type \(\textsf {Nat}(a)\) as \({\left\lbrace {\nu :\mu \textsf {NatF}} \;\middle |\; {(\mathsf {fold}_{\textsf {NatF}}\;{\textsf {ixnat}})\,{\nu } = a}\right\rbrace }\) where \(\textsf {NatF} = I \oplus \mathrm{Id}\otimes I\) and \(\textsf {ixnat} = {\tt inj}_{1}\,{\texttt{ ()}} \mathbin {\Rightarrow }{0} \rm \,\texttt {|}\hspace{-5.35pt}\texttt {|}\hspace{-5.35pt}\texttt {|}\hspace{-5.35pt}\texttt {|}\hspace{-5.35pt}\texttt {|}\hspace{-5.35pt}\texttt {|}\hspace{-5.35pt}\texttt {|}\hspace{-5.35pt}\texttt {|}\,{\tt inj}_{2}\,{(a, \texttt{ ()})} \mathbin {\Rightarrow }{1+a}\) , consider lists of natural numbers, specified by \(I \oplus \underline{\exists {b:\mathbb {N}}.\: \textsf {Nat}(b)} \otimes \mathrm{Id}\otimes I\) . Folding such a list with the algebra \({\tt inj}_{1}\,{\texttt{ ()}} \mathbin {\Rightarrow }{0} \rm \,\texttt {|}\hspace{-5.35pt}\texttt {|}\hspace{-5.35pt}\texttt {|}\hspace{-5.35pt}\texttt {|}\hspace{-5.35pt}\texttt {|}\hspace{-5.35pt}\texttt {|}\hspace{-5.35pt}\texttt {|}\hspace{-5.35pt}\texttt {|}\,{\tt inj}_{2}\,{({{\tt pack}}{\left(b, \top \right)}, a, \texttt{ ()})} \mathbin {\Rightarrow }{a + b}\) sums all the numbers in the list. (For clarity, we updated the definitions in Section 2 to agree with our grammars as presented in Figures 5 and 6.)

For measures relating indexes in structurally distinct positions within an inductive type, in ongoing work, we are extending our system with multi-argument measures by way of higher-order sorts \(\tau _1 \Rightarrow \tau _2\) . Doing so would allow us to refine, for example, integer lists to lists of integers in increasing order, because we could then compare the indexed elements of a list.

Contexts. A logical context \(\Theta {::=} \cdot {|} \Theta , a : \tau {|} \Theta , \phi\) is an ordered list of index propositions \(\phi\) and (index) variable sortings \(a:\tau\) (which may be used in subsequent propositions). A program variable context (or program context) \(\Gamma {::=} \cdot {|} \Gamma , x : P\) is a set of (program) variable typings \(x:P\) . A value-determined context \(\Xi {::=} \cdot {|} \Theta , a : \tau\) is a set of index sortings \(a:\tau\) . In any kind of context, a variable can be declared at most once.

4.1 Index Sorting and Propositional Validity

We have a standard index sorting judgment \({\Theta } {\vdash} {t} : {\tau }\) (appendix Figure 4) checking that, under context \(\Theta\) , index term t has sort \(\tau\) . For example, \({a:\mathbb {N}} {\vdash} {\lnot (a \le a+1)} : {\mathbb {B}}\) . This judgment does not depend on propositions in \(\Theta\) , which only matter when checking propositional validity (only done in subtyping and program typing). The operation \(\overline{\Theta }\) merely removes all propositions \(\phi\) from \(\Theta\) .

Well-sorted index terms \(\Theta {\vdash} t : \tau\) denote functions \([\![ {t}]\!] _{} : [\![ {\overline{\Theta }}]\!] _{} \rightarrow [\![ {\tau }]\!] _{}\) . For each \(\Theta\) , define \([\![ {\Theta }]\!] _{}\) as the set of index-level semantic substitutions (defined in this paragraph) \({\left\lbrace {\delta } \;\middle |\; {{\vdash} \delta : \Theta }\right\rbrace }\) . For example, \([\![ {4+a}]\!] _{3/a} = 7\) and \([\![ {b=1+0}]\!] _{1/b} = {\lbrace \bullet \rbrace }\) (that is, true) and \([\![ {a=1}]\!] _{2/a} = \emptyset\) (that is, false). An index-level semantic substitution \({\vdash} \delta : \Theta\) assigns exactly one semantic index value d to each index variable in \(\mathit {dom}(\Theta)\) such that every proposition \(\phi\) in \(\Theta\) is true (written \({\lbrace \bullet \rbrace }\) ; false is \(\emptyset\) ):

A propositional validity or truth judgment \({\Theta } {\vdash} {\phi } \;\mathsf {true}\) , which is a semantic entailment relation, holds if \(\phi\) is valid under \(\Theta\) , that is, if \(\phi\) is true under every interpretation of variables in \(\Theta\) such that all propositions in \(\Theta\) are true. We say t and \(t^{\prime }\) are logically equal under \(\Theta\) if \({\Theta } {\vdash} {t = t^{\prime }} \;\mathsf {true}\) .

An index-level syntactic substitution \(\sigma\) is a list of index terms to be substituted for index variables: \(\sigma {::=} \cdot {|} \sigma , t/a\) . The metaoperation \([\sigma ]\mathcal {O}\) , where \(\mathcal {O}\) is an index term, program term, or type, is sequential substitution: \([\cdot ]\mathcal {O} = \mathcal {O}\) and \([\sigma , t/a]\mathcal {O} = [\sigma ]([t/a]\mathcal {O})\) , where \([t/a]\mathcal {O}\) is standard capture-avoiding (by \(\alpha\) -renaming) substitution. Syntactic substitutions (index-level) are typed (“sorted”) in a standard way. Because syntactic substitutions substitute terms that may have free variables, their judgment form includes a context to the left of the turnstile, in contrast to semantic substitution:

Because our substitution operation \([\sigma ]-\) applies sequentially, we type (“sort”) the application of the rest of the substitution to the head being substituted. For example, the rule concluding \(\Theta _0 {\vdash} (\sigma , t/a) : (\Theta , a:\tau)\) checks that the application \([\sigma ]t\) of \(\sigma\) to t has sort \(\tau\) .

The decidability of our system depends on the decidability of propositional validity. Our example index domain is decidable [Barrett et al. 2009]. Our system is parametric in the index domain, provided the latter has certain properties. In particular, propositional validity must satisfy the following basic properties required of a logical theory ( \({\Theta }\; \mathsf {ctx}\) is logical context well-formedness):

—

Weaken: If \({\Theta _1, \Theta , \Theta _2}\; \mathsf {ctx}\) and \({\Theta _1, \Theta _2} {\vdash} {\phi } \;\mathsf {true}\) , then \({\Theta _1, \Theta , \Theta _2} {\vdash} {\phi } \;\mathsf {true}\) .

—

Permute: If \({\Theta , \Theta _1}\; \mathsf {ctx}\) and \({\Theta , \Theta _2}\; \mathsf {ctx}\) and \({\Theta , \Theta _1, \Theta _2, \Theta ^{\prime }} {\vdash} {\phi } \;\mathsf {true}\) , then \({\Theta , \Theta _2, \Theta _1, \Theta ^{\prime }} {\vdash} {\phi } \;\mathsf {true}\) .

—

Substitution: If \({\Theta } {\vdash} {\phi } \;\mathsf {true}\) and \(\Theta _0 {\vdash} \sigma : \Theta\) , then \({\Theta _0} {\vdash} {[\sigma ]\phi } \;\mathsf {true}\) .

—

Equivalence: The relation \({\Theta } {\vdash} {t_1 = t_2} \;\mathsf {true}\) is an equivalence relation.

—

Assumption: If \({\Theta _1, \phi , \Theta _2}\; \mathsf {ctx}\) , then \({\Theta _1, \phi , \Theta _2} {\vdash} {\phi } \;\mathsf {true}\) .

—

Consequence: If \({\Theta _1} {\vdash} {\psi } \;\mathsf {true}\) and \({\Theta _1, \psi , \Theta _2} {\vdash} {\phi } \;\mathsf {true}\) , then \({\Theta _1, \Theta _2} {\vdash} {\phi } \;\mathsf {true}\) .

—

Consistency: It is not the case that \({\cdot } {\vdash} {\mathsf {ff}} \;\mathsf {true}\) .

We also assume that \({\Theta } {\vdash} {t} : {\tau }\) is decidable and satisfies weakening and substitution. Our example index domain satisfies all these properties.

4.2 Well-formedness

Context well-formedness \({\Theta }\; \mathsf {ctx}\) and \({\Theta } {\vdash} {\Gamma }\; \mathsf {ctx}\) (appendix Figure 8) is straightforward. For both logical and program context well-formedness, there can be at most one of each variable. Index terms in well-formed logical contexts must have Boolean sort:

In well-formed program variable contexts \({\Theta } {\vdash} {\Gamma }\; \mathsf {ctx}\) , the types (of program variables) must be well-formed under \(\Theta\) ; further, we must not be able to extract index information from these types (in the sense of Section 4.4). For example, \(x : 1 \wedge \mathsf {ff}\) is an ill-formed program context because \(\mathsf {ff}\) can be extracted, but \(x : {\downarrow \!{{\uparrow \!{1 \wedge \mathsf {ff}}}}}\) is well-formed because nothing under a shift type can be extracted.

Type well-formedness \({\Theta } {\vdash} {A}\; \mathsf {type} {[{\Xi }}]\) (read “under \(\Theta\) , type A is well-formed with value-determined indexes \(\Xi\) ”) has \(\Xi\) in output mode, which tracks index variables appearing in the type A that are uniquely⁴ determined by values of refined inductive types in A, particularly by their folds. (See Lemma 5.4 in Section 5.) Consider the following type well-formedness rule:

The index b is uniquely determined by a value of the conclusion type, so we add it to \(\Xi\) . For example, the value \(\mathsf {one} = {\tt into}({{\tt inj}_{2}\,{\langle {{\tt into}({{\tt inj}_{1}\,{\left\langle \right\rangle }})}, {\left\langle \right\rangle }\rangle }})\) determines the variable b appearing in the value’s type \(\textsf {NatF}(b) = {\left\lbrace {\nu :\mu \textsf {NatF}} \;\middle |\; {(\mathsf {fold}_{\textsf {NatF}}\;{\textsf {ixnat}})\,{\nu } = b}\right\rbrace }\) to be one. (We have a similar rule where the b-position metavariable is not an index variable, adding nothing to \(\Xi\) .) We use set union ( \(\Xi \cup b:\tau\) ), as b may already be value-determined in F (that is, \((b:\tau)\) may be in \(\Xi\) ). The algebra well-formedness premise \({\cdot }; {\Theta } {\vdash} {\alpha } : {F}({\tau }) \Rightarrow {\tau }\) requires the algebra \(\alpha\) to be closed (that is, the first context is empty, \(\cdot\) ). This premise ensures that existential variables never appear in algebras, which is desirable, because folds with algebras solve existential variables when typechecking a value (see Section 6).

Because ultimately \(\Xi\) tracks only measure-determined indexes, \({\text{DeclTp}}\mu{\text{Var}}\) is the only rule that adds to \(\Xi\) . The index propositions of asserting and guarded types do not track anything beyond what is tracked by the types to which they are connected.

We restrict quantification to value-determined index variables to guarantee we can always solve them algorithmically. For example, in checking \(\mathsf {one}\) against the type \(\exists {a:\mathbb {N}}.\: \mathsf {Nat}(a)\) , we solve a to an index semantically equal to \(1 \in \mathbb {N}\) . If \({\Theta , a:\tau } {\vdash} {P}\; \mathsf {type} {[{\Xi }}]\) , then \(\exists {a:\tau }.\: P\) is well-formed if and only if \((a:\tau) \in \Xi\) , and similarly for universal quantification (which we will restrict to the argument types of function types; argument types are positive):

We read commas in value-determined contexts such as \(\Xi _1, \Xi _2\) as set union \(\Xi _1 \cup \Xi _2\) together with the fact that \({dom}(\Xi _1) \cap {dom}(\Xi _2) = \emptyset\) , so these rules can be read top-down as removing a.

A value of product type is a pair of values, so we take the union of what each component value determines:

We also take the union for function types \(P \rightarrow N\) , because to use a function, due to focusing, we must provide values for all its arguments. The \(\Xi\) of \(\mathsf {Nat}(a) \rightarrow {\uparrow \!{\mathsf {Nat}(a)}}\) is \(a:\mathbb {N}\) , so \(\forall {a:\mathbb {N}}.\: \mathsf {Nat}(a) \rightarrow {\uparrow \!{\mathsf {Nat}(a)}}\) is well-formed. In applying a head of this type to a value, we must instantiate a to an index semantically equal to what that value determines; for example, if the value is \(\mathsf {one}\) , then a gets instantiated to an index semantically equal to \(1 \in \mathbb {N}\) .

However, a value of sum type is either a left- or right-injected value, but we do not know which, so we take the intersection of what each injection determines:

The unit type 1 and void (empty) type 0 both have empty \(\Xi\) . We also empty out value-determined indexes at shifts, preventing certain quantifications over shifts. For example, \(\forall {a:\mathbb {N}}.\: {\uparrow \!{\textsf {Nat}(a)}}\) (which is void anyway) is not well-formed. Crucially, we will see that this restriction, together with focusing, guarantees indexes will be algorithmically solved by the end of certain stages.

We define functor and algebra well-formedness in Figure 7 of the appendix.

Fig. 7.

Functor well-formedness \({\Theta } {\vdash} {\mathcal {F}}\; \mathsf {functor} {[{\Xi }]}\) is similar to type well-formedness: Constant functors output the \(\Xi\) of the underlying positive type, the identity and unit functors \(\mathrm{Id}\) and I have empty \(\Xi\) , the product functor \(B \otimes \hat{P}\) takes the union of the component \(\Xi\) s, and the sum functor \(F_1 \oplus F_2\) takes the intersection. The latter two reflect the fact that unrolling inductive types (Section 4.6) generates \(+\) types from \(\oplus\) functors and \(\times\) types from \(\otimes\) functors. That I has empty \(\Xi\) reflects that 1 (unrolled from I) does, too, together with the fact that asserting and guarded types do not affect \(\Xi\) .

Algebra well-formedness \({\Xi }; {\Theta } {\vdash} {\alpha } : {F}({\tau }) \Rightarrow {\tau }\) (read “under \(\Xi\) and \(\Theta\) , algebra \(\alpha\) is well-formed and has type \(F(\tau) \Rightarrow \tau\) ”) has two contexts: \(\Xi\) is for \(\alpha\) (in particular, the index bodies of its clauses) and \(\Theta\) is for F (in particular, the positive types of constant functors); we maintain the invariant that \(\Xi \subseteq \Theta\) . We have these separate contexts to prevent existential variables from appearing in \(\alpha\) (as explained with respect to \({\text{DeclTp}}\mu{\text{Var}}\) ) while still allowing them to appear in F. For example, consider \(\exists {b:\mathbb {N}}.\: {\left\lbrace {\nu : \mu F(b)} \;\middle |\; {(\mathsf {fold}_{F(b)}\;{\alpha })\,{\nu } = n}\right\rbrace }\) , where \(F(b) = (\underline{\textsf {Nat}(b)} \otimes I) \oplus (\underline{\textsf {Nat}(b)} \otimes \mathrm{Id}\otimes I)\) and \(\alpha = {\tt inj}_{1}\,{(\top , \texttt{ ()})} \mathbin {\Rightarrow }{0} \rm \,\texttt {|}\hspace{-5.35pt}\texttt {|}\hspace{-5.35pt}\texttt {|}\hspace{-5.35pt}\texttt {|}\hspace{-5.35pt}\texttt {|}\hspace{-5.35pt}\texttt {|}\hspace{-5.35pt}\texttt {|}\hspace{-5.35pt}\texttt {|}\,{\tt inj}_{2}\,{(\top , a, \texttt{ ()})} \mathbin {\Rightarrow }{1+a}\) .

Refined inductive type well-formedness initializes the input \(\Xi\) to \(\cdot\) , but index variables can be bound in the body of an algebra:

where the right rule simultaneously binds a in both t and Q, and the left rule only binds a in t (but we add a to both contexts to maintain the invariant \(\Xi \subseteq \Theta\) for inputs \(\Xi\) and \(\Theta\) ). We sort algebra bodies only when a product ends at a unit (possible by design of the functor grammar), and merely under \(\Xi\) ; constant functors depend on \(\Theta\) :

For algebras \(\alpha\) of “type” \((F_1 \oplus F_2) \; \tau \Rightarrow \tau\) , we use a straightforward judgment \({\alpha } \circ \mathsf {inj}_{k} \circeq {\alpha _k}\) (appendix Figure 5) that outputs the kth clause \(\alpha _k\) of input algebra \(\alpha\) :

By restricting the bodies of algebras to index terms t and the carriers of our F-algebras to index sorts \(\tau\) , we uphold the phase distinction: We can therefore safely refine inductive types by folding them with algebras and also manage decidable typing.

4.3 Equivalence

We have equivalence judgments for propositions \({\Theta }\vdash{\phi }\equiv{\psi }\) (appendix Figure 11), logical contexts \({\Theta }\vdash{\Theta_1}\equiv{\Theta _2}\) (appendix Figure 12), functors \({\Theta }\vdash{\mathcal {F}}\equiv{\mathcal {G}}\) (appendix Figure 13), and types \({\Theta }\vdash{A}\equiv^{\pm}{B}\) (appendix Figure 14), which use \({\Theta } {\vdash} {\phi } \;\mathsf {true}\) to verify logical equality of index terms. Basically, two entities are equivalent if their respective, structural subparts are equivalent (under the logical context). Type/functor equivalence is used in sum and refined ADT subtyping (type equivalence implies mutual subtyping), as well as to prove algorithmic completeness (appendix Lemma B.108), but context equivalence is only used to prove algorithmic completeness (appendix Lemma B.95). However, it should be possible to remove equivalence from the system itself by using “subfunctoring” and covariant sum subtyping. For space reasons, we do not show all their rules here (see appendix), only the ones we think are most likely to surprise.

Refined inductive types are equivalent only if they use syntactically the same algebra (but the algebra must be well-formed at both functors F and G; this holds by inversion on the conclusion’s presupposed type well-formedness judgments):

Two index equality propositions (respectively, two index inequalities) are equivalent if their respective sides are logically equal:

We use logical context equivalence in proving subsumption admissibility (see Section 4.8) and the completeness of algorithmic typing (see Section 7.3). Two logical contexts are judgmentally equivalent under \(\Theta\) if they have exactly the same variable sortings (in the same list positions) and logically equivalent (under \(\Theta\) ) propositions, in the same order. The most interesting rule is the one for propositions, where, in the second premise, we filter out propositions from \(\Theta _1\) because we want each respective proposition to be logically equivalent under the propositions (and indexes) of \(\Theta\) , but variables in \(\Theta _1\) (or \(\Theta _2\) ) may appear in \(\phi _1\) (or \(\phi _2\) ):

(Note that it is equivalent to use \(\overline{\Theta _2}\) rather than \(\overline{\Theta _1}\) in the second premise above.)

All equivalence judgments satisfy reflexivity (appendix Lemmas B.75 and B.76), symmetry (appendix Lemmas B.91 and B.92), and transitivity (appendix Lemma B.92).

4.4 Extraction

The judgment \({\Theta }\vdash{A}\rightsquigarrow^{\pm}{A^{\prime }}[{\Theta _A}]\) (Figure 7) extracts quantified variables and \(\wedge\) and \(\mathrel {\supset }\) propositions from the type A, outputting the type \(A^{\prime }\) without these, and the context \(\Theta _A\) with them. We call \(A^{\prime }\) and \(\Theta _A\) the type and context extracted from A. For negative A, everything is extracted up to an upshift. For positive A, everything is extracted up to any connective that is not \(\exists\) , \(\wedge\) , or \(\times\) . For convenience in program typing (Section 4.7), \({\Theta }\vdash{A}\rightsquigarrow\!\!\!\!\!\!/\) abbreviates \({\Theta }\vdash{A}\rightsquigarrow{A}[{\cdot}]\) (we sometimes omit the polarity label from extraction judgments). If \({\Theta}\vdash{A}\rightsquigarrow\!\!\!\!\!\!/\) , then we say A is simple.

4.5 Subtyping

Declarative subtyping \({\Theta }\vdash{A}\leq^{\pm}{B}\) is defined in Figure 8.

Fig. 8.

Subtyping is polarized into mutually recursive positive \({\Theta }\vdash{P}\leq^{+}{Q}\) and negative \({\Theta }\vdash{N}\leq^{-}{M}\) relations. The design of inference rules for subtyping is guided by sequent calculi, perhaps most clearly seen in the left and right rules pertaining to quantifiers ( \(\exists\) , \(\forall\) ), asserting types ( \(\wedge\) ), and guarded types ( \(\mathrel {\supset }\) ). This is helpful to establish key properties such as reflexivity and transitivity (viewing subtyping as a sequent system, we might instead say that the structural identity and cut rules, respectively, are admissible⁵). We interpret types as sets with some additional structure (Section 5), but considering only the sets, we prove that a subtype denotes a subset of the set denoted by any of its supertypes. That is, membership of a (semantic) value in the subtype implies its membership in any supertype of the subtype. We may also view subtyping as implication.

Instead of \(\leq^{\color{#a600a6}{+}}\rightsquigarrow{\text{L}}\) , one might reasonably expect these two rules (the brackets around the rule names indicate that these rules are not in our system):

Similarly, one might expect to have [ \({\leq^{\color{\blue}{-}}\supset{\text{R}} }\) ] and [ \({\leq^{\color{\blue}{-}}\forall{\text{R}} }\) ], dual to the above rules, instead of the dual rule \(\leq^{\color{\blue}{-}}\rightsquigarrow{\text{R}}\) . Reading, for example, the above rule [ \({\leq^{\color{#a600a6}{+}}\wedge\; {\text{L}} }\) ] logically and top-down, if \(\Theta\) and \(\phi\) implies that P implies Q, then we can infer that \(\Theta\) implies that P and \(\phi\) implies Q. We can also read rules as a bottom-up decision procedure: Given \(P \wedge \phi\) , we know \(\phi\) , so we can assume it; given \(\exists {a:\tau }.\: P\) , we know there exists an index of sort \(\tau\) such that P, but we do not have a specific index term. However, these rules are not powerful enough to derive reasonable judgments such as \({a:\mathbb {N}}\vdash{1 \times (1 \wedge a=3)}\leq^{+}{(1 \wedge a \ge 3) \times 1}\) : Subtyping for the first component requires verifying \(a \ge 3\) , which is impossible under no logical assumptions. But from a logical perspective, \(1 \times (1 \wedge a=3)\) implies \(a \ge 3\) . Reading \(\leq^{\color{#a600a6}{+}}\rightsquigarrow{\text{L}}\) bottom-up, in this case, we extract \(a=3\) from the subtype, which we later use to verify that \(a \ge 3\) . The idea is that, for a type in an assumptive position, it does not matter which product component (products are viewed conjunctively) or function argument (in our system, functions must be fully applied to values) to which index data is attached. Moreover, as we will explain at the end of Section 6, the weaker rules by themselves are incompatible with algorithmic completeness. We emphasize that we do not include [ \({\leq^{\color{blue}{+}}\wedge\; {\text{L}} }\) ], [ \(\leq^{\color{#a600a6}{+}}\exists{\text{L}}\) ], [ \({{\leq^{\color{blue}{-}}\supset{\text{R}} } }\) ], or [ \({\leq^{\color{blue}{-}}\forall{\text{R}} }\) ] in the system.

For the unit type and the void type, rules \(\leq^{\color{#a600a6}{+}}{\text{1}}\) and void \(\leq^{\color{#a600a6}{+}}{\text{0}}\) are simply reflexivity. Product subtyping \(\leq^{\color{#a600a6}{+}}\times\) is covariant subtyping of component types: A product type is a subtype of another if each component of the former is a subtype of the respective component of the latter. We have covariant shift rules \(\leq^{\color{#a600a6}{+}}\downarrow\) and \(\leq^{\color{blue}{-}}\uparrow\) . Function subtyping \(\leq^{\color{blue}{-}}\rightarrow\) is standard: contravariant (from conclusion to premise, the subtyping direction flips) in the function type’s domain and covariant in the function type’s codomain.

Rule \(\leq^{\color{#a600a6}{+}}\wedge{\text{R}}\) and its dual rule \({\leq^{\color{blue}{-}}\supset{\text{L}} }\) verify the validity of the attached proposition. In rule \(\leq^{\color{#a600a6}{+}}\exists{\text{R}}\) and its dual rule \({\leq^{\color{blue}{-}}\forall{\text{L}} }\) , we assume that we can conjure a suitable index term t; in practice (that is, algorithmically), we must introduce an existential variable \(\hat{a}\) and then solve it.

Rule \(\leq^{\color{#a600a6}{+}}{\text{+}}\) says a sum is a subtype of another sum if their respective subparts are (judgmentally) equivalent. Judgmental equivalence does not use judgmental extraction. The logical reading of subtyping begins to clarify why we do not extract anything under a sum connective: \((1 \wedge \mathsf {ff}) + 1\) does not imply \(\mathsf {ff}\) . However, using equivalence here is a conservative restriction: For example, \((1 \wedge \mathsf {ff}) + (1 \wedge \mathsf {ff})\) does imply \(\mathsf {ff}\) . Regardless, we do not expect this to be very restrictive in practice, because programmers tend not to work with sum types themselves, but rather algebraic inductive types (like \(\mu F\) ) and do not need to directly compare, via subtyping, (the unrolling of) different such types (such as the type of lists and the type of natural numbers).

In rule \(\leq^{\color{#a600a6}{+}}{\mu}\) , just as in the refined inductive type equivalence rule (Section 4.3), a refined inductive type is a subtype of another type if they have judgmentally equivalent functors, they use syntactically the same algebra (that agrees with both subtype and supertype functors), and the index terms on the right-hand side of their measurements are equal under the logical context. As we discuss in Section 9, adding polymorphism to the language (future work) might necessitate replacing type and functor equivalence in subtyping with subtyping and “subfunctoring.”

In the appendix, we prove that subtyping is reflexive (Lemma B.77) and transitive (Lemma B.83).

Subtyping and type equivalence. We prove that type equivalence implies subtyping (appendix Lemma B.96). To prove that, we use the fact that if \(\Theta _1\) is logically equivalent to \(\Theta _2\) under their prefix context \(\Theta\) (judgment \({\Theta }\vdash{\Theta_1}\equiv{\Theta _2}\) ), then we can replace \(\Theta _1\) with \(\Theta _2\) (and vice versa) in derivations (appendix Lemma B.95). We use appendix Lemma B.96 to prove subsumption admissibility (Section 4.8) and a subtyping constraint verification transport lemma (mentioned in Section 7.2). Conversely, mutual subtyping does not imply type equivalence: \({\vdash} 1 \wedge \mathsf {tt}\le 1\) and \({\vdash} 1 \le 1 \wedge \mathsf {tt}\) but \({\vdash} 1 \not\equiv 1 \wedge \mathsf {tt}\) because the unit type is structurally distinct from an asserting type.

4.6 Unrolling

Given \(a:\mathbb {N}\) , in our system, the type \(\textsf {List} \; P \; a\) of a-length lists of elements of type P is defined as \({\left\lbrace {\nu : \mu \textsf {ListF}_P} \;\middle |\; {(\mathsf {fold}_{\textsf {ListF}_P}\;{\textsf {lenalg}})\,{\nu } = a}\right\rbrace },\) where \(\textsf {ListF}_P = I \oplus (\underline{P} \otimes \mathrm{Id}\otimes I)\) and \(\textsf {lenalg} = {\tt inj}_{1}\,{\texttt{ ()}} \mathbin {\Rightarrow }{0} \rm \,\texttt {|}\hspace{-5.35pt}\texttt {|}\hspace{-5.35pt}\texttt {|}\hspace{-5.35pt}\texttt {|}\hspace{-5.35pt}\texttt {|}\hspace{-5.35pt}\texttt {|}\hspace{-5.35pt}\texttt {|}\hspace{-5.35pt}\texttt {|}\,{\tt inj}_{2}\,{(\top , (b, \texttt{ ()}))} \mathbin {\Rightarrow }{1 + b}\) . Assuming we have \(\textsf {succ} : \forall {a:\mathbb {N}}.\: \textsf {Nat}(a) \rightarrow {\uparrow \!{\textsf {Nat}(1 + a)}}\) for incrementing a (program-level) natural number by one, we define length in our system as follows:

Checking length against its type annotation, the lambda rule assumes \(x : \textsf {List}(P)(a)\) for an arbitrary \(a:\mathbb {N}\) . Upon matching x against the pattern \({\tt into}({x^{\prime }})\) , we know \(x^{\prime }\) should have the unrolled type of \(\textsf {List}(P)(a)\) . Ignoring refinements, we know that the erasure of this unrolling should be a sum type where the left component represents the empty list and the right component represents a head element together with a tail list. However, to verify the refinement that length does what we intend, we need to know more about the length index associated with x—that is, a—in the case where x is nil and in the case where x is a cons cell. Namely, the unrolling of \(\textsf {List}(P)(a)\) should know that \(a = 0\) when x is the empty list, and that \(a = 1 + a^{\prime }\) where \(a^{\prime }\) is the length of the tail of x when x is a nonempty list. This is the role of the unrolling judgment, to output just what we need here:

That is, the type of P-lists of length a unrolls to either the unit type 1 (representing the empty list) together with the fact that a is 0, or the product of P (the type of the head element) and P-lists (representing the tail) of length \(a^{\prime }\) such that \(a^{\prime }\) is a minus one.

Refined inductive type unrolling \({\Xi };{\Theta }\vdash \{{ \nu :G[\mu F] }\;|\;{\beta }({G\;(\mathsf {fold}_{F}\;{\alpha })\nu)=_{\tau}}t\}\doteq P\) , inspired by work in fibrational dependent type theory [Atkey et al. 2012], is defined in Figure 9. There are two contexts: \(\Xi\) is for \(\beta\) and \(\Theta\) is for G, F, and t. Similarly to algebra well-formedness, we maintain the invariant in unrolling that \(\Xi \subseteq \Theta\) . The (non-contextual) input metavariables are G, F, \(\beta\) , \(\alpha\) , \(\tau\) , and t. The type P, called the unrolled type, is an output. As in the list example above, inductive type unrolling is always initiated with \(\Xi = \cdot\) and \(G = F\) and \(\beta = \alpha\) .

Fig. 9.

\(\text{Unroll}\oplus\) unrolls each branch and then sums the resulting types. \(\text{UnrollId}\) outputs the product of the original inductive type but with a measurement given by the recursive result of the fold (over which we existentially quantify), together with the rest of the unrolling. \(\text{Unroll}\exists\) pushes the packed index variable a onto the context and continues unrolling, existentially quantifying over the result; in the conclusion, a is simultaneously bound in Q and \(t^{\prime }\) . \(\text{UnrollConst}\) outputs a product of the type of the constant functor and the rest of the unrolling. \(\text{Unroll}I\) simply outputs the unit type together with the index term equality given by the (unrolled) measurement.

If our functor and algebra grammars were instead more direct, like those implicitly used in the introduction (Section 1) and overview (Section 2), and explicitly discussed in Section 4.0.1, then we would have to modify the unrolling judgment, and it would need two more rules. We expect everything would still work, but we prefer having to consider fewer rules when proving metatheory.

Unrolling, equivalence, and subtyping. Substituting judgmentally equivalent types, functors and indexes for the inputs of unrolling generates an output type that is both a subtype and supertype of the original unrolling output:

Lemma 4.1 (Unroll to Mutual Subtypes).

(Lemma B.97 in appendix)

If \({\Xi };{\Theta }\vdash\{{\nu :G[\mu F]}\;|\;{\beta }({G\;(\mathsf {fold}_{F}\;{\alpha })\;\nu )}=_{t}t\}\doteq P\)

and \({\Theta }\vdash{G}\equiv{G^{\prime }}\) and \({\Theta }\vdash{F}\equiv{F^{\prime }}\) and \({\Theta } {\vdash} {t=t^{\prime }} \;\mathsf {true}\) ,

then there exists Q such that \({\Xi };{\Theta }\vdash\{{\nu :G^{\prime}[\mu F^{\prime}]}\;|\;{\beta }({G^{\prime}\;(\mathsf {fold}_{F^{\prime}}\;{\alpha })\;\nu )}=_{t}t^{\prime}\}\doteq Q\)

and \({\Theta }\vdash{P}\leq^{+}{Q}\) and \({\Theta }\vdash{Q}\leq^{+}{P}\) .

We use this to prove subsumption admissibility (see Section 4.8) for the cases that involve constructing and pattern matching inductive values.

4.7 Typing

Declarative bidirectional typing rules are given in Figures 10–12. By careful design, guided by logical principles, all typing rules are syntax-directed. That is, when deriving a conclusion, at most one rule is compatible with the syntax of the input program term and the principal input type.

To manage the interaction between subtyping and program typing, types in a well-formed (under \(\Theta\) ) program context \(\Gamma\) must be invariant under extraction: For all \((x:P) \in \Gamma\) , we have \({\Theta }\vdash{P}\rightsquigarrow^{+}{P}[{\cdot }]\) (that is, \({\Theta }\vdash{P}\rightsquigarrow\!\!\!\!\!\!/\) ). We maintain this invariant in program typing by extracting before adding any variable typings to the context.

Fig. 10.

Fig. 11.

Fig. 12.

The judgment \({\Theta }; {\Gamma } {\vdash} {h} {\color{red}{\Rightarrow}} {P}\) (Figure 10) synthesizes the type P from the head h. This judgment is synthesizing, because it is used in what are, from a Curry–Howard perspective, kinds of cut rules: \({\text{Decl}}{\color{red}{\Rightarrow}}{\text{App}}\) and \({\text{Decl}}{\color{blue}{\Leftarrow}}{\texttt{match}}\) , discussed later. The synthesized type is the cut type, which does not appear in the conclusion of \({\text{Decl}}{\color{red}{\Rightarrow}}{\text{App}}\) or \({\text{Decl}}{\color{blue}{\Leftarrow}}{\texttt{match}}\) . For head variables, we look up the variable’s type in the context \(\Gamma\) ( \({\text{Decl}}{\color{red}{\Rightarrow}}{\text{Var}}\) ). For annotated values, we synthesize the annotation ( \({\text{Decl}}{\color{red}{\Rightarrow}}{\text{ValAnnot}}\) ).

The judgment \({\Theta }; {\Gamma } {\vdash} {g} {\color{red}{\Rightarrow}} {{\uparrow \!{P}}}\) (Figure 10) synthesizes the type \({\uparrow \!{P}}\) from the bound expression g. Similarly to the synthesizing judgment for heads, this judgment is synthesizing because it is used in a cut rule \({\text{Decl}}{\color{blue}{\leftarrow}}{\texttt{let}}\) (the synthesized type is again the cut type). Bound expressions only synthesize an upshift because of their (lone) role in rule \({\text{Decl}}{\color{blue}{\leftarrow}}{\texttt{let}}\) , discussed later. For an application of a head to a spine ( \({\text{Decl}}{\color{red}{\Rightarrow}}{\text{App}}\) , an auxiliary cut rule), we first synthesize the head’s type (which must be a downshift) and then check the spine against the thunked computation type, synthesizing the latter’s return type. (Function applications must always be fully applied, but we can simulate partial application via \(\eta\) -expansion. For example, given \(x:P_1\) and \(h {\color{red}{\Rightarrow}} {\downarrow \!{(P_1 \rightarrow P_2 \rightarrow {\uparrow \!{Q}})}}\) , to partially apply h to x, we can write \({\lambda} {y}{{\tt let}\;{z}\,\texttt {=}\,{h(x,y)}{\tt ;}\;{\cdots }}\) .) For annotated expressions, we synthesize the annotation ( \({\text{Decl}}{\color{red}{\Rightarrow}}{\text{App}}\) ), which must be an upshift. If the programmer wants, say, to verify guard constraints in N of an expression e of type N whenever it is run, then they must annotate it: \(\texttt {(}{{\tt return}\,{\left\lbrace {e}\right\rbrace }} : {{\uparrow \!{{\downarrow \!{N}}}}}\texttt {)}\) . If an e of type N is intended to be a function to be applied (as a head to a spine; \({\text{Decl}}{\color{red}{\Rightarrow}}{\text{App}}\) ) only if the guards of N can be verified and the universally quantified indexes of N can be instantiated, then the programmer must thunk and annotate it: \(\texttt {(}{\left\lbrace {e}\right\rbrace } : {{\downarrow \!{N}}}\texttt {)}\) . The two annotation rules have explicit type well-formedness premises to emphasize that type annotations are provided by the programmer.

The judgment \({\Theta }; {\Gamma } {\vdash} {v} {\color{blue}{\Leftarrow}} {P}\) (Figure 11) checks the value v against the type P. From a Curry–Howard perspective, this judgment corresponds to a right-focusing stage. According to rule \({\text{Decl}}{\color{blue}{\Leftarrow}}{\exists}\) , a value checks against an existential type if there is an index instantiation it checks against (declaratively, an index is conjured, but algorithmically we will have to solve for one). For example, as discussed in Section 4.2, checking the program value \(\mathsf {one}\) representing 1 against type \(\exists {a:\mathbb {N}}.\: \mathsf {Nat}(a)\) solves a to an index semantically equal to 1. According to rule \({\text{Decl}}{\color{blue}{\Leftarrow}}{\wedge}\) , a value checks against an asserting type if the asserted proposition \(\phi\) holds (and the value checks against the type to which \(\phi\) is connected). Instead of a general value type subsumption rule like

we restrict subsumption to (value) variables and prove that subsumption is admissible (see Section 4.8). This is easier to implement efficiently, because the type checker would otherwise have to guess Q (and possibly need to backtrack), whereas \({\text{Decl}}{\color{blue}{\Leftarrow}}{\text{Var}}\) need only look up the variable. Further, the \(P \ne \exists , \wedge\) constraint on \({\text{Decl}}{\color{blue}{\Leftarrow}}{\text{Var}}\) means that any top-level \(\exists\) or \(\wedge\) constraints must be verified before subtyping, eliminating nondeterminism of verifying these in subtyping or typing. Rule \({\text{Decl}}{\color{blue}{\Leftarrow}}{{\mu}}\) checks the unrolled value against the unrolled inductive type. Rule \({\text{Decl}}{\color{blue}{\Leftarrow}}{\text{1}}\) says \(\left\langle \right\rangle\) checks against 1. Rule \({\text{Decl}}{\color{blue}{\Leftarrow}}{{\times}}\) says a pair checks against a product if each pair component checks against its corresponding factor. Rule \({\text{Decl}}{\color{blue}{\Leftarrow}}{{+}}_{k}\) says a value injected into the kth position checks against a sum if it can be checked against the kth summand. Rule \({\text{Decl}}{\color{blue}{\Leftarrow}}{{\downarrow}}\) checks the thunked expression against the computation type N under the given thunk type \(\downarrow {N}\) .

The judgment \({\Theta }; {\Gamma } {\vdash} {e} {\color{blue}{\Leftarrow}} {N}\) (Figure 11) checks the expression e against the type N. From a Curry–Howard perspective, this judgment is a right-inversion stage with stable moments ( \({\text{Decl}}{\color{blue}{\leftarrow}}{\texttt{let}}\) and \({\text{Decl}}{\color{blue}{\Leftarrow}}{\texttt{match}}\) , which enter left- or right-focusing stages, respectively). Instead of \({\text{Decl}}{\color{blue}{\Leftarrow}}{{\rightsquigarrow}}\) , one might expect two rules (one for \(\forall\) and one for \(\mathrel {\supset }\) ) that simply put the universal index variable or proposition into logical context, but these alone are less compatible with subsumption admissibility (see Section 4.8) due to the use of extraction in subtyping rule \(\leq^{\color{\blue}{-}}\rightsquigarrow{\text{R}}\) . However, the idea is still the same: Here, we are using indexes, not verifying them as in the dual left-focusing stage. To reduce \({\text{Decl}}{\color{blue}{\Leftarrow}}{{\rightsquigarrow}}\) nondeterminism, and to enable a formal correspondence between our system and (a variant of) CBPV (which has a general \(\downarrow\) elimination rule), the other (expression) rules must check against a simple type. In practice, we eagerly apply (if possible) \({\text{Decl}}{\color{blue}{\Leftarrow}}{{\rightsquigarrow}}\) immediately when type checking an expression; extracted types are invariant under extraction.

All applications \(h(s)\) must be named and sequenced via \({\text{Decl}}{\color{blue}{\leftarrow}}{\texttt{let}}\) , which we may think of as monadic binding, and is a key cut rule. Other computations—annotated returner expressions \(\texttt {(}{e} : {{\uparrow \!{P}}}\texttt {)}\) —must also be named and sequenced via \({\text{Decl}}{\color{blue}{\leftarrow}}{\texttt{let}}\) . It would not make sense to allow arbitrary negative annotations, because that would require verifying constraints and instantiating indexes that should only be done when the annotated expression is applied, which does not occur in \({\text{Decl}}{\color{blue}{\leftarrow}}{\texttt{let}}\) itself.

Heads, that is, head variables and annotated values, can be pattern matched via \({\text{Decl}}{\color{blue}{\Leftarrow}}{\texttt{match}}\) . From a Curry–Howard perspective, the rule \({\text{Decl}}{\color{blue}{\Leftarrow}}{\texttt{match}}\) is a cut rule dual to the cut rule \({\text{Decl}}{\color{blue}{\leftarrow}}{\texttt{let}}\) : The latter binds the result of a computation to a (sequenced) computation, whereas the former binds the deconstruction of a value to, and directs control flow of, a computation. Rule \(\leq^{\color{\blue}{\Leftarrow}}{\lambda}\) is standard (besides the check that \(P \rightarrow N\) is simple). Rule \(\leq^{\color{\blue}{\Leftarrow}}{\texttt{rec}}\) requires an annotation that universally quantifies over the argument a that must be smaller at each recursive call, as dictated by its annotation in the last premise: \(x : {\downarrow \!{\big (\forall {a^{\prime }:\mathbb {N}}.\: (a^{\prime } \lt a) \mathrel {\supset }[a^{\prime }/a]M\big)}}\) only allows x to be used for \(a^{\prime } \lt a\) , ensuring that refined recursive functions are well-founded (according to \(\lt\) on naturals). Rule \(\leq^{\color{\blue}{\Leftarrow}}{{\uparrow}}\) checks that the value being returned has the positive type under the given returner type ( \(\uparrow\) ); this may be thought of as a monadic return operation. Rule \(\leq^{\color{\blue}{\Leftarrow}}{\text{Unreachable}}\) says that unreachable checks against any type, provided the logical context is inconsistent; for example, an impossible pattern in pattern matching extracts to an inconsistent context.

Rule \(\leq^{\color{\blue}{\Leftarrow}}{\texttt{rec}}\) only handles one termination metric, namely, \(\lt\) on natural numbers. This is only to simplify our presentation and is not a fundamental limitation of the system. We can, for example, add a rule that encodes a termination metric \(\lt\) on the sum of two natural numbers:

It is somewhat straightforward to update the metatheory for the system with this rule added. This rule obviates, for example, the ghost parameter used in the mergesort example of Section 3. Similarly, one could add rules for other termination metrics, such as lexicographic induction.

The judgment \({\Theta }; {\Gamma }; [{P}] {\vdash} {\lbrace {{r}_{i}} \mathbin {\Rightarrow }{{e}_{i}}\rbrace _{{i} \in {I}}} {\color{blue}{\Leftarrow}} {N}\) (Figure 12) decomposes P, according to patterns \(r_i\) (if \(P \ne \wedge\) or \(\exists\) , which have no computational content; if \(P = \wedge\) or \(\exists\) , then the index is put in logical context for use) and checks that each branch \(e_i\) has type N. The rules are straightforward. Indexes from matching on existential and asserting types are used, not verified (as in value typechecking); we deconstruct heads, and to synthesize a type for a head, its indexes must hold, so within the pattern-matching stage itself, we may assume and use them. From a Curry–Howard perspective, this judgment corresponds to a left-inversion stage. However, it is not strongly focused, that is, it does not decompose P eagerly and as far as possible; therefore, “stage” might be slightly misleading. If our system were more strongly focused, then we would have nested patterns, at least for all positive types except inductive types; it is unclear how strong focusing on inductive types would work.

The judgment \({\Theta }; {\Gamma }; [{N}] {\vdash} {s} \gg {{\uparrow \!{P}}}\) (Figure 12) checks the spine s against N, synthesizing the return type \({\uparrow \!{P}}\) . From a Curry–Howard perspective, this judgment corresponds to a left-focusing stage. The rules are straightforward: Decompose the given N, checking index constraints ( \(\text{DeclSpine}\forall\) and \(\text{DeclSpine}\supset\) ) and values ( \(\text{DeclSpineApp}\) ) until an upshift, the return type, is synthesized ( \(\text{DeclSpineNil}\) ). Similarly to dual rule \({\text{Decl}}{\color{blue}{\Leftarrow}}{\exists}\) , the declarative rule \(\text{DeclSpine} \forall\) conjures an index measuring a value, but in this case an argument value in a spine. For example, in applying a head of type \(\forall {a:\mathbb {N}}.\: \mathsf {Nat}(a) \rightarrow {\uparrow \!{\mathsf {Nat}(a)}}\) to the spine with program value \(\mathsf {one}\) representing 1, we must instantiate a to an index semantically equal to 1; we show how this works algorithmically in Section 6.4. All universal quantifiers (in the input type of a spine judgment) are solvable algorithmically, because in a well-formed return type, the set of value-determined indexes \(\Xi\) is empty.

4.8 Substitution

A key correctness result that we prove is a substitution lemma: Substitution (of index terms for index variables and program values for program variables) preserves typing. We now extend the index-level syntactic substitutions (and the sequential substitution operation) introduced in Section 4.1. A syntactic substitution \(\sigma {::=} \cdot {|} \sigma , t/a {|} \sigma , {v}:{P}/{x}\) is essentially a list of terms to be substituted for variables. Substitution application \([\sigma ]-\) is a sequential substitution metaoperation on types and terms. On program terms, it is a kind of hereditary substitution⁶ [Watkins et al. 2004; Pfenning 2008] in the sense that, at head variables (note the \(\mathsf {h}\) superscript in the Figure 13 definition; we elide \(\mathsf {h}\) when clear from context), an annotation is produced if the value and the head variable being replaced by it are not equal—thereby modifying the syntax tree of the substitutee. Otherwise, substitution is standard (homomorphic application) and does not use the value’s associated type given in \(\sigma\) : see Figure 13.

Fig. 13.

In the definition given in Figure 13, an annotation is not produced if \(v = x\) so \({x}:{P}/{x}\) is always an identity substitution: that is, \([{x}:{P}/{x}]^{\mathsf {h}}x = x\) . As usual, we assume variables are \(\alpha\) -renamed to avoid capture by substitution.

The judgment \(\Theta _0; \Gamma _0 {\vdash} \sigma : \Theta ; \Gamma\) (appendix Figure 9) means that, under \(\Theta _0\) and \(\Gamma _0\) , we know \(\sigma\) is a substitution of index terms and program values for variables in \(\Theta\) and \(\Gamma\) , respectively. The key rule of this judgment is for program value entries (the three elided rules are similar to the three rules for syntactic substitution typing at index level, found near the start of Section 4.1, but adds program contexts \(\Gamma\) where appropriate):

We apply the rest of the syntactic substitution—that is, the \(\sigma\) in the rule—to v and P because the substitution operation is sequential; v may mention variables in \(\Gamma\) and \(\Theta\) , and P may mention variables in \(\Theta\) . The metaoperation \(\lfloor {-}\rfloor\) filters out program variable entries (program variables cannot appear in types, functors, algebras, or indexes).

That substitution respects typing is an important correctness property of the type system. We state only two parts here, but those of the remaining program typing judgments are similar; all six parts are mutually recursive.

Lemma 4.2 (Syntactic Substitution).

(Lemma B.107 in appendix)

Assume \(\Theta _0; \Gamma _0 {\vdash} \sigma : \Theta ; \Gamma\) .

(1)

If \({\Theta }; {\Gamma } {\vdash} {h} {\color{red}{\Rightarrow}} {P}\) , then there exists Q such that \({\Theta_0}\vdash{Q}\leq^{+}{[\lfloor {\sigma }\rfloor ]P}\) and \({\Theta _0}; {\Gamma _0} {\vdash} {[\sigma ]h} {\color{red}{\Rightarrow}} {Q}\) .

(2)

If \({\Theta }; {\Gamma } {\vdash} {e} {\color{blue}{\Leftarrow}} {N}\) , then \({\Theta _0}; {\Gamma _0} {\vdash} {[\sigma ]e} {\color{blue}{\Leftarrow}} {[\lfloor {\sigma }\rfloor ]N}\) .

In part (1), if substitution creates a head variable with stronger type, then the stronger type Q is synthesized. The proof relies on other structural properties such as weakening. It also relies on subsumption admissibility, which captures what we mean by “stronger type.” We show only one part; the mutually recursive parts for the other five program typing judgments are similar.

Lemma 4.3 (Subsumption Admissibility).

(Lemma B.106 in appendix)

Assume \({\Theta }\vdash{\Gamma ^{\prime }}\leq{\Gamma }\) (pointwise subtyping).

(1)

If \({\Theta }; {\Gamma } {\vdash} {v} {\color{blue}{\Leftarrow}} {P}\) and \({\Theta }\vdash{P}\leq^{+}{Q}\) , then \({\Theta }; {\Gamma ^{\prime }} {\vdash} {v} {\color{blue}{\Leftarrow}} {Q}\) .

Subtypes are stronger than supertypes. That is, if we can check a value against a type, then we know that it also checks against any of the type’s supertypes; similarly for expressions. Pattern matching is similar, but it also says we can match on a stronger type. A head or bound expression can synthesize a stronger type under a stronger context. Similarly, with a stronger input type, a spine can synthesize a stronger return type.

5 Type Soundness

We prove type (and substitution) soundness of the declarative system with respect to an elementary domain-theoretic denotational semantics. Refined type soundness implies the refined system’s totality and logical consistency.

Refinement type systems refine already-given type systems, and the soundness of the former depends on that of the latter [Melliès and Zeilberger 2015]. Thus, the semantics of our refined system is defined in terms of that of its underlying, unrefined system, which we discuss in Section 5.1.

Notation: We define the disjoint union \(X \uplus Y\) of sets X and Y by \(X \uplus Y = (\lbrace 1\rbrace \times X) \cup (\lbrace 2\rbrace \times Y)\) and define \({\mathit {inj}}_{k} : X_k \rightarrow X_1 \uplus X_2\) by \({\mathit {inj}}_{k}(d) = (k, d)\) . Semantic values are usually named d, f, g, or V.

5.1 Unrefined System

For space reasons, we do not fully present the unrefined system and its semantics here (see appendix Section A.4). The unrefined system is basically just the refined system with everything pertaining to indexes erased. The program terms of the unrefined system have almost the same syntax as those of the refined system, but an unrefined, recursive expression has no type annotation, and we replace the expression unreachable by diverge, which stands for an inexhaustive pattern-matching error. The unrefined system satisfies a substitution lemma (appendix Lemma C.1) similar to that of the refined system, but its proof is simpler and does not rely on subsumption admissibility, because the unrefined system has no subtyping.

In CBPV, nontermination is regarded as an effect, so value and computation types denote different kinds of mathematical things: predomains and domains, respectively [Levy 2004], which are both sets with some structure. Because we have recursive expressions, we must model nontermination, an effect. We use elementary domain theory. For our (unrefined) system, we interpret (unrefined) positive types as predomains and (unrefined) negative types as domains. The only effect we consider in this article is nontermination (though we simulate inexhaustive pattern-matching errors with it); we take (chain-)complete partial orders (cpo) as predomains and pointed (chain-)complete partial orders (cppo) as domains.

Positive types and functors. The grammar for unrefined positive types is similar to that for refined positive types, but lacks asserting and existential types, and unrefined inductive types \(\mu F\) are not refined by predicates. Unrefined inductive types use the unrefined functor grammar, which is the same as the refined functor grammar but uses unrefined types in constant functors.

The denotations of unrefined positive types are standard. We briefly describe their partial orders, then describe the meaning of functors, and, last, return to the meaning of inductive types (which involve functors).

We give (the denotation of) 1 (denoting the distinguished terminal object \(\lbrace \bullet \rbrace\) ) the discrete order \(\lbrace (\bullet , \bullet)\rbrace\) . For \(P \times Q\) (denoting product), we use component-wise order ( \({(d_1, d_2)}\sqsubseteq _{D_1 \times D_2}{(d_1^{\prime }, d_2^{\prime })}\) if \({d_1}\sqsubseteq _{D_1}{d_1^{\prime }}\) and \({d_2}\sqsubseteq _{D_2}{d_2^{\prime }}\) ), for 0 (denoting the initial object), we use the empty order, and for \(P + Q\) (denoting coproduct, that is, disjoint union \(\uplus\) ), we use injection-wise order ( \({{\mathit {inj}}_{j} d}\sqsubseteq _{D_1 \uplus D_2}{{\mathit {inj}}_{k} d^{\prime }}\) if \(j = k\) and \({d}\sqsubseteq _{D_j}{d^{\prime }}\) ). We give \({\downarrow \!{N}}\) the order of N, that is, \(\downarrow\) denotes the forgetful functor from the category \(\mathbf {Cppo}\) of cppos and continuous functions to the category \(\mathbf {Cpo}\) of cpos and continuous functions. Finally, \({V_1}\sqsubseteq _{[\![ {\mu F}]\!] _{}}{V_2}\) if \({V_1}\sqsubseteq _{[\![ {F}]\!] _{}^{k+1} \emptyset }{V_2}\) for some \(k \in \mathbb {N}\) , inheriting the type denotation orders as the functor is applied.

The denotations of unrefined functors are standard \(\mathbf {Cpo}\) endofunctors. We briefly describe them here, but full definitions are in appendix Section A.4. The sum functor \(\oplus\) denotes a functor that sends a cpo to the disjoint union \(\uplus\) of its component applications (with usual injection-wise order), and its functorial action is injection-wise. The product functor \(\otimes\) denotes a functor that sends a cpo to the product \(\times\) of its component applications (with usual component-wise order), and its functorial action is component-wise. The unit functor I denotes a functor sending any cpo to \(1 = {\lbrace \bullet \rbrace }\) (discrete order), and its functorial action sends all morphisms to \({\mathit {id}}_{\lbrace \bullet \rbrace }\) . The constant (type) functor \(\underline{P}\) denotes a functor sending any cpo to the cpo \([\![ {P}]\!] _{}\) , and its functorial action sends all morphisms to the identity \({\mathit {id}}_{[\![ {P}]\!] _{}}\) on \([\![ {P}]\!] _{}\) . The identity functor \(\mathrm{Id}\) denotes the identity endofunctor on \(\mathbf {Cpo}\) . (Forgetting the order structure, functors also denote endofunctors on the category \(\mathbf {Set}\) of sets and functions.)

We now explain the denotational semantics of our inductive types. Semantically, we build an inductive type (such as \([\![ {\mathsf {List} \; A}]\!] _{}\) ) by repeatedly applying (the denotation of) its functor specification (such as \([\![ {\textsf {ListF}_A}]\!] _{}\) ) to the initial object \([\![ {0}]\!] _{} = \emptyset\) . For example,

where \(1 = {\lbrace \bullet \rbrace }\) (using the relatively direct functors with more complicated unrolling, discussed in Section 4.0.1). We denote the nil list \([]\) by \({\mathit {inj}}_{1} \bullet\) , a list \(x :: []\) with one term x by \({\mathit {inj}}_{2} ([\![ {x}]\!] _{}, {\mathit {inj}}_{1} \bullet)\) , and so on. In general, given a (polynomial) \(\mathbf {Set}\) (category of sets and functions) endofunctor F (which, for this article, will always be the denotation of a well-formed (syntactic) functor, refined or otherwise), we define \(\mu F = \cup _{k\in \mathbb {N}} F^k \emptyset\) . We then define \([\![ {\mu F}]\!] _{} = \mu [\![ {F}]\!] _{}\) . In our system, for every well-formed (unrefined) functor F, the set \(\mu [\![ {F}]\!] _{}\) is a fixed point of \([\![ {F}]\!] _{}\) (appendix Lemma C.7): that is, \([\![ {F}]\!] _{} (\mu [\![ {F}]\!] _{}) = \mu [\![ {F}]\!] _{}\) (and similarly for refined functors: appendix Lemma D.11).

Negative types. The grammar for unrefined negative types has unrefined function types \(P \rightarrow N\) and unrefined upshifts \({\uparrow \!{P}}\) , with no guarded or universal types. Unrefined negative types denote cppos.

Function types \(P \rightarrow N\) denote continuous functions from \([\![ {P}]\!] _{}\) to \([\![ {N}]\!] _{}\) (which we sometimes write as \([\![ {P}]\!] _{} \Rightarrow [\![ {N}]\!] _{}\) ), where its order is defined pointwise, together with the bottom element (the “point” of “pointed cpo”) \(\bot _{[\![ {P \rightarrow N}]\!] _{}}\) that maps every \(V \in [\![ {P}]\!] _{}\) to the bottom element \(\bot _{[\![ {N}]\!] _{}}\) of \([\![ {N}]\!] _{}\) (that is, \(\uparrow\) denotes the lift functor from \(\mathbf {Cpo}\) to \(\mathbf {Cppo}\) ). For our purposes, this is equivalent to lifting \([\![ {P}]\!] _{} \in \mathbf {Cpo}\) to \(\mathbf {Cppo}\) and denoting arrow types by strict ( \(\bot\) goes to \(\bot\) ) continuous functions so function types denote \(\mathbf {Cppo}\) exponentials.

Upshifts \({\uparrow \!{P}}\) denote \([\![ {P}]\!] _{} \uplus \lbrace \bot _{\uparrow }\rbrace\) with the lift order

and bottom element \(\bot _{[\![ {{\uparrow \!{P}}}]\!] _{}} = {\mathit {inj}}_{2} \bot _{\uparrow }\) . We could put, say, \(\bullet\) rather than \(\bot _{\uparrow }\) , but we think the latter is clearer in associating it with the bottom element of upshifts; or \(\bot\) rather than \(\bot _{\uparrow }\) but we often elide the “ \([\![ {A}]\!] _{}\) ” subscript in \(\bot _{[\![ {A}]\!] _{}}\) when clear from context.

Appendix Figure 27 has the full definition of (unrefined) type and functor denotations.

Well-typed program terms. We write \(\Gamma {\vdash} \mathcal {O} \cdots A\) and \(\Gamma ; [B] {\vdash} \mathcal {O} \cdots A\) to stand for all six unrefined program typing judgments: \({\Gamma } {\vdash} {h} {\color{red}{\Rightarrow}} {P}\) and \({\Gamma } {\vdash} {g} {\color{red}{\Rightarrow}} {{\uparrow \!{P}}}\) and \({\Gamma } {\vdash} {v} {\color{blue}{\Leftarrow}} {P}\) and \({\Gamma } {\vdash} {e} {\color{blue}{\Leftarrow}} {N}\) and \({\Gamma }; [{P}] {\vdash} {\lbrace {{r}_{i}} \mathbin {\Rightarrow }{{e}_{i}}\rbrace _{{i} \in {I}}} {\color{blue}{\Leftarrow}} {N}\) and \({\Gamma }; [{N}] {\vdash} {s} \gg {{\uparrow \!{P}}}\) .

The denotational semantics of well-typed, unrefined program terms of judgmental form \(\Gamma {\vdash} \mathcal {O} \cdots A\) or \(\Gamma ; [B] {\vdash} \mathcal {O} \cdots A\) are continuous functions \([\![ {\Gamma }]\!] _{} \rightarrow [\![ {A}]\!] _{}\) and \([\![ {\Gamma }]\!] _{} \rightarrow [\![ {B}]\!] _{} \rightarrow [\![ {A}]\!] _{}\) , respectively, where \([\![ {\Gamma }]\!] _{}\) is the set of all semantic substitutions \({\vdash} \delta : \Gamma\) together with component-wise order. Similarly to function type denotations, the bottom element of a \([\![ {\Gamma }]\!] _{} \rightarrow [\![ {N}]\!] _{}\) sends every \(\delta \in [\![ {\Gamma }]\!] _{}\) to \(\bot _{[\![ {N}]\!] _{}}\) (equivalently for our purposes, we can lift source predomains and consider strict continuous functions). We only interpret typing derivations, but we often only mention the program term in semantic brackets \([\![ {-}]\!] _{}\) . For example, if \(\Gamma {\vdash} x {\color{red}{\Rightarrow}} P\) , then \([\![ {x}]\!] _{} = (\delta \in [\![ {\Gamma }]\!] _{}) \mapsto \delta (x)\) . We write the application of the denotation \([\![ {E}]\!] _{}\) of a program term E (typed under \(\Gamma\) ) to a semantic substitution \(\delta \in [\![ {\Gamma }]\!] _{}\) as \([\![ {E}]\!] _{\delta }\) . We only mention a few of the more interesting cases of the definition of typing denotations; for the full definition, see appendix Figures 28, 29, and 30. If \({\Gamma }; [{N}] {\vdash} {v, s} \gg {M}\) , then

Returner expressions denote monadic returns:

Let-binding denotes monadic binding:

A recursive expression denotes a fixed point obtained by taking the least upper bound ( \(\sqcup\) ) of all its successive approximations:

In the unrefined system, we include diverge, to which unreachable erases (that is, \(| {\tt unreachable} | = {\tt diverge}\) ). We intend diverge to stand for an undefined body of a pattern-matching clause, but we interpret this error as divergence to simplify the semantics:

The point is that the refined system prevents the error.

We will say more about the semantics of folds in Section 5.2, but note that the action of rolling and unrolling syntactic values is essentially denoted by \(d \mapsto d\) :

This works due to the fact that unrolling is sound (roughly, the denotations of each side of “ \(\circeq\) ” in the unrolling judgment are equal) and the fact that \([\![ {F}]\!] _{} (\mu [\![ {F}]\!] _{}) = \mu [\![ {F}]\!] _{}\) (and similarly for the refined system).

Unrefined soundness. Our proofs of (appendix) Lemma C.28 (Unrefined Type Soundness) and (appendix) Lemma C.30 (Unrefined Substitution Soundness) use standard techniques in domain theory [Gunter 1993].

Unrefined type soundness says that a term typed A under \(\Gamma\) denotes a continuous function \([\![ {\Gamma }]\!] _{} \rightarrow [\![ {A}]\!] _{}\) . We (partly) state (3 out of 6 parts) this in two mutually recursive lemmas as follows:

Lemma 5.1 (Continuous Maps).

(Lemma C.27 in appendix)

Suppose \({\vdash} \delta _1 : \Gamma _1\) and \({\vdash} \delta _2 : \Gamma _2\) and \({} {\vdash} {\Gamma _1, y:Q, \Gamma _2}\; \mathsf {ctx}\) .

(1)

If \({\Gamma _1, y:Q, \Gamma _2} {\vdash} {h} {\color{red}{\Rightarrow}} {P}\) , then the function \([\![ {Q}]\!] _{} \rightarrow [\![ {P}]\!] _{}\) defined by \(d \mapsto [\![ {h}]\!] _{\delta _1, d/y, \delta _2}\) is continuous.

(2)

If \({\Gamma _1, y:Q, \Gamma _2} {\vdash} {e} {\color{blue}{\Leftarrow}} {N}\) , then the function \([\![ {Q}]\!] _{} \rightarrow [\![ {N}]\!] _{}\) defined by \(d \mapsto [\![ {e}]\!] _{\delta _1, d/y, \delta _2}\) is continuous.

(3)

If \({\Gamma _1, y:Q, \Gamma _2}; [{N}] {\vdash} {s} \gg {{\uparrow \!{P}}}\) , then the function \([\![ {Q}]\!] _{} \rightarrow [\![ {N}]\!] _{} \rightarrow [\![ {{\uparrow \!{P}}}]\!] _{}\) defined by \(d \mapsto [\![ {s}]\!] _{\delta _1, d/y, \delta _2}\) is continuous.

Lemma 5.2 (Unrefined Type Soundness).

(Lemma C.28 in appendix)

Assume \({\vdash} \delta : \Gamma\) .

(1)

If \({\Gamma } {\vdash} {h} {\color{red}{\Rightarrow}} {P}\) , then \([\![ {{\Gamma } {\vdash} {h} {\color{red}{\Rightarrow}} {P}}]\!] _{\delta } \in [\![ {P}]\!] _{}\) .

(2)

If \({\Gamma } {\vdash} {e} {\color{blue}{\Leftarrow}} {N}\) , then \([\![ {{\Gamma } {\vdash} {e} {\color{blue}{\Leftarrow}} {N}}]\!] _{\delta } \in [\![ {N}]\!] _{}\) .

(3)

If \({\Gamma }; [{N}] {\vdash} {s} \gg {{\uparrow \!{P}}}\) , then \([\![ {{\Gamma }; [{N}] {\vdash} {s} \gg {{\uparrow \!{P}}}}]\!] _{\delta } \in [\![ {N}]\!] _{} \Rightarrow [\![ {{\uparrow \!{P}}}]\!] _{}\) .

The proof of unrefined type soundness is standard and uses the well-known fact that a continuous function in \(\mathbf {Cppo}\) has a least fixed point. Among other things, we also use the fact that \(\mu [\![ {F}]\!] _{}\) is a fixed point of \([\![ {F}]\!] _{}\) (appendix Lemma C.7). We also use the soundness of unrefined unrolling, which we did not mention here because it is similar to refined unrolling and its soundness, discussed in the next section.

We interpret an unrefined syntactic substitution (typing derivation) \(\Gamma _0 {\vdash} \sigma : \Gamma\) as a continuous function \([\![ {\Gamma _0}]\!] _{} \rightarrow [\![ {\Gamma }]\!] _{}\) that takes a \(\delta \in [\![ {\Gamma _0}]\!] _{}\) and uses \(\delta\) to interpret each of the entries in \(\sigma\) (remembering to apply the rest of the syntactic substitution, because substitution is defined sequentially):

Similarly to typing derivations, we only consider denotations of typing derivations \(\Gamma _0 {\vdash} \sigma : \Gamma\) of substitutions, but often simply write \([\![ {\sigma }]\!] _{}\) .

Unrefined substitution soundness says that semantic and syntactic substitution commute: If E is a program term typed under \(\Gamma\) and \(\Gamma _0 {\vdash} \sigma : \Gamma\) is a substitution, then \([\![ {[\sigma ]E}]\!] _{} = [\![ {E}]\!] _{} \circ [\![ {\sigma }]\!] _{}\) . Here, we partly show how it is stated in the appendix (1 out of 6 parts):

Lemma 5.3 (Unrefined Substitution Soundness).

(Lemma C.30 in appendix)

Assume \(\Gamma _0 {\vdash} \sigma : \Gamma\) and \({\vdash} \delta : \Gamma _0\) .

(1)

If \({\Gamma } {\vdash} {e} {\color{blue}{\Leftarrow}} {N}\) , then \([\![ {{\Gamma _0} {\vdash} {[\sigma ]e} {\color{blue}{\Leftarrow}} {N}}]\!] _{\delta } = [\![ {{\Gamma } {\vdash} {e} {\color{blue}{\Leftarrow}} {N}}]\!] _{[\![ {\sigma }]\!] _{\delta }}\) .

We use unrefined type/substitution soundness to prove refined type/substitution soundness, discussed next.

5.2 Refined System

Indexes. For any sort \(\tau\) , we give its denotation \([\![ {\tau }]\!] _{}\) the discrete order \(\sqsubseteq _{[\![ {\tau }]\!] _{}} \; = {\left\lbrace {(d,d)} \;\middle |\; {d\in [\![ {\tau }]\!] _{}}\right\rbrace }\) , making it a cpo.

Semantic Substitution. We introduced semantic substitutions \(\delta\) (at the index level) when discussing propositional validity (Section 4.1). Here, they are extended to semantic program values:

where \(\lfloor {{-}}\rfloor\) filters out program entries. Notation: We define \([\![ {\Theta ; \Gamma }]\!] _{} = {\left\lbrace {\delta } \;\middle |\; {{\vdash} \delta : \Theta ; \Gamma }\right\rbrace }\) .

Erasure. The erasure metaoperation \(| {-} |\) (appendix Section A.5) erases all indexes from (refined) types, program terms (which can have type annotations, but those do not affect program meaning), and syntactic and semantic substitutions. For example, \(| {\left\lbrace {\nu : \mu F} \;\middle |\; {(\mathsf {fold}_{F}\;{\alpha })\,{\nu } =_\tau t}\right\rbrace } | = \mu | F |\) and \(| \forall {a:\tau }.\: N | = | N |\) and \(| P \times Q | = | P | \times | Q |\) and so on.

We use many facts about erasure to prove refined type/substitution soundness (appendix lemmas):

—

Refined types denote subsets of what their erasures denote: Lemma C.31 (Type Subset of Erasure). Similarly for refined functors and refined inductive types: Lemmas C.32 (Functor Application Subset of Erasure) and C.33 (Mu Subset of Erasure).

—

The erasure of both types appearing in extraction, equivalence, and subtyping judgments results in equal (unrefined) types: Lemma C.36 (Extraction Erases to Equality), Lemma C.37 (Equivalence Erases to Equality), and Lemma C.38 (Subtyping Erases to Equality).

—

Refined unrolling and typing are sound with respect to their erasure: Lemma C.39 (Erasure Respects Unrolling), Lemma C.40 (Erasure Respects Typing), and Lemma C.42 (Erasure Respects Substitution Typing).

—

Erasure commutes with syntactic and semantic substitution: Lemma C.41 (Erasure Respects Substitution) and Lemma C.43 (Erasure Respects Semantic Substitution).

Types, functors, algebras, and folds. The denotations of refined types and functors are defined as logical subsets of the denotations of their erasures (together with their erasure denotations themselves). They are defined mutually with the denotations of well-formed algebras.

In appendix Figure 36, we inductively define the denotations of well-formed types \({\Theta } {\vdash} {A}\; \mathsf {type} {[{\_}}]\) . We briefly discuss a few of the cases. The meaning of an asserting type is the set of refined values such that the asserted index proposition holds (read \({\lbrace \bullet \rbrace }\) as true and \(\emptyset\) as false):

Existential and universal types denote elements of their erasure such that the relevant index quantification holds:

Guarded types denote elements of their erasure such that they are also in the refined type being guarded if the guard holds ( \({\lbrace \bullet \rbrace }\) means true):

The denotation of refined function types \([\![ {P \rightarrow N}]\!] _{\delta }\) is not the set \([\![ {P}]\!] _{\delta } \Rightarrow [\![ {N}]\!] _{\delta }\) of (continuous) functions from refined P-values to refined N-values; if it were, then type soundness would break:

which is not in \((\emptyset \Rightarrow {\lbrace \bullet \rbrace }\uplus \lbrace \bot _{\uparrow }\rbrace)\) . Instead, the meaning of a refined function type is a set (resembling a unary logical relation)

of unrefined (continuous) functions that take refined values to refined values. The meaning of refined upshifts enforces termination (if refined type soundness holds, and we will see it does):

Note that divergence \({\mathit {inj}}_{2} \bot _{\uparrow }\) is not in the set \([\![ {{\uparrow \!{P}}}]\!] _{\delta }\) .

In appendix Figure 37, we inductively define the denotations of well-formed refined functors F and algebras \(\alpha\) . The main difference between refined and unrefined functors is that in refined functors, constant functors produce subsets of their erasure. All functors, refined or otherwise, also (forgetting the partial order structure) denote endofunctors on the category of sets and functions. As with our unrefined functors, our refined functors denote functors with a fixed point (appendix Lemma D.11): \([\![ {F}]\!] _{\delta } (\mu [\![ {F}]\!] _{\delta }) = \mu [\![ {F}]\!] _{\delta }\) . Moreover, \(\mu [\![ {F}]\!] _{\delta }\) satisfies a recursion principle such that we can inductively define measures on \(\mu [\![ {F}]\!] _{\delta }\) via \([\![ {F}]\!] _{\delta }\) -algebras (discussed next).

Categorically, given an endofunctor F, an F-algebra is an evaluator map \(\alpha : F(\tau) \rightarrow \tau\) for some carrier set \(\tau\) . We may think of this in terms of elementary algebra: We form algebraic expressions with F and evaluate them with \(\alpha\) . A morphism f from algebra \(\alpha : F(\tau) \rightarrow \tau\) to algebra \(\beta : F(\tau ^{\prime }) \rightarrow \tau ^{\prime }\) is a morphism \(f : \tau \rightarrow \tau ^{\prime }\) such that \(f \circ \alpha = \beta \circ (F(f))\) . If an endofunctor F has an initial⁷ algebra \({\mathit {into}}: F (\mu F) \rightarrow \mu F\) , then it has a recursion principle. By the recursion principle for \(\mu F\) , we can define a recursive function from \(\mu F\) to \(\tau\) by folding \(\mu F\) with an F-algebra \(\alpha : F(\tau) \rightarrow \tau\) like so:

where \(\mathit {out} : \mu F \rightarrow F(\mu F)\) , which by Lambek’s lemma exists and is inverse to \(\mathit {into}\) , embeds (semantic) inductive values into the unrolling of the (semantic) inductive type (we usually elide \(\mathit {fmap}\) ). Conveniently, in our system and semantics, \(\mathit {out}\) is always \(d \mapsto d\) , and we almost never explicitly mention it. Syntactic values v in our system must be rolled into inductive types— \({\tt into}({v})\) —and this is also how (syntactic) inductive values are pattern-matched (“applying \(\mathit {out}\) ” to \({\tt into}({v})\) ), but \({\tt into}({-})\) conveniently denotes \(d \mapsto d\) .

We specify inductive types abstractly as sums of products so they denote polynomial endofunctors more directly. Polynomial endofunctors always have a “least” (initial) fixed point,⁸ and hence specify inductive types, which have a recursion principle. For example, we specify (modulo the unrolling simplification) \(\mathsf {len} : \mathsf {ListF}_A (\mathbb {N}) \Rightarrow \mathbb {N}\) (Section 1) by the (syntactic) algebra

which denotes the (semantic) algebra

defined by \([\![ {\alpha }]\!] _{} = [\bullet \mapsto 0, (a, n) \mapsto 1+n]\) . By initiality (the recursion principle), there is a unique function

such that \(\mathit {fold}_{[\![ {\textsf {ListF}_A}]\!] _{}}\;{[\![ {\alpha }]\!] _{}} = [\![ {\alpha }]\!] _{} \circ ([\![ {F}]\!] _{} (\mathit {fold}_{[\![ {\textsf {ListF}_A}]\!] _{}}\;{[\![ {\alpha }]\!] _{}}))\) , which semantically captures \(\mathsf {len}\) (Section 1).

In our system, a refined inductive type is written \({\left\lbrace {\nu : \mu F} \;\middle |\; {(\mathsf {fold}_{F}\;{\alpha })\,{\nu } =_\tau t}\right\rbrace }\) , which looks quite similar to its own semantics:

The type \(\textsf {List}(A)(n)\) of A-lists having length \(n : \mathbb {N}\) , for example, is defined in our system as:

Syntactic types, functors, and algebras in our system look very similar to their own semantics.

A well-typed algebra \({\Xi }; {\Theta } {\vdash} {\alpha } : {F}({\tau }) \Rightarrow {\tau }\) denotes a dependent function \(\prod _{\delta \in [\![ {\Theta }]\!] _{}} [\![ {F}]\!] _{\delta }[\![ {\tau }]\!] _{} \rightarrow [\![ {\tau }]\!] _{}\) . The definition (appendix Figure 37) is mostly standard, but the unit and pack cases could use some explanation. Because \(\Theta\) is for F and \(\Xi \ (\subseteq \Theta)\) is for \(\alpha\) , we restrict \(\delta\) to \(\Xi\) at algebra bodies:

The most interesting part of the definition concerns index packing:

The pack clause lets us bind the witness d of \(\tau ^{\prime }\) in the existential type \(\exists {a:\tau ^{\prime }}.\:{Q}\) to a in the body t of the algebra. We know d exists, since \(V_1 \in [\![ {\exists {a:\tau ^{\prime }}.\:{Q}}]\!] _{\delta }\) , but it is not immediate that d is unique. However, we prove d is uniquely determined by \(V_1\) ; we call this property the soundness of value-determined indexes (all parts are mutually recursive):

Lemma 5.4 (Soundness of Value-Determined Indexes).

(Lemma D.14 in appendix)

Assume \({\vdash} \delta _1 : \Theta\) and \(\delta _2 : \Theta\) .

(1)

If \({\Theta } {\vdash} {P}\; \mathsf {type} {[{\Xi }}]\) and \(V \in [\![ {P}]\!] _{\delta _1}\) and \(V \in [\![ {P}]\!] _{\delta _2}\) , then \(\delta _1\mathord {\upharpoonright }_\Xi = \delta _2\mathord {\upharpoonright }_\Xi\) .

(2)

If \({\Theta } {\vdash} {\mathcal {F}}\; \mathsf {functor} {[{\Xi }]}\) and \(X_1, X_2 \in \mathbf {Set}\) and \(V \in [\![ {\mathcal {F}}]\!] _{\delta _1} X_1\) and \(V \in [\![ {\mathcal {F}}]\!] _{\delta _2} X_2\) , then \(\delta _1\mathord {\upharpoonright }_\Xi = \delta _2\mathord {\upharpoonright }_\Xi\) .

(3)

If \({\Xi }; {\Theta } {\vdash} {\alpha } : {F}({\tau }) \Rightarrow {\tau }\) and \(\Xi \subseteq \Theta\) and \(\delta _1\mathord {\upharpoonright }_\Xi = \delta _2\mathord {\upharpoonright }_\Xi\) , then \([\![ {\alpha }]\!] _{\delta _1} = [\![ {\alpha }]\!] _{\delta _2}\) on \([\![ {F}]\!] _{\delta _1} [\![ {\tau }]\!] _{} \cap [\![ {F}]\!] _{\delta _2} [\![ {\tau }]\!] _{}\) .

Therefore, the \(\Xi\) in type and functor well-formedness really does track index variables that are uniquely determined by values, semantically speaking.

Well-typed program terms. Appendix Figure 38 specifies the denotations of well-typed refined program terms in terms of the denotations of their erasure. The denotation of a refined program term E typed under \((\Theta ; \Gamma)\) , at refined semantic substitution \(\delta \in [\![ {\Theta ; \Gamma }]\!] _{}\) , is the denotation \([\![ { | E | }]\!] _{ | \delta | }\) of the (derivation of the) term’s erasure \(| E |\) at the erased substitution \(| \delta |\) . For example,

Unrolling. We prove (appendix Lemma D.15) that unrolling is sound:

Lemma 5.5 (Unrolling Soundness).

(Lemma D.15 in appendix)

Assume \({\vdash} \delta : \Theta\) and \(\Xi \subseteq \Theta\) . If \({\Xi };{\Theta }\vdash\{{\nu :G[\mu F]}\;|\;{\beta }({G\;(\mathsf {fold}_{F}\;{\alpha })\;\nu )}=_{t}t\}\doteq P\) ,

then \({\left\lbrace {V \in [\![ {G}]\!] _{\delta } (\mu [\![ {F}]\!] _{\delta })} \;\middle |\; {[\![ {\beta }]\!] _{\delta } ([\![ {G}]\!] _{\delta }(\mathit {fold}_{[\![ {F}]\!] _{\delta }}\;{[\![ {\alpha }]\!] _{\delta }})\;V) = [\![ {t}]\!] _{\delta }}\right\rbrace } = [\![ {P}]\!] _{\delta }\) .

Due to our definition of algebra denotations (specifically, for the pack pattern), we use the soundness of value-determined indexes in the pack case of the proof.

Subtyping. We prove (appendix Lemma D.19) that subtyping is sound:

Lemma 5.6 (Soundness of Subtyping).

(Lemma D.19 in appendix)

Assume \({\vdash} \delta : \Theta\) . If \({\Theta }\vdash{A}\leq^{\pm}{B}\) , then \([\![ {A}]\!] _{\delta } \subseteq [\![ {B}]\!] _{\delta }\) .

Type soundness. Denotational-semantic type soundness says that if a program term has type A under \(\Theta\) and \(\Gamma\) , then the mathematical meaning of that program term at any interpretation of (that is, semantic environment for) \(\Theta\) and \(\Gamma\) is an element of the mathematical meaning of A at that interpretation, that is, the program term denotes a dependent function \(\prod _{\delta \in [\![ {\Theta ; \Gamma }]\!] _{}} [\![ {A}]\!] _{\lfloor {\delta }\rfloor }\) . This more or less corresponds to proving (operational) type soundness with respect to a big-step operational semantics. Refined types pick out subsets of values of unrefined types. Therefore, by type soundness, if a program has a refined type, then we have learned something more about that program than the unrefined system can verify for us.

Theorem 5.7 (Type Soundness).

(Theorem D.25 in appendix)

Assume \({\vdash} \delta : \Theta ; \Gamma\) . Then:

(1)

If \({\Theta }; {\Gamma } {\vdash} {h} {\color{red}{\Rightarrow}} {P}\) , then \([\![ {h}]\!] _{\delta } \in [\![ {P}]\!] _{\lfloor {\delta }\rfloor }\) .

(2)

If \({\Theta }; {\Gamma } {\vdash} {g} {\color{red}{\Rightarrow}} {N}\) , then \([\![ {g}]\!] _{\delta } \in [\![ {N}]\!] _{\lfloor {\delta }\rfloor }\) .

(3)

If \({\Theta }; {\Gamma } {\vdash} {v} {\color{blue}{\Leftarrow}} {P}\) , then \([\![ {v}]\!] _{\delta } \in [\![ {P}]\!] _{\lfloor {\delta }\rfloor }\) .

(4)

If \({\Theta }; {\Gamma } {\vdash} {e} {\color{blue}{\Leftarrow}} {N}\) , then \([\![ {e}]\!] _{\delta } \in [\![ {N}]\!] _{\lfloor {\delta }\rfloor }\) .

(5)

If \({\Theta }; {\Gamma }; [{P}] {\vdash} {\lbrace {{r}_{i}} \mathbin {\Rightarrow }{{e}_{i}}\rbrace _{{i} \in {I}}} {\color{blue}{\Leftarrow}} {N}\) , then \([\![ {\lbrace {{r}_{i}} \mathbin {\Rightarrow }{{e}_{i}}\rbrace _{{i} \in {I}}}]\!] _{\delta } \in [\![ {P}]\!] _{\lfloor {\delta }\rfloor } \Rightarrow [\![ {N}]\!] _{\lfloor {\delta }\rfloor }\) .

(6)

If \({\Theta }; {\Gamma }; [{N}] {\vdash} {s} \gg {{\uparrow \!{P}}}\) , then \([\![ {s}]\!] _{\delta } \in [\![ {N}]\!] _{\lfloor {\delta }\rfloor } \Rightarrow [\![ {{\uparrow \!{P}}}]\!] _{\lfloor {\delta }\rfloor }\) .

(All parts are mutually recursive.) The proof (appendix Theorem D.25) uses the soundness of unrolling and subtyping. The proof is mostly straightforward. The hardest case is the one for recursive expressions in part (4), where we use an upward closure lemma—in particular, part (3) below—to show that the fixed point is in the appropriately refined set:

Lemma 5.8 (Upward Closure).

(Lemma D.22 in appendix)

Assume \({\vdash} \delta : \Theta\) .

(1)

If \({\Xi }; {\Theta } {\vdash} {\alpha } : {F}({\tau }) \Rightarrow {\tau }\) and \(\Xi \subseteq \Theta\) then \([\![ {\alpha }]\!] _{\delta }\) is monotone.

(2)

If \({\Theta } {\vdash} {\mathcal {G}}\; \mathsf {functor} {[{\_}]}\) and \({\Theta } {\vdash} {F}\; \mathsf {functor} {[{\_}]}\) and \(k \in \mathbb {N}\)

and \(V \in [\![ {\mathcal {G}}]\!] _{\delta } ([\![ {F}]\!] _{\delta }^k \emptyset)\) and \({V}\sqsubseteq _{[\![ { | \mathcal {G} | }]\!] _{} ([\![ { | F | }]\!] _{}^k \emptyset)}{V^{\prime }}\) ,

then \(V^{\prime } \in [\![ {\mathcal {G}}]\!] _{\delta } ([\![ {F}]\!] _{\delta }^k \emptyset)\) .

(3)

If \({\Theta } {\vdash} {A}\; \mathsf {type} {[{\_}}]\) and \(V \in [\![ {A}]\!] _{\delta }\) and \({V}\sqsubseteq _{[\![ { | A | }]\!] _{}}{V^{\prime }}\) , then \(V^{\prime } \in [\![ {A}]\!] _{\delta }\) .

Out of all proofs in this article, the proof of upward closure (appendix Lemma D.22) is a top contender for the most interesting induction metric:

Proof.

By lexicographic induction on, first, \({\mathit {sz}}{(A)}\) / \({\mathit {sz}}{(F)}\) (parts (1), (2), and (3), mutually), and, second, \(\langle k, \mathcal {G} \text{ structure} \rangle\) (part (2)), where \(\langle \dots \rangle\) denotes lexicographic order. □

We define the simple size function \({\mathit {sz}}{(-)}\) , which is basically a standard structural notion of size, in appendix Figure 57. This is also the only place, other than unrolling soundness, where we use the soundness of value-determined indexes (again for a pack case, in part (1)).

Substitution soundness. We interpret a syntactic substitution (typing derivation) \(\Theta _0; \Gamma _0 {\vdash} \sigma : \Theta ; \Gamma\) as a function \([\![ {\sigma }]\!] _{} : [\![ {\Theta _0; \Gamma _0}]\!] _{} \rightarrow [\![ {\Theta ; \Gamma }]\!] _{}\) on semantic substitutions (appendix Def. B.1). Similarly to the interpretation of unrefined substitution typing derivations, the interpretation of the head term being substituted (its typing/sorting subderivation) pre-applies the rest of the substitution:

Substitution soundness holds (appendix Theorem D.28): If E is a program term typed under \(\Theta\) and \(\Gamma\) , and \(\Theta _0; \Gamma _0 {\vdash} \sigma : \Theta ; \Gamma\) , then \([\![ {[\sigma ]E}]\!] _{} = [\![ {E}]\!] _{} \circ [\![ {\sigma }]\!] _{}\) . (Recall we prove a syntactic substitution lemma: Lemma 4.2.) That is, substitution and denotation commute, or (in other words) syntactic substitution and semantic substitution are compatible.

Logical consistency, total correctness, and partial correctness. Our semantic type soundness result implies that our system is logically consistent and totally correct.

A logically inconsistent type (for example, 0 or \({\uparrow \!{0}}\) or \({\uparrow \!{(1 \wedge \mathsf {ff})}}\) ) denotes the empty set, which is uninhabited.

Corollary 5.9 (Logical Consistency).

If \({\cdot }; {\cdot } {\vdash} {e} {\color{blue}{\Leftarrow}} {N}\) , then N is logically consistent, that is, \([\![ {N}]\!] _{\cdot } \ne \emptyset\) . Similarly, if \({\cdot }; {\cdot } {\vdash} {v} {\color{blue}{\Leftarrow}} {P}\) , then P is logically consistent, and so on for the other typing judgments.

Proving logical consistency syntactically, say, via progress and preservation lemmas, would require also proving that every reduction sequence eventually terminates (that is, strong normalization), which might need a relatively complicated proof using logical relations [Tait 1967].

Total correctness means that every closed computation (that is specified as total) returns a value of the specified type:

Corollary 5.10 (Total Correctness).

If \({\cdot }; {\cdot } {\vdash} {e} {\color{blue}{\Leftarrow}} {{\uparrow \!{P}}}\) , then \([\![ {e}]\!] _{\cdot } \ne \bot _{[\![ {{\uparrow \!{P}}}]\!] _{\cdot }}\) , that is, e does not diverge.

Our system can be extended to include partiality, simply by adding a partial upshift type connective \({\upharpoonleft \!{P}}\) (“partial upshift of P”), with type well-formedness, subtyping, and type equivalence rules similar to those of \({\uparrow \!{P}}\) , and the following two expression typechecking rules: The first rule introduces the new connective \({\upharpoonleft \!{P}}\) ; the second rule lacks a termination refinement such as that in \(\leq^{\color{\blue}{\Leftarrow}}{\texttt{rec}}\) , so it may yield divergence.

The meaning of the partial upshift is defined as follows:

It is straightforward to update the metatheory to prove partial correctness: If a closed computation (that is specified as partial) terminates, then it returns a value of the specified type. Partial correctness is a corollary of the updated type soundness result: If \({\cdot }; {\cdot } {\vdash} {e} {\color{blue}{\Leftarrow}} {{\upharpoonleft \!{P}}}\) and \([\![ {e}]\!] _{\cdot } \ne \bot _{[\![ {{\uparrow \!{P}}}]\!] _{\cdot }}\) , then \([\![ {e}]\!] _{\cdot } = {\mathit {inj}}_{1} V\) and \(V \in [\![ {P}]\!] _{\cdot }\) .

Adding partiality introduces logical inconsistency, so we must restate logical consistency for expression typing: If \({\cdot }; {\cdot } {\vdash} {e} {\color{blue}{\Leftarrow}} {{\uparrow \!{P}}}\) , then \({\uparrow \!{P}}\) is logically consistent.

6 Algorithmic System

We design our algorithmic system in the spirit of those of Dunfield and Krishnaswami [2013];, 2019], but those systems do not delay constraint verification until all existentials are solved. The algorithmic rules closely mirror the declarative rules, except for a few key differences:

—

Whenever a declarative rule conjures an index term, the corresponding algorithmic rule adds, to a separate (input) algorithmic context \(\Delta\) , an existential variable (written with a hat: \(\hat{a}\) ) to be solved.

—

As the typing algorithm proceeds, we add index term solutions of the existential variables to the output algorithmic context, increasing knowledge (see Section 6.2). We eagerly apply index solutions to input types and output constraints and pop them off the output context when out of scope.

—

Whenever a declarative rule checks propositional validity or equivalence ( \({\Theta } {\vdash} {\phi } \;\mathsf {true}\) or \({\Theta }\vdash{\phi }\equiv{\psi }\) ), the algorithm delays checking the constraint until all existential variables in the propositions are solved (at the end of a focusing stage). Similarly, subtyping, type equivalence, and expression typechecking constraints are delayed until all existential variables are solved. When an entity has no existential variables, we say that it is ground.

—

In subtyping, we eagerly extract from assumptive positions immediately under polarity shifts.

Syntactically, objects in the algorithmic system are not much different from corresponding objects of the declarative system. We extend the grammar for index terms with a production of existential variables, which we write as an index variable with a hat \(\hat{a}\) , \(\hat{b}\) , or \(\hat{c}\) :

We use this (algorithmic) index grammar everywhere in the algorithmic system, using the same declarative metavariables. However, we write algorithmic logical contexts with a hat: \(\hat{\Theta }\) . Algorithmic logical contexts \(\hat{\Theta }\) only appear in output mode and are like (input) logical contexts \(\Theta\) , but propositions listed in them may have existential variables (its index variable sortings \(a:\tau\) are universal).

Constraints are added to the algorithmic system. Figure 14 gives grammars for subtyping and typing constraints. In contrast to DML, the grammar does not include existential constraints.

Fig. 14.

Checking constraints boils down to checking propositional validity, \({\Theta } {\vdash} {\phi } \;\mathsf {true}\) , which is analogous to checking verification conditions in the tradition of imperative program verification initiated by Floyd [1967] and Hoare [1969] (where programs annotated with Floyd–Hoare assertions are analyzed, generating verification conditions whose validity implies program correctness). These propositional validity constraints are the constraints that can be automatically verified by a theorem prover such as an SMT solver. The (algorithmic) W constraint verification judgment is written \({\Theta } \models {W}\) and means that W algorithmically holds under \(\Theta\) . Notice that the only context in the judgment is \(\Theta\) , which has no existential variables; this reflects the fact that we delay verifying W until W has no existential variables (in which case, we say W is ground). Similarly, \({\Theta }; {\Gamma } \lhd {\chi }\) is the (algorithmic) \(\chi\) verification judgment, meaning all of the constraints in \(\chi\) algorithmically hold under \(\Theta\) and \(\Gamma\) , and here \(\chi\) is also ground (by focusing).

6.1 Contexts and Substitution

Algorithmic contexts \(\Delta\) are lists of solved or unsolved existential variables and are said to be complete, and are written as \(\Omega\) , if they are all solved:

We require solutions t of existential variables \(\hat{a}\) to be well-sorted under (input) logical contexts \(\Theta\) , which have no existential variables. To maintain this invariant that every solution in \(\Delta\) is ground, that is, has no existential variables, we exploit type polarity in algorithmic subtyping and prevent existential variables from ever appearing in refinement algebras.

We will often treat algorithmic contexts \(\Delta\) as substitutions of ground index terms for existential variables \(\hat{a}\) in index terms t (including propositions \(\phi\) ), types A, functors \(\mathcal {F}\) , constraints W and \(\chi\) , and output logical contexts \(\hat{\Theta }\) (whose propositions may have existential variables). The definition is straightforward: homomorphically apply the context to the object \(\mathcal {O}\) and further define \([\Delta ]\mathcal {O}\) by induction on \(\Delta\) .

The order of substitutions in the definition of context application above does not matter, because solutions are ground (we may view \([\Delta ]\mathcal {O}\) as simultaneous substitution). If \(\mathcal {O}\) only has existential variables from \(\mathit {dom}(\Omega)\) , then \([\Omega ]\mathcal {O}\) is ground.

6.2 Context Extension

The algorithmic context extension judgment \({\Theta } {\vdash} {\Delta } \longrightarrow {\Delta ^{\prime }}\) says that \(\mathit {dom}(\Delta) = \mathit {dom}(\Delta ^{\prime })\) and \(\Delta ^{\prime }\) has the same solutions as \(\Delta\) , but possibly solves more (that are unsolved in \(\Delta\) ). All typing and subtyping judgments (under \(\Theta\) ) that have input and output algorithmic contexts \(\Delta\) and \(\Delta ^{\prime }\) (respectively) enjoy the property that they increase index information, that is, \({\Theta } {\vdash} {\Delta } \longrightarrow {\Delta ^{\prime }}\) . If \({\Theta } {\vdash} {\Delta } \longrightarrow {\Omega }\) , then \(\Omega\) completes \(\Delta\) : It has \(\Delta\) ’s solutions, but also solutions to all of \(\Delta\) ’s unsolved variables.

6.3 Subtyping

Algorithmic subtyping \({\Theta ; \Delta }\vdash{A}<:^{\pm}{B}/{W}\dashv{\Delta ^{\prime }}\) says that, under logical context \(\Theta\) and algorithmic context \(\Delta\) , the type A is algorithmically a subtype of B if and only if output constraint W holds algorithmically (under suitable solutions including those of \(\Delta ^{\prime }\) ), outputting index solutions \(\Delta ^{\prime }\) . In subtyping and type equivalence, the delayed output constraints W must remember their logical context via \(\mathrel {\supset }\) and \(\forall\) . For example, in checking that \(\exists {a:\mathbb {N}}.\: \textsf {Nat}(a) \wedge (a \lt 5)\) is a subtype of \(\exists {a:\mathbb {N}}.\: \textsf {Nat}(a) \wedge (a \lt 10)\) , the output constraint W is \(\forall {a:\mathbb {N}}.\: (a \lt 5) \mathrel {\supset }(a \lt 10)\) .

For space reasons, we do not present all algorithmic subtyping rules here (see appendix Figure 48), but only enough rules to discuss the key design issues. Further, we do not present algorithmic equivalence here (see appendix Figures 44 and 46), which is similar to and simpler than algorithmic subtyping.

In algorithmic subtyping, we maintain the invariant that positive subtypes and negative supertypes are ground. The rules

are the only subtyping rules that add existential variables (to the side not necessarily ground) to be solved (whereas the declarative system conjures a solution). We pop off the solution, as we have the invariant that output contexts are eagerly applied to output constraints and input types.

The rule

runs the functor equivalence algorithm (which outputs constraint W and solutions \(\Delta _1^{\prime }, \hat{a}:\tau , \Delta _2^{\prime }\) ), checks that \(\hat{a}\) does not get solved there, and then solves \(\hat{a}\) to t (yielding \(\Delta ^{\prime }\) ) after checking that the latter (which is a subterm of a positive subtype) is ground, outputting the constraint generated by functor equivalence together with the equation \(t = t\) (the declarative system can conjure a different but logically equal term for the right-hand side of this equation), and \(\Delta ^{\prime }\) . Alternatively, there is a rule for when \(\hat{a}\) gets solved by functor equivalence, and a rule where a term that is not an existential variable is in place of \(\hat{a}\) .

The rule

extracts \(M^{\prime }\) and \(\hat{\Theta }\) from M and delays the resulting negative subtyping constraint \(\underline{{N}<:^{-}{M^{\prime }}}\) , to be verified under its logical setting \(\hat{\Theta }\) (whose propositions, which were extracted from the side not necessarily ground, may have existential variables only solved in value typechecking). The metaoperation \(\mathrel {\supset }^\ast\) traverses \(\hat{\Theta }\) , creating universal quantifiers from universal variables and implications from propositions:

The dual shift rule is similar. In the declarative system, \(\leq^{\color{#a600a6}{+}}\rightsquigarrow{\text{L}}\) and \(\leq^{\color{\blue}{-}}\rightsquigarrow{\text{R}}\) are invertible, which means that they can be eagerly applied without getting stuck; algorithmically, we apply them immediately at polarity shifts, so the above rule corresponds to an algorithmic combination of the declarative rules \(\leq^{\color{\blue}{-}}\rightsquigarrow{\text{R}}\) and \({\leq^{\color{#a600a6}{+}}\downarrow}\) (and similarly for its dual rule for \(\uparrow\) ).

For rules with multiple nontrivial premises, such as product subtyping

we thread solutions through inputs, applying them to the non-ground side ( \([\Delta ]-\) treats \(\Delta\) as a substitution of index solutions for existential variables), ultimately outputting both delayed constraints. We maintain the invariant that existential variables in output constraints are eagerly solved, which is why, for example, \(\Delta ^{\prime }\) is applied to \(W_1\) in the conclusion of the above rule, but not to \(W_2\) (that would be redundant).

6.4 Typing

We now discuss issues specific to algorithmic program typing.

Exploiting polarity, we can restrict the flow of index information to the right- and left-focusing stages: in particular, \({\Theta ; \Delta }; {\Gamma } {\vdash} {v} {\color{blue}{\Leftarrow}} {P} \mathrel {/}{\chi } \dashv {\Delta ^{\prime }}\) and \({\Theta ; \Delta }; {\Gamma }; [{N}] {\vdash} {s} \gg {{\uparrow \!{P}}} \mathrel {/}{\chi } \dashv {\Delta ^{\prime }}\) , the algorithmic value and spine typechecking judgments. The input types of these judgments can have existential variables, and these judgments synthesize constraints and index solutions, but the algorithmic versions of the other judgments do not; we judgmentally distinguish the latter by replacing the “ \({\vdash}\) ” in the declarative judgments with “ \(\rhd\) ” (for example, \({\Theta }; {\Gamma } \rhd {g} {\color{red}{\Rightarrow}} {{\uparrow \!{P}}}\) ). Delayed constraints are verified only and immediately after completing a focusing stage, when all their existential variables are necessarily solved.

Consequently, the algorithmic typing judgments for heads, bound expressions, pattern matching, and expressions are essentially the same as their declarative versions, but with a key difference. Namely, in \({\text{Alg}}{\color{red}{\Rightarrow}}{\text{ValAnnot}}\) , \({\text{Alg}}{\color{red}{\Rightarrow}}{\text{App}}\) , and \({\text{Alg}}{\color{blue}{\Leftarrow}}{{\uparrow}}\) (below, respectively), focusing stages start with an empty algorithmic context, outputting ground constraints (and an empty output context because solutions are eagerly applied), and a premise is added to verify these constraints:

Algorithmic typechecking for recursive expressions uses algorithmic subtyping, which outputs a ground constraint W. Because this W is ground, we can verify it ( \({\Theta } \models {W}\) ) immediately:

For the full definition of algorithmic typing, see appendix Figures 51, 52, and 53.

Besides the instantiation rules (such as \(<:^{{\color{#a600a6}{+}}}/{\dashv}\mu {\text{INST}}\) ) for inductive types in algorithmic subtyping and type equivalence, there are exactly two judgments ( \({\Theta ; \Delta } {\vdash} {\phi }\;\mathsf {inst} \dashv {\Delta ^{\prime }}\) and \({\Theta ; \Delta } {\vdash} {\phi } \equiv {\psi } \;\mathsf {inst} \dashv {\Delta ^{\prime }}\) ) responsible for inferring index solutions, both dealing with the output of algorithmic unrolling. Algorithmic unrolling can output indexes of the form \(\hat{a}= t\) with t ground, and these equations are solved in either value typechecking, subtyping, or type equivalence. In the former two cases, we can solve \(\hat{a}\) as the algebra body t, which is necessarily ground (as discussed in Section 4.2). The judgment \({\Theta ; \Delta } {\vdash} {\phi }\;\mathsf {inst} \dashv {\Delta ^{\prime }}\) (appendix Figure 45), used in \({\text{Alg}}{\color{blue}{\Leftarrow}}{{\wedge}}\) , checks whether \(\phi\) has form \(\hat{a}= t\) where t is ground. If so, then it solves \(\hat{a}\) to t in \(\Delta\) ; otherwise, it does not touch \(\Delta\) .

For example, suppose a head synthesizes \(\forall {a:\mathbb {N}}.\: \mathsf {Nat}(a) \rightarrow {\uparrow \!{\mathsf {Nat}(a)}}\) and we wish to apply this head to the spine (containing exactly one argument value) \({\tt into}({{\tt inj}_{2}\,{\langle {{\tt into}({{\tt inj}_{1}\,{\left\langle \right\rangle }})}, {\left\langle \right\rangle }\rangle }})\) . We generate a fresh existential variable \(\hat{a}\) for a (rule \(\text{AlgSpine}\forall\) ) and then check the value against \(\textsf {Nat}(\hat{a})\) (rule \(\text{AlgSpineApp}\) ). (Checking the same value against type \(\exists {a:\mathbb {N}}.\: \mathsf {Nat}(a)\) yields the same problem, by dual rule \({\text{Alg}}{\color{blue}{\Leftarrow}}{{\exists}}\) , and the following solution also works in this case.) The type \(\textsf {Nat}(\hat{a})\) has value-determined index \(\hat{a}\) (its \(\Xi\) is \(\hat{a}:\mathbb {N}\) ), so it is solvable. We unroll (rule \({\text{Alg}}{\color{blue}{\Leftarrow}}{{\mu}}\) ) \(\textsf {Nat}(\hat{a})\) to \(\big (1 \wedge (\hat{a}=0) \big) + \big (\exists {a^{\prime }:\mathbb {N}}.\: \textsf {Nat}(a^{\prime }) \times \big (1 \wedge (\hat{a}= 1 + a^{\prime }) \big) \big)\) and check \({\tt inj}_{2}\,{\langle {{\tt into}({{\tt inj}_{1}\,{\left\langle \right\rangle }})}, {\left\langle \right\rangle }\rangle }\) against that (0 and \(1+a^{\prime }\) are the bodies of the two branches of Nat’s algebra). In this unrolled type, \(\hat{a}\) is no longer tracked by its \(\Xi\) , but we can still solve it.

The value now in question is a right injection, so we must check \(\langle {{\tt into}({{\tt inj}_{1}\,{\left\langle \right\rangle }})}, {\left\langle \right\rangle }\rangle\) against \(\exists {a^{\prime }:\mathbb {N}}.\: \textsf {Nat}(a^{\prime }) \times \big (1 \wedge (\hat{a}=1+a^{\prime })\big)\) (rule \({\text{Alg}}{\color{blue}{\Leftarrow}}{{+}}_{2}\) ). We generate another fresh existential variable \(\hat{a}^{\prime }\) in place of \(a^{\prime }\) . We now check the pair using rule \({\text{Alg}}{\color{blue}{\Leftarrow}}{{\times}}\) . For the first component, we check \({\tt inj}_{1}\,{\left\langle \right\rangle }\) against the unrolled \(\textsf {Nat}(\hat{a}^{\prime })\) , which is \(1 \wedge (\hat{a}^{\prime } = 0) + \cdots\) . Now, we solve \(\hat{a}^{\prime } = 0\) (rules \({\text{Alg}}{\color{blue}{\Leftarrow}}{{+}}_{1}\) , \({\text{Alg}}{\color{blue}{\Leftarrow}}{{\wedge}}\) , \(\text{Inst}\) , and \({\text{Alg}}{\color{blue}{\Leftarrow}}{\text{1}}\) ). This information flows to the type \(1 \wedge (\hat{a}=1+\hat{a}^{\prime })\) against which we need to check the second value component ( \(\left\langle \right\rangle\) ). By “this information flows,” we mean that we apply the context output by type checking the first component, namely, \(\hat{a}:\mathbb {N}, {\hat{a}^{\prime }} : {\mathbb {N}}{=}{0}\) (notice \(\hat{a}\) is not yet solved), as a substitution to obtain \(1 \wedge (\hat{a}=1+0)\) for the second premise of \({\text{Alg}}{\color{blue}{\Leftarrow}}{{\times}}\) . The right-hand side of the equation now has no existential variables, and we solve \(\hat{a}= 1+0 = 1\) (again using \({\text{Alg}}{\color{blue}{\Leftarrow}}{{\wedge}}\) ), as expected. It is worth noting that this solving happens entirely within focusing stages.

Values of inductive type may involve program variables, so existential variables may not be solved by \({\text{Alg}}{\color{blue}{\Leftarrow}}{{\wedge}}\) (and \(\text{Inst}\) ), but in algorithmic subtyping, using the same instantiation judgment:

Finally, if an equation of the form \(\hat{a}= t\) makes its way into type equivalence (by checking a variable value against a sum type), then \(\hat{a}\) gets solved, not as t, but rather as the index in the same structural position (including logical structure) of the necessarily ground positive type on the left of the equivalence (see judgment \({\Theta ; \Delta } {\vdash} {\phi } \equiv {\psi } \;\mathsf {inst} \dashv {\Delta ^{\prime }}\) in appendix Figure 45, used in appendix Figure 46). For example, \({b:\mathbb {N}; \hat{a}:\mathbb {N}}; {x:(1\wedge (b=0+0))+\exists {b^{\prime }:\mathbb {N}}.\:\mathsf {Nat}(b^{\prime })\wedge (b=b^{\prime }+0+1)} {\vdash} {x} {\color{blue}{\Leftarrow}} {P} \mathrel {/}{\_} \dashv {{\hat{a}} : {\mathbb {N}}{=}{b}}\) where \(P = (1\wedge (\hat{a}=0))\hspace{14.79996pt}+\exists {a^{\prime }:\mathbb {N}}.\:\mathsf {Nat}(a^{\prime })\wedge (\hat{a}=a^{\prime }+1)\) .

Fig. 15.

Next, we cover the algorithmic value and spine typing rules (Figure 15) in detail.

Typechecking values. Because there is no stand-alone algorithmic version of \(\leq^{\color{#a600a6}{+}}\rightsquigarrow{\text{L}}\) (recall that, in algorithmic subtyping, we eagerly extract immediately under polarity shifts), the rule \({\text{Alg}}{\color{blue}{\Leftarrow}}{\text{Var}}\) clarifies why we require types in contexts to have already been subjected to extraction. With eager extraction in subtyping under polarity shifts, but without eager type extraction for program variables, we would not be able to extract any assumptions from Q in the (algorithmic) subtyping premise.

Rule \({\text{Alg}}{\color{blue}{\Leftarrow}}{{\exists}}\) generates a fresh existential variable that ultimately gets solved within the same stage. Its solution is eagerly applied to the input type and output constraints, so we pop it off of the output context (as it is no longer needed).

Rule \({\text{Alg}}{\color{blue}{\Leftarrow}}{{\mu}}\) unrolls the inductive type, checks the inductive type’s injected value against the unrolled type, and passes along the constraints and solutions.

Rule \({\text{Alg}}{\color{blue}{\Leftarrow}}{{\wedge}}\) delays verifying the validity of the conjoined proposition \(\phi\) until it is grounded. As explained in the example above, existential variables can be solved via propositions generated by algorithmic type unrolling. This is the role of the propositional instantiation judgment used in the second premise: It simply checks whether the proposition is of the form \(\hat{a}= t\) where t is ground, in which case it solves \(\hat{a}\) as t (rule \(\text{Inst}\) ), and otherwise it does nothing (rule \(\text{NoInst}\) ). If the proposition does solve an existential variable, then the \([\Delta ^{\prime }]\phi\) part of the constraint is a trivial equation, but \(\phi\) could be a non-ground proposition unrelated to unrolling, in which case \(\Delta ^{\prime } = \Delta ^{\prime \prime }\) , whose solutions have not yet been applied to the input \(\phi\) .

Rule \({\text{Alg}}{\color{blue}{\Leftarrow}}{{\downarrow}}\) does not have a premise for typechecking the thunked expression (unlike \({\text{Decl}}{\color{blue}{\Leftarrow}}{{\downarrow}}\) ). Instead, the rule delays this typechecking constraint until its existential variables are solved. For example, in

the output constraint \(\chi\) has \([0/\hat{a}]({\tt return}\,{\left\langle \right\rangle } {\color{blue}{\Leftarrow}} {\uparrow \!{\big (1 \wedge (\hat{a}=0)\big)}})\) , where the index solution 0 to the \(\hat{a}\) introduced by \({\text{Alg}}{\color{blue}{\Leftarrow}}{{\exists}}\) is found only in typechecking the second component of the pair.

Rule \({\text{Alg}}{\color{blue}{\Leftarrow}}{\text{1}}\) says that \(\left\langle \right\rangle\) checks against 1, which solves nothing, and there are no further constraints to check.

In rule \({\text{Alg}}{\color{blue}{\Leftarrow}}{{\times}}\) , we check the first component \(v_1\) , threading through solutions found there in checking the second component \(v_2\) . Checking the second component can solve further existential variables in the first component’s constraints \(\chi _1\) , but solutions are eagerly applied, so, in the conclusion, we apply all the solutions only to \(\chi _1\) .

Rule \({\text{Alg}}{\color{blue}{\Leftarrow}}{{+}}_{k}\) checks the k-injected value \(v_k\) against the sum’s kth component type and passes along the constraints and solutions.

Typechecking spines. Rule \(\text{AlgSpine} \forall\) , similarly to \({\text{Alg}}{\color{blue}{\Leftarrow}}{{\exists}}\) , generates a fresh existential variable that ultimately gets solved in typechecking a value (in this case, the spine’s corresponding value). As usual, we pop off the solution because solutions are eagerly applied.

In rule \(\text{AlgSpineApp}\) , we typecheck the value v, outputting constraints \(\chi\) and solutions \(\Delta ^{\prime \prime }\) . We thread these solutions through when checking s, the rest of the spine, ultimately outputting constraints \(\chi ^{\prime }\) and solutions \(\Delta ^{\prime }\) . The context \(\Delta ^{\prime }\) may have more solutions than \(\Delta ^{\prime \prime }\) , and we eagerly apply solutions, so we need only apply \(\Delta ^{\prime }\) to the first value’s constraints \(\chi\) .

In \(\text{AlgSpine} \supset\) , we check the spine s but add the guarding proposition \(\phi\) to the list of constraints to verify later (applying the solutions \(\Delta ^{\prime }\) found when checking the spine).

In \(\text{AlgSpineNil}\) , nothing gets solved, so we output the input algorithmic context \(\Delta\) . Nothing needs to be verified, so we output the trivial constraint \(\mathsf {tt}\) .

7 Algorithmic Metatheory

We prove that the algorithmic system (Section 6) is decidable, as well as sound and complete with respect to the declarative system (Section 4).

7.1 Decidability

We have proved that all algorithmic judgments are decidable (appendix Section G). Algorithmic constraint verification \({\Theta } \models {W}\) and \({\Theta }; {\Gamma } \lhd {\chi }\) boils down to verifying propositional validity \({\Theta } {\vdash} {\phi } \;\mathsf {true}\) , which is known to be decidable [Barrett et al. 2009]. Besides that, our decidability proofs rely on fairly simple metrics for the various algorithmic judgments, which involve a simple size function (appendix Figures 57 and 58) and counting the number of subtyping constraints W in typing constraint lists \(\chi\) . We show that, for each algorithmic rule, every premise is smaller than the conclusion, according to the metrics. The most interesting lemmas we use state that the constraints output by algorithmic equivalence, subtyping, and program typing decrease in size (appendix Lemmas G.9, G.10, and G.12). For example:

Lemma 7.1 (Program Typing Shrinks Constraints).

(Lemma G.12 in appendix)

(1)

If \({\Theta ; \Delta }; {\Gamma } {\vdash} {v} {\color{blue}{\Leftarrow}} {P} \mathrel {/}{\chi } \dashv {\Delta ^{\prime }}\) , then \({\mathit {sz}}{(\chi)} \le {\mathit {sz}}{(v)}\) .

(2)

If \({\Theta ; \Delta }; {\Gamma }; [{N}] {\vdash} {s} \gg {{\uparrow \!{P}}} \mathrel {/}{\chi } \dashv {\Delta ^{\prime }}\) , then \({\mathit {sz}}{(\chi)} \le {\mathit {sz}}{(s)}\) .

7.2 Algorithmic Soundness

We show that the algorithmic system is sound with respect to the declarative system. Since the algorithmic system is designed to mimic the judgmental structure of the declarative system, soundness (and completeness) of the algorithmic system is relatively straightforward to prove (the real difficulty lies in designing the overall system to make this the case).

Soundness of algorithmic subtyping says that, if the subtyping algorithm solves indexes under which its verification conditions hold, then subtyping holds declaratively under the same solutions:

Theorem 7.2 (Soundness of Algorithmic Subtyping).

(Theorem I.4 in appendix)

(1)

If \({\Theta ; \Delta }\vdash{P} <:^{+} {Q}/{W}\dashv{\Delta ^{\prime }}\) and \({\Theta ; \Delta } {\vdash} {Q}\; \mathsf {type} {[{\Xi }}]\) and \({P}\,{\mathsf {ground}}\) and \({\Theta } {\vdash} {\Delta ^{\prime }} \longrightarrow {\Omega }\) and \({\Theta } \models {[\Omega ]W}\) , then \({\Theta }\vdash{P}\leq^{+}{[\Omega ]Q}\) .

(2)

If \({\Theta ; \Delta }\vdash{N}<:^{-}{M}/{W}\dashv{\Delta ^{\prime }}\) and \({\Theta ; \Delta } {\vdash} {N}\; \mathsf {type} {[{\Xi }}]\) and \({M}\,{\mathsf {ground}}\) and \({\Theta } {\vdash} {\Delta ^{\prime }} \longrightarrow {\Omega }\) and \({\Theta } \models {[\Omega ]W}\) , then \({\Theta }\vdash{[\Omega ]N}\leq^{-}{M}\) .

We prove (appendix Theorem I.4) the soundness of algorithmic subtyping by way of two intermediate (sound) subtyping systems: a declarative system that eagerly extracts under shifts and a semideclarative system that also eagerly extracts under shifts, but outputs constraints W in the same way as algorithmic subtyping, to be checked by the semideclarative judgment \({\Theta } \rhd {W}\) (that we prove sound with respect to the algorithmic \({\Theta } \models {W}\) ). We define a straightforward subtyping constraint equivalence judgment \({\Theta } \rhd {W} \leftrightarrow {W^{\prime }}\) , which uses the proposition and type equivalence mentioned in Section 4.3, to transport semideclarative to algorithmic subtyping constraint verification (and the other way around for algorithmic subtyping completeness): If \({\Theta } \rhd {W}\) and \({\Theta } \rhd {W} \leftrightarrow {W^{\prime }}\) , then \({\Theta } \rhd {W^{\prime }}\) (appendix Lemma H.24).

As a consequence of polarization, the soundness of head, bound expression, expression, and match typing can be stated relatively simply. The typing soundness of values and spines says that if \(\Omega\) completes the algorithm’s solutions such that the algorithm’s constraints hold, then the typing of the value or spine holds declaratively with \(\Omega\) ’s solutions.

Theorem 7.3 (Alg. Typing Sound).

(Theorem I.5 in appendix)

(1)

If \({\Theta }; {\Gamma } \rhd {h} {\color{red}{\Rightarrow}} {P}\) , then \({\Theta }; {\Gamma } {\vdash} {h} {\color{red}{\Rightarrow}} {P}\) .

(2)

If \({\Theta }; {\Gamma } \rhd {g} {\color{red}{\Rightarrow}} {{\uparrow \!{P}}}\) , then \({\Theta }; {\Gamma } {\vdash} {g} {\color{red}{\Rightarrow}} {{\uparrow \!{P}}}\) .

(3)

If \({\Theta ; \Delta }; {\Gamma } {\vdash} {v} {\color{blue}{\Leftarrow}} {P} \mathrel {/}{\chi } \dashv {\Delta ^{\prime }}\) and \({\Theta ; \Delta } {\vdash} {P}\; \mathsf {type} {[{\Xi }}]\) and \({\Theta } {\vdash} {\Delta ^{\prime }} \longrightarrow {\Omega }\) and \({\Theta }; {\Gamma } \lhd {[\Omega ]\chi }\) ,

then \({\Theta }; {\Gamma } {\vdash} {[\Omega ]v} {\color{blue}{\Leftarrow}} {[\Omega ]P}\) .

(4)

If \({\Theta }; {\Gamma } \rhd {e} {\color{blue}{\Leftarrow}} {N}\) , then \({\Theta }; {\Gamma } {\vdash} {e} {\color{blue}{\Leftarrow}} {N}\) .

(5)

If \({\Theta }; {\Gamma }; [{P}] \rhd {\lbrace {{r}_{i}} \mathbin {\Rightarrow }{{e}_{i}}\rbrace _{{i} \in {I}}} {\color{blue}{\Leftarrow}} {N}\) , then \({\Theta }; {\Gamma }; [{P}] {\vdash} {\lbrace {{r}_{i}} \mathbin {\Rightarrow }{{e}_{i}}\rbrace _{{i} \in {I}}} {\color{blue}{\Leftarrow}} {N}\) .

(6)

If \({\Theta ; \Delta }; {\Gamma }; [{N}] {\vdash} {s} \gg {{\uparrow \!{P}}} \mathrel {/}{\chi } \dashv {\Delta ^{\prime }}\) and \({\Theta ; \Delta } {\vdash} {N}\; \mathsf {type} {[{\Xi }}]\) and \({\Theta } {\vdash} {\Delta ^{\prime }} \longrightarrow {\Omega }\) and \({\Theta }; {\Gamma } \lhd {[\Omega ]\chi }\) ,

then \({\Theta }; {\Gamma }; [{[\Omega ]N}] {\vdash} {[\Omega ]s} \gg {[\Omega ]{\uparrow \!{P}}}\) .

We prove (appendix Theorem I.5) algorithmic typing is sound by a straightforward lexicographic induction on, first, program term structure, and, second, input type size. We happen not to use an intermediate system for this proof, but an intermediate system is very helpful if not indispensable for proving algorithmic typing completeness, discussed next.

7.3 Algorithmic Completeness

We show that the algorithmic system is complete with respect to the declarative system. The declarative system can conjure index solutions that are different from the algorithm’s solutions, but they must be equal according to the logical context. We capture this with relaxed context extension \({\Theta } {\vdash} {\Delta }\;{\tilde{\longrightarrow }}\;{\Delta ^{\prime }}\) , similar to (non-relaxed) context extension ( \({\Theta } {\vdash} {\Delta } \longrightarrow {\Delta ^{\prime }}\) ) but solutions in \(\Delta\) may change to equal (under \(\Theta\) ) solutions in \(\Delta ^{\prime }\) :

Algorithmic completeness says our subtyping algorithm verifies any declarative subtyping. Since declarative subtyping does not eagerly extract from types inside shifts in assumptive positions, but algorithmic subtyping does, the conclusion involves extraction from the given ground type. For example, the equivalence of the algorithmic solutions \(\Delta ^{\prime }\) to the indexes in \(\Omega\) for which subtyping declaratively holds may depend on extracted assumptions like \(\Theta _P\) in part (1) just below.

Theorem 7.4 (Completeness of Algorithmic Subtyping).

(Theorem J.14 in appendix)

(1)

If \({\Theta }\vdash{P}\leq^{+}{[\Omega ]Q}\) and \({\Theta ; \Delta } {\vdash} {Q}\; \mathsf {type} {[{\Xi }}]\) and \({P}\,{\mathsf {ground}}\) and \([\Delta ]Q = Q\) and \({\Theta } {\vdash} {\Delta }\;{\tilde{\longrightarrow }}\;{\Omega }\) ,

then there exist \(P^{\prime }\) , \(\Theta _P\) , W, and \(\Delta ^{\prime }\) such that \({\Theta , \Theta _P; \Delta }\vdash{P^{\prime }}<:^{+}{Q}/{W}\dashv{\Delta ^{\prime }}\)

and \({\Theta , \Theta _P} {\vdash} {\Delta ^{\prime }}\;{\tilde{\longrightarrow }}\;{\Omega }\) and \({\Theta , \Theta _P} \models {[\Omega ]W}\) and \({\Theta }\vdash{P}\rightsquigarrow^{-}{P^{\prime }}{\Theta _P}\) .

(2)

If \({\Theta }\vdash{[\Omega ]N}\leq^{-}{M}\) and \({\Theta ; \Delta } {\vdash} {N}\; \mathsf {type} {[{\Xi }}]\) and \({M}\,{\mathsf {ground}}\) and \([\Delta ]N = N\) and \({\Theta } {\vdash} {\Delta }\;{\tilde{\longrightarrow }}\;{\Omega }\) ,

then there exist \(M^{\prime }\) , \(\Theta _M\) , W, and \(\Delta ^{\prime }\) such that \({\Theta , \Theta _M; \Delta }\vdash{N}<:^{-}{M^{\prime }}/{W}\dashv{\Delta ^{\prime }}\)

and \({\Theta , \Theta _M} {\vdash} {\Delta ^{\prime }}\;{\tilde{\longrightarrow }}\;{\Omega }\) and \({\Theta , \Theta _M} \models {[\Omega ]W}\) and \({\Theta }\vdash{M}\rightsquigarrow^{-}{M^{\prime }}{\Theta _M}\) .

We prove (appendix Theorem J.14) completeness of algorithmic subtyping by using, in a similar way, the same intermediate systems used to prove soundness. However, it is more complicated. Indexes in semideclarative and algorithmic constraints may be syntactically different but logically and semantically equal. More crucially, to prove the completeness of algorithmic typing with respect to semideclarative typing, we need to prove that algorithmic subtyping solves all value-determined indexes of input types that are not necessarily ground:

Lemma 7.5 (Sub. Solves Val-det).

(Lemma J.5 in appendix)

(1)

If \({\Theta ; \Delta }\vdash{P}<:^{+}{Q}/{W}\dashv{\Delta ^{\prime }}\) and \({P}\,{\mathsf {ground}}\) and \({\Theta ; \Delta } {\vdash} {Q}\; \mathsf {type} {[{\Xi }}]\) ,

then for all \((\hat{a}: \tau) \in \Xi\) , there exists t such that \({\Theta } {\vdash} {t} : {\tau }\) and \(({\hat{a}} : {\tau }{=}{t}) \in \Delta ^{\prime }\) .

(2)

If \({\Theta ; \Delta }\vdash{M}<:{N}/{W}\dashv{\Delta ^{\prime }}\) and \({N}\,{\mathsf {ground}}\) and \({\Theta ; \Delta } {\vdash} {M}\; \mathsf {type} {[{\Xi }}]\) ,

then for all \((\hat{a}: \tau) \in \Xi\) , there exists t such that \({\Theta } {\vdash} {t} : {\tau }\) and \(({\hat{a}} : {\tau }{=}{t}) \in \Delta ^{\prime }\) .

We prove (appendix Lemma J.5) this by straightforward induction on the given subtyping derivation, using a similar lemma for type equivalence (appendix Lemma J.4).

We use extraction to achieve a complete subtyping algorithm. For example, the following holds declaratively without extraction but instead using \({\leq^{\color{blue}{+}}\wedge\; {\text{L}} }\) (this rule is not in our system; see Section 4.5):

However, checking function argument subtyping first, the non-extractive algorithm solves \(\hat{c}\) to a (not b) and outputs a verification condition needing \(a=b\) to hold under no logical assumptions, which is invalid. Our system instead extracts \(a=b\) from the supertype; the algorithmic solution a for \(\hat{c}\) and the declarative choice b for \(\hat{c}\) are equal under this assumption ( \(a=b\) ).

Finally, we prove the completeness of algorithmic typing. Like algorithmic typing soundness, again due to focusing, the head, bound expression, expression, and pattern matching parts are straightforward to state. But, because algorithmic function application may instantiate indexes different but logically equal to those conjured (semi)declaratively, bound expressions may algorithmically synthesize a type (judgmentally) equivalent to the type it synthesizes declaratively.

We introduced logical context equivalence in Section 4.3. Other than in proving that type equivalence implies subtyping, logical context equivalence is used in proving the completeness of algorithmic typing (in particular, we effectively use appendix Lemma B.95 to swap logically equivalent logical contexts in (semi)declarative typing derivations). The type \(P^{\prime }\) in the output of the algorithm in part (6) below can have different index solutions (output \(\Delta ^{\prime }\) ) that are logically equal (under \(\Theta\) ) to the solutions in \(\Omega\) that appear in the declaratively synthesized P. However, P and \(P^{\prime }\) necessarily have the same structure, so \({\Theta }\vdash{P}\equiv^{+}{[\Omega ]P^{\prime }}\) . Therefore, a bound expression may (semi)declaratively synthesize a type that is judgmentally equivalent to the type synthesized algorithmically. We then extract different but logically equivalent logical contexts from the (equivalent) types synthesized by a bound expression.

As such, algorithmic typing completeness is stated as follows:

Theorem 7.6 (Alg. Typing Complete).

(Theorem J.21 in appendix)

(1)

If \({\Theta }; {\Gamma } {\vdash} {h} {\color{red}{\Rightarrow}} {P}\) , then \({\Theta }; {\Gamma } \rhd {h} {\color{red}{\Rightarrow}} {P}\) .

(2)

If \({\Theta }; {\Gamma } {\vdash} {g} {\color{red}{\Rightarrow}} {{\uparrow \!{P}}}\) , then there exists \(P^{\prime }\) such that \({\Theta }; {\Gamma } \rhd {g} {\color{red}{\Rightarrow}} {{\uparrow \!{P^{\prime }}}}\) and \({\Theta }\vdash{P}\equiv^{+}{P^{\prime }}\) .

(3)

If \({\Theta }; {\Gamma } {\vdash} {v} {\color{blue}{\Leftarrow}} {[\Omega ]P}\) and \({\Theta ; \Delta } {\vdash} {P}\; \mathsf {type} {[{\Xi }}]\) and \([\Delta ]P = P\) and \({\Theta } {\vdash} {\Delta }\;{\tilde{\longrightarrow }}\;{\Omega }\) ,

then there exist \(\chi\) and \(\Delta ^{\prime }\) such that \({\Theta ; \Delta }; {\Gamma } {\vdash} {v} {\color{blue}{\Leftarrow}} {P} \mathrel {/}{\chi } \dashv {\Delta ^{\prime }}\)

and \({\Theta } {\vdash} {\Delta ^{\prime }}\;{\tilde{\longrightarrow }}\;{\Omega }\) and \({\Theta }; {\Gamma } \lhd {[\Omega ]\chi }\) .

(4)

If \({\Theta }; {\Gamma } {\vdash} {e} {\color{blue}{\Leftarrow}} {N}\) , then \({\Theta }; {\Gamma } \rhd {e} {\color{blue}{\Leftarrow}} {N}\) .

(5)

If \({\Theta }; {\Gamma }; [{P}] {\vdash} {\lbrace {{r}_{i}} \mathbin {\Rightarrow }{{e}_{i}}\rbrace _{{i} \in {I}}} {\color{blue}{\Leftarrow}} {N}\) , then \({\Theta }; {\Gamma }; [{P}] \rhd {\lbrace {{r}_{i}} \mathbin {\Rightarrow }{{e}_{i}}\rbrace _{{i} \in {I}}} {\color{blue}{\Leftarrow}} {N}\) .

(6)

If \({\Theta }; {\Gamma }; [{[\Omega ]N}] {\vdash} {s} \gg {{\uparrow \!{P}}}\) and \({\Theta ; \Delta } {\vdash} {N}\; \mathsf {type} {[{\Xi }}]\) and \([\Delta ]N = N\) and \({\Theta } {\vdash} {\Delta }\;{\tilde{\longrightarrow }}\;{\Omega }\) ,

then there exist \(P^{\prime }\) , \(\chi\) , and \(\Delta ^{\prime }\) such that \({\Theta ; \Delta }; {\Gamma }; [{N}] {\vdash} {s} \gg {{\uparrow \!{P^{\prime }}}} \mathrel {/}{\chi } \dashv {\Delta ^{\prime }}\)

and \({\Theta } {\vdash} {\Delta ^{\prime }}\;{\tilde{\longrightarrow }}\;{\Omega }\) and \({\Theta }; {\Gamma } \lhd {[\Omega ]\chi }\) and \({\Theta }\vdash{P}\equiv^{+}{[\Omega ]P^{\prime }}\) .

We prove (appendix Theorem J.21) algorithmic typing completeness by way of an intermediate, semideclarative typing system that is essentially the same as declarative typing in that it conjures indexes, but differs in a way similar to algorithmic typing: It outputs constraints \(\chi\) and only verifies them (via semideclarative \({\Theta }; {\Gamma } \;{\widetilde{\lhd }}\; {\chi }\) as opposed to algorithmic \({\Theta }; {\Gamma } \lhd {\chi }\) ) immediately upon completion of focusing stages. Similarly to the proof of algorithmic subtyping completeness, we transport (appendix Lemma J.18) the semideclarative verification of typing constraints over a straightforward typing constraint equivalence judgment \({\Theta }; {\Gamma } \lhd {{\chi } \leftrightarrow {\chi ^{\prime }}}\) that uses the subtyping constraint equivalence ( \({\Theta } \rhd {W} \leftrightarrow {W^{\prime }}\) ) and type equivalence judgments.

To prove that algorithmic typing is complete with respect to semideclarative typing, we use the fact that typing solves all value-determined indexes in input types of focusing stages. This fact is similar to the fact that subtyping solves the value-determined indexes of non-ground types (used in the algorithmic subtyping completeness proof), but the interaction between value-determined indexes and unrolling introduces some complexity: Unrolling a refined inductive type does not preserve the type’s \(\Xi\) . Therefore, we had to split the value typechecking part into the mutually recursive parts (1) and (2); part (3) depends on parts (1) and (2) but not vice versa.

Lemma 7.7 (Typing Solves Val-det).

(Lemma J.19 in appendix)

(1)

Suppose \(\Delta = \Delta _1, \hat{a}:\tau , \Delta _2\) . If \({\Xi };{\Theta };\Delta\vdash\{{\nu :G[\mu F]}\;|\;{\beta }({G\;(\mathsf {fold}_{F}\;{\alpha })\;\nu )}=_{t}â\}\doteq Q\)

and \({\Theta ; \Delta } {\vdash} {G}\; \mathsf {functor} {[{\Xi _G}]}\) and \((\hat{a}:\tau) \notin \Xi _G\)

and \({\Theta ; \Delta }; {\Gamma } {\vdash} {v} {\color{blue}{\Leftarrow}} {Q} \mathrel {/}{\chi } \dashv {\Delta ^{\prime }}\)

and \([\Delta ]Q = Q\) and \({\Theta } {\vdash} {\Delta ^{\prime }}\;{\tilde{\longrightarrow }}\;{\Omega }\) and \({\Theta }; {\Gamma } \;{\widetilde{\lhd }}\; {[\Omega ]\chi }\) ,

then there exists t such that \({\Theta } {\vdash} {t} : {\tau }\) and \(({\hat{a}} : {\tau }{=}{t}) \in \Delta ^{\prime }\) .

(2)

If \({\Theta ; \Delta }; {\Gamma } {\vdash} {v} {\color{blue}{\Leftarrow}} {P} \mathrel {/}{\chi } \dashv {\Delta ^{\prime }}\)

and \({\Theta ; \Delta } {\vdash} {P}\; \mathsf {type} {[{\Xi _P}}]\) and \([\Delta ]P = P\) and \({\Theta } {\vdash} {\Delta ^{\prime }}\;{\tilde{\longrightarrow }}\;{\Omega }\) and \({\Theta }; {\Gamma } \;{\widetilde{\lhd }}\; {[\Omega ]\chi }\) ,

then for all \((\hat{a}: \tau) \in \Xi _P\) , there exists t such that \({\Theta } {\vdash} {t} : {\tau }\) and \(({\hat{a}} : {\tau }{=}{t}) \in \Delta ^{\prime }\) .

(3)

If \({\Theta ; \Delta }; {\Gamma }; [{N}] {\vdash} {s} \gg {{\uparrow \!{P}}} \mathrel {/}{\chi } \dashv {\Delta ^{\prime }}\)

and \({\Theta ; \Delta } {\vdash} {N}\; \mathsf {type} {[{\Xi _N}}]\) and \([\Delta ]N = N\) and \({\Theta } {\vdash} {\Delta ^{\prime }}\;{\tilde{\longrightarrow }}\;{\Omega }\) and \({\Theta }; {\Gamma } \;{\widetilde{\lhd }}\; {[\Omega ]\chi }\) ,

then for all \((\hat{a}: \tau) \in \Xi _N\) , there exists t such that \({\Theta } {\vdash} {t} : {\tau }\) and \(({\hat{a}} : {\tau }{=}{t}) \in \Delta ^{\prime }\) .

The proof (appendix Lemma J.19) of part (1) boils down to inversion on the propositional instantiation judgment \({\Theta ; \Delta } {\vdash} {\phi }\;\mathsf {inst} \dashv {\Delta ^{\prime }}\) in the unit case of unrolling where \(\phi\) necessarily has the form \(\hat{a}= t\) with \({t}\,{\mathsf {ground}}\) , due to the invariant that algebras are ground.

8 Related Work

Typing refinement. As far as we know, Constable [1983] was first to introduce the concept of refinement types (though not by that name) as logical subsets of types, writing {x:A|B} for the subset type classifying terms x of type A that satisfy proposition B. Freeman and Pfenning [1991] introduced type refinement to the programming language ML via datasort refinements—inclusion hierarchies of ML-style (algebraic, inductive) datatypes—and intersection types for Standard ML: They showed that full type inference is decidable under a refinement restriction and provided an algorithm based on abstract interpretation. The dangerous interaction of datasort refinements, intersection types, side effects, and call-by-value evaluation was first dealt with by Davies and Pfenning [2000] by way of a value restriction on intersection introduction; they also presented a bidirectional typing algorithm.

Dependent ML (DML) [Xi 1998; Xi and Pfenning 1999] extended the ML type discipline parametrically by index domains. DML is only decidable modulo decidability of index constraint satisfiability. DML uses a bidirectional type system with index refinements for a variant of ML, capable of checking properties ranging from in-bound array access [Xi and Pfenning 1998] to program termination [Xi 2002]. DML, similarly to our system, collects constraints from the program and passes them to a constraint solver, but does not guarantee that they are SMT solvable (unlike our system). This is also the approach of systems like Stardust [Dunfield 2007a] (which combines datasort and index refinement and supports index refinements that are not value-determined, that is, invaluable refinements, which we do not consider) and those with liquid types [Rondon et al. 2008]. The latter are based on a Hindley–Milner approach; typically, Hindley–Damas–Milner inference algorithms [Hindley 1969; Milner 1978; Damas and Milner 1982] generate typing constraints to be verified [Heeren et al. 2002].

Due to issues with existential index instantiation, the approach of Xi [1998] (incompletely) translated programs into a let-normal form [Sabry and Felleisen 1993] before typing them, and Dunfield [2007b] provided a complete let-normal translation for similar issues. The system in this paper is already let-normal.

Liquid types. Rondon et al. [2008] introduced logically qualified data types, that is, liquid types, in a system that extends Hindley–Milner to infer (by abstract interpretation) refinements based on built-in or programmer-provided refinement templates or qualifiers. Kawaguchi et al. [2009] introduced recursive refinement via sound and terminating measures on algebraic data types; they also introduced polymorphic refinement. Vazou et al. [2013] generalize recursive and polymorphic refinement into a single, more expressive mechanism: abstract refinement. Our system lacks polymorphism, which we plan to study in future work. Abstract refinements go together with multi-argument measures, because abstract refinements may be thought of as predicates of higher-order sort, and we can encode multi-argument measures using higher-order sorts. Extending our system with higher-order sorts and abstract refinements is ongoing work. In future work, it would be interesting to study other features of liquid typing in our setting. Extending our system with liquid inference of refinements, for example, would require adding Hindley–Milner type inference that creates templates, as well as mechanisms to solve these templates, possibly in an initial phase using abstract interpretation.

Unlike DML (and our system), liquid type systems do not distinguish index terms from programs. While this provides simplicity and convenience to the user (from their perspective, there is just one language), it makes it relatively difficult to provide liquid type systems a denotational semantics and to prove soundness results denotationally (rather than operationally), in contrast to our system (we should be able to recover some of this convenience, by requiring users, for example, to make measure annotations like Liquid Haskell). It creates other subtleties such as the fact that annotations for termination metrics in Liquid Haskell must be internally translated to ghost parameters similar to that used in Section 3. By contrast, if we extend our system with additional termination metrics, because these metrics are at the index level, then we can obviate such ghost parameters. Liquid types’ lack of index distinction also makes it trickier to support computational effects and evaluation orders.

Initial work on liquid types [Rondon et al. 2008; Kawaguchi et al. 2009] used call-by-value languages, but Haskell uses lazy evaluation, so Liquid Haskell was discovered to be unsound [Vazou et al. 2014]. Vazou et al. [2014] regained type soundness by imposing operational-semantic restrictions on subtyping and let-binding. In their algorithmic subtyping, there is exactly one rule, \(\preceq\) -Base-D, which pertains to refinements of base types (integers, Booleans, and so on) and inductive data types; however, these types have a well-formedness restriction, namely, that the refinement predicates have the type of Boolean expressions that reduce to finite values. But this restriction alone does not suffice for soundness under laziness and divergence. As such, their algorithmic typing rule T-Case-D, which combines let-binding and pattern matching, uses an operational semantics to approximate whether or not the bound expression terminates. If the bound expression might diverge, then so might the entire case expression; otherwise, it checks each branch in a context that assumes the expression reduces to a (potentially infinite Haskell) value.

We also have a type well-formedness restriction, but it is purely syntactic, and only on index quantification, requiring them to be associated with a fold that is necessarily decidable by virtue of a systematic phase distinction between the index level and the program level. Further, via type polarization, our let-binding rule requires the bound expression to return a value, we only allow value types in our program contexts, and we cannot extract index information across polarity shifts (such as in a suspended computation). Therefore, in our system, there is no need to stratify our types according to an approximate criterion; rather, we exploit the systemic distinction between positive (value) types and negative (computation) types, which Levy [2004] designed or discovered to be semantically well-behaved. We suspect that liquid types’ divergence-based stratification is indirectly grappling with logical polarity. Because divergence-based stratification is peculiar to the specific effect of nontermination, it is unclear how their approach may extend to other effects. By way of a standard embedding of CBN or CBV into (our focalized variant of) CBPV, we can obtain CBN or CBV subtyping and typing relations automatically respecting any necessary value or covalue restrictions [Zeilberger 2009]. Further, our system is already in a good position to handle the addition of effects other than nontermination.

Contract calculi. Software contracts express program properties in the same language as the programs themselves; Findler and Felleisen [2002] introduced contracts for runtime verification of higher-order functional programs. These latent contracts are not types, but manifest contracts are [Greenberg et al. 2010]. Manifest contracts are akin to refinement types. Indeed, Vazou et al. [2013] sketch a proof of type soundness for a liquid type system by translation from liquid types to the manifest contract calculus F_H of Belo et al. [2011]. However, there is no explicit translation back, from F_H to liquid typing. They mention that the translated terms in F_H do not have upcasts because the latter in F_H are logically related to identity functions if they correspond to static subtyping (as they do in the liquid type system): an upcast lemma. Presumably, this facilitates a translation from F_H back to liquid types. However, there are technical problems in F_H that break type soundness and the logical consistency of the F_H contract system; Sekiyama et al. [2017] fix these issues, resulting in the system \(\text{F}^{\hspace{1.0pt}\sigma }_{\text{H}}\) , but do not consider subtyping and subsumption and do not prove an upcast lemma.

Bidirectional typing. Bidirectional typing [Pierce and Turner 2000] is a popular way to implement a wide variety of systems, including dependent types [Coquand 1996; Abel et al. 2008], contextual types [Pientka 2008], and object-oriented languages [Odersky et al. 2001]. The bidirectional system of Peyton Jones et al. [2007] supports higher-rank polymorphism. Dunfield and Krishnaswami [2013] also present a bidirectional type system for higher-rank polymorphism, but framed more proof theoretically; Dunfield and Krishnaswami [2019] extend it to a richer language with existentials, indexed types, sums, products, equations over type variables, pattern matching, polarized subtyping, and principality tracking. The bidirectional system of this paper uses logical techniques similar to the systems of Dunfield and Krishnaswami, but it does not consider polymorphism. A survey paper [Dunfield and Krishnaswami 2021] includes some discussion of bidirectional typing’s connections to proof theory. Basically, good bidirectional systems tend to distinguish checking and synthesizing terms or proofs according to their form, such as normal or neutral.

Proof theory, polarization, focusing, and analyticity. The concept of polarity most prominent in this article dates back to Andreoli’s work on focusing for tractable proof search [Andreoli 1992] and Girard’s work on unifying classical, intuitionistic, and linear logic [Girard 1993]. Logical polarity and focusing have been used to explain many common phenomena in programming languages. We mentioned in the overview that Zeilberger [2009] explains the value and evaluation context restrictions in terms of focusing; and Krishnaswami [2009] explains pattern matching as (proof terms of) the left-inversion stage of focused systems (also, that system is bidirectional). More broadly, Downen [2017] discusses many logical dualities common in programming languages.

Brock-Nannestad et al. [2015] study the relation between polarized intuitionistic logic and CBPV. They obtain a bidirectionally typed system of natural deduction related to a variant of the focused sequent calculus LJF [Liang and Miller 2009] by \(\eta\) -expansion (for inversion stages). Espírito Santo [2017] does a similar study, but starts with a focused sequent calculus for intuitionistic logic much like the system of Simmons [2014] (but without positive products), proves it equivalent to a natural deduction system (we think the lack of positive products helps establish this equivalence), and defines, also via \(\eta\) -expansion, a variant of CBPV in terms of the natural deduction system. Our system is not in the style of natural deduction, but rather sequent calculus. We think our system relates to CBPV in a similar way—via \(\eta\) -expansion—but we do not prove it in this article, because we focus on proving type soundness and algorithmic decidability, soundness, and completeness.

Barendregt et al. [1983] discovered that a program that typechecks (in a system with intersection types) using subtyping can also be checked without using subtyping, if the program is sufficiently \(\eta\) -expanded. An analogous phenomenon involving identity coercion was studied by Zeilberger [2009] in a focused setting. Similarly, our ability to place subtyping solely in (value) variable typechecking is achievable due to the focusing (and let-normality) of our system.

Interpreting Kant, Martin-Löf [1994] considers an analytic judgment to be one that is derivable using information found only in its inputs (in the sense of the bidirectional modes, input and output). A synthetic judgment, in contrast, requires us to look beyond the inputs of the judgment to find a derivation. The metatheoretic results for our algorithmic system demonstrate that our judgments are analytic, except the judgment \({\Theta } {\vdash} {\phi } \;\mathsf {true}\) , which is verified by an external SMT solver. As such, our system may be said to be analytic modulo an external SMT solver. Focusing, in proper combination with bidirectional typing (which clarifies where to put type annotations), let-normality and value-determinedness, guarantees that all information needed to generate verification conditions suitable for an SMT solver may be found in the inputs to judgments. In our system, cut formulas can always be inferred from a type annotation or by looking up a variable in the program context, making all our cuts analytic in the sense of Smullyan [1969].

Dependent types. Dependent types, introduced by Martin-Löf [1971];, 1975], are a key conceptual and historical precursor to index refinement types. Dependent types may depend on arbitrary program terms, not only terms restricted to indexes. This is highly expressive, but undecidable in languages with divergence. The main difference between refinement and dependent typing is that refinement typing attempts to increase the expressivity of a highly automatic type system, whereas dependent typing attempts to increase the automation of a highly expressive type system. Semantically, refinement type systems differ from dependent type systems in that they refine a pre-existing type system, so erasure of refinements always preserves typing.

Many dependent type systems impose their own restrictions for the sake of decidability. In Cayenne [Augustsson 1998], typing can only proceed a given number of steps. All well-typed programs in Epigram [McBride and McKinna 2004] are required to terminate so its type equivalence is decidable. Epigram, and other systems [Chen and Xi 2005; Licata and Harper 2005], allow programmers to write explicit proofs of type equivalence.

Systems such as ATS [Xi 2004] and \(\text{F}^\star\) [Swamy et al. 2016] can be thought of as combining refinement and dependent types. These systems aim to bring the best of both refinement and dependent types, but ATS is more geared to practical, effectful functional programming (hence, refinement types), while \(\text{F}^\star\) is more geared to formal verification and dependent types. Unlike our system, they allow the programmer to provide proofs. The overall design of ATS is closer to our system than that of \(\text{F}^\star\) , due to its phase distinction between statics and dynamics; but it allows the programmer to write (in the language itself) proofs to simplify or eliminate constraints for the (external) constraint solver: Xi calls this internalized constraint solving. It should be possible to internalize constraint solving to some extent in our system. Liquid Haskell has a similar mechanism called refinement reflection [Vazou et al. 2017] in which programmers can write manual proofs (in Haskell) in cases where automatic Proof by Logical Evaluation and SMT solving fail.

Both ATS and \(\text{F}^\star\) have a CBV semantics, which is inherently monadic [Moggi 1989b]. Our system is a variant of CBPV, which subsumes both CBV and CBN. These systems consider effects other than divergence, like exceptions, mutable state, and input/output, which we hope to add to our system in future work; this should go relatively smoothly, because CBPV is inherently monadic. The system \(\text{F}^\star\) allows for termination metrics other than strong induction on naturals, such as lexicographic induction, but we think it would be straightforward to add such metrics to our system in the way discussed in Section 4.7.

Data abstraction and category theory. Categorically, inductive types are initial algebras of endofunctors. We only consider certain polynomial endofunctors, which specify tree-shaped or algebraic data structures. Objects (in the sense of object-oriented programming) or coinductive types are dual to inductive types in that, categorically, they are final coalgebras of endofunctors [Cook 2009]. A consideration of categorical duality leads us to a natural (perhaps naïve) question:If we can build a well-behaved system that refines algebraic data types by algebras, could it mean anything to refine objects by coalgebras? We would expect the most direct model of coinductive types would be via negative types, but working out the details is potential future work.

Our rolled refinement types refine type constructors \(\mu F\) . Sekiyama et al. [2015], again in work on manifest contracts, compare this to refining (types of) data constructors and provide a translation from type constructor to data constructor refinements. According to Sekiyama et al. [2015], type constructor refinements (such as our \({\left\lbrace {\nu : \mu F} \;\middle |\; {(\mathsf {fold}_{F}\;{\alpha }) \; \nu =_\tau t}\right\rbrace }\) ) are easier for the programmer to specify, but data constructor refinements (such as the output types of our unrolling judgment) are easier to verify automatically. Sekiyama et al. [2015] say that their translation from type to data constructor refinements is closely related to the work of Atkey et al. [2012] on refining inductive data in (a fibrational interpretation of) dependently typed languages. Atkey et al. [2012] provide “explicit formulas” computing inductive characterizations of type constructor refinements. These semantic formulas resemble our syntactic unrolling judgment, which may be viewed as a translation from type refinements to data constructor refinements.

Ornaments [McBride 2011] describe how inductive types with different logical or ornamental properties can be systematically related using their algebraic and structural commonalities. Practical work in ornaments seems mostly geared toward code reuse [Dagand and McBride 2012], code refactoring [Williams and Rémy 2017], and such. In contrast, this paper focuses on incorporating similar ideas in a foundational index refinement typing algorithm.

Melliès and Zeilberger [2015] provide a categorical theory of type refinement in general, where functors are considered to be type refinement systems. This framework is based on Reynolds’s distinction between intrinsic (or Church) and extrinsic (or Curry) views of typing [Reynolds 1998]. We think that our system fits into this framework, but have not confirmed it formally. This is most readily seen in the fact that the semantics of our refined system is simply the semantics (intrinsic to unrefined typing derivations) of its erasure of indexes, which express extrinsic properties of (erased) programs.

9 Conclusion and Future Work

We have presented a declarative system for index-based recursive refinement typing (with nullary measures) that is logically designed, semantically sound, and theoretically implementable. We have proved that our declarative system is sound with respect to an elementary domain-theoretic denotational semantics, which implies that our system is logically consistent and totally correct. We have also presented an algorithmic system and proved it is decidable, as well as sound and complete with respect to the declarative system. Focusing yields CBPV, which already has a clear denotational semantics, and refining it by an index domain (and by measures in it) facilitates a semantics in line with the perspective of Melliès and Zeilberger [2015]. But focusing (in combination with value-determinedness) also allows for an easy proof of the completeness of a decidable typing algorithm. The relative ease with which we demonstrate (denotational-semantic) soundness and (completeness of) decidable typing for a realistic language follows from a single, proof-theoretic technique: focusing.

Researchers of liquid typing have laid out an impressive and extensive research program providing many useful features that would be very interesting to study in our setting. We plan to add parametric polymorphism in future work, which goes along with refinement abstraction [Vazou et al. 2013]. Refinement abstraction may be thought of as predicates of higher-order sort, which can also accommodate multi-argument measures (such as whether a list of naturals is in increasing order). We are adding multi-argument measures and refinement abstraction in ongoing work. It is also of great interest to study other features of liquid typing, such as refinement inference with templates and refinement reflection, though arguably the latter is more closely related to dependent typing.

We also plan to allow the use of multiple measures on inductive types (so we can specify, for example, the type of length-n lists of naturals in increasing order). It would also be interesting to experiment with our value-determinedness technique. By allowing quantification over indexes in propositions whose only variable dependencies are value-determined, we think we can simulate termination metrics by other ones (such as \(\lt\) on sums of natural numbers by \(\lt\) on natural numbers as such).

In future work, we hope to apply our type refinement system (or future extensions of it) to various domains, from static time complexity analysis [Wang et al. 2017] to resource analysis [Handley et al. 2019]. Eventually, we hope to be able to express, for example, that a program terminates within a worst-case amount of time. Our system is parametric in the index domain, provided it satisfies some basic properties. Different index domains may be suitable for different applications. We also hope to add more effects, such as input/output and mutable reference cells. CBPV is built for effects, but our refinement layer may result in interesting interactions between effects and indexes.

Our system may at first seem complicated, but its metatheoretic proofs are largely straightforward, if lengthy (at least as presented). A primary source of this complexity is the proliferation of judgments. However, having various judgments helps us organize different forms of knowledge [Martin-Löf 1996] or (from a Curry–Howard perspective) stages or parts of an implementation (such as pattern-matching, processing an argument list, and so on).

Our system focuses on the feature of nullary measures of algebraic data types and does not include key typing features expected of a realistic programming language (such as additional effects and polymorphism, which we hope to add in future work). Adding such expressive features tends to significantly affect the metatheory and the techniques used to prove it. We hope to reflect on the development of our proofs (including those for systems with polymorphism [Dunfield and Krishnaswami 2013]) in search of abstractions that may help designers of practical, general-purpose functional languages to establish crucial metatheoretic properties.

Acknowledgments

We thank the anonymous reviewers for their thorough reading and recommendations, which helped to improve our article. We also thank Ondrej Baranovič [2023] for implementing the system presented in this article.

Footnotes

A language has a phase distinction if it can distinguish aspects that are relevant at runtime from those that are relevant only at compile time.

In today’s context, “Refinement ML” might seem a more appropriate name than Dependent ML. But when DML was invented, “refinement types” referred to datasort refinement systems; the abstract of Xi [1998] describes DML as “another attempt towards refining...type systems ..., following the step of refinement types” (Freeman and Pfenning 1991).

We give the algorithmic judgment to note existential variables \(\Delta\) do not flow through it or any of the non-focusing stages.

⁴

Semantically speaking.

⁵

A proposed inference rule is admissible with respect to a system if, whenever the premises of the proposed rule are derivable, we can derive the proposed rule’s conclusion using the system’s inference rules.

⁶

Typically, hereditary substitution reduces terms after substitution, modifying the syntax tree.

⁷

An object X in a category \(\mathbf {C}\) is initial if for every object Y in \(\mathbf {C}\) there exists a unique morphism \(X \rightarrow Y\) in \(\mathbf {C}\) .

⁸

This is not the case for all endofunctors. Therefore, not all endofunctors can be said to specify an inductive type. For example, consider the powerset functor.

References

[1]

Andreas Abel, Thierry Coquand, and Peter Dybjer. 2008. Verifying a semantic \(\beta \eta\) -conversion test for Martin-Löf type theory. In Conference on Mathematics of Program Construction (MPC’08)(LNCS, Vol. 5133). Springer, 29–56.

Digital Library

Google Scholar

[2]

Jean-Marc Andreoli. 1992. Logic programming with focusing proofs in linear logic. J. Logic Comput. 2, 3 (1992), 297–347.

Crossref

Google Scholar

[3]

Robert Atkey, Patricia Johann, and Neil Ghani. 2012. Refining inductive types. Logic. Meth. Comput. Sci. 8, 2 (June2012). DOI:

Crossref

Google Scholar

[4]

Lennart Augustsson. 1998. Cayenne—A language with dependent types. In ACM SIGPLAN International Conference on Functional Programming (ICFP’98). 239–250.

Google Scholar

[5]

Ondrej Baranovič. 2023. LTR. Retrieved from https://github.com/nulano/LTR

Google Scholar

[6]

Henk Barendregt, Mario Coppo, and Mariangiola Dezani-Ciancaglini. 1983. A filter lambda model and the completeness of type assignment. J. Symbol. Logic 48, 4 (1983), 931–940.

Crossref

Google Scholar

[7]

Clark Barrett, Roberto Sebastiani, Sanjit A. Seshia, and Cesare Tinelli. 2009. Satisfiability Modulo Theories (1st ed.). Number 1 in Frontiers in Artificial Intelligence and Applications. IOS Press, 825–885. DOI:

Crossref

Google Scholar

[8]

João Filipe Belo, Michael Greenberg, Atsushi Igarashi, and Benjamin C. Pierce. 2011. Polymorphic contracts. In 20th European Symposium on Programming(Lecture Notes in Computer Science, Vol. 6602), Gilles Barthe (Ed.). Springer International Publishing, 18–37. DOI:

Crossref

Google Scholar

[9]

Taus Brock-Nannestad, Nicolas Guenot, and Daniel Gustafsson. 2015. Computation in focused intuitionistic logic. In 17th International Symposium on Principles and Practice of Declarative Programming (PPDP’15). ACM Press, New York, NY, 43–54. DOI:

Digital Library

Google Scholar

[10]

Iliano Cervesato and Frank Pfenning. 2003. A linear spine calculus. J. Logic Comput. 13, 5 (2003), 639–688.

Crossref

Google Scholar

[11]

Chiyan Chen and Hongwei Xi. 2005. Combining programming with theorem proving. In ACM SIGPLAN International Conference on Functional Programming (ICFP’05). 66–77.

Digital Library

Google Scholar

[12]

James Cheney and Ralf Hinze. 2003. First-class Phantom Types. Technical Report CUCIS TR2003-1901. Cornell University. Retrieved from https://hdl.handle.net/1813/5614

Google Scholar

[13]

Robert L. Constable. 1983. Mathematics as programming. In Logics of Programs, Workshop(Lecture Notes in Computer Science, Vol. 164), Edmund M. Clarke and Dexter Kozen (Eds.). Springer, 116–128. DOI:

Crossref

Google Scholar

[14]

William R. Cook. 2009. On understanding data abstraction, revisited. In 24th ACM SIGPLAN Conference on Object Oriented Programming Systems Languages and Applications (OOPSLA’09). ACM Press, New York, NY, 557–572. DOI:

Digital Library

Google Scholar

[15]

Thierry Coquand. 1996. An algorithm for type-checking dependent types. Sci. Comput. Program. 26, 1–3 (1996), 167–177.

Digital Library

Google Scholar

[16]

Pierre-Evariste Dagand and Conor McBride. 2012. Transporting functions across ornaments. In 17th ACM SIGPLAN International Conference on Functional Programming (ICFP’12). ACM Press, New York, NY, 103–114. DOI:

Digital Library

Google Scholar

[17]

Luis Damas and Robin Milner. 1982. Principal type-schemes for functional programs. In SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL’82). ACM Press, 207–212.

Google Scholar

[18]

Rowan Davies and Frank Pfenning. 2000. Intersection types and computational effects. In ACM SIGPLAN International Conference on Functional Programming (ICFP’00). ACM Press, 198–208.

Digital Library

Google Scholar

[19]

Leonardo de Moura and Nikolaj Bjørner. 2008. Z3: An efficient SMT solver. In International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS’08)(Lecture Notes in Computer Science, Vol. 4963). Springer, 337–340.

Crossref

Google Scholar

[20]

Paul Downen. 2017. Sequent Calculus: A Logic and a Language for Computation and Duality. Ph.D. Dissertation. University of Oregon. Retrieved from https://www.cs.uoregon.edu/Reports/PHD-201706-Downen.pdf

Google Scholar

[21]

Jana Dunfield. 2007a. Refined typechecking with Stardust. In Programming Languages Meets Programming Verification Conference (PLPV’07). ACM Press, 21–32.

Digital Library

Google Scholar

[22]

Jana Dunfield. 2007b. A Unified System of Type Refinements. Ph.D. Dissertation. Carnegie Mellon University. CMU-CS-07-129.

Google Scholar

[23]

Jana Dunfield and Neelakantan R. Krishnaswami. 2013. Complete and easy bidirectional typechecking for higher-rank polymorphism. In ACM SIGPLAN International Conference on Functional Programming (ICFP’13). ACM Press, 429–442. arXiv:1306.6032[cs.PL].

Digital Library

Google Scholar

[24]

Jana Dunfield and Neelakantan R. Krishnaswami. 2019. Sound and complete bidirectional typechecking for higher-rank polymorphism with existentials and indexed types. Proc. ACM Program. Lang. 3, POPL (2019), 9:1–9:28.

Digital Library

Google Scholar

[25]

Jana Dunfield and Neelakantan R. Krishnaswami. 2021. Bidirectional typing. ACM Comput. Surv. 54, 5 (2021).

Digital Library

Google Scholar

[26]

Jana Dunfield and Frank Pfenning. 2003. Type assignment for intersections and unions in call-by-value languages. In International Conference on Foundations of Software Science and Computation Structures (FoSSaCS’03). Springer, 250–266.

Crossref

Google Scholar

[27]

José Espírito Santo. 2017. The polarized \(\lambda\) -calculus. Electron. Notes Theoret. Comput. Sci. 332 (2017), 149–168. DOI:

Crossref

Google Scholar

[28]

Robert Bruce Findler and Matthias Felleisen. 2002. Contracts for higher-order functions. In ACM SIGPLAN International Conference on Functional Programming (ICFP’02). 48–59.

Digital Library

Google Scholar

[29]

Cormac Flanagan, Amr Sabry, Bruce F. Duba, and Matthias Felleisen. 1993. The essence of compiling with continuations. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’93). 237–247.

Digital Library

Google Scholar

[30]

Robert W. Floyd. 1967. Assigning meanings to programs. Proc. Symp. Appl. Math. 19 (1967), 19–32.

Crossref

Google Scholar

[31]

Tim Freeman and Frank Pfenning. 1991. Refinement types for ML. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’91). ACM Press, 268–277.

Digital Library

Google Scholar

[32]

Jean-Yves Girard. 1993. On the unity of logic. Ann. Pure Appl. Logic 59, 3 (1993), 201–217. DOI:

Crossref

Google Scholar

[33]

Jean-Yves Girard. 1992. A Fixpoint Theorem in Linear Logic. Retrieved from http://www.seas.upenn.edu/sweirich/types/archive/1992/msg00030.html

Google Scholar

[34]

Michael Greenberg, Benjamin C. Pierce, and Stephanie Weirich. 2010. Contracts made manifest. In 37th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL’10). ACM Press, New York, NY, 353–364. DOI:

Digital Library

Google Scholar

[35]

Carl A. Gunter. 1993. Semantics of Programming Languages—Structures and Techniques. MIT Press.

Google Scholar

[36]

Martin A. T. Handley, Niki Vazou, and Graham Hutton. 2019. Liquidate your assets: Reasoning about resource usage in Liquid Haskell. Proc. ACM Program. Lang. 4, POPL, Article 24 (Dec.2019), 27 pages. DOI:

Digital Library

Google Scholar

[37]

Bob Harper and Mark Lillibridge. 1991. ML with Callcc is Unsound. Retrieved from https://www.seas.upenn.edu/sweirich/types/archive/1991/msg00034.html

Google Scholar

[38]

Robert Harper, John C. Mitchell, and Eugenio Moggi. 1990. Higher-order modules and the phase distinction. In 17th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. ACM Press, San Francisco, CA, 341–354. DOI:

Digital Library

Google Scholar

[39]

Bastiaan Heeren, Jurriaan Hage, and S. Doaitse Swierstra. 2002. Generalizing Hindley-Milner Type Inference Algorithms. Technical Report UU-CS-2002-031. Department of Information and Computing Sciences, Utrecht University. Retrieved from http://www.cs.uu.nl/research/techreps/repo/CS-2002/2002-031.pdf

Google Scholar

[40]

R. Hindley. 1969. The principal type-scheme of an object in combinatory logic. Trans. Amer. Math. Soc. 146 (1969), 29–60.

Google Scholar

[41]

C. A. R. Hoare. 1969. An axiomatic basis for computer programming. Commun. ACM 12, 10 (Oct.1969), 576–580. DOI:

Digital Library

Google Scholar

[42]

Ming Kawaguchi, Patrick Rondon, and Ranjit Jhala. 2009. Type-based data structure verification. In 30th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’09). ACM Press, New York, NY, 304–315. DOI:

Digital Library

Google Scholar

[43]

Andrew Kennedy. 1994. Dimension types. In European Symposium on Programming (ESOP’94), Vol. 788. Springer, 348–362.

Google Scholar

[44]

Neelakantan R. Krishnaswami. 2009. Focusing on pattern matching. In SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL’09). ACM Press, 366–378.

Google Scholar

[45]

Nico Lehmann, Adam T. Geller, Niki Vazou, and Ranjit Jhala. 2023. Flux: Liquid types for rust. Proc. ACM Program. Lang. 7, PLDI, Article 169 (June2023), 25 pages. DOI:

Digital Library

Google Scholar

[46]

Paul Blain Levy. 2004. Call-by-push-value: A Functional/Imperative Synthesis (Semantics Structures in Computation, V. 2). Kluwer Academic Publishers, Norwell, MA.

Google Scholar

[47]

Chuck Liang and Dale Miller. 2009. Focusing and polarization in linear, intuitionistic, and classical logics. Theoret. Comput. Sci. 410, 46 (2009), 4747–4768. DOI:

Digital Library

Google Scholar

[48]

Daniel R. Licata and Robert Harper. 2005. A Formulation of Dependent ML with Explicit Equality Proofs. Technical Report CMU-CS-05-178. Carnegie Mellon University. DOI:

Crossref

Google Scholar

[49]

Per Martin-Löf. 1971. A Theory of Types. Manuscript, Stockholm University. Retrieved from https://raw.githubusercontent.com/michaelt/martin-lof/master/pdfs/martin-loef1971%20-%20A%20Theory%20of%20Types.pdf

Google Scholar

[50]

Per Martin-Löf. 1975. An intuitionistic theory of types: Predicative part. In Logic Colloquium’73, H. E. Rose and J. C. Shepherdson (Eds.) (Studies in Logic and the Foundations of Mathematics, Vol. 80). Elsevier, 73–118. DOI:

Crossref

Google Scholar

[51]

Per Martin-Löf. 1984. Intuitionistic Type Theory(Studies in Proof Theory, Vol. 1). Bibliopolis. iv+91 pages.

Google Scholar

[52]

Per Martin-Löf. 1994. Analytic and synthetic judgements in type theory. In Kant and Contemporary Epistemology, Paolo Parrini (Ed.). Springer Netherlands, 87–99.

Crossref

Google Scholar

[53]

Per Martin-Löf. 1996. On the meanings of the logical constants and the justifications of the logical laws. Nordic J. Philos. Logic 1, 1 (1996), 11–60.

Google Scholar

[54]

Conor McBride. 2011. Ornamental algebras, algebraic ornaments. Retrieved from https://personal.cis.strath.ac.uk/conor.mcbride/pub/OAAO/LitOrn.pdf

Google Scholar

[55]

Conor McBride and James McKinna. 2004. The view from the left. J. Function. Program. 14, 1 (2004), 69–111.

Digital Library

Google Scholar

[56]

Paul-André Melliès and Noam Zeilberger. 2015. Functors are type refinement systems. In 42nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL’15). ACM Press, New York, NY, USA, 3–16. DOI:

Digital Library

Google Scholar

[57]

Robin Milner. 1978. A theory of type polymorphism in programming. J. Comput. Syst. Sci. 17, 3 (1978), 348–375.

Crossref

Google Scholar

[58]

Eugenio Moggi. 1989a. A category-theoretic account of program modules. In Category Theory and Computer Science. Springer-Verlag, Berlin, 101–117.

Crossref

Google Scholar

[59]

Eugenio Moggi. 1989b. Computational lambda-calculus and monads. In Logic in Computer Science Conference (LICS’89). 14–23.

Crossref

Google Scholar

[60]

Martin Odersky, Matthias Zenger, and Christoph Zenger. 2001. Colored local type inference. In SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL’01). ACM Press, 41–53.

Google Scholar

[61]

Simon Peyton Jones, Dimitrios Vytiniotis, Stephanie Weirich, and Mark Shields. 2007. Practical type inference for arbitrary-rank types. J. Function. Program. 17, 1 (2007), 1–82.

Digital Library

Google Scholar

[62]

Frank Pfenning. 2008. Church and Curry: Combining intrinsic and extrinsic typing. In Reasoning in Simple Type Theory: Festschrift in Honor of Peter B. Andrews on His 70th Birthday. College Publications. Retrieved from http://www.cs.cmu.edu/fp/papers/andrews08.pdf

Google Scholar

[63]

Brigitte Pientka. 2008. A type-theoretic foundation for programming with higher-order abstract syntax and first-class substitutions. In SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL’08). ACM Press, 371–382.

Google Scholar

[64]

Benjamin C. Pierce and David N. Turner. 2000. Local type inference. ACM Trans. Prog. Lang. Sys. 22 (2000), 1–44.

Digital Library

Google Scholar

[65]

John C. Reynolds. 1998. Theories of Programming Languages. Cambridge University Press.

Digital Library

Google Scholar

[66]

Nick Rioux and Steve Zdancewic. 2020. Computation focusing. Proc. ACM Program. Lang. 4, ICFP, Article 95 (Aug.2020), 27 pages. DOI:

Digital Library

Google Scholar

[67]

Patrick Rondon, Ming Kawaguchi, and Ranjit Jhala. 2008. Liquid types. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’08). ACM Press, 159–169.

Digital Library

Google Scholar

[68]

Amr Sabry and Matthias Felleisen. 1993. Reasoning about programs in continuation-passing style. Lisp Symb. Comput. 6, 3–4 (Nov.1993), 289–360. DOI:

Digital Library

Google Scholar

[69]

Peter Schroeder-Heister. 1994. Definitional reflection and the completion. In Extensions of Logic Programming (LNCS). Springer, 333–347.

Crossref

Google Scholar

[70]

Taro Sekiyama, Atsushi Igarashi, and Michael Greenberg. 2017. Polymorphic manifest contracts, revised and resolved. ACM Trans. Program. Lang. Syst. 39, 1, Article 3 (Feb.2017), 36 pages. DOI:

Digital Library

Google Scholar

[71]

Taro Sekiyama, Yuki Nishida, and Atsushi Igarashi. 2015. Manifest contracts for datatypes. In 42nd Symposium on Principles of Programming Languages. ACM, 195–207. DOI:

Digital Library

Google Scholar

[72]

Robert J. Simmons. 2014. Structural focalization. ACM Trans. Comput. Logic 15, 3, Article 21 (Sept.2014). DOI:

Digital Library

Google Scholar

[73]

Raymond M. Smullyan. 1969. Analytic cut. J. Symbol. Logic 33 (1969), 560–564.

Crossref

Google Scholar

[74]

Nikhil Swamy, Cătălin Hriţcu, Chantal Keller, Aseem Rastogi, Antoine Delignat-Lavaud, Simon Forest, Karthikeyan Bhargavan, Cédric Fournet, Pierre-Yves Strub, Markulf Kohlweiss, Jean-Karim Zinzindohoue, and Santiago Zanella-Béguelin. 2016. Dependent types and multi-monadic effects in F*. In 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL’16). ACM Press, New York, NY, 256–270. DOI:

Digital Library

Google Scholar

[75]

W. W. Tait. 1967. Intensional interpretations of functionals of finite type I. J. Symbol. Logic 32, 2 (1967), 198–212. Retrieved from http://www.jstor.org/stable/2271658

Crossref

Google Scholar

[76]

Alan M. Turing. 1936. On computable numbers, with an application to the Entscheidungsproblem. London Math. Soc. 2, 42 (1936), 230–265.

Google Scholar

[77]

Niki Vazou, Patrick Maxim Rondon, and Ranjit Jhala. 2013. Abstract refinement types. In 22nd European Symposium on Programming: Programming Languages and Systems, Held as Part of the European Joint Conferences on Theory and Practice of Software(Lecture Notes in Computer Science, Vol. 7792), Matthias Felleisen and Philippa Gardner (Eds.). Springer, 209–228. DOI:

Digital Library

Google Scholar

[78]

Niki Vazou, Eric L. Seidel, Ranjit Jhala, Dimitrios Vytiniotis, and Simon Peyton-Jones. 2014. Refinement types for Haskell. In 19th ACM SIGPLAN International Conference on Functional Programming (ICFP’14). ACM Press, New York, NY, 269–282. DOI:

Digital Library

Google Scholar

[79]

Niki Vazou, Anish Tondwalkar, Vikraman Choudhury, Ryan G. Scott, Ryan R. Newton, Philip Wadler, and Ranjit Jhala. 2017. Refinement reflection: Complete verification with SMT. Proc. ACM Program. Lang. 2, POPL, Article 53 (Dec.2017), 31 pages. DOI:

Digital Library

Google Scholar

[80]

Peng Wang, Di Wang, and Adam Chlipala. 2017. TiML: A functional language for practical complexity analysis with invariants. Proc. ACM Program. Lang. 1, OOPSLA, Article 79 (Oct.2017), 26 pages. DOI:

Digital Library

Google Scholar

[81]

Kevin Watkins, Iliano Cervesato, Frank Pfenning, and David Walker. 2004. A concurrent logical framework: The propositional fragment. In Types for Proofs and Programs Conference (TYPES’03) (LNCS, Vol. 3085). Springer, 355–377.

Crossref

Google Scholar

[82]

Thomas Williams and Didier Rémy. 2017. A principled approach to ornamentation in ML. Proc. ACM Program. Lang. 2, POPL, Article 21 (Dec.2017). DOI:

Digital Library

Google Scholar

[83]

Andrew K. Wright. 1995. Simple imperative polymorphism. Lisp and Symbol. Comput. 8, 4 (1995), 343–355.

Digital Library

Google Scholar

[84]

Hongwei Xi. 1998. Dependent Types in Practical Programming. Ph.D. Dissertation. Carnegie Mellon University. Retrieved from https://www.cs.cmu.edu/rwh/students/xi.pdf

Digital Library

Google Scholar

[85]

Hongwei Xi. 2002. Dependent types for program termination verification. J. High.-ord. Symbol. Computat. 15 (Oct.2002), 91–131.

Digital Library

Google Scholar

[86]

Hongwei Xi. 2004. Applied type system (extended abstract). In Types for Proofs and Programs Conference (TYPES’03) (LNCS). Springer, 394–408.

Crossref

Google Scholar

[87]

Hongwei Xi, Chiyan Chen, and Gang Chen. 2003. Guarded recursive datatype constructors. In SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL’03). ACM Press, 224–235.

Google Scholar

[88]

Hongwei Xi and Frank Pfenning. 1998. Eliminating array bound checking through dependent types. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’98). 249–257.

Digital Library

Google Scholar

[89]

Hongwei Xi and Frank Pfenning. 1999. Dependent types in practical programming. In SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL’99). ACM Press, 214–227.

Google Scholar

[90]

Noam Zeilberger. 2009. Refinement types and computational duality. In Programming Languages meets Programming Verification Conference (PLPV’09). ACM Press, 15–26.

Digital Library

Google Scholar

Cited By

View all

Lehmann NGeller AVazou NJhala R(2023)Flux: Liquid Types for RustProceedings of the ACM on Programming Languages10.1145/35912837:PLDI(1533-1557)Online publication date: 6-Jun-2023
https://dl.acm.org/doi/10.1145/3591283

Index Terms

Focusing on Refinement Typing
1. Software and its engineering
  1. Software organization and properties
    1. Software functional properties
      1. Formal methods
        Automated static analysis
2. Theory of computation
  1. Logic
    1. Proof theory
    2. Type theory
  2. Semantics and reasoning

Recommendations

Mechanizing Refinement Types

Practical checkers based on refinement types use the combination of implicit semantic subtyping and parametric polymorphism to simplify the specification and automate the verification of sophisticated properties of programs. However, a formal ...
Call-by-push-value in Coq: operational, equational, and denotational theory
CPP 2019: Proceedings of the 8th ACM SIGPLAN International Conference on Certified Programs and Proofs

Call-by-push-value (CBPV) is an idealised calculus for functional and imperative programming, introduced as a subsuming paradigm for both call-by-value (CBV) and call-by-name (CBN). We formalise weak and strong operational semantics for (effect-free) ...
Sound and complete bidirectional typechecking for higher-rank polymorphism with existentials and indexed types

Bidirectional typechecking, in which terms either synthesize a type or are checked against a known type, has become popular for its applicability to a variety of type systems, its error reporting, and its ease of implementation. Following principles from ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Programming Languages and Systems

ACM Transactions on Programming Languages and Systems Volume 45, Issue 4

December 2023

178 pages

ISSN:0164-0925

EISSN:1558-4593

DOI:10.1145/3633281

Editor:
Jan Vitek
Northeastern University, USA

Issue’s Table of Contents

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s).

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 December 2023

Online AM: 17 October 2023

Accepted: 22 June 2023

Revised: 22 April 2023

Received: 02 September 2022

Published in TOPLAS Volume 45, Issue 4

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Natural Sciences and Engineering Research Council of Canada through Discovery
European Research Council (ERC) Consolidator
European Union’s Horizon 2020 Framework Programme

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
907
Total Downloads

Downloads (Last 12 months)907
Downloads (Last 6 weeks)117

Reflects downloads up to 16 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Lehmann NGeller AVazou NJhala R(2023)Flux: Liquid Types for RustProceedings of the ACM on Programming Languages10.1145/35912837:PLDI(1533-1557)Online publication date: 6-Jun-2023
https://dl.acm.org/doi/10.1145/3591283

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Abstract

1 Introduction

2 Overview

3 Example: Verifying Mergesort

4 Declarative System

4.0.1 Inductive Types, Functors, and Algebras.

4.1 Index Sorting and Propositional Validity

4.2 Well-formedness

4.3 Equivalence

4.4 Extraction

4.5 Subtyping

4.6 Unrolling

4.7 Typing

4.8 Substitution

5 Type Soundness

5.1 Unrefined System

5.2 Refined System

6 Algorithmic System

6.1 Contexts and Substitution

6.2 Context Extension

6.3 Subtyping

6.4 Typing

7 Algorithmic Metatheory

7.1 Decidability

7.2 Algorithmic Soundness

7.3 Algorithmic Completeness

8 Related Work

9 Conclusion and Future Work

Acknowledgments

Footnotes

References

Cited By

Index Terms

Recommendations

Mechanizing Refinement Types

Call-by-push-value in Coq: operational, equational, and denotational theory

Sound and complete bidirectional typechecking for higher-rank polymorphism with existentials and indexed types

Comments

Information

Published In

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Funding Sources

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

PDF

eReader

Get Access

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations