Enforcing Availability in Failure-Aware Communicating Systems

López, Hugo A.; Nielson, Flemming; Nielson, Hanne Riis

doi:10.1007/978-3-319-39570-8_13

Hugo A. López¹⁵,
Flemming Nielson¹⁵ &
Hanne Riis Nielson¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 9688))

Included in the following conference series:

International Conference on Formal Techniques for Distributed Objects, Components, and Systems

1433 Accesses
3 Altmetric

Abstract

Choreographic programming is a programming-language design approach that drives error-safe protocol development in distributed systems. Motivated by challenging scenarios in Cyber-Physical Systems (CPS), we study how choreographic programming can cater for dynamic infrastructures where the availability of components may change at runtime. We introduce the Global Quality Calculus ($GC_q$), a process calculus featuring novel operators for multiparty, partial and collective communications; we provide a type discipline that controls how partial communications refer only to available components; and we show that well-typed choreographies enjoy progress.

You have full access to this open access chapter, Download conference paper PDF

Towards Choreographic-Based Monitoring

Self-adaptation and secure information flow in multiparty communications

Article Open access 20 June 2016

Automated Choreography Repair

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Choreographies are a well-established formalism in concurrent programming, with the purpose of providing a correct-by-construction framework for distributed systems [9, 12]. Using Alice-Bob’s, protocol narrations, they provide the structure of interactions among components in a distributed system. Combined with a behavioral type system, choreographies are capable of deriving distributed (endpoint) implementations. Endpoints generated from a choreography ascribe all and only the behaviors defined by it. Additionally, interactions among endpoints exhibit correctness properties, such as liveness and deadlock-freedom. In practice, choreographies guide the implementation of a system, either by automating the generation of correct deadlock-free code for each component involved, or by monitoring that the execution of a distributed system behaves according to a protocol [3, 9, 32].

In this paper we study the role of availability when building communication protocols. In short, availability describes the ability of a component to engage in a communication. Insofar, the study of communications using choreographies assumed that components were always available. We challenge this assumption on the light of new scenarios. The case of Cyber-Physical Systems (CPS) is one of them. In CPS, components become unavailable due to faults or because of changes in the environment. Even simple choreographies may fail when including availability considerations. Thus, a rigorous analysis of availability conditions in communication protocols becomes necessary, before studying more advanced properties, such as deadlock-freedom or protocol fidelity.

Practitioners in CPS take availability into consideration, programming applications in a failure-aware fashion. First, application-based QoS policies replace old node-based ones. Second, one-to-many and many-to-one communication patterns replace peer-to-peer communications. Still, programming a CPS from a component viewpoint such that it respects an application-based QoS is difficult, because there is no centralized way to ensure its enforcement.

This paper advocates a choreography-based approach for the development of failure-aware communication protocols, as exemplified by CPS. On the one hand, interactions described in choreographies take a global viewpoint, in the same way application-based QoS describe availability policies in a node-conscious fashion. On the other hand, complex communication including one-to-many and many-to-one communications can be explicitly defined in the model, which is a clear advantage over component-based development currently used in CPS. Finally, choreographies give a formal foundation to practical development of distributed systems, with Chor [11], ParTypes [27] and Scribble [36].

Contributions. First, we present the Global Quality Calculus ($GC_q$), a process calculus aimed at capturing the most important aspects of CPS, such as variable availability conditions and multicast communications. It is a generalization of the Global Calculus [9, 12], a basic model for choreographies and the formal foundation of the Chor programming language [11]. With respect to the Global Calculus, $GC_q$ introduces two novel aspects: First, it extends the communication model to include collective communication primitives (broadcast and reduce). Second, it includes explicit availability considerations. Central to the calculus is the inclusion of quality predicates [33] and optional datatypes, whose role is to allow for communications where only a subset of the original participants is available.

Our second contribution relates to the verification of failure-aware protocols. We focus on progress. As an application-based QoS, a progress property requires that at least a minimum set of components is available before firing a communication action. Changing availability conditions may leave collective communications without enough required components, forbidding the completion of a protocol. We introduce a type system, orthogonal to session types, that ensures that well-typed protocols with variable availability conditions do not get stuck, preserving progress.

Document Structure. In Sect. 2 we introduce the design considerations for a calculus with variable availability conditions and we present a minimal working example to illustrate the calculus in action. Section 3 introduces syntax and semantics of $GC_q$. The progress-enforcing type system is presented in Sect. 4. Section 5 discusses related work. Finally, Sect. 6 concludes. The Appendix includes additional definitions.

2 Towards a Language for CPS Communications

The design of a language for CPS requires a technology-driven approach, that answers to requirements regarding the nature of communications and devices involved in CPS. Similar approaches have been successfully used for Web-Services [10, 31, 36], and Multicore Programming [14, 27]. The considerations on CPS used in this work come from well-established sources [2, 35]. We will proceed by describing their main differences with respect to traditional networks.

2.1 Unique Features in CPS Communications

Before defining a language for communication protocols in CPS, it is important to understand the taxonomy of networks where they operate. CPS are composed by sensor networks (SN) that perceive important measures of a system, and actuator networks that change it. Some of the most important characteristics in these networks include asynchronous operation, sensor mobility, energy-awareness, application-based protocol fidelity, data-centric protocol development, and multicast communication patterns. We will discuss each of them.

Asynchrony. Depending on the application, deployed sensors in a network have less accessible mobile access points, for instance, sensors deployed in harsh environmental conditions, such as arctic or marine networks. Environment may also affect the lifespan of a sensor, or increase its probability of failure. To maximize the lifespan of some sensors, one might expect an asynchronous operation, letting sensors remain in a standby state, collecting data periodically.

Sensor Mobility. The implementation of sensors in autonomic devices brings about important considerations on mobility. A sensor can move away from the base station, making their interactions energy-intensive. In contrast, it might be energy-savvy to start a new session with a different base station closer to the new location.

Energy-Awareness. Limited by finite energetic resources, SN must optimize their energy consumption, both from node and application perspectives. From a node-specific perspective, a node in a sensor network can optimize its life by turning parts of the node off, such as the RF receiver. From a application-specific perspective, a protocol can optimize it energy usage by reducing its traffic. SN cover areas with dense node deployment, thus it is unnecessary that all nodes are operational to guarantee coverage. Additionally, SN must provide self-configuration capabilities, adapting its behavior to changing availability conditions. Finally, it is expected that some of the nodes deployed become permanently unavailable, as energetic resources ran out. It might be more expensive to recharge the nodes than to deploy new ones. The SN must be ready to cope with a decrease in some of the available nodes.

Data-Centric Protocols. One of the most striking differences to traditional networks is the collaborative behavior expected in SN. Nodes aim at accomplishing a similar, universal goal, typically related to maintaining an application-level quality of service (QoS). Protocols are thus data-centric rather than node-centric. Moreover, decisions in SN are made from the aggregate data from sensing nodes, rather than the specific data of any of them [34]. Collective decision-making based in aggregates is common in SN, for instance, in protocols suites such as SPIN [20] and Directed Diffusion [24]. Shifting from node-level to application-level QoS implies that node fairness is considerably less important than in traditional networks. In consequence, the analysis of protocol fidelity [22] requires a shift from node-based guarantees towards application-based ones.

Multicast Communication. Rather than peer-to-peer message passing, one-to-many and many-to-one communications are better solutions for energy-efficient SN, as reported in [15, 19]. However, as the number of sensor nodes in a SN scales to large numbers, communications between a base and sensing nodes can become a limiting factor. Many-to-one traffic patterns can be combined with data aggregation services (e.g.: TAG [29] or TinyDB [30]), minimizing the amount and the size of messages between nodes.

2.2 Model Preview

We will illustrate how the requirements for CPS communications have been assembled in our calculus through a minimal example in Sensor Networks (SN). The syntax of our language is inspired on the Global Calculus [9, 12] extended with collective communication operations [27].

Example 1

Figure 1 portrays a simple SN choreography for temperature measurement. Line 1 models a session establishment phase between sensors $\mathfrak {t}_1, \mathfrak {t}_2, \mathfrak {t}_3$ (each of them implementing role S) and a monitor $\mathfrak {t}_m$ with role M. In Line 2, $\mathfrak {t}_m$ invoques the execution of method measure at each of the sensors. In Line 3, an asynchronous many-to-one communication (e.g. reduce) of values of the same base type ($\textsf {\small int}$ in this case) is performed between sensors and the monitor. Quality predicates model application-based QoS, established in terms of availability requirements for each of the nodes. For instance, only allows communications with all sensors in place, and tolerates the absence of one of the sensors in data harvesting. Once nodes satisfy applications’ QoS requirements, an aggregation operation will be applied to the messages received, in this case computing the average value.

One important characteristic of fault-tolerant systems, of which CPS are part, is known as graceful degradation. Graceful degradation allows a system to maintain functionality when portions of a system break down, for instance, when some of the nodes are unavailable for a communication. The use of different quality predicates allow us to describe choreographies that gracefully degrade, since the system preserves functionality despite one of the nodes is unavailable.

Considerations regarding the impact of available components in a communication must be tracked explicitly. Annotations (in blue font) define capabilities, that is, control points achieved in the system. The in denotes the required capability for $\mathfrak {t}$ to act, and describes the capability offered after $\mathfrak {t}$ has engaged in an interaction. No preconditions are necessary for establishing a new session, so no required capabilities are necessary in Line 1. After a session has been established, capabilities are available in the system. Lines 2 and 3 modify which capabilities are present depending on the number of available threads. For example, a run of the choreography in Fig. 1 with will update capabilities from to any of the sets , , , or . The interplay between capabilities and quality predicates may lead to choreographies that cannot progress. For example, the choreography above with will be stuck, since three of the possible evolutions fail to provide capabilities . We will defer the discussion about the interplay of capabilities and quality predicates to Sect. 4.

3 The Global Quality Calculus ($GC_q$)

In the following, $ C$ denotes a choreography; p denotes an annotated thread , where $\mathfrak {t}$ is a thread, are atomic formulae and A is a role annotation. We will use $\widetilde{\mathfrak {t}}$ to denote $\{\mathfrak {t}_1, \ldots , \mathfrak {t}_j\}$ for a finite j. Variable a ranges over service channels, intuitively denoting the public identifier of a service, and $k \in {\mathbf N}$ ranges over a finite, countable set of session (names), created at runtime. Variable x ranges over variables local to a thread. We use terms t to denote data and expressions e to denote optional data, much like the use of option data types in programming languages like Standard ML [18]. Expressions include arithmetic and other first-order expressions excluding service and session channels. In particular, the expression $\mathsf {some}(t)$ signals the presence of some data t and $\mathsf {none}$ the absence of data. In our model, terms denote closed values v. Names m, n range over threads and session channels. For simplicity of presentation, all models in the paper are finite.

Definition 1

( $GC_q$ syntax).

A novelty in this variant of the Global calculus is the addition of quality predicates , binding thread vectors in a multiparty communication. Essentially, determines when sufficient inputs/outputs are available. As an example, can be $\exists $, meaning that one sender/receiver is required in the interaction, or it can be $\forall $ meaning that all of them are needed. The syntax of and other examples can be summarised in Fig. 2. We require to be monotonic (in the sense that implies for all $\widetilde{\mathfrak {t}_s} \subseteq \widetilde{\mathfrak {t}_r}$) and satisfiable.

We will focus our discussion on the novel interactions. First, $\mathbf{{start}}$ defines a (multiparty) session initiation between active annotated threads $\widetilde{p_r}$ and annotated service threads $\widetilde{p_{s}}$. Each active thread (resp. service thread) implements the behaviour of one of the roles in $\widetilde{A_r}$ (resp. $\widetilde{A_s}$), sharing a new session name k. We assume that a session is established with at least two participating processes, therefore $2 \le |\widetilde{p_r}|+|\widetilde{p_s}|$, and that threads in $\widetilde{p_r}\cup \widetilde{p_s}$ are pairwise different.

The language features broadcast, reduce and selection as collective interactions. A broadcast describes one-to-many communication patterns, where a session channel k is used to transfer the evaluation of expression e (located at $p_r$) to threads in $\widetilde{p_s}$, with the resulting binding of variable $x_i$ at $p_i$, for each $p_i \in \widetilde{p_s}$. At this level of abstraction, we do not differentiate between ways to implement one-to-many communications (so both broadcast and multicast implementations are allowed). A reduce combines many-to-one communications and aggregation [29]. In , each annotated thread $p_i$ in $\widetilde{p_r}$ evaluates an expression $e_i$, and the aggregate of all receptions is evaluated using $\mathsf {op}$ (an operator defined on multisets such as $\mathsf {max}, \mathsf {min}$, etc.). Interaction describes a collective label selection: $p_r$ communicates the selection of label l to peers in $\widetilde{p_s}$ through session k.

Central to our language are progress capabilities. Pairs of atomic formulae at each annotated thread state the necessary preconditions for a thread to engage (), and the capabilities provided after its interaction (). As we will see in the semantics, there are no associated preconditions for session initiation (i.e. threads are created at runtime), so we normally omit them. Explicit x@p/e@p indicate the variable/boolean expression x/e is located at p. We often omit $\varvec{0}$, empty vectors, roles, and atomic formulae from annotated threads when unnecessary.

The free term variables $\mathsf {fv}( C)$ are defined as usual. An interaction $\mathcal \eta $ in $\mathcal \eta \varvec{;}\ C$ can bind session channels, choreographies and variables. In $\mathbf{{start}}$, variables $\{\widetilde{p_r}, a\}$ are free while variables $\{\widetilde{p_s}, k\}$ are bound (since they are freshly created). In broadcast, variables $\widetilde{x_s}$ are bound. A reduce binds $\{x\}$. Finally, we assume that all bound variables in an expression have been renamed apart from each other, and apart from any other free variables in the expression.

Expressivity. The importance of roles is only crucial in a $\mathbf{{start}}$ interaction. Technically, one can infer the role of a given thread $\mathfrak {t}$ used in an interaction $\mathcal \eta $ by looking at the $\mathbf{{start}}$ interactions preceding it in the abstract syntax tree. $GC_q$ can still represent unicast message-passing patterns as in [9]. Unicast communication $p_1.e\, \hbox {-}{>} p_2:x: k $ can be encoded in multiple ways using broadcast/reduce operators. For instance, and are just a couple of possible implementations. The implementation of unicast label selection $p \,\hbox {-}{>} r:k[l]$ can be expressed analogously.

3.1 Semantics

Choreographies are considered modulo standard structural and swapping congruence relations (resp. $\equiv $, $\simeq _C$). Relation $\equiv $ is defined as the least congruence relation on $ C$ supporting $\alpha -$renaming, such that $( C, \varvec{0}, + )$ is an abelian monoid. The swap congruence [12] provides a way to reorder non-conflicting interactions, allowing for a restricted form of asynchronous behavior. Non-conflicting interactions are those involving sender-receiver actions that do not conform a control-flow dependency. For instance, . Formally, let $\mathbf {T}( C) $ be the set of threads in $ C$, defined inductively as $\mathbf {T}(\mathcal \eta \varvec{;}\ C) \mathop {=}\limits ^{\texttt {def} }\mathbf {T}(\mathcal \eta ) \cup \mathbf {T}(C) $, and $\mathbf {T}(\mathcal \eta ) \mathop {=}\limits ^{\texttt {def} }\bigcup _{i = \{1..j\}} \mathfrak {t}_i$ if (similarly for init, reduce and selection, and standardly for the other process constructs in $ C$). The swapping congruence rules are presented in Fig. 3.

A state $\sigma $ keeps track of the capabilities achieved by a thread in a session, and it is formally defined as set of maps $(\mathfrak {t}, k) \mapsto X$. The rules in Fig. 4 define state manipulation operations, including update ($\sigma [\sigma ' ]$), and lookup ($\sigma (\mathfrak {t}, k)$).

Because of the introduction of quality predicates, a move from $\mathcal \eta \varvec{;}\ C$ into $ C$ might leave some variables in $\mathcal \eta $ without proper values, as the participants involved might not have been available. We draw inspiration from [33], introducing effect rules describing how the evaluation of an expression in a reduce operation affects interactions. The relation $\twoheadrightarrow $ (Fig. 5) describes how evaluations are partially applied without affecting waiting threads. Label $\xi $ records the substitutions of atomic formulae in each thread.

Finally, given ${\phi }\in \{ \mathtt {tt}, \mathtt {ff}\} $, the relation $ \beta \,{:}{:}_{{\phi }} \; {\theta }$ tracks whether all required binders in $\beta $ have been performed, as well as substitutions used $\theta $. Binder $\beta $ is defined in terms of partially evaluated outputs c:

$$ \begin{aligned} sc \,{:}{:}=&{{}\mid \quad {}} p.e \quad ~~ \mid \quad p.\mathsf {some}(v)&c \,{:}{:}=&{{}\mid \quad {}} \& _{q}( sc_1, \ldots , sc_n) \end{aligned}$$

The rules specifying $\beta \,{:}{:}_{\phi } \, {\theta }$ appear in Fig. 6. A substitution $\theta = [(p_1,\mathsf {some}(v_1)), \ldots , (p_n,\mathsf {some}(v_n)) /x_1@p_1, \ldots , x_n@p_n ]$ maps each variable $x_i$ at $p_i$ to optional data $\mathsf {some}(v_i)$ for $1 \le i \le n$. A composition $\theta _1 \circ \theta _2(x)$ is defined as $\theta _1 \circ \theta _2(x) \,{:}{:}= \theta _1(\theta _2(x))$, and $q(t_1, \ldots , t_n) = \bigwedge _{i \in 1 \le i \le n} t_i$ if $q = \forall $, $q(t_1, \ldots , t_n) = \bigvee _{i \in 1 \le i \le n} t_i$ if $q = \exists $, and possible combinations therein. As for process terms, $\theta ( C)$ denotes the application of substitution $\theta $ to a term $ C$ (and similarly for $\mathcal \eta $).

We now have all the ingredients to understand the semantics of $GC_q$. The set of transition rules in $\xrightarrow {\lambda }$ is defined as the minimum relation on names, states, and choreographies satisfying the rules in Fig. 7. The operational semantics is given in terms of labelled transition rules. Intuitively, a transition $(\mathbf \nu \widetilde{m})\left\langle \sigma , C \right\rangle \xrightarrow {\,\lambda \,} \, (\mathbf \nu \widetilde{n})\left\langle \sigma ', C' \right\rangle $ expresses that a configuration $\langle \sigma , C\rangle $ with used names $\widetilde{m}$ fires an action $\lambda $ and evolves into $\langle \sigma ', C' \rangle $ with names $\widetilde{n}$. We use the shorthand notation $A ~\#~ B$ to denote set disjointness, $A \cap B = \emptyset $. The exchange function returns if $ X \subseteq Z$ and Z otherwise. Actions are defined as $\lambda \,{:}{:}=\{\tau , \mathcal \eta \}$, where $\mathcal \eta $ denotes interactions, and $\tau $ represents an internal computation. Relation $e@p \downarrow v$ describes the evaluation of a expression e (in p) to a value v.

We now give intuitions on the most representative operational rules. Rule $\lfloor $ Init $\rceil $ models initial interactions: state $\sigma $ is updated to account for the new threads in the session, updating the set of used names in the reductum. Rule $\lfloor $ Bcast $\rceil $ models broadcast: given an expression evaluated at the sender, one needs to check that there are enough receivers ready to get a message. Such a check is performed by evaluating q(J). In case of a positive evaluation, the execution of the rule will: (1) update the current state with the new states of each participant engaged in the broadcast, and (2) apply the partial substitution $\theta $ to the continuation $ C$. The behaviour of a reduce operation is described using rules $\lfloor $ RedD $\rceil $ and $\lfloor $ RedE $\rceil $: the evaluation of expressions of each of the available senders generates an application of the effect rule in Fig. 5. If all required substitutions have been performed, one can proceed by evaluating the operator to the set of received values, binding variable x to its results, otherwise the choreography will wait until further inputs are received (i.e.: the continuation is delayed).

Remark 1

(Broadcast vs. Selection). The inclusion of separate language constructs for communication and selection takes origin in early works of structured communications [22]. Analogous to method invocation in object-oriented programming, selections play an important role in making choreographies projectable to distributed implementations. We illustrate their role with an example. Assume a session key k shared among threads p, r, s, and an evaluation of e@p of boolean type. The choreography $p.e\, \hbox {-}{>} r:x: k \varvec{;}\ \textsf {\small if}\, (x@r) \,\textsf {\small then}\, \left( r.d\, \hbox {-}{>} s:y: k \right) \,\textsf {\small else}\, \left( s.f\, \hbox {-}{>} r:z: k \right) $ branches into two different communication flows: one from r to s if the evaluation of x@r is true, and one from s to r otherwise. Although the evaluation of the guard in the $\textsf {\small if}$ refers only to r, the projection of such choreography to a distributed system requires s to behave differently based on the decisions made by r. The use of a selection operator permits s to be notified by r about which behavior to implement:

Remark 2

(Broadcast vs. Reduce). We opted in favor of an application-based QoS instead of a classical node-based QoS, as described in Sect. 2. This consideration motivates the asymmetry of broadcast and reduce commands: both operations are blocked unless enough receivers are available, however, we give precedence to senders over receivers. In a broadcast, only one sender needs to be available, and provided availability constraints for receivers are satisfied, its evolution will be immediate. In a reduce, we will allow a delay of the transition, capturing in this way the fact that senders can become active in different instants.

The reader familiar with the Global Calculus may have noticed the absence of a general asynchronous behaviour in our setting. In particular, rule:

$$\begin{aligned} {\displaystyle \frac{ \begin{array}{c} (\mathbf \nu \widetilde{m}) \left\langle \sigma , C \right\rangle \xrightarrow {\lambda } (\mathbf \nu \widetilde{n})\left\langle \sigma ', C' \right\rangle \quad \mathcal \eta \ne \mathbf{{start}}\quad snd(\mathcal \eta ) \subseteq \textsf {fn}(\lambda ) \\ rcv(\mathcal \eta ) ~\#~ \textsf {fn}(\lambda ) \quad \widetilde{n} = \widetilde{m}, \widetilde{r} \quad \forall _{r \in \widetilde{r}}\ ( r \in \textsf {bn}(\lambda ) \quad r \notin \textsf {fn}(\mathcal \eta ) ) \end{array} }{ (\mathbf \nu \widetilde{m})\left\langle \sigma ,\mathcal \eta \varvec{;}\ C \right\rangle \xrightarrow {\lambda } (\mathbf \nu \widetilde{n})\left\langle \sigma ',\mathcal \eta \varvec{;}\ C' \right\rangle }} {\scriptstyle \lfloor \mathsf {Asynch}\rceil } \end{aligned}$$

corresponding to the extension of rule $\mathrm { \lfloor ^C|_{ASYNCH}\rceil }$ in [12] with collective communications, is absent in our semantics. The reason behind it lies in the energy considerations of our application: consecutive communications may have different energetic costs, affecting the availability of sender nodes. Consider for example the configuration

with $\widetilde{\mathfrak {t}_r} \# \widetilde{\mathfrak {t}_s}$ and $X \subseteq \sigma (\mathfrak {t}_A,k)$. If the order of the broadcasts is shuffled, the second broadcast may consume all energy resources for $\mathfrak {t}_A$, making it unavailable later. Formally, the execution of a broadcast update the capabilities offered in $\sigma $ for $\mathfrak {t}_A, k$ to Y, inhibiting two communication actions with same capabilities to be reordered. We will refrain the use Rule $\lfloor $ Asynch $\rceil $ in our semantics.

Definition 2

(Progress). $ C$ progresses if there exists $ C', \sigma ', \widetilde{n}, \lambda $ such that $(\mathbf \nu \widetilde{m})\left\langle \sigma , C \right\rangle \xrightarrow {\lambda } (\mathbf \nu \widetilde{n})\left\langle \sigma ', C' \right\rangle $, for all $\sigma , \widetilde{m}$.

4 Type-Checking Progress

One of the challenges regarding the use of partial collective operations concerns the possibility of getting into runs with locking states. Consider a variant of Example 1 with and . This choice leads to a blocked configuration. The system blocks since the collective selection in Line (2) continues after a subset of the receivers in $\mathfrak {t}_1,\mathfrak {t}_2,\mathfrak {t}_3,$ have executed the command. Line (3) requires all senders to be ready, which will not be the most general case. The system will additionally block if participant dependencies among communications is not preserved. The choreography in Fig. 8 illustrates this. It blocks for , since the selection operator in Line 2 can proceed by updating the capability associated to $\mathfrak {t}_2$ to , leaving the capabilities for $\mathfrak {t}_1,\mathfrak {t}_3$ assigned to . With such state, Line 3 cannot proceed.

We introduce a type system to ensure progress on variable availability conditions. A judgment is written as ${\varPsi }\vdash { C} $, where $\varPsi $ is a list of formulae in Intuitionistic Linear Logic (ILL) [17]. Intuitively, ${\varPsi }\vdash { C} $ is read as the formulae in $\varPsi $ describe the program point immediately before $ C$. Formulae $\psi \in \varPsi $ take the form of the constant $\mathtt {tt}$, ownership types of the form , and the linear logic version of conjunction, disjunction and implication $(\otimes , \oplus , \multimap )$. Here is an ownership type, asserting that p behaves as the role A in session k with atomic formula X. Moreover, we require $\varPsi $ to contain formulae free of linear implications in ${\varPsi }\vdash { C} $.

Figure 9 presents selected rules for the type system for $GC_q$. The full definition is included in Appendix A.1. Since the rules for inaction, conditionals and non-determinism are standard, we focus our explanation on the typing rules for communications. Rule $\lfloor $ TInit $\rceil $ types new sessions: $\varPsi $ is extended with function $ \mathbf {init}( \widetilde{\mathfrak {t}_p[A]\{X\} }, k)$, that returns a list of ownership types . The condition $\{\widetilde{\mathfrak {t}_s},k\} ~\#~ (\mathbf {T}(\varPsi ) \cup \mathbf {K}(\varPsi ) )$ ensures that new names do not exist neither in the threads nor in the used keys in $\varPsi $.

The typing rules for broadcast, reduce and selection are analogous, so we focus our explanation in $\lfloor $ TBcast $\rceil $. Here we abuse of the notation, writing ${\varPsi }\vdash { C} $ to denote type checking, and $\varPsi \vdash \psi $ to denote formula entailment. The semantics of $\forall ^{\ge 1}J$ s.t. $\mathbf {C}:D$ is given by $\forall J$ s.t. $\mathbf {C}:D \wedge \exists J$ s.t. $\mathbf {C}$. The judgment

succeeds if environment $\varPsi $ can provide capabilities for sender $\mathfrak {t}_A [A]$ and for a valid subset J of the receivers in $\widetilde{\mathfrak {t}_r[B_r]}$. J is a valid subset if it contains enough threads to render the quality predicate true (q(J)), and the proof of is provable. This proof succeeds if $ \psi _A $ and $ (\psi _j ) _{j \in J}$ contain ownership types for the sender and available receivers with corresponding capabilities. Finally, the type of the continuation $ C$ will consume the resources used in the sender and all involved receivers, updating them with new capabilities for the threads engaged.

Example 2

In Example 1, ${\mathtt {tt}}\vdash { C} $ if . In the case , the same typing fails. Similarly, ${\mathtt {tt}}\not \vdash { C} $ if , for the variant of Example 1 in Fig. 8.

A type preservation theorem must consider the interplay between the state and formulae in $\varPsi $. We write ${\sigma }\models _{ }{\varPsi } $ to say that the tuples in $\sigma $ entail the formulae in $\varPsi $. For instance, iff $(\mathfrak {t}, k, X) \in \sigma $. Its formal definition is included in Appendix A.1.

Theorem 1

(Type Preservation). If $(\mathbf \nu \widetilde{m})\left\langle \sigma , C \right\rangle \xrightarrow {\lambda } (\mathbf \nu \widetilde{n}) \left\langle \sigma ', C' \right\rangle $, ${\sigma }\models _{ }{\varPsi } $, and ${\varPsi }\vdash { C} $, then $\exists \varPsi '.~ {\varPsi '}\vdash { C'} $ and ${\sigma '}\models _{ }{\varPsi '} $.

Theorem 2

(Progress). If ${\varPsi }\vdash { C} $, ${\sigma }\models _{ }{\varPsi } $ and $ C\not \equiv \varvec{0}$, then $ C$ progresses.

The decidability of type checking depends on the provability of formulae in our ILL fragment. Notice that the formulae used in type checking corresponds to the Multiplicative-Additive fragment of ILL, whose provability is decidable [26]. For typing collective operations, the number of checks grows according to the amount of participants involved. Decidability exploits the fact that for each interaction the number of participants is bounded.

Theorem 3

(Decidability of Typing). ${\varPsi }\vdash { C} $ is decidable

5 Related Work

Availability considerations in distributed systems has recently spawned novel research strands in regular languages [1], continuous systems [2], and endpoint languages [33]. To the best of our knowledge, this is the first work considering availability from a choreographical perspective.

A closely related work is the Design-By-Contract approach for multiparty interactions [4]. In fact, in both works communication actions are enriched with pre-/post- conditions, similar to works in sequential programming [21]. The work on [4] enriches global types with assertions, that are then projected to a session $\pi -$calculus. Assertions may generate ill-specifications, and a check for consistency is necessary. Our capability-based type system guarantees temporal-satisfiability as in [4], not requiring history-sensitivity due to the simplicity of the preconditions used in our framework. The most obvious difference with [4] is the underlying semantics used for communication, that allows progress despite some participants are unavailable.

Other works have explored the behavior of communicating systems with collective/broadcast primitives. In [23], the expressivity of a calculus with bounded broadcast and collection is studied. In [27], the authors present a type theory to check whether models for multicore programming behave according to a protocol and do not deadlock. Our work differs from these approaches in that our model focuses considers explicit considerations on availability for the systems in consideration. Also for multicore programming, the work in [14] presents a calculus with fork/join communication primitives, with a flexible phaser mechanism that allows some threads to advance prior to synchronization. The type system guarantees a node-centric progress guarantee, ideal for multicore computing, but inadequate for CPS. Finally, the work [25], present endpoint (session) types for the verification of communications using broadcast in the $\varPsi $-calculus. We do not observe similar considerations regarding availability of components in this work.

The work in [13] presented multiparty global types with join and fork operators, capturing in this way some notions of broadcast and reduce communications, which is similar to our capability type-system. The difference with our approach is described in Sect. 3. On the same branch, [16] introduces multiparty global types with recursion, fork, join and merge operations. The work does not provide a natural way of encoding broadcast communication, but one could expect to be able to encode it by composing fork and merge primitives.

6 Conclusions and Future Work

We have presented a process calculus aimed at studying protocols with variable availability conditions, as well as a type system to ensure their progress. It constitutes the first step towards a methodology for the safe development of communication protocols in CPS. The analysis presented is orthogonal to existing type systems for choreographies (c.f. session types [12].) Our next efforts include the modification of the type theory to cater for recursive behavior, the generation of distributed implementations (e.g. EndPoint Projection [9]), and considerations of compensating [7, 8, 28] and timed behavior [5, 6]. Type checking is computationally expensive, because for each collective interaction one must perform the analysis on each subset of participants involved. The situation will be critical once recursion is considered. We believe that the efficiency of type checking can be improved by modifying the theory so it generates one formulae for all subsets.

Traditional design mechanisms (including sequence charts of UML and choreographies) usually focus on the desired behavior of systems. In order to deal with the challenges from security and safety in CPS it becomes paramount to cater for failures and how to recover from them. This was the motivation behind the development of the Quality Calculus that not only extended a $\pi $-calculus with quality predicates and optional data types, but also with mechanisms for programming the continuation such that both desired and undesired behavior was adequately handled. In this work we have incorporated the quality predicates into choreographies and thereby facilitate dealing with systems in a failure-aware fashion. However, it remains a challenge to incorporate the consideration of both desired and undesired behavior that is less programming oriented (or EndPoint Projection oriented) than the solution presented by the Quality Calculus. This may require further extensions of the calculus with fault-tolerance considerations.

References

Abdulla, P.A., Atig, M.F., Meyer, R., Salehi, M.S.: What’s decidable about availability languages? In: Harsha, P., Ramalingam, G. (eds.) FSTTCS. LIPIcs, vol. 45, pp. 192–205. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik (2015)
Google Scholar
Alur, R.: Principles of Cyber-Physical Systems. MIT Press, Cambridge (2015)
Google Scholar
Bocchi, L., Chen, T.-C., Demangeon, R., Honda, K., Yoshida, N.: Monitoring networks through multiparty session types. In: Beyer, D., Boreale, M. (eds.) FORTE 2013 and FMOODS 2013. LNCS, vol. 7892, pp. 50–65. Springer, Heidelberg (2013)
Chapter Google Scholar
Bocchi, L., Honda, K., Tuosto, E., Yoshida, N.: A theory of design-by-contract for distributed multiparty interactions. In: Gastin, P., Laroussinie, F. (eds.) CONCUR 2010. LNCS, vol. 6269, pp. 162–176. Springer, Heidelberg (2010)
Chapter Google Scholar
Bocchi, L., Lange, J., Yoshida, N.: Meeting deadlines together. In: Aceto, L., de Frutos-Escrig, D. (eds.) CONCUR, LIPIcs, vol. 42, pp. 283–296. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik (2015)
Google Scholar
Bocchi, L., Yang, W., Yoshida, N.: Timed multiparty session types. In: Baldan, P., Gorla, D. (eds.) CONCUR 2014. LNCS, vol. 8704, pp. 419–434. Springer, Heidelberg (2014)
Google Scholar
Carbone, M.: Session-based choreography with exceptions. Electron. Notes Theor. Comput. Sci. 241, 35–55 (2009)
Article Google Scholar
Carbone, M., Honda, K., Yoshida, N.: Structured interactional exceptions in session types. In: van Breugel, F., Chechik, M. (eds.) CONCUR 2008. LNCS, vol. 5201, pp. 402–417. Springer, Heidelberg (2008)
Chapter Google Scholar
Carbone, M., Honda, K., Yoshida, N.: Structured communication-centered programming for web services. ACM Trans. Program. Lang. Syst. 34(2), 8 (2012)
Article MATH Google Scholar
Carbone, M., Honda, K., Yoshida, N., Milner, R., Brown, G., Ross-Talbot, S.: A theoretical basis of communication-centred concurrent programming. In: Web Services Choreography Working Group mailing list, WS-CDL working report (2006, to appear)
Google Scholar
Carbone, M., Montesi, F.: Chor: a choreography programming language for concurrent systems. http://sourceforge.net/projects/chor/
Carbone, M., Montesi, F.: Deadlock-freedom-by-design: multiparty asynchronous global programming. In: Giacobazzi, R., Cousot, R. (eds.) POPL, pp. 263–274. ACM (2013)
Google Scholar
Castagna, G., Dezani-Ciancaglini, M., Padovani, L.: On global types and multi-party session. Logical Methods Comput. Sci. 8(1), 1–45 (2012)
Article MathSciNet MATH Google Scholar
Cogumbreiro, T., Martins, F., Thudichum Vasconcelos, V.: Coordinating phased activities while maintaining progress. In: De Nicola, R., Julien, C. (eds.) COORDINATION 2013. LNCS, vol. 7890, pp. 31–44. Springer, Heidelberg (2013)
Chapter Google Scholar
Deng, J., Han, Y.S., Heinzelman, W.B., Varshney, P.K.: Balanced-energy sleep scheduling scheme for high-density cluster-based sensor networks. Comput. Commun. 28(14), 1631–1642 (2005)
Article Google Scholar
Deniélou, P.-M., Yoshida, N.: Multiparty session types meet communicating automata. In: Seidl, H. (ed.) ESOP 2012. LNCS, vol. 7211, pp. 194–213. Springer, Heidelberg (2012)
Chapter Google Scholar
Girard, J.-Y.: Linear logic. Theor. Comput. Sci. 50, 1–102 (1987)
Article MathSciNet MATH Google Scholar
Harper, R.: Programming in Standard ML. Working Draft (2013)
Google Scholar
Heinzelman, W.B., Chandrakasan, A.P., Balakrishnan, H.: An application-specific protocol architecture for wireless microsensor networks. IEEE Trans. Wireless Commun. 1(4), 660–670 (2002)
Article Google Scholar
Heinzelman, W.R., Kulik, J., Balakrishnan, H.: Adaptive protocols for information dissemination in wireless sensor networks. In MOBICOM, pp. 174–185. ACM (1999)
Google Scholar
Hoare, C.A.R.: An axiomatic basis for computer programming (reprint). Commun. ACM 26(1), 53–56 (1983)
Article MathSciNet Google Scholar
Honda, K., Vasconcelos, V.T., Kubo, M.: Language primitives and type discipline for structured communication-based programming. In: Hankin, C. (ed.) ESOP 1998. LNCS, vol. 1381, pp. 122–138. Springer, Heidelberg (1998)
Chapter Google Scholar
Hüttel, H., Pratas, N.: Broadcast and aggregation in BBC. In: Gay, S., Alglave, J. (eds.) PLACES, EPTCS, pp. 51–62 (2015)
Google Scholar
Intanagonwiwat, C., Govindan, R., Estrin, D.: Directed diffusion: a scalable and robust communication paradigm for sensor networks. In: Pickholtz, R.L., Das, S.K., Cáceres, R., Garcia-Luna-Aceves, J.J. (eds.) MOBICOM, pp. 56–67. ACM (2000)
Google Scholar
Kouzapas, D., Gutkovas, R., Gay, S.J.: Session types for broadcasting. In: Donaldson, A.F., Vasconcelos, V.T. (eds.) PLACES, EPTCS, vol. 155, pp. 25–31 (2014)
Google Scholar
Lincoln, P.: Deciding provability of linear logic formulas. In: Advances in Linear Logic, pp. 109–122. Cambridge University Press (1994)
Google Scholar
López, H.A., Marques, E.R.B., Martins, F., Ng, N., Santos, C., Vasconcelos, V.T., Yoshida, N.: Protocol-based verification of message-passing parallel programs. In: Aldrich, J., Eugster, P. (eds.) OOPSLA, pp. 280–298. ACM (2015)
Google Scholar
López, H.A., Pérez, J.A.: Time and exceptional behavior in multiparty structured interactions. In: Carbone, M., Petit, J.-M. (eds.) WS-FM 2011. LNCS, vol. 7176, pp. 48–63. Springer, Heidelberg (2012)
Chapter Google Scholar
Madden, S., Franklin, M.J., Hellerstein, J.M., Hong, W.: TAG: A tiny aggregation service for ad-hoc sensor networks. In: Culler, D.E., Druschel, P. (eds.) OSDI. USENIX Association (2002)
Google Scholar
Madden, S., Franklin, M.J., Hellerstein, J.M., Hong, W.: The design of an acquisitional query processor for sensor networks. In: Halevy, A.Y., Ives, Z.G., Doan, A. (eds.) SIGMOD Conference, pp. 491–502. ACM (2003)
Google Scholar
Montesi, F., Guidi, C., Zavattaro, G.: Service-oriented programming with jolie. In: Bouguettaya, A., Sheng, Q.Z., Daniel, F. (eds.) Web Services Foundations, pp. 81–107. Springer, New York (2014)
Chapter Google Scholar
Neykova, R., Bocchi, L., Yoshida, N.: Timed runtime monitoring for multiparty conversations. In: Carbone, M. (ed.) BEAT, EPTCS, vol. 162, pp. 19–26 (2014)
Google Scholar
Nielson, H.R., Nielson, F., Vigo, R.: A calculus for quality. In: Păsăreanu, C.S., Salaün, G. (eds.) FACS 2012. LNCS, vol. 7684, pp. 188–204. Springer, Heidelberg (2013)
Chapter Google Scholar
Pattem, S., Krishnamachari, B., Govindan, R.: The impact of spatial correlation on routing with compression in wireless sensor networks. TOSN 4(4), 1–23 (2008)
Article Google Scholar
Perillo, M.A., Heinzelman, W.B.: Wireless sensor network protocols. In: Boukerche, A. (ed.) Handbook of Algorithms for Wireless Networking and Mobile Computing, pp. 1–35. Chapman and Hall/CRC, London (2005)
Google Scholar
Yoshida, N., Hu, R., Neykova, R., Ng, N.: The Scribble Protocol Language. In: Abadi, M., Lluch Lafuente, A. (eds.) TGC 2013. LNCS, vol. 8358, pp. 22–41. Springer, Heidelberg (2014)
Chapter Google Scholar

Download references

Acknowledgments

We would like to thank Marco Carbone and Jorge A. Pérez for their insightful discussions, and to all anonymous reviewers for their helpful comments improving the paper. This research was funded by the Danish Foundation for Basic Research, project IDEA4CPS (DNRF86-10). López has benefitted from travel support by the EU COST Action IC1201: Behavioural Types for Reliable Large-Scale Software Systems (BETTY).

Author information

Authors and Affiliations

Technical University of Denmark, Kongens Lyngby, Denmark
Hugo A. López, Flemming Nielson & Hanne Riis Nielson

Authors

Hugo A. López
View author publications
You can also search for this author in PubMed Google Scholar
Flemming Nielson
View author publications
You can also search for this author in PubMed Google Scholar
Hanne Riis Nielson
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Hugo A. López , Flemming Nielson or Hanne Riis Nielson .

Editor information

Editors and Affiliations

Complutense University of Madrid, Madrid, Spain
Elvira Albert
University of Bologna/Inria, Bologna, Italy
Ivan Lanese

A Additional Definitions

1.1 A.1 Type System

Figure 10 presents the complete type system for $GC_q$.

Definition 3

(State Satisfaction). The entailment relation between a state $\sigma $ and a formula $\varPsi $, and between $\sigma $ and a formula $\psi $ are written ${\sigma }\models _{ }{\varPsi } $ and ${\sigma }\models _{ }{\psi } $, respectively. They are defined as follows:

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

López, H.A., Nielson, F., Nielson, H.R. (2016). Enforcing Availability in Failure-Aware Communicating Systems. In: Albert, E., Lanese, I. (eds) Formal Techniques for Distributed Objects, Components, and Systems. FORTE 2016. Lecture Notes in Computer Science(), vol 9688. Springer, Cham. https://doi.org/10.1007/978-3-319-39570-8_13

Download citation

DOI: https://doi.org/10.1007/978-3-319-39570-8_13
Published: 24 May 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-39569-2
Online ISBN: 978-3-319-39570-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Enforcing Availability in Failure-Aware Communicating Systems

Abstract

Similar content being viewed by others

Towards Choreographic-Based Monitoring

Self-adaptation and secure information flow in multiparty communications

Automated Choreography Repair

Keywords

1 Introduction

2 Towards a Language for CPS Communications

2.1 Unique Features in CPS Communications

2.2 Model Preview

Example 1

3 The Global Quality Calculus (\(GC_q\))

Definition 1

3.1 Semantics

Remark 1

Remark 2

Definition 2

4 Type-Checking Progress

Example 2

Theorem 1

Theorem 2

Theorem 3

5 Related Work

6 Conclusions and Future Work

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

A Additional Definitions

A Additional Definitions

1.1 A.1 Type System

Definition 3

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation