Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Eptcs

2010

The induction of a signaling pathway is characterized by transient complex formation and mutual posttranslational modification of proteins. To faithfully capture this combinatorial process in a math- ematical model is an important challenge in systems biology. Exploiting the limited context on which most binding and modification events are conditioned, attempts have been made to reduce the com- binatorial complexity by quotienting the reachable set of molecular species, into species aggregates while preserving the deterministic semantics of the thermodynamic limit. Recently we proposed a quotienting that also preserves the stochastic semantics and that is complete in the sense that the semantics of individual species can be recovered from the aggregate semantics. In this paper we prove that this quotienting yields a sufficient condition for weak lumpability and that it gives rise to a backward Markov bisimulation between the original and aggregated transition system. We illustrate t...

View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by IST Austria: PubRep (Institute of Science and Technology) Lumpability Abstractions of Rule-based Systems∗ Jerome Feret Thomas Henzinger LIENS (INRIA/ÉNS/CNRS) Paris, France feret@ens.fr Institute and Science of Technology Vienna, Austria thenzinger@ist.ac.at Heinz Koeppl Tatjana Petrov School of Computer and Communication Sciences EPFL Lausanne, Switzerland heinz.koeppl@epfl.ch School of Computer and Communication Sciences EPFL Lausanne, Switzerland tatjana.petrov@epfl.ch The induction of a signaling pathway is characterized by transient complex formation and mutual posttranslational modification of proteins. To faithfully capture this combinatorial process in a mathematical model is an important challenge in systems biology. Exploiting the limited context on which most binding and modification events are conditioned, attempts have been made to reduce the combinatorial complexity by quotienting the reachable set of molecular species, into species aggregates while preserving the deterministic semantics of the thermodynamic limit. Recently we proposed a quotienting that also preserves the stochastic semantics and that is complete in the sense that the semantics of individual species can be recovered from the aggregate semantics. In this paper we prove that this quotienting yields a sufficient condition for weak lumpability and that it gives rise to a backward Markov bisimulation between the original and aggregated transition system. We illustrate the framework on a case study of the EGF/insulin receptor crosstalk. 1 Introduction Often a few elementary events of binding and covalent modification [31] in a biomolecular reaction system give rise to a combinatorial number of non-isomorphic reachable species or complexes [17, 18]. Instances of such systems are signaling pathway, polymerizations involved in cytoskeleton maintainance, the formation of transcription factor complexes in gene-regulation. For such biomolecular systems, traditional chemical kinetics face fundamental limitations, that are related to the question of how biomolecular events are represented and translated into a mathematical model [24]. More specifically, chemical reactions can only operate on a collection of fully specified molecular species and each such species gives rise to one differential equation, describing the rate of change of that species’ concentration. Many combinatorial systems do not permit the enumeration of all molecular species and thus render their traditional differential description prohibitive. However, even if one could enumerate them, it remains questionable whether chemical reactions are the appropriate way to represent and to reason about such systems. As the dynamics of a biomolecular reaction mixture comes about through the repeated execution of a few elementary events one may wonder about the effective degrees of freedom of the reaction mixture’s dynamics. If the velocity of all events – or their probabilities to occur per time-unit per instance – ∗ Jérôme Feret’s contribution was partially supported by the A BSTRACT C ELL ANR-Chair of Excellence. Heinz Koeppl acknowledges the support from the Swiss National Science Foundation, grant no. 200020-117975/1. Tatjana Petrov acknowledges the support from SystemsX.ch, the Swiss Initiative in Systems Biology. G.Ciobanu, M.Koutny (Eds.): Membrane Computing and Biologically Inspired Process Calculi 2010 (MeCBIC 2010) EPTCS 40, 2010, pp. 142–161, doi:10.4204/EPTCS.40.10 J. Feret, T. Henzinger, H. Koeppl, T. Petrov 143 are different for all complexes (w.r.t. modification) and pairs of complexes (w.r.t. binding) to which the events can apply to, then the degrees of freedom would equal to the number of molecular species. However, due to the local nature of physical forces underlying molecular dynamics, the kinetics of most events appear to be ignorant with respect to the global configuration of the complexes they are operating on. More provocatively, one may say that even if there would be variations of kinetics of an event from one context to another, experimental biology does not – and most likely never will – have the means to discern between all different contexts. For instance, fluorescence resonance energy transfer (FRET), may report on a specific protein-binding event and even its velocity, however we have no means to determine whether the binding partners are already part of a protein complex – not to speak of the composition and modification state of these complexes. To this end, molecular species remain elusive and appear to be inappropriate entities of descriptions. To align with the mentioned experimental insufficiencies and with the underlying biophysical locality, rule-based or agent-based descriptions were introduced as a framework to encode such reaction mixtures succinctly and to enable their mathematical analysis [8, 3]. Rules exploit the limited context on which most elementary events are conditioned. They just enumerate that part of a molecular species that is relevant for a rule to be applicable. Thus, in contrast to chemical reactions, rules can operate on a collection of partially specified molecular species. Recently, attempts have been made to identify the set of those partially specified species that allow for a self-consistent description of the rule-set’s dynamics [11, 6]. Naturally, as partially specified species – or fragments – in general encompass many fully specified species, the cardinality of that set is less than of the set of molecular species. These approaches aim to obtain a self-consistent fragment dynamics based on ordinary differential equations. It represents the dynamics in the thermodynamic limit of stochastic kinetics when scaling species multiplicities to infinity while maintaining a constant concentration (multiplicity per unit volume) [22]. In many applications in cellular biology this limiting dynamics is an inappropriate model due to the low multiplicities of some molecular species – think of transcription factor - DNA binding events. We recently showed that the obtained differential fragments cannot be used to describe the finite volume case of stochastic kinetics [12]. Exploiting statistical independence of events we instead derived stochastic fragments that represent the effective degrees of freedom in the stochastic case. Conceptually, the procedure replaces the Cartesian product by a Cartesian sum for statistically independent states. In contrast to the differential case, stochastic fragments have the important property that the sample paths of molecular species can be recovered from that of partially specified species. We believe that interdisciplinary fields, such as systems biology, can move forward quickly enough only if the well-established knowledge in each of the disciplines involved is exploited to its maximum, i.e. if the standardized, well-established theories are recognized and re-used. For that reason, in this paper we translate our abstraction method ([12]) into the language of well-established contexts of abstraction for probabilistic systems – lumpability and bisimulation. Lumpability is mostly considered from a theoretical point of view in the theory of stochastic processes [19, 13, 28, 26, 27, 4]. A Markov chain is lumpable with respect to a given aggregation (quotienting) of its states, if the lumped chain preserves the Markov property [21]. A sound aggregation for any initial probability distribution is referred to as strong lumpability, while otherwise it is termed weak lumpability [4, 29]. Approximate aggregation techniques for Markov chains of biochemical networks are discussed in [14]. Probabilistic bisimulation was introduced as an extension to classic bisimulation in [23]. It is extended to continuous-state and continuous-time in [9] and, for the discrete-state case, to weak bisimulation [2]. For instance, in [9] the authors use bisimulation of labelled Markov processes, the state space of which is not necessarily discrete, and they provide a logical characterization of probabilistic bisimulation. Another notion of weak bisimulation was recently introduced in [10]. Therein two labeled Markov chains are defined to 144 Lumpability Abstractions of Rule-based Systems be equivalent if every finite sequence of observations has the same probability of occurring in the two chains. Herein we recognize the sound aggregations of [12] as a form of backward Markov bisimulations on weighted labeled transition systems (WLTS), and we show it to be equivalent to the notion of weak lumpability on Markov chains. The remaining part of the paper is organized as follows. In the Section 2, we introduce weighted labeled transition systems (WLTS) and we assign it the trace density semantics of a continuous-time Markov chain (CTMC). Moreover, we define the Kappa language, and we assign a WLTS to a Kappa specification. Based on the notion of the annotated contact map we briefly summarize in Section 3 the general procedure to compute stochastic fragments, as it is offered in [12]. In Section 4, we introduce the characterizations of sound and complete abstractions on WLTS as a backward Markov bisimulation. Moreover, we define it being equivalent to the weak lumpability on Markov chains. Finally, we provide in Section 5 results for the achieved dimensionality reduction for a rule-based model of the crosstalk between the EGF/insulin signaling pathway [6]. This mechanistic model comprises 76 rules giving rise to 42956 reactions and 2768 molecular species. 2 Preliminaries The stochastic semantics of a biochemical network is modelled by a continuous-time Markov chain (CTMC). The main object that we will use in the analysis is the weighted labelled transition system (WLTS) on a countable state space. We will assign a WLTS to a given Kappa specification, and we manipulate that object when reasoning about abstractions. 2.1 CTMC We will observe the CTMC that is generated by the weighted labelled transition system (WLTS) on a countable state space. We define the CTMC of a WLTS, by defining the Borel σ -algebra containing all cylinder sets of traces [20] that can occur in the system, and the corresponding probability distribution among them. We also introduce the standard notation of a rate matrix, which we will use when analysing the lumpability and bisimulation properties in Sec. 4. Definition 1. (WLTS) A weighted-labelled transition system W is a tuple (X , L , w, π0 ), where • X is a countable state space; • L is a set of labels; • w : X × L × X → R+ 0 is the weighting function that maps two states and a label to a real value; • π0 : X → [0, 1] is an initial probability distribution. We assume that the label fully identifies the transition, i.e. for any x ∈ X and l ∈ L , there is at most one x′ ∈ X , such that w(x, l, x′ ) > 0. Moreover, we assume that the system is finitely branching, in the sense that (i) the set {x ∈ X | π0 (x) > 0} is finite, and (ii) for arbitrary x̂ ∈ X , the set {(l, x′ ) ∈ L × X | w(x̂, l, x′ ) > 0} is finite. The activity of the state xi , denoted a : X → R+ 0 is the sum of all weights originating at xi , i.e. a(xi ) := ∑{w(xi , l, x j ) | x j ∈ X , l ∈ L }. The definition of a WLTS implicitely defines a transition relation →⊆ X ×X , such that (xi , x j ) ∈→, if and only if there exists a non-zero transition from state xi to state x j , i.e. the total weight over all labels is strictly bigger then zero, written ∑{w(xi , l, x j ) | l ∈ L } > 0. Moreover, we can differentiate the initial set of states I ⊆ X , such that their initial probabilities are positive, i.e. I = {x ∈ X | π0 (x) > 0}. 145 J. Feret, T. Henzinger, H. Koeppl, T. Petrov Definition 2. (Rate matrix of a WLTS) Given a WLTS W = (X , L , w, π0 ), we assign it the CTMC rate matrix R : X × X → R, given by R(xi , x j ) = ∑{w(xi , l, x j ) | l ∈ L }. The consequence is that we do not enforce R(xi , xi ) = − ∑{R(xi , x j ) | i 6= j}, as it is usual for the generator matrix of CTMCs. This however does not affect the transient, not the steady-state behavior of the CTMC [1]. We do so for the following reason. When abstracting the WLTS by partitioning the state space, we get another WLTS. If the two states x and x′ which have a transition between each other were aggregated in the same partition class x̃, it will result as a prolongation of the residence time in the abstract state x̃, i.e. we will have a self-loop in the abstract WLTS. If we refer to the generated stochastic Markov process, written as a continuous-time random variable {Xt }t∈R+ , over the countable state space X . We write Pr(Xt = xi ), the probability that the process takes 0 the value xi at time point t. It thus holds that Pr(X0 = xi ) = π0 (xi ), and Pr(Xt+dt = x j | Xt = xi ) = R(xi , x j )dt when i 6= j, whereas Pr(Xt+dt = xi | Xt = xi ) = R(xi , xi )dt + (1 − ∑{R(xi , x j′ )dt | x j′ ∈ X }), which gives after simplification: Pr(Xt+dt = xi | Xt = xi ) = 1 − ∑{R(xi , x j′ )dt | j′ 6= i}. We define the cylinder sets of traces that can be observed in the system W . By observing the trace at a certain time point, we mean observing the sequence of visited states, labels that were assigned to the executed transitions, and time points of when the transition happened. Definition 3. (A trace of a WLTS) Let us observe the WLTS W = (X , L , w, π0 ) and its CTMC. Given a k number k in N, we define a trace of length k as τ ∈ (X × L × R+ 0 ) × X , written l1 ,t1 τ = x0 → x1 . . . xk−1 lk ,t1 +...+tk → xk . If the trace τ is such that (i) π0 (x0 ) > 0, and (ii) for all i, 0 ≤ i ≤ k, we have that w(xi , li , xi+1 ) > 0, then we say that τ belongs to the set of traces of W , and we write it τ ∈ T (W ). The ‘time stamps’ on each of the transitions denote intuitively the absolute time of the transition, from the moment when the system was started (t = 0). We cannot assign the probability distribution to the traces in T (W ), since the probability of any such trace is zero. We thus introduce the cylinder set of traces over intervals of times. Definition 4. (Cylinder set of traces) If IR is the set of all nonempty intervals in R+ 0 , then we define the cylinder set of traces τIR ∈ (X × L × IR)k × X , such that: l1 ,I1 l ,I k k τIR = x0 → x1 . . . xk−1 → xk l1 ,t1 (1) l ,t1 +...+t denotes the set of all traces τ = x0 → x1 . . . xk−1 k → k xk , such that ti ∈ Ii , 1 ≤ i ≤ k. If the cylinder of traces τIR is such that π0 (x0 ) > 0, and for all i = 0, ..., k − 1, we have that w(xi , li , xi+1 ) > 0, then we say that τIR belongs to the cylinder set of traces of W , and we write τIR ∈ TIR (W ). Let Ω(TIR (W )) be the smallest Borel σ -algebra that contains all the cylinder sets of traces in TIR (W ). We define a probability measure over Ω(TIR (W )) in the following way. Definition 5. (Trace density semantics on a WLTS) Given a WLTS (X , L , w, π0 ), and a number k in N, the probability of the cylinder set of traces τIR ∈ TIR (W ), specified as in expression (1), is given by: l1 ,I1 l ,I k  w(xi−1 , li , xi )  −a(xi−1 )·inf(Ii ) e − e−a(xi−1 )·sup(Ii ) . a(xi−1 ) i=1 k k π (τIR ) = π (x0 → x1 . . . xk−1 → xk ) = π0 (x0 ) ∏ Note that Ii a(xi−1 )e−a(xi−1 )·t dt = e−a(xi−1 )·inf(Ii ) − e−a(xi−1 )·sup(Ii ) is the probability of exiting the state xi−1 in a time interval Ii−1 , since the probability density function of the residence time of xi−1 is equal to a(xi−1 )e−a(xi−1 ) . R 146 2.2 Lumpability Abstractions of Rule-based Systems Kappa We present Kappa in a process-like notation. We start with an operational semantics, then define the stochastic semantics of a Kappa model. We assume a finite set of agent names A , representing different kinds of proteins; a finite set of sites S , corresponding to protein domains; a finite set of internal states I, and Σι ,Σλ two signature maps from A to ℘(S ), listing the domains of a protein which can bears respectively an internal state and a binding state. We denote by Σ the signature map that associates to each agent name A ∈ A the combined interface Σι (A) ∪ Σλ (A). Definition 6. (Kappa agent) A Kappa agent A(σ ) is defined by its type A ∈ A and its interface σ . In A(σ ), the interface σ is a sequence of sites s in Σ(A), with internal states (as subscript) and binding states (as superscript). The internal state of the site s may be written as sε , which means that either it does not have internal states (when s ∈ Σ(A) \ Σι (A)), or it is not specified. A site that bears an internal state m ∈ I is written sm (in such a case s ∈ Σι (A)). The binding states of a site s can be specified as sε , if it is free, otherwise it is bound (which is possible only when s ∈ Σλ (A)). There are several levels of information about the binding partner: we use a binding label i ∈ N when we know the binding partner, or a wildcard bond − when we only know that the site is bound. The detailed description of the syntax of a Kappa agent is given by the following grammar: a ::= N(σ ) N ::= A ∈ A σ ::= ε | s,σ (agent) (agent name) (interface) s n ι λ ::= ::= ::= ::= nιλ x∈S ε | m∈I ε | − | i∈N (site) (site name) (internal state) (binding state) We generally omit the symbol ε . Definition 7. (Kappa expression) Kappa expression E is a set of agents A(σ ) and fictitious agents 0. / Thus the syntax of a Kappa expression is defined as follows: E ::= ε | a , E | 0/ , E. The structural equivalence ≡, defined as the smallest binary equivalence relation between expressions that satisfies the rules given as follows: E , A(σ ,s,s′ ,σ ′ ) , E ′ ≡ E , A(σ ,s′ ,s,σ ′ ) , E ′ E , a , a′ , E ′ ≡ E , a′ , a , E ′ E ≡ E , 0/ i, j ∈ N and i does not occur in E E[i/ j] ≡ E i ∈ N and i occurs only once in E E[ε /i] ≡ E stipulates that neither the order of sites in interfaces nor the order of agents in expressions matters, that a fictitious agent might as well not be there, that binding labels can be injectively renamed and that dangling bonds can be removed. Definition 8. (Kappa pattern, Kappa mixture) A Kappa pattern is a Kappa expression which satisfies the following five conditions: (i) no site name occurs more than once in a given interface; (ii) each site name s in the interface of the agent A occurs in Σ(A); (iii) each site s which occurs in the interface of the agent A with a non empty internal state occurs in Σι (A); (iv) each site s which occurs in the interface of the agent A with a non empty binding state occurs in Σλ (A); and (v) each binding label i ∈ N occurs exactly twice if it does at all –there are no dangling bonds. A mixture is a pattern that is fully specified, i.e. each agent A documents its full interface Σ(A), a site can only be free or tagged with a binding label i ∈ N, a site in Σι (A) bears an internal state in I, and no fictitious agent occurs. 147 J. Feret, T. Henzinger, H. Koeppl, T. Petrov Definition 9. (Kappa rule) A Kappa rule r is defined by two Kappa patterns Eℓ and Er , and a rate k ∈ R+ 0 , and is written: r = Eℓ → Er @k. A rule r is well-defined, if the expression Er is obtained from Eℓ by finite application of the following operations: (i) creation (some fictitious agents 0/ are replaced with some fully defined agents of the form A(σ ), moreover σ documents all the sites occurring in Σ(A) and all site in Σι (A) bears an internal state in I), (ii) unbinding (some occurrences of the wild card and binding labels are removed), (iii) deletion (some agents with only free sites are replaced with fictitious agent 0), / (iv) modification (some non-empty internal states are replaced with some non-empty internal states), (v) binding (some free sites are bound pair-wise by using binding labels in N). From now on, we assume all rules to be well-defined. We sometimes omit the rate of a rule. Moreover, we denote by Eℓ ↔ Er @k1 , k2 the two rules Eℓ ↔ Er → @k1 and Er → Eℓ @k2 . Definition 10. (Kappa system) A Kappa system R = (π0R , {r1 , . . . , rn }) is given by finite distribution over initial mixtures π0R : {M01 , . . . , M0k } → [0, 1], and a finite set of rules {r1 , . . . , rn }. In order to apply a rule r := Eℓ → Er @k to a mixture M, we use the structural equivalence ≡ to bring the participating agents to the front of E (with their sites in the same order as in Eℓ ), rename binding labels if necessary and introduce a fictitious agent for each agent that is created by r. This yields an equivalent expression E ′ that matches the left and side (lhs) Eℓ , which is written E |= Eℓ as defined as follows: E |= ε σ |= σℓ =⇒ N(σ ) |= N(σℓ ) ι |= ιℓ ∧ λ |= λℓ =⇒ nιλ |= nλιℓℓ a |= aℓ ∧ E |= Eℓ =⇒ a , E |= aℓ , Eℓ σ |= ε ιℓ ∈ {ε , ι } =⇒ ι |= ιℓ 0/ |= 0/ s |= sℓ ∧ σ |= σℓ =⇒ s, σ |= sℓ , σℓ λℓ ∈ {−, λ } =⇒ λ |= λℓ Note that in order to find a matching, we only uses structural equivalence on E, not Eℓ . We then replace E ′ by E ′ [Er ] which is defined as follows: λ [λ ] N(σ )[N(σr )] = N(σ [σr ]) nιλ [nλιrr ] = nι [ιr ]r E[ε ] = E σ [ε ] = σ ιr ∈ I =⇒ ι [ιr ] = ιr (a , E)[ar , Er ] = a[ar ] , E[Er ] 0[a / r ] = ar (s, σ )[sr , σr ] = s[sr ], σ [σr ] λr ∈ N ∪ {ε } =⇒ λ [λr ] = λr ar [0] / = 0/ λ [−] = λ This may produce dangling bonds (if r unbinds a wildcard bond or destroys an agent on one side of a bond) or fictitious agents (if r destroys agents), so we use ≡ resolve them. 2.2.1 Population-based stochastic semantics In addition to the rate constants k, careful counting of the number of times each rule can be applied to a mixture is required to define the system’s quantitative semantics correctly [7, 25]. Thus we define the notions of embedding between a mixture and an expression. Let Z = a1 , . . . , am and Zℓ = c1 , . . . , cn be two patterns with no occurrence of the fictitious agent and such that there exists a pattern Z ′ = b1 , . . . , bm that satisfies both Z ≡ Z ′ and Z ′ |= Zℓ (and so, in particular, n ≤ m). The agent permutations used in the proof that Z ≡ Z ′ allow us to derive a permutation p such that ap(i) ≡ bi . The restriction φ of p to the integers between 1 and n is called an embedding between Zℓ and Z. This is written Zℓ ✁φ Z. There may be several embeddings between Zℓ and Z for the same Z ′ ; if so, this influences the relative weight of the reaction in the stochastic semantics. We denote by [Z, Z ′ ] the set 148 Lumpability Abstractions of Rule-based Systems of embeddings between Z and Z ′ . This notion of embedding is extended to patterns (including fictitious agents) by defining Zℓ ✁φ Z if, and only if, (↓0/ Zℓ ) ✁φ (↓0/ Z), where ↓0/ removes all occurrences of the fictitious agent in patterns. We assume that Eℓ is the lhs of a rule r := Eℓ → Er @k and Z is a mixture such that Eℓ ✁φ Z. Let Z = a1 , . . . , am and ↓0/ Eℓ = c1 , . . . cn . Given Z ′ ≡ Z (we write ↓0/ Z ′ = b1 , . . . , bm ) and a bijection p such that we have Z ′ |= Eℓ , bi ≡ ap(i) for 1 ≤ i ≤ m and φ ( j) = p( j) for 1 ≤ j ≤ n. The result of applying r along φ to the mixture Z is defined (modulo ≡) as any pattern that is ≡-equivalent to Z ′ [Er ]. In other words the embedding φ between Eℓ and Z fully defines the action of r on Z up to structural equivalence. We are now ready to define the stochastic semantics by the mean of a WLTS. In this semantics, the state is a soup of agents, that is to say that we do not care about the order of agents in mixture. So the states of the system are the class of ≡-mixture. Defining species as connected mixture, the state of the system can be seen as a multi-set of species. The formal definition of a Kappa species is as follows: Definition 11. (Kappa species) A pattern E is reducible whenever E ≡ E ′ , E ′′ for some non-empty patterns E ′ , E ′′ ; A Kappa species is the ≡-equivalence class of a irreducible Kappa mixture. As explained earlier, the action of a rule r on a mixture E is fully defined (up to ≡) by an embedding φ between the lhs Eℓ of the rule r and the mixture. So as to consider computation steps over ≡-equivalent of mixtures, we introduce an equivalence relation ≡L over triples (r, E, φ ) where φ is an embedding of the lhs Eℓ of r into E. We say that (r1 , E1 , φ1 ) ≡L (r2 , E2 , φ2 ) if, and only if, (i) r1 = r2 and (ii) there exists an embedding ψ ∈ [E1 , E2 ] such that φ2 = ψ ◦ φ1 . Definition 12. (WLTS of a Kappa system) Let R = (π0R , {r1 , . . . , rn }) be a Kappa system. We define the WLTS WR = (X , L , w, π0 ) where: (i) X is the set of all ≡-equivalent classes of mixtures; (ii) L is the set of all ≡L -equivalence classes of triples (r, E, φ ) such that φ is an embedding between the lhs Eℓ of r and E; (iii) w(x, l, x′ ) = k |[Eℓ ,Eℓ ]| whenever there exist a rule r = Eℓ → Er @k, two mixtures E and E ′ , and an embedding φ ∈ [Eℓ , E], such that x = [E]≡ , l = [r, E, φ ]≡L , and E ′ is the result (up to ≡) of the application of r along φ to the mixture E; otherwise w(x, l, x′ ) = 0; (iv) π0 (x) = ∑{π0R (E ′ ) | E ′ ∈ Dom(π0R ) ∩ x}. The stochastic semantics of a Kappa system R is then defined as the trace distribution semantics of the WLTS WR . 3 Reduction procedure In this section, we assume, without any loss of generality that Σι and Σλ are disjoint sets. This can always be achieved by taking two disjoint copies Sι and Sλ of S and using site names in Sι to bear internal states, and site names in Sλ to bear binding states. Informally, a contact map represents a summary of the agents and their potential bindings. Definition 13. (Contact map) Given a Kappa system R, a contact map (CM) is a graph object (N , E ), where the set of nodes N are agent types equipped with the corresponding interface, and the edges are specified between the sites of the nodes. Formally, we have that N = {(A, Σ(A)) | A ∈ A } and E ⊆ {((A, s), (A′ , s′ )) | A, A′ ∈ A and s ∈ Σ(A), s′ ∈ Σ(A′ )}. If the site s of an agent of type A and the site s′ of an agent of type A′ bear the same binding state in the rhs Er of a rule, then there exists an edge e = ((A, s), (A′ , s′ )) ∈ E between s ∈ Σ(A) and s′ ∈ Σ(A′ ). We say that a site s of the agent a is tested by the rule r, if it is contained in the lhs Eℓ of the rule r. J. Feret, T. Henzinger, H. Koeppl, T. Petrov 149 Definition 14. (Annotated Contact map) Given a Kappa system R, and its CM (N , E ), a valid annotated contact map (ACM) (N , E ) is a contact map where all agents are annotated with respect to the rule set R. The annotation on the agent of type A ∈ A with respect to the rule r is given by the equivalence relation on its set of sites ≈A ⊆ Σ(A) × Σ(A) such that: • If a rule r tests the sites s1 and site s2 of agents a1 ,a2 (it is possible that a1 = a2 ) of type A, then s1 ≈A s2 ; • If a rule r creates an agent a of type A, then all the sites of Σ(A) are in the same equivalence class, i.e. ≈A = Σ(A) × Σ(A); Note that there can be several annotations of the agent type A ∈ A which satisfy the conditions. More precisely, if the equivalence relation ≈A meets the condition, then so does any of its refinement. This allows to define the smallest such equivalence relation ≈A which we call the minimal annotation of agent A. An ACM is minimal whenever each agent type is annotated by its minimal annotation. Let r be a rule and an ACM which is valid with respect to the singleton {r}. Then for any agent type A ∈ A , either A does not occur in the lhs of r, or A occurs but all occurrences of A have an empty interface, or A occurs, tests some sites which all belongs to a same equivalence class C in ≈A . In the latter case, we define testACM (A) = C, otherwise, we define testACM (A) = 0. / r r The meaning of the ACM is to summarize the dependences between sites that can occur during the simulation of a Kappa system. If the two sites s and s′ in the Σ(A) are correlated by the relation ≈A , i.e. s ≈A s′ , it suggests that they are dependent in the following way. We must not aggregate in the same equivalence class any two states x and x′ , such that they contain the agent A in a different evaluation of the sites s and s′ . On the other hand, if the two sites s and s′ are not correlated by ≈A , then we may aggregate the states by the ’marginal’ criteria, i.e. the condition which involves only one of the sites. Therefore, the less states are related by (≈A )A∈A , the better the reduction will be. To numerically justify this, we can imagine having an agent type A whose interface has n different sites s1 , ..., sn , and each of them has two possible internal state modifications. Let us observe the two limiting relations ≈A , i.e. ≈A = {(si , s j ) | 1 ≤ i ≤ n, 1 ≤ j ≤ n}, and ≈′A = {(si , si ) | 1 ≤ i ≤ n}. The annotation ≈A enforces at least to 2n states to describe all modifications of the agent A, whereas the annotation ≈′A suggests that it is enough to use only 2 · n of them. The ACM can be used to identify parts of Kappa species that we call fragments. Definition 15. (Kappa fragments) A fragment is the ≡-equivalent class of a non empty irreducible pattern E such that: (i) the set of sites in the interface σ of an agent A(σ ) in E is an equivalence class of ≈A , (ii) sites can only be free or tagged with a binding label i ∈ N and sites in Σι are tagged with an internal state in I, (iii) there is no occurrence of fictitious agent 0. / We can use fragments to abstract the WLTS WR , by identifying the mixtures which have the same (multi-)set of fragments. To reach that goal, we first overload the definition of ≡ in order to identify mixtures having the same fragments. We introduce the binary relation ≡♯ as the smallest equivalence relation over patterns which is compatible with ≡ and such that: A(σ ), A(σ ′ ), E ≡♯ A(↑C σ ′ , ↑Σ(A)\C σ ), A(↑C σ , ↑Σ(A)\C σ ′ ), E for any agent type A ∈ A , σ ,σ ′ interfaces, E pattern, and C an ≈A -equivalence class of sites. For any set of sites X ⊆ S , the projection function ↑X over interfaces keeps only the sites in X, formally ↑X is defined by ↑X ε = ε , ↑X (sιλ ,σ ′ ) = sιλ , ↑X σ ′ whenever s ∈ X, and ↑X (sιλ ,σ ′ ) =↑X σ ′ otherwise. Now we define the relation ∼L ♯ which stipulates that the rule r1 applies on E1 along φ1 the same way as the rule r2 on E2 along φ2 . More formally, we write (r1 , E1 , φ1 ) ∼L ♯ (r2 , E2 , φ2 ) whenever the following properties are all satisfied: 150 Lumpability Abstractions of Rule-based Systems 1. r1 = r2 ; 2. E1 ≡♯ E2 ; 3. φ2 = ψ ◦ φ1 , where ψ is the permutation which tracks how the sub-interface ↑testACM (Ai ) (Ai (σi )) is r ♯ moved in the proof that E1 ≡ E2 , for any agent Ai (σi ) occurring in E1 . More precisely, the transposition [ii+1 ] is associated to an agent permutation of the agents at position (A) i and i + 1; the transposition [12 ] is associated to a step which permutes the sub-interface testACM r of two agents of type A, for any agent type A ∈ A ; any other step is associated with the identity function (over N). The function ψ is defined as the composition of all the permutations (in the reverse order) which are associated to the elementary steps in the proof that E1 ≡♯ E2 . 4. the result of the application of r1 to E1 along φ1 is ≡-equivalent to the result of the application of r2 to E2 along φ2 . Definition 16. (Abstract WLTS of a Kappa system) Let R = (π0R , {r1 , . . . , rn }) be a Kappa system. We define the WLTS W˜R = (X˜ , L˜ , w̃, π˜0 ) where: • X˜ is the set of all ≡♯ -equivalent class of mixture; • L˜ is the set of all ≡♯L -equivalent class of triples (r, E, φ ) such that φ is an embedding between the lhs Eℓ of r and E; • w̃(x̃, λ̃ , x̃′ ) is equal to k |[Eℓ ,Eℓ ]| whenever there exist a mixture E, a rule r, and an embeddings φ such that x̃ = [E]≡♯ and λ̃ = [r, E, φ ]≡♯ ; otherwise, w̃(x̃, λ̃ , x̃′ ) is equal to 0; L • for any x̃ ∈ X˜ , π˜0 (x̃) = ∑E ′ ∈Dom(π R )∩x̃ π0R (E ′ ). 0 Let us define the relation ∼ over X by [E1 ]≡ ∼ [E2 ]≡ if, and only if, E1 ≡♯ E2 and the relation ∼L over L by [λ1 ]≡L ∼L [λ2 ]≡L if, and only if, λ1 ≡♯L λ2 . 4 Abstraction We introduce abstractions on WLTS by aggregating the states and labels into partition classes. We obtain a new WLTS defined over the aggregated states and labels. Each non-trivial abstraction is a loss of information. However some of them are such that it is possible to do the stochastic analysis on the aggregates rather than on concrete states. We address the problem of characterizing when this is possible, and if so, how the weights in the abstracted system are computed. We also discuss the reverse process - given the abstracted system, and a particular probability distributions over the aggregates, whether we can make conclusions about the traces in the concrete system. We do the general theoretical analysis of the abstractions on WLTS, and afterwards we show the relation with the reduction of Kappa systems, that is presented in Sec. 3. Definition 17. (Abstraction) Consider a WLTS W = (X , L , w, π0 ), and a pair of equivalence relations (∼, ∼L ) ∈ X 2 × L 2 , such that each ∼-equivalence class and each ∼L -equivalence class is finite. We denote the equivalence classes by x̃, l,˜ and we write x ∈ x̃, to indicate that x belongs to the equivalence class x̃, and l ∈ l˜ to indicate that the label l belongs to the equivalence class l.˜ A WLTS of the form W˜ = (X/∼ , L/∼L , w̃, π˜0 ), where π̃0 (x̃) = ∑{π0 (x) | x ∈ x̃} is called an abstraction of W , induced by the pair of equivalence relations (∼, ∼L ). Note that more abstractions can be induced by W , depending on how w̃ is defined. Moreover, for any two cylinder sets of traces τ̃IR ∈ TIR (W˜ ) and τIR ∈ TIR (W ), we say that τ̃IR = l˜1 ,I1 l˜ ,I l1 ,I1 l ,I k k k k x̃k is an abstraction of τIR = x0 → x1 . . . xk−1 → x̃0 → x̃1 . . . x̃k−1 → xk , and we write it τIR ∈ τ̃IR . 151 J. Feret, T. Henzinger, H. Koeppl, T. Petrov Definition 18. (Sound abstraction: Aggregation) We say that the abstraction W˜ is a sound abstraction of W , if the probability of any cylinder set of traces τ̃IR ∈ TIR (W˜ ) is equal to the sum of the probabilities of all the cylinder sets of traces τIR ∈ TIR (W ), whose abstraction is τ̃IR : π (τ̃IR ) = ∑{π (τIR ) | τIR ∈ τ̃IR }. We introduce a function γ : X/∼ → (X → [0, 1]) which assigns to each of the partition class x̃ ∈ X/∼ a probability distribution over the states that belong to this partition class. The set of all such vectors γ we denote by ΓX ,∼ is defined as: {γ | γ : X/∼ → (X → [0, 1]) ∧ ∀x̃ ∈ X˜ , ∑ γ (x̃, x) = 1}. x∈x̃ We can think of the value γ (x̃, x) as the conditional probability of being in the state x, knowing that we are in state x̃, i.e. Pr(Xt = x | Xt ∈ x̃) = γ (x̃, x). We note that, when thinking of γ as the conditional probability, it should be a time-dependent value. However, we refer to γ as to a single, constant distribution. This will be justified in Lem. 1. Definition 19. (Complete abstraction: Deaggregation) We say that the abstraction W˜ is a complete abstraction of W for γ ∈ ΓX ,∼ , if the following holds. Given the probability of an arbitrary abstract cylinder set of traces of length k ≥ 1, that ends in the abstract state x̃k (written τ̃IR → x̃k ), we can recompute the probability of ending the trace in the concrete state xk ∈ x̃k in the following way: π (τ̃IR → xk ) = γ (x̃k , xk ) · π (τ̃IR → x̃k ). Sound and complete abstractions W˜ cannot be induced by any pair of relations (∼, ∼L ), because there might not exist a weighting function w̃ : X/∼ × L/∼L × X/∼ → R, such that the conditions from Dfn. 18 and Dfn. 19 are met. Moreover, even if such w̃ exists, the remaining question is whether the information on the abstract system is enough to compute them. We now restate the main Theorem from [12], that the abstractions for Kappa systems, that we resumed in Sec. 3, are sound and complete. Theorem 1. (The abstraction induced by the ACM is sound and complete) Given a Kappa system R = (π0R , {r1 , ..., rn }), and an ACM (N , E ), an abstraction W˜R = (X/∼ , L/∼ , w̃, π˜0 ) induced by the partition classes (∼, ∼L ) ⊆ X 2 × L 2 , as proposed in the Def. 16 is a sound and complete abstraction of the WR = (X , L , w, π0 ), provided that for any two mixtures M and M ′ such that M ≡♯ M ′ , we have: π0 ([M]≡ ) · |[M ′ , M ′ ]| = π0 ([M ′ ]≡ ) · |[M, M]|. We consider a mixture M. We denote by x ∈ X the equivalence class [M]≡ , and by x̃ ∈ X˜ the equivalence class [M]≡♯ = [x]∼ . The conditional probability γ (x̃, x) is computed as the ratio of the number of automorphisms of x (embedding between x and x) and the sum of the number of automorphisms of any ∼-equivalent state. Thus we have: γ (x̃, x) = |[x, x]| . | x ∼ x′ } ∑{|[x′ , x′ ]| The reader can find the detailed proof in [12]. 152 4.1 Lumpability Abstractions of Rule-based Systems Lumpability Now we define different versions of lumpability and investigate the relationship with sound and complete abstractions. Definition 20. (Lumped process) Given a WLTS W = (X , L , w, π0 ), where X = {x1 , x2 , ...}, and a partition ∼⊆ X × X on its state space, we observe the continuous-time stochastic process {Xt }t∈R+ , 0 that is generated by W (Dfn. 2). We define the lumped process {Yt } on the state space X/∼ = {x̃1 , x̃2 , ...} (denoted by capital indices, i.e. x̃I , x̃J ) and with initial distribution π˜0 , so that Pr(Yt = x̃J | Y0 = x̃0 ) = Pr(Xt ∈ x̃J | X0 ∈ x̃0 ). The lumped process is not necessarily a Markov process. Definition 21. (Lumpability) Given a WLTS W = (X , L , w, π0 ) that generates the process {Xt }, we say that it is lumpable with respect to the equivalence relation ∼⊆ X × X if and only if its lumped process {Yt } has the Markov property. The evolution of a process depends on the initial distribution, and so does the lumpability property. We thus define the set of initial distributions, for which the lumpability holds. We denote the set of all probability distributions over X as PX : PX = {π | π : X → [′, ∞] and ∑ π (§i ) = ∞}. §i ∈X Moreover, we denote the set of initial distributions that produce a chain lumpable with respect to the I : given equivalence relation ∼ by PX ,∼ I PX ,∼ := {π | the lumped process initialized with π is lumpable with respect to ∼}. Whenever a distribution π ∈ PX is positive on the equivalence class x̃, i.e. ∑{π (x) | x ∈ x̃} > 0, we denote by π |x̃ (x), the conditional distribution over the states of x̃: π |x̃ (x) = π (x)/π (x̃), when x ∈ x̃, and π |x̃ (x) = 0, otherwise. Definition 22. (Strong and weak lumpability) Given a WLTS W = (X , L , w, π0 ) that generates the process {Xt }, and an equivalence relation ∼⊆ X × X , we say that {Xt } is: • strongly lumpable with respect to ∼, if the lumped process {Yt } is Markov with respect to any I initial distribution, i.e. PX ,∼ = PX ; • weakly lumpable with respect to ∼, if there exists an initial distribution that makes the lumped I process {Yt } Markov, i.e. PX / ,∼ 6= 0. Note that the definitions of strong and weak lumpability involve the quantifiers ”for all” and ”exists” over the probability distributions over a set of states. Thus, checking for either of them involves in general an infinite number of checks. People have given sufficient conditions of strong and weak lumpability on discrete-time Markov chains (DTMC’s) [21, 26]. The results had been extended to the continuous-time case [4, 27]. We rephrase the sufficient conditions stated therein. In order to understand the sense of the weak lumpability characterization, we discuss the meaning l1 ,I1 of γ . We recall the semantics of a WLTS W by observing the cylinder sets of traces, i.e. τIR = x0 → lk ,Ik x1 . . . xk−1 → xk ∈ TIR (W ). The abstraction W˜ of W , induced by (∼, ∼L ) generates an abstract cylinder l1 ,I1 l ,I k k x̃k ∈ TIR (W˜ ). set of traces, denoted τ̃IR = x̃0 → x̃1 . . . x̃k−1 → 153 J. Feret, T. Henzinger, H. Koeppl, T. Petrov For any cylinder set of traces τ̃IR ∈ TIR (W˜ ), we denote by γτ̃IR the distribution of the conditional probabilities over the lumped state x̃k , knowing that the abstract cylinder of traces τ̃IR , which ends in the abstract state x̃k , was observed, i.e. π (τ̃IR → xk ) γτ̃IR (xk ) = . π (τ̃IR ) The definition of the complete abstraction suggests that, if γτ̃IR was independent of the traces on which it is conditioned, i.e. τ̃IR , then the completeness would hold. Theorem 2. (Lumpability on CTMCs) Let us observe a WLTS W = (X , L , w, π0 ) that generates the process {Xt }, and an equivalence relation ∼⊆ X × X . We consider the rate matrix R : X × X → R. If the lumped process is Markov, then we denote its rate matrix by R̃ : X/∼ × X/∼ → R. Then we have the following characterizations about the lumped process {X̃t }: • If for all xi1 , xi2 ∈ X such that xi1 ∼ xi2 , and for all x̃J ∈ X/∼ , we have that ∑ R(xi1 , x j ) = x j ∈x̃J ∑ R(xi2 , x j ), (2) x j ∈x̃J then {Xt } is strongly lumpable with respect to ∼; We have: R̃(x̃I , x̃J ) = ∑{R(xi1 , x j ) | x j ∈ x̃J }; • If there exists a family of probability distributions over the lumped states, γ ∈ ΓX ,∼ , such that for all x j1 , x j2 ∈ X such that x j1 ∼ x j2 and for all x̃I ∈ X/∼ , we have that a(x j1 ) = a(x j2 ) and ∑xi ∈x̃i R(xi , x j1 ) ∑xi ∈x̃I R(xi , x j2 ) = , γ (x̃J , x j1 ) γ (x̃J , x j2 ) (3) then 1. If the distribution γ is in accordance with π0 , i.e. π0 |X/∼ = γ , then for any finite sequence of states (x0 , . . . , xk ) ∈ X k+1 and any sequence of time intervals (I1 , . . . , Ik ) ∈ IRk , we consider l1 ,t1 l ,t1 +...+t k ′ the set τ̃IR of the traces of the form x0′ → x1′ . . . xk−1 → k xk′ . For all i, 0 ≤ i ≤ k and xi ∼ xi′ , and for all i, 1 ≤ i ≤ k, ti ∈ Ii and li ∈ L , we have that: if π (τ̃IR ) > 0 then γτ̃IR = γ . In other words, knowing that we are in state x̃I , the conditional probability of being in state x ∈ x̃I is invariant of time. 2. The process {Xt } is weakly lumpable with respect to ∼. Moreover, we have: R̃(x̃I , x̃J ) = ∑{R(xi , x j2 ) | xi ∈ x̃I } ; γ (x̃J , x j2 ) One shall notice that Thm. 2 gives a weaker condition than the completeness of WLTS abstraction (eg see Dfn. 19). The main reason is that we do not ’track’ transition labels, in the sense that we observe the abstraction on the cylinder sets of traces induced only by ∼, and not also by ∼L . Yet, in the particular case when states fully define the transition labels (ie, if w(x1 , l1 , x1′ ) > 0, w(x2 , l2 , x2′ ) > 0, x1 ∼ x2 , and x1′ ∼ x2′ , then l1 ∼L l2 ), the given condition for weak lumpability coincides with the definition of the complete abstraction of WLTS. The characterization of weak (resp. strong) lumpability given in Thm. 2 is sufficient, but not a necessary condition: there exist systems which are strongly or weakly lumpable, but do not satisfy the 154 Lumpability Abstractions of Rule-based Systems conditions given in the theorem. Interestingly, there are systems, such that the characterization from Thm. 2 would detect as strong, but not weakly lumpable, which is counter-intuitive with the terminology. One shall also notice that the conditions of Thm. 2 imply that: in order to aggregate two states in the CTMC, they must not have different waiting times until the next transition (e.g. they should have the same activity). It is stated explicitly in the characterization of weak lumpability and it can be obtained by summation over the outgoing class in the characterization of strong lumpability. We consider a WLTS W = (X , L , w, π0 ), and the set of all equivalence relations ∼ on X , denoted PTX . We introduce the subsets of PTX , denoted PS, PW , CS, CW in the following meaning: (i) PS -the set of all equivalence relations such that {Xt } is strongly lumpable with respect to ∼; (ii) PW - the set of all equivalence relations such that {Xt } is weakly lumpable with respect to ∼; (iii) CS - the set of all equivalence relations such that {Xt } satisfies the condition for strong lumpability given in the Thm. 2; (iv) CW - the set of all equivalence relations such that {Xt } satisfies the condition for weak lumpability given in the Thm. 2. Lemma 1. (Relations on lumpability properties and conditions) Consider an arbitrary WLTS W = (X , L , w, π0 ) and the equivalence relation ∼⊆ X × X . We have the following relations: (1a) if ∼∈ PS then ∼∈ PW , if ∼∈ CS then ∼∈ PS; and if ∼∈ CW then ∼∈ PW ; The converse implication does not hold for any of the statements; (2a) If ∼∈ CW , that does not imply ∼∈ CS; (2b) If ∼∈ CS, that does not imply ∼∈ CW . Proof. The statement in the part (1) trivially follows from the Dfn. 21 and Thm. 2. To show (2a) and (2b), we consider the WLTS W specified in the Fig. 1(a), with the state space X = {x, y1 , y2 , y3 , z1 , z2 , z3 }. Let ∼1 be an equivalence relation on X , such that y1 ∼1 y2 and z1 ∼1 z2 . By lumping the states by ∼1 , we get the system W˜1 , as shown in Fig. 1(b). It is easy to check that ∼1 ∈ CS. Moreover, we have that ∼1 ∈ CW , since for   x y12 y3 z12 z3 γ= 1 (0.5, 0.5) 1 (0.5, 0.5) 1 the weak lumpability condition is satisfied, so ∼1 ∈ CS ∩CW . We further lump the states y12 and y3 , by taking the transitive closure of the relation ∼1 union (y1 , y3 ), denoted ∼2 = tc(∼1 ∪(y1 , y3 )) (Fig. 1(c)). This lumping is such that ∼2 ∈ / CS because we have y1 ∼ y3 , but w(y1 , l, z12 ) > 0, and w(y3 , l, z12 ) = 0. On the other hand, for γ=  x y123 z12 z3 1 (1/3, 1/3, 1/3) (0.5, 0.5) 1  we argue that ∼2 ∈ CW . Therefore, if the initial distribution is in accordance with γ , the abstraction W˜2 is sound and complete. If we rather lump z1 and z2 , by ∼3 = tc(∼1 ∪(z1 , z3 )), we get the system W˜3 (Fig. 1(d)). This system is such that ∼3 ∈ CS \ CW . More precisely, we cannot find a γ which would witness ∼3 ∈ CW : if such a γ existed, we would have γ ({x})(x) = 1, and consequently γ (y12 ) = (0.5, 0.5), and γ (y3 ) = 1. This implies that the conditional distribution γ (z123 ) cannot be invariant of time - it will alternate between the distributions (0, 0, 1) and (0.5, 0.5, 0), depending on the choice made in x. Note that, since ∼3 ∈ CS, it follows that ∼3 ∈ PS, and this implies ∼3 ∈ PW . This discussion indicates that if we decide to check for weak lumpability instead of for strong by using the characterization from Thm. 2, it might happen that we eliminate the aggregations that are strongly lumpable. In the case of reductions of Kappa systems, we will use the weak lumpability characterization. 155 J. Feret, T. Henzinger, H. Koeppl, T. Petrov 1/2 1/2 1/3 x 1/3 y1 y2 1 z1 1/2 1/2 1 z2 y1 1/3 1/3 x 1/2 1 y2 1 y3 1 y3 1 z3 1/2 z2 1/2 z3 1/2 x̃ 1/2 y1 1 y2 1 y3 1 1 ỹ12 z̃12 1/2 1/3 1 ỹ3 z̃3 1/2 1/2 (b) W˜1 ; ∼1 ∈ PS ∩ PW ∩CS ∩CW (a) W : the concrete system x 2/3 1/2 1/2 1/3 1/2 1/3 1/3 1/3 1/2 z1 1/2 1/2 z1 1/2 1/2 z̃12 2/3 1/2 z2 1/2 z3 1/2 1/3 x̃ 1 1/3 1/2 x ỹ 1 y2 1 y3 1 z1 1/2 z2 1/2 z̃3 1/2 1/2 z3 1/2 ỹ12 1/2 1/2 1/3 1/3 1/2 1/3 y1 1 1/2 x̃ 1/2 ỹ3 z̃ 1/2 1 1/2 (c) W˜2 ; ∼2 ∈ CW \CS (d) W˜3 ; ∼3 ∈ CS \CW Figure 1: Different abstractions of system W 4.2 Bisimulations Aiming to define the algorithm that is abstracting the WLTS of a Kappa system, we start by redefining the lumpability properties in the bisimulation notions. Bisimulation is typically defined by logically characterizing the distinguishing property of the states that may be aggregated. We define three kinds of bisimulation relations on the WLTS, which are based on the lumpability characterizations given in Thm. 2. We adopt the terminology of [5]. The forward bisimulations arise from the characterization for strong lumpability: the bisimilar states have the same forward behaviour in the sense that they are each targeting any other lumped state with the same total affinity (total outgoing rate). This concept is well established for dependability or performance analysis [16, 15]. What we use in the abstractions of Kappa systems is backward bisimulation. The bisimilar states have the same backward behaviour in the sense that they are reached by the predecessors from one lumped state with the same probabilistic quantity, which becomes the rate in the abstract system. It is however less established and only applied in very few approaches for stochastic modelling [30]. The backward uniform bisimulation is an instance of a backward bisimulation with an additional constraint that only the equally-probable states may be aggregated. Definition 23. Given a WLTS W = (X , L , w, π0 ), and (∼, ∼L ) a pair of equivalence relations respectively over X and L , we define a function δF : X × L/∼L × X/∼ → R+ 0: ˜ x̃ j ) = ∑{|w(xi , l, x j )| | l ∈ l˜ and x j ∈ x̃ j }. δF (xi , l, Furthermore, given a family of probability distributions over the partitions γ ∈ ΓX ,∼ , we define the quantity δB : X/∼ × L/∼L × X → R+ 0: ˜ xi ∈ x̃i } {γ (x̃i , xi ) · |w(xi , l, x j )| | l ∈ l, ˜ x j) = ∑ δB (x̃i , l, . γ (x̃ j , x j ) Specifically, if we have that γ is a uniform distribution over the equivalence classes, we can express the latter expression in terms of cardinalities of the equivalence classes: ˜ xi ∈ x̃i }. ˜ x j ) = |x̃ j | ∑{|w(xi , l, x j )| | l ∈ l, δBU (x̃i , l, |x̃i | 156 Lumpability Abstractions of Rule-based Systems Definition 24. (Forward and backward Markov bisimulation) Given a WLTS W = (X , L , w, π0 ), and (∼, ∼L ) a pair of equivalence relations respectively over X and L , we say that (∼, ∼L ) is a 1. Forward Markov Bisimulation, if for all xi and x j , the following is satisfied: xi ∼ x j , iff for all ˜ x̃) = δF (x j , l, ˜ x̃). equivalence classes x̃ ∈ X/∼ ,l˜ ∈ L/∼L , we have that a(xi ) = a(x j ) and δF (xi , l, Remark. Note that this involves the bisimulation in the classical sense: if xi has a successor in some class, x j has it as well, and they are related by appropriate labels (and probabilities in this case). 2. Backward Markov bisimulation, if for all xi and x j , there exists an γ ∈ ΓX ,∼ , such that the following is satisfied: xi ∼ x j , iff for all equivalence classes x̃ ∈ X/∼ , l˜ ∈ L/∼L , we have that a(xi ) = a(x j ) ˜ xi ) = δB (x̃, l, ˜ x j ). and δB (x̃, l, Theorem 3. (Forward Markov bisimulation implies sound abstraction) Let W = (X , L , w, π0 ) be a WLTS. If (∼, ∼L ) induces a forward Markov bisimulation, then for any aggregates x̃i , l,˜ and x̃ j , we can define ˜ x̃ j ) = δF (xi , l, ˜ x̃ j ). w̃(x̃i , l, The so defined abstraction W˜ = (X/∼ , L/∼L , w̃, π˜0 ) is sound. We then say that W refines W˜ by a forward Markov bisimulation (∼, ∼L ), written W F,(∼,∼ ) W˜ . L Theorem 4. (Backward Markov bisimulation implies sound and complete abstraction) Given a WLTS W = (X , L , w, π0 ), if (∼, ∼L ) induces a backward Markov bisimulation with conditional probabilities over the aggregates γ ∈ ΓX ,∼ , then for any aggregates x̃i ,l,˜ and x̃ j , we can define ˜ x j ). ˜ x̃ j ) = δB (x̃i , l, w̃(x̃i , l, (4) If γ (x̃) = π0 |x̃ , then the so defined abstraction W˜ = (X/∼ , L/∼L , w̃, π˜0 ) is sound and complete. We then say that W refines W˜ by a backward Markov bisimulation (∼, ∼L ) with conditional distributions γ , written W B,(∼,∼L ),γ W˜ . ˜ x̃ j ) = δBU (xi , l, ˜ x̃ j ), In particular, if we know that γ is uniform, the equation (4) becomes w̃(x̃i , l, ˜ written also W BU,(∼,∼L ) W . 4.3 Proving bisimulations The forward bisimulation relation for abstracting the transition systems with CTMC semantics has been established and used in applications (eg, [16, 15]). Moreover, computing the backward uniform bisimulation when γ is uniform is defined [5, 30]. It is based on an alternative characterization of the backward uniform Markov bisimulation, which eases the analysis. Lemma 2. (Proving backward uniform Markov bisimulation) Let W = (X , L , w, π0 ) be a WLTS and (∼, ∼L ) be a pair of equivalence relations respectively over X and L . For any state x′ ∈ X , and any ˜ x′ ) of transitions from a state in x̃ to the pair of classes x̃, l˜ ∈ X/∼ × L/∼L , let us define the set Pred(x̃, l, state x′ and with a label in l˜ as follows: ˜ x′ ) = {(x, l) ∈ x̃ × l˜ | w(x, l, x′ ) > 0}. Pred(x̃, l, Assume that: (1) π0 |X/∼ = π˜0 , and (2) for any xi′ , x′j ∈ X such that xi′ ∼ x′j and any x̃ ∈ X/∼ , l˜ ∈ ˜ x′ ) and Pred(x̃, l, ˜ x′ ), such that for any (xi , li ) ∈ L/∼L , there exists a bijective map φ between Pred(x̃, l, i j ˜ x′ ), if φ (xi , li ) = (x j , l j ), then we have that w(xi , li , x′ ) = w(x j , l j , x′ ). Pred(x̃, l, i i j Then we have that W is the backward uniform bisimulation of the abstraction W˜ = (X/∼ , L/∼ , w̃, π˜0 ), i.e. W BU,(∼,∼L ) W˜ . 157 J. Feret, T. Henzinger, H. Koeppl, T. Petrov Wi !BU,(∼1 ,∼L1 ) W W̃ ⇒ W !B,(∼,∼L ),γ W̃ !BU,(∼2 ,∼L2 ) Figure 2: Proving backward refinement On the other hand, as soon as γ over the aggregates is not uniform, we cannot observe the bijection between predecessors over the states. Proving that the given abstraction is a backward bisimulation cannot be established unless we have a right ’guess’ of the distributions γ . Lem. 3 states how to avoid proving backward bisimulation by instead proving two uniform backward bisimulations. More precisely, if we want to prove the backward refinement between the systems W and W˜ , it is enough to observe the system W i , which is a backward uniform refinement of both of the systems W and W˜ (Fig. 4.3). Lemma 3. (Proving backward Markov bisimulation) Consider a WLTS W = (X , L , w, π0 ), and any aggregation relation (∼, ∼L ), such that W˜ = (X/∼ , L/∼L , w̃, π̃0 ). We assume that there exist a system W i = (X i , L i , wi , π0i ), and the two pairs of equivalence relations (∼1 , ∼L1 ), (∼2 , ∼L2 ), such that W i BU,(∼1 ,∼L1 ) W , W i BU,(∼2 ,∼L2 ) W˜ , ∼1 ∼2 (in the sense that, for any x1i , x2i ∈ X i , x1i ∼1 x2i ⇒ x1i ∼2 x2i ), ∼L1 ∼L2 , and, for any x1i , x2i such that x1i ∼2 x2i , the number of states which are ∼L1 - equivalent to x1i is equal to the number of states which are ∼L1 - equivalent to x2i . Under this assumption, we have that W B,(∼,∼L ),γ W˜ , where γ is defined as γ (x̃, x) = |{xi ∈ X i | xi ∼1 x0i }| , for any [x0i ]∼1 = x. |{xi ∈ X i | xi ∼2 x0i }| This Lemma contains the key observation for the abstraction of Kappa systems, and for proving Thm.1. It thus completes the intention of the theoretical analysis in this paper. More precisely, we observe the WLTS W R of a given Kappa system R, as defined in Dfn. 12 and its abstraction generated as proposed in the reduction procedure (Sec. 3, Dfn. 16). The main observation is that the system W is already an abstraction. More concretely, the states of W are multisets of species, and as such, they abstract the individual species. For example, a state that contains two agents of type A(su ) abstracts away the potential individual behavior of these two agents, for example A1 (su ) and A2 (su ). To show that the abstraction is sound and complete, we observe the system W i , which is the individual-based semantics of a Kappa system, where each individual agent is uniquely identified. The backward uniform refinement is established between W i and W by the modeling assumptions. We are left to prove the backward uniform refinement between W i and W˜ . This is done by inspections on the ACM’s (Dfn. 14). 4.4 Example We consider the following Kappa system. We have the agent types A = {A, B}, the site names {s,t}, the signatures Σι (A) = Σι (B) = {s} and Σλ (A) = Σλ (B) = {t}, the alphabet of internal states I = {u, p}. The contact map is defined by (N , E ), such that N = {(A, s), (A,t), (B, s), (B,t)} and E = {((A,t), (B,t))} and the following rules:  r1 : A(su ) ↔ A s p @k1 , k1−  r2 : B(su ) ↔ B s p @k2 , k2−   r3 : A(t) , B(t) ↔ A t1 , B t1 @k3 , k3− 158 Lumpability Abstractions of Rule-based Systems Moreover, using the minimal ACM for annotating the agents, as written in Dfn. 14, we get that ≈A has two equivalence classes {s} and {t}; and that ≈B has two equivalence classes {s} and {t} as well. The fragments derived from an ACM (Dfn. 15) are the following: F1 = A(su ), F2 = A s p , F3 = A(t),    F4 = A t1 , B t1 , F5 = B(su ), F6 = B s p , F7 = B(t). Let us pick a (finite) initial distribution π0 . Now we observe the WLTS W = (X , L , w, π0 ) assigned to the Kappa system RAB (introduced in Dfn. 12), and the state y which is the ≡-equivalence class of the mixture Ey defined as follows:       A s p ,t1 ,B s p ,t1 ,A su ,t2 ,B su ,t2 ,A su ,t3 ,B su ,t3 The unique (up to ≡) non ≡♯ -equivalent mixtures is Ey′ which is defined as follows:       A s p ,t1 ,B su ,t1 ,A su ,t2 ,B s p ,t2 ,A su ,t3 ,B su ,t3 We denote y′ = [Ey′ ]≡ , ỹ′ = [Ey′ ]≡♯ . We compute however that the distribution among state ỹ = [Ey ]≡♯ . We have: γ (ỹ, y) = 1/3 and γ (ỹ, y′ ) = 2/3. Roughly speaking this comes from the fact that if we annotate fragments of type A and B in ỹ with the identifiers 1, 2, 3 (there are 36 possible annotation), and if we assume that agents with the same identifiers are bound together. Then there are 12 annotations so that the phosphorilated A and B are bound together, and 24 where this is not the case. A more detailed analysis of this model is given in [12]. 5 Case study In this section, we apply our framework to a case study. We have indeed refactored in Kappa the model of a crosstalk between the EGF rececptor and the Insulin receptor pathways which is described in [6]. In this model, two kinds of receptors, EGF receptor (EGFR) and insulin receptor (IR) can recruit a protein called Sos, which can be phosphorylated, or not. Each kind of receptor has its own pathway, and these two pathways shared some common proteins. The contact map is given in Fig. 3. One can notice that some sites can be bound to several other sites, which denotes a competition (concurrency). Moreover, the site d of a EGF receptor (EGFR) can be bound to the site d of another EGFR (since there is a loop in the CM). Moreover rules are given in Table 1. We do not give the rate for rules, but we assume no hypotheses on the rates (rates are parameters, some of them might be equal, or not). Moreover, each rule is reversible. We roughly explain how each pathway works, by focusing on the forward direction of rules. First, we describe how EGFR can recruit a transport molecule (Grb). EGFR can recruit a ligand (EGF) on site a (r01,r02), and two EGFRs can form a dimer (r03,r04,r05). We have used two rules to encode EGF/EGFR binding, in order to model the fact that the rate of association may depend on the fact whether EGFR is in a dimer, or not. The same way, the rate of dimer formation/dissociation may depend on the number of ligands that are bound to receptors. The site b of EGFR can be phosphorylated (r06,r07) at a rate which depends whether the receptor is in a dimer, or not. Then, EGFR can recruit an adapter molecule called Shc (r08). Then, EGFR can phosphorylize Shc (r09,r10) (the rate depend on the fact whether the receptor is still in a dimer, or not). Shc can then recruit a transport modecule (Grb) (r11). Yet, each receptor has a shorter way to recruit a transport molecule. The site c of EGFR can be phosphorylated (r12,r13), and then recruit Grb directly (r14,r15). Then, we describe how an insulin receptor (IR) can recruit Grb. IR can recruit insulin molecules (Ins) on two sites a (r16,r17) and b (r18,r19) (the rate may depend on the fact whether an insulin molecule 159 J. Feret, T. Henzinger, H. Koeppl, T. Petrov Sos d Ins EGF b Grb a a a sitea a EGFR b IRS b a c b Shc d d IR a c b Figure 3: Contact map for the EGFR/Insulin crosstalk. r01: r02: r03: r04: r05: r06: r07: r08: r09: r10: r11: r12: r13: r14: r15: r16: r17: r18: r19:   EGF(a) , EGFR(a,d) ←→ EGF a1 , EGFR a1,d  − 1 1 − EGF(a) , EGFR(a,d ) ←→ EGF a , EGFR  a ,d  EGFR(a,d) , EGFR(a−,d) ←→ EGFR a,d1 , EGFR a−,d1 1 1 EGFR(a,d) , EGFR(a,d) ←→ EGFR a,d , EGFR a,d  − 1 − 1 EGFR(a−,d) , EGFR(a−,d) ←→  EGFR a ,d , EGFR a ,d EGFR(bu ,d) ←→ EGFR b p ,d  EGFR(bu ,d− ) ←→ EGFR b p ,d−    EGFR b p , Shc(a) ←→ EGFR b1p , Shc a1     EGFR b1p ,d , Shc a1,bu ←→ EGFR b1p ,d , Shc a1,b p     − 1 − 1 1 1 EGFR b p ,d , Shc a ,bu ←→ EGFR b p ,d , Shc a ,b p    1 1 Grb(a) , Shc b p ←→ Grb a , Shc b p  EGFR(cu ,d) ←→ EGFR c p ,d  − EGFR(cu ,d ) ←→ EGFR c p ,d−    EGFR c p ,d , Grb(a) ←→ EGFR c1p ,d , Grb a1    EGFR c p ,d− , Grb(a) ←→ EGFR c1p ,d− , Grb a1   1 1 IR(a,b) , Ins(a) ←→ IR a ,b , Ins a  IR(a,b− ) , Ins(a) ←→ IR a1,b− , Ins a1 1 1 IR(a,b) , Ins(a) ←→ IR a,b ,Ins a  IR(a−,b) , Ins(a) ←→ IR a−,b1 , Ins a1 r20: r21: r22: r23: r24: r25: r26: r27: r28: r29: r30: r31: r32: r33: r34: r35: r36: r37: r38:  IR(a,b,cu ) ←→ IR a,b,c p  − − IR(a ,b,cu ) ←→ IR a ,b,c p  − − IR(a,b ,cu ) ←→ IR a,b ,c p  IR(a−,b−,cu ) ←→ IR a−,b−,c p    1 IR c p , Shc(a) ←→ IR c p , Shc a1     − − 1 1 − IR a ,b ,c , Shc a ,bu ←→ IR a ,b−,c1 , Shc a1,b p  IR(a,b,du ) ←→ IR a,b,d p  IR(a−,b,du ) ←→ IR a−,b,d p  IR(a,b−,du ) ←→ IR a,b−,d p  IR(a−,b−,du ) ←→ IR a−,b−,d p    1 IR d p , IRS(a) ←→ IR d p , IRS a1     IR a−,b−,d1 , IRS a1,bu ←→ IR a−,b−,d1 , IRS a1,b p    Grb(a) , IRS b p ←→ Grb a1 , IRS b1p   Grb(b) , Sos(du ) ←→ Grb b1  , Sos d1u  Grb(b) , Sos d p ←→ Grb b1 , Sos d1p     Grb b1 , Sos d1u ←→ Grb b1 , Sos d1p  Sos(du ) ←→ Sos d p  Shc(bu ) ←→ Shc b p  IRS(bu ) ←→ IRS b p Table 1: Rule set for the EGFR/Insulin crosstalk. has already been recruited). The site c of the IR can be phosphorylated (r20,r21,r22,r23) at a rate which depends on the number of recruited insulin molecules (in practice the rates of rules r21 and r22 are the same). Then, IR can recruit an adapter Shc (r24). Whenever IR is also bound to two insulin molecules, Shc can be phosphorylized (r25). Shc can then recruit Grb (r11). Yet, IR has an other way of recruiting a Grb. The site d of IR (r26,r27,r28,r29), and then recruit another adapter called IRS (r30) which can be activated when the insulin receptor is bound to two insulin molecules (r31). Then, IRS can recruit Grb (r32). Independently, Grb can recruit a protein Sos (r33,r34). And Sos can be activated (r35,r36) at the rate which may depend on the fact whether it is bound to a Grb, or not. Other rules describe the recuitment of Sos by Grb. And spontaneous (de)phosphorylation of Shc (r37) and IRS (r38). In this model, 2, 768 different complexes may occur. This number is mainly due to the fact that each dimer made of two proteins EGFR has 4 sites (the sites b and c for each EGFR) to recruit a Grb, which induces a small combinatorial blow up. Scanning the set of rules, one can notice that no rule tests both the site a and b of some proteins Grb. Thus, the partition {{a}, {b}} can be used safely for the sites of Grb, in the annotated contact map. As a consequence, the number of fragments is only 609. Unfortunatly, 160 Lumpability Abstractions of Rule-based Systems this is the only reduction that we can do (ie, the partition for the sites of any other kind of proteins, is the coarsest one). Last, one can notice that, given some additional hypotheses on the rate of some rule, that the sites a and b of IR have a symetric role in the system. We could consider this symetry toreduce the setof consid  1 ered fragments, by identifying two symetric fragments, such as IR a ,b , Ins a1 and IR a,b1 , Ins a1 . 6 Conclusions Reducing the complexity of combinatorial reaction mixtures is an important milestone towards simulation and analysis of large-scale realistic models of cellular signal transduction. In this paper we study a scalable reduction method, that is applicable to any rule-based specification. The reduction is sound and moreover complete, i.e. the sample traces of individual molecular species can be reconstructed from the traces of aggregated species in the reduced model. We put this method into the general context of abstractions of probabilistic transition system and show that it yields a sufficient condition for weak lumpability and that it is equivalent to backward Markov bisimulation. The reduction factor depends on the number of independent molecular events and is strictly smaller than that of the less-demanding reduction based on the differential semantics. A compelling problem for future work is thus to analyze differential fragments in the context of stochastic semantics and to obtain error bounds for this reduction as a function of the kinetic parameters of the system. References [1] Christel Baier, Boudewijn Haverkort, Holger Hermanns & Joost-Pieter Katoen (2003): Model-checking algorithms for continuous-time Markov chains. IEEE Transactions on Software Engineering 29(7), p. 2003. [2] Christel Baier & Holger Hermanns (1999). Weak Bisimulation for Fully Probabilistic Processes. [3] Michael L. Blinov, James R. Faeder & William S. Hlavacek (2004): BioNetGen: software for rule-based modeling of signal transduction based on the interactions of molecular domains. Bioinformatics 20, pp. 3289–3292. [4] Peter Buchholz (1994): Exact and Ordinary Lumpability in Finite Markov Chains. Journal of Applied Probability 31, no1, pp. 59–75. [5] Peter Buchholz (2008): Bisimulation relations for weighted automata. Theoretical Computer Science Volume 393, Issue 1-3, pp. 109–123. [6] Holger Conzelmann, Dirk Fey & Ernst D. Gilles (2008): Exact model reduction of combinatorial reaction. BMC Syst Biol 2(78), pp. 342–351. [7] Vincent Danos, Jérôme Feret, Walter Fontana, Russel Harmer & Jean Krivine (2008): Rule-based modelling, symmetries, refinements. In: FMSB’08, LNBI. [8] Vincent Danos & Cosimo Laneve (2003): Core Formal Molecular Biology. Theoretical Computer Science 325, pp. 69–110. [9] Josée Desharnais, Abbas Edalat & Prakash Panangaden (2002): Bisimulation for labelled Markov processes. Inf. Comput. 179(2), pp. 163–193. [10] Laurent Doyen, Thomas A. Henzinger & Jean-Francois Raskin (2008): Equivalence of Labeled Markov Chains. Int. J. Found. Comput. Sci. 19(3), pp. 549–563. Available at http://dx.doi.org/10.1142/ S0129054108005814. J. Feret, T. Henzinger, H. Koeppl, T. Petrov 161 [11] Jérôme Feret, Vincent Danos, Jean Krivine, Russ Harmer & Walter Fontana (2009): Internal coarse-graining of molecular systems. Proc. Natl. Acad. Sci. USA 106(16), pp. 6453–6458. Available at http://dx.doi. org/10.1073/pnas.0809908106. [12] Jerome Feret, Heinz Koeppl & Tatjana Petrov: Stochastic fragments: A framework for the exact reduction of the stochastic semantics of rule-based models. International Journal of Software and Informatics To appear. [13] Donald Gross & Douglas R. Miller (1984): The Randomization Technique as a Modeling Tool and Solution Procedure for Transient Markov Processes. Operations Research 32, no2, pp. 343–361. [14] Thomas A. Henzinger, Maria Mateescu & Verena Wolf (2009): Sliding Window Abstraction for Infinite Markov Chains. In: CAV, pp. 337–352. [15] Holger Hermanns (2002): Interactive Markov Chains And the Quest for Quantified Quality. Ph.D. thesis, University of Marburg. [16] Jane Hillston (1996): A compositional approach to performance modelling. Cambridge University Press, New York, NY, USA. [17] William S. Hlavacek, James R. Faeder, Michael L. Blinov, Alan S. Perelson & Byron Goldstein (2005): The complexity of complexes in signal transduction. Biotechnol. Bio-eng. 84, pp. 783–794. [18] William S. Hlavacek, James R. Faeder, Michael L. Blinov, Richard G. Posner, Michael Hucka & Walter Fontana (2006): Rules for Modeling Signal-Transduction Systems. Science’s STKE 2006(344). [19] D. Kannan Jianjun Paul Tian (2006): Lumpability and Commutativity of Markov Processes. Stochastic analysis and Applications 24, no3, pp. 685–702. [20] J. G. Kemeny, J. L. Snell & A. W. Knapp (1976): Denumerable Markov Chains. Springer-Verlag, New York, NY, USA. [21] John Kemeny & James L. Snell (1960): Finite Markov Chains. Van Nostrand. [22] Thomas G. Kurtz (1971): Limit Theorems for Sequences of Jump Markov Processes Approximating Ordinary Differential Processes. Journal of Applied Probability 8(2), pp. 344–356. [23] Kim G. Larsen & Arne Skou (1989): Bisimulation through probabilistic testing (preliminary report). In: POPL ’89, ACM, New York, NY, USA, pp. 344–352. [24] Donald A. McQuarrie (1967): Stochastic approach to chemical kinetics. Journal of Applied Probability 4(3), pp. 413–478. [25] Elaine Murphy, Vincent Danos, Jerome Feret, Russell Harmer & Jean Krivine (2009): Rule Based Modelling and Model Refinement. In: Elements of Computational Systems Biology, Wiley. [26] Gerardo Rubino & Bruno Sericola (1991): A Finite Characterization of Weak Lumpable Markov Processes. Part I: The Discrete Time Case. Stochastic processes and their applications 38, pp. 195–204. [27] Gerardo Rubino & Bruno Sericola (1993): A Finite Characterization of Weak Lumpable Markov Processes. Part II: The Continuous Time Case. Stochastic processes and their applications 45, pp. 115–125. [28] Roger B. Sidje, Kevin Burrage & Shev MacNamara (2007): Inexact Uniformization Method for Computing Transient Distributions of Markov Chains. SIAM Journal on Scientific Computing 29, issue 6, pp. 2562– 2580. [29] Ana Sokolova & Erik P. de Vink (2003): On relational properties of lumpability. In: Proceedings of the 4th PROGRESS. [30] Jeremy Sproston & Susanna Donatelli (2004): Backward Stochastic Bisimulation in CSL Model Checking. In: QEST ’04, Washington, DC, USA, pp. 220–229. [31] Christopher T. Walsh (2006): Posttranslation Modification of Proteins: Expanding Nature’s Inventory. Roberts and Co. Publisher.