The Pervasive Workflow: A Decentralized Workflow System Supporting Long-Running Transactions

Silvan T. Golega

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 38, NO. 3, MAY 2008 319 The Pervasive Workflow: A Decentralized Workflow System Supporting Long-Running Transactions Frederic Montagut, Student Member, IEEE, Refik Molva, Member, IEEE, and Silvan Tecumseh Golega Abstract—Workflow technologies are becoming pervasive in that they enable the execution of business processes in distributed and ubiquitous computing environments. As long-running transactions, the execution of workflows in environments without dedicated infrastructures raises transactional requirements due to the dynamicity of resources available to run a workflow instance and the integration of relaxed atomicity constraints at both design and instantiation time. In this paper, we propose an adaptive transactional protocol for the pervasive workflow model developed in a previous work to support the execution of business processes in the pervasive setting. The execution of this protocol takes place in two phases. First, candidate business partners are assigned to tasks using an algorithm wherein the selection process is based on both functional and transactional requirements. The workflow execution further proceeds through a hierarchical coordination protocol managed by the workflow initiator and controlled based on a decision table computed as an outcome of the business partner assignment procedure. The resulting workflow execution is compliant with the defined consistency requirements, and the coordination decisions depend on the transactional characteristics offered by the partners assigned to each task. An implementation of our theoretical results relying on ontology Web Language for Series and Business Process Execution Language technologies is further detailed as a proof of concept. Index Terms—Decentralized workflows, transaction-aware composition, transactional consistency. I. INTRODUCTION ORKFLOW technologies are becoming pervasive in that they enable the execution of long-running business processes and transactions in distributed and ubiquitous environments [1]–[3]. The adequate execution support for pervasive workflows has to cope with the lack of dedicated infrastructure for management and control tasks in order to provide business users with means to leverage the resources available in their surrounding environment. To that effect, a first step has been achieved by the design of a fully decentralized workflow architecture based on the service oriented computing paradigm (SOC) [4]. Featuring a dynamic assignment of tasks to workflow partners, this architecture allows users to initiate work- W Manuscript received November 27, 2006; revised February 25, 2007, June 8, 2007, and December 18, 2007. This work was supported in part by European Union (EU) Information on Science and Technology (IST) Directorate General as a part of FP6 IST projects MOSQUITO, in part by the R4eGov, and in part by the Systems, Applications and Products (SAP) Research Laboratories, France S.A.S. This paper was recommended by Guest Editor H. Patrick. F. Montagut is with the Systems, Applications and Products (SAP) Research Laboratories, France, 06250 Mougins, France (e-mail: frederic.montagut@sap.com). R. Molva is with the Institut Eurecom, 06904 Sophia-Antipolis, France (e-mail: refik.molva@eurecom.fr). S. T. Golega is with the Hasso-Plattner-Institut, 900460-14440 Potsdam, Germany (e-mail: silvangolega@gmail.com). Digital Object Identifier 10.1109/TSMCC.2008.919184 flows in any environment where surrounding users’ resources can be advertised by various means including a service discovery mechanism. Yet, this architecture does not provide any guarantee on the consistency of the outcome reached by the process execution. Considering the lack of reliability akin to distributed environments, data and transaction consistency is a main issue. Transactional requirements raised by the execution of processes on top of the pervasive workflow infrastructure are twofold: on the one hand, the workflow execution is dynamic in that the workflow partners offering different characteristics can be assigned to tasks depending on the resources available at run-time, and on the other hand, atomicity of the workflow execution can be relaxed as intermediate results produced by the workflow may be kept despite the failure of one partner. Existing transactional protocols [5], [6] are not adapted to solve this paradigm as they do not offer enough flexibility to cope, for instance, with the run-time assignment of computational tasks. In this paper, we propose an adaptive transactional protocol for the pervasive workflow management system developed in [4]. The execution of this protocol takes place in two phases. First, business partners are assigned to tasks using an algorithm wherein the selection process is based on functional and transactional requirements. These transactional requirements are defined at the workflow design stage using the acceptable termination states (ATS) model. The workflow execution further proceeds through a hierarchical coordination protocol managed by the workflow initiator and controlled using a decision table computed as an outcome of the business partner assignment procedure. The resulting workflow execution is compliant with the defined consistency requirements and the coordination decisions depend on the characteristics of the partners assigned to each task. Besides, it should be noted that the practical solutions that are presented in this paper do not only answer specific requirements introduced by the pervasive workflow model but are sufficiently generic to be applied to other workflow architectures supporting long-running transactions. The remainder of the paper is organized as follows. Section II introduces preliminary definitions and the methodology underpinning our approach. We present an example of pervasive workflow execution in section III for the purpose of illustrating our results throughout the paper. Section IV introduces a detailed description of the transactional model used to represent the characteristics offered by business partners. Section V describes how transactional requirements expressed by means of the ATS model are derived from the inherent properties of termination states. Sections VI and Sections VII present the transaction-aware partner assignment procedure and the associated coordination protocol, respectively. An implementation 1094-6977/$25.00 © 2008 IEEE 320 Fig. 1. IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 38, NO. 3, MAY 2008 Pervasive workflow run-time specification. of our theoretical results based on Web services technologies including ontology Web Language for Series (OWL-S) [7] and Business Process Execution Language (BPEL) [8] is presented in Section VIII. Section IX discusses related work while Section X presents concluding remarks. II. DEFINITIONS AND GOALS STATEMENT Defining a transactional protocol for pervasive workflows raises challenges that are mainly due to the flexibility of their execution and their lack of dedicated infrastructure in charge of management and control tasks. After a short overview of the features offered by the pervasive workflow architecture, we specify the set of requirements in terms of transactional consistency that must be met by the execution of pervasive workflows. A. Pervasive Workflows In this section, we present the pervasive workflow model that was designed in [4]. The pervasive workflow concept introduces a workflow management system supporting the execution of business processes in environments whereby computational resources offered by each business partner can potentially be used by any party within the surroundings of that business partner. This workflow management system features a distributed architecture characterized by two objectives. 1) Fully decentralized architecture: The management of the workflow execution is distributed among the partners taking part in a workflow instance in order to cope with the lack of dedicated infrastructure in the pervasive setting. 2) Dynamic assignment of business partners to workflow tasks: The peers in charge of executing the workflow can be discovered at run-time based on available resources. The run-time specification of the pervasive workflow architecture is depicted in Fig. 1. Having designed an abstract representation of the workflow whereby business partners are not yet assigned to functional tasks, the workflow initiator d1 launches the execution. The initiator d1 executes a first set of tasks t1 1) before discovering in its surrounding environment a partner able to perform the next tasks t2 2). Once the discovery phase is complete; 3) workflow data are transferred from the peer that performed the discovery to the discovered one and 4) the workflow execution further proceeds with the processing of the next set of tasks t3 . The sequence composed of the discovery request, the transfer of workflow data, and the execution of a set of tasks is iterated till the final vertex. In order to relax the availability constraints of pervasive environments, the execution is stateless so that, after the completion of a set of tasks, each business part- ner sends all workflow data to the next partner involved in the workflow execution, and thus, does not have to remain online till the end of a workflow instance. Along with workflow application data, the flow of data among business partners includes the abstract representation of the workflow that consists of the execution plan and the functional requirements associated with each workflow step. We note that W is the abstract representation of a pervasive workflow and W = (ta )a∈[1,q ] where ta denotes a vertex that is a set of workflow tasks that are performed by a business partner from the receipt of workflow data till the transfer of these data to the next partner. The instance of W wherein q business partners (da )a∈[1,q ] have been assigned to the sets (ta )a∈[1,q ] is denoted by Wd = (da )a∈[1,q ] . B. Assuring Consistency of Pervasive Workflows As a first step towards assuring workflow consistency, one has to be able to express transactional requirements as part of the workflow model. We therefore want to offer the possibility to coordinate some tasks of a pervasive workflow instance in order to assure the consistency of termination states. Our approach consists in partitioning the specification of a pervasive workflow into subsets or zones and identifying some zones called critical zones, wherein transactional requirements defined by designers have to be fulfilled. Definition 2.1. We define a critical zone C of a workflow W as a subset of W composed of contiguous vertices that require to meet some transactional requirements. We distinguish within C: 1) (mk )k ∈[1,i] , the i vertices whose tasks only modify mobile or volatile data; 2) (vk )k ∈[1,j ] , the j vertices whose tasks modify data other than mobile ones, v1 being the first vertex of C. The business partner assigned to the vertex vk (respectively, mk ) is denoted by dvk (respectively, dm k ), and the instance of C is denoted by Cd . We adopt a simple transactional protocol in which the coordination is managed in a centralized manner by dv1 assigned to v1 . The role of the coordinator consists in making decisions based on the transactional requirements defined for the critical zone given the overall state of workflow execution so that the critical zone execution can reach a consistent state of termination. The coordination is assured in a hierarchical way and the business partners (dvk )k ∈[1,j ] that are subcoordinators report directly to dv1 v whereas the partners dm k report to the business partner dx most 1 recently executed. For the sake of simplicity, we consider that m m the set of business partners {dm l , dl+ 1 , . . . , dp } reporting to the v business partner dx form an abstract partner named dm l,p that is assigned to the abstract vertex ml,p . C therefore denotes a set of n vertices (abstract or not) C = (ca )a∈[1,n ] . This reporting strategy based on the type of business partners is depicted in Fig. 2. Within the pervasive workflow model, the workflow execution is performed by business partners that are assigned to vertices at run-time. Considering the diversity of business partners 1 Business partner of type d v that is located on the same branch of the workk flow as these dm k business partners and that has most recently completed its execution. MONTAGUT et al.: PERVASIVE WORKFLOW: A DECENTRALIZED WORKFLOW SYSTEM 321 business partners can be assigned to workflow vertices. Finally, once Cd is formed, we can proceed toward the second goal by expressing the coordination rules inherent to Cd and designing the actual coordination protocol in charge of processing those rules. This methodology basically follows the steps of the transactional pervasive workflow lifecyle from the instantiation to the execution, as depicted in Fig. 3. III. MOTIVATING EXAMPLE Fig. 2. Protocol actors. encountered in the pervasive setting, we assume that these partners might offer various transactional properties, in addition to different functional capabilities. For instance, a business partner can have the capability to compensate the effects of a given operation or to re-execute the operation after failure as possible transactional properties whereas some other business partner does not have any of these capabilities. It thus becomes necessary to select the business partners executing a critical zone of a pervasive workflow not only based on functional requirements but also according to transactional ones. The business partner assignment procedure through which business partners are assigned to vertices using a match-making procedure based on functional requirements has to be augmented to integrate transactional ones. The purpose of the business partner assignment procedure consists in building an instance of C consistent with the transactional requirements imposed by designers. It is thus required to first discover all the business partners that will be involved in the execution of a given critical zone prior to the execution in order to verify the existence of a set of business partners that can be assigned to C. Once the instance of C has been created, the execution supported by the coordination protocol can start. The execution of the coordination protocol therefore consists of two phases: the first phase that includes the discovery and assignment of business partners to vertices and the second one with the actual execution. C. Methodology As described in Section II-B, a coordination protocol designed to support the execution of pervasive workflows has to meet two basic requirements. First, business partners have to be assigned based on a transaction-aware process. Second, a runtime mechanism should process and assure the coordination of the execution in the face of failure scenarios. In order to achieve the first, we capitalize on the work presented in [9] whose results are reminded later on in this paper. In our approach, the partners part of a critical zone instance Cd are selected according to their transactional properties by means of a matchmaking procedure. We therefore need to first specify the semantic associated with the transactional properties offered by business partners. The matchmaking procedure is indeed based on this semantic. This semantic is also used in order to define a tool allowing workflow designers to specify their transactional requirements for a given critical zone. Based on these transactional requirements, In this section, we describe a motivating example that will be used throughout the paper to illustrate the design methodology. We consider a workflow executed during a computer fair where clients, retailers, and hardware providers can electronically exchange orders and invoices. The workflow used in this example is depicted in Fig. 4. Alice would like to buy a new computer and makes a call for offer to three available retailers. After having received some offers, she decides to go for the cheapest one, and therefore, contacts the corresponding retailer Bob. Bob initiates the critical zone C1 by sending an invoice to Alice and contacting his hardware provider Jack (vertex v1 ). Alice pays using Bob’s trusted payment platform (vertex v2 ). In the meantime, Jack receives the order from Bob and sends him an invoice (vertex m1 ) that he pays (vertex v3 ) using Jack’s trusted payment platform. Afterwards, Bob starts to build the computer and ships it to Alice (vertex v4 ). Of course, in this example, we need to define transactional requirements as, for instance, Bob would like to have the opportunity to cancel his payment to Jack if Alice’s payment is not done. Likewise, Alice would like to be refunded if Bob does not manage to assemble and ship the computer. These different scenarios refer to characteristics offered by the business partners or services assigned to the workflow tasks. For example, the payment platform should be able to compensate Alice’s payment and Jack’s payment platform should offer the possibility to cancel an order. Yet, it is no longer necessary for Jack to provide the cancellation option if the payment platform claims that it is reliable and not prone to transaction errors. In this example, we do not focus on the trust relationship between the different entities, and therefore, assume the trustworthiness of each of them, yet we are rather interested in the transactional characteristics offered by each participant. IV. TRANSACTIONAL MODEL In this section, we provide and extend the semantic specifying the transactional properties offered by business partners described in [9] before specifying the consistency evaluation tool associated with this semantic. The semantic model is based on the “transactional Web service description” defined in [10]. A. Transactional Properties of Business Partners A model specifying semantically the transactional properties of Web services is presented in [10]. This model is based on the classification of computational tasks made in [11] and [12] that considers three different types of transactional properties. 322 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 38, NO. 3, MAY 2008 Fig. 3. Methodology. Fig. 4. Workflow example. Deal at a fair. Fig. 6. Fig. 5. Termination states of C 1 . A task, and by extension, a business partner executing this task can be defined as follows. 1) Compensatable: The data modified by the task can be rolled back. 2) Retriable: The task is sure to complete successfully after a finite number of tries. 3) Pivot: The task is neither compensatable nor retriable. In the definition of a critical zone, we distinguish two sets of business partners: (dm k )k ∈[1,i] that only modify mobile or volatile data and (dvk )k ∈[1,j ] that only modify data other than State model. mobile ones, e.g., remote database, production of an item, etc. Based on this distinction, the aforementioned transactional model has to be extended. This model describes the modification of permanent data, and is thus, only relevant to database systems whereas the pervasive setting introduces, in addition, transactional properties representing business partners’ hardware characteristics such as battery level, reliability, connectivity, etc. A new transactional property representing the reliability of a business partner is therefore introduced. 1) A business partner is reliable (respectively, unreliable) if it is highly unlikely (respectively, likely) that the business partner will fail due to hardware failures (battery level, communication medium access, etc.) To properly detail this model, we can map the transactional properties with the state of data modified by the business partners during the execution of computational tasks. This mapping is depicted in Fig. 6. Basically, data can be in three different states: initial (0); unknown (x); and completed (1). In the state (0), it means either that the vertex execution has not yet started initial, the execution has been aborted before starting, or the data modified have been compensated after completion. In state (1), it means that the vertex has been properly completed. In state (x), it means either that the execution is active, the MONTAGUT et al.: PERVASIVE WORKFLOW: A DECENTRALIZED WORKFLOW SYSTEM Fig. 7. 323 State diagrams of business partners dvk and dm k . execution has been stopped, canceled before completion, the execution has f ailed or an hardware failure, Hf ailed happened. These transactional properties allow to define eight types of business partners: (reliable, retriable) (rl ,rt ); (reliable, compensatable) (rl ,c); (reliable, retriable, and compensatable) (rl ,rt c); (reliable, pivot) (rl ,p); and the four others Unreliable (url ). We must distinguish within this model: a) the inherent termination states: f ailed, completed, and Hf ailed that result from the normal course of the task execution; b) the forced termination states: compensated, aborted, and canceled that result from a coordination message received during a coordination protocol instance and forcing a task execution to either stop or rollback. In the state diagrams of Figs. 6 and 7, plain and dashed lines represent the inherent transitions leading to inherent states and the forced transitions leading to forced states, respectively. The transactional properties of the business partners are only differentiated by the states f ailed, compensated and Hf ailed that indeed, respectively, specify the retriability, compensatability, and reliability aspects. Definition 4.1. We have for a given partner d: 1) f ailed is not a termination state of d ⇔ d is retriable; 2) compensated is a termination state of d ⇔ d is compensatable; 3) Hf ailed is not a termination state of d ⇔ d is reliable. From the state transition diagram, we can also derive some simple rules. The states f ailed, completed, Hf ailed, and canceled can only be reached if the business partner is in the state active. The state compensated can only be reached if the partner is in the state completed. The state aborted can only be reached if the partner is in the state initial. Regarding the distinction made on the nature of vertices within a critical zone, we specify some requirements for the business partners selected for a critical zone execution. On the one hand, as the partners (dvk )k ∈[1,j ] modify sensitive and permanent data, we consider that they are required to be reliable. There are therefore four types of dvk partners: (rl ,rt ), (rl ,c), (rl ,rt c), and (rl ,p). On the other hand, as the business partners of type dm k only modify mobile and volatile data, we first consider that they are retriable besides compensatability is not required for volatile data. Second, we assume that these tasks can be executed by unreliable partners, and there are, as a result, only two types m of dm k partners: (rl ,rt ) and (url ,rt ). If one of the dk partners m m part of the abstraction dl,p is unreliable, then dl,p is unreliable, otherwise dm l,p is reliable. Fig. 7 depicts the transition diagram for the six types of transactional partners that can be encountered. B. Termination States The crucial point of the transactional model specifying the transactional properties of business partners is the analysis of their possible termination states. The ultimate goal is indeed to be able to define consistent termination states for a critical zone, i.e., determining for each partner executing a critical zone vertex which termination states it is allowed to reach. Definition 4.2. We define the operator termination state ts(x) that specifies the possible termination states of the element x. This element x can be defined as follows. 1) A partner d and ts(d) ∈ {aborted, canceled, f ailed, Hf ailed, completed, compensated}. 2) A vertex c and ts(c) ∈ {aborted, canceled, f ailed, Hf ailed, completed, compensated}. 3) A critical zone composed of n vertices C = (ca )a∈[1,n ] and ts(C) = (ts(c1 ), ts(c2 ), . . . , ts(cn )). 4) An instance Cd of C composed of n partners Cd = (da )a∈[1,n ] and ts(Cd ) = (ts(d1 ), ts(d2 ), . . . , ts(dn )). The operator TS(x) represents the finite set of all possible termination states of the element x, TS(x) = (tsk (x))k ∈[1,j ] . We especially have T S(Cd ) ⊆ TS(C) since the set T S(Cd ) represents the actual termination states that can be reached by Cd according to the transactional properties of the partners assigned to C. We also define for x a critical zone or a critical zone instance and a ∈ [1, n]. 1) ts(x, ca ): the value of ts(ca ) in ts(x) 2) tscomp(x): the termination state of x such that ∀ a ∈ [1, n] ts(x, ca ) = completed. For the remaining of the paper, C = (ca )a∈[1,n ] denotes a critical zone of n vertices and Cd = (da )a∈[1,n ] an instance of C. C. Transactional Consistency Tool We use the ATS [13] model as the consistency evaluation tool for the critical zone. ATS defines the termination states as when a critical zone is allowed to reach so that its execution is deemed consistent. 324 Fig. 8. IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 38, NO. 3, MAY 2008 ATS(C 1 ) and available business partners. Definition 4.3. An ATS(C) is a subset of TS(C) whose elements are considered consistent by workflow designers for a specific execution of C. A consistent termination state of C is called an acceptable termination state ATSk (C); thus, ATS(C) = (atsk (C))k ∈[1,i] . A set ATS(C) specifies the transactional requirements defined by designers associated with a specific execution of C. ATS(C) and TS(C) can be represented by a table that defines, for each termination state, the tuple of termination states reached by each vertex, as depicted in Figs. 5 and 8. Depending on the application, different ATS tables can, of course, be specified by designers for the same critical zone C, and for the sake of readability, we do not introduce in this paper an index [as in ATSi (C)] in the notation ATS(C). As mentioned in the definition, the specification of ATS(C) is done at the workflow designing phase. ATS(C) is mainly used as a decision table for a coordination protocol so that Cd can reach an acceptable termination state knowing the termination state of at least one vertex. The coordination decision, i.e., the termination state that has to be reached, made given a state of the critical zone execution has to be unique, this is the main characteristic of a coordination protocol. In order to cope with this requirement, ATS(C) that is used as input for the coordination decision-making process has thus to verify some properties that are specified in the next section. V. FORMING ATS(C) In this section, the definitions and theorems introduced and proved in [9] are reminded and adapted to specify ATS(C) in the scope of the pervasive workflow model. The approach followed is based on the fact that ATS(C) ⊆ TS(C); thus, ATS(C) inherits the characteristics of TS(C). For the sake of clarity in what follows, we make the assumption that only one business partner can fail at a time during a pervasive workflow instance. Our approach can indeed be extended easily to concurrent failure scenarios as discussed later on in Section VI-D. As explained earlier, the unicity of the coordination decision during the execution of a coordination protocol is a ma- jor requirement. We thus try here to identify the elements of TS(C) that correspond to different coordination decisions given the same state of a workflow execution. There are two situations whereby a protocol coordination has different possibilities of coordination given the state of a workflow vertex. Let a, b ∈ [1, n] and assume that the vertex cb has failed. 1) The vertex ca is in the state completed, and either it remains in this state or it is compensated. 2) The vertex ca is in the state active, and either it is canceled or the coordinator let it reach the state completed. From these two statements, we define the incompatibility from a coordination perspective and the flexibility notions. Defination 5.1. Two termination states of C tsk (C) and tsl (C) are said to be incompatible from a coordination perspective iff ∃ a,b ∈ [1, n] such that tsk (C, ca ) = completed, tsk (C, cb ) = tsl (C, cb ) ∈ {f ailed, Hf ailed}, and tsl (C, ca ) = compensated. Otherwise, tsl (C) and tsk (C) are said to be compatible from a coordination perspective. The value in {compensated, completed} reached by a vertex ca in a termination state tsk (C) whereby tsk (C, cb ) ∈ {f ailed, Hf ailed} is called recovery strategy of ca against cb in tsk (C). Definition 5.2. A vertex ca is flexible against an other vertex cb iff ∃ k ∈ [1, j] such that tsk (C, cb ) ∈ {f ailed, Hf ailed} and tsk (C, ca ) = canceled. Such a termination state is said to be flexible to ca against cb . The set of termination states of C flexible to ca against cb is denoted by FTS(ca , cb ). From these definitions, we now study the termination states of C according to the compatibility and flexibility criteria in order to identify the termination states that follow a common strategy of coordination. Definition 5.3. A termination state of C tsk (C) is called generator of a vertex ca iff tsk (C, ca ) ∈ {f ailed, Hf ailed} and ∀b ∈ [1, n] such that cb is executed before or in parallel with ca , tsk (C, cb ) ∈ {completed, compensated}. The set of termination states of C compatible with tsk (C) generator of ca is denoted by CTS(tsk (C), ca ). The set CTS(tsk (C), ca ) specifies all the termination states of C that follow the same recovery strategy as tsk (C) against ca . Definition 5.4. Let tsk (C) ∈ TS(C) be a generator of ca . Coordinating an instance Cd of C in case of the failure of ca consists in choosing the recovery strategy of each vertex of C against ca and the za < n vertices (va i )i∈[1,z a ] flexible to ca whose execution is not canceled when ca fails. As unreliable business partners modify only volatile data, we consider that cancellation is always performed if a task execution is still active as soon as a failure occurs. The set (va i )i∈[1,z a ] is thus only composed of vertices of type vk . We call coordination strategy of Cd against ca the set CS(Cd , tsk (C), (va i )i∈[1,z a ] , ca ) = CTS(tsk (C), ca ) − ∪zi=a 1 FTS(va i , ca ). If the partner da assigned to ca is retriable then CS(Cd , tsk (C), (va i )i∈[1,z a ] , ca ) = ∅. Cd is said to be coordinated according to CS(Cd , tsk (C), (va i )i∈[1,z a ] , ca ) if, in case of the failure of ca , Cd reaches a termination state in CS(Cd , tsk (C), (va i )i∈[1,z a ] , ca ). Of course, it assumes that the transactional properties of Cd are sufficient to reach tsk (C). MONTAGUT et al.: PERVASIVE WORKFLOW: A DECENTRALIZED WORKFLOW SYSTEM Given a vertex ca , the idea is to classify the elements of TS(C) using the sets of termination states compatible with the generators of ca . Using this approach, we can identify the different recovery strategies and the coordination strategies associated with the failure of ca as we decide which vertices can be canceled. Defining ATS(C) is therefore deciding at design time the termination states of C that are consistent. ATS(C) is to be inputted to a coordination protocol in order to provide it with a set of rules that leads to a unique coordination decision in any case. According to the definitions and properties we introduced earlier, we can now explicit some rules on ATS(C) so that the unicity requirement of coordination decisions is respected. Definition 5.5. Let tsk (C) ∈ TS(C) such that tsk (C, ca ) ∈ {f ailed, Hf ailed} and tsk (C) ∈ ATS(C). ATS(C) is valid iff ∃ ! l ∈ [1, j] such that tsl (C) generator of ca compatible with tsk (C) and CTS(tsl (C), ca ) − ∪zi=a 1 FTS(va i , ca ) ⊂ ATS(C) for a set of vertices (va i )i∈[1,z a ] flexible to ca . A valid ATS(C) therefore contains for all tsk (C) in which a vertex fails a unique coordination strategy associated with this failure and the termination states contained in this coordination strategy are compatible with tsk (C). In Fig. 8, an example of possible ATS is presented for the critical zone C1 . It just consists in selecting the termination states of the table TS(C1 ) that we consider consistent and respect the validity rule for the created ATS(C1 ). For example, here the payment of Alice has to be compensated if Bob fails to deliver the computer as specified in ats2 = ts4 . VI. ASSIGNING BUSINESS PARTNERS USING ATS In this section, we specify the main steps of the partner assignment procedure whose underpinning theorems are proved in [9]. The transaction-aware business partner assignment procedure aims at assigning n business partners to the n vertices ca in order to create an instance of C acceptable with respect to a valid ATS(C). We first define a validity criteria for the instance Cd of C with respect to ATS(C), the business partner assignment algorithm is then detailed. Finally, we specify the coordination strategy associated with the instance created from our assignment scheme. A. Acceptability of Cd With Respect To ATS(C) Definition 6.1. Cd is an acceptable instance of C with respect to ATS(C) iff TS(Cd ) ⊆ ATS(C). Now we express the condition T S(Cd ) ⊆ ATS(C) in terms of coordination strategies. The termination state generator of ca present in ATS(C) is denoted as tsk a (C). The set of vertices whose execution is not canceled when ca fails is denoted (va i )i∈[1,z a ] . We get Theorem 6-2 [9]. Theorem 6.2. TS(Cd ) ⊆ ATS(C) iff ∀ a ∈ [1, n] CS(Cd , tsk a (C), (va i )i∈[1,z a ] , ca ) ⊂ ATS(C). It should be noted that if f ailed (respectively, Hf ailed) ∈ ATS(C, ca ) where ATS(C, ca ) represents the acceptable termination states of the vertex ca in ATS(C) then CS(Cd , tsk a (C), (va i )i∈[1,z a ] , ca ) = ∅. 325 B. Transaction-Aware Assignment Procedure The business partner assignment algorithm uses ATS(C) as a set of requirements during the partner assignment procedure, and thus, identifies those partners whose transactional properties match the transactional requirements associated with vertices defined in ATS(C). The assignment procedure is an iterative process, partners are assigned to vertices sequentially. At each step i, the assignment procedure therefore generates a partial instance of C noted Cdi . T S(Cdi ) refers to the termination states of C that can be reached based on the transactional properties of the i partners that are already assigned. Intuitively, the acceptable termination states refer to the degree of flexibility offered when choosing the partners with respect to the different coordination strategies complying with ATS(C). This degree of flexibility is influenced by two parameters. 1) The list of acceptable termination states for each workflow vertex. This list can be determined based on ATS(C). Using this list, the requirements on the transactional properties of a candidate partner can be derived since this partner can only reach the states defined in ATS(C) for the considered vertex. 2) The assignment process is iterative, and therefore, as new partners are assigned to vertices, both TS(Csi ) and the transactional properties required for the assignment of further partners are updated. For instance, we are sure to no longer reach the termination states CTS(tsk (C), ca ) allowing the failure of the vertex ca in ATS(C) when we assign a partner retriable and reliable to ca . In this specific case, we no longer care about the states reached by other vertices in CTS(tsk (C), ca ), and therefore, there is no transactional requirements introduced for the vertices to which business partners have not already been assigned. We therefore need to first define the transactional requirements for the assignment of a partner after i steps in the assignment procedure. 1) Extraction of Transactional Requirements: From the two requirements before, we define for a vertex ca : 1) ATS(C, ca ): set of acceptable termination states of ca that is derived from ATS(C); 2) DIS(ca , Cdi ): set of transactional requirements that the partner assigned to ca must meet based on previous assignments. This set is determined based on the following reasoning. a) (DIS1 )): the partner must be compensatable iff compensated ∈ DIS(ca , Cdi ). b) (DIS2 ): the partner must be retriable iff failed ∈ DIS(ca , Cdi ). c) (DIS3 ): the partner must be reliable iff Hfailed ∈ DIS(ca , Cdi ). Using these two sets, we are able to compute MinT P (da , ca , Cdi ) = ATS(C, ca ) DIS(ca , Cdi ) that defines the minimal transactional properties a partner da has at least to comply with in order to be assigned to the vertex ca at the i + 1 assignment step. We simply check 326 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 38, NO. 3, MAY 2008 the retriability and compensatability properties for the set MinT P (da , ca , Cdi ). 1) f ailed ∈ MinT P (da , ca , Cdi ) ⇔ da has to verify the retriability property. 2) Hf ailed ∈ MinT P (da , ca , Cdi ) ⇔ da has to verify the reliability property. 3) compensated ∈ MinT P (da , ca , Cdi ) ⇔ da has to verify the compensatability property. The set ATS(C, ca ) is easily derived from ATS(C). We now need to compute DIS(ca , Cdi ). We assume that we are at the i + 1 step of an assignment procedure, i.e., the current partial instance of C is Cdi . Computing DIS(ca , Cdi ) means determining if (DIS1 ), (DIS2 ) and (DIS3 ) are true. From these three statements, we can derive following four properties. 1) (DIS1 ) implies that state compensated can definitely be reached by ca . 2) (DIS2 ) implies that ca cannot f ail. 3) (DIS2 ) implies that ca cannot be canceled. 4) (DIS3 ) implies that ca cannot Hf ail. The third property is derived from the fact that, if a vertex cannot be canceled when the failure of a vertex has occurred, then it has to finish its execution and reach at least the state completed. In this case, if a business partner cannot be canceled, then it cannot fail, which is the third property. To verify whether 1–4. are true, we present the following theorems that are an extension of results proved in [9]. Theorem 6.3. Let a ∈ [1, n]. The state compensated can definitely be reached by ca iff ∃ b ∈ [1, n] − {a} verifying (6 − 3b): db not retriable (respectively, reliable) is assigned to cb and ∃ tsk (C) ∈ ATS(C) generator of cb such that tsk (C, ca ) = compensated. Theorem 6.4. Let a ∈ [1, n]. ca cannot fail (respectively, Hf ail) iff ∃ b ∈ [1, n] − {a} verifying (6 − 4b): (db not compensatable is assigned to cb and ∃ tsk (C) ∈ ATS(C) generator of ca such that tsk (C, cb ) = compensated) or (cb is flexible to ca and db not retriable is assigned to cb and ∀ tsk (C) ∈ ATS(C) such that tsk (C, ca ) = f ailed (respectively, tsk (C, ca ) = Hf ailed), tsk (C, tb ) = canceled). Theorem 6.5. Let a, b ∈ [1, n] such that ca is flexible to cb . ca is not canceled when cb fails (respectively, Hf ail) iff 6 − 5b: db not retriable (respectively, not reliable) is assigned to cb and ∀ tsk (C) ∈ ATS(C) such that tsk (C, cb ) = f ailed (respectively, tsk (C, cb ) = Hf ailed), tsk (C, ca ) = canceled. In order to compute DIS(ca , Cdi ), we have to compare ca with each of the i vertices cb ∈ C − {ca } to which a business partner db has already been assigned. Two cases have to be considered: either we assign a business partner to a vertex vk or to an abstract vertex ml,p . This is an iterative procedure. At the initialization phase, in the first case, we have: since no vertex has been yet compared to ca = vk , da can be of type (rl , p): DIS(ca , Cdi ) = {f ailed}. 1) If cb verifies (6.3b) ⇒ compensated ∈ DIS(ca , Cdi ). 2) If cb verifies (6.4b) ⇒ f ailed ∈ DIS(ca , Cdi ). 3) If cb is flexible to ca and verifies (6-5b) ⇒ f ailed ∈ DIS(ca , Cdi ). In this case, the verification stops if f ailed ∈ DIS(ca , Cdi ) and compensated ∈ DIS(ca , Cdi ). For the vertices of type vk , we indeed only need to check the retriability and compensatability properties. In the second case, we have the following at the initialization phase: since no vertex has yet been compared to ca = ml,p , da can be of type (url ): DIS(ca , Cdi ) = {Hf ailed}. 1) if cb verifies (6.4b) ⇒ Hf ailed ∈ DIS(ca , Cdi ). In that case, the verification stops if Hf ailed ∈ DIS(ca , Cdi ). For the vertices of type ml,p , we only need to check the reliability property. Finally, when MinT P (da , ca , Cdi ) is computed, we are able to select the appropriate business partner to be assigned to a given vertex according to transactional requirements. 2) Business Partner Assignment Process: Business partners are assigned to each vertex based on an iterative process. Depending on the transactional requirements and the transactional properties of the business partners available for each vertex, different scenarios can occur. a) Business partners of type (rl , rt c) are available in the case of a vertex vk or business partners of type (rl ) are available in the case of a vertex ml,p [i.e., all the business partners of the abstraction are of type (rl ))]. It is not necessary to compute any transactional requirements as such partners match all transactional requirements. b) A single partner is available for the considered vertex. We need to compute the transactional requirements associated with the vertex, and either the transactional properties offered by this partner are sufficient or there is no solution. c) Business partners of type (rl , rt ) and (rl , c) but none of type (rl , rt c) are available for a vertex vk . We need to compute the transactional requirements associated with the vertex, and we have three cases. First, (rl , rt c) is required and therefore there is no solution. Second, (rl , rt ) [respectively, (rl , c)] is required and we assign a business partner of type (rl , rt ) [respectively, (rl , c)] to the vertex. Third, there is no requirement. The assignment procedure is performed by the coordinator c1 . Business partners have to be assigned to all vertices prior to the beginning of the critical zone execution. The first vertex is de facto assigned to the critical zone initiator. The idea is then to first assign business partners to the vertices verifying a) and b) since there is no flexibility in the choice of the business partner. Vertices verifying c) are finally analyzed. Based on the transactional requirements raised by the remaining vertices, we first assign partners to vertices with a nonempty transactional requirements. We then handle the assignment for vertices with an empty transactional requirements. Note that the transactional requirements of all the vertices to which partners are not yet assigned are also affected (updated) as a result of the current partner assignment. If no vertex has transactional requirements, then we assign the partners of type (rl , rt ) to assure the completion of the remaining vertices’ execution. C. Actual Termination States of Cd Once all the business partners have been assigned to vertices, we can coordinate their execution so that they respect the defined transactional requirements. In order to do so, we MONTAGUT et al.: PERVASIVE WORKFLOW: A DECENTRALIZED WORKFLOW SYSTEM need to know the actual termination states subset of ATS(C) that can be reached by the defined instance of C. Having computed TS(Cd ), we can deduce the coordination rules associated with the execution of Cd . This subset is determined using the following theorem that is proved in [9]. Theorem 6.6. Let Cd be an acceptable instance of C with respect to ATS(C). We note that (ca i )i∈[1,n r ] is the set of vertices to which neither a retriable nor a reliable business partner has been assigned. tsk a i (C) is the generator of ca i present in ATS(C) and (va i j )j ∈[1,z a i ] denotes the set of vertices that are not canceled when ca i fails. TS(Cd ) ={tscomp(Cd )}∪ ∪ni=r 1 (CTS(tsk a i (C), ca i ) − za ∪j =i 1 FTS(va i j , ca i )). TS(Cd ) is indeed derived from ATS(C) that contains for all vertices at most a single coordination strategy as specified in Definition 5.5. As a result, whenever the failure of a vertex ca is detected, a transactional protocol in charge of coordinating an instance Cd resulting from our approach reacts as follows. The coordination strategy CS(Cd , tsk (C), (va i )i∈[1,z a ] , ca ) corresponding to ca is identified and a unique termination state belonging to CS(Cd , tsk (C), (va i )i∈[1,z a ] , ca ) can be reached given the current state of the critical zone execution. D. Discussion and Performance Evaluation In order to handle the scenarios wherein more than one business partner can fail at a time, one would need to extend the definition of the termination state generator to take into account the failure of a set of partners as follows. Definition 6.7. A termination state of C tsk (C) is called generator of a set of vertices (ca i )i∈[1,p] iff ∀i ∈ [1, p] tsk (C, ca i ) ∈ {f ailed, Hf ailed} and ∀ b ∈ [1, n] such that cb is executed before one of the vertices (ca i )i∈[1,p] , or in parallel with all the vertices (ca i )i∈[1,p] , tsk (C, cb ) ∈ {completed, compensated}. The compatibility definition would also be defined for termination states in which exactly the same set of partners fail at the same time so that coordination strategies are defined for possible concurrent failures. In this case, two termination states such that the set of failures in the first is a subset of the set of failures in the second are not incompatible. The composition algorithm and the coordination protocol are not affected by this configuration. The operations that are relevant from the complexity point of view are twofold: the definition of transactional requirements by means of the acceptable termination states model and the execution of the transaction-aware business partner assignment procedure. One can argue that building an ATS table specifying the transactional requirements of a business process W consists of computing the whole TS(W ) table, yet this is not the case. Building a ATS(C) set in fact only requires for designers to identify the vertices of C that they allow to fail as part of the process execution and to select the termination state generator associated with each of those vertices that meet their requirements in terms of failure atomicity. Once this phase is complete, designers only need to select the vertices whose execution can be canceled 327 when the former vertices may fail and complete the associated coordination strategy. The second aspect concerns the complexity of the transaction aware assignment procedure that we presented in section VI. Theorem 6.8. Let C = (ca )a∈[1,n ] be a critical zone. The complexity of the transaction-aware assignment procedure is O(n3 ). Proof: We can show that the number of operations necessary to compute the step i of the assignment procedure for a vertex ca is bounded by 4 × n × i. Computing the step i indeed consists of verifying Theorems 6.3–6.5 and determining ATS(C, ca ). On one hand, performing the operations part of Theorems 6.3 (one comparison), 6.4 (two comparisons), and 6.5 (one comparison) requires at most four comparisons. On the other hand, building ATS(C, ca ) requires at most n operations (there is at most n generators in a ATS(C) set). Therefore, we can derive that the number of operations that need to be performed in order to compute the n steps of the assignment procedure for a critical zone to be composed of n tasks is bounded by 4 × n × nj= 1 j that is equivalent to n3 as n −→ ∞. E. Example We consider the critical zone C1 of Fig. 4. Designers have defined ATS(C1 ) of Fig. 8 as the transactional requirements. The set of available business partners for each vertex of C1 is specified in Fig. 8. The goal is to assign business partners to vertices so that the instance of C1 is valid with respect to ATS(C1 ) and we apply the presented assignment procedure. The critical zone initiator assigned to v1 uses a business partner of type (rl , rt ) matching the transactional requirements. We now start to assign the business partners of type (rl , rt c) and (rl ) for which it is not necessary to compute any transactional requirements. d52 which is the only available business partner of type (rl , rt c) is therefore assigned to v4 . We then try to assign business partners to tasks for which there is no choice, to m1 . We comand we verify whether d31 can be assigned 2 2 ) = ATS(C1 , m1 ) DIS(m1 , C1d ). pute MinT P (da , m1 , C1d 2 ATS(C1 , m1 ) = {completed, Hf ailed} and DIS(m1 , C1d )= {Hf ailed} as d52 and d11 are the only business partner already assigned and the Theorems 6.3–6.5 are not verified. Thus, 2 MinT P (ca , m1 , C1d ) = {Hf ailed} and d31 can be assigned to m1 as it matches the transactional requirements. We get for 3 v3 MinT P (da , v3 , C1d ) = {f ailed}. The business partner d41 that is of type (rl , c) verifies the transactional requirements is assigned to v3 . Now, we compute the transactional requirements of 4 v2 and we get MinT P (da , v2 , C1d ) = {f ailed, compensated} as Theorem 6.3 is verified with the business partners d31 . The partner d21 can thus be assigned to v2 as it matches the transactional requirements of the task. Using the created instance of C1 we get the set TS(C1d ) of Fig. 9. VII. COORDINATION PROTOCOL SPECIFICATION Having introduced the method through which an instance of C is obtained by assigning partners to workflow vertices according to the transactional requirements of C, we turn to the actual coordination of partners during the execution of the 328 Fig. 9. IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 38, NO. 3, MAY 2008 TS(C 1 d ). Fig. 11. Fig. 10. Notification messages. critical zone. The protocol that is in charge of the coordination is specified in terms of the different actors, notification messages, and coordination cases. We finally motivate the chosen solution by comparing it with existing coordination protocols. A. Protocol Actors As mentioned in Section II-B and Fig. 2, we distinguish three main entities within the coordination protocol execution. 1) Business partner dv1 = c1 : This business partner is the critical zone initiator and is in charge of performing the business partner assignment procedure and coordinating the execution of C. The coordination decisions are made using the table T S(Cd ) specifying the subset of ATS(C) Cd is actually able to reach. 2) Business partners dvk : these business partners modify sensitive data and play the role of subcoordinators. They report their state of execution and the state of execution of v the business partners dm k to d1 . m 3) Business partners dk : These partners modify volatile data and report to the partner dvx most recently executed. Actors exchange messages for the purpose of decision making and forwarding, as listed in Fig. 10. These messages are mostly derived from the state diagram of the transactional model and the respective role of the partners in the protocol. The flow of notification messages within the protocol execution and the mechanisms involved in the processing of these notification messages are stated in the next section. B. Coordination Scenarios In this section, we detail the different phases and coordination scenarios that can be encountered during the execution of the Business partner registration. protocol. First, we explain how partners are registered with the coordination protocol during the partner assignment phase. Then, we analyze the message flow between the different actors of the protocol in three different scenarios: normal course of execution, failure of a partner dvk , and failure of a partner dm k . 1) Business Partner Registration: The first phase of the coordination protocol consists of the discovery and registration of the business partners that will be involved in the critical zone execution. The discovery process through which business partners that can be assigned to critical zone vertices are identified is performed by the business partner c1 = dv1 . The transactional requirements extraction procedure, specified in Section VI-B, provides the coordinator with a list of suitable business partners that match the computed transactional requirements. It is then necessary to contact the business partners of this list in order to receive from of one of them the commitment to execute the requested vertex. Based on the registration handshake depicted in Fig. 11, the coordinator dv1 contacts a business partner asking it whether it agrees to commit to execute the operation a of the workflow whose identifier is WI d . Once the newly assigned business partner’s coordinator is known, dv1 sends the information. In the case of business partners dvk , this information is known from the beginning since dv1 is their coordinator whereas, for the business partners dm k , the information is known when dvx the business partner dvk executed most recently has been assigned to a vertex. 2) Normal Course of Execution: Once all involved business partners are known, the critical zone execution can start supported by the coordination protocol. Business partners are sequentially activated based on the workflow specification. A sample for normal execution of C is depicted in Fig. 12. The Activate(W, k, WI d , D) message is a workflow message defined in [4]; it especially contains the workflow specification W , the requested vertex k to be executed, the workflow data D modified during the execution, and the workflow identifier WI d . Within the critical zone execution, local acknowledgments Ack(WI d ) are used. Each business partner dm k reports its status to the business partner dvx most recently executed, and once its execution is complete, it can leave the critical zone execution. The Completed(k, WI d , D) message sent by a business MONTAGUT et al.: PERVASIVE WORKFLOW: A DECENTRALIZED WORKFLOW SYSTEM Fig. 12. Fig. 14. Normal execution. 329 Failure of a business partner dm k . problems, and failure of such partners therefore implies a loss of contact with their coordinator. The failure of a partner dm k is reported by its subcoordinator to the partner dv1 . The failure detection and forwarding of the Hf ailed message are depicted in Fig. 14. C. Coordination Decisions and Recovery Fig. 13. Failure of a business partner dvk . partner dm k includes a backup copy of the volatile data modified by the business partner that can be reused later on for the recovery procedure in case of failure of a business partner dm k (Section VII-B4). Once in the state completed, business partners of type dm k can leave the coordination as they will not be asked to compensate their execution. Depending on the transactional requirements defined for C, business partners dvk may leave the critical zone before the end of the critical zone execution. A business partner dvk is indeed able to leave the coordination if it reaches the state completed regardless of possible failures in the sequel of the critical zone execution. The condition allowing a business partner dvk to leave the coordination is therefore stated as follows. Theorem 7.1. A partner dvk assigned to a vertex cl can leave the execution of a critical zone C iff the partner dvk is in the state completed and ∀ i ∈ [1, n] such that a business partner di not retriable (respectively, not reliable) is assigned to the vertex ci , di is in the state initial and tsk (C, cl ) = completed where tsk (C) is the termination state generator of ci in TS(Cd ). 3) Failure of A Business Partner dvk : This scenario is possible only with business partners of type (rl , p). We can encounter two situations: either the failure is total and the business partner is not able to communicate any longer or the business partner is still alive and can forward a failure message to dv1 . Fig. 13 depicts the two cases whereby the total failure is detected using a simple timeout in Ping/Alive message exchanges. Once the failure has been detected, the coordinator forwards the coordination decision to all involved business partners. It should be noted here that business partners of type (rl , rt ) can also reach the state f ailed but the retriability property implies that they have at their disposal recovery solutions ensuring that the contact is never lost permanently. Thus, the failure of business partners of type (rl , rt ) is transparent to the rest of the coordination and does not have to be handled. 4) Failure of A Business Partner dm k : The failure of a business partner dm k is detected by its subcoordinator with a timeout. As specified in the transactional model, we indeed consider that business partners of type dm k can fail only because of hardware Having detailed various coordination scenarios that can occur during the execution of a critical zone, we analyze the possible recovery strategies, in particular the replacement of failed partners dvk and dm k and how coordination decisions are made upon detection of a failure. 1) Replacement of Failed Partners dvk : During the course of the execution, new partners can be discovered and assigned to vertices in order to replace failed ones. In fact, two situations can happen: either the failure of a partner occurs while executing its assigned vertex or the coordinator loses contact (timeout detection) prior to the activation of the partner. The first situation is specified in the previous section, and no backup solution is possible as the data modified by the failed business partner are in an unknown state. In the second situation, it is possible on the contrary to assign a new business partner matching the transactional requirements to the vertex that has not yet started with the execution. Once the loss of contact with a business partner dvk is detected, no coordination decision is yet sent to business partners and the execution continues. If no business partner is found to be assigned to the vertex when its execution should be activated, the protocol coordinator considers the business partner it has lost contact with as failed. 2) Replacement of Failed Business Partners dm k : In case of failure of a business partner dm k , be it before or after its activation, a recovery procedure can be executed prior to informing the coordinator of the hardware failure. It is indeed possible to assign to the vertex a new business partner so that the execution can go on. This is possible as, on the one hand, the partners dm k only modify volatile data, and on the other hand, we have a backup copy of the data modified by the partners that are part of the abstract vertex dm l,p . Once the failure is detected, the subcoordinator of the failed partner tries to assign a new partner to the failed vertex. In this case, only volatile data are being modified, transactional requirements is not a concern, and the assignment procedure can be repeated till a business partner manages to execute the requested vertex. 3) Reaching Consistent Termination States: Once all possible recovery mechanisms have been attempted, a coordination decision is made by the coordination dv1 . The table TS(Cd ) is the input to the coordination decisions that are made throughout 330 Fig. 15. IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 38, NO. 3, MAY 2008 Architecture. the execution of a critical zone. Once the failure of a vertex ca has been detected, the protocol coordinator reads in TS(Cd ) the set CS(Cd , tsk (C), (va i )i∈[1,z a ] , ca ) listing the possible termination states reachable by Cd whereby ca is f ailed. There is a unique element of this set that is reachable by Cd with respect to its current state of execution and dv1 sends the appropriate messages so that the overall critical zone can reach this consistent termination state. D. Discussion The coordination protocol integrates the semantic description of involved business partners and relies on an adaptive decision table that is computed during the assignment procedure. The coordination protocol is flexible as it completely depends on the designers’ choice for the specification of ATS. This solution therefore offers a full support of relaxed atomicity constraints for workflow-based applications and is also self-adaptable to the business partners characteristics that is not the case with recent efforts [14], [15]. The organization of the coordination is based on a simple hierarchical approach as in BTP [16]. In that respect, the central point of the coordination is the business partner dv1 on which relies the whole coordination. This is the main weakness of the protocol, as a failure of this business partner would cause the complete failure of the workflow execution. The role of critical zone initiator of the coordination is therefore reserved to business partners that are both reliable and retriable. Nonetheless, this centralized and hierarchical approach facilitates the management of the coordination process. In addition to usual coordination phases such as coordination registration, business partner completion, and failure, our protocol offers the possibility to replace participants at runtime depending on their role within the coordination and the volatility degree of data they have to modify during the workflow execution. This makes the protocol flexible and adapted to the pervasive paradigm whereas such recovery procedure is not specified in other transactional protocols. In the protocol description, we do not specify the data recovery strategy especially for the compensated states. Different approaches can be integrated with our work to support either forward error recovery or backward error recovery [17]. The choice of the recovery strategy basically depends on the application and its fault-handling protocol. For instance, a simple backward error recovery strategy is sufficient for workflows used for payment in the example of the paper whereas a forward recovery strategy might be required for a hotel booking system. Existing mechanisms in this area can therefore be used to augment our transactional protocol to specify complex faulthandling and compensation scenarios [8], [18]. VIII. IMPLEMENTATION In this section, an implementation of the work presented in this paper is described. The overall system architecture is depicted in Fig. 15. The basic pervasive workflow infrastructure spans over the business partners taking part in a workflow instance. A local workflow engine developed on top of BPEL [8] is in charge of handling, for each involved business partner, the workflow management and control tasks that mainly consist of: 1) receiving and forwarding workflow requests; 2) issuing discovery requests; 3) invoking the appropriate local services to execute workflow tasks. In order to support the execution of pervasive workflows, we implemented in the fashion of the WS-coordination initiative [19] a transactional stack composed of the following components. 1) Transactional coordinator: This component is supported by a critical zone initiator. On the one hand, it implements the transaction-aware business partner assignment procedure as part of the composition manager module, and on the other hand, it is in charge of assuring the coordinator role of the transactional protocol relying on the set T S(Cd ) outcome of the assignment procedure. 2) Transactional submanager: This component is deployed on the other partners and is in charge of forwarding coordination messages from the local workflow to the appropriate subcoordinator or coordinator and conversely. In the remainder of this section, we focus on the implementation of the transaction-aware partner assignment procedure. MONTAGUT et al.: PERVASIVE WORKFLOW: A DECENTRALIZED WORKFLOW SYSTEM Fig. 16. OWL-S transactional matchmaker. A. OWL-S Transactional and Functional Matchmaker To implement the assignment procedure presented in this paper, we augmented an existing functional OWL-S matchmaker [20] with transactional matchmaking capabilities. In order to achieve our goal, the matchmaking procedure has been split into two phases. First, the functional matchmaking based on OWL-S semantic matching is performed in order to identify subsets of the available partners that meet the functional requirements for each workflow vertex. Second, the implementation of the transaction-aware partner assignment procedure is run against the selected sets of partners in order to build an acceptable instance fulfilling defined transactional requirements. The structure of the matchmaker consists of several components whose dependencies are displayed in Fig. 16. The composition manager implements the matchmaking process and provides a Java API that can be invoked to start the selection process. It gets as input an abstract process description specifying the functional requirements for the candidate partners and a table of acceptable termination states. The registry stores OWL-S profiles of partners that are available. Those OWL-S profiles have been augmented with the transactional properties offered by business partners. This has been done by adding to the nonfunctional information of the OWL-S profiles a new element called transactional properties that specifies three Boolean attributes that are retriable, reliable, and compensatable as follows. <tp:transactionalproperties retrible=“true” reliable=“true” compensatable=“true”/> In the first phase of the selection procedure, the business partner manager is invoked with a set of OWL-S profiles that specify the functional requirements for each workflow vertex. The business partner manager gets access to the registry, where all published profiles are available, and to the functional matchmaker that is used to match the available profiles against the functional requirements specified in the workflow. For each 331 workflow vertex, the business partner manager returns a set of functionally matching profiles along with their transactional properties. The composition manager then initiates the second phase, passing these sets along with the process description, and the table of acceptable termination states to the transactional composer. The transactional composer starts the transactionaware business partner assignment procedure using the transactional matchmaker by classifying first those sets into six groups. 1) Sets including only business partners of type (url ,rt ). 2) Sets including only business partners of type (rl ,rt ). 3) Sets including only business partners of type (rl ,p). 4) Sets including only business partners of type (rl ,c). 5) Sets including business partners of types (rl ,rt ) and (rl ,c). 6) Sets including business partners of type (rl ,rt c). Once those sets are formed, the iterative transactional composition process takes place as specified before based on the table of acceptable termination states. Depending on the set of available services and the specified acceptable termination states, the algorithm may terminate without finding a solution. IX. RELATED WORK Transactional consistency and correctness of distributed systems such as database systems has been an active research topic over the last 15 years [21]–[23] yet it is still an open issue in the area of distributed processes within the SOC [24]–[27]. In this paper, we specified a transactional protocol for the pervasive workflow architecture presented in [4], and our solution uses and extends the results proved in [9]. The execution of distributed processes wherein business partners are not assigned at design time raises new requirements for transactional systems such as dynamicity, semantic description, and relaxed atomicity. Existing transactional models for advanced applications and workflows [5] do not offer the flexibility to integrate these requirements [28]. Our solution allows the specification of transactional requirements supporting relaxed atomicity for an abstract workflow specification and the selection of semantically described business partners or services fulfilling the defined transactional requirements. In addition, we provide the means to compute a coordination protocol suited to the workflow instance resulting from our business partner assignment procedure. The first approach specifying relaxed atomicity requirements for Web-service-based workflow applications using the ATS tool and a transactional semantic is presented in [10]. Despite a solid contribution, this paper provides only some means to verify the consistency of composite services but it does not take into account transactional requirements at the composition phase. This work therefore appears to be limited when it comes to the possible integration into dynamic and distributed business processes. In this approach, transactional requirements do not play any role in the component business partners selection process that may result in several attempts to determine a valid workflow instance. As opposed to this work, our solution provides a systematic procedure enabling the creation of valid workflow instances by means of a transaction-aware business 332 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 38, NO. 3, MAY 2008 partner assignment procedure. A transactional Web service composition framework is also presented in [29], yet this approach does not allow to define coordination strategies as fine-grained as the termination state model underpinning our composition algorithm. The transactional protocol we propose offers suitable means to respond to the constraints introduced by environments where heterogeneous business partners share resources in a collaborative manner. Using relaxed atomicity features, the protocol indeed offers the flexibility for business partners to release their resources as soon as their participation to the workflow is no longer required. Moreover, using a flexible semantic, business partners are able to advertise their capabilities so that they can assume a role suited to any workflow in which their resources can be used. Current efforts in the design of transactional framework supporting the coordination of business processes [14], [15], [30], [31] do not offer such flexibility. They suffer from the lack of tools for the specification of transactional requirements and their integration into a dynamic business partners’ selection process. As opposed to our solution, the WS-BA specification and its implementation [32], for instance, do not provide designers with the adequate means to specify the business logic associated with their long-running transactions. Furthermore, no recovery procedure is specified as part of the protocol for the replacement of partners in case of failure. X. CONCLUSION We presented an adaptive transactional protocol developed in the scope of the pervasive workflow model [4]. The contributions of the paper are threefold. First, we provide a transactional model that captures the typical transactional properties associated with Web services and make it possible for business partner to advertise the latter as a nonfunctional attribute to potential clients. This transactional model and its associated semantic are actually the core of the overall approach and consider the transactional properties offered by business partner as part of the SLA. Second, we propose a composition algorithm whose goal is to mix the transactional properties offered by business partners in order to meet some transactional constraints identified by workflow application designers. This algorithm does not only build consistent workflow instances but also provides the coordination rules to adequately coordinate the workflow execution. These rules are directly derived from the requirements set by designers in the first place and also from the composite application outcome of the assignment procedure. Third, we propose a transactional protocol that meet the dynamicity requirements introduced by a flexible execution environment. We believe that our approach can be used to augment recent specifications [19] in increasing their flexibility to incorporate transactional properties of business partners in the definition of adaptive coordination rules. Besides, a complete transactional framework has been implemented as a proof of concept of our theoretical results. Future work will focus on the design of security solutions for the pervasive workflow model. APPENDIX REFERENCES [1] A. Ranganathan and S. McFaddin, “Using workflows to coordinate web services in pervasive computing environments,” in Proc. IEEE Int. Conf. Web Serv., 2004, pp. 288–295. [2] Web services tool kit for mobile devices. (2002) [Online]. Available: http://www.alphaworks.ibm.com/tech/wstkmd. [3] S. Berger, S. McFaddin, C. Narayanaswami, and M. Raghunath, “Web services on mobile devices—Implementation and experience,” in Proc. 5th IEEE Workshop Mobile Comput. Syst. Appl., 2003, pp. 100–109. [4] F. Montagut and R. Molva, “Enabling pervasive execution of workflows,” in Proc. 1st IEEE Int. Conf. Collaborative Comput.: Netw., Appl. Worksharing, CollaborateCom, 2005, p. 10. [5] A. K. Elmagarmid, Database Transaction Models for Advanced Applications. San Mateo, CA: Morgan Kaufmann, 1992. [6] P. Greenfield, A. Fekete, J. Jang, and D. Kuo, “Compensation is not enough,” in Proc. 7th Int. Enterprise Distrib. Object Comput. Conf. (EDOC 2003), pp. 232–239. [7] OWL-s specifications. (2003) [Online]. Available: http://www.daml.org/ services. [8] Business process execution language for web sevices [Online]. Available: http://www.ibm.com/developerworks/library/ws-bpel/ [9] F. Montagut and R. Molva, “Augmenting Web services composition with transactional requirements,” in Proc. IEEE Int. Conf. Web Serv. (ICWS 2006), Chicago, IL, Sep. 18–22. [10] S. Bhiri, O. Perrin, and C. Godart, “Ensuring required failure atomicity of composite web services,” in Proc. 14th Int. Conf. World Wide Web, 2005, pp. 138–147. [11] S. Mehrotra, R. Rastogi, A. Silberschatz, and H. Korth, “A transaction model for multidatabase systems,” in Proc. 12th IEEE Int. Conf. Distrib. Comput. Syst. (ICDCS 1992), pp. 56–63. [12] H. Schuldt, G. Alonso, and H. Schek, “Concurrency control and recovery in transactional process management,” in Proc. Conf. Principles Database Syst., 1999, pp. 316–326. [13] M. Rusinkiewicz and A. Sheth, “Specification and execution of transactional workflows,” in Modern Database Syst.: The Object Model, Interoperability, Beyond. Reading, MA: Addison-Wesley, 1995, pp. 592–630. [14] D. Langworthy, L. F. Cabrera, and G. Copeland [Online]. Available: http://docs.oasis-open.org/ws-tx/wstx-wast-1.1-spec-os.pdf [15] D. Langworthy, L. F. Cabrera, and G. Copeland [Online]. Available: http://docs.oasis-open.org/ws-tx/wstx-wsba-1.1-spec-os.pdf [16] M. Abbott, A. Berson, and G. Brown [Online]. Available: http:/ /www.oasis-open.org/committees/download.php/1184/2002-06-03. BTP_cttee_spec_1.0.pdf [17] P. A. Lee and T. Anderson, Fault Tolerance: Principles and Practice. San Mateo, CA: Morgan Kaufmann, 1990. [18] F. Tartanoglu, V. Issarny, A. Romanovsky, and N. Levy, “Coordinated Forward error recovery for composite Web services,” in Proc. 22nd Int. Symp. Reliable Dist. Sys., 2003. [19] D. Langworthy, L. F. Cabrera, and G. Copeland [Online]. Available: http://docs.oasis-open.org/ws-tx/wstx-wascoor-1.1-spec-os.pdf MONTAGUT et al.: PERVASIVE WORKFLOW: A DECENTRALIZED WORKFLOW SYSTEM [20] P. Doshi, R. Goodwin, and R. Akkiraju, “Parameterized semantic matching for workflow composition,” International Business Machines Corporation (IBM), Armonk, NY, Tech. Rep. RC23133, Mar. 2004. [21] J. Gray and A. Reuter, Transaction Processing: Concepts and Techniques. San Mateo, CA: Morgan Kaufmann, 1993. [22] S. Lu, A. Bernstein, and P. Lewis, “Automatic workflow verification and generation,” Theor. Comput. Sci., vol. 353, no. 1, pp. 71–92, 2006. [23] S. Lu, A. Bernstein, and P. Lewis, “Correct execution of transactions at different isolation levels,” IEEE Trans. Knowl. Data Eng., vol. 16, no. 9, pp. 1070–1081, Sep. 2004. [24] F. Curbera, R. Khalaf, N. Mukhi, S. Tai, and S. Weerawarana, “The next step in web services,” Commun. ACM, vol. 46, no. 10, pp. 29–34, 2003. [25] M. Gudgin, “Secure, reliable, transacted; innovation in Web services architecture,” presented at the ACM Int. Conf. Manag. Data, Paris, France, 2004. [26] M. Little, “Transactions and Web services,” Commun. ACM, vol. 46, no. 10, pp. 49–54, 2003. [27] S. Tai, R. Khalaf, and T. Mikalsen, “Composition of coordinated web services,” in Middleware 2004: Proc. 5th ACM/IFIP/USENIX Int. Conf. Middleware, New York, pp. 294–310. [28] G. Alonso, D. Agrawal, A. E. Abbadi, M. Kamath, R. Gnthr, and C. Mohan, “Advanced transaction models in workflow contexts,” in Proc. 12th Int. Conf. Data Eng., Feb., 1996, p. 574. [29] M.-C. Fauvet, H. Duarte, M. Dumas, and B. Benatallah, “Handling transactional properties in web service composition,” in Proc. 6th Int. Conf. Web Inf. Syst. Eng. (WISE 2005), New York, pp. 273–289. [30] G. Alonso, F. Casati, H. Kuno, and V. Machiraju, Web services: Concepts, Architectures, and Applications. New York: Springer-Verlag, 2003. [31] M. P. Papazoglou, “Web services and business transactions,” World Wide Web, vol. 6, no. 1, pp. 49–91, 2003. [32] Apache kandula. (2007). [Online]. Available: http://ws.apache.org/ kandula/. Frederic Montagut (S’05) received the Eng. Dipl. in telecommunications from Telecom International, Evry, France, in 2004, the M.Sc. degree in network and distributed systems from Nice University, Nice, France, in 2004, and the Ph.D. degree in computer science from the Systems, Applications and Products (SAP) Research Laboratory, Mougins, France, in 2007. Since October 2004, he has been with the SAP Research Laboratory. His current research interests include workflow systems and workflow coordination to workflow security. 333 Refik Molva (M’88) received the B.Sc. degree in computer science from Joseph Fourier University, Grenoble, France, in 1981, and the Ph.D. degree in computer science from Paul Sabatier University, Toulouse, France, in 1986. He is currently a Full Professor and the Head of the Department of Computer Communications, Eurecom Institute, Sophia Antipolis, France. He was also a Research Staff Member in the Zurich Research Laboratory, International Business Machines Corporation (IBM), where he was a Key Designer of the KryptoKnight system. During 1997, he was a Security Consultant at the IBM Consulting Group. He was engaged in research on multicast and mobile network security, anonymity, intrusion detection, distributed multimedia applications over high-speed networks, and network interconnection. His current research interests include security protocols for self-organizing systems and privacy. Silvan Tecumseh Golega received the B.Sc. degree in computer science in 2006 from Hasso-PlatnerInstitut, Postdam, Germany, where he is currently working toward the Master’s degree in computer science. He was engaged in research on “composition and coordination of transactional business processes.”

RELATED PAPERS

RELATED TOPICS

Log In

The Pervasive Workflow: A Decentralized Workflow System Supporting Long-Running Transactions

The Pervasive Workflow: A Decentralized Workflow System Supporting Long-Running Transactions

Related Papers

RELATED PAPERS

RELATED TOPICS