Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Triggered Real-Time Databases with Consistency Constraints * Henry F. Kor@ Nandit Soparkar Abraham Silberschatz Department of Computer Sciences University of Texas at Austin Austin, TX 78712-1188 USA sulting real-time database systems [18]. There are several difficulties in accomplishing such an integration. A database operation (read or write) takes a highly variable amount of time depending on whether disk I/O, logging, etc. are required. Furthermore, if concurrent transactions are allowed, the concurrency control may cause aborts or delays of indeterminate length. Most previous work on real-time transactions assumes a set of transactions and associated deadlines. It is the responsibility of the transaction manager to find a correct schedule for the transactions that will ensure that the deadlines are met. There has been extensive study of real-time systems [19]. Formal aspects of such systems have been examined from the standpoints of scheduling (e.g., [S]) and verification [9]. In the context of real-time databases, [l, 2] consider alternative queuing disciplines with lockbaaed concurrency control of real-time transactions, and use simulation results to compare these techniques. [16] proposes concurrency control techniques for distributed real-time systems baaed on a partitioning of data. [15] discusses the specific time-dependent application of stock-market trading. The RTDB models outlined above apply timeconstraints directly to transactions, but they do not model situations where the time-constraints apply directly to states of the systems. Time-constraints on the states of the system enforce similar time-constraints on transactions that are triggered by those states. As an example, consider an RTDB application in a manufacturing environment. Suppose that the state of the information maintained in the database indicates that the temperature in a furnace has fallen below a particular threshold value. This state of the system may necessitate the triggering of some actions that restore the temperature to a value above the threshold. The application may enforce a maximum period of time for which the temperature is permitted to remain below the threshold - and t.hat enforces a deadline on the actions triggered by the low value. Furthermore, it may be the case that several actions may be candidates for the restoration of the temperature. For instance, Abstract Real-time database systems incorporate the notion of USUa deadline into the database system model. ally, deadlines are associated with transactions, and the system attempts to execute a given set of transactions so as to both meet the deadlines and ensure the database consistency. This paper presents an alternative model of real-time database processing in which deadlines are associated with consistency constraints rather than directly with transactions. This model leads to a predicate-baaed approach to transaction management that allows greater concurrency and more flexibility in modeling real-world systems. 1 Introduction Real-time database systems (RTDBs) incorporate timing considerations into a database system. Not only must the transactions execute correctly, but also, they must complete execution within some time limit called a deadline. Systems that incorporate strict deadlines are called hard real-time systems while those that do not are called soft real-time systems. Real-time systems are usually applied for processcontrol which often require a large database of information. Hence, recent efforts have aimed at integrating the real-time systems with database systems to facilitate the efficient and correct management of the re*Research partially supported by TARP grant 4355, NSF grant IRI-8805215, and a grant from the IBM Corporation. Proceedings of the 16th VLDB C‘onfcrencc Brisbane. Australia 1990 71 there may be actions that initiate more fuel getting pumped-in, or actions that increase the oxygen supply, etc. Thus, a choice may be available, and depending on the time constraints (and other factors such as the cost of the actions), one particular action may be initiated to restore the temperature value. These actions are reflected as triggered transactions within the database. In this paper, we propose a new approach to the modeling of an RTDB. Our approach is based on a set of explicitly defined consistency constraints for the database. Each transaction ensures that upon completion, the database remains in a state that satisfies these consistency constraints. However, in addition to such transactions that maintain correct database states, transactions may be invoked to record the effects of some external event that is generated outside the system. The ensuing change in the database state may render a consistency constraint invalid, and that constraint may need to be restored within a specific deadline. The system restores constraints by choosing one or more transactions from a pre-defined library of transactions. These transactions restore certain constraints but may invalidate other constraints. In the absence of further external events, the system must eventually return the entire database to a consistent state. In a dynamic real-time system, external events may occur with sufficient frequency to prevent global consistency, but the system must seek to ensure that no constraint remains invalid for an interval longer than a specified limit, the deadline of the constraint. 2 Transaction be of long-duration. The purpose of this type of transaction is the restoration of “consistency” that may have been violated as a result of some external-input transaction. 3. External-output transactions. Such a transaction causes some event to occur in the world external to the system. These transactions are of shortduration from a system perspective, although the external actions they trigger may take a longer time to complete. We do not permit externaloutput transactions to wait for the acknowledgement of completion of the external activity. Instead, we treat transactions of this type as performing only the initiation. If further action is to be taken as a result of completion of the external activity, another transaction (an external-input transaction) must record in the database the completion of the external activity, which then triggers the execution of further internal or externaloutput transactions. These three types of transactions differ in their atomicity and concurrency requirements. A write-only external-input transaction should never wait. Its writes should succeed immediately unless a “newer” value has already been recorded in the database. These requirements are justified since the external-input transactions are used to record the outside world within the system. In a real-time application, such events need to be recorded in the database as soon as possible so that any resulting inconsistency may be corrected. A consequence of this is that it may not be desirable to ensure serializable executions even if multiple versions of data are retained. The notion of transactions violating the database consistency and other transactions reading possibly inconsistent database states is a major deviation from the standard transaction model. We represent such actions using the NT/PV model of [12] by defining input and output conditions for each transaction. These conditions are predicates on the database state. The input condition is a pre-condition of transaction execution and must hold on the state that the transaction “observes.” The output condition is a post-condition which the transaction guarantees on the database state at the end of the transaction provided that there is no concurrency and the database state initially seen by the transaction satisfies the input condition. Thus, in the NT/PV model, as in the standard model, transactions are assumed to be correct programs, and responsibility for correct concurrent execution lies with the transaction manager. The actions required to restore consistency may involve more than direct database access. Internal trans- Model In this section, we give an informal characterization of real-time transactions and relate this to other work on extended transaction models. In Section 3, we present a formal model for reasoning about transactions and constraints. A real-time transaction system interacts with the external world in several ways. Events in the external world are recorded in the database. Transactions in the transaction system initiate external actions. This leads us to partition the set of transactions in a realtime system into three categories: 1. External-input transactions. Such a transaction records in the database some event that has occurred in the external world. Often, such a transaction is a write-only transaction, and is usually of short-duration. 2. Internal transactions. Such a transaction accesses the database in a similar manner as any standard database transaction except that it may 72 actions, (4) triggered external-output transactions, and (5) non-triggered external-output transactions. The system model we consider in this paper may be regarded as comprising of a set T = {tl, ta, . . . , t,,} of predefined transaction-types, and a finite set C = {Cl,CZ,. *. , c,} of predefined consistency constraints in the form of conjuncts. Conjuncts are formulae consisting of a disjunction of possibly negated terms. The consistency constraint for the entire database may be represented by P s /\r=, ci. As far as the cons& tency constraints are concerned, for the purposes of this paper, we restrict our attention to predicate calculus rather than first-order logic since all quantifiers will be over a finite set (the database). Some instances of the transaction-types are triggered by the falsehood of a conjunct, and may function to restore the truth of the conjunct. Certain instances of the transactiontypes, upon execution, may render inconsistent some conjuncts. Thus, the system may be regarded as consisting of transaction-types and conjuncts that interact with each other. actions spawn external-output transactions as subtransactions to modify the outside world as part of a process of restoring consistency. Other subtransactions may be required to. test the results of externaloutput transactions. The potential long-duration of internal transactions make a requirement of serializability impractical [lo]. Furthermore, the nested nature of these transactions requires an extension of the transaction model to support nested transactions [14]. A serializability-based approach to nested transactions is discussed in [14] while correctness of nested transactions without the requirement of serializability is presented in [3, 7, 121. The use of multiple versions of data is often indicated in real-time database applications. An obvious utility is in situations which require the monitoring of data as it assumes different values in time; that is, the “trends” exhibited by the values of the data are used to trigger actions. Examples include rising temperature of a furnace in a nuclear application and the change in the distance of an approaching aircraft in radar tracking systems. The above considerations lead us to suggest that our real-time transaction model may include (1) nesting, (2) versions, and (3) correct concurrent execution without the requirement of traditional serializability. We use the NT/PV model of [12] as the basis for our work since this model supports the above features. Transactions in real-time systems may be submitted either by users, or by external devices. In addition, transactions may also be triggered by the state of the system. If an external-input transaction changes the database state to an inconsistent state, an internal transaction must be run to restore consistency. These transactions are not necessarily triggered by an external-input transaction. Rather they may depend on both the external-input transaction and the database state. For a given inconsistent state, there may be several transactions that are enabled for triggering. The transaction system is free to choose a subset of those transactions that are enabled provided that subset is sufficient to restore consistency. This choice is, in its most general form, computationally complex. We explore this idea further in Section 4. Our model of triggered transactions is related to that used in active databases [13]. H owever, for the purposes of this paper, the manner in which transactions are selected for execution differs from active databases in that we base the selection on the goal of consistency restoration. The concept of triggered transactions, along with our characterization of real-time transactions above, provides five types of transactions: (1) external-input transactions (non-triggered by definition), (2) triggered internal transactions, (3) non-triggered internal trans- 3 Predicate-Priority Graph To facilitate the description of our model, and to make the algorithmic analyses easier, we ‘define a predicatepriority graph (PPG). The PPG captures the relationships between the transaction-types and the conjuncts, and its annotations are used to incorporate various timing constraints. A PPG is a directed bipartite graph with a set of vertices V = T U C, where T denotes the set of transaction-types, and C denotes the set of conjunct vertices. The edges in a PPG represent the triggering of transaction-types by the falsehoods of the conjuncts, and the invalidation of conjuncts by the transactiontypes. If an instance of a transaction-type ti may invalidate a conjunct cj, then the directed edge (ti, cj) appears in the graph. If a transaction-type tk ensures the truth of a conjunct cl upon completion, then the directed edge (cl, 26) appears in the graph. Thus, the PPG represents the transaction-types available to the system for restoring consistency. We exclude non-triggered transactions in order that the PPG may be a static structure. The only dynamic aspect to this graph will be the markings introduced below. The term transaction-type was used above to emphasize that we are creating a vertex for each type of triggered transaction, not a vertex for each execution of a specific transaction. If we had a vertex for each actual execution, then the PPG would become a dynamic structure. In this paper, we restrict attention to only static PPGs. The reason is that the static sit- 73 uation is a subcase of the dynamic one: and hence, it indicates some problems that may be encountered in the analyses of the more general situation. As we shall see, the analysis of the static PPG itself reveals several computationally intractable problems that indicate the need for heuristic approaches - and these results also apply to the dynamic PPG. An example of a. PPG is shown in Figure 1. Transaction-types are represented by square vertices, and the round vertices correspond to conjunct vertices. In the example, an inconsistency in conjunct cl may be resolved by executing an instance of either one of the transaction-types ti or t2. Furthermore, the execution of a transaction of type tl may result in the invalidation of the conjuncts ~5, es and cr. Let us now examine how the PPG is used. If the database is inconsistent, the vertices corresponding to the false conjuncts are marked. To restore consistency, it is necessary to run an instance of the transactiontype associated with the head of at least one out-edge of each marked vertex. However, running these transactions may lead to side-effects beyond restoring the truth of certain previously-false conjuncts. Possibly, these side effects will result in other conjuncts becoming false. This results in further marked vertices. It is important to note that, given a graph and a set of marked vertices, there may exist many ways to resolve the inconsistencies. For each marked vertex, the outdegree indicates the number of potential options for restoring the truth of the corresponding conjunct. If a vertex corresponding to a conjunct is a sink (has no out-edges), then there is no way to restore the truth of this conjunct within the system. A non-triggered transaction (either an external-input or a non-triggered internal transaction) is required to restore the truth of this conjunct. Such a situation requires either human intervention or a “lucky” turn of events external to the system. Thus, we require that all sinks correspond to transaction-types. A cycle in the PPG represents a potentially unstable situation. The situation is only potentially unstable, since an edge from a transaction-type vertex to a conjunct vertex means only that an instance of the transaction-type may make the conjunct false. Also, if in restoring the truth of a conjunct, a transactiontype vertex that is not within the cycle is chosen, the situation may not be unstable. A safe strategy (i.e., one that is not potentially unstable) for resolving an inconsistent database state can be represented by an acyclic subgraph of the PPG such that the subgraph contains all the marked conjunct vertices of the PPG, retains all outedges in the PPG of transaction-type vertices in the subgraph, and retains at least one outedge of each conjunct vertex in the subb graph. We shall restrict attention to strategies that are not potentially unstable. As an example, consider the PPG of Figure I again. The subgraph shown within the dotted outline in the figure provides a strategy to resolve the inconsistencies if cl and cs (and possibly any ‘or all of cs, cs, and cr) are the only marked vertices. We consider sub-DAGs of the PPG that resolve an inconsistent database to have roots at all marked vertices and sinks that are transaction-types. As before, all outedges in the PPG from transaction-types in such a DAG must be included in the DAG. The partial order on transaction-types induced by the DAG must be observed if the execution will, in fact, restore consistency (without requiring the execution of multiple instances of a transaction-type). As described above, a DAG subgraph may be identified in a marked PPG so as to resolve the inThe subgraph should include all the consistencies. marked vertices, and all the sinks should correspond to transaction-types with no outgoing edges. We call such a subgraph an inconsistency-resolution subgraph (IRS) for a given marked PPG. An IRS provides a strategy by which the inconsistencies in the PPG may be resolved: Executions of the transactions within an IRS that obey the partial order imposed by the IRS will resolve the inconsistencies. A more formal definition a PPG and an IRS is now provided. Definition 1. A predicate-priority graph is a 3tuple (C,T, E) representing a bipartite graph with vertex set CUT and edge set E E ((C x T) U (T x C)). A marked PPG is a PPG in which a nonempty set of vertices X E C is identified as being “marked”. q The inconsistency-resolution subgraph (IRS) defined below represents a strategy for restoring consistency to the database given that the marked set of conjuncts are false. Definition 2. Let G = (C, T, E) be a PPG in which the vertices in X E C are marked. An inconsistencyresolution subgraph of G is a Stuple G’ = (C’,T’, E’) such that, (1) X C C’ E C, T’ E T, and E’ c E, (2) For all edges (tj, ci) E E surh that tj E T’, we have ci E C’ and (tj, ci) E E’, (3) For all ci E C’, there exists a path in G’ from ci to tk, where tk is a sink in G, and (4) G’ is acyclic. CI A natural question arises as to whether an IRS exists for a particular marked PPG. The following result implies that the question is easily settled. 74 only for systems that have an economically unjustifiable amount of redundant computing power. Therefore, although the formal model developed in this paper is independent of the determination of the execution times, for the purposes of this paper, we consider the expected-case estimates of the transaction-type execution times. Indeed, if the database is entirely memoryresident (see, e.g., [17]), the differences between the worst-case and expected time estimates are likely to be negligible. In fact, most real-time applications have memory-resident data. The incorporation of time into our model is achieved by the use the functions W, and Wr which denote mappings from the conjuncts C and the transactiontypes T, respectively, to the set of non-negative integers. This requires the redefinition of the PPG to incorporate the timing constraints. We term this new PPG as a weighted PPG, while the original PPG is termed an unweighted PPG. These terms will be used in case of ambiguity in referring to the different types of the PPGs. Theorem 1. Let G be a marked PPG. The problem of deciding whether there is an IRS G’ for G is solvable in polynomial-time. Proof Sketch. We provide a sketch of a requisite polynomial-time algorithm that manipulates the PPG, G. For the ease of presentation, introduce a (pseudo) transaction-type vertex, t’ E T, with out-edges (t’,ci) for every ci E X, and a (pseudo) conjunct vertex, c’ E C, with an out-edge (c’, t’). 1. whilec’ECdo (a>Choose a sink transaction-type vertex, tj. If none exists, print “No IRS exists”, and stop. P-J)For each conjunct vertex ci such that (ci, tj) E E, delete all edges that are adjacent to ci. Hence, delete ci. (cl Delete tj. endwhile 2. print “IRS exists”, and stop. Definition 3. A (weighted) predicate-priority graph (PPG) is. a 5-tuple (C, T, E, W,, Wr) representing a bipartite graph with vertex set CUT and edge set E s ((C x T) U (T x C)). W, and Wr are the time interval and time cost functions, respectively. With appropriate data structures, the algorithm takes and since it finds a way to resolve every vertex in X, it places the problem in polynomial-time. 0 O(lCl+ ITI + LJ-4)t ime, 4 Timing Notice that an unweighted PPG can be represented by a PPG in which W, maps all elements of T to 1, and W, maps all elements of C to 1 (where 1 is suitably chosen). Also, we can extend the notion of a marked unweighted PPG to a marked weighted PPG in a natural manner. Note that the case where WK(ci) < Wr(tj) for a conjunct ci and a transaction-type tj , it is not worthwhile to include an edge (ci, tj) in the PPG. Hence, we shall always assume that for an edge (ci, tj) in a PPG, it is always the case that Wn(ci) > Wr(tj). We also need to redefine the inconsistency-resolution subgraph for a weighted PPG. Again, the IRS represents a strategy for restoring consistency to the database given that the marked set of conjuncts are false. Considerations Timing constraints are represented in the PPG by associating a time interval with each conjunct and a time cost with each transaction-type. The value associated with each conjunct represents the maximum duration of a time interval during which the corresponding conjunct may be false. The time cost represents an estimate of the execution time of the transaction-type. Typically, real-time analysis is based upon worst-case assumptions about execution time so as to ensure the correctness of a schedule. If we took that approach to real-time database management, we would be forced to make drastic assumptions about page-fault frequency, delays due to concurrency control requirements, and other resource-contention factors. For example, unless detailed information about the physical-level schema is made available to the real-time system, it is necessary to assume that every data item reference incurs a page fault, consisting of the write of a page frame back to disk, the reading of the data page, plus requisite disk access to support write-ahead logging and index page access. The difference between the worst case and the expected case is so large that a worst-case analysis for real-time database transactions would find a solution Definition 4. Let G = (C,T, E, W,, Wr) be a weighted PPG in which the vertices X E C are marked. An inconsistency-resolution subgraph of G is a Ctuple G’ = (C’, T’, E’, WL, W:) such that, (1) X C C’ C C, T’ C T, and E’ & E, (2) For ~11edges (tj, ci) E E such that tj E T’, we have ci E C’ and (tj, ci) E E’, (3) For all ci E C’, there exists a path in G’ from ci to tk, where tk is a sink in G, (4) G’ is acyclic, (5) Ws’ is the restriction of W, to C’, and W: is the 75 5 restriction of W, to T’, and (6) For all ci E C’, there is an edge (ci, tj) E E’ such that Wr(tj) 5 Wn(ci). 0 Selecting an IRS The PPG and the IRS defined above allow us to pose several important questions regarding the algorithms that will use them. The issues related to a PPG and an IRS are two-fold. First, an efficient selection procedure is needed to identify a good IRS, where goodness is related to how profitably the IRS can be used to resolve the inconsistencies within the deadlines imposed. Second, once the IRS has been identified, efficient approaches are needed to execute the actions of the transaction-types specified by the IRS. For the time being, let us disregard the effects of the PPG-imposed partial ordering among the transaction-types and concurrency control issues. An IRS can be used to decide how to resolve the inconsistencies, and it must ensure that each conjunct is false for a period no longer than its time interval, on the assumption that time costs for transaction-types are accurate. There may exist several DAGs that may be used for a particular marked PPG and each represents an IRS as defined above. In this case, a decision needs to be made as to which particular one is to be chosen. Intuitively, the IRS that represents the best strategy to resolve the inconsistencies should be the one that is selected. Although the precise definition of a good IRS is dependent on the application, it is possible to identify certain important traits that an IRS should possess. For example, an IRS that provides a method to restore consistency promptly should be regarded as being better than one that implies a slower method. Concurrency aspects for running the restoring transaction-types need to be considered to achieve this. Another measure of goodness could be the choice of an IRS that renders the least number of consistency constraints false. A third measure of goodness arises from the potential inaccuracy in time costs for transaction-types. This measure is related to the scheduling of transactions with regard to the available slack time which, in the case of a transaction-type tj that is chosen to resolve the inconsistency in a conSufficiently large slack junct Ci, is Ws(Ci) - Wr(tj). times “absorb” the inaccuracies of the time estimates for transactions that are scheduled sufficiently early, and this is further discussed in Section 5. Therefore, we suggest that the goodness of an IRS may also be measured as a function of the amount of slack time left for the restoration of the truth of conjuncts. The nature of this function is application-dependent. Example functions include the total slack time, the geometric mean of slack times, and the minimum of the slack times for each conjunct. 5.1 Selection Based on Weights Consider an unweighted, acyclic PPG, G. We may assume that the selection criterion for an IRS is obtaining one that includes the fewest number of transactiontype vertices. Problem 1. (TUAP) The Transaction-weight Problem for a marked, unweighted, acyclic PPG is: Given a marked, unweighted, acyclic PPG, G, and an integer K, is there an IRS, G’, such that the number of elements in T’ is at most K? 0 Theorem 2. The TUAP problem is NP-complete. Proof Sketch. The proof of NP-easiness is as follows. We demonstrate how to verify in polynomialtime that a non-deterministically selected graph G’ is an IRS with IT’1 < K. Verifying that G’ represents an IRS is accompli<hed by checking that X C C’, and that for every ci E C’, there exists an edge (c;, tj) E E’. Checking that IT’1 5 K completes the verification. We now prove NP-hardness. An instance of the NPcomplete Satisfiability problem (LO1 in [5]) is reduced to the TUAP problem. Let P represent the conjunction of m clauses in LOl, i.e., P z A?=“=,Ci where the clauses are formed over n boolean variables 21, x2,. . . ,z,. As shown in Figure 2, form an instance of a PPG, G, withC= {p,c,q,cz ,..,, c,,,,tl,x2 ,.,., x,}, T = {pt,ct,Fq,Fxz ,..., Fx,,,Tx1,Tx2 ,..., TX,}, and X = {p}. Besides the edges explicitly shown in Figure 2, G includes an out-edge from a vertex ci to either Txj or to Ftj for every positive or negative literal, respectively, formed using an xj occurring in the clause Ci of the Satisfiability problem instance. We prove that P is a satisfiable.instance of LO1 if and only if G contains an IRS, G’, with IT’1 5 (n+2). Note that the construction guarantees the existence of an IRS. Assume that a requisite IRS, G’, exists. IT’1 > (n+2) since included in G’ are p’, c’, and at least one of Txi or The conjunct-based model of real-time transactions represented by the PPG provides the system with additional degrees of freedom in managing a real-time database. Not only can the concurrency and recovery managers take into account the conjunct deadlines and time costs associated with the transaction-types, but also the system has some choice among the set of transaction-types to use in response to a particular collection of violated conjuncts that arise due to external events. Below, we consider the computational complexity of taking optimal advantage of these degrees of freedom. 76 proof is now clearly similar to the NP-hardness proof of the TUAP problem. 0 Fzi for every xi. Since G’ is a requisite IRS, we have IT’1 = (n + 2). Th is implies that exactly one of the vertices reachable from a vertex xi is included in T’. Assign a boolean value of .T or F to the corresponding variable xi in LO1 according as Txi or Fxi is included, respectively, in T’. It is clear that every clause of LO1 will have one satisfied literal by this assignment. If there is a truth assignment for every xi in the problem instance of LO1 that satisfies P, consider a subgraph G’ as described next. The set T’ consists of p’, c’, and Txi or Fxi according as xi is assigned T or F, and the set C’ = C. The subgraph G’ contains all possible edges of G. It is easy to see that G’ is an IRS with IT’1 5 (n + 2). 0 The above theorems indicate that the selection procedures to find optimal IRS graphs for the PPG graphs is difficult. We conjecture that there exist interesting cases of the PPG problems that are both of practical interest and of polynomial complexity. Furthermore, we begin to anticipate the need for heuristic approaches to find good IRS graphs in place of the “best.” IRS graph. 5.2 2. (TWP) Theorem 3. The TWP problem is NP-complete. on Slack Times Definition 5. The potential slack time for a conjunct vertex ci in a PPG, G = (C,T, E, W,, Wt), is given by Sn(ci) The Transaction-weight Problem for a marked, weighted, acyclic PPG is: Given a marked, weighted PPG, G, and an integer K, is there an IRS, G’, such that the sum of the weights of the elements in T’ is at most K? [7 = W&(G) - min(,,,tj)fE(WZ(tj)). q The slack time S,(ci) does not provide a precise value for a conjunct ci since there is an inherent inaccuracy associated with the W, values. Furthermore, unless the transaction-type vertex tj. that corresponds to the minimum weight is chosen to resolve the inconsistency, the potential slack time may not be realized. However, S, does serve the purposes of approximation, especially if the transaction-types can be assumed to take unit time - in which case the potential slack time is always realized subject to accurate estimates for the transaction-type time costs. Proof Sketch. The TUAP problem is the TWP problem with unit weight assignments to the elements ofT. 0 We consider now a different, selection criterion that is based on the number of conjuncts that may be rendered false. In the case of an marked, unweighted, acyclic PPG, a related measure of goodness would be to find an IRS which minimizes the number of consistency conjuncts that it renders false. 5.2.1 Total Slack Time The sum of the slack times associated with the conjunct vertices of an IRS, G’, is called the total slack time of the IRS, and is denoted by slack(G’). As mentioned earlier, assume that some application indicates that a selection criterion may be baaed on the maximization of the the total ,slack time. With the S, values ss provided, the IRS chosen directly would be the PPG itself - clearly an unacceptable choice. Hence, we use the method described below to limit the number of vertices chosen while retaining the criterion of total slack time maximization. Problem 3. (PUAP) The Predicate-weight Problem for a marked, unweighted, acyclic PPG is: Given a marked, unweighted, acyclic PPG, G, and an integer K, is there an IRS, G’, such that the number of elements in C’ is at most K? 0 Theorem Based Large slack times allow a greater flexibility in scheduling transactions, and in time-constrained systems, this flexibility is valuable. To analyze the PPG in terms of slack times and scheduling, we first formalize some of these notions. If we introduce the timing constraints in terms of the functions IV, and W,, a selection criterion for an IRS could be the minimization of the sum of the time costs of the transaction type vertices included in the IRS. This criterion is suggested by the need for the “fastest” inconsistency-resolution strategy. Problem Selection 4. The PUAP problem is NP-complete. Proof Sketch. The proof of NP-easiness is the same as that for the TUAP problem with a verification of IC’I 5 I< replacing IT’1 < K. To prove NP-hardness, we exhibit a similar reduction from the problem LO1 as we did for the TUAP problem. The instance of the PPG constructed is modified to have the additional subgraphs at the nodes Txi and Fxi as shown in Figure 3. Set K = (2n + m + 2). The Definition 6. The inverse slack time associated with a conjunct vertex ci is given by SL(ci) = 71- SK(ci) where 3 2 (1 + maxcjEc(SK(cj))). 0 The constraint on the value of r) is to ensure that S;(Q) 2 1 for all ci E C. The reason why 7 is left unspecified in the definition is explained below. 77 stances of transaction-types which may be necessitated by concurrency control considerations. As mentioned earlier, if a transaction is scheduled early, the inaccuracies in the transaction execution time estimates are less likely to affect the deadline requirements on the conjunct inconsistencies. We examine slack times in more detail in Section 6. In the discussion to follow, we assume for simplicity that an IRS exists. Suppose that an IRS, G&,, is chosen such that the sum of the 5’; values associated with its conjunct vertices is the smallest among all the IRSs, G’, that are possible. Using the above definition, we have r$2Lin] - sla&(Ghin) 5 n]C’] - sleck(G’). Notice that ICAin] = IC’] implies that slaclc(G&) 1 slaclc(G’), and that slaclc(GAin) = slack(G’) implies that IC~inI 5 IC’]. Thus, for two IRSs, if the number of conjunct vertices in each is the same, the one with a larger total slack time is preferred by this minimization criterion. If the total slack times of the two IRSs are equal, then this criterion chooses the one with fewer conjunct vertices. As mentioned above, attempting to maximize the total slack time without using a notion such as the inverse slack time leads to the selection of an unnecessarily large IRS with too many conjunct vertices. This is undesirable since the inclusion of a conjunct vertex in an IRS implies that the inconsistency-resolution process may cause that conjunct to become inconsistent. Thus, there exists a trade-off between increasing the total slack time, sZach(G’), and decreasing the number of conjunct vertices, ]C’], in the IRS. It is the value of 7 that determines the importance attached to each. A small value of 7 gives more importance to maximizing slack(G’), whereas a large value of 1 gives more importance to minimizing IC’]. This is clear by examining the expression q]C’] - slack(G’) which is the sum of the inverse slack times of the vertices in C’. Consider a modified PPG, G, in which for all ci E C = S:(ci) and WT(tj) = 1. and tj E T, we set WK(ci) By introducing inverse slack times in this manner, and choosing a desired value for 7, the question of maximizing the total slack time for an IRS reduces to the following problem. Problem 5. (IST) The Individual Slack Time Problem for a PPG is: For a given marked, weighted, acyclic PPG, G, and an integer K, is there an IRS, G’, such that mm,,ECj(SL(ci)) is at most K? •I Theorem Proof Sketch. Add a (pseudo) transaction-type vertex t’ to T with outedges (t’,ci) to every ci E X. With each vertex v E CUT, associate two values, V(v) and tag(v). Set tag(tj) = 1 for each sink transactiontype vertex and set all the remaining V and tug values to 0. tj, 1. while (4 (cl tag(v) or as := 1. 2. if V(t’) 5 K then print and stop. “Yes” else print “No”, At the end of loop statement, a tagged vertex, v, has the value V(v) that provides the cost of the subgraph of the best IRS (in the IST sense) that is rooted at that vertex. With the use of suitable data structures, the algorithm runs in O(]Cl + ITI + IE]) time - thereby placing IST in polynomial-time. 0 5. The PWP Problem is NP-complete. Large Individual Choose vertex v with tag(v) = 0 and all successor vertices u with tag(u).= 1. value of V(v) to max(,,u)E&V(u)) max(Si(v), minc,,u)EE(V(U))) according v E T or v E C, respectively. 5.3 Proof Sketch. The PUAP problem is the PWP problem with unit weight assignments to the elements inc. 0 5.2.2 tag(t’) = 0 do (b) Set the Problem 4. (P WP) The Predicate-weight Problem for a marked, weighted PPG is: Given a marked, weighted, acyclic PPG, G, and an integer K, is there an IRS, G’, such that the sum of the weights of the elements in C’ is at most K? 0 Theorem The IST problem is in polynomial- 6. time. Discussion The significance of the intractable results is only that the optimal solutions are computationally very costly to obtain. However, as in many other situations, near optimal solutions would serve almost as well. By sacrificing optimality, we can make use of several approximation methods available in the literature (e.g., from [5]). Such h euristic methods are well-studied and provide computationally inexpensive means to obtain near-optimal solutions. The fact that formal analysis of this nature is possible in our formulation is a very encouraging indication. Slack Times It may be argued that it is more germane to use a selection criterion for an IRS based on the largeness of the slack times associated with the conjuncts. That is, the cost of an IRS G’ = (C’, T’, E’, WL, Wi) is max,,Ec,(SL(ei)). Large slack times provide the flexibility in scheduling the inconsistency-resolving in- 78 The PPG that we have dealt with so far may be regarded as “static”, since the only “dynamic” aspect of the PPG are the markings. It is possible to consider a more complex “dynamic” *version of a PPG where the weights may change dynamically, or the transactiontypes are replaced by transaction instances. However, the intractability of the problems encountered in the static case indicate that the dynamic version would definitely pose problems that are at least as difficult. Thus, the study of a simpler model provides a basis for directly seeking heuristics in the more complicated models. 6 Using of each other, it is the case that a single execution of t4 will not suffice to resolve both the inconsistencies. In a similar situation, the example in Figure 5 shows a conjunct vertex, cl, that may become inconsistent due to the execution of either tl or t2. Suppose that tl makes cl inconsistent, and t2 does the same within the next W&(ci) = 3 units of time. In this situation, no matter when t3 is scheduled, the time period for which cl will remain inconsistent will exceed WK(cr). The occurrence of problems such as those illustrated in the two examples above is not peculiar to our particular formulation. They will occur in general in systems with timing constraints, and the problems must be addressed if real-time databases are to be realized. Our model serves to exhibit these problems as well as to serve as a tool by which they may be analyzed. In the examples just discussed, notice that if the conjuncts have larger slack times due to larger deadlines, the problems may be alleviated. For example, if we changed Figure 4 to have Wn(cz) = 4, and changed Figure 5 to have WK(cl) >> 3, then the scheduling of the inconsistency-resolution transactions may be successfully accomplished. These examples show how large slack times permit the transaction-types to exceed their inherently inaccurate estimates of execution-times so long as their instances are scheduled’ sufficiently early. Large slack times are useful in other contexts as well. Before transactions begin executing, it is often the case that the resources they need must be obtained - and this could be time-consuming. Furthermore, the dura tion of this resource-gathering phase is indeterminate and it depends on the other transactions that are executing concurrently in the system. If the transactions are triggered by conjuncts with large slack times, the initial phase of the transactions could be safely accommodated by scheduling the transactions early. One way to accomplish this to a certain extent is to identify the conjuncts with large slack times, and to use the notion of nested transactions as follows. The conjuncts with small slack times that occur in the IRS are embodied within the nested transaction-types. The conjuncts that have been identified with large slack times serve as triggering conjuncts for the nested transactions. Thus, the IRS is regarded as a collection of partially ordered nested transaction-types - most of which are triggered by conjuncts with large slack times. Details regarding nested transactions are available in [12, 141. As an example, consider the PPG shown in Figure 6. We represent conjuncts that have been identified to have large slack times by triangular vertices. In the manner explained above, some vertices of the PPG are shown to be grouped together by the dotted outlines to form nested transaction-types that are denoted by ntl, nt2, nt3, and nt4. The conjunct vertex cl may trigger the IRS Once an IRS is chosen, the question arises as to how the actions that it implies should be scheduled. It may be argued that since the transactions are likely to interact, concurrency control requirements may render the selection criteria for the IRS untenable. However, note that the intractability of the problems encountered indicate that additional criteria will not make the problems any easier, and heuristic methods must be used. Therefore, we separate the two issues of selection and scheduling for an IRS. The detailed analysis of the use of an IRS is beyond the scope of this paper, and we restrict ourselves to indicating the important issues involved in such analyses. 6.1 Slack Scheduling, Nested Transactions Times, and Consider a subgraph of an IRS in Figure 4. The parenthesized numbers give the values of IV, and IV, for the conjunct vertices and the transaction-type vertices, respectively. Assume that cz and ca become inconsistent immediately after the completion of ti. The IRS chosen does not allow any slack time for the resolution of the inconsistency in either of these conjuncts, and hence, tz and tz are scheduled immediately. The conjuncts cd and cz may become inconsistent immediately after tz and t3 complete, respectively. Notice that since neither c4 nor cg have any slack time, and hence, as soon as either of them becomes inconsistent, t4 must be scheduled. However, in this example, c4 and cz become inconsistent within WT(t4) = 3 time units of each other (in fact, within 1 time unit) - but not simultaneously. Thus, if the same transaction from the transaction-type t4 is used to resolve the inconsistencies, irrespective of when it is scheduled, one of the two conjuncts will remain inconsistent for a period greaterthan its deadline. Furthermore, assuming that c4 and c5 do not become inconsistent within W, (t4) time units 79 avoided. The use of versions of data in this context also helps to alleviate the problem. Clearly, it is important to identify where the transactions may interact so as to reduce the interactions to limit the delays. Thus, our model plays the dual role of describing the transactions as well as prescribing their design to control the contention. We highlight some of the immediate facets of transaction interaction next. Let 5’ be an IRS of a PPG G. Let ckl, tj,, CL,,,i?ja, . . . , ckm,tjn be a path in S. We ZiSSllllK the NT/PV model of [12] with input and output conditions (pre- and post-conditions) for each transactiontype. Then, we expect the following to hold in many cases. For an edge e = (cki,tji) in s, since tji is triggered by the falsehood of cki, the input condition of tj; mentions all the data items occurring in ck,. Also, since tji makes cki true, the output condition of tj, mentions potentially all the data items occurring in cki. For an edge e = (tji, cki+l) in S, since tj, may invalidate cki+l, the output condition of tj, mentions potentially all the data items occurring in cki+l. It is unlikely that all the data items of a conjunct will be affected by one transaction-type. These observations can be used to identify bounds on the read and write sets of transaction-types. Such information can be used to advantage in concurrency control. Concurrency along paths in S could be managed by the preemptive protocol described in [KSM], perhaps simplified to its singleversion variant. It is valuable to identify the potential for concurrency among the transaction-types that do not lie on the same path in the IRS. Although two transactiontype8 may not have any common conjuncts in their pre- and post-condition sets, it may happen that they access common data items. This is because different conjuncts may mention common data items. Also, consider an example of a PPG in which there are two outedge8 (cl, tl) and (cl, t2) from the same conjunct vertex cl. It may happen that the IRS for the PPG contains both the transaction-types tl and t2, but only one of the two edges, say (cl, tl). This could happen if t2 is chosen to resolve the inconsistency for some conjunct other than cr. In such a situation, the analysis to find common data items should include consideration for both the edges (cl, tl) and (cl, 12). After this analysis is done, it become8 necessary to ensure that the instances of the transaction-types that access the common data are correctly controlled by the concurrency protocol. Even after reducing the extent to which the transaction-types interact, any reasonable real-time database system will have transactions competing for the resources. In such situations, the study of preemptive protocols to manage the timing and prior- instances of either one of the nested transaction-types ntl or nts. In ntl, the parent transaction of type tl may spawn the child transactions of type t5, tc, and t7 by making the conjuncts cs, c8, and cr inconsistent. Similarly, nt2 has a parent transaction-type t2, an instance of which may spawn child transactions of type ts and t7. Notice that an instance of nt2 could make c8 inconsistent, and this would trigger an instance of nt4 which consists of the single transaction-type ts. The nested transaction-type nt3 ha8 a parent transactiontype t4, an instance of which may spawn just a single child transaction of type t8. 6.2 Concurrency Control Issues In the execution of the transactions indicated by an IRS, beside8 correctness, the issue of the timing constraints is also of importance. We have noted above that an inconsistency-resolution subgraph induces a partial order on the set T’ of transaction-types. Given that we seek the prompt restoration of consistency in a real-time system, the need for a significant amount of concurrency among instances of the transaction-types in T’ is required. In this section, we briefly examine aspects of concurrency control protocols germane to our model. From the model of transactions described earlier, the use of methods that deal with nested transactions is indicated clearly. The subgraphs of an IRS are best described a8 nested transaction-types with added timing constraints. Existing work on nested transactions should be modified to handle the timing considerations to be used in this context. Obviously, the presence of timing constraints will affect the concurrency control. The increased needs for concurrency may be achieved using less restrictive correctness criteria a8 compared to the traditional serializability - for example, the correct concurrent execution criteria of [12]. Our model allows for external-output transactions and transactions with stringent timing requirements. Both of these suggest that the facility for undoing the effects of a transaction may be unavailable. Also, situations that may result in cascading aborts must be avoided - which does not necessarily preclude other transactions from “observing” data pr* duced by uncommitted transactions since our model i:not the traditional one. These factors suggest that it may be necessary to introduce the notion of compensating transactions [6, 111. Transactions that run concurrently in our system interact due to the shared data that they may access. The timing constraints imply that the delays arising a8 a result of these interactions should be minimized. For example, deadlock or livelock situations should be 80 ity constraints is important. Thus, the satisfactiorl transaction-type timing constraints may result in the sacrifice of the best overall throughput of the system. Research along these lines, is desirable for our model, and in this context, work such as [l] may be extendible. 7 ference on Very Large Databases, Los Angeles, pages 1-12, 1988. [31 C. Beeri, P. A. Bernstein, and N. Goodman. A model for concurrency in nested transaction sysApril terns. Journal of the ACM, 36(2):230-269, 1989. Conclusions PI C. Forgy. RETE: A fast match algorithm for the many pattern/ many object pattern match problem. Artificial Intelligence, (19):17-37, 1982. We have proposed a model of real-time transaction processing based upon deadlines associated with consistency constraints. We have demonstrated that, in general, finding a strategy for restoring database consistency is computationally intractable. This negative result does not preclude the practical use of our model. Rather, it indicates that heuristics or suitbble “protocols” are required for transaction processing. An analogous situation exists for standard transaction processing, where the set of two-phase locked schedules is usually accepted as a suitable subset of the set of serializable schedules whose recognition problem is NPcomplete. We have suggested some approaches toward the development of practical transaction management algorithms for our real-time model, but many issues remain to be addressed. For example, heuristics for the selection of an acceptable inconsistency-resolution subgraph are needed as is the development of a complete concurrency protocol that exploits the semantics of the inconsistency-resolution subgraph. The introduction of dynamic violations of consistency constraints in realtime database systems requires the system to modify its consistency restoration strategy as external events occur. Rather than recomputing a complete strategy, an incremental algorithm is desirable. Techniques of this nature are already in use in expert database systems [4]. [51 M. R. Garey and D. S. Johnson. Intractability. Computers and W. H. Freeman and Company, New York, 1979. PI J. N’. Gray. The transaction and limitations. In Proceedings concept: Virtues of the Seventh Inon Very Large Databases, ternational Conference Cannes, pages 144-154, 1981. PI T. Hadzilacos and V. Hadzilacos. Transaction synchronisation in object bases. In Proceedings of the Seventh ACM SIGACT-SIGMOD-SIGART Systems, Symposium on Principles of Database Austin, pages 193-200, March 1988. PI R. Holte, A. K.-L. Mok, L. Rosier, I. TulchinThe pinwheel: A realsky, and D. Varvel. time scheduling problem. In Proceedings of the 22nd Hawaii International Conference on System Sciences, Kailua-Kona, pages 693-702, January 1989. PI F. Jahanian and A. K.-L. Mok. Safety anaylsis of timing properties in real-time systems. IEEE Transactions on Software Engineering, SE 12(9):890-904, September 1986. PO1H. F. Korth, W. Kim, and F. Bancilhon. On long Sciences, duration CAD transactions. Information 46:73-107, October 1988. Acknowledgements The authors wish to thank Robert Abbott, Garcia-Molina, and Eliezer Levy for helpful sions. Hector discus- WI tional Conference on Very Large Databases, bane, pages ? -?, August 1990. References of ACM-SIGMOD on Management International 1988 International Conference of Data, Chicago, pages 379-388, June 1988. [2] R. Abbott and H. Garcia-Molina. Scheduling real time transactions: A performance evaluation. In of the Fourteenth Bris- WI H. F. Korth and G. Speegle. Formal model of correctness without serializability. In Proceedings [I] R. Abbott and H. Garcia-Molina. Scheduling realtime transactions. SIGMOD Record, 17( 1):71-81, March 1988. Proceedings II. F. Korth, E. Levy, and A. Silberschatz. A formal approach to recovery by compensating transactions. In ‘Proceedings of the Sicteenth Interna- 1131 D. R. McCarthy and U. Dayal. The architecture of an active dat,a base management system. Con- 81 In Proceedings of ACM-SIGMOD 1989 International Conference on Management of Data, Portland, Oregon, pages 215-224, June 1989. [14] J. E. B. M oss. Nested transactions: An introduction. In B. Bhargava, editor, Concurrency Control and Reliability in Distributed Systems, pages 395425. Van Nostrand Reinhold, 1987. [15) P. Peinl and A. Reuter. High contention in a stock trading database: A case study. In Proceedings of ACM-SIGMOD on Management 1988 International Conference of Data, Chicago, pages 260-268, June 1988. P Y T=. [16] L. Sha, R. R ajk umar, and J. P. Lehoczky. Concurrency control for distributed real-time databases. SIGMOD Record, 17(1):82-98, March 1988. 0 PI. Tr. 0 For. i 0 [17] M. Singhal. Issues and approaches to design of SIGMOD Record, real-time database systems. 17(1):19-33, March 1988. [18] S. H. Son, editor. SIGMOD on Real-Time Databases. Record: F”r< Special Issue ACM, March 1988. 1191 J. A. Stankovic. Misconceptions about real-time computing. IEEE Computer, pages 10-19, October 1988. Figure 4: 0) First (1) (3, Erunplc of SchedulingProblerm b, &-------;$ Figure 5: SecondExample of SchedulingProblems Cl Transaction Vertices 0 Conjunct Vertices The dotted outline shows a DAG subgraph Figure 1: An example PPG 82