Rare event provoking simulation techniques

Poul  Heegaard

Rare event provoking simulation techniques

1995

International Teletraffic Seminar, 28 Nov-1 Dec 1995, Bangkok, Thailand. - 1 - Abstract Importance sampling is the rare event provoking technique that is most frequently used for speed-up simulation of dependability and traffic models. Due to its sensitivity to the parameter settings, it is not yet ready for practical use. There exists a number of approaches for finding optimal, or at least good, parameters. The applicability of these and necessary extensions are discussed with reference to the typical characteristics of state models used for simulation. In dependability models failure biasing is applied to guarantee a bounded relative error. It is more flexible with respect to the model characteristics, but less efficient, than the heuristic based on large deviation theory used in queuing models. This heuristic puts severe restrictions on the model, partic- ularly considering state models with state space of more than 1 dimension. This paper proposes a technique that is rather efficient for simulation of state models with irregular and -dimensional state spaces. Constant rate models are discussed in detail. 1 Introduction Due to the increase in complexity , tight logical couplings between distributed processing units, and traffic diver- sity , simulation is the most applicable means for performance evaluation of telecommunication systems. However, simulation can be rather inefficient in the evaluation of some extreme QoS impairments caused by rare events in the system, e.g. cell losses in ATM. A set of speed-up simulation techniques exists, see e.g. [1] for an overview, where rare event provoking techniques seem to be the most efficient. Importance sampling is considered as the most promising, see [2] for a comparison of three rare event provoking techniques. Importance sampling efficiency is observed to be very sensitive to the choice of parameters governing the sam- pling, the so-called bias-parameters, see e.g. [3]. A lot of work is done on defining (asymptotic) optimal parameters, either by using results from large deviation theory (for queuing system simulations) or bounded rel- ative error (for dependability simulations), see Heidelberger [4] for an excellent survey. Based on this work n RARE EVENT PROVOKING SIMULATION TECHNIQUES Poul E. Heegaard, The Norwegian Institute of Technology Department of Computer Systems and Telematics N-7034 TRONDHEIM, NORWAY phone: +47 7359 2890 email: Poul.Heegaard@idt.unit.no

International Teletraffic Seminar, 28 Nov-1 Dec 1995, Bangkok, Thailand. - 2 - some authors state that importance sampling is now ready for practical use. However, before the maturity can be discussed it is necessary to be more specific on what “practical use” is, e.g. what does the model of a realistic, real-sized system looks like. Section 2 gives a short introduction to and an overview of some achievements made in application of impor- tance sampling in rare event simulation. Section 3 takes a closer look on the characteristics of state models to discuss the applicability of different parameter optimisation methods. At the end of Section 3, existing tech- niques and necessary extensions are discussed. A new strategy for efficient simulation of irregular state spaces is proposed in Section 4, and finally some closing remarks are given in Section 5. 2 Importance sampling 2.1 Principle Importance sampling (IS) is a technique for variance reduction in computer simulation where the sampling of the outcomes are made in proportion to their relative importance on the result. The method has been applied to several areas, e.g. both dependability [5,6] and teletraffic [7,8,9] assessment. In this paper we regard systems which may be modelled by a state space, . In this , we sample , which can be regarded as a trajectory , that is a sequence of events or state visits in , starting and ending at a regenerative state, see the figure below for an example. is sampled from a probability distribution , determined by the product of the global state transition prob- abilities. The quantity of interest , e.g. a system failure, is some function of this trajectory. To observe non-zero ’s, the must include visits to a subspace . If this happens extremely seldom (under ), such a visit is a rare event. Then, sampling from is very inefficient with respect to estimating . The simple idea of importance sampling is to change the underlying sampling distribution to , i.e. use bias-parameters to change the global state transition probabilities, where visits to are less rare and where the following relation hold: (2.1) The likelihood ratio is the ratio between the likelihood of under and . Thus, the property of S S X S A S A likely samples X unlikely sample X unlikely sample X visiting A regenerative state X fx () Y gX () Y X A S ⊂ fx () fx () Y f * X () A E f Y () E f gX () ( ) E f * gX () Λ X () ⋅ ( ) E f * Y Λ X () ⋅ ( ) = = = Λ X () X fX () f * X ()

International Teletraffic Seminar, 28 Nov-1 Dec 1995, Bangkok, Thailand. RARE EVENT PROVOKING SIMULATION TECHNIQUES Poul E. Heegaard, The Norwegian Institute of Technology Department of Computer Systems and Telematics N-7034 TRONDHEIM, NORWAY phone: +47 7359 2890 email: Poul.Heegaard@idt.unit.no Abstract Importance sampling is the rare event provoking technique that is most frequently used for speed-up simulation of dependability and traffic models. Due to its sensitivity to the parameter settings, it is not yet ready for practical use. There exists a number of approaches for finding optimal, or at least good, parameters. The applicability of these and necessary extensions are discussed with reference to the typical characteristics of state models used for simulation. In dependability models failure biasing is applied to guarantee a bounded relative error. It is more flexible with respect to the model characteristics, but less efficient, than the heuristic based on large deviation theory used in queuing models. This heuristic puts severe restrictions on the model, particularly considering state models with state space of more than 1 dimension. This paper proposes a technique that is rather efficient for simulation of state models with irregular and n -dimensional state spaces. Constant rate models are discussed in detail. 1 Introduction Due to the increase in complexity, tight logical couplings between distributed processing units, and traffic diversity, simulation is the most applicable means for performance evaluation of telecommunication systems. However, simulation can be rather inefficient in the evaluation of some extreme QoS impairments caused by rare events in the system, e.g. cell losses in ATM. A set of speed-up simulation techniques exists, see e.g. [1] for an overview, where rare event provoking techniques seem to be the most efficient. Importance sampling is considered as the most promising, see [2] for a comparison of three rare event provoking techniques. Importance sampling efficiency is observed to be very sensitive to the choice of parameters governing the sampling, the so-called bias-parameters, see e.g. [3]. A lot of work is done on defining (asymptotic) optimal parameters, either by using results from large deviation theory (for queuing system simulations) or bounded relative error (for dependability simulations), see Heidelberger [4] for an excellent survey. Based on this work -1- International Teletraffic Seminar, 28 Nov-1 Dec 1995, Bangkok, Thailand. some authors state that importance sampling is now ready for practical use. However, before the maturity can be discussed it is necessary to be more specific on what “practical use” is, e.g. what does the model of a realistic, real-sized system looks like. Section 2 gives a short introduction to and an overview of some achievements made in application of importance sampling in rare event simulation. Section 3 takes a closer look on the characteristics of state models to discuss the applicability of different parameter optimisation methods. At the end of Section 3, existing techniques and necessary extensions are discussed. A new strategy for efficient simulation of irregular state spaces is proposed in Section 4, and finally some closing remarks are given in Section 5. 2 Importance sampling 2.1 Principle Importance sampling (IS) is a technique for variance reduction in computer simulation where the sampling of the outcomes are made in proportion to their relative importance on the result. The method has been applied to several areas, e.g. both dependability [5,6] and teletraffic [7,8,9] assessment. In this paper we regard systems which may be modelled by a state space, S . In this S , we sample X , which can be regarded as a trajectory, that is a sequence of events or state visits in S , starting and ending at a regenerative state, see the figure below for an example. S likely samples X regenerative state A unlikely sample X visiting A unlikely sample X A X is sampled from a probability distribution f(x) , determined by the product of the global state transition probabilities. The quantity of interest Y , e.g. a system failure, is some function g(X) of this trajectory. To observe non-zero Y ’s, the X must include visits to a subspace A ⊂ S . If this happens extremely seldom (under f(x) ), such a visit is a rare event. Then, sampling from f(x) is very inefficient with respect to estimating Y . The simple idea of importance sampling is to change the underlying sampling distribution to f∗(X) , i.e. use bias-parameters to change the global state transition probabilities, where visits to A are less rare and where the following relation hold: E f(Y) = E f(g(X)) = E f∗(g(X) ⋅ Λ(X)) = E f∗(Y ⋅ Λ(X)) (2.1) The likelihood ratio Λ(X) is the ratio between the likelihood of X under f(X) and f∗(X) . Thus, the property of -2- International Teletraffic Seminar, 28 Nov-1 Dec 1995, Bangkok, Thailand. interest Y can be estimated by taking R samples from f∗(X) , accumulate the corresponding likelihood ratios Λ(X) and using the following unbiased estimator: 1 R Y = --- ⋅ ∑r = 1 Y r ⋅ Λ(X r) n (2.2) The challenge with importance sampling is to choose a new distribution that minimises the variance of this estimator. If a distribution far from the optimal is used, inefficient simulations producing inaccurate results is observed. The trajectories X under the new distribution must also exist under the original distribution. The objective is to manipulate the transition probabilities to force (stress) the most likely trajectories with the largest contribution to Y to occur more frequently. Obviously, to increase the probability of visits to A , the probability of transitions towards this set must be increased. However, this is not always as simple as it seems: - the set A and the trajectories leading to it are not always easy to identify, - the optimal stressing strategy is not easy to obtain, e.g. the optimal increase in transition probabilities. 2.2 Optimisation Importance sampling has been applied with success both to dependability and traffic evaluations. Rare events n in these contexts are basically different, illustrated by assuming P(rare event) ∝ ε - System failures occur because a few events (e.g. component failures) coincide, that is the trajectories X visn iting A consists of only a few events, all unlikely to occur, i.e ε → 0. ε→0 - Traffic congestion is experienced when a large number of events n have occurred, that is the trajectory X n visiting A consists of a large number of events, each of them not necessary unlikely to occur, i.e. ε → 0 . n→∞ Hence, different approaches need to be considered for dependability and queuing models to obtain optimal, or at least good, parameters to yield maximum efficiency under importance sampling, see Heidelberger [4] for an excellent survey. Queuing systems applies heuristics of Borovkov and Rudget based on results from large deviation theory [1,10]. It is applied to single queue [11, 12], tandem and parallel queues [7], and Jackson networks [13]. The idea is to use a theorem of Cramer to optimise the observed rates governing the most likely X leading to A . For a single queue the optimal is to interchange the arrival- and service rates, which is the same result that is obtained by slow random walk [10], a more exact large deviation approach which unfortunately fails in more complex systems due to rate discontinuities. Recent publications on optimisation of dependability models conclude that it is more important to keep the variability of the estimates under control when the probability of the rare events approaches 0 rather than to minimise the empirical variance. The variability is represented by the relative error (also called coefficient of variation) r.e. { Y } = S ⁄ X . Shahabuddin [4] found that balanced failure biasing, i.e. the new total failure probability p is uniformly distributed on each failure transition out of current state, guarantees bounded relative error with respect to a finite number of samples. Furthermore, he discovered that if all failure rates are in the same order of magnitude (the system is balanced) then the more efficient simple failure biasing, where the p is now -3- International Teletraffic Seminar, 28 Nov-1 Dec 1995, Bangkok, Thailand. distributed in proportion to the original failure rates, is also bounded. Nakayama [14] found that a balanced system is a sufficient but not necessary conditions for bounded relative error when the simple failure biasing scheme is applied. Under some regular condition an unbalanced system may have bounded relative error. Whenever the heuristic approach based on large deviation results applies, this gives a far more efficient simulations than the failure biasing techniques. Unfortunately, bounded relative error can not be guaranteed. In a more empirical approach, short pilot studies are carried out for different simulation parameters. The choice of the optimal parameter is made based on these observations, e.g. by minimisation of the empirical variance of Y for different parameter settings by use of mean field annealing [3]. Another very interesting approach is optimisation by derivation of symbolic expression for the parameters constructed from the failure distance [15]. 3 Applicability of existing optimisation techniques State models relation decides (which) Real system applications Techniques Figure 3.1 State model relations to real system and to optimisation techniques The applicability of optimisation technique to a realistic system is restricted by the underlying assumptions. A state model is often used to represent a real system. The characteristics of state space will determine what technique that applies and whether an existing method can be used, or extensions or new developments are needed. This section will take a closer look at the underlying characteristics of a state model, both the structure of the state space and the driving processes. The relation to real system and state models will be discussed. 3.1 State model The state space of most queuing and dependability models may be depicted by a state diagram showing the (discrete) state space and some underlying stochastic processes by the transitions (processes) between states. State models are the natural framework when using importance sampling for both queuing and dependability. These application areas should be kept in mind in the following discussion. 3.1.1 State space structure A state, s , in this context is defined by a set of (integer) attribute-values of the elements in the system in a value domain given by the state space, S . 3.1.1.1 Dimensionality The dimensionality d of a state space is determined by the number of state variables describing the system state, -4- International Teletraffic Seminar, 28 Nov-1 Dec 1995, Bangkok, Thailand. d s = ( s 1, s 2, …, s d ) , where s i ∈ S i ( ∀i ). The total state spaces is S 1 × S 2 × … × S d or for simplicity S . ˜ d The global state space S spans the same number of dimensions as the number of state variables in the system state vector. Note that increasing the number of state variables d and/or the cardinality1 of each dimension, S i , will increase the number of states in the state space rather excessively. Consider a single queue with a finite queue-length N . Let the state s be the total number of customers in the queue and under service. The state space has dimension d = 1 because only 1 state variable exists. Adding a new queue in series with the first, a new state variable is added for counting the number of customers in queue 2 2 and the resulting state space is S , i.e. having dimensionality d = 2 . 3.1.1.2 Resource limitations In most cases the assessment of system properties somehow relates to a set of resources, e.g. obtaining mean waiting time for resource, probability for available resource, exhaustion of resource (utilisation). In this context, a resource can be e.g. repairman, queue positions, servers, processing units, customers, trunks, etc. Resources influence the cardinality of each dimension in the state space, S i , and the “shape” of the resulting total state space and its cardinality, S 1 × S 2 × … × S d . However, for some resources such as repairmen and servers the limitations are expressed as state dependent rate functions, θ(s) , rather than, or in addition to, state space truncation, see Section 3.1.2.2. E.g. consider a queuing system having m servers and n queuing positions. The server rate is modelled as a linearly increasing state dependent function, µ(s) = µ b ⋅ min(s, m) . But, the number of servers also determines the dimensionality S = m + n , which is the maximum allowed simultaneous customers. With no resource limitations the resulting state space is infinite. However, a simulation experiment can easily be carried out in an infinite state space as long as the state definition is unique and the transition rules are defined. The substate A must also exist (finite or infinite). The subset A ⊂ S may be given by resource limitations like the number of processing units, trunks or queuing positions, or customers. The cardinality of A is rather different for dependability and queuing models. Dependability models normally have a large A , e.g. in a 5-out-of-6 system A = 57 and S = 64 . In contradiction to this, a queuing model have typically small A , e.g. a system consisting of 2 queues, each having 10 queuing (a) d=1, single (b) d=2, 1 common resource (convex space) (c) d=2, convex space Figure 3.2 State space resource limitations 1. the cardinality of a set A is the number of elements in the set (vector), denoted A . -5- (d) d=2, concave space International Teletraffic Seminar, 28 Nov-1 Dec 1995, Bangkok, Thailand. Figure 3.3 2-dimensional state space spanned by pure birth-death processes positions, if maximum allowed simultaneous customers are N = 10 and the probability of full system are to be estimated, then the state space has A = 11 and S = 55 . Figure 3.2 shows 4 examples of different ways the resource limitations may truncate S and define A . 3.1.1.3 State interrelations In a discrete state space model where only one state variable s i ( 1 ≤ i ≤ d ) is allowed to change at the time, and s i is either increased or decreased by 1, we have an n -dimensional pure birth-death relation, see Figure 3.3. Obviously, not all state spaces will be that simple, for instance: - Common mode failures, i.e. one fault will cause a failure in more than 1 component. - Tandem queue, i.e. the each customer is queued/served by two or more queues in a series, see Figure 3.3(a). - Deferred repair, i.e. a number of components are allowed to fail before a repair is initiated, see Figure 3.3(b). - Batch arrivals, i.e. more than 1 queuing position will be allocated at the same time. will have a more complicated structure. This means that finding all trajectories X visiting A , or at least the most likely one, is no longer a simple matter. 3.1.2 Processes 3.1.2.1 Generality Up to now, the static structure of the state space has been considered. In addition, we need to include the dynamics of the real system. This is introduced through general stochastic processes. 3,0 3,0 rejected arrival system failure 2,0 2,1 1,0 1,1 1,2 0,0 0,1 0,2 0,3 2,0 2,1 1,0 1,1 1,2 0,0 0,1 0,2 (a) tandem queue Figure 3.4 2-dimensional state spaces -6- (b) deferred repair 0,3 International Teletraffic Seminar, 28 Nov-1 Dec 1995, Bangkok, Thailand. In a (semi-) Markov model, the aggregated process guarantees that an X sample consists of events from an embedded1 chain. All processes are (non)-homogenous Poisson processes and this condition are automatically met. If all processes are renewal (have embedded points), the aggregated process is normally not renewal. 3.1.2.2 Rate dependencies The arrival rates, λ(s, t) , i.e. frequency of arrival of customer or failures, and the departure rates, µ(s, t) , i.e. the reciprocal of the mean service or repair time, need not be constant over the experiment period t or state space S . In homogenous processes, λ(s, t) = λ(s) and µ(s, t) = µ(s) (for all t ). Additionally, if the rates are stateindependent then λ(s) = λ and µ(s) = µ , (for all s ). In a dependability model including wearouts of a component, non-homogenous (time variant) processes must be applied where the failure rates are λ(t) . When failures propagate between cooperative processing units, the failure rates will be dependent on the system state, λ(s) . In a queuing system the departure rates may be dependent on the state variable representing the number of customers under service, that is µ(s) . Having a limited number of servers or customers in a queuing model implies that the rates have some functional state dependencies. Other resource limitations, like trunk-lines or queuing positions, influence the “shape” of the state space, see Section 3.1.1.2. 3.2 Application of optimisation technique This section focus on the existing techniques for finding optimal parameters in importance sampling, see Section 2.2 for an overview. Note that it is not the applicability of importance sampling that is discussed, only the current state of art with respect to optimisation. Generally, the techniques applied for queuing models are much more efficient than the (balanced/simple) failure biasing techniques applied in dependability models. Because of the large dimensionality d and more irregular structure of the state space S , it is a major concern for defining parameters to keep the variability under control at the expense of optimality with respect to minimisation of the variance of the estimate in Eq. (2.2). In Appendix A a table of application domains of existing methods are listed, identified by the state space and process characteristics. The main observation from this table is that optimisation based on large deviation results (Borovkov heuristics) have rather limited application due to rather restrictive assumptions: - with d = 1 (e.g. single queue) general independent processes can be applied, - with d > 1 some additional restrictions must be made: • with general independent processes then the problem must be reduced to a d = 1 problem, e.g. by only including the dominant dimension (e.g. the queue with maximum load in Tandem queues) in the stressing. • assume markovian processes (e.g. parallel queues with arbitrary limitations, see Section 4). • assume Jackson network (M/M/1 queues with probabilistic routing) and only study one queue at the time. Generally, detailed knowledge of system properties are required to apply this optimisation approach. Some extension of existing results must be made, e.g.: 1. In an embedded point the succeeding events are independent of the history, and depend only on current state. -7- International Teletraffic Seminar, 28 Nov-1 Dec 1995, Bangkok, Thailand. - Non-Markovian processes and/or state dependent rates when d > 1 . - Different cardinality in each dimension S i defined by different resource limitations. Some achievements for constant rate Poisson processes are reported in Section 4. The Borovkov heuristic depends on some regularity in the state space, the ability to identify the most likely trajectory X to A , and that the number events in X is large. Compared to queuing models, dependability models have an irregular state space S , a large and irregular A , and only a few events in the most likely X . Hence, other approaches must be considered. Experiments with importance sampling have shown that it is rather sensitive to its parameters. In an irregular state space with irregular A the optimisation requirements should be relaxed to guarantee bounded variability of the estimator in Eq. (2.2). This is the rational for using a presumably less effective technique in dependability models, (balanced or simple) failure biasing, see Nakayama [14] for some empirical results. The failure biasing is easily applied to a state space with d > 1 and rather complex structure. Markovian processes are assumed, either with state-independent or -dependent rates. In non-markovian models, a uniformization technique is applied [16], where a Poisson process with rate β is introduced to generate an embedded chain of events. Then the failure biasing applies. The failure biasing strategy includes a parameter p which is the total probability of experiencing a failure in a given state. The estimates have bounded relative error for any value of p , hence finding an optimal, or at least a good, value is a matter that must be considered. This can probably be done by adapting some of Devetsikiotis [3] ideas where a set of pilot studies for different bias-parameters were made, and then picking the parameter minimising the variance of the observations. 4 Multi-M/M/1 queues with irregular resource limitations As discusses in previous section, application of large deviation results are so far demonstrated on state spaces with regular subset A . In all situations where A is irregular, assumptions like ‘only consider the maximum loaded queue’ does not hold. Some initial work has been done, to be reported in [17], where the Borovkov heuristic is applied on more complicated and irregular state spaces, see Figure 4.1 for an example with dimensionality d = 3 . In the following, constant rate Markovian models are assumed. The state variables s i express the number of allocated resources, e.g. at origin (the regenerative state) all state variables are equal to 0. The basic idea is to use Borovkov heuristic to express, for each dimension k , the likelihood α k for observing a trajectory X visiting the nearest resource limitation for dimension k starting from current state s : ˜  λ k  Dk(s) α k =  -----  ˜ where D k(s) = min(N j – ∑ s i) ˜ j ∈ Ξk  µk  i ∈ Ωj . (4.1) D k(s) is the number of resource allocations needed from the current state to reach the nearest resource limitation ˜ for dimension k . The λ k and µ k denotes the arrival- and departure-rates respectively, N r is the number of resource type r , Ω r is the set of dimensions limited by resource type r , and Ξ k is the set of resource limitations affecting dimension k . Under the Markovian assumption and convex resource limitations (see Figure 3.2), cal-8- International Teletraffic Seminar, 28 Nov-1 Dec 1995, Bangkok, Thailand. N1=3 λ1 N12=5 N1=3 .. λ1 N12=5 .. λ2 λ3 N2=4 µ2 N123=8 λ3 .. .. µ3 λ2 µ1 N2=4 N3=6 .. N3=6 state space N123=8 Figure 4.1 Irregular resource limitations in 3D example culating α k for each dimension has proven [17] to be sufficient to find the most likely trajectory to the nearest resource limitation from a given state s . ˜ Defining, for current state s , a set Γ consisting of the dimension(s) which have the most likely trajectories X ˜ to A from current state s . A dimension is included in this set when its likelihood α k is within an empirical tol˜ erance ∆ relative to the maximum likelihood, α opt : Γ = { k ( α opt – α k ) ⁄ α opt < ∆ } where α opt = max(α k) (4.2) ∀k For each dimension in Γ the transition rates should be manipulated to increase the original probability of arrivals, P(arrival k) = λ ⁄ ( λ + µ ) , to increase the probability of X visiting A . The optimal scaling (BIAS-factor) is found by Borovkov heuristic which yield an interchange of the total arrival and departure rates of the dimension in Γ (see [17] for more details): BIAS = ∑k ∈ Γ µ k ⁄ ∑k ∈ Γ λ k (4.3) All dimensions k ∈ Γ must scale up its probability of arrivals through scaling the arrival- and departure-rates up and down, respectively, using the same BIAS-factor, yielding: λ∗ k = BIAS ⋅ λ k and µ∗ k = µ k ⁄ BIAS => P∗(arrival k) = λ∗ ⁄ ( λ∗ + µ∗ ) (4.4) For all k ∉ Γ the probabilities are unchanged. Table 4.1 Process parameters k λk Table 4.2 Resource parameters µk Ξk r Ωr Nr 1 0.100 1.0 {1,4} 1 9 {1} 2 0.147 1.0 {2,4} 2 12 {2} 3 0.316 1.0 {3,4} 3 18 {3} 4 24 {1,2,3} -9- International Teletraffic Seminar, 28 Nov-1 Dec 1995, Bangkok, Thailand. To illustrate the efficiency, a simulation experiment of an example like the one in Figure 4.1 is carried out. The number of resources are increased for each type, and N 12 is infinite. All parameters are given in Tables 4.1 and 4.2. Table 4.3 shows the results of a simulation experiment where the time blocking with respect to all the resource limitations are estimated. An exact solution exists for this simple example [18]. Two different strategies for increasing the probability of visits to A are applied, S1: towards the nearest resource limitation(s) from current state, S2: towards the dimension with maximum load, see [7]. S1 is the strategy proposed in the previous part. The estimates from using S1 agree well with all expected time blocking values. S2 only agrees with the time blocking value in the maximum load direction (which is exactly the dimension under stress). Note that only looking at the relative error, S2 will be incorrectly chosen as the best strategy when estimating the total blocking (see row 4 in Table 4.3). S2 underestimates this property because significant contributions for the other dimensions are not included. Table 4.3 Comparison of two strategies using large deviation results applied on multi-M/M/1 with irregular rare event set A. (#reg.cycles=100 000, ∆=0.05) Exact: Y=P(Nr)a r [10-9] S1: proposed techniqueb Y [10-9] s.ed.{Y}/Y S2: previous technique [7,13]c Y [10-9] s.e.{Y}/Y 1 0.9000 0.9203 0.0135 0.000014 0.4586 2 0.0869 0.0914 0.0856 0.000014 0.4586 3 0.6752 0.6647 0.0128 0.6706 0.0045 4 1.6620 1.6763 0.0098 0.6706 0.0045 a. Exact values can be obtained for this example by V.B. Iversens convolution method [18]. b. CPU time consumption = 3:19.8 min. c. CPU time consumption = 2:24.3 min. d. Standard error of sample mean, i.e. s ⁄ n . The S1 strategy are less computer efficient than S2 because calculations must be made at every visited state. However, so far no effort is put in optimisation of the simulation code. Using the S1 at origin to decide the dominate dimensions will improve the calculation efficiency (it will be the same as for S2). This is better than S2, but still a significant underestimation is observed, see [17] for results. 5 Closing remarks Importance sampling is an efficient speed-up simulation technique, but it is very sensitive to the parameter settings. Obtaining good parameters is essential for the success of importance sampling in evaluation of real world systems. Hence, importance sampling is not ready for practical use without some good heuristic for obtaining these parameter settings. Some optimisation techniques already exist, but rather severe restrictions are put on the models. In queuing models, for 1 dimensional state spaces the efficient optimisation technique based on large deviation theory gen- 10 - International Teletraffic Seminar, 28 Nov-1 Dec 1995, Bangkok, Thailand. erally applies. However, in multi-dimensional state spaces rather restrictive assumption must be made with respect to the “shape” of state space and the generality of the underlying processes. The failure biasing strategy is less efficient, but more flexible and applicable to irregular state spaces, like the model class handled by the SAVE tool [19]. A new strategy is proposed in this paper which uses the large deviation heuristic to irregular state spaces. Its efficiency is demonstrated on a simple example where the total time blocking depends on significant contributions from all 3 dimensions. Other techniques where e.g. only the maximum load dimension is considered, a low relative error is observed but the blocking is significantly underestimated because not all contributions (from other dimensions) are included. This new strategy is limited to constant rate Poisson processes. It should be extended to handle state-dependent rates and more complex state interrelations. The applicability to non-convex state spaces must also be considered. Furthermore, it is interesting to see how this strategy relates to failure biasing where bounded relative error is guaranteed. Acknowledgement I would like to thank my supervisor Prof. Bjarne E. Helvik at The Norwegian Institute of Technology for fruitful and stimulating discussions, and for constructive comments and all motivation during preparation of this paper. References [1] P. E. Heegaard. “Speed-up techniques for simulation.” Telektronikk, 91(2), 1995. ISSN 0085-7130. [2] P. E. Heegaard. “Comparison of speed-up techniques for simulation.” In I. Norros and J. Virtamo, editors, The 12’th Nordic Teletraffic Seminar (NTS-12), pages 407–420, Espoo, Finland, 22 - 24 August 1995. VTT Information Technology. [3] M. Devetsikiotis and J. K. Townsend. “Statistical optimization of dynamic importance sampling parameters for efficient simulation of communication networks.” IEEE/ACM Transactions on Networking, 1(3):293 – 305, June 1993. [4] P. Heidelberger. “Fast simulation of rare events in queuing and reliability models.” ACM transaction on modeling and computer simulation, 5(1):43–85, January 1995. [5] A. E. Conway and A. Goyal. “Monte Carlo Simulation of Computer System Availability/Reliability Models.” In Digest of paper, FTCS-17 - The seventeenth international symposium on fault-tolerant computing, pages 230 –235, July 6 - 8 1987. [6] V. F. Nicola, M. K. Nakayama, P. Heidelberger, and A. Goyal. “Fast simulation of dependability models with general failure, repair and maintenance processes.” In Proc. 20’th International Symposium on Fault-Tolerant Computing, pages 491 – 498, 1990. [7] S. Parekh and J. Walrand. “Quick simulation of excessive backlogs in networks of queues.” IEEE Trans. on Auto. Control, 34(1):54–66, 1986. [8] G. Kesidis and J. Walrand. “A review of quick simulation methods for queues.” In International Workshop on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS’93), San Diego, California, USA, January 17 - 20 1993. [9] I. Norros and J. Virtamo. “Importance sampling simulation studies on the discrete time nD/D/1 queue.” In The 8’th Nordic Teletraffic Seminar, pages VI.3.1 –.16. Tekniska Høgskolan i Helsingfors, Aug. 29 - 31 1989. [10] J. A. Bucklew. Large Deviation Techniques in Decision, Simulation, and Estimation. Wiley, 1990. [11] J. Walrand. “Quick Simulation of Queuing Networks: An Introduction.” In 2nd international workshop on applied mathematics and performance/reliability models of computer/communication systems, pages 275–286. University of Rome II, 1987. [12] S. Parekh. Rare events in networks. PhD dissertation, University of California, Berkeley, November 1986. [13] M. R. Frater. Estimation of the statistics of rare events in communication systems. PhD dissertation, The Australian National University, Department of system engineering, November 1990. [14] M. K. Nakayama. “A characterization of the simple failure biasing method for simulation of highly reliable markovian systems.” ACM Transaction on Modeling and Computer Simulation, 4(1):52 – 88, January 1994. - 11 - International Teletraffic Seminar, 28 Nov-1 Dec 1995, Bangkok, Thailand. [15] J. A. Carrasco. “Failure distance based simulation of repairable fault-tolerant computer systems.” In G. Balbo and G. Serazzi, editors, Proceedings of the Fifth International Conference on Computer Performance Evaluation. Modelling Techniques and Tools, pages 351 – 365. North-Holland, Feb. 15-17 1991. [16] P. Heidelberger, P. Shahabuddin, and D. M. Nicol. “Bounded relative error in estimating transient measures of highly dependable non-markovian systems.” ACM Transaction on Modeling and Computer Simulation, 4(2):137 – 164, April 1994. [17] P. E. Heegaard. “Memo 1: Importance sampling optimisation using results from large deviation theory,” 1995. Unpublished memo. [18] V. B. Iversen. “A simple convolution algorithm for the exact evaluation of multi-service loss system with heterogeneous traffic flows and access control.” In U. Körner, editor, The 7’th Nordic Teletraffic Seminar (NTS-7), pages IX.3–1–IX.3–22, Lund tekniska högskola, Sweden, 25 - 27 August 1987. Studentlitteratur. [19] A. Goyal, W. Carter, E. de Souza e Silva, S. Lavenberg, and K. Trivedi. “The System Availability Estimator (SAVE).” In Proc. IEEE 16the Fault-Tolerant Computing Symposium. IEEE, July 1986. [20] M. Cottrell, J.-C. Fort, and G. Malgouyres. “Large deviation and rare events in the study of stochastic algorithms.” IEEE Transaction of Automatic Control, AC-28(9):13–18, September 1983. [21] J. Sadowski. “Large deviations and efficient simulation of excessive backlogs in a GI/G/m queue.” IEEE Transaction of Automatic Control, AC-36:1383–1394, 1991. [22] G. Kesidis and J. Walrand. “Relative entropy between Markov transition rate matrixes.” IEEE Trans. on Info. Theory, 39(3):1056 – 1057, May 1993. [23] M. R. Frater and B. D. Anderson. “Fast simulation of buffer overflows in tandem networks of GI/GI/1 queues.” Annals of Operation Research, 49:207–220, 1994. Appendix A Application of asymptotic optimal importance sampling Structure Application Resource limitations Dimensionality Processes State interrelations Generality Rate dependence Optimisation approach Refs M/M/1 1 1 birth/death Markovian State independent Slow-random-walk Borovkov heuristics [10, 20] M[x]/M[x]/1 1 1 birth/death Markovian State dependent Slow-random-walk [10] GI/GI/1 1 1 birth/death Renewal Any Borovkov heuristics [7] GI/GI/m 1 1 birth/death Renewal Any Borovkov heuristics [21] GI/GI/1, d types of on-off sources d 1 birth/death Renewal Any State reduction+ Borovkov heuristicsa [22] Tandem queue, M/M/1, 1 traffic type 2 1 - 2b complex, Fig. 2.4(a) Markovian State independent Borovkov heuristics [11] d queues in Tandem, d-GI/GI/1, 1 traffic type d 1 - db complex, Fig. 2.4(a) Renewal Any Borovkov heuristics [23] 2 Parallel queues, 1 resource limitation 2 1 birth/death Markovian State independent Borovkov heuristics [7 n Parallel queue, several resource limitations n min: 1 max: n(n+1)/2 birth/death Markovian State independent Borovkov heuristics Section 4 Jackson network, M/M/1 with prob. routing n n birth/death Markovian State independent Borovkov heuristics on max load queue [7] SAVE dependability models, n components n min: 1 max: n(n+1)/2 complex Markovian State dependent Balanced failure biasing => BREc [4, 14] Non-markovian dependability models, n components n min: 1 max: n(n+1)/2 complex Generald nonhomogenous BRE when uniformization process applies [16] a. Computer intensive b. This depend on the performance measure, e.g. P(total backlog>N) gives only 1 resource limitation. c. Bounded Relative Error d. Introducing uniformization process which is Markovian - 12 View publication stats

Log In

Rare event provoking simulation techniques