Abstract
We survey a new approach that the author and his co-workers have developed to formulate stochastic control problems (predominantly queueing systems) asmathematical programming problems. The central idea is to characterize the region of achievable performance in a stochastic control problem, i.e., find linear or nonlinear constraints on the performance vectors that all policies satisfy. We present linear and nonlinear relaxations of the performance space for the following problems: Indexable systems (multiclass single station queues and multiarmed bandit problems), restless bandit problems, polling systems, multiclass queueing and loss networks. These relaxations lead to bounds on the performance of an optimal policy. Using information from the relaxations we construct heuristic nearly optimal policies. The theme in the paper is the thesis that better formulations lead to deeper understanding and better solution methods. Overall the proposed approach for stochastic control problems parallels efforts of the mathematical programming community in the last twenty years to develop sharper formulations (polyhedral combinatorics and more recently nonlinear relaxations) and leads to new insights ranging from a complete characterization and new algorithms for indexable systems to tight lower bounds and nearly optimal algorithms for restless bandit problems, polling systems, multiclass queueing and loss networks.
Similar content being viewed by others
References
E. Anderson and P. Nash,Linear Programming in Infinite-Dimensional Spaces;Theory and Applications (Wiley, 1987).
F. Avram, D. Bertsimas and M. Ricard, Optimization of multiclass fluid queueing networks: a linear control approach, to appear inStochastic Networks;Proc. IMA (1995), eds. F. Kelly and R. Williams, pp. 199–234.
D. Atkins and H. Chen, Dynamic scheduling control for a network of queues, to appear.
R.E. Bellman,Dynamic Programming (Princeton University Press, Princeton, 1957).
D. Bertsimas and T. Cryssikou, Bounds and policies for loss networks, in preparation.
D. Bertsimas and J. Nino-Mora, Conservation laws, extended polymatroids and multiarmed bandit problems; a unified approach to indexable systems, to appear in Math. Oper. Res.
D. Bertsimas and J. Nino-Mora, Restless bandits, linear programming relaxations and a primaldual heuristic, submitted for publication.
D. Bertsimas, I. Paschalidis and J. Tsitsiklis, Optimization of multiclass queueing networks: polyhedral and nonlinear characterizations of achievable performance, Ann. Appl. Prob. 4 (1994) 43–75.
D. Bertsimas, I. Paschalidis and J. Tsitsiklis, Branching bandits and Klimov's problem: achievable region and side constraints, to appear in IEEE Trans. Autom. Contr.
D. Bertsimas and H. Xu, Optimization of polling systems and dynamic vehicle routing problems on networks, submitted for publication.
D. Bertsimas and R. Vohra, Linear programming relaxations, approximation algorithms and randomization: a unified approach to covering problems, submitted for publication.
P.P. Bhattacharya, L. Georgiadis and P. Tsoucas, Extended polymatroids: Properties and optimization,Proc. Int. Conf. on Integer Programming and Combinatorial Optimization, Carnegie Mellon University (Mathematical Programming Society, 1991) pp 349–361.
O.J. Boxma, H. Levy and J.A. Weststrate, Optimization of polling systems, in:Performance '90, eds. P. King, I. Mitrani and R. Pooley (North-Holland, Amsterdam 1990) pp. 349–361.
H. Chen and D. Yao, Optimal scheduling control of a multiclass fluid network, to appear in Oper. Res.
E. Coffman and I. Mitrani, A characterization of waiting time performance realizable by single server queues, Oper. Res. 28 (1980) 810–821.
D.R. Cox and W.L. Smith,Queues (Methuen, London/Wiley, New York, 1961).
J. Edmonds, Submodular functions, matroids and certain polyhedra, in: Combinatorial Structures and Their Applications, eds. R. Guy et al. (Gordon and Breach, New York, 1970) pp. 69–87.
A. Federgruen and H. Groenevelt, Characterization and optimization of achievable performance in general queueing systems. Oper. Res. 36 (1988) 733–741.
A. Federgruen and H. Groenevelt,M/G/c queueing systems with multiple customer classes: Characterization and control of achievable performance under nonpreemptive priority rules, Manag. Sci. 34 (1988) 1121–1138.
E. Gelenbe and I. Mitrani,Analysis and Synthesis of Computer Systems (Academic Press, New York, 1980).
J.C. Gittins and D.M. Jones, A dynamic allocation index for the sequential design of experiments, eds. J. Gani, K. Sarkadi and I. Vince in:Progress in Statistics European Meeting of Statisticians 1972, Vol. 1 (North-Holland, Amsterdam, 1974) pp. 241–266.
J.C. Gittins, Bandit processes and dynamic allocation indices, J. Roy. Stat. Soc. Series, B14 (1979) 148–177.
J.C. Gittins,Bandit Processes and Dynamic Allocation Indices (Wiley, 1989).
K.D. Glazebrook, Stochastic scheduling with order constraints, Int. J. Syst. Sci. (1976) 657–666.
K.D. Glazebrook, Sensitivity analysis for stochastic scheduling problems, Math. Oper. Res. 12 (1987) 205–223.
J.M. Harrison, A priority queue with discounted linear costs, Oper. Res. 23 (1975) 260–269.
J.M. Harrison, Dynamic scheduling of a multiclass queue: discount optimality, Oper. Res. 23 (1975) 270–282.
J.M. Harrison and L.M. Wein, Scheduling network of queues: Heavy traffic analysis of a simple open network', Queueing Systems 5 (1989) 265–280.
D.P. Heyman and M.J. Sobel,Stochastic Models in Operations Research, Vol. II: Stochastic Optimization (McGraw-Hill, New York, 1984).
M. Hofri and K.W. Ross, On the optimal control of two queues with server set up times and its analysis, SIAM J. Comp. 16 (1988) 399–419.
W.A. Horn, Single-machine job sequencing with treelike precedence ordering and linear delay penalties, SIAM J. Appl. Math. 23 (1972) 189–202.
F. Kelly, Loss networks, Ann. Appl. Prob. 1 (1991) 319–378.
F. Kelly, Bounds on the performance of dynamic routing schemes for highly connected networks, Math. Oper. Res. 19 (1994) 1–20.
L. Kleinrock and H. Levy, The analysis of random polling systems, Oper. Res. 36 (1988) 716–732.
G.P. Klimov, Time sharing service systems I, Theor, Prob. Appl. 19 (1974) 532–551.
S. Kumar and P.R. Kumar, Performance bounds for queueing networks and scheduling policies, IEEE Trans. Autom. Contr. 39 (1994) 1600–1611.
H. Levy and M. Sidi, Polling systems: applications, modelling and optimization, IEEE Trans. Commun. 38 (1990) 1750–1760.
H. Levy, M. Sidi and O.J. Boxma, Dominance relations in polling systems, Queueing Systems 6 (1990) 155–171.
L. Lovász and A. Schrijver, Cones of matrices and set functions and 0–1 optimization, SIAM J. Opt. (1990) 166–190.
I. Meilijson and G. Weiss, Multiple feedback at a single-server station, Stoch. Process. Appl. 5 (1977) 195–205.
K.G. Murty,Linear Programming (Wiley, New York, 1983).
Z. Ou and L. Wein, Performance bounds for scheduling queueing networks, Ann. Appl. Prob. 2 (1992) 460–480.
M. Queyranne, Structure of a simple scheduling polyhedron, Math. Program. 58 (1993) 263–285.
C. Papadimitriou,Computational Complexity (Addison-Wesley, 1994).
C. Papadimitriou and J. Tsitsiklis, Complexity of queueing network problems, Extended Abstract (1993).
M. Reiman and L. Wein, Dynamic scheduling of a two-class queue with setups, submitted for publication.
K.W. Ross and D.D. Yao, Optimal dynamic scheduling in Jackson Networks, IEEE Trans. Autom. Contr. 34 (1989) 47–53.
M.H. Rothkopf, Scheduling independent tasks on parallel processors, Manag. Sci. 12 (1966) 437–447.
M.H. Rothkopf, Scheduling with random service times, Manag. Sci. 12 (1966) 707–713.
J.G. Shantikumar and D.D. Yao, Multiclass queueing systems: Polymatroidal structure and optimal scheduling control, Oper. Res. 40, Suppl. 2 (1992) S293–299.
W.E. Smith, Various optimizers for single-stage production, Naval Res. Logist. Quart. 3 (1956) 59–66.
H. Takagi,Analysis of Polling Systems (MIT Press, 1986).
H. Takagi, Queueing analysis of polling systems, ACM Comput. Surveys 20 (1988) 5–28.
D.W. Tcha and S.R. Pliska, Optimal control of single-server queueing networks and multi-classM/G/1 queues with feedback, Oper. Res. 25 (1977) 248–258.
J.N. Tsitsiklis, A lemma on the multiarmed bandit problem, IEEE Trans. Autom. Contr. 31 (1986) 576–577.
J.N. Tsitsiklis, A short proof of the Gittins index theorem, Ann. Appl. Prob, to appear.
P. Tsoucas, The region of achievable performance in a model of Klimov, Technical Report RC16543, IBM T.J. Watson Research Center (1991).
P.P. Varaiya, J.C. Walrand and C. Buyukkoc, Extensions of the multiarmed bandit problem: The discounted case, IEEE Trans. Autom. Contr. 30 (1985) 426–439.
R. Weber, On the Gittins index for multiarmed bandits, Ann. Appl. Prob. 2 (1992) 1024–1033.
G. Weiss, Branching bandit processes, Prob. Eng. Inf. Sci. 2 (1988) 269–278.
P. Whittle, Multiarmed bandits and the Gittins index, J. Roy. Stat. Soc. 42 (1980) 143–149.
P. Whittle, Restless bandits: activity allocation in changing world,Celebration of Applied Probability, ed. J. Gani, J. Appl. Prob. A25 (1988) 287–298.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Bertsimas, D. The achievable region method in the optimal control of queueing systems; formulations, bounds and policies. Queueing Syst 21, 337–389 (1995). https://doi.org/10.1007/BF01149167
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF01149167