Abstract
Dynamical systems typically have uncertainties due to modeling errors, measurement inaccuracy, mutations in the evolutionary processes, etc. The stochastic deviations can compound overtime and can lead to catastrophic results. In this chapter, we will present different approaches for decision-making in systems with uncertainties. We first consider model predictive control, which models and learns the system and uses the learned model to optimize the system. In many cases, a prior model of the system is not known. For such cases, we will also explain Gaussian process regression, which is one of the efficient modeling techniques. Further, we present constrained Markov decision process (CMDP) as an approach for decision-making with uncertainties, where the system is modeled as a MDP with constraints. The formalism of CMDP is extended to a model-free approach based on reinforcement learning to make decisions in the presence of constraints.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Altman, E.: Constrained Markov Decision Processes, vol. 7. CRC Press, Boca Raton (1999)
Farrell, R., Polli, A.: Comparison of unconstrained dynamic matrix control to conventional feedback control for a first order model. Adv. Instrum. Control 45(2), 1033 (1990)
Holkar, K., Waghmare, L.: An overview of model predictive control. Int. J. Control Autom. 3(4), 47–63 (2010)
Rachael, J., Rault, A., Testud, J., Papon, J.: Model predictive heuristic control: application to an industrial process. Automatica 14(5), 413–428 (1978)
Cutler, C.R., Ramaker, B.L.: Dynamic matrix control—a computer control algorithm. In: Joint Automatic Control Conference, vol. 17, p. 72 (1980)
Prett, D.M., Gillette, R.: Optimization and constrained multivariable control of a catalytic cracking unit. In: Joint Automatic Control Conference, vol. 17, p. 73 (1980)
Garcia, C.E., Prett, D.M., Morari, M.: Model predictive control: theory and practice—a survey. Automatica 25(3), 335–348 (1989)
Mayne, D.Q., Rawlings, J.B., Rao, C.V., Scokaert, P.O.: Constrained model predictive control: stability and optimality. Automatica 36(6), 789–814 (2000)
Fernandez-Camacho, E., Bordons-Alba, C.: Model Predictive Control in the Process Industry. Springer, Berlin (1995)
Zadeh, L., Whalen, B.: On optimal control and linear programming. IRE Trans. Autom. Control 7(4), 45–46 (1962)
Propoi, A.: Application of linear programming methods for the synthesis of automatic sampled-data systems. Avtomat. i Telemeh 24, 912–920 (1963)
Gutman, P.O.: Controllers for bilinear and constrained linear systems. PhD Thesis TFRT-1022 (1982)
Chang, T., Seborg, D.: A linear programming approach for multivariable feedback control with inequality constraints. Int. J. Control 37(3), 583–597 (1983)
Lorenzen, M., Cannon, M., Allgöwer, F.: Robust MPC with recursive model update. Automatica 103, 461–471 (2019)
Bujarbaruah, M., Zhang, X., Tanaskovic, M., Borrelli, F.: Adaptive stochastic mpc under time varying uncertainty. IEEE Trans. Autom. Control (2020)
Kocijan, J., Murray-Smith, R., Rasmussen, C.E., Girard, A.: Gaussian process model based predictive control. In: Proceedings of the 2004 American Control Conference, vol. 3, pp. 2214–2219. IEEE (2004)
Cao, G., Lai, E.M.K., Alam, F.: Gaussian process model predictive control of an unmanned quadrotor. J. Intell. Robot. Syst. 88(1), 147–162 (2017)
Hewing, L., Kabzan, J., Zeilinger, M.N.: Cautious model predictive control using gaussian process regression. IEEE Trans. Control Syst. Technol. (2019)
Matschek, J., Himmel, A., Sundmacher, K., Findeisen, R.: Constrained Gaussian process learning for model predictive control. IFAC-PapersOnLine 53(2), 971–976 (2020)
Kolesar, P.: A markovian model for hospital admission scheduling. Manag. Sci. 16(6), B-384 (1970)
Golabi, K., Kulkarni, R.B., Way, G.B.: A statewide pavement management system. Interfaces 12(6), 5–21 (1982)
Winden, C., Dekker, R.: Markov decision models for building maintenance: a feasibility study. J. Oper. Res. Soc. 49, 928–935 (1998)
Shi, B., Ozsoy, M.G., Hurley, N., Smyth, B., Tragos, E.Z., Geraci, J., Lawlor, A.: Pyrecgym: a reinforcement learning gym for recommender systems. In: Proceedings of the 13th ACM Conference on Recommender Systems, pp. 491–495 (2019)
Luketina, J., Nardelli, N., Farquhar, G., Foerster, J., Andreas, J., Grefenstette, E., Whiteson, S., Rocktäschel, T.: A survey of reinforcement learning informed by natural language. In: Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI-19, pp. 6309–6317. International Joint Conferences on Artificial Intelligence Organization (2019)
Al-Abbasi, A.O., Ghosh, A., Aggarwal, V.: Deeppool: Distributed model-free algorithm for ride-sharing using deep reinforcement learning. IEEE Trans. Intell. Transp. Syst. 20(12), 4714–4727 (2019)
Singh, A., Al-Abbasi, A.O., Aggarwal, V.: A distributed model-free algorithm for multi-hop ride-sharing using deep reinforcement learning. IEEE Trans. Intell. Transp. Syst. (2021)
Chen, J., Umrawal, A.K., Lan, T., Aggarwal, V.: Deepfreight: A model-free deep-reinforcement-learning-based algorithm for multi-transfer freight delivery. In: International Conference on Automated Planning and Scheduling (ICAPS) (2021)
Wang, Y., Li, Y., Lan, T., Aggarwal, V.: Deepchunk: Deep q-learning for chunk-based caching in wireless data processing networks. IEEE Trans. Cogn. Commun. Netw. 5(4), 1034–1045 (2019)
Geng, N., Lan, T., Aggarwal, V., Yang, Y., Xu, M.: A multi-agent reinforcement learning perspective on distributed traffic engineering. In: 2020 IEEE 28th International Conference on Network Protocols (ICNP), pp. 1–11. IEEE (2020)
Silver, D., Schrittwieser, J., Simonyan, K., Antonoglou, I., Huang, A., Guez, A., Hubert, T., Baker, L., Lai, M., Bolton, A., Chen, Y., Lillicrap, T., Hui, F., Sifre, L., Driessche, G.V.D., Graepel, T., Hassabis, D.: Mastering the game of go without human knowledge. Nature 550, 354 – 359 (2017)
Silver, D., Hubert, T., Schrittwieser, J., Antonoglou, I., Lai, M., Guez, A., Lanctot, M., Sifre, L., Kumaran, D., Graepel, T., et al.: A general reinforcement learning algorithm that masters chess, Shogi, and go through self-play. Science 362(6419), 1140–1144 (2018)
Åström, K.J., Wittenmark, B.: Adaptive Control, 2nd edn. Addison-Wesley Longman Publishing, Boston (1994)
Djonin, D.V., Krishnamurthy, V.: Mimo transmission control in fading channels: a constrained markov decision process formulation with monotone randomized policies. IEEE Trans. Signal Process. 55(10), 5069–5083 (2007)
Lizotte, D., Bowling, M.H., Murphy, A.S.: Efficient reinforcement learning with multiple reward functions for randomized controlled trial analysis. In: Proceedings of the 27th International Conference on International Conference on Machine Learning, ICML’10, pp. 695–702. Omnipress, USA (2010)
Drugan, M.M., Nowe, A.: Designing multi-objective multi-armed bandits algorithms: a study. In: The 2013 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2013)
Achiam, J., Held, D., Tamar, A., Abbeel, P.: Constrained policy optimization. In: Proceedings of the 34th International Conference on Machine Learning—Volume 70, ICML’17, pp. 22–31. JMLR.org (2017)
Abels, A., Roijers, D., Lenaerts, T., Nowé, A., Steckelmacher, D.: Dynamic weights in multi-objective deep reinforcement learning. In: K. Chaudhuri, R. Salakhutdinov (eds.) Proceedings of the 36th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 97, pp. 11–20. PMLR, Long Beach (2019)
Raghu, R., Upadhyaya, P., Panju, M., Agarwal, V., Sharma, V.: Deep reinforcement learning based power control for wireless multicast systems. In: 2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pp. 1168–1175. IEEE (2019)
Gattami, A., Bai, Q., Agarwal, V.: Reinforcement learning for multi-objective and constrained markov decision processes. In: Proceedings of AISTATS (2021)
Sastry, S., Bodson, M.: Adaptive Control: Stability, Convergence and Robustness. Courier Corporation (2011)
Kumar, P.R.: A survey of some results in stochastic adaptive control. SIAM J. Control Optim. 23(3), 329–380 (1985)
Kalman, R.E.: A new approach to linear filtering and prediction problems. Trans. ASME-J. Basic Eng. 82, 35–45 (1960)
Schulz, E., Speekenbrink, M., Krause, A.: A tutorial on gaussian process regression: Modelling, exploring, and exploiting functions. J. Math. Psychol. 85, 1–16 (2018)
Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8(3–4), 229–256 (1992)
Sutton, R.S., McAllester, D.A., Singh, S.P., Mansour, Y.: Policy gradient methods for reinforcement learning with function approximation. In: Advances in Neural Information Processing Systems, pp. 1057–1063 (2000)
Watkins, C.J., Dayan, P.: Q-learning. Mach. Learn. 8(3-4), 279–292 (1992)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press (2018)
Di Cairano, S., Yanakiev, D., Bemporad, A., Kolmanovsky, I.V., Hrovat, D.: An MPC design flow for automotive control and applications to idle speed regulation. In: 2008 47th IEEE Conference on Decision and Control, pp. 5686–5691. IEEE (2008)
Fleming, W.H., Rishel, R.W.: Deterministic and Stochastic Optimal Control, vol. 1. Springer, Berlin (2012)
Koppang, P., Leland, R.: Linear quadratic stochastic control of atomic hydrogen masers. IEEE Trans. Ultrason. Ferroelectr. Freq. Control 46(3), 517–522 (1999)
Duncan, T.E., Pasik-Duncan, B.: A direct approach to linear-quadratic stochastic control. Opuscula Math. 37(6), 821–827 (2017)
Bank, P., Voß, M.: Linear quadratic stochastic control problems with stochastic terminal constraint. SIAM J. Control Optim. 56(2), 672–699 (2018)
Hordijk, A., Kallenberg, L.C.: Constrained undiscounted stochastic dynamic programming. Math. Oper. Res. 9(2), 276–289 (1984)
Neto, T.A., Pereira, M.F., Kelman, J.: A risk-constrained stochastic dynamic programming approach to the operation planning of hydrothermal systems. IEEE Trans. Power Apparatus Syst. (2), 273–279 (1985)
Chen, R.C., Blankenship, G.L.: Dynamic programming equations for discounted constrained stochastic control. IEEE Trans. Autom. Control 49(5), 699–709 (2004)
Roijers, D.M., Vamplew, P., Whiteson, S., Dazeley, R.: A survey of multi-objective sequential decision-making. J. Artif. Int. Res. 48(1), 67–113 (2013)
Paternain, S., Chamon, L., Calvo-Fullana, M., Ribeiro, A.: Constrained reinforcement learning has zero duality gap. In: H. Wallach, H. Larochelle, A. Beygelzimer, F. d’ Alché-Buc, E. Fox, R. Garnett (eds.) Advances in Neural Information Processing Systems, vol. 32, pp. 7555–7565. Curran Associates (2019)
Bai, Q., Agarwal, M., Aggarwal, V.: Joint optimization of multi-objective reinforcement learning with policy gradient Based algorithm. J. Artif. Intell. Res. 74, 1565–1597 (2022)
Bai, Q., Bedi, A.S., Agarwal, M., Koppel, A., Aggarwal, V.: Achieving zero constraint violation for constrained reinforcement learning via primal-dual approach. In Proceedings of the AAAI Conference on Artificial Intelligence 36(4), 3682–3689 (2022)
Agarwal, M., Bai, Q., Aggarwal, V.: Concave utility reinforcement learning with zero-constraint violations (2021). arXiv preprint arXiv:2109.05439
Liu, C., Geng, N., Aggarwal, V., Lan, T., Yang, Y., Xu, M.: Cmix: Deep multi-agent reinforcement learning with peak and average constraints. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 157–173. Springer, Berlin (2021)
Preindl, M.: Robust control invariant sets and Lyapunov-based MPC for IPM synchronous motor drives. IEEE Trans. Ind. Electron. 63(6), 3925–3933 (2016)
Sopasakis, P., Herceg, D., Bemporad, A., Patrinos, P.: Risk-averse model predictive control. Automatica 100, 281–288 (2019)
Deisenroth, M., Rasmussen, C.E.: Pilco: A model-based and data-efficient approach to policy search. In: Proceedings of the 28th International Conference on Machine Learning (ICML-11), pp. 465–472 (2011)
Yiqing, L., Xigang, Y., Yongjian, L.: An improved PSO algorithm for solving non-convex NLP/MINLP problems with equality constraints. Comput. Chem. Eng. 31(3), 153–162 (2007)
Madani, T., Benallegue, A.: Sliding mode observer and backstepping control for a quadrotor unmanned aerial vehicles. In: 2007 American Control Conference, pp. 5887–5892. IEEE (2007)
Manchella, K., Umrawal, A.K., Aggarwal, V.: Flexpool: A distributed model-free deep reinforcement learning algorithm for joint passengers and goods transportation. IEEE Trans. Intell. Transp. Syst. 22(4), 2035–2047 (2021)
Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., Riedmiller, M.: Playing Atari with deep reinforcement learning (2013). arXiv preprint arXiv:1312.5602
Van Hasselt, H., Guez, A., Silver, D.: Deep reinforcement learning with double q-learning. In: Proceedings of the AAAI conference on artificial intelligence, vol. 30 (2016)
Hester, T., Vecerik, M., Pietquin, O., Lanctot, M., Schaul, T., Piot, B., Horgan, D., Quan, J., Sendonaris, A., Osband, I., et al.: Deep q-learning from demonstrations. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018)
Schrittwieser, J., Antonoglou, I., Hubert, T., Simonyan, K., Sifre, L., Schmitt, S., Guez, A., Lockhart, E., Hassabis, D., Graepel, T., et al.: Mastering Atari, Go, Chess and Shogi by planning with a learned model. Nature 588(7839), 604–609 (2020)
Kakade, S.M.: A natural policy gradient. Adv. Neural Inform. Process. Syst. 14, 1531–1538 (2001)
Schulman, J., Levine, S., Abbeel, P., Jordan, M., Moritz, P.: Trust region policy optimization. In: International Conference on Machine Learning, pp. 1889–1897 (2015)
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms (2017). arXiv preprint arXiv:1707.06347
Moffaert, K.V., Nowé, A.: Multi-objective reinforcement learning using sets of pareto dominating policies. J. Mach. Learn. Res. 15, 3663–3692 (2014)
Tessler, C., Mankowitz, D.J., Mannor, S.: Reward constrained policy optimization. In: International Conference on Learning Representations (2018)
Efroni, Y., Mannor, S., Pirotta, M.: Exploration-exploitation in constrained MDPs (2020). arXiv preprint arXiv:2003.02189
Zheng, L., Ratliff, L.: Constrained Upper Confidence Reinforcement Learning, pp. 620–629. PMLR, The Cloud (2020)
Parpas, P., Rustem, B.: An algorithm for the global optimization of a class of continuous minimax problems. J. Optim. Theory Appl. 141, 461–473 (2009)
Morari, M., Lee, J.H.: Model predictive control: past, present and future. Comput. Chem. Eng. 23(4–5), 667–682 (1999)
Hewing, L., Wabersich, K.P., Menner, M., Zeilinger, M.N.: Learning-based model predictive control: Toward safe learning in control. Ann. Rev. Control Robot. Auton. Syst. 3, 269–296 (2020)
Darby, M.L., Nikolaou, M.: Mpc: Current practice and challenges. Control Eng. Practice 20(4), 328–342 (2012)
Incremona, G.P., Ferrara, A., Magni, L.: Mpc for robot manipulators with integral sliding modes generation. IEEE/ASME Trans. Mechatron. 22(3), 1299–1307 (2017)
Yin, X., Jindal, A., Sekar, V., Sinopoli, B.: A control-theoretic approach for dynamic adaptive video streaming over HTTP. In: Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication, pp. 325–338 (2015)
Elgabli, A., Aggarwal, V., Hao, S., Qian, F., Sen, S.: Lbp: Robust rate adaptation algorithm for SVC video streaming. IEEE/ACM Trans. Netw. 26(4), 1633–1645 (2018)
Elgabli, A., Aggarwal, V.: Fastscan: Robust low-complexity rate adaptation algorithm for video streaming over HTTP. IEEE Trans. Circuits Syst. Video Technol. 30(7), 2240–2249 (2020)
Å irokỳ, J., Oldewurtel, F., Cigler, J., PrÃvara, S.: Experimental analysis of model predictive control for an energy efficient building heating system. Appl. Energy 88(9), 3079–3087 (2011)
Saponara, M., Barrena, V., Bemporad, A., Hartley, E., Maciejowski, J.M., Richards, A., Tramutola, A., Trodden, P.: Model predictive control application to spacecraft rendezvous in Mars sample return scenario. EDP Sciences (2013)
Ding, Y., Wang, L., Li, Y., Li, D.: Model predictive control and its application in agriculture: A review. Comput. Electron. Agric. 151, 104–117 (2018)
Chung, H.M., Maharjan, S., Zhang, Y., Eliassen, F.: Distributed deep reinforcement learning for intelligent load scheduling in residential smart grid. IEEE Trans. Ind. Inform. (2020)
Li, R., Zhao, Z., Sun, Q., Chih-Lin, I., Yang, C., Chen, X., Zhao, M., Zhang, H.: Deep reinforcement learning for resource management in network slicing. IEEE Access 6, 74429–74441 (2018)
Zeng, D., Gu, L., Pan, S., Cai, J., Guo, S.: Resource management at the network edge: a deep reinforcement learning approach. IEEE Netw. 33(3), 26–33 (2019)
Zhang, Y., Yao, J., Guan, H.: Intelligent cloud resource management with deep reinforcement learning. IEEE Cloud Comput. 4(6), 60–69 (2017)
Vamvoudakis, K.G., Modares, H., Kiumarsi, B., Lewis, F.L.: Game theory-based control system algorithms with real-time reinforcement learning: how to solve multiplayer games online. IEEE Control Syst. Mag. 37(1), 33–52 (2017)
Koch, W., Mancuso, R., West, R., Bestavros, A.: Reinforcement learning for UAV attitude control. ACM Trans. Cyber-Phys. Syst. 3(2), 1–21 (2019)
Bai, W., Zhang, B., Zhou, Q., Lu, R.: Multigradient recursive reinforcement learning NN control for affine nonlinear systems with unmodeled dynamics. Int. J. Robust Nonlinear Control 30(4), 1643–1663 (2020)
Redder, A., Ramaswamy, A., Quevedo, D.E.: Deep reinforcement learning for scheduling in large-scale networked control systems. IFAC-PapersOnLine 52(20), 333–338 (2019)
Bai, Q., Bedi, A. S., Aggarwal, V.: Achieving zero constraint violation for constrained reinforcement learning via conservative natural policy gradient primal-dual algorithm. arXiv preprint (2022) arXiv:2206.05850
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Section Editor information
Rights and permissions
Copyright information
© 2023 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Aggarwal, V., Agarwal, M. (2023). Control of Uncertain Systems. In: Nof, S.Y. (eds) Springer Handbook of Automation. Springer Handbooks. Springer, Cham. https://doi.org/10.1007/978-3-030-96729-1_8
Download citation
DOI: https://doi.org/10.1007/978-3-030-96729-1_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-96728-4
Online ISBN: 978-3-030-96729-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)