Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Public Access

Worst-case Satisfaction of STL Specifications Using Feedforward Neural Network Controllers: A Lagrange Multipliers Approach

Published: 08 October 2019 Publication History

Abstract

In this paper, a reinforcement learning approach for designing feedback neural network controllers for nonlinear systems is proposed. Given a Signal Temporal Logic (STL) specification which needs to be satisfied by the system over a set of initial conditions, the neural network parameters are tuned in order to maximize the satisfaction of the STL formula. The framework is based on a max-min formulation of the robustness of the STL formula. The maximization is solved through a Lagrange multipliers method, while the minimization corresponds to a falsification problem. We present our results on a vehicle and a quadrotor model and demonstrate that our approach reduces the training time more than 50 percent compared to the baseline approach.

References

[1]
Houssam Abbas, Matthew O’Kelly, Alena Rodionova, and Rahul Mangharam. 2017. Safe at any speed: A simulation-based test harness for autonomous vehicles. (2017).
[2]
Arvind Adimoolam, Thao Dang, Alexandre Donzé, James Kapinski, and Xiaoqing Jin. 2017. Classification and coverage-based falsification for embedded control systems. In International Conference on Computer Aided Verification. Springer, 483--503.
[3]
Matthias Althoff. 2015. An introduction to CORA 2015. In Proc. of the Workshop on Applied Verification for Continuous and Hybrid Systems.
[4]
Yashwanth Annpureddy, Che Liu, Georgios Fainekos, and Sriram Sankaranarayanan. 2011. S-taliro: A tool for temporal logic falsification for hybrid systems. In International Conference on Tools and Algorithms for the Construction and Analysis of Systems. Springer, 254--257.
[5]
Ezio Bartocci, Jyotirmoy Deshmukh, Alexandre Donze, Georgios Fainekos, Oded Maler, Dejan Nivckovic, and Sriram Sankaranarayanan. 2018. Specification-based monitoring of cyber-physical systems: A survey on theory, tools and applications. In Lectures on Runtime Verification. Springer, 135--175.
[6]
Dimitri P. Bertsekas. 2014. Constrained Optimization and Lagrange Multiplier Methods. Academic press.
[7]
Dimitri P. Bertsekas. 2019. Reinforcement learning and optimal control. Athena Scientific.
[8]
Xin Chen, Erika Ábrahám, and Sriram Sankaranarayanan. 2013. Flow*: An analyzer for non-linear hybrid systems. In International Conference on Computer Aided Verification. Springer, 258--263.
[9]
Kyunghoon Cho and Songhwai Oh. 2018. Learning-based model predictive control under signal temporal logic specifications. In 2018 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 7322--7329.
[10]
Arthur Claviere, Souradeep Dutta, and Sriram Sankaranarayanan. 2019. Trajectory tracking control for robotic vehicles using counterexample guided training of neural networks. In Proceedings of the International Conference on Automated Planning and Scheduling, Vol. 29. 680--688.
[11]
Konstantinos Dalamagkidis, Kimon P Valavanis, and Les A. Piegl. 2010. Nonlinear model predictive control with neural network optimization for autonomous autorotation of small unmanned helicopters. IEEE Transactions on Control Systems Technology 19, 4 (2010), 818--831.
[12]
M. Dehghani, M. Ahmadi, A. Khayatian, M. Eghtesad, and M. Farid. 2008. Neural network solution for forward kinematics problem of HEXA parallel robot. In 2008 American Control Conference. IEEE, 4214--4219.
[13]
Marc Deisenroth and Carl E. Rasmussen. 2011. PILCO: A model-based and data-efficient approach to policy search. In Proceedings of the 28th International Conference on Machine Learning (ICML-11). 465--472.
[14]
Alexandre Donzé. 2010. Breach, a toolbox for verification and parameter synthesis of hybrid systems. In International Conference on Computer Aided Verification. Springer, 167--170.
[15]
Alexandre Donzé and Oded Maler. 2010. Robust satisfaction of temporal logic over real-valued signals. In International Conference on Formal Modeling and Analysis of Timed Systems. Springer, 92--106.
[16]
Tommaso Dreossi, Alexandre Donzé, and Sanjit A Seshia. 2017. Compositional falsification of cyber-physical systems with machine learning components. In NASA Formal Methods Symposium. Springer, 357--372.
[17]
Tommaso Dreossi, Shromona Ghosh, Xiangyu Yue, Kurt Keutzer, Alberto Sangiovanni-Vincentelli, and Sanjit A. Seshia. 2018. Counterexample-guided data augmentation. arXiv preprint arXiv:1805.06962 (2018).
[18]
Tommaso Dreossi, Somesh Jha, and Sanjit A. Seshia. 2018. Semantic adversarial deep learning. In International Conference on Computer Aided Verification. Springer, 3--26.
[19]
Souradeep Dutta, Xin Chen, and Sriram Sankaranarayanan. 2019. Reachability analysis for neural feedback systems using regressive polynomial rule inference. In International Conference on Hybrid Systems: Computation and Control (HSCC).
[20]
Souradeep Dutta, Susmit Jha, Sriram Sanakaranarayanan, and Ashish Tiwari. 2017. Output range analysis for deep neural networks. arXiv preprint arXiv:1709.09130 (2017).
[21]
Souradeep Dutta, Susmit Jha, Sriram Sankaranarayanan, and Ashish Tiwari. 2018. Learning and verification of feedback control systems using feedforward neural networks. IFAC-PapersOnLine 51, 16 (2018), 151--156.
[22]
Georgios E. Fainekos and George J. Pappas. 2009. Robustness of temporal logic specifications for continuous-time signals. Theoretical Computer Science 410, 42 (2009), 4262--4291.
[23]
Goran Frehse, Colas Le Guernic, Alexandre Donzé, Scott Cotton, Rajarshi Ray, Olivier Lebeltel, Rodolfo Ripado, Antoine Girard, Thao Dang, and Oded Maler. 2011. SpaceEx: Scalable verification of hybrid systems. In International Conference on Computer Aided Verification. Springer, 379--395.
[24]
Qitong Gao, Davood Hajinezhad, Yan Zhang, Yiannis Kantaros, and Michael M. Zavlanos. 2019. Reduced variance deep reinforcement learning with temporal logic specifications. (2019).
[25]
Martin T Hagan, Howard B Demuth, and Orlando De Jesús. 2002. An introduction to the use of neural networks in control systems. International Journal of Robust and Nonlinear Control: IFAC-Affiliated Journal 12, 11 (2002), 959--985.
[26]
Nikolaus Hansen and Stefan Kern. 2004. Evaluating the CMA evolution strategy on multimodal test functions. In International Conference on Parallel Problem Solving from Nature. Springer, 282--291.
[27]
Nikolaus Hansen and Andreas Ostermeier. 2001. Completely derandomized self-adaptation in evolution strategies. Evolutionary Computation 9, 2 (2001), 159--195.
[28]
Michael Hertneck, Johannes Köhler, Sebastian Trimpe, and Frank Allgöwer. 2018. Learning an approximate model predictive controller with guarantees. IEEE Control Systems Letters 2, 3 (2018), 543--548.
[29]
Kurt Hornik, Maxwell Stinchcombe, and Halbert White. 1989. Multilayer feedforward networks are universal approximators. Neural Networks 2, 5 (1989), 359--366.
[30]
Radoslav Ivanov, James Weimer, Rajeev Alur, George J. Pappas, and Insup Lee. 2019. Verisig: Verifying safety properties of hybrid systems with neural network controllers. (2019), 169--178.
[31]
Kyle D. Julian and Mykel J. Kochenderfer. 2017. Neural network guidance for UAVs. In AIAA Guidance, Navigation, and Control Conference. 1743.
[32]
Kyle D. Julian, Jessica Lopez, Jeffrey S. Brush, Michael P. Owen, and Mykel J. Kochenderfer. 2016. Policy compression for aircraft collision avoidance systems. In 2016 IEEE/AIAA 35th Digital Avionics Systems Conference (DASC). IEEE, 1--10.
[33]
Hassan K. Khalil and Jessy W. Grizzle. 2002. Nonlinear systems. Vol. 3. Prentice hall Upper Saddle River, NJ.
[34]
Ron Koymans. 1990. Specifying real-time properties with metric temporal logic. Real-time Systems 2, 4 (1990), 255--299.
[35]
Sergey Levine and Pieter Abbeel. 2014. Learning neural network policies with guided policy search under unknown dynamics. In Advances in Neural Information Processing Systems. 1071--1079.
[36]
Xiao Li, Yao Ma, and Calin Belta. 2018. A policy search method for temporal logic specified reinforcement learning tasks. In 2018 Annual American Control Conference (ACC). IEEE, 240--245.
[37]
Oded Maler and Dejan Nickovic. 2004. Monitoring temporal properties of continuous signals. In Formal Techniques, Modelling and Analysis of Timed and Fault-Tolerant Systems. Springer, 152--166.
[38]
Mohammadreza Mehrabian et al. 2017. Timestamp temporal logic (TTL) for testing the timing of cyber-physical systems. ACM Transactions on Embedded Computing Systems (TECS) 16, 5s (2017), 169.
[39]
William H Montgomery and Sergey Levine. 2016. Guided policy search via approximate mirror descent. In Advances in Neural Information Processing Systems. 4008--4016.
[40]
Meinard Müller. 2007. Dynamic time warping. Information Retrieval for Music and Motion (2007), 69--84.
[41]
K. Muralitharan, Rathinasamy Sakthivel, and R. Vishnuvarthan. 2018. Neural network based optimization approach for energy demand prediction in smart grid. Neurocomputing 273 (2018), 199--208.
[42]
Yash Vardhan Pant, Houssam Abbas, and Rahul Mangharam. 2017. Smooth operator: Control using the smooth robustness of temporal logic. In Control Technology and Applications (CCTA), 2017 IEEE Conference on. IEEE, 1235--1240.
[43]
Yash Vardhan Pant, Houssam Abbas, Rhudii A. Quaye, and Rahul Mangharam. 2018. Fly-by-logic: Control of multi-drone fleets with temporal logic objectives. In Proceedings of the 9th ACM/IEEE International Conference on Cyber-Physical Systems. IEEE Press, 186--197.
[44]
Razvan Pascanu, Tomas Mikolov, and Yoshua Bengio. 2012. Understanding the exploding gradient problem. CoRR, abs/1211.5063 2 (2012).
[45]
Razvan Pascanu, Tomas Mikolov, and Yoshua Bengio. 2013. On the difficulty of training recurrent neural networks. In International Conference on Machine Learning. 1310--1318.
[46]
Kexin Pei, Yinzhi Cao, Junfeng Yang, and Suman Jana. 2017. Deepxplore: Automated whitebox testing of deep learning systems. In proceedings of the 26th Symposium on Operating Systems Principles. ACM, 1--18.
[47]
Vasumathi Raman, Alexandre Donzé, Mehdi Maasoumy, Richard M. Murray, Alberto Sangiovanni-Vincentelli, and Sanjit A. Seshia. 2014. Model predictive control with signal temporal logic specifications. In 53rd IEEE Conference on Decision and Control. IEEE, 81--87.
[48]
Vasumathi Raman, Alexandre Donzé, Dorsa Sadigh, Richard M. Murray, and Sanjit A. Seshia. 2015. Reactive synthesis from signal temporal logic specifications. In Proceedings of the 18th International Conference on Hybrid Systems: Computation and Control. ACM, 239--248.
[49]
Vicenc Rubies Royo, David Fridovich-Keil, Sylvia Herbert, and Claire J. Tomlin. 2018. Classification-based approximate reachability with guarantees applied to safe trajectory tracking. arXiv preprint arXiv:1803.03237 (2018).
[50]
Johann Schumann and Yan Liu. 2010. Applications of neural networks in high assurance systems. SCI, Vol. 268. Springer.
[51]
Cumhur Erkan Tuncali, Georgios Fainekos, Hisahiro Ito, and James Kapinski. 2018. Simulation-based adversarial test generation for autonomous vehicles with machine learning components. In 2018 IEEE Intelligent Vehicles Symposium (IV). IEEE, 1555--1562.
[52]
Cristian-Ioan Vasile, Vasumathi Raman, and Sertac Karaman. 2017. Sampling-based synthesis of maximally-satisfying controllers for temporal logic specifications. In 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 3840--3847.
[53]
Marcell J. Vazquez-Chanlatte, Shromona Ghosh, Vasumathi Raman, Alberto Sangiovanni-Vincentelli, and Sanjit A. Seshia. 2018. Generating dominant strategies for continuous two-player zero-sum games. IFAC-PapersOnLine 51, 16 (2018), 7--12.
[54]
Grady Williams, Nolan Wagener, Brian Goldfain, Paul Drews, James M. Rehg, Byron Boots, and Evangelos A. Theodorou. 2017. Information theoretic MPC for model-based reinforcement learning. In 2017 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 1714--1721.
[55]
Weiming Xiang, Patrick Musau, Ayana A. Wild, Diego Manzanas Lopez, Nathaniel Hamilton, Xiaodong Yang, Joel Rosenfeld, and Taylor T. Johnson. 2018. Verification for machine learning, autonomy, and neural networks survey. arXiv preprint arXiv:1810.01989 (2018).
[56]
Shakiba Yaghoubi and Georgios Fainekos. 2018. Falsification of temporal logic requirements using gradient based local search in space and time. IFAC-PapersOnLine 51, 16 (2018), 103--108.
[57]
Shakiba Yaghoubi and Georgios Fainekos. 2019. Gray-box adversarial testing for control systems with machine learning components. In Proceedings of the 22Nd ACM International Conference on Hybrid Systems: Computation and Control (HSCC’19). ACM, New York, NY, USA, 179--184.
[58]
Tianhao Zhang, Gregory Kahn, Sergey Levine, and Pieter Abbeel. 2016. Learning deep control policies for autonomous aerial vehicles with mpc-guided policy search. In 2016 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 528--535.
[59]
Siqi Zhou, Mohamed K. Helwa, and Angela P. Schoellig. 2017. Design of deep neural networks as add-on blocks for improving impromptu trajectory tracking. In 2017 IEEE 56th Annual Conference on Decision and Control (CDC). IEEE, 5201--5207.

Cited By

View all
  • (2024)LB4TL: A Smooth Semantics for Temporal Logic to Train Neural Feedback ControllersIFAC-PapersOnLine10.1016/j.ifacol.2024.07.44558:11(183-188)Online publication date: 2024
  • (2023)Risk-Awareness in Learning Neural Controllers for Temporal Logic Objectives2023 American Control Conference (ACC)10.23919/ACC55779.2023.10156345(4096-4103)Online publication date: 31-May-2023
  • (2023)Receding Horizon Control With Online Barrier Function Design Under Signal Temporal Logic SpecificationsIEEE Transactions on Automatic Control10.1109/TAC.2022.319547068:6(3545-3556)Online publication date: Jun-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Embedded Computing Systems
ACM Transactions on Embedded Computing Systems  Volume 18, Issue 5s
Special Issue ESWEEK 2019, CASES 2019, CODES+ISSS 2019 and EMSOFT 2019
October 2019
1423 pages
ISSN:1539-9087
EISSN:1558-3465
DOI:10.1145/3365919
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Journal Family

Publication History

Published: 08 October 2019
Accepted: 01 July 2019
Revised: 01 June 2019
Received: 01 April 2019
Published in TECS Volume 18, Issue 5s

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Reinforcement learning
  2. neural network controller
  3. signal temporal logic

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)166
  • Downloads (Last 6 weeks)17
Reflects downloads up to 04 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)LB4TL: A Smooth Semantics for Temporal Logic to Train Neural Feedback ControllersIFAC-PapersOnLine10.1016/j.ifacol.2024.07.44558:11(183-188)Online publication date: 2024
  • (2023)Risk-Awareness in Learning Neural Controllers for Temporal Logic Objectives2023 American Control Conference (ACC)10.23919/ACC55779.2023.10156345(4096-4103)Online publication date: 31-May-2023
  • (2023)Receding Horizon Control With Online Barrier Function Design Under Signal Temporal Logic SpecificationsIEEE Transactions on Automatic Control10.1109/TAC.2022.319547068:6(3545-3556)Online publication date: Jun-2023
  • (2023)Counter-Example Guided Imitation Learning of Feedback Controllers from Temporal Logic Specifications2023 62nd IEEE Conference on Decision and Control (CDC)10.1109/CDC49753.2023.10383831(5339-5344)Online publication date: 13-Dec-2023
  • (2023)Multilayer extreme learning machine: a systematic reviewMultimedia Tools and Applications10.1007/s11042-023-14634-482:26(40269-40307)Online publication date: 1-Nov-2023
  • (2022)Semi-Supervised Trajectory-Feedback Controller Synthesis for Signal Temporal Logic Specifications2022 American Control Conference (ACC)10.23919/ACC53348.2022.9867345(178-185)Online publication date: 8-Jun-2022
  • (2022)Backpropagation through signal temporal logic specifications: Infusing logical structure into gradient-based methodsThe International Journal of Robotics Research10.1177/0278364922108211542:6(356-370)Online publication date: 28-May-2022
  • (2022)STL2vec: Signal Temporal Logic Embeddings for Control Synthesis With Recurrent Neural NetworksIEEE Robotics and Automation Letters10.1109/LRA.2022.31551977:2(5246-5253)Online publication date: Apr-2022
  • (2022)Recurrent Neural Network Controllers for Signal Temporal Logic Specifications Subject to Safety ConstraintsIEEE Control Systems Letters10.1109/LCSYS.2021.30499176(91-96)Online publication date: 2022
  • (2022)Formal synthesis of closed-form sampled-data controllers for nonlinear continuous-time systems under STL specificationsAutomatica (Journal of IFAC)10.1016/j.automatica.2022.110184139:COnline publication date: 16-May-2022
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media