Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Runtime Adaptation in Wireless Sensor Nodes Using Structured Learning

Published: 06 July 2020 Publication History

Abstract

Markov Decision Processes (MDPs) provide important capabilities for facilitating the dynamic adaptation and self-optimization of cyber physical systems at runtime. In recent years, this has primarily taken the form of Reinforcement Learning (RL) techniques that eliminate some MDP components for the purpose of reducing computational requirements. In this work, we show that recent advancements in Compact MDP Models (CMMs) provide sufficient cause to question this trend when designing wireless sensor network nodes. In this work, a novel CMM-based approach to designing self-aware wireless sensor nodes is presented and compared to Q-Learning, a popular RL technique. We show that a certain class of CPS nodes is not well served by RL methods and contrast RL versus CMM methods in this context. Through both simulation and a prototype implementation, we demonstrate that CMM methods can provide significantly better runtime adaptation performance relative to Q-Learning, with comparable resource requirements.

References

[1]
R. Bellman and E. Lee. 1984. History and development of dynamic programming. IEEE Contr. Syst. Mag. 4, 4 (Nov. 1984), 24--28.
[2]
L. Benini, A. Bogliolo, and G. De Micheli. 2000. A survey of design techniques for system-level dynamic power management. IEEE Trans. VLSI Syst. 8, 3 (Jun. 2000).
[3]
L. Benini, A. Bogliolo, G. A. Paleologo, and G. De Micheli. 1999. Policy optimization for dynamic power management. IEEE Trans. Comput.-Aid. Des. Integr. Circ. Syst. 18, 6 (Jun. 1999), 742--760.
[4]
C. Boutilier, R. Dearden, and M. Goldszmidt. 1995. Exploiting structure in policy construction. In Proceedings of the International Joint Conference on Artificial Intelligence. 1104--1111.
[5]
C. Moser, L. Thiele, D. Brunelli and L. Benini. 2010. Adaptive power management for environmentally powered systems. IEEE Trans. Comput. 59, 4 (Apr. 2010), 478--491.
[6]
S. Dawaliby, A. Bradai, and Y. Pousset. 2016. In depth performance evaluation of LTE-M for M2M communications. In Proceedings of the IEEE 12th International Conference on Wireless and Mobile Computing, Networking and Communications (WiMob’16). 1--8.
[7]
T. Dean and S. Lin. 1995. Decomposition techniques for planning in stochastic domains. In Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI’95), Vol. 2. 1121--1127.
[8]
Y. Debizet, G. Lallement, F. Abouzeid, P. Roche, and J. Autran. 2018. Q-learning-based adaptive power management for iot system-on-chips with embedded power states. In Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS’18). 1--5.
[9]
Nikil Dutt, Axel Jantsch, and Santanu Sarma. 2016. Toward smart embedded systems: A self-aware system-on-chip perspective. ACM Trans. Embed. Comput. Syst. 15, 2, Article 22 (Feb. 2016), 22:1–22:27 pages.
[10]
L. Esterle and B. Rinner. 2018. An architecture for self -aware IoT applications. In Proceedings of the 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’18). 6588--6592.
[11]
S. Filippone, V. Cardellini, D. Barbieri, and A. Fanfarillo. 2017. Sparse matrix-vector multiplication on GPGPUs. ACM Trans. Math. Softw. 43, 4, Article 30 (Jan. 2017), 49 pages.
[12]
J. Hoey, R. St-Aubin, A. Hu, and C. Boutilier. 1999. SPUDD: Stochastic planning using decision diagrams. In Proceedings of the 15th Conference on Uncertainty in Artificial Intelligence (UAI’99). Morgan Kaufmann, San Francisco, CA, 279--288.
[13]
C.-M. Hsieh, F. Samie, M. S. Srouji, M. Wang, Z. Wang, and J. Henkel. 2014. Hardware/software co-design for a wireless sensor network platform. In Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis. 1--10.
[14]
R. C. Hsu, C. T. Liu, and H. L. Wang. 2014. A reinforcement learning-based ToD provisioning dynamic power management for sustainable operation of energy harvesting wireless sensor node. IEEE Trans. Emerg. Top. Comput. 2, 2 (Jun. 2014), 181--191.
[15]
A. Jonsson and A. Barto. 2006. Causal graph based decomposition of factored MDPs. J. Mach. Learn. Res. 7 (2006), 2259--2301. https://dl.acm.org/doi/10.5555/1248547.1248628.
[16]
E. Jung, F. Maker, T. L. Cheung, X. Liu, and V. Akella. 2010. Markov decision process (MDP) framework for software power optimization using call profiles on mobile phones. J. Des. Autom. Embed. Syst. 14, 2 (2010), 131--159.
[17]
A. Kansal, J. Hsu, S. Zahedi, and M. B. Srivastava. 2007. Power management in energy harvesting sensor networks. ACM Trans. Embed. Comput. Syst. 6, 4 (2007).
[18]
P. R. Lewis, M. Platzner, B. Rinner, J. Torresen, and X. Yao. 2016. Self-aware Computing Systems: An Engineering Approach. Springer.
[19]
Q. Liu et al. 2015. Power-adaptive computing system design for solar-energy-powered embedded systems. IEEE Trans. VLSI Systems 23, 8 (2015), 1402--1414.
[20]
W. Liu, Y. Tan, and Q. Qiu. 2010. Enhanced q-learning algorithm for dynamic power management with performance constraint. In Proceedings of the Design, Automation Test in Europe Conference Exhibition (DATE’10). 602--605.
[21]
J. Modayil, A. White, P. M. Pilarski, and R. S. Sutton. 2012. Acquiring a broad range of empirical knowledge in real time by temporal-difference learning. In Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics (SMC’12). 1903--1910.
[22]
C. Moser, L. Thiele, D. Brunelli, and L. Benini. 2007. Adaptive power management in energy harvesting systems. In Proceedings of the Design, Automation and Test in Europe Conference and Exhibition. 1--6.
[23]
S. Ruiz and B. Hernandez. 2015. A parallel solver for markov decision process in crowd simulations. In Proceedings of the 14th Mexican International Conference on Artificial Intelligence (MICAI’15). 107--116.
[24]
S. Russell and P. Norvig. 2009. Artificial Intelligence: A Modern Approach (3rd ed.). Pearson.
[25]
A. Sapio, S. Bhattacharyya, and M. Wolf. 2018. Efficient solving of markov decision processes on GPUs using parallelized sparse matrices. In Proceedings of the Conference on Design and Architectures for Signal and Image Processing (DASIP’18).
[26]
A. Sapio, L. Li, J. Wu, M. Wolf, and S. S. Bhattacharyya. 2017. Reconfigurable digital channelizer design using factored markov decision processes. J. Sign. Process. Syst. (Dec. 2017).
[27]
A. Sapio, M. Wolf, and S. S. Bhattacharyya. 2016. Compact modeling and management of reconfiguration in digital channelizer implementation. In Proceedings of the IEEE Global Conference on Signal and Information Processing. 595--599.
[28]
O. Sigaud and O. Buffet (Eds.). 2010. Markov Decision Processes in Artificial Intelligence. Wiley.
[29]
R. Sutton and A. Barto. 1998. Reinforcement Learning: An Introduction (1st ed.). MIT Press.
[30]
A. D. Tijsma, M. M. Drugan, and M. A. Wiering. 2016. Comparing exploration strategies for Q-Learning in random stochastic mazes. In Proceedings of the IEEE Symposium Series on Computational Intelligence (SSCI’16). 1--8.
[31]
Y. Wei, X. Wang, F. Guo, G. Hogan, and M. Collier. 2015. Energy saving local control policy for Green Reconfigurable Routers. In Proceedings of the IEEE International Conference on Communications. 221--225.
[32]
Anton Wijs, Joost-Pieter Katoen, and Dragan Bošnački. 2016. Efficient GPU algorithms for parallel decomposition of graphs into strongly connected and maximal end components. Formal Methods Syst. Des. 48, 3 (01 Jun. 2016), 274--300.
[33]
S. Yue, D. Zhu, Y. Wang, and M. Pedram. 2012. Reinforcement learning based dynamic power management with a hybrid power supply. In Proceedings of the IEEE International Conference on Computer Design (ICCD’12). 81--86.

Cited By

View all
  • (2022)FreeSia: A Cyber-physical System for Cognitive Assessment through Frequency-domain Indoor Locomotion AnalysisACM Transactions on Cyber-Physical Systems10.1145/34704546:2(1-31)Online publication date: 11-Apr-2022
  • (2021)The Applicability of Reinforcement Learning Methods in the Development of Industry 4.0 ApplicationsComplexity10.1155/2021/71793742021Online publication date: 30-Nov-2021

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Cyber-Physical Systems
ACM Transactions on Cyber-Physical Systems  Volume 4, Issue 4
Special Issue on Self-Awareness in Resource Constrained CPS and Regular Papers
October 2020
293 pages
ISSN:2378-962X
EISSN:2378-9638
DOI:10.1145/3407233
  • Editor:
  • Tei-Wei Kuo
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Journal Family

Publication History

Published: 06 July 2020
Online AM: 07 May 2020
Accepted: 01 November 2019
Revised: 01 August 2019
Received: 01 November 2018
Published in TCPS Volume 4, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. LTE-M
  2. Markov decision processes
  3. adaptation
  4. reinforcement learning
  5. self-awareness

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • US National Science Foundation

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)7
  • Downloads (Last 6 weeks)0
Reflects downloads up to 15 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2022)FreeSia: A Cyber-physical System for Cognitive Assessment through Frequency-domain Indoor Locomotion AnalysisACM Transactions on Cyber-Physical Systems10.1145/34704546:2(1-31)Online publication date: 11-Apr-2022
  • (2021)The Applicability of Reinforcement Learning Methods in the Development of Industry 4.0 ApplicationsComplexity10.1155/2021/71793742021Online publication date: 30-Nov-2021

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media