research-article

Runtime Adaptation in Wireless Sensor Nodes Using Structured Learning

Authors:

Shuvra S. Bhattacharyya,

Marilyn WolfAuthors Info & Claims

ACM Transactions on Cyber-Physical Systems, Volume 4, Issue 4

Article No.: 40, Pages 1 - 28

https://doi.org/10.1145/3372153

Published: 06 July 2020 Publication History

Abstract

Markov Decision Processes (MDPs) provide important capabilities for facilitating the dynamic adaptation and self-optimization of cyber physical systems at runtime. In recent years, this has primarily taken the form of Reinforcement Learning (RL) techniques that eliminate some MDP components for the purpose of reducing computational requirements. In this work, we show that recent advancements in Compact MDP Models (CMMs) provide sufficient cause to question this trend when designing wireless sensor network nodes. In this work, a novel CMM-based approach to designing self-aware wireless sensor nodes is presented and compared to Q-Learning, a popular RL technique. We show that a certain class of CPS nodes is not well served by RL methods and contrast RL versus CMM methods in this context. Through both simulation and a prototype implementation, we demonstrate that CMM methods can provide significantly better runtime adaptation performance relative to Q-Learning, with comparable resource requirements.

References

[1]

R. Bellman and E. Lee. 1984. History and development of dynamic programming. IEEE Contr. Syst. Mag. 4, 4 (Nov. 1984), 24--28.

[2]

L. Benini, A. Bogliolo, and G. De Micheli. 2000. A survey of design techniques for system-level dynamic power management. IEEE Trans. VLSI Syst. 8, 3 (Jun. 2000).

Digital Library

[3]

L. Benini, A. Bogliolo, G. A. Paleologo, and G. De Micheli. 1999. Policy optimization for dynamic power management. IEEE Trans. Comput.-Aid. Des. Integr. Circ. Syst. 18, 6 (Jun. 1999), 742--760.

Digital Library

[4]

C. Boutilier, R. Dearden, and M. Goldszmidt. 1995. Exploiting structure in policy construction. In Proceedings of the International Joint Conference on Artificial Intelligence. 1104--1111.

[5]

C. Moser, L. Thiele, D. Brunelli and L. Benini. 2010. Adaptive power management for environmentally powered systems. IEEE Trans. Comput. 59, 4 (Apr. 2010), 478--491.

Digital Library

[6]

S. Dawaliby, A. Bradai, and Y. Pousset. 2016. In depth performance evaluation of LTE-M for M2M communications. In Proceedings of the IEEE 12th International Conference on Wireless and Mobile Computing, Networking and Communications (WiMob’16). 1--8.

[7]

T. Dean and S. Lin. 1995. Decomposition techniques for planning in stochastic domains. In Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI’95), Vol. 2. 1121--1127.

[8]

Y. Debizet, G. Lallement, F. Abouzeid, P. Roche, and J. Autran. 2018. Q-learning-based adaptive power management for iot system-on-chips with embedded power states. In Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS’18). 1--5.

[9]

Nikil Dutt, Axel Jantsch, and Santanu Sarma. 2016. Toward smart embedded systems: A self-aware system-on-chip perspective. ACM Trans. Embed. Comput. Syst. 15, 2, Article 22 (Feb. 2016), 22:1–22:27 pages.

Digital Library

[10]

L. Esterle and B. Rinner. 2018. An architecture for self -aware IoT applications. In Proceedings of the 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’18). 6588--6592.

[11]

S. Filippone, V. Cardellini, D. Barbieri, and A. Fanfarillo. 2017. Sparse matrix-vector multiplication on GPGPUs. ACM Trans. Math. Softw. 43, 4, Article 30 (Jan. 2017), 49 pages.

Digital Library

[12]

J. Hoey, R. St-Aubin, A. Hu, and C. Boutilier. 1999. SPUDD: Stochastic planning using decision diagrams. In Proceedings of the 15th Conference on Uncertainty in Artificial Intelligence (UAI’99). Morgan Kaufmann, San Francisco, CA, 279--288.

Digital Library

[13]

C.-M. Hsieh, F. Samie, M. S. Srouji, M. Wang, Z. Wang, and J. Henkel. 2014. Hardware/software co-design for a wireless sensor network platform. In Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis. 1--10.

[14]

R. C. Hsu, C. T. Liu, and H. L. Wang. 2014. A reinforcement learning-based ToD provisioning dynamic power management for sustainable operation of energy harvesting wireless sensor node. IEEE Trans. Emerg. Top. Comput. 2, 2 (Jun. 2014), 181--191.

[15]

A. Jonsson and A. Barto. 2006. Causal graph based decomposition of factored MDPs. J. Mach. Learn. Res. 7 (2006), 2259--2301. https://dl.acm.org/doi/10.5555/1248547.1248628.

Digital Library

[16]

E. Jung, F. Maker, T. L. Cheung, X. Liu, and V. Akella. 2010. Markov decision process (MDP) framework for software power optimization using call profiles on mobile phones. J. Des. Autom. Embed. Syst. 14, 2 (2010), 131--159.

Digital Library

[17]

A. Kansal, J. Hsu, S. Zahedi, and M. B. Srivastava. 2007. Power management in energy harvesting sensor networks. ACM Trans. Embed. Comput. Syst. 6, 4 (2007).

Digital Library

[18]

P. R. Lewis, M. Platzner, B. Rinner, J. Torresen, and X. Yao. 2016. Self-aware Computing Systems: An Engineering Approach. Springer.

[19]

Q. Liu et al. 2015. Power-adaptive computing system design for solar-energy-powered embedded systems. IEEE Trans. VLSI Systems 23, 8 (2015), 1402--1414.

Digital Library

[20]

W. Liu, Y. Tan, and Q. Qiu. 2010. Enhanced q-learning algorithm for dynamic power management with performance constraint. In Proceedings of the Design, Automation Test in Europe Conference Exhibition (DATE’10). 602--605.

[21]

J. Modayil, A. White, P. M. Pilarski, and R. S. Sutton. 2012. Acquiring a broad range of empirical knowledge in real time by temporal-difference learning. In Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics (SMC’12). 1903--1910.

[22]

C. Moser, L. Thiele, D. Brunelli, and L. Benini. 2007. Adaptive power management in energy harvesting systems. In Proceedings of the Design, Automation and Test in Europe Conference and Exhibition. 1--6.

[23]

S. Ruiz and B. Hernandez. 2015. A parallel solver for markov decision process in crowd simulations. In Proceedings of the 14th Mexican International Conference on Artificial Intelligence (MICAI’15). 107--116.

[24]

S. Russell and P. Norvig. 2009. Artificial Intelligence: A Modern Approach (3rd ed.). Pearson.

Digital Library

[25]

A. Sapio, S. Bhattacharyya, and M. Wolf. 2018. Efficient solving of markov decision processes on GPUs using parallelized sparse matrices. In Proceedings of the Conference on Design and Architectures for Signal and Image Processing (DASIP’18).

[26]

A. Sapio, L. Li, J. Wu, M. Wolf, and S. S. Bhattacharyya. 2017. Reconfigurable digital channelizer design using factored markov decision processes. J. Sign. Process. Syst. (Dec. 2017).

[27]

A. Sapio, M. Wolf, and S. S. Bhattacharyya. 2016. Compact modeling and management of reconfiguration in digital channelizer implementation. In Proceedings of the IEEE Global Conference on Signal and Information Processing. 595--599.

[28]

O. Sigaud and O. Buffet (Eds.). 2010. Markov Decision Processes in Artificial Intelligence. Wiley.

[29]

R. Sutton and A. Barto. 1998. Reinforcement Learning: An Introduction (1st ed.). MIT Press.

Digital Library

[30]

A. D. Tijsma, M. M. Drugan, and M. A. Wiering. 2016. Comparing exploration strategies for Q-Learning in random stochastic mazes. In Proceedings of the IEEE Symposium Series on Computational Intelligence (SSCI’16). 1--8.

[31]

Y. Wei, X. Wang, F. Guo, G. Hogan, and M. Collier. 2015. Energy saving local control policy for Green Reconfigurable Routers. In Proceedings of the IEEE International Conference on Communications. 221--225.

[32]

Anton Wijs, Joost-Pieter Katoen, and Dragan Bošnački. 2016. Efficient GPU algorithms for parallel decomposition of graphs into strongly connected and maximal end components. Formal Methods Syst. Des. 48, 3 (01 Jun. 2016), 274--300.

[33]

S. Yue, D. Zhu, Y. Wang, and M. Pedram. 2012. Reinforcement learning based dynamic power management with a hybrid power supply. In Proceedings of the IEEE International Conference on Computer Design (ICCD’12). 81--86.

Cited By

Khodabandehloo EAlimohammadi ARiboni D(2022)FreeSia: A Cyber-physical System for Cognitive Assessment through Frequency-domain Indoor Locomotion AnalysisACM Transactions on Cyber-Physical Systems10.1145/34704546:2(1-31)Online publication date: 11-Apr-2022
https://dl.acm.org/doi/10.1145/3470454
Kegyes TSüle ZAbonyi J(2021)The Applicability of Reinforcement Learning Methods in the Development of Industry 4.0 ApplicationsComplexity10.1155/2021/71793742021Online publication date: 30-Nov-2021
https://dl.acm.org/doi/10.1155/2021/7179374

Index Terms

Runtime Adaptation in Wireless Sensor Nodes Using Structured Learning

Recommendations

Continuous-action reinforcement learning with fast policy search and adaptive basis function selection
Special issue on Recent advances on machine learning and Cybernetics

As an important approach to solving complex sequential decision problems, reinforcement learning (RL) has been widely studied in the community of artificial intelligence and machine learning. However, the generalization ability of RL is still an open ...
Reinforcement learning algorithm for non-stationary environments
Abstract
Reinforcement learning (RL) methods learn optimal decisions in the presence of a stationary environment. However, the stationary assumption on the environment is very restrictive. In many real world problems like traffic signal control, robotic ...
A Modified Memory-Based Reinforcement Learning Method for Solving POMDP Problems

Partially observable Markov decision processes (POMDP) provide a mathematical framework for agent planning under stochastic and partially observable environments. The classic Bayesian optimal solution can be obtained by transforming the problem into ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Cyber-Physical Systems

ACM Transactions on Cyber-Physical Systems Volume 4, Issue 4

Special Issue on Self-Awareness in Resource Constrained CPS and Regular Papers

October 2020

293 pages

ISSN:2378-962X

EISSN:2378-9638

DOI:10.1145/3407233

Editor:
Tei-Wei Kuo
City University of Hong Kong and National Taiwan University, Taiwan

Issue’s Table of Contents

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Journal Family

ACM Journals for the Design of Smart and Connected Systems

Publication History

Published: 06 July 2020

Online AM: 07 May 2020

Accepted: 01 November 2019

Revised: 01 August 2019

Received: 01 November 2018

Published in TCPS Volume 4, Issue 4

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

US National Science Foundation

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
93
Total Downloads

Downloads (Last 12 months)7
Downloads (Last 6 weeks)0

Reflects downloads up to 15 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Khodabandehloo EAlimohammadi ARiboni D(2022)FreeSia: A Cyber-physical System for Cognitive Assessment through Frequency-domain Indoor Locomotion AnalysisACM Transactions on Cyber-Physical Systems10.1145/34704546:2(1-31)Online publication date: 11-Apr-2022
https://dl.acm.org/doi/10.1145/3470454
Kegyes TSüle ZAbonyi J(2021)The Applicability of Reinforcement Learning Methods in the Development of Industry 4.0 ApplicationsComplexity10.1155/2021/71793742021Online publication date: 30-Nov-2021
https://dl.acm.org/doi/10.1155/2021/7179374

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Issue’s Table of Contents