Joint Task Offloading and Resource Allocation for Intelligent Reflecting Surface-Aided Integrated Sensing and Communication Systems Using Deep Reinforcement Learning Algorithm
Abstract
:1. Introduction
1.1. Related Works
1.2. Contributions
- We propose the IRS-assisted ISAC framework, exploiting the IRS to assist and enhance sensing and communication functions in NLoS coverage areas. We construct a comprehensive optimization goal, covering the sensing, communication, and computation offloading. The main goal is to maximize the data sum-rate while minimizing energy consumption under the radar performance, transmit power budget, and offloading time delay constraints through the joint design of transmit beamforming and IRS phase shift.
- Considering the coupled relationship between optimization variables, the joint optimization problem is NP-hard and non-convex, making it challenging to use traditional mathematical methods. Therefore, the optimization problem is formulated as an MDP problem, and two innovative DRL schemes are designed to solve it. Due to the continuous and large-dimension action space, we develop a deep deterministic policy gradient (DDPG) scheme, which combines prior experience replay technology to enhance training efficiency. Furthermore, a twin delayed DDPG (TD3) scheme is designed based on the DDPG framework.
- Simulation results confirm the effectiveness and convergence of our proposed scheme. In contrast with benchmarks, our proposed DRL scheme achieves a better balance between communication and sensing performance. Moreover, system’s energy consumption and latency are optimized by proper computation offloading. Finally, the benefits and feasibility of the IRS-assisted ISAC framework are verified.
2. System Model
2.1. Communication Model
2.2. Radar Sensing Model
2.3. Computation Offloading Model
3. Problem Formulation
3.1. Transmission Performance Optimization
3.2. System Energy Consumption Optimization
3.3. System-Comprehensive Performance Optimization
4. DRL-Based Joint Task Offloading and Resource Allocation Scheme
4.1. MDP Formulation
- : the channel matrix is divided into the real part and imaginary part, due to the fact that the neural network cannot deal with the complex value.
- : as the same way, is separated into two independent parts, and .
- : the transmit power for each UE and divided into two ports inputting the training network with .
- : the size of the computation task generated at UE.
- : denotes the action selected by the agent at the previous time step.
4.2. An Improved DDPG-Based Joint Optimization Algorithm
Algorithm 1 PER DDPG-based Joint Task Offloading and Resource Allocation Algorithm. |
|
4.3. Twin Delayed DDPG (TD3)-Based Joint Optimization Algorithm
- In the Input Step, input two pairs of critic networks and , respectively. In Step 1, initialize parameters of two estimate critics and two target critics with , , , and .
- Before turning to Step 17, the agent adopts a delayed update strategy to keep policy networks updated less frequently than value networks.
5. Numerical Results
5.1. Convergence Performance
5.2. Performance Comparison
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- ITU-R WP5D. Draft New Recommendation ITU-R M. [IMT. Framework for 2030 and Beyond]–Framework and Overall Objectives of the Future Development of IMT for 2030 and Beyond. 2023. Available online: https://www.itu.int/md/R19-WP5D-230612-TD-0905/ (accessed on 20 September 2023).
- Mishra, K.V.; Shankar, M.B.; Koivunen, V.; Ottersten, B.; Vorobyov, S.A. Toward millimeter-wave joint radar communications: A signal processing perspective. IEEE Signal Process. Mag. 2019, 36, 100–114. [Google Scholar] [CrossRef]
- Kumari, P.; Vorobyov, S.A.; Heath, R.W. Adaptive virtual waveform design for millimeter-wave joint communication–Radar. IEEE Trans. Signal Process. 2019, 68, 715–730. [Google Scholar] [CrossRef]
- Dokhanchi, S.H.; Mysore, B.S.; Mishra, K.V.; Ottersten, B. A mmWave automotive joint radar-communications system. IEEE Trans Aerosp. Electron. Syst. 2019, 55, 1241–1260. [Google Scholar] [CrossRef]
- Zhang, Q.; Sun, H.; Gao, X.; Wang, X.; Feng, Z. Time-Division ISAC Enabled Connected Automated Vehicles Cooperation Algorithm Design and Performance Evaluation. IEEE J. Sel. Areas Commun. 2022, 40, 2206–2218. [Google Scholar] [CrossRef]
- Liu, X.; Zhang, H.; Long, K.; Zhou, M.; Li, Y.; Poor, H.V. Proximal Policy Optimization-Based Transmit Beamforming and Phase-Shift Design in an IRS-Aided ISAC System for the THz Band. IEEE J. Sel. Areas Commun. 2022, 40, 2056–2069. [Google Scholar] [CrossRef]
- Solomitckii, D.; Heino, M.; Buddappagari, S.; Hein, M.A.; Valkama, M. Radar scheme with raised reflector for NLOS vehicle detection. IEEE Trans. Intell. Transp. Syst. 2021, 23, 9037–9045. [Google Scholar] [CrossRef]
- Song, X.; Zhao, D.; Hua, H.; Han, T.X.; Yang, X.; Xu, J. Joint transmit and reflective beamforming for IRS-assisted integrated sensing and communication. In Proceedings of the 2022 IEEE Wireless Communications and Networking Conference (WCNC), Austin, TX, USA, 10–13 April 2022; pp. 189–194. [Google Scholar]
- Liu, F.; Cui, Y.; Masouros, C.; Xu, J.; Han, T.X.; Eldar, Y.C.; Buzzi, S. Integrated sensing and communications: Toward dual-functional wireless networks for 6G and beyond. IEEE J. Sel. Areas Commun. 2022, 40, 1728–1767. [Google Scholar] [CrossRef]
- Rajatheva, N.; Atzeni, I.; Björnson, E.; Bourdoux, A.; Buzzi, S.; Doré, J.B.; Erkucuk, S.; Fuentes, M.; Guan, K.; Hu, Y.; et al. White paper on broadband connectivity in 6G. 2020. Available online: http://urn.fi/urn:isbn:9789526226798 (accessed on 2 October 2023).
- Shao, X.; You, C.; Ma, W.; Chen, X.; Zhang, R. Target sensing with intelligent reflecting surface: Architecture and performance. IEEE J. Sel. Areas Commun. 2022, 40, 2070–2084. [Google Scholar] [CrossRef]
- Liu, X.; Huang, T.; Shlezinger, N.; Liu, Y.; Zhou, J.; Eldar, Y.C. Joint transmit beamforming for multiuser MIMO communications and MIMO radar. IEEE Trans. Signal Process. 2020, 68, 3929–3944. [Google Scholar] [CrossRef]
- Jiang, Z.M.; Rihan, M.; Zhang, P.; Huang, L.; Deng, Q.; Zhang, J.; Mohamed, E.M. Intelligent Reflecting Surface Aided Dual-Function Radar and Communication System. IEEE Syst. J. 2022, 16, 475–486. [Google Scholar] [CrossRef]
- Chu, Z.; Xiao, P.; Shojafar, M.; Mi, D.; Mao, J.; Hao, W. Intelligent Reflecting Surface Assisted Mobile Edge Computing for Internet of Things. IEEE Wirel. Commun. Lett. 2021, 10, 619–623. [Google Scholar] [CrossRef]
- Sankar, R.P.; Chepuri, S.P. Beamforming in Hybrid RIS assisted Integrated Sensing and Communication Systems. In Proceedings of the 2022 30th European Signal Processing Conference (EUSIPCO), Belgrade, Serbia, 29 August–2 September 2022; pp. 1082–1086. [Google Scholar] [CrossRef]
- Buzzi, S.; Grossi, E.; Lops, M.; Venturino, L. Foundations of MIMO Radar Detection Aided by Reconfigurable Intelligent Surfaces. IEEE Trans. Signal Process. 2022, 70, 1749–1763. [Google Scholar] [CrossRef]
- Hua, M.; Wu, Q.; He, C.; Ma, S.; Chen, W. Joint Active and Passive Beamforming Design for IRS-Aided Radar-Communication. IEEE Trans. Wirel. Commun. 2023, 22, 2278–2294. [Google Scholar] [CrossRef]
- He, Y.; Cai, Y.; Mao, H.; Yu, G. RIS-Assisted Communication Radar Coexistence: Joint Beamforming Design and Analysis. IEEE J. Sel. Areas Commun. 2022, 40, 2131–2145. [Google Scholar] [CrossRef]
- Wang, X.; Fei, Z.; Huang, J.; Yu, H. Joint Waveform and Discrete Phase Shift Design for RIS-Assisted Integrated Sensing and Communication System Under Cramer-Rao Bound Constraint. IEEE Trans. Veh. Technol. 2022, 71, 1004–1009. [Google Scholar] [CrossRef]
- Liu, R.; Li, M.; Liu, Y.; Wu, Q.; Liu, Q. Joint Transmit Waveform and Passive Beamforming Design for RIS-Aided DFRC Systems. IEEE J. Sel. Top. Signal Process. 2022, 16, 995–1010. [Google Scholar] [CrossRef]
- Liao, C.; Wang, F.; Lau, V.K.N. Optimized Design for IRS-Assisted Integrated Sensing and Communication Systems in Clutter Environments. IEEE Trans. Commun. 2023, 71, 4721–4734. [Google Scholar] [CrossRef]
- Huang, N.; Wang, T.; Wu, Y.; Wu, Q.; Quek, T.Q.S. Integrated Sensing and Communication Assisted Mobile Edge Computing: An Energy-Efficient Design via Intelligent Reflecting Surface. IEEE Wirel. Commun. Lett. 2022, 11, 2085–2089. [Google Scholar] [CrossRef]
- Lillicrap, T.P.; Hunt, J.J.; Pritzel, A.; Heess, N.; Erez, T.; Tassa, Y.; Silver, D.; Wierstra, D. Continuous control with deep reinforcement learning. arXiv 2015, arXiv:1509.02971. [Google Scholar]
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef]
- François-Lavet, V.; Henderson, P.; Islam, R.; Bellemare, M.G.; Pineau, J. An introduction to deep reinforcement learning. Found. Trends Mach. Learn. 2018, 11, 219–354. [Google Scholar] [CrossRef]
- Mnih, V.; Kavukcuoglu, K.; Silver, D.; Rusu, A.A.; Veness, J.; Bellemare, M.G.; Graves, A.; Riedmiller, M.; Fidjeland, A.K.; Ostrovski, G.; et al. Human-level control through deep reinforcement learning. Nature 2015, 518, 529–533. [Google Scholar] [CrossRef] [PubMed]
- Chen, J.; Xing, H.; Xiao, Z.; Xu, L.; Tao, T. A DRL Agent for Jointly Optimizing Computation Offloading and Resource Allocation in MEC. IEEE Internet Things J. 2021, 8, 17508–17524. [Google Scholar] [CrossRef]
- Meng, F.; Chen, P.; Wu, L.; Cheng, J. Power Allocation in Multi-User Cellular Networks: Deep Reinforcement Learning Approaches. IEEE Trans. Wirel. Commun. 2020, 19, 6255–6267. [Google Scholar] [CrossRef]
- Cheng, M.; Li, J.; Nazarian, S. DRL-cloud: Deep reinforcement learning-based resource provisioning and task scheduling for cloud service providers. In Proceedings of the 2018 23rd Asia and South Pacific Design Automation Conference (ASP-DAC), Jeju, Korea, 22–25 January 2018; pp. 129–134. [Google Scholar] [CrossRef]
- Huang, C.; Mo, R.; Yuen, C. Reconfigurable Intelligent Surface Assisted Multiuser MISO Systems Exploiting Deep Reinforcement Learning. IEEE J. Sel. Areas Commun. 2020, 38, 1839–1850. [Google Scholar] [CrossRef]
- Pereira-Ruisánchez, D.; Fresnedo, Ó.; Pérez-Adán, D.; Castedo, L. Joint Optimization of IRS-assisted MU-MIMO Communication Systems through a DRL-based Twin Delayed DDPG Approach. In Proceedings of the 2022 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB), Bilbao, Spain, 15–17 June 2022; pp. 1–6. [Google Scholar] [CrossRef]
- You, C.; Zhang, R. Wireless Communication Aided by Intelligent Reflecting Surface: Active or Passive? IEEE Wirel. Commun. Lett. 2021, 10, 2659–2663. [Google Scholar] [CrossRef]
- Xu, S.; Du, Y.; Zhang, J.; Liu, J.; Wang, J.; Zhang, J. Intelligent Reflecting Surface Enabled Integrated Sensing, Communication and Computation. IEEE Trans. Wirel. Commun. 2023. early access. [Google Scholar] [CrossRef]
- Dinh, T.Q.; Tang, J.; La, Q.D.; Quek, T.Q.S. Offloading in Mobile Edge Computing: Task Allocation and Computational Frequency Scaling. IEEE Trans. Commun. 2017, 65, 3571–3584. [Google Scholar] [CrossRef]
- Wang, C.; Liang, C.; Yu, F.R.; Chen, Q.; Tang, L. Computation Offloading and Resource Allocation in Wireless Cellular Networks With Mobile Edge Computing. IEEE Trans. Wirel. Commun. 2017, 16, 4924–4938. [Google Scholar] [CrossRef]
- Mao, Y.; Zhang, J.; Song, S.H.; Letaief, K.B. Stochastic Joint Radio and Computational Resource Management for Multi-User Mobile-Edge Computing Systems. IEEE Trans. Wirel. Commun. 2017, 16, 5994–6009. [Google Scholar] [CrossRef]
- Zhou, F.; Wu, Y.; Hu, R.Q.; Qian, Y. Computation Rate Maximization in UAV-Enabled Wireless-Powered Mobile-Edge Computing Systems. IEEE J. Sel. Areas Commun. 2018, 36, 1927–1941. [Google Scholar] [CrossRef]
- Feriani, A.; Hossain, E. Single and multi-agent deep reinforcement learning for AI-enabled wireless networks: A tutorial. IEEE Commun. Surv. Tutor. 2021, 23, 1226–1252. [Google Scholar] [CrossRef]
- Hou, Y.; Liu, L.; Wei, Q.; Xu, X.; Chen, C. A novel DDPG method with prioritized experience replay. In Proceedings of the 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Banff, AB, Canada, 5–8 October 2017. [Google Scholar]
- Schaul, T.; Quan, J.; Antonoglou, I.; Silver, D. Prioritized experience replay. arXiv 2015, arXiv:1511.05952. [Google Scholar]
- Fujimoto, S.; Hoof, H.; Meger, D. Addressing function approximation error in actor-critic methods. In Proceedings of the International Conference on Machine Learning, PMLR, Stockholm, Sweden, 10–15 July 2018; pp. 1587–1596. [Google Scholar]
- Zhang, H.; Di, B.; Song, L.; Han, Z. Reconfigurable Intelligent Surfaces Assisted Communications With Limited Phase Shifts: How Many Phase Shifts Are Enough? IEEE Trans. Veh. Technol. 2020, 69, 4498–4502. [Google Scholar] [CrossRef]
- Study on Channel Model for Frequencies from 0.5 to 100 GHz (Release 17). Document 3GPP TR 38.901. v17.0.0. 2022. Available online: https://www.3gpp.org/DynaReport/38901.htm (accessed on 10 September 2023).
- Basar, E.; Yildirim, I. Reconfigurable Intelligent Surfaces for Future Wireless Networks: A Channel Modeling Perspective. IEEE Wirel. Commun. 2021, 28, 108–114. [Google Scholar] [CrossRef]
- Wang, Z.; Wei, Y.; Yu, F.R.; Han, Z. Utility Optimization for Resource Allocation in Multi-Access Edge Network Slicing: A Twin-Actor Deep Deterministic Policy Gradient Approach. IEEE Trans. Wirel. Commun. 2022, 21, 5842–5856. [Google Scholar] [CrossRef]
Ref. | Phases | Users | Targets | Radar Paths | Method |
---|---|---|---|---|---|
[13] | Continuous | Single | Single | LoS, NLoS | MM |
[8] | Continuous | Single | Multiple | NLoS | SDR |
[15] | Continuous | Multiple | Multiple | LoS | AO |
[18] | Continuous | Single | Single | LoS, NLoS | PDD, BCD |
[19] | Discrete | Multiple | Multiple | LoS | AO |
[20] | Continuous | Multiple | Single | LoS, NLoS | ADMM, AO |
[21] | Discrete | Multiple | Multiple | NLoS | SDR |
[22] | Continuous | Single | Multiple | LoS, NLoS | BCD |
This paper | Continuous | Multiple | Multiple | NLoS | DRL |
Parameter | Description | Value |
---|---|---|
M | Number of antennas at BS | 8 |
Number of IRS elements | 64 | |
K | Number of UEs | 8 |
Power budget of BS | 10 dB | |
Transmit power of the UE | 30 dBm | |
Noise variance | −85 dBm | |
Bandwidth allocated to UE k | 2 MHz | |
Input data size of task | Mbits | |
Required computation cost | Kcycles/bit | |
CPU frequency of BS server | 10 Gcycles/s | |
CPU frequency of UE | Gcycles/s | |
Maximum tolerable latency | 100 ms | |
Effective capacitance coefficient | , | |
Learning rate for actor and critic networks | 0.001, 0.001 | |
Discount factor | 0.7 | |
Soft update factor | 0.01 | |
Capacity of experience buffer | 10,000 | |
J | Capacity of minibatch | 16 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yang, L.; Wei, Y.; Wang, X. Joint Task Offloading and Resource Allocation for Intelligent Reflecting Surface-Aided Integrated Sensing and Communication Systems Using Deep Reinforcement Learning Algorithm. Sensors 2023, 23, 9896. https://doi.org/10.3390/s23249896
Yang L, Wei Y, Wang X. Joint Task Offloading and Resource Allocation for Intelligent Reflecting Surface-Aided Integrated Sensing and Communication Systems Using Deep Reinforcement Learning Algorithm. Sensors. 2023; 23(24):9896. https://doi.org/10.3390/s23249896
Chicago/Turabian StyleYang, Liu, Yifei Wei, and Xiaojun Wang. 2023. "Joint Task Offloading and Resource Allocation for Intelligent Reflecting Surface-Aided Integrated Sensing and Communication Systems Using Deep Reinforcement Learning Algorithm" Sensors 23, no. 24: 9896. https://doi.org/10.3390/s23249896