research-article

Multi-UAV Reinforcement Learning for Data Collection in Cellular MIMO Networks

Authors:

Carles Diaz-Vilor,

Amr M. Abdelhady,

Ahmed M. Eltawil,

Hamid JafarkhaniAuthors Info & Claims

IEEE Transactions on Wireless Communications, Volume 23, Issue 10_Part_3

Pages 15462 - 15476

https://doi.org/10.1109/TWC.2024.3430228

Published: 01 October 2024 Publication History

Abstract

Uncrewed Aerial Vehicles (UAVs) provide a compelling solution for data collection in Internet of Things (IoT) networks due to their mobility and adaptability. However, the line-of-sight dominance in their channels may result in severe interference to ground users during UAV operations. To address this, we present an optimization framework that concurrently optimizes UAV trajectories and transmit powers. Our approach efficiently results in the collection of data from a variety of IoT sensors while (a) minimizing the UAVs flying time and (b) mitigating interference with terrestrial networks. Given the complex nature of such an optimization problem, this paper leverages reinforcement learning, specifically the twin delayed deep deterministic policy gradient algorithm, where a distributed learning algorithm is presented. Experimental results validate the efficacy of our proposed approach, demonstrating its capability to significantly enhance data collection in IoT networks while minimizing UAV flight time and interference with ground user links.

References

[1]

Y. Xiao, G. Shi, Y. Li, W. Saad, and H. V. Poor, “Toward self-learning edge intelligence in 6G,” IEEE Commun. Mag., vol. 58, no. 12, pp. 34–40, Dec. 2020.

[2]

B. Alzahrani, O. S. Oubbati, A. Barnawi, M. Atiquzzaman, and D. Alghazzawi, “UAV assistance paradigm: State-of-the-art in applications and challenges,” J. Netw. Comput. Appl., vol. 166, Sep. 2020, Art. no.

[3]

K. Messaoudi, O. S. Oubbati, A. Rachedi, A. Lakas, T. Bendouma, and N. Chaib, “A survey of UAV-based data collection: Challenges, solutions and future perspectives,” J. Netw. Comput. Appl., vol. 216, Jul. 2023, Art. no.

[4]

K. Messaoudi, O. S. Oubbati, A. Rachedi, and T. Bendouma, “UAV-UGV-based system for AoI minimization in IoT networks,” in Proc. IEEE Int. Conf. Commun., Oct. 2023, pp. 4743–4748.

[5]

H. Fu, J. Wang, J. Chen, P. Ren, Z. Zhang, and G. Zhao, “Dense multiagent reinforcement learning aided multi-UAV information coverage for vehicular networks,” IEEE Internet Things J., vol. 11, no. 12, pp. 21274–21286, Jun. 2024.

[6]

O. S. Oubbati, H. Badis, A. Rachedi, A. Lakas, and P. Lorenz, “Multi-UAV assisted network coverage optimization for rescue operations using reinforcement learning,” in Proc. IEEE Consum. Commun. Netw. Conf. (CCNC), Nov. 2023, pp. 1003–1008.

[7]

S. Gao, H. Zhang, and S. K. Das, “Efficient data collection in wireless sensor networks with path-constrained mobile sinks,” IEEE Trans. Mobile Comput., vol. 10, no. 4, pp. 592–608, Apr. 2011.

Digital Library

[8]

M. Sun, X. Xu, X. Qin, and P. Zhang, “AoI-energy-aware UAV-assisted data collection for IoT networks: A deep reinforcement learning method,” IEEE Internet Things J., vol. 8, no. 24, pp. 17275–17289, Dec. 2021.

[9]

C. Zhan, Y. Zeng, and R. Zhang, “Energy-efficient data collection in UAV enabled wireless sensor network,” IEEE Wireless Commun. Lett., vol. 7, no. 3, pp. 328–331, Jun. 2018.

[10]

J. Gong, T.-H. Chang, C. Shen, and X. Chen, “Flight time minimization of UAV for data collection over wireless sensor networks,” IEEE J. Sel. Areas Commun., vol. 36, no. 9, pp. 1942–1954, Sep. 2018.

Digital Library

[11]

Y. Wang et al., “Trajectory design for UAV-based Internet of Things data collection: A deep reinforcement learning approach,” IEEE Internet Things J., vol. 9, no. 5, pp. 3899–3912, Mar. 2022.

[12]

C. Diaz-Vilor, A. Lozano, and H. Jafarkhani, “Cell-free UAV networks: Asymptotic analysis and deployment optimization,” IEEE Trans. Wireless Commun., vol. 22, no. 5, pp. 3055–3070, May 2023.

Digital Library

[13]

C. Diaz-Vilor, A. Lozano, and H. Jafarkhani, “Cell-free UAV networks with wireless fronthaul: Analysis and optimization,” IEEE Trans. Wireless Commun., vol. 23, no. 3, pp. 2054–2069, Mar. 2024.

Digital Library

[14]

J. Guo, P. Walk, and H. Jafarkhani, “Optimal deployments of UAVs with directional antennas for a power-efficient coverage,” IEEE Trans. Commun., vol. 68, no. 8, pp. 5159–5174, Aug. 2020.

[15]

E. Koyuncu, M. Shabanighazikelayeh, and H. Seferoglu, “Deployment and trajectory optimization of UAVs: A quantization theory approach,” IEEE Trans. Wireless Commun., vol. 17, no. 12, pp. 8531–8546, Dec. 2018.

Digital Library

[16]

C. D. Vilor and H. Jafarkhani, “Optimal 3D-UAV trajectory and resource allocation of DL UAV-GE links with directional antennas,” in Proc. IEEE Global Commun. Conf. (GLOBECOM), Dec. 2020, pp. 1–6.

[17]

K. Liu and J. Zheng, “UAV trajectory optimization for time-constrained data collection in UAV-enabled environmental monitoring systems,” IEEE Internet Things J., vol. 9, no. 23, pp. 24300–24314, Dec. 2022.

[18]

X. Liu, B. Lai, B. Lin, and V. C. M. Leung, “Joint communication and trajectory optimization for multi-UAV enabled mobile Internet of Vehicles,” IEEE Trans. Intell. Transp. Syst., vol. 23, no. 9, pp. 15354–15366, Sep. 2022.

Digital Library

[19]

J. Guo and H. Jafarkhani, “Sensor deployment with limited communication range in homogeneous and heterogeneous wireless sensor networks,” IEEE Trans. Wireless Commun., vol. 15, no. 10, pp. 6771–6784, Oct. 2016.

Digital Library

[20]

J. Guo and H. Jafarkhani, “Movement-efficient sensor deployment in wireless sensor networks with limited communication range,” IEEE Trans. Wireless Commun., vol. 18, no. 7, pp. 3469–3484, Jul. 2019.

Digital Library

[21]

Y. He, Y. Gan, H. Cui, and M. Guizani, “Fairness-based 3-D multi-UAV trajectory optimization in multi-UAV-assisted MEC system,” IEEE Internet Things J., vol. 10, no. 13, pp. 11383–11395, Jul. 2023.

[22]

S. Karimi-Bidhendi, J. Guo, and H. Jafarkhani, “Energy-efficient node deployment in heterogeneous two-tier wireless sensor networks with limited communication range,” IEEE Trans. Wireless Commun., vol. 20, no. 1, pp. 40–55, Jan. 2021.

Digital Library

[23]

S. Karimi-Bidhendi, J. Guo, and H. Jafarkhani, “Energy-efficient deployment in static and mobile heterogeneous multi-hop wireless sensor networks,” IEEE Trans. Wireless Commun., vol. 21, no. 7, pp. 4973–4988, Jul. 2022.

Digital Library

[24]

S. Karimi-Bidhendi and H. Jafarkhani, “Outage-aware deployment in heterogeneous Rayleigh fading wireless sensor networks,” IEEE Trans. Commun., vol. 72, no. 2, pp. 1146–1161, Feb. 2024.

[25]

K. Guo, M. Wu, X. Li, H. Song, and N. Kumar, “Deep reinforcement learning and NOMA-based multi-objective RIS-assisted IS-UAV-TNs: Trajectory optimization and beamforming design,” IEEE Trans. Intell. Transp. Syst., vol. 24, no. 9, pp. 10197–10210, Sep. 2023.

Digital Library

[26]

C. Deng, X. Fang, and X. Wang, “Beamforming design and trajectory optimization for UAV-empowered adaptable integrated sensing and communication,” IEEE Trans. Wireless Commun., vol. 22, no. 11, pp. 8512–8526, Nov. 2023.

Digital Library

[27]

C. Diaz-Vilor, M. A. Almasi, A. M. Abdelhady, A. Celik, A. M. Eltawil, and H. Jafarkhani, “Sensing and communication in UAV cellular networks: Design and optimization,” IEEE Trans. Wireless Commun., vol. 23, no. 6, pp. 5456–5472, Jun. 2024.

Digital Library

[28]

A. M. Abdelhady, A. Celik, C. Diaz-Vilor, H. Jafarkhani, and A. M. Eltawil, “Laser-empowered UAVs for aerial data aggregation in passive IoT networks,” IEEE Open J. Commun. Soc., vol. 5, pp. 1609–1623, 2024.

[29]

J. Dai, W. Pu, J. Yan, Q. Shi, and H. Liu, “Multi-UAV collaborative trajectory optimization for asynchronous 3-D passive multitarget tracking,” IEEE Trans. Geosci. Remote Sens., vol. 61, 2023, Art. no.

[30]

H. Hu, Z. Chen, F. Zhou, Z. Han, and H. Zhu, “Joint resource and trajectory optimization for heterogeneous-UAVs enabled aerial-ground cooperative computing networks,” IEEE Trans. Veh. Technol., vol. 72, no. 7, pp. 8812–8826, Jul. 2023.

[31]

T. Zhang, J. Lei, Y. Liu, C. Feng, and A. Nallanathan, “Trajectory optimization for UAV emergency communication with limited user equipment energy: A safe-DQN approach,” IEEE Trans. Green Commun. Netw., vol. 5, no. 3, pp. 1236–1247, Sep. 2021.

[32]

S. Arzykulov, A. Celik, G. Nauryzbayev, and A. M. Eltawil, “UAV-assisted cooperative & cognitive NOMA: Deployment, clustering, and resource allocation,” IEEE Trans. Cognit. Commun. Netw., vol. 8, no. 1, pp. 263–281, Mar. 2022.

[33]

H. Jafarkhani, “Taking to the air to help on the ground: How UAVs can help fight wildfires,” in Proc. IEEE ComSoc Technol. News, Oct. 2022.

[34]

S. Kaul, R. Yates, and M. Gruteser, “Real-time status: How often should one update?” in Proc. IEEE INFOCOM, Mar. 2012, pp. 2731–2735.

[35]

S. Zhang, H. Zhang, Z. Han, H. Vincent Poor, and L. Song, “Age of information in a cellular Internet of UAVs: Sensing and communication trade-off design,” IEEE Trans. Wireless Commun., vol. 19, no. 10, pp. 6578–6592, Oct. 2020.

[36]

M. A. Abd-Elmagid and H. S. Dhillon, “Average peak age-of-information minimization in UAV-assisted IoT networks,” IEEE Trans. Veh. Technol., vol. 68, no. 2, pp. 2003–2008, Feb. 2019.

[37]

J. Liu, X. Wang, B. Bai, and H. Dai, “Age-optimal trajectory planning for UAV-assisted data collection,” in Proc. IEEE Conf. Comput. Commun. Workshops, Apr. 2018, pp. 553–558.

[38]

J. Hoydis, S. T. Brink, and M. Debbah, “Massive MIMO in the UL/DL of cellular networks: How many antennas do we need?” IEEE J. Sel. Areas Commun., vol. 31, no. 2, pp. 160–171, Feb. 2013.

[39]

S. Wagner, R. Couillet, M. Debbah, and D. T. M. Slock, “Large system analysis of linear precoding in correlated MISO broadcast channels under limited feedback,” IEEE Trans. Inf. Theory, vol. 58, no. 7, pp. 4509–4537, Jul. 2012.

Digital Library

[40]

H. Jafarkhani, Space-Time Coding: Theory and Practice. Cambridge, U.K.: Cambridge Univ. Press, 2005.

Digital Library

[41]

K. Guo, Y. Guo, G. Fodor, and G. Ascheid, “Uplink power control with MMSE receiver in multi-cell MU-massive-MIMO systems,” in Proc. IEEE Int. Conf. Commun. (ICC), Jun. 2014, pp. 5184–5190.

[42]

R. Chen, J. Andrews, R. Heath, and A. Ghosh, “Uplink power control in multi-cell spatial multiplexing wireless systems,” IEEE Trans. Wireless Commun., vol. 6, no. 7, pp. 2700–2711, Jul. 2007.

Digital Library

[43]

A. A. Khuwaja, Y. Chen, N. Zhao, M. Alouini, and P. Dobbins, “A survey of channel modeling for UAV communications,” IEEE Commun. Surveys Tuts., vol. 20, no. 4, pp. 2804–2821, 4th Quart., 2018.

Digital Library

[44]

S. Shimamoto and Iskandar, “Channel characterization and performance evaluation of mobile communication employing stratospheric platforms,” IEICE Trans. Commun., vol. E89-B, pp. 937–944, Mar. 2006.

[45]

R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, vol. 2. Cambridge, MA, USA: MIT Press, Nov. 2018.

Digital Library

[46]

K. Arulkumaran, M. P. Deisenroth, M. Brundage, and A. A. Bharath, “Deep reinforcement learning: A brief survey,” IEEE Signal Process. Mag., vol. 34, no. 6, pp. 26–38, Nov. 2017.

[47]

V. Mnih et al., “Playing Atari with deep reinforcement learning,” 2013, arXiv:1312.5602.

[48]

R. S. Sutton, D. McAllester, S. Singh, and Y. Mansour, “Policy gradient methods for reinforcement learning with function approximation,” in Adv. Neural Inf. Proc. Syst., vol. 12, 1999.

[49]

V. Konda and J. Tsitsiklis, “Actor-critic algorithms,” in Adv. Neural Inf. Proc. Syst., vol. 12, 1999.

[50]

V. Mnih et al., “Asynchronous methods for deep reinforcement learning,” in Proc. Int. Conf. Mach. Learn., 2016, pp. 1928–1937.

[51]

T. P. Lillicrap et al., “Continuous control with deep reinforcement learning,” 2015, arXiv:1509.02971.

[52]

H. Yang, J. Zhao, Z. Xiong, K.-Y. Lam, S. Sun, and L. Xiao, “Privacy-preserving federated learning for UAV-enabled networks: Learning-based joint scheduling and resource management,” IEEE J. Sel. Areas Commun., vol. 39, no. 10, pp. 3144–3159, Oct. 2021.

Digital Library

[53]

Y. Yu, J. Tang, J. Huang, X. Zhang, D. K. C. So, and K.-K. Wong, “Multi-objective optimization for UAV-assisted wireless powered IoT networks based on extended DDPG algorithm,” IEEE Trans. Commun., vol. 69, no. 9, pp. 6361–6374, Sep. 2021.

[54]

R. Ding, F. Gao, and X. S. Shen, “3D UAV trajectory design and frequency band allocation for energy-efficient and fair communication: A deep reinforcement learning approach,” IEEE Trans. Wireless Commun., vol. 19, no. 12, pp. 7796–7809, Dec. 2020.

[55]

Z. Xia et al., “Multi-agent reinforcement learning aided intelligent UAV swarm for target tracking,” IEEE Trans. Veh. Technol., vol. 71, no. 1, pp. 931–945, Jan. 2022.

[56]

R. Ding, Y. Xu, F. Gao, and X. Shen, “Trajectory design and access control for air–ground coordinated communications system with multiagent deep reinforcement learning,” IEEE Internet Things J., vol. 9, no. 8, pp. 5785–5798, Apr. 2022.

[57]

S. Fujimoto, H. van Hoof, and D. Meger, “Addressing function approximation error in actor-critic methods,” in Proc. Int. Conf. Mach. Learn., 2018, pp. 1587–1596.

[58]

D. Hong, S. Lee, Y. H. Cho, D. Baek, J. Kim, and N. Chang, “Energy-efficient online path planning of multiple drones using reinforcement learning,” IEEE Trans. Veh. Technol., vol. 70, no. 10, pp. 9725–9740, Oct. 2021.

[59]

Y. Li and A. H. Aghvami, “Radio resource management for cellular-connected UAV: A learning approach,” IEEE Trans. Commun., vol. 71, no. 5, pp. 2784–2800, May 2023.

[60]

H. B. Mann and A. Wald, “On stochastic limit and order relationships,” Ann. Math. Statist., vol. 14, no. 3, pp. 217–226, Sep. 1943.

[61]

J. Guo, E. Koyuncu, and H. Jafarkhani, “A source coding perspective on node deployment in two-tier networks,” IEEE Trans. Commun., vol. 66, no. 7, pp. 3035–3049, Jul. 2018.

[62]

C. Guestrin, D. Koller, and R. Parr, “Multiagent planning with factored MDPs,” in Proc. Adv. Neural Inf. Process. Syst., vol. 14, T. Dietterich, S. Becker, and Z. Ghahramani, Eds. Cambridge, MA, USA: MIT Press, 2001.

[63]

T. Degris, O. Sigaud, and P.-H. Wuillemin, “Learning the structure of factored Markov decision processes in reinforcement learning problems,” in Proc. 23rd Int. Conf. Mach. Learn. (ICML). New York, NY, USA: Association for Computing Machinery, 2006, pp. 257–264.

[64]

AY. Ng, D. Harada, and S. Russell, “Policy invariance under reward transformations: Theory and application to reward shaping,” in Proc. Int. Conf. Mach. Learn. San Francisco, CA, USA: Morgan Kaufmann, 1999, pp. 278–287.

[65]

Enhanced LTE Support for Aerial Vehicles, document 36.777, 3GPP, Dec. 2017.

[66]

D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” 2014, arXiv:1412.6980.

Index Terms

Multi-UAV Reinforcement Learning for Data Collection in Cellular MIMO Networks

Index terms have been assigned to the content through auto-classification.

Recommendations

Reinforcement Learning for UAV Attitude Control

Autopilot systems are typically composed of an “inner loop” providing stability and control, whereas an “outer loop” is responsible for mission-level objectives, such as way-point navigation. Autopilot systems for unmanned aerial vehicles are ...
Collision-Aware UAV Trajectories for Data Collection via Reinforcement Learning
2021 IEEE Global Communications Conference (GLOBECOM)
Unmanned aerial vehicles (UAVs) are expected to be an integral part of wireless networks, and determining collision-free trajectories in multi-UAV non-cooperative scenarios is a challenging task. In this paper, we consider a path planning optimization ...
Efficient aerial data collection with UAV in large-scale wireless sensor networks
Special issue on Energy and Spectrum Efficient Wireless Sensor Networks

Data collection from deployed sensor networks can be with static sink, ground-based mobile sink, or Unmanned Aerial Vehicle (UAV) based mobile aerial data collector. Considering the large-scale sensor networks and peculiarity of the deployed ...

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Wireless Communications

IEEE Transactions on Wireless Communications Volume 23, Issue 10_Part_3

Oct. 2024

1050 pages

Issue’s Table of Contents

© 2024 The Authors. This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.

Publisher

IEEE Press

Publication History

Published: 01 October 2024

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 26 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

Figures

Tables

Media

View Issue’s Table of Contents