Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Multi-UAV Reinforcement Learning for Data Collection in Cellular MIMO Networks

Published: 01 October 2024 Publication History

Abstract

Uncrewed Aerial Vehicles (UAVs) provide a compelling solution for data collection in Internet of Things (IoT) networks due to their mobility and adaptability. However, the line-of-sight dominance in their channels may result in severe interference to ground users during UAV operations. To address this, we present an optimization framework that concurrently optimizes UAV trajectories and transmit powers. Our approach efficiently results in the collection of data from a variety of IoT sensors while (a) minimizing the UAVs flying time and (b) mitigating interference with terrestrial networks. Given the complex nature of such an optimization problem, this paper leverages reinforcement learning, specifically the twin delayed deep deterministic policy gradient algorithm, where a distributed learning algorithm is presented. Experimental results validate the efficacy of our proposed approach, demonstrating its capability to significantly enhance data collection in IoT networks while minimizing UAV flight time and interference with ground user links.

References

[1]
Y. Xiao, G. Shi, Y. Li, W. Saad, and H. V. Poor, “Toward self-learning edge intelligence in 6G,” IEEE Commun. Mag., vol. 58, no. 12, pp. 34–40, Dec. 2020.
[2]
B. Alzahrani, O. S. Oubbati, A. Barnawi, M. Atiquzzaman, and D. Alghazzawi, “UAV assistance paradigm: State-of-the-art in applications and challenges,” J. Netw. Comput. Appl., vol. 166, Sep. 2020, Art. no.
[3]
K. Messaoudi, O. S. Oubbati, A. Rachedi, A. Lakas, T. Bendouma, and N. Chaib, “A survey of UAV-based data collection: Challenges, solutions and future perspectives,” J. Netw. Comput. Appl., vol. 216, Jul. 2023, Art. no.
[4]
K. Messaoudi, O. S. Oubbati, A. Rachedi, and T. Bendouma, “UAV-UGV-based system for AoI minimization in IoT networks,” in Proc. IEEE Int. Conf. Commun., Oct. 2023, pp. 4743–4748.
[5]
H. Fu, J. Wang, J. Chen, P. Ren, Z. Zhang, and G. Zhao, “Dense multiagent reinforcement learning aided multi-UAV information coverage for vehicular networks,” IEEE Internet Things J., vol. 11, no. 12, pp. 21274–21286, Jun. 2024.
[6]
O. S. Oubbati, H. Badis, A. Rachedi, A. Lakas, and P. Lorenz, “Multi-UAV assisted network coverage optimization for rescue operations using reinforcement learning,” in Proc. IEEE Consum. Commun. Netw. Conf. (CCNC), Nov. 2023, pp. 1003–1008.
[7]
S. Gao, H. Zhang, and S. K. Das, “Efficient data collection in wireless sensor networks with path-constrained mobile sinks,” IEEE Trans. Mobile Comput., vol. 10, no. 4, pp. 592–608, Apr. 2011.
[8]
M. Sun, X. Xu, X. Qin, and P. Zhang, “AoI-energy-aware UAV-assisted data collection for IoT networks: A deep reinforcement learning method,” IEEE Internet Things J., vol. 8, no. 24, pp. 17275–17289, Dec. 2021.
[9]
C. Zhan, Y. Zeng, and R. Zhang, “Energy-efficient data collection in UAV enabled wireless sensor network,” IEEE Wireless Commun. Lett., vol. 7, no. 3, pp. 328–331, Jun. 2018.
[10]
J. Gong, T.-H. Chang, C. Shen, and X. Chen, “Flight time minimization of UAV for data collection over wireless sensor networks,” IEEE J. Sel. Areas Commun., vol. 36, no. 9, pp. 1942–1954, Sep. 2018.
[11]
Y. Wang et al., “Trajectory design for UAV-based Internet of Things data collection: A deep reinforcement learning approach,” IEEE Internet Things J., vol. 9, no. 5, pp. 3899–3912, Mar. 2022.
[12]
C. Diaz-Vilor, A. Lozano, and H. Jafarkhani, “Cell-free UAV networks: Asymptotic analysis and deployment optimization,” IEEE Trans. Wireless Commun., vol. 22, no. 5, pp. 3055–3070, May 2023.
[13]
C. Diaz-Vilor, A. Lozano, and H. Jafarkhani, “Cell-free UAV networks with wireless fronthaul: Analysis and optimization,” IEEE Trans. Wireless Commun., vol. 23, no. 3, pp. 2054–2069, Mar. 2024.
[14]
J. Guo, P. Walk, and H. Jafarkhani, “Optimal deployments of UAVs with directional antennas for a power-efficient coverage,” IEEE Trans. Commun., vol. 68, no. 8, pp. 5159–5174, Aug. 2020.
[15]
E. Koyuncu, M. Shabanighazikelayeh, and H. Seferoglu, “Deployment and trajectory optimization of UAVs: A quantization theory approach,” IEEE Trans. Wireless Commun., vol. 17, no. 12, pp. 8531–8546, Dec. 2018.
[16]
C. D. Vilor and H. Jafarkhani, “Optimal 3D-UAV trajectory and resource allocation of DL UAV-GE links with directional antennas,” in Proc. IEEE Global Commun. Conf. (GLOBECOM), Dec. 2020, pp. 1–6.
[17]
K. Liu and J. Zheng, “UAV trajectory optimization for time-constrained data collection in UAV-enabled environmental monitoring systems,” IEEE Internet Things J., vol. 9, no. 23, pp. 24300–24314, Dec. 2022.
[18]
X. Liu, B. Lai, B. Lin, and V. C. M. Leung, “Joint communication and trajectory optimization for multi-UAV enabled mobile Internet of Vehicles,” IEEE Trans. Intell. Transp. Syst., vol. 23, no. 9, pp. 15354–15366, Sep. 2022.
[19]
J. Guo and H. Jafarkhani, “Sensor deployment with limited communication range in homogeneous and heterogeneous wireless sensor networks,” IEEE Trans. Wireless Commun., vol. 15, no. 10, pp. 6771–6784, Oct. 2016.
[20]
J. Guo and H. Jafarkhani, “Movement-efficient sensor deployment in wireless sensor networks with limited communication range,” IEEE Trans. Wireless Commun., vol. 18, no. 7, pp. 3469–3484, Jul. 2019.
[21]
Y. He, Y. Gan, H. Cui, and M. Guizani, “Fairness-based 3-D multi-UAV trajectory optimization in multi-UAV-assisted MEC system,” IEEE Internet Things J., vol. 10, no. 13, pp. 11383–11395, Jul. 2023.
[22]
S. Karimi-Bidhendi, J. Guo, and H. Jafarkhani, “Energy-efficient node deployment in heterogeneous two-tier wireless sensor networks with limited communication range,” IEEE Trans. Wireless Commun., vol. 20, no. 1, pp. 40–55, Jan. 2021.
[23]
S. Karimi-Bidhendi, J. Guo, and H. Jafarkhani, “Energy-efficient deployment in static and mobile heterogeneous multi-hop wireless sensor networks,” IEEE Trans. Wireless Commun., vol. 21, no. 7, pp. 4973–4988, Jul. 2022.
[24]
S. Karimi-Bidhendi and H. Jafarkhani, “Outage-aware deployment in heterogeneous Rayleigh fading wireless sensor networks,” IEEE Trans. Commun., vol. 72, no. 2, pp. 1146–1161, Feb. 2024.
[25]
K. Guo, M. Wu, X. Li, H. Song, and N. Kumar, “Deep reinforcement learning and NOMA-based multi-objective RIS-assisted IS-UAV-TNs: Trajectory optimization and beamforming design,” IEEE Trans. Intell. Transp. Syst., vol. 24, no. 9, pp. 10197–10210, Sep. 2023.
[26]
C. Deng, X. Fang, and X. Wang, “Beamforming design and trajectory optimization for UAV-empowered adaptable integrated sensing and communication,” IEEE Trans. Wireless Commun., vol. 22, no. 11, pp. 8512–8526, Nov. 2023.
[27]
C. Diaz-Vilor, M. A. Almasi, A. M. Abdelhady, A. Celik, A. M. Eltawil, and H. Jafarkhani, “Sensing and communication in UAV cellular networks: Design and optimization,” IEEE Trans. Wireless Commun., vol. 23, no. 6, pp. 5456–5472, Jun. 2024.
[28]
A. M. Abdelhady, A. Celik, C. Diaz-Vilor, H. Jafarkhani, and A. M. Eltawil, “Laser-empowered UAVs for aerial data aggregation in passive IoT networks,” IEEE Open J. Commun. Soc., vol. 5, pp. 1609–1623, 2024.
[29]
J. Dai, W. Pu, J. Yan, Q. Shi, and H. Liu, “Multi-UAV collaborative trajectory optimization for asynchronous 3-D passive multitarget tracking,” IEEE Trans. Geosci. Remote Sens., vol. 61, 2023, Art. no.
[30]
H. Hu, Z. Chen, F. Zhou, Z. Han, and H. Zhu, “Joint resource and trajectory optimization for heterogeneous-UAVs enabled aerial-ground cooperative computing networks,” IEEE Trans. Veh. Technol., vol. 72, no. 7, pp. 8812–8826, Jul. 2023.
[31]
T. Zhang, J. Lei, Y. Liu, C. Feng, and A. Nallanathan, “Trajectory optimization for UAV emergency communication with limited user equipment energy: A safe-DQN approach,” IEEE Trans. Green Commun. Netw., vol. 5, no. 3, pp. 1236–1247, Sep. 2021.
[32]
S. Arzykulov, A. Celik, G. Nauryzbayev, and A. M. Eltawil, “UAV-assisted cooperative & cognitive NOMA: Deployment, clustering, and resource allocation,” IEEE Trans. Cognit. Commun. Netw., vol. 8, no. 1, pp. 263–281, Mar. 2022.
[33]
H. Jafarkhani, “Taking to the air to help on the ground: How UAVs can help fight wildfires,” in Proc. IEEE ComSoc Technol. News, Oct. 2022.
[34]
S. Kaul, R. Yates, and M. Gruteser, “Real-time status: How often should one update?” in Proc. IEEE INFOCOM, Mar. 2012, pp. 2731–2735.
[35]
S. Zhang, H. Zhang, Z. Han, H. Vincent Poor, and L. Song, “Age of information in a cellular Internet of UAVs: Sensing and communication trade-off design,” IEEE Trans. Wireless Commun., vol. 19, no. 10, pp. 6578–6592, Oct. 2020.
[36]
M. A. Abd-Elmagid and H. S. Dhillon, “Average peak age-of-information minimization in UAV-assisted IoT networks,” IEEE Trans. Veh. Technol., vol. 68, no. 2, pp. 2003–2008, Feb. 2019.
[37]
J. Liu, X. Wang, B. Bai, and H. Dai, “Age-optimal trajectory planning for UAV-assisted data collection,” in Proc. IEEE Conf. Comput. Commun. Workshops, Apr. 2018, pp. 553–558.
[38]
J. Hoydis, S. T. Brink, and M. Debbah, “Massive MIMO in the UL/DL of cellular networks: How many antennas do we need?” IEEE J. Sel. Areas Commun., vol. 31, no. 2, pp. 160–171, Feb. 2013.
[39]
S. Wagner, R. Couillet, M. Debbah, and D. T. M. Slock, “Large system analysis of linear precoding in correlated MISO broadcast channels under limited feedback,” IEEE Trans. Inf. Theory, vol. 58, no. 7, pp. 4509–4537, Jul. 2012.
[40]
H. Jafarkhani, Space-Time Coding: Theory and Practice. Cambridge, U.K.: Cambridge Univ. Press, 2005.
[41]
K. Guo, Y. Guo, G. Fodor, and G. Ascheid, “Uplink power control with MMSE receiver in multi-cell MU-massive-MIMO systems,” in Proc. IEEE Int. Conf. Commun. (ICC), Jun. 2014, pp. 5184–5190.
[42]
R. Chen, J. Andrews, R. Heath, and A. Ghosh, “Uplink power control in multi-cell spatial multiplexing wireless systems,” IEEE Trans. Wireless Commun., vol. 6, no. 7, pp. 2700–2711, Jul. 2007.
[43]
A. A. Khuwaja, Y. Chen, N. Zhao, M. Alouini, and P. Dobbins, “A survey of channel modeling for UAV communications,” IEEE Commun. Surveys Tuts., vol. 20, no. 4, pp. 2804–2821, 4th Quart., 2018.
[44]
S. Shimamoto and Iskandar, “Channel characterization and performance evaluation of mobile communication employing stratospheric platforms,” IEICE Trans. Commun., vol. E89-B, pp. 937–944, Mar. 2006.
[45]
R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, vol. 2. Cambridge, MA, USA: MIT Press, Nov. 2018.
[46]
K. Arulkumaran, M. P. Deisenroth, M. Brundage, and A. A. Bharath, “Deep reinforcement learning: A brief survey,” IEEE Signal Process. Mag., vol. 34, no. 6, pp. 26–38, Nov. 2017.
[47]
V. Mnih et al., “Playing Atari with deep reinforcement learning,” 2013, arXiv:1312.5602.
[48]
R. S. Sutton, D. McAllester, S. Singh, and Y. Mansour, “Policy gradient methods for reinforcement learning with function approximation,” in Adv. Neural Inf. Proc. Syst., vol. 12, 1999.
[49]
V. Konda and J. Tsitsiklis, “Actor-critic algorithms,” in Adv. Neural Inf. Proc. Syst., vol. 12, 1999.
[50]
V. Mnih et al., “Asynchronous methods for deep reinforcement learning,” in Proc. Int. Conf. Mach. Learn., 2016, pp. 1928–1937.
[51]
T. P. Lillicrap et al., “Continuous control with deep reinforcement learning,” 2015, arXiv:1509.02971.
[52]
H. Yang, J. Zhao, Z. Xiong, K.-Y. Lam, S. Sun, and L. Xiao, “Privacy-preserving federated learning for UAV-enabled networks: Learning-based joint scheduling and resource management,” IEEE J. Sel. Areas Commun., vol. 39, no. 10, pp. 3144–3159, Oct. 2021.
[53]
Y. Yu, J. Tang, J. Huang, X. Zhang, D. K. C. So, and K.-K. Wong, “Multi-objective optimization for UAV-assisted wireless powered IoT networks based on extended DDPG algorithm,” IEEE Trans. Commun., vol. 69, no. 9, pp. 6361–6374, Sep. 2021.
[54]
R. Ding, F. Gao, and X. S. Shen, “3D UAV trajectory design and frequency band allocation for energy-efficient and fair communication: A deep reinforcement learning approach,” IEEE Trans. Wireless Commun., vol. 19, no. 12, pp. 7796–7809, Dec. 2020.
[55]
Z. Xia et al., “Multi-agent reinforcement learning aided intelligent UAV swarm for target tracking,” IEEE Trans. Veh. Technol., vol. 71, no. 1, pp. 931–945, Jan. 2022.
[56]
R. Ding, Y. Xu, F. Gao, and X. Shen, “Trajectory design and access control for air–ground coordinated communications system with multiagent deep reinforcement learning,” IEEE Internet Things J., vol. 9, no. 8, pp. 5785–5798, Apr. 2022.
[57]
S. Fujimoto, H. van Hoof, and D. Meger, “Addressing function approximation error in actor-critic methods,” in Proc. Int. Conf. Mach. Learn., 2018, pp. 1587–1596.
[58]
D. Hong, S. Lee, Y. H. Cho, D. Baek, J. Kim, and N. Chang, “Energy-efficient online path planning of multiple drones using reinforcement learning,” IEEE Trans. Veh. Technol., vol. 70, no. 10, pp. 9725–9740, Oct. 2021.
[59]
Y. Li and A. H. Aghvami, “Radio resource management for cellular-connected UAV: A learning approach,” IEEE Trans. Commun., vol. 71, no. 5, pp. 2784–2800, May 2023.
[60]
H. B. Mann and A. Wald, “On stochastic limit and order relationships,” Ann. Math. Statist., vol. 14, no. 3, pp. 217–226, Sep. 1943.
[61]
J. Guo, E. Koyuncu, and H. Jafarkhani, “A source coding perspective on node deployment in two-tier networks,” IEEE Trans. Commun., vol. 66, no. 7, pp. 3035–3049, Jul. 2018.
[62]
C. Guestrin, D. Koller, and R. Parr, “Multiagent planning with factored MDPs,” in Proc. Adv. Neural Inf. Process. Syst., vol. 14, T. Dietterich, S. Becker, and Z. Ghahramani, Eds. Cambridge, MA, USA: MIT Press, 2001.
[63]
T. Degris, O. Sigaud, and P.-H. Wuillemin, “Learning the structure of factored Markov decision processes in reinforcement learning problems,” in Proc. 23rd Int. Conf. Mach. Learn. (ICML). New York, NY, USA: Association for Computing Machinery, 2006, pp. 257–264.
[64]
AY. Ng, D. Harada, and S. Russell, “Policy invariance under reward transformations: Theory and application to reward shaping,” in Proc. Int. Conf. Mach. Learn. San Francisco, CA, USA: Morgan Kaufmann, 1999, pp. 278–287.
[65]
Enhanced LTE Support for Aerial Vehicles, document 36.777, 3GPP, Dec. 2017.
[66]
D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” 2014, arXiv:1412.6980.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Wireless Communications
IEEE Transactions on Wireless Communications  Volume 23, Issue 10_Part_3
Oct. 2024
1050 pages

Publisher

IEEE Press

Publication History

Published: 01 October 2024

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 26 Jan 2025

Other Metrics

Citations

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media