research-article

Deep reinforcement learning-based air combat maneuver decision-making: literature review, implementation tutorial and future direction

Authors:

Jie LiuAuthors Info & Claims

Artificial Intelligence Review, Volume 57, Issue 1

https://doi.org/10.1007/s10462-023-10620-2

Published: 28 December 2023 Publication History

Abstract

Nowadays, various innovative air combat paradigms that rely on unmanned aerial vehicles (UAVs), i.e., UAV swarm and UAV-manned aircraft cooperation, have received great attention worldwide. During the operation, UAVs are expected to perform agile and safe maneuvers according to the dynamic mission requirement and complicated battlefield environment. Deep reinforcement learning (DRL), which is suitable for sequential decision-making process, provides a powerful solution tool for air combat maneuver decision-making (ACMD), and hundreds of related research papers have been published in the last five years. However, as an emerging topic, there lacks a systematic review and tutorial. For this reason, this paper first provides a comprehensive literature review to help people grasp a whole picture of this field. It starts from the DRL itself and then extents to its application in ACMD. And special attentions are given to the design of reward function, which is the core of DRL-based ACMD. Then, a maneuver decision-making method based on one-to-one dogfight scenarios is proposed to enable UAV to win short-range air combat. The model establishment, program design, training methods and performance evaluation are described in detail. And the associated Python codes are available at gitee.com/wangyyhhh, thus enabling a quick-start for researchers to build their own ACMD applications by slight modifications. Finally, limitations of the considered model, as well as the possible future research direction for intelligent air combat, are also discussed.

References

[1]

Air Combat Evolution Project Overview. (Air Combat Evolution Project Overview. https://www.darpa.mil/program/air-combat-evolution. 2023–May–21

[2]

Air combat reinforcement learning. https://github.com/y8107928/air-combat-Reinforcement-Learning. 2023–May–21

[3]

Akabari S, Menhaj MB, and Nikravesh SK Fuzzy modeling of offensive maneuvers in an air-to-air combat. computational intelligence Theory Appl 2005 10 171-184

[4]

AlMahamid F and Grolinger K Autonomous unmanned aerial vehicle navigation using reinforcement learning: a systematic review Eng Appl Artificial Intell 2022

[5]

Alpdemir MN Tactical UAV path optimization under radar threat using deep reinforcement learning Neural Comput Appl 2022 34 5649-5664

[6]

Arulkumaran K, Deisenroth MP, Brundage M, and Bharath AA Deep reinforcement learning: a brief survey IEEE Signal Process Mag 2017 34 26-38

[7]

Austin F, Carbone G, Falco M, Hinz H, and Lewis M Automated maneuvering decisions for air-to-air combat American Institute Aeronaut Astronautics 1987

[8]

Austin F, Carbone G, Hinz H, Lewis M, and Falco M Game theory for automated maneuvering during air-to-air combat J Guid Control Dyn 1991

[9]

Azar AT, Koubaa A, Ali Mohamed N, Ibrahim HA, Ibrahim ZF, Kazim M, Ammar A, Benjdira B, Khamis AM, Hameed IA, and Casalino G Drone deep reinforcement learning: a review Electronics 2021 10 999

[10]

Bae J, Jung H, Kim S, Kim S, and Kim Y-D Deep reinforcement learning-based air-to-air combat maneuver generation in a realistic environment IEEE Access 2023 11 26427-26440

[11]

Bayerlein H, Theile M, Caccamo M, and Gesbert D Multi-UAV path planning for wireless data harvesting with deep reinforcement learning IEEE Open J Commun Soc 2021 2 1171-1187

[12]

Bergdahl J, Gordillo C, Tollmar K, and Gisslén L Augmenting automated game testing with deep reinforcement learning ArXiv 2021

[13]

Berner C, Brockman G, Chan B, Cheung V, Dębiak P, Dennison C, Farhi D, Fischer Q, Hashme S, Hesse C, Józefowicz R, Gray S, Olsson C, Pachocki J, Petrov M, Pinto H, Raiman J, Salimans T, Schlatter J, and Zhang S Dota 2 with large scale deep reinforcement learning ArXiv 2019

[14]

Cao X, Wan H, Lin Y, and Han S High-value prioritized experience replay for off-policy reinforcement learning IEEE Int Conference Tools with Artificial Intell 2019 2019 1510-1514

[15]

Cao Y, Kou Y, Li Z, and Xu A Autonomous maneuver decision of UCAV air combat based on double deep Q network algorithm and stochastic game theory Int J Aerospace Eng 2023 2023 1-20

[16]

Chai R, Tsourdos A, Savvaris A, Chai S, and Xia Y Design and implementation of deep neural network-based control for automatic parking maneuver process IEEE Trans Neural Net Learn Syst 2020 33 1400-1413

[17]

Chai R, Tsourdos A, Savvaris A, Chai S, Xia Y, and Chen CLP Six-DOF spacecraft optimal trajectory planning and real-time attitude control: a deep neural network-based approach IEEE Trans Neural Net Learn Syst 2020 31 5005-5013

[18]

Chai R, Tsourdos A, Savvaris A, Xia Y, and Chai S Real-time reentry trajectory planning of hypersonic vehicles: a two-step strategy incorporating fuzzy multiobjective transcription and deep neural network IEEE Trans Industr Electron 2020 67 6904-6915

[19]

Chai R, Tsourdos A, Savvaris A, and Chai S Review of advanced guidance and control algorithms for space/aerospace vehicles Prog Aerosp Sci 2021

[20]

Chai R, Tsourdos A, Savvaris A, Chai S, and Xia Y Solving constrained trajectory planning problems using biased particle swarm optimization IEEE Trans Aerosp Electron Syst 2021 57 1685-1701

[21]

Chai R, Tsourdos A, Gao H, Chai S, and Xia Y Attitude tracking control for reentry vehicles using centralised robust model predictive control Automatica 2022

[22]

Chai R, Tsourdos A, Gao H, Xia Y, and Chai S Dual-loop tube-based robust model predictive attitude tracking control for spacecraft with system constraints and additive disturbances IEEE Trans Industr Electron 2022 69 4022-4033

[23]

Chai R, Tsourdos A, Chai S, Xia Y, and Savvaris A Multi-phase overtaking maneuver planning for autonomous ground vehicles via a desensitized trajectory optimization approach IEEE Trans Industr Inf 2022 51 4035-4049

[24]

Chai R, Liu D, Liu T, Tsourdos A, Xia Y, and Chai S Deep learning-based trajectory planning and control for autonomous ground vehicle parking maneuver IEEE Trans Autom Sci Eng 2023 20 1633-1647

[25]

Chen C, Wu W, and Jiang J A modified ant optimization algorithm for path planning of UCAV Appl Soft Comput 2008 8 1712-1718

[26]

Crumpacker JB, Robbins MJ, and Jenkins PR An approximate dynamic programming approach for solving an air combat maneuvering problem Expert Syst Appl 2022 203

[27]

Cruz J, Simaan M, Gacic A, Jiang H, Letelliier B, Li M, and Liu Y Game-theoretic modeling and control of a military air operation IEEE Trans Aerosp Electron Syst 2001 37 1393-1405

[28]

Cui K, Han W, Liu Y, Wang X, Su X, Liu J, and Shao X Model predictive control for automatic carrier landing with time delay Int J Aerospace Eng 2021 2021 8613498

[29]

DARPA AlphaDogfight program overview. (DARPA AlphaDogfight program overview. https://en.wikipedia.org/wiki/DARPA_AlphaDogfight. 2023–May–21

[30]

DARPA's Gremlins Program. (DARPA's Gremlins Program. https://www.darpa.mil/program/gremlins. 2023–May–21

[31]

Dassault nEUROn. https://zh.wikipedia.org/zh-cn. 2023–Aug–08

[32]

Din A, Mir I, and Faiza SA Development of reinforced learning based non-linear controller for unmanned aerial vehicle J Ambient Intell Humaniz Comput 2022 14 4005-4022

[33]

Din A, Mir I, Gul F, and Mir S Non-linear intelligent control design for unconventional unmanned aerial vehicle American Institute Aeronautics Astronautics 2023

[34]

Din A, Akhtar S, Maqsood A, Habib M, and Mir I Modified model free dynamic programming: an augmented approach for unmanned aerial vehicle Appl Intell 2023 53 3048-3068

[35]

Dong Y, Ai J, and Liu J Guidance and control for own aircraft in the autonomous air combat: a historical review and future prospects J Aerosp Eng 2019 233 5943-5991

[36]

European Horizons Program. (European Horizons Program. https://irp.fas.org/program/collect/uav_roadmap2005.pdf. 2023–May–21

[37]

Evers L, Dollevoet T, Barros AI, and Monsuur H Robust UAV mission planning Ann Oper Res 2014 222 293-315

[38]

Fan Z, Xu Y, Kang Y, and Luo D Air combat maneuver decision method based on A3C deep reinforcement learning MACHINES 2022 10 1033

[39]

Fu L, Wang Q, Xu J, Zhou Y, Zhu K (2012) Target assignment and sorting for multi-target attack in multi-aircraft coordinated based on RBF. 2012 Chinese control and decision conference.

[40]

Fu L, Xie F, Wang D, and Meng G The overview for UAV air-combat decision method Chinese Control and Decision Conference 2014 2014 3380-3384

[41]

Future combat air system project overview. https://en.wikipedia.org/wiki/Future_Combat_Air_System#Contractors. 2023–May–21

[42]

Gao X, Wang L, Yu X, Su X, Ding Y, Lu C, Peng H, and Wang X Conditional probability based multi-objective cooperative task assignment for heterogeneous UAVs Eng Appl Artificial Intell 2023

[43]

Grondman I, Busoniu L, Lopes G, and Babuska R A survey of actor-critic reinforcement learning: standard and natural policy gradients IEEE Trans Syst 2012 42 1291-1307

[44]

Guo H, Hou M, Zhang Q, and Tang C UCAV robust maneuver decision based on statistics principle Binggong Xuebao/acta Armamentarii 2017 38 160-167

[45]

Guo T, Jiang N, Li B, Zhu X, Wang Y, and Du W UAV navigation in high dynamic environments: A deep reinforcement learning approach Chin J Aeronaut 2021 34 479-489

[46]

Han Y, Piao H, Hou Y, Sun Y, Sun Z, Zhou D, Yang S, Peng X, and Fan S Deep relationship graph reinforcement learning for multi-aircraft air combat International Joint Conference on Neural Net 2022 2022 1-8

[47]

Hou Z, Fei J, Deng Y, and Xu J Data-Efficient hierarchical reinforcement learning for robotic assembly control applications IEEE Trans Industr Electron 2021 11 11565-11575

[48]

Hu X, Luo P, Zhang X, and Wang J Improved ant colony optimization for weapon-target assignment Math Prob Eng 2018

[49]

Hu D, Yang R, Zuo J, Zhang Z, Wu J, and Wang Y Application of deep reinforcement learning in maneuver planning of beyond-visual-range air combat IEEE Access 2021 9 32282-32297

[50]

Hu J, Wang L, Hu T, Guo C, and Wang Y Autonomous maneuver decision making of dual-uav cooperative air combat based on deep reinforcement learning Electronics 2022 11 467

[51]

Hu Z (2020) Research on tactical decision-making of ucav based on deep reinforcement learning. Master of engineering, Harbin Institute of Technology, Shenzhen

[52]

Huang C, Dong K, Huang H, and Tang S Autonomous air combat maneuver decision using Bayesian inference and moving horizon optimization J Syst Eng Electron 2018 29 86-97

[53]

Huang C, Wei Z, Yang Y, Ku S, Zhang H (2019) Knowledge acquisition for the air combat based on GWO. In: 2019 International conference on artificial intelligence technologies and applications vol 1325, pp 12–78.

[54]

Jang B, Kim M, Harerimana G, and Kim JW Q-learning algorithms: a comprehensive classification and applications IEEE Access 2019 7 133653-133667

[55]

Jiang N, Jin S, and Zhang C Hierarchical automatic curriculum learning: Converting a sparse reward navigation task into dense reward Neurocomputing 2019 360 265-278

[56]

Jiang Y, Yu J, and Li Q A novel decision-making algorithm for beyond visual range air combat based on deep reinforcement learning Youth Academic Annual Conference of Chinese Association of Automation 2022 2022 516-521

[57]

Jing X, Hou M, Wu G, Ma Z, and Tao Z Research on maneuvering decision algorithm based on improved deep deterministic policy gradient IEEE Access 2022 10 92426-92445

[58]

Kaneshige J and Krishnakumar K Artificial immune system approach for air combat maneuvering Intell Comput 2007

[59]

Kim C, Ji C, and Kim BS Development of a control law to improve the handling qualities for short-range air-to-air combat maneuvers Adv Mech Eng 2020 12 207-226

[60]

Kober J, Bagnell J, and Peters J Reinforcement learning in robotics: a survey Int J Robot Res 2013 32 1238-1274

[61]

Kong W, Zhou D, Zhang K, and Yang Z Air combat autonomous maneuver decision for one-on- one within visual range engagement base on robust multi-agent reinforcement learning IEEE Int Conference Control Automation 2020 2020 506-512

[62]

Kong W, Zhou D, Du Y, Zhou Y, and Zhao Y Reinforcement Learning for Multi-aircraft autonomous air combat in multi-sensor UCAV platform IEEE Sens J 2022

[63]

Kong W, Zhou D, Du Y, Zhou Y, and Zhao YY Hierarchical multi-agent reinforcement learning for multi-aircraft close-range air combat IET Control Theory Appl 2022

[64]

Kumar M, Agrawal K, Dutt V (2019) Modeling Decisions in Collective Risk Social Dilemma Games for Climate Change Using Reinforcement Learning. 2019 IEEE conference on cognitive and computational aspects of situation management.

[65]

Lange S, Riedmiller M (2010) Deep auto-encoder neural networks in reinforcement learning. 2010 International Joint Conference on Neural Networks.

[66]

Li B and Wu Y Path planning for uav ground target tracking via deep reinforcement learning IEEE Access 2020 8 29064-29074

[67]

Li B, Gan Z, Chen D, and Sergey D UAV maneuvering target tracking in uncertain environments based on deep reinforcement learning and meta-learning Remote Sensing 2020 12 3789

[68]

Li Y, Han W, and Wang Y Deep reinforcement learning with application to air confrontation intelligent decision-making of manned/unmanned aerial vehicle cooperative system IEEE Access 2020 8 67887-67898

[69]

Li B, Bai S, Gan Z, Liang S, Evgeny N, and Yao S Autonomous air combat decision-making of UAV based on parallel self-play reinforcement learning CAAI Trans Intell Technol 2022 8 64-81

[70]

Li Y, Shi J, Jiang W, Zhang W, and Lyu Y Autonomous maneuver decision-making for a UCAV in short-range aerial combat based on an MS-DDQN algorithm Def Technol 2022 18 1697-1714

[71]

Li B, Bai S, Liang S, Ma R, Neretin E, and Huang J Manoeuvre decision-making of unmanned aerial vehicles in air combat based on an expert actor-based soft actor critic algorithm CAAI Trans Intell Technol 2023

[72]

Li S, Wu Q, Du B, Wang Y, and Chen M Autonomous maneuver decision-making of ucav with incomplete information in human-computer gaming Drones 2023 7 157

[73]

Liu X, Yin Y, Su Y, and Ming R A Multi-UCAV cooperative decision-making method based on an MAPPO algorithm for beyond-visual-range air combat Aerospace 2022 9 563

[74]

Luong NC, Hoang DT, Gong S, Niyato D, Wang P, Liang Y, and Kim DI Applications of deep reinforcement learning in communications and networking: a survey IEEE Commun Surveys Tutorials 2019 21 3133-3174

[75]

Lyu L, Shen Y, Zhang S (2022) The advance of reinforcement learning and deep reinforcement learning. 2022 IEEE International conference on electrical engineering p 644–648.

[76]

Morales EF, Murrieta-Cid R, Becerra I, and Esquivel-Basaldua MA A survey on deep learning and deep reinforcement learning in robotics with a tutorial on deep reinforcement learning Intel Serv Robot 2021 14 773-805

[77]

MQ-9. https://zh.wikipedia.org/zh-cn/MQ-9. 2023–Aug–08

[78]

Nguyen TT, Nguyen ND, and Nahavandi S Deep reinforcement learning for multiagent systems: a review of challenges, solutions, and applications IEEE Trans Cybernet 2020 50 3826-3839

[79]

OFFensive Swarm-Enabled Tactics (OFFSET) program. https://apps.dtic.mil/sti/pdfs/AD1125864.pdf. 2023–May–21

[80]

Özbek M, Yıldırım S, Aksoy M, Kernin E, and Koyuncu E Harfang3D dog-fight sandbox: a reinforcement learning research platform for the customized control tasks of fighter aircrafts ArXiv 2022

[81]

Parisi S, Tateo D, Hensel M, Eramo CD, Peters J, and Pajarinen J Long-term visitation value for deep exploration in sparse-reward reinforcement learning Algorithms 2022 15 81

[82]

Park H, Lee B, Tahk M, and Yoo D Differential game based air combat maneuver generation using scoring function matrix Int J Aeronautical Space Sci 2016 17 204-213

[83]

Piao H, Sun Z, Meng G, Chen H, Qu B, Lang K, Sun Y, Yang S, and Peng X Beyond-visual-range air combat tactics auto-generation by reinforcement learning Int Joint Conference on Neural Net 2020 2020 1-8

[84]

Piao H, Han Y, Chen H, Peng X, Fan S, Sun Y, Liang C, Liu Z, Sun Z, and Zhou D Complex relationship graph abstraction for autonomous air combat collaboration: A learning and expert knowledge hybrid approach Expert Syst Appl 2023 215

[85]

Pope AP, Ide JS, Micovic D, Diaz H, Rosenbluth D, Ritholtz L, Twedt JC, Walker TT, Alcedo K, and Javorsek D Hierarchical reinforcement learning for air-to-air combat International Conference Unmanned Aircraft Syst 2021

[86]

Poropudas J and Virtanen K Game-theoretic validation and analysis of air combat simulation models IEEE Trans Syst, Man, Cybernet - Part a: Syst Humans 2010 40 1057-1070

[87]

Russia National Weapons Program. https://www.foi.se/rest-api/report/FOI-R--4239--SE. 2023–May–21

[88]

Qie H, Shi D, Shen T, Xu X, Li Y, and Wang L Joint optimization of multi-UAV target assignment and path planning based on multi-agent reinforcement learning IEEE Access 2019 7 146264-146272

[89]

Qiu X, Yao Z, Tan F, Zhu Z, and Lu J One-to-one air-combat maneuver strategy based on improved TD3 algorithm Chinese Automation Congress 2020 2020 5719-5725

[90]

Rardin R and Uzsoy R Experimental evaluation of heuristic optimization algorithms: a tutorial J Heurist 2001 7 261-304

[91]

RL air combat. https://github.com/Linaom1214/RL_air-combat. 2023–May–21

[92]

Rodriguez-Ramos A, Sampedro C, Bavle H, de la Puente P, and Campoy P A deep reinforcement learning strategy for UAV autonomous landing on a moving platform J Intell Rob Syst 2019 93 351-366

[93]

Ruan W, Duan H, and Deng Y Autonomous maneuver decisions via transfer learning pigeon-inspired optimization for ucavs in dogfight engagements IEEE/CAA J Automatica Sinica 2022 9 1639-1657

[94]

Russia is testing its own 'loyal wingman' drone for its Su-57 stealth fighter. https://tass.com/defense/1012351. 2023–May–21

[95]

Sarkar N and Gul S Artificial intelligence-based autonomous UAV networks: a survey Drones 2023 7 322

[96]

Silver D, Huang A, Maddison C, Guez A, Sifre L, Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V, Lanctot M, Dieleman S, Grewe D, Nham J, Kalchbrenner N, Sutskever I, Lillicrap T, Leach M, Kavukcuoglu K, Graepel T, and Hassabis D Mastering the game of go with deep neural networks and tree search Nature 2016 529 484-489

[97]

Silver D, Schrittwieser J, Simonyan K, Antonoglou I, Huang A, Guez A, Hubert T, Baker L, Bolton A, Chen Y, Lillicrap T, Hui F, Sifre L, Driessche G, Graepel T, and Hassabis D Mastering the game of go without human knowledge Nature 2017 550 354-359

[98]

Smith R and Dike B Learning novel fighter combat maneuver rules via genetic algorithms Int J Expert Syst 1995 8 247-276

[99]

Subrahmanian VS Amalgamating knowledge bases Association for Comput Machinery 1994

[100]

Sun Y, Wang X, Wang T, and Gao P Modeling of air-to-air missile dynamic attack zone based on bayesian networks Chinese Automation Congress 2020 2020 5596-5601

[101]

Tasbas S and Aydinli S 2-D air combat maneuver decision using reinforcement learning Int Conference Eng Emerg Technol 2021 2021 1-6

[102]

Vázquez-Canteli JR and Nagy Z Reinforcement learning for demand response: a review of algorithms and modeling techniques Appl Energy 2019 235 1072-1089

[103]

Vien NA, Yu H, and Chung T Hessian matrix distribution for Bayesian policy gradient reinforcement learning Inf Sci 2011 181 1671-1685

[104]

Vinyals O, Babuschkin I, Czarnecki WM, Mathieu M, Dudzik A, Chung J, Choi DH, Powell R, Ewalds T, Georgiev P, Oh J, Horgan D, Kroiss M, Danihelka I, Huang A, Sifre L, Cai T, Agapiou JP, Jaderberg M, Vezhnevets AS, Leblond R, Pohlen T, Dalibard V, Budden D, Sulsky Y, Molloy J, Paine TL, Gulcehre C, Wang Z, Pfaff T, Wu Y, Ring R, Yogatama D, Wünsch D, McKinney K, Smith O, Schaul T, Lillicrap T, Kavukcuoglu K, Hassabis D, Apps C, and Silver D Grandmaster level in StarCraft II using multi-agent reinforcement learning Nature 2019 575 350-354

[105]

Wang L and Wei H Research on autonomous decision-making of UCAV based on deep reinforcement learning Inform Commun Technol Conference 2022 2022 122-126

[106]

Wang J, Zhao X, Zhang Y, and Wang B Cooperative air-defense system of system model based on immune multi-agent for surface warship formation Int Conference Awareness Sci Technol 2011 2011 256-260

[107]

Wang Y, Li TS, and Lin C Backward Q-learning: the combination of Sarsa algorithm and Q-learning Eng Appl Artif Intell 2013 26 2184-2193

[108]

Wang Y, Huang C, and Tang C Research on unmanned combat aerial vehicle robust maneuvering decision under incomplete target information Adv Mech Eng 2016

[109]

Wang C, Wang J, Wang J, and Zhang X Deep reinforcement-learning-based autonomous UAV navigation with sparse rewards IEEE Internet Things J 2020 7 6180-6190

[110]

Wang M, Wang L, Yue T, and Liu H Influence of unmanned combat aerial vehicle agility on short-range aerial combat effectiveness Aerosp Sci Technol 2020 96

[111]

Wang Z, Li H, Wu H, and Wu Z Improving maneuver strategy in air combat by alternate freeze games with a deep reinforcement learning algorithm Math Probl Eng 2020 2020 1-17

[112]

Wang L, Wang K, Pan C, Xu W, Aslam N, and Hanzo L Multi-agent deep reinforcement learning-based trajectory planning for multi-uav assisted mobile edge computing IEEE Trans Commun 2021 7 73-84

[113]

Wang X, Chen Y, and Zhu W A survey on curriculum learning IEEE Trans Pattern Anal Mach Intell 2021 44 4555-4576

[114]

Wang X, Peng H, Liu J, Dong X, Zhao X, and Lu C Optimal control based coordinated taxiing path planning and tracking for multiple carrier aircraft on flight deck Def Technol 2022 18 238-248

[115]

Wang Y, Ren T, and Fan Z Autonomous maneuver decision of uav based on deep reinforcement learning: comparison of DQN and DDPG Chinese Control and Decision Conference 2022 2022 4857-4860

[116]

Wang X, Li B, Su X, Peng H, Wang L, Lu C, and Wang C Autonomous dispatch trajectory planning on flight deck: a search-resampling-optimization framework Eng Appl Artificial Intell 2023 119 105792

[117]

Wang Y, Jiang T, Li Y, Zhang Z (2021) A hierarchical reinforcement learning method on multi UCAV air combat. Society of photo-optical instrumentation engineers 119330K–119337K.

[118]

Wu J, He H, Peng J, Li Y, and Li Z Continuous reinforcement learning of energy management with deep Q network for a power split hybrid electric bus Appl Energy 2018 222 799-811

[119]

Wu L, Wang C, Zhang P, and Wei C Deep reinforcement learning with corrective feedback for autonomous uav landing on a mobile platform Drones 2022 6 238

[120]

Wu Y, Lei Y, Z Z, Wang Y (2022) Decision modeling and simulation of fighter air-to-ground combat based on reinforcement learning: association for computing machinery 8:102–109.

[121]

Xi Z, Xu A, Kou Y, Li Z, and Yang A Air combat maneuver trajectory prediction model of target based on chaotic theory and IGA-VNN Math Probl Eng 2020 2020 1-23

[122]

Xi Z, An X, Kou Y, Li Z, and Yang A Target maneuver trajectory prediction based on RBF neural network optimized by hybrid algorithm J Syst Eng Electron 2021 32 498-516

[123]

Xi Z, Yu Y, Kou Y, Li Z, and Li Y An online ensemble semi-supervised classification framework for air combat target maneuver recognition Chinese J Aeronaut 2023 36 340-360

[124]

Xie J, Peng X, Wang H, Niu W, and Zheng X UAV autonomous tracking and landing based on deep reinforcement learning strategy Sensors 2020 20 5630

[125]

Xu Z, Cao L, Chen X, Li C, Zhang Y, and Lai J Deep reinforcement learning with sarsa and q-learning: a hybrid approach IEICE Trans Inform Syst 2018

[126]

Xu D, Guo Y, Yu Z, Wang Z, Lan R, Zhao R, Xie X, and Long H PPO-Exp: keeping fixed-wing UAV formation with deep reinforcement learning Drones 2023 7 28

[127]

Xuan Y, Huang C, and Li W Air combat situation assessment by gray fuzzy bayesian network Appl Mech Mater 2011 69 114-119

[128]

Yan J, Daobo W, Tingting B, and Zongyuan Y Multi-UAV objective assignment using hungarian fusion genetic algorithm IEEE Access 2022 10 43013-43021

[129]

Yang Q, Zhang J, Shi G, Hu J, and Wu Y Maneuver decision of uav in short-range air combat based on deep reinforcement learning IEEE Access 2020 8 363-378

[130]

Yang K, Dong W, Cai M, Jia S, and Liu R UCAV air combat maneuver decisions based on a proximal policy optimization algorithm with situation reward shaping Electronics 2022 11 2602

[131]

Yoo J, Seong H, Shim D, Bae J, and Kim Y Deep reinforcement learning-based intelligent agent for autonomous air combat IEEE/AIAA Digital Avionics Syst Conference 2022 2022 1-9

[132]

You S, Diao M, Gao L, Zhang F, and Wang H Target tracking strategy using deep deterministic policy gradient Appl Soft Comput 2020

[133]

Yu X, Gao X, Wang L, Wang X, Ding Y, Lu C, and Zhang S Cooperative multi-UAV task assignment in cross-regional joint operations considering ammunition inventory Drones 2022

[134]

Yue L, Yang R, Zhang Y, Yu L, and Wang Z Deep reinforcement learning for uav intelligent mission planning Complexity 2022 2022 1-13

[135]

Zhang Z, Yuan Z, and Liu L The design of target assignment model based on the reverse mutation ant colony algorithm Procedia Eng 2012 29 1554-1558

[136]

Zhang J, Qiming Y, Guoqing S, Yi L, and Yong W UAV cooperative air combat maneuver decision based on multi-agent reinforcement learning J syst Eng Electron 2021 32 1421-1438

[137]

Zhang H, Zhou H, Wei Y, and Huang C Autonomous maneuver decision-making method based on reinforcement learning and monte carlo tree search Front Neurorobot 2022

[138]

Zhang H, Wei Y, Zhou H, and Huang C Maneuver decision-making for autonomous air combat based on FRE-PPO Appl Sci 2022 12 10230

[139]

Zhao K and Huang C Air combat situation assessment for UAV based on improved decision tree Chinese Control and Decision Conference 2018 2018 1772-1776

[140]

Zhao T, Hachiya H, Niu G, and Sugiyama M Analysis and improvement of policy gradient estimation Neural Netw 2012 26 118-129

[141]

Zhao W, Chu H, Miao X, Guo L, Shen H, Zhu C, Zhang F, and Liang D Research on the multiagent joint proximal policy optimization algorithm controlling cooperative fixed-wing UAV obstacle avoidance Sensors 2020 20 4546

[142]

Zhao Y, Chen Y, Zhen Z, and Jiang J Multi-weapon multi-target assignment based on hybrid genetic algorithm in uncertain environment Int J Adv Rob Syst 2020

[143]

Zhao W, Meng Z, Wang K, Zhang J, and Lu S Hierarchical active tracking control for UAVs via deep reinforcement learning Appl Sci 2021 11 10595

[144]

Zhao X, Yang R, Zhang Y, Yan M, and Yue L Deep reinforcement learning for intelligent dual-uav reconnaissance mission planning Electronics 2022 11 2031

[145]

Zheng Z and Duan H UAV maneuver decision-making via deep reinforcement learning for short-range air combat Intell Robot 2023 3 76-94

[146]

Zhong Z, Tong T, Zhong Z, and Zhagn Z Sequential maneuvering decisions based on multi-stage influence diagram in air combat J Syst Eng Electron 2007 18 551-555

[147]

Zhong Y, Yao P, Sun Y, and Yang J Cooperative task allocation method of MCAV/UCAV formation Math Probl Eng 2016 2016 1-9

[148]

Zhou H, Zhang X, Zhang Z, Wu F, Liu J, and Chen Y Reinforcement learning technology for air combat confrontation of unmanned aerial vehicle Soc Photo-Optical Instrument Eng 2022

[149]

Zhou K, Wei R, Xu Z, Zhang Q (2018) A brain like air combat learning system inspired by human learning mechanism. In: 2018 IEEE CSAA guidance navigation and control conference.

[150]

Zhu J, Song Y, Jiang D, and Song H A new deep-Q-learning-based transmission scheduling mechanism for the cognitive internet of things IEEE Int Things 2018 5 2375-2385

[151]

Zhu B, Bedeer E, Nguyen HH, Barton R, and Henry J UAV trajectory planning in wireless sensor networks for energy consumption minimization by deep reinforcement learning IEEE Trans Veh Technol 2021 70 9540-9554

Cited By

Xia HKe YLiao RSun Y(2025)Fractional order calculus enhanced dung beetle optimizer for function global optimization and multilevel threshold medical image segmentationThe Journal of Supercomputing10.1007/s11227-024-06592-x81:1Online publication date: 1-Jan-2025
https://dl.acm.org/doi/10.1007/s11227-024-06592-x
Zheng YXin BHe BDing Y(2024)Mean policy-based proximal policy optimization for maneuvering decision in multi-UAV air combatNeural Computing and Applications10.1007/s00521-024-10261-836:31(19667-19690)Online publication date: 1-Nov-2024
https://dl.acm.org/doi/10.1007/s00521-024-10261-8

Index Terms

Deep reinforcement learning-based air combat maneuver decision-making: literature review, implementation tutorial and future direction

Index terms have been assigned to the content through auto-classification.

Recommendations

Discovering Expert-Level Air Combat Knowledge via Deep Excitatory-Inhibitory Factorized Reinforcement Learning
Artificial Intelligence (AI) has achieved a wide range of successes in autonomous air combat decision-making recently. Previous research demonstrated that AI-enabled air combat approaches could even acquire beyond human-level capabilities. However, there ...
Air combat maneuver decision based on deep reinforcement learning with auxiliary reward
Abstract
For air combat maneuvering decision, the sparse reward during the application of deep reinforcement learning limits the exploration efficiency of the agents. To address this challenge, we propose an auxiliary reward function considering the impact ...
Deep Reinforcement Learning for Jointly Resource Allocation and Trajectory Planning in UAV-Assisted Networks
Computational Collective Intelligence
Abstract
Unmanned aerial vehicles (UAVs) have diverse applications in various fields, including the deployment of drones in 5G mobile networks and upcoming 6G and beyond. In UAV wireless networks, where the UAV is equipped with an eNB or gNB, it is ...

Comments

Information & Contributors

Information

Published In

cover image Artificial Intelligence Review

Artificial Intelligence Review Volume 57, Issue 1

Jan 2024

569 pages

Issue’s Table of Contents

© The Author(s), under exclusive licence to Springer Nature B.V. 2023. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 28 December 2023

Accepted: 01 October 2023

Author Tags

Qualifiers

Research-article

Funding Sources

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 13 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Xia HKe YLiao RSun Y(2025)Fractional order calculus enhanced dung beetle optimizer for function global optimization and multilevel threshold medical image segmentationThe Journal of Supercomputing10.1007/s11227-024-06592-x81:1Online publication date: 1-Jan-2025
https://dl.acm.org/doi/10.1007/s11227-024-06592-x
Zheng YXin BHe BDing Y(2024)Mean policy-based proximal policy optimization for maneuvering decision in multi-UAV air combatNeural Computing and Applications10.1007/s00521-024-10261-836:31(19667-19690)Online publication date: 1-Nov-2024
https://dl.acm.org/doi/10.1007/s00521-024-10261-8

View Options

View options

Media

Figures

Other

Tables

View Issue’s Table of Contents