Machine Learning for Communications
Funding
Conflicts of Interest
References
- Xia, W.; Zheng, G.; Zhu, Y.; Zhang, J.; Wang, J.; Petropulu, A.P. A deep learning framework for optimization of MISO downlink beamforming. IEEE Trans. Commun. 2019, 68, 1866–1880. [Google Scholar] [CrossRef]
- Nikbakht, R.; Jonsson, A.; Lozano, A. Unsupervised learning for parametric optimization. IEEE Commun. Lett. 2020, 25, 678–681. [Google Scholar] [CrossRef]
- Cheng, M.X.; Li, Y.; Du, D.Z. Combinatorial Optimization in Communication Networks; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2006; Volume 18. [Google Scholar]
- Stefanello, F.; Aggarwal, V.; Buriol, L.S.; Resende, M.G. Hybrid algorithms for placement of virtual machines across geo-separated data centers. J. Comb. Optim. 2019, 38, 748–793. [Google Scholar] [CrossRef]
- Mazyavkina, N.; Sviridov, S.; Ivanov, S.; Burnaev, E. Reinforcement learning for combinatorial optimization: A survey. Comput. Oper. Res. 2021, 134, 105400. [Google Scholar] [CrossRef]
- Kim, H.; Jiang, Y.; Kannan, S.; Oh, S.; Viswanath, P. Deepcode: Feedback codes via deep learning. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, Montreal, QC, Canada, 3–8 December 2018; pp. 9458–9468. [Google Scholar]
- Huang, L.; Zhang, H.; Li, R.; Ge, Y.; Wang, J. AI coding: Learning to construct error correction codes. IEEE Trans. Commun. 2019, 68, 26–39. [Google Scholar] [CrossRef] [Green Version]
- Chadaga, S.; Agarwal, M.; Aggarwal, V. Encoders and Decoders for Quantum Expander Codes Using Machine Learning. arXiv 2019, arXiv:1909.02945. [Google Scholar]
- Luong, N.C.; Hoang, D.T.; Gong, S.; Niyato, D.; Wang, P.; Liang, Y.C.; Kim, D.I. Applications of deep reinforcement learning in communications and networking: A survey. IEEE Commun. Surv. Tutor. 2019, 21, 3133–3174. [Google Scholar] [CrossRef] [Green Version]
- Geng, N.; Lan, T.; Aggarwal, V.; Yang, Y.; Xu, M. A Multi-agent Reinforcement Learning Perspective on Distributed Traffic Engineering. In Proceedings of the 2020 IEEE 28th International Conference on Network Protocols (ICNP), Madrid, Spain, 13–16 October 2020; pp. 1–11. [Google Scholar]
- Wang, Y.; Li, Y.; Lan, T.; Aggarwal, V. Deepchunk: Deep q-learning for chunk-based caching in wireless data processing networks. IEEE Trans. Cogn. Commun. Netw. 2019, 5, 1034–1045. [Google Scholar] [CrossRef]
- Raghu, R.; Upadhyaya, P.; Panju, M.; Agarwal, V.; Sharma, V. Deep reinforcement learning based power control for wireless multicast systems. In Proceedings of the 2019 57th Annual Allerton Conference on Communication, Control, and Computing, Allerton, IL, USA, 24–27 September 2019; pp. 1168–1175. [Google Scholar]
- Mao, H.; Netravali, R.; Alizadeh, M. Neural adaptive video streaming with pensieve. In Proceedings of the Conference of the ACM Special Interest Group on Data Communication, Los Angeles, CA, USA, 21–25 August 2017; pp. 197–210. [Google Scholar]
- Zhang, J.; Ye, M.; Guo, Z.; Yen, C.Y.; Chao, H.J. CFR-RL: Traffic engineering with reinforcement learning in SDN. IEEE J. Sel. Areas Commun. 2020, 38, 2249–2259. [Google Scholar] [CrossRef]
- Hildebrandt, F.D.; Thomas, B.; Ulmer, M.W. Where the Action is: Let’s make Reinforcement Learning for Stochastic Dynamic Vehicle Routing Problems work! arXiv 2021, arXiv:2103.00507. [Google Scholar]
- Al-Abbasi, A.O.; Ghosh, A.; Aggarwal, V. Deeppool: Distributed model-free algorithm for ride-sharing using deep reinforcement learning. IEEE Trans. Intell. Transp. Syst. 2019, 20, 4714–4727. [Google Scholar] [CrossRef] [Green Version]
- Haliem, M.; Mani, G.; Aggarwal, V.; Bhargava, B. A distributed model-free ride-sharing approach for joint matching, pricing, and dispatching using deep reinforcement learning. arXiv 2020, arXiv:2010.01755. [Google Scholar]
- Chen, J.; Umrawal, A.K.; Lan, T.; Aggarwal, V. DeepFreight: A Model-free Deep-reinforcement-learning-based Algorithm for Multi-transfer Freight Delivery. In Proceedings of the International Conference on Automated Planning and Scheduling, Guangzhou, China, 7–12 June 2021; Volume 31, pp. 510–518. [Google Scholar]
- Chen, C.; Wei, H.; Xu, N.; Zheng, G.; Yang, M.; Xiong, Y.; Xu, K.; Li, Z. Toward A thousand lights: Decentralized deep reinforcement learning for large-scale traffic signal control. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; Volume 34, pp. 3414–3421. [Google Scholar]
- Gu, H.; Guo, X.; Wei, X.; Xu, R. Mean-Field Controls with Q-learning for Cooperative MARL: Convergence and Complexity Analysis. arXiv 2020, arXiv:2002.04131. [Google Scholar]
- Rashid, T.; Samvelyan, M.; Schroeder, C.; Farquhar, G.; Foerster, J.; Whiteson, S. Qmix: Monotonic value function factorisation for deep multi-agent reinforcement learning. In Proceedings of the International Conference on Machine Learning (PMLR), Stockholm, Sweden, 10–15 June 2018; pp. 4295–4304. [Google Scholar]
- Rashid, T.; Farquhar, G.; Peng, B.; Whiteson, S. Weighted QMIX: Expanding Monotonic Value Function Factorisation. arXiv 2020, arXiv:2006.10800. [Google Scholar]
- Zhang, J.; Bedi, A.S.; Wang, M.; Koppel, A. MARL with General Utilities via Decentralized Shadow Reward Actor-Critic. arXiv 2021, arXiv:2106.00543. [Google Scholar]
- Sukhbaatar, S.; Szlam, A.; Fergus, R. Learning multiagent communication with backpropagation. In Proceedings of the 30th International Conference on Neural Information Processing Systems, Barcelona, Spain, 5–10 December 2016; pp. 2252–2260. [Google Scholar]
- Foerster, J.N.; Assael, Y.M.; de Freitas, N.; Whiteson, S. Learning to communicate with Deep multi-agent reinforcement learning. In Proceedings of the 30th International Conference on Neural Information Processing Systems, Barcelona, Spain, 5–10 December 2016; pp. 2145–2153. [Google Scholar]
- Wang, Z.; Aggarwal, V.; Wang, X. Iterative dynamic water-filling for fading multiple-access channels with energy harvesting. IEEE J. Sel. Areas Commun. 2015, 33, 382–395. [Google Scholar] [CrossRef] [Green Version]
- Aggarwal, V.; Bell, M.R.; Elgabli, A.; Wang, X.; Zhong, S. Joint energy-bandwidth allocation for multiuser channels with cooperating hybrid energy nodes. IEEE Trans. Veh. Technol. 2017, 66, 9880–9889. [Google Scholar] [CrossRef]
- Badita, A.; Parag, P.; Aggarwal, V. Optimal Server Selection for Straggler Mitigation. IEEE/ACM Trans. Netw. 2020, 28, 709–721. [Google Scholar] [CrossRef] [Green Version]
- Nishimura, M.; Yonetani, R. L2B: Learning to Balance the Safety-Efficiency Trade-off in Interactive Crowd-aware Robot Navigation. arXiv 2020, arXiv:2003.09207. [Google Scholar]
- Agarwal, M.; Aggarwal, V. Reinforcement Learning for Joint Optimization of Multiple Rewards. arXiv 2021, arXiv:1909.02940v3. [Google Scholar]
- Bai, Q.; Agarwal, M.; Aggarwal, V. Joint Optimization of Multi-Objective Reinforcement Learning with Policy Gradient Based Algorithm. arXiv 2021, arXiv:2105.14125. [Google Scholar]
- Altman, E. Constrained Markov Decision Processes; CRC Press: Boca Raton, FL, USA, 1999; Volume 7. [Google Scholar]
- Li, H.; Wan, Z.; He, H. Constrained EV charging scheduling based on safe deep reinforcement learning. IEEE Trans. Smart Grid 2019, 11, 2427–2439. [Google Scholar] [CrossRef]
- Zhang, Y.; Vuong, Q.; Ross, K.W. First order optimization in policy space for constrained deep reinforcement learning. arXiv 2020, arXiv:2002.06506. [Google Scholar]
- Puterman, M.L. Markov Decision Processes: Discrete Stochastic Dynamic Programming; John Wiley & Sons: Hoboken, NJ, USA, 2014. [Google Scholar]
- Gattami, A.; Bai, Q.; Aggarwal, V. Reinforcement Learning for Constrained Markov Decision Processes. In Proceedings of the International Conference on Artificial Intelligence and Statistics (PMLR), Virtual Conference, 13–15 April 2021; pp. 2656–2664. [Google Scholar]
- Singh, R.; Gupta, A.; Shroff, N.B. Learning in Markov decision processes under constraints. arXiv 2020, arXiv:2002.12435. [Google Scholar]
- Agarwal, M.; Bai, Q.; Aggarwal, V. Markov Decision Processes with Long-Term Average Constraints. arXiv 2021, arXiv:2106.06680. [Google Scholar]
- Zheng, L.; Ratliff, L. Constrained upper confidence reinforcement learning. In Proceedings of the 2nd Conference on Learning for Dynamics and Control (PMLR), Berkeley, CA, USA, 11–12 June 2020; pp. 620–629. [Google Scholar]
- Ding, D.; Wei, X.; Yang, Z.; Wang, Z.; Jovanovic, M. Provably efficient safe exploration via primal-dual policy optimization. In Proceedings of the International Conference on Artificial Intelligence and Statistics (PMLR), Virtual Conference, 13–15 April 2021; pp. 3304–3312. [Google Scholar]
- Xu, T.; Liang, Y.; Lan, G. A Primal Approach to Constrained Policy Optimization: Global Optimality and Finite-Time Analysis. arXiv 2020, arXiv:2011.05869. [Google Scholar]
- Ding, D.; Zhang, K.; Basar, T.; Jovanovic, M. Natural Policy Gradient Primal-Dual Method for Constrained Markov Decision Processes. In Proceedings of the 34th Conference on Neural Information Processing Systems (NeurIPS 2020), Vancouver, BC, Canada, 6–12 December 2020. [Google Scholar]
- Bai, Q.; Aggarwal, V.; Gattami, A. Provably Efficient Model-Free Algorithm for MDPs with Peak Constraints. arXiv 2020, arXiv:2003.05555. [Google Scholar]
- Liu, C.; Geng, N.; Aggarwal, V.; Lan, T.; Yang, Y.; Xu, M. CMIX: Deep Multi-agent Reinforcement Learning with Peak and Average Constraints. In Proceedings of the 2021 European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2021), Virtual Conference, 13–17 September 2021. [Google Scholar]
- Aggarwal, V.; Mahimkar, A.; Ma, H.; Zhang, Z.; Aeron, S.; Willinger, W. Inferring smartphone service quality using tensor methods. In Proceedings of the 2016 12th International Conference on Network and Service Management (CNSM), Montreal, QC, Canada, 31 October–4 November 2016; pp. 263–267. [Google Scholar]
- Wei, C.Y.; Luo, H. Non-stationary Reinforcement Learning without Prior Knowledge: An Optimal Black-box Approach. arXiv 2021, arXiv:2102.05406. [Google Scholar]
- Padakandla, S.; Prabuchandran, K.J.; Bhatnagar, S. Reinforcement learning algorithm for non-stationary environments. Appl. Intell. 2020, 50, 3590–3606. [Google Scholar] [CrossRef]
- Haliem, M.; Aggarwal, V.; Bhargava, B. AdaPool: An Adaptive Model-Free Ride-Sharing Approach for Dispatching using Deep Reinforcement Learning. In Proceedings of the 7th ACM International Conference on Systems for Energy-Efficient Buildings, Cities, and Transportation, Virtual Conference, 18–20 November 2020; pp. 304–305. [Google Scholar]
- Haliem, M.; Aggarwal, V.; Bhargava, B. AdaPool: A Diurnal-Adaptive Fleet Management Framework using Model-Free Deep Reinforcement Learning and Change Point Detection. arXiv 2021, arXiv:2104.00203. [Google Scholar]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Aggarwal, V. Machine Learning for Communications. Entropy 2021, 23, 831. https://doi.org/10.3390/e23070831
Aggarwal V. Machine Learning for Communications. Entropy. 2021; 23(7):831. https://doi.org/10.3390/e23070831
Chicago/Turabian StyleAggarwal, Vaneet. 2021. "Machine Learning for Communications" Entropy 23, no. 7: 831. https://doi.org/10.3390/e23070831
APA StyleAggarwal, V. (2021). Machine Learning for Communications. Entropy, 23(7), 831. https://doi.org/10.3390/e23070831