research-article

Diffusion-Based Reinforcement Learning for Edge-Enabled AI-Generated Content Services

Authors:

Shiwen MaoAuthors Info & Claims

IEEE Transactions on Mobile Computing, Volume 23, Issue 9

Pages 8902 - 8918

https://doi.org/10.1109/TMC.2024.3356178

Published: 01 September 2024 Publication History

Abstract

As Metaverse emerges as the next-generation Internet paradigm, the ability to efficiently generate content is paramount. AI-Generated Content (AIGC) emerges as a key solution, yet the resource-intensive nature of large Generative AI (GAI) models presents challenges. To address this issue, we introduce an AIGC-as-a-Service (AaaS) architecture, which deploys AIGC models in wireless edge networks to ensure broad AIGC services accessibility for Metaverse users. Nonetheless, an important aspect of providing personalized user experiences requires carefully selecting AIGC Service Providers (ASPs) capable of effectively executing user tasks, which is complicated by environmental uncertainty and variability. Addressing this gap in current research, we introduce the AI-Generated Optimal Decision (AGOD) algorithm, a diffusion model-based approach for generating the optimal ASP selection decisions. Integrating AGOD with Deep Reinforcement Learning (DRL), we develop the Deep Diffusion Soft Actor-Critic (D2SAC) algorithm, enhancing the efficiency and effectiveness of ASP selection. Our comprehensive experiments demonstrate that D2SAC outperforms seven leading DRL algorithms. Furthermore, the proposed AGOD algorithm has the potential for extension to various optimization problems in wireless networks, positioning it as a promising approach for future research on AIGC-driven services.

References

[1]

A. M. Turing, Computing Machinery and Intelligence. Berlin, Germany: Springer, 2009.

[2]

Y. Wang et al., “A survey on metaverse: Fundamentals, security, and privacy,” IEEE Commun. Surv. Tut., vol. 25, no. 1, pp. 319–352, First Quarter 2023.

Digital Library

[3]

S. John and K. Matt, “Sizing the prize: What's the real value of AI for your business and how can you capitalise?,” PwC AI Anal. Rep., 2020. [Online]. Available: https://www.pwc.com/gx/en/news-room/docs/report-pwc-ai-analysis-sizing-the-prize.pdf

[4]

M. Aljanabi et al., “ChatGpt: Open possibilities,” Iraqi J. Comput. Sci. Math., vol. 4, no. 1, pp. 62–64, Jan. 2023.

[5]

A. Ulhaq, N. Akhtar, and G. Pogrebna, “Efficient diffusion models for vision: A survey,” 2022,.

[6]

H. Du et al., “Enabling AI-generated content (AIGC) services in wireless edge networks,” IEEE Wireless Commun., 2023.

[7]

Y. Lin et al., “Blockchain-aided secure semantic communication for AI-generated content in metaverse,” IEEE Open J. Comput. Soc., vol. 4, pp. 72–83, 2023.

[8]

H. Du et al., “The age of generative AI and AI-generated everything,” 2023,.

[9]

G. Harshvardhan, M. K. Gourisaria, M. Pandey, and S. S. Rautaray, “A comprehensive survey and analysis of generative models in machine learning,” Comput. Sci. Rev., vol. 38, 2020, Art. no.

Digital Library

[10]

H. Du et al., “Attention-aware resource allocation and QoE analysis for metaverse xURLLC services,” IEEE J. Sel. Areas Commun., vol. 41, no. 7, pp. 2158–2175, Jul. 2023.

Digital Library

[11]

J. Ren et al., “An efficient two-layer task offloading scheme for MEC system with multiple services providers,” in Proc. IEEE Conf. Comput. Commun., 2022, pp. 1519–1528.

[12]

I. Osband, C. Blundell, A. Pritzel, and B. Van Roy, “Deep exploration via bootstrapped DQN” in Proc. Adv. Neural Inf. Process. Syst., 2016, pp. 1–9.

[13]

H. Du et al., “Beyond deep reinforcement learning: A tutorial on generative diffusion models in network optimization,” 2023,.

[14]

V. Mnih et al., “Human-level control through deep reinforcement learning,” Nature, vol. 518, no. 7540, pp. 529–533, 2015.

[15]

M. Hausknecht and P. Stone, “Deep recurrent Q-learning for partially observable MDPs,” in Proc. AAAI Fall Symp. Ser., 2015, pp. 29–37.

[16]

T. Schaul, J. Quan, I. Antonoglou, and D. Silver, “Prioritized experience replay,” 2015,.

[17]

M. Hessel et al., “Rainbow: Combining improvements in deep reinforcement learning,” in Proc. AAAI Conf. Artif. Intell., 2018, pp. 3215–3222.

[18]

R. J. Williams, “Simple statistical gradient-following algorithms for connectionist reinforcement learning,” Reinforcement Learn., vol. 8, pp. 229–256, 1992.

Digital Library

[19]

J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,” 2017,.

[20]

T. Haarnoja, A. Zhou, P. Abbeel, and S. Levine, “Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor,” in Proc. Int. Conf. Mach. Learn., 2018, pp. 1861–1870.

[21]

H. Du et al., “Exploring collaborative distributed diffusion-based AI-generated content (AIGC) in wireless networks,” IEEE Netw. IEEE Netw., 2023.

Digital Library

[22]

Y. Liu et al., “Blockchain-empowered lifecycle management for AI-generated content (AIGC) products in edge networks,” IEEE Wireless Commun., 2023.

[23]

S. Yue, J. Ren, J. Xin, D. Zhang, Y. Zhang, and W. Zhuang, “Efficient federated meta-learning over multi-access wireless networks,” IEEE J. Sel. Areas Commun., vol. 40, no. 5, pp. 1556–1570, May 2022.

[24]

Z. Wang, J. J. Hunt, and M. Zhou, “Diffusion policies as an expressive policy class for offline reinforcement learning,” 2022,.

[25]

H. Du, J. Wang, D. Niyato, J. Kang, Z. Xiong, and D. I. Kim, “AI-generated incentive mechanism and full-duplex semantic communications for information sharing,” IEEE J. Sel. Areas Commun., vol. 41, no. 9, pp. 2981–2997, Sep. 2023.

Digital Library

[26]

X. Chen et al., “Reinforcement learning–based QoS/QoE-aware service function chaining in software-driven 5G slices,” Trans. Emerg. Telecommun. Technol., vol. 29, no. 11, Nov. 2018, Art. no.

[27]

K. Arulkumaran, M. P. Deisenroth, M. Brundage, and A. A. Bharath, “Deep reinforcement learning: A brief survey,” IEEE Signal Process. Mag., vol. 34, no. 6, pp. 26–38, Nov. 2017.

[28]

G. Sun, Z. Xu, H. Yu, and V. Chang, “Dynamic network function provisioning to enable network in box for industrial applications,” IEEE Trans. Ind. Inform., vol. 17, no. 10, pp. 7155–7164, Oct. 2020.

[29]

G. Sun, L. Sheng, L. Luo, and H. Yu, “Game theoretic approach for multipriority data transmission in 5G vehicular networks,” IEEE Trans. Intell. Transp. Syst., vol. 23, no. 12, pp. 24672–24685, Dec. 2022.

[30]

M. Dai, L. Luo, J. Ren, H. Yu, and G. Sun, “PSACCF: Prioritized online slice admission control considering fairness in 5G/B5G networks,” IEEE Trans. Netw. Sci. Eng., vol. 9, no. 6, pp. 4101–4114, Nov./Dec. 2022.

[31]

J. Ho, A. Jain, and P. Abbeel, “Denoising diffusion probabilistic models,” Proc. Adv. Neural Inf. Process. Syst., 2020, pp. 6840–6851.

[32]

S. AI, “Stable diffusion,” 2024. [Online]. Available: https://stability.ai/

[33]

J. B. Mazzola and A. W. Neebe, “Resource-constrained assignment scheduling,” Oper. Res., vol. 34, no. 4, pp. 560–572, 1986.

Digital Library

[34]

S. Desale, A. Rasool, S. Andhale, and P. Rane, “Heuristic and meta-heuristic algorithms and their relevance to the real world: A survey,” Int. J. Comput. Eng. Res. Trends, vol. 351, no. 5, pp. 2349–7084, May 2015.

[35]

A. Mehta et al., “Online matching and ad allocation,” Found. Trends Theor. Comput. Sci., vol. 8, no. 4, pp. 265–368, Apr. 2013.

Digital Library

[36]

W. Chen, X. Qiu, T. Cai, H.-N. Dai, Z. Zheng, and Y. Zhang, “Deep reinforcement learning for Internet of Things: A comprehensive survey,” IEEE Commun. Surv. Tut., vol. 23, no. 3, pp. 1659–1692, Third Quarter 2021.

[37]

R. S. Sutton, D. McAllester, S. Singh, and Y. Mansour, “Policy gradient methods for reinforcement learning with function approximation,” in Proc. Adv. Neural Inf. Process. Syst., 1999, pp. 1057–1063.

[38]

M. Janner, Y. Du, J. B. Tenenbaum, and S. Levine, “Planning with diffusion for flexible behavior synthesis,” 2022,.

[39]

A. Lugmayr, M. Danelljan, A. Romero, F. Yu, R. Timofte, and L. Van Gool, “RePaint: Inpainting using denoising diffusion probabilistic models,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2022, pp. 11461–11471.

[40]

S. Kastryulin, D. Zakirov, and D. Prokopenko, “PyTorch image quality: Metrics for image quality assessment,” Open-source Softw., 2022. [Online]. Available: https://github.com/photosynthesis-team/piq

[41]

Z. Liu, P. Luo, X. Wang, and X. Tang, “Deep learning face attributes in the wild,” in Proc. IEEE Int. Conf. Comput. Vis., 2015, pp. 3730–3738.

[42]

S.-W. Ko, K. Han, and K. Huang, “Wireless networks for mobile edge computing: Spatial modeling and latency analysis,” IEEE Trans. Wireless Commun., vol. 17, no. 8, pp. 5225–5240, Aug. 2018.

Digital Library

[43]

S.-P. Chung and J.-C. Lee, “Performance analysis and overflowed traffic characterization in multiservice hierarchical wireless networks,” IEEE Trans. Wireless Commun., vol. 4, no. 3, pp. 904–918, May 2005.

Digital Library

[44]

A. Vaswani et al., “Attention is all you need,” in Proc. Adv. Neural Inf. Process. Syst., 2017, pp. 1–11.

[45]

D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” in Proc. Int. Conf. Learn. Representations, 2015, pp. 1–15.

[46]

F. Garcia-Carballeira, A. Calderon, and J. Carretero, “Enhancing the power of two choices load balancing algorithm using round robin policy,” Cluster Comput., vol. 24, no. 2, pp. 611–624, Feb. 2021.

[47]

K. Cobbe, C. Hesse, J. Hilton, and J. Schulman, “Leveraging procedural generation to benchmark reinforcement learning,” in Proc. Int. Conf. Mach. Learn., 2020, pp. 2048–2056.

[48]

A. Raffin, “RL baselines zoo,” 2018. [Online]. Available: https://github.com/araffin/rl-baselines-zoo

Cited By

Gan DJiang YLi QGe X(2024)Wireless Metaverse Behavior Models and Optimization Based on Bandwagon EffectsIEEE Transactions on Wireless Communications10.1109/TWC.2024.345480023:11_Part_2(17586-17601)Online publication date: 1-Nov-2024
https://dl.acm.org/doi/10.1109/TWC.2024.3454800
Zheng JDu BDu HKang JNiyato DZhang H(2024)Energy-Efficient Resource Allocation in Generative AI-Aided Secure Semantic Mobile NetworksIEEE Transactions on Mobile Computing10.1109/TMC.2024.339686023:12(11422-11435)Online publication date: 1-Dec-2024
https://dl.acm.org/doi/10.1109/TMC.2024.3396860
Du HNiyato DKang JXiong ZZhang PCui SShen XMao SHan ZJamalipour APoor HKim D(2024)The Age of Generative AI and AI-Generated EverythingIEEE Network: The Magazine of Global Internetworking10.1109/MNET.2024.342224138:6(501-512)Online publication date: 1-Nov-2024
https://dl.acm.org/doi/10.1109/MNET.2024.3422241
Show More Cited By

Index Terms

Diffusion-Based Reinforcement Learning for Edge-Enabled AI-Generated Content Services

Index terms have been assigned to the content through auto-classification.

Recommendations

Reward Shaping in Episodic Reinforcement Learning
AAMAS '17: Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems

Recent advancements in reinforcement learning confirm that reinforcement learning techniques can solve large scale problems leading to high quality autonomous decision making. It is a matter of time until we will see large scale applications of ...
Explanation-Based Learning and Reinforcement Learning: A Unified View

In speedup-learning problems, where full descriptions of operators are known, both explanation-based learning (EBL) and reinforcement learning (RL) methods can be applied. This paper shows that both methods involve fundamentally the same process of ...
Reinforcement Learning: With Open AI, TensorFlow and Keras Using Python

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Mobile Computing

IEEE Transactions on Mobile Computing Volume 23, Issue 9

Sept. 2024

473 pages

Issue’s Table of Contents

1536-1233 © 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See https://www.ieee.org/publications/rights/index.html for more information.

Publisher

IEEE Educational Activities Department

United States

Publication History

Published: 01 September 2024

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 08 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Gan DJiang YLi QGe X(2024)Wireless Metaverse Behavior Models and Optimization Based on Bandwagon EffectsIEEE Transactions on Wireless Communications10.1109/TWC.2024.345480023:11_Part_2(17586-17601)Online publication date: 1-Nov-2024
https://dl.acm.org/doi/10.1109/TWC.2024.3454800
Zheng JDu BDu HKang JNiyato DZhang H(2024)Energy-Efficient Resource Allocation in Generative AI-Aided Secure Semantic Mobile NetworksIEEE Transactions on Mobile Computing10.1109/TMC.2024.339686023:12(11422-11435)Online publication date: 1-Dec-2024
https://dl.acm.org/doi/10.1109/TMC.2024.3396860
Du HNiyato DKang JXiong ZZhang PCui SShen XMao SHan ZJamalipour APoor HKim D(2024)The Age of Generative AI and AI-Generated EverythingIEEE Network: The Magazine of Global Internetworking10.1109/MNET.2024.342224138:6(501-512)Online publication date: 1-Nov-2024
https://dl.acm.org/doi/10.1109/MNET.2024.3422241
Zhao YLiu CYi TLi GWu D(2024)Energy-Efficient Ground-Air-Space Vehicular Crowdsensing by Hierarchical Multi-Agent Deep Reinforcement Learning With Diffusion ModelsIEEE Journal on Selected Areas in Communications10.1109/JSAC.2024.345903942:12(3566-3580)Online publication date: 12-Sep-2024
https://dl.acm.org/doi/10.1109/JSAC.2024.3459039

View Options

View options

Figures

Tables

Media

View Issue’s Table of Contents