Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Diffusion-Based Reinforcement Learning for Edge-Enabled AI-Generated Content Services

Published: 01 September 2024 Publication History

Abstract

As Metaverse emerges as the next-generation Internet paradigm, the ability to efficiently generate content is paramount. AI-Generated Content (AIGC) emerges as a key solution, yet the resource-intensive nature of large Generative AI (GAI) models presents challenges. To address this issue, we introduce an AIGC-as-a-Service (AaaS) architecture, which deploys AIGC models in wireless edge networks to ensure broad AIGC services accessibility for Metaverse users. Nonetheless, an important aspect of providing personalized user experiences requires carefully selecting AIGC Service Providers (ASPs) capable of effectively executing user tasks, which is complicated by environmental uncertainty and variability. Addressing this gap in current research, we introduce the AI-Generated Optimal Decision (AGOD) algorithm, a diffusion model-based approach for generating the optimal ASP selection decisions. Integrating AGOD with Deep Reinforcement Learning (DRL), we develop the Deep Diffusion Soft Actor-Critic (D2SAC) algorithm, enhancing the efficiency and effectiveness of ASP selection. Our comprehensive experiments demonstrate that D2SAC outperforms seven leading DRL algorithms. Furthermore, the proposed AGOD algorithm has the potential for extension to various optimization problems in wireless networks, positioning it as a promising approach for future research on AIGC-driven services.

References

[1]
A. M. Turing, Computing Machinery and Intelligence. Berlin, Germany: Springer, 2009.
[2]
Y. Wang et al., “A survey on metaverse: Fundamentals, security, and privacy,” IEEE Commun. Surv. Tut., vol. 25, no. 1, pp. 319–352, First Quarter 2023.
[3]
S. John and K. Matt, “Sizing the prize: What's the real value of AI for your business and how can you capitalise?,” PwC AI Anal. Rep., 2020. [Online]. Available: https://www.pwc.com/gx/en/news-room/docs/report-pwc-ai-analysis-sizing-the-prize.pdf
[4]
M. Aljanabi et al., “ChatGpt: Open possibilities,” Iraqi J. Comput. Sci. Math., vol. 4, no. 1, pp. 62–64, Jan. 2023.
[5]
A. Ulhaq, N. Akhtar, and G. Pogrebna, “Efficient diffusion models for vision: A survey,” 2022,.
[6]
H. Du et al., “Enabling AI-generated content (AIGC) services in wireless edge networks,” IEEE Wireless Commun., 2023.
[7]
Y. Lin et al., “Blockchain-aided secure semantic communication for AI-generated content in metaverse,” IEEE Open J. Comput. Soc., vol. 4, pp. 72–83, 2023.
[8]
H. Du et al., “The age of generative AI and AI-generated everything,” 2023,.
[9]
G. Harshvardhan, M. K. Gourisaria, M. Pandey, and S. S. Rautaray, “A comprehensive survey and analysis of generative models in machine learning,” Comput. Sci. Rev., vol. 38, 2020, Art. no.
[10]
H. Du et al., “Attention-aware resource allocation and QoE analysis for metaverse xURLLC services,” IEEE J. Sel. Areas Commun., vol. 41, no. 7, pp. 2158–2175, Jul. 2023.
[11]
J. Ren et al., “An efficient two-layer task offloading scheme for MEC system with multiple services providers,” in Proc. IEEE Conf. Comput. Commun., 2022, pp. 1519–1528.
[12]
I. Osband, C. Blundell, A. Pritzel, and B. Van Roy, “Deep exploration via bootstrapped DQN” in Proc. Adv. Neural Inf. Process. Syst., 2016, pp. 1–9.
[13]
H. Du et al., “Beyond deep reinforcement learning: A tutorial on generative diffusion models in network optimization,” 2023,.
[14]
V. Mnih et al., “Human-level control through deep reinforcement learning,” Nature, vol. 518, no. 7540, pp. 529–533, 2015.
[15]
M. Hausknecht and P. Stone, “Deep recurrent Q-learning for partially observable MDPs,” in Proc. AAAI Fall Symp. Ser., 2015, pp. 29–37.
[16]
T. Schaul, J. Quan, I. Antonoglou, and D. Silver, “Prioritized experience replay,” 2015,.
[17]
M. Hessel et al., “Rainbow: Combining improvements in deep reinforcement learning,” in Proc. AAAI Conf. Artif. Intell., 2018, pp. 3215–3222.
[18]
R. J. Williams, “Simple statistical gradient-following algorithms for connectionist reinforcement learning,” Reinforcement Learn., vol. 8, pp. 229–256, 1992.
[19]
J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,” 2017,.
[20]
T. Haarnoja, A. Zhou, P. Abbeel, and S. Levine, “Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor,” in Proc. Int. Conf. Mach. Learn., 2018, pp. 1861–1870.
[21]
H. Du et al., “Exploring collaborative distributed diffusion-based AI-generated content (AIGC) in wireless networks,” IEEE Netw. IEEE Netw., 2023.
[22]
Y. Liu et al., “Blockchain-empowered lifecycle management for AI-generated content (AIGC) products in edge networks,” IEEE Wireless Commun., 2023.
[23]
S. Yue, J. Ren, J. Xin, D. Zhang, Y. Zhang, and W. Zhuang, “Efficient federated meta-learning over multi-access wireless networks,” IEEE J. Sel. Areas Commun., vol. 40, no. 5, pp. 1556–1570, May 2022.
[24]
Z. Wang, J. J. Hunt, and M. Zhou, “Diffusion policies as an expressive policy class for offline reinforcement learning,” 2022,.
[25]
H. Du, J. Wang, D. Niyato, J. Kang, Z. Xiong, and D. I. Kim, “AI-generated incentive mechanism and full-duplex semantic communications for information sharing,” IEEE J. Sel. Areas Commun., vol. 41, no. 9, pp. 2981–2997, Sep. 2023.
[26]
X. Chen et al., “Reinforcement learning–based QoS/QoE-aware service function chaining in software-driven 5G slices,” Trans. Emerg. Telecommun. Technol., vol. 29, no. 11, Nov. 2018, Art. no.
[27]
K. Arulkumaran, M. P. Deisenroth, M. Brundage, and A. A. Bharath, “Deep reinforcement learning: A brief survey,” IEEE Signal Process. Mag., vol. 34, no. 6, pp. 26–38, Nov. 2017.
[28]
G. Sun, Z. Xu, H. Yu, and V. Chang, “Dynamic network function provisioning to enable network in box for industrial applications,” IEEE Trans. Ind. Inform., vol. 17, no. 10, pp. 7155–7164, Oct. 2020.
[29]
G. Sun, L. Sheng, L. Luo, and H. Yu, “Game theoretic approach for multipriority data transmission in 5G vehicular networks,” IEEE Trans. Intell. Transp. Syst., vol. 23, no. 12, pp. 24672–24685, Dec. 2022.
[30]
M. Dai, L. Luo, J. Ren, H. Yu, and G. Sun, “PSACCF: Prioritized online slice admission control considering fairness in 5G/B5G networks,” IEEE Trans. Netw. Sci. Eng., vol. 9, no. 6, pp. 4101–4114, Nov./Dec. 2022.
[31]
J. Ho, A. Jain, and P. Abbeel, “Denoising diffusion probabilistic models,” Proc. Adv. Neural Inf. Process. Syst., 2020, pp. 6840–6851.
[32]
S. AI, “Stable diffusion,” 2024. [Online]. Available: https://stability.ai/
[33]
J. B. Mazzola and A. W. Neebe, “Resource-constrained assignment scheduling,” Oper. Res., vol. 34, no. 4, pp. 560–572, 1986.
[34]
S. Desale, A. Rasool, S. Andhale, and P. Rane, “Heuristic and meta-heuristic algorithms and their relevance to the real world: A survey,” Int. J. Comput. Eng. Res. Trends, vol. 351, no. 5, pp. 2349–7084, May 2015.
[35]
A. Mehta et al., “Online matching and ad allocation,” Found. Trends Theor. Comput. Sci., vol. 8, no. 4, pp. 265–368, Apr. 2013.
[36]
W. Chen, X. Qiu, T. Cai, H.-N. Dai, Z. Zheng, and Y. Zhang, “Deep reinforcement learning for Internet of Things: A comprehensive survey,” IEEE Commun. Surv. Tut., vol. 23, no. 3, pp. 1659–1692, Third Quarter 2021.
[37]
R. S. Sutton, D. McAllester, S. Singh, and Y. Mansour, “Policy gradient methods for reinforcement learning with function approximation,” in Proc. Adv. Neural Inf. Process. Syst., 1999, pp. 1057–1063.
[38]
M. Janner, Y. Du, J. B. Tenenbaum, and S. Levine, “Planning with diffusion for flexible behavior synthesis,” 2022,.
[39]
A. Lugmayr, M. Danelljan, A. Romero, F. Yu, R. Timofte, and L. Van Gool, “RePaint: Inpainting using denoising diffusion probabilistic models,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2022, pp. 11461–11471.
[40]
S. Kastryulin, D. Zakirov, and D. Prokopenko, “PyTorch image quality: Metrics for image quality assessment,” Open-source Softw., 2022. [Online]. Available: https://github.com/photosynthesis-team/piq
[41]
Z. Liu, P. Luo, X. Wang, and X. Tang, “Deep learning face attributes in the wild,” in Proc. IEEE Int. Conf. Comput. Vis., 2015, pp. 3730–3738.
[42]
S.-W. Ko, K. Han, and K. Huang, “Wireless networks for mobile edge computing: Spatial modeling and latency analysis,” IEEE Trans. Wireless Commun., vol. 17, no. 8, pp. 5225–5240, Aug. 2018.
[43]
S.-P. Chung and J.-C. Lee, “Performance analysis and overflowed traffic characterization in multiservice hierarchical wireless networks,” IEEE Trans. Wireless Commun., vol. 4, no. 3, pp. 904–918, May 2005.
[44]
A. Vaswani et al., “Attention is all you need,” in Proc. Adv. Neural Inf. Process. Syst., 2017, pp. 1–11.
[45]
D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” in Proc. Int. Conf. Learn. Representations, 2015, pp. 1–15.
[46]
F. Garcia-Carballeira, A. Calderon, and J. Carretero, “Enhancing the power of two choices load balancing algorithm using round robin policy,” Cluster Comput., vol. 24, no. 2, pp. 611–624, Feb. 2021.
[47]
K. Cobbe, C. Hesse, J. Hilton, and J. Schulman, “Leveraging procedural generation to benchmark reinforcement learning,” in Proc. Int. Conf. Mach. Learn., 2020, pp. 2048–2056.
[48]
A. Raffin, “RL baselines zoo,” 2018. [Online]. Available: https://github.com/araffin/rl-baselines-zoo

Cited By

View all
  • (2024)Wireless Metaverse Behavior Models and Optimization Based on Bandwagon EffectsIEEE Transactions on Wireless Communications10.1109/TWC.2024.345480023:11_Part_2(17586-17601)Online publication date: 1-Nov-2024
  • (2024)Energy-Efficient Resource Allocation in Generative AI-Aided Secure Semantic Mobile NetworksIEEE Transactions on Mobile Computing10.1109/TMC.2024.339686023:12(11422-11435)Online publication date: 1-Dec-2024
  • (2024)The Age of Generative AI and AI-Generated EverythingIEEE Network: The Magazine of Global Internetworking10.1109/MNET.2024.342224138:6(501-512)Online publication date: 1-Nov-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Mobile Computing
IEEE Transactions on Mobile Computing  Volume 23, Issue 9
Sept. 2024
473 pages

Publisher

IEEE Educational Activities Department

United States

Publication History

Published: 01 September 2024

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 08 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Wireless Metaverse Behavior Models and Optimization Based on Bandwagon EffectsIEEE Transactions on Wireless Communications10.1109/TWC.2024.345480023:11_Part_2(17586-17601)Online publication date: 1-Nov-2024
  • (2024)Energy-Efficient Resource Allocation in Generative AI-Aided Secure Semantic Mobile NetworksIEEE Transactions on Mobile Computing10.1109/TMC.2024.339686023:12(11422-11435)Online publication date: 1-Dec-2024
  • (2024)The Age of Generative AI and AI-Generated EverythingIEEE Network: The Magazine of Global Internetworking10.1109/MNET.2024.342224138:6(501-512)Online publication date: 1-Nov-2024
  • (2024)Energy-Efficient Ground-Air-Space Vehicular Crowdsensing by Hierarchical Multi-Agent Deep Reinforcement Learning With Diffusion ModelsIEEE Journal on Selected Areas in Communications10.1109/JSAC.2024.345903942:12(3566-3580)Online publication date: 12-Sep-2024

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media