Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1109/ASE.2019.00077acmconferencesArticle/Chapter ViewAbstractPublication PagesaseConference Proceedingsconference-collections
research-article

Wuji: automatic online combat game testing using evolutionary deep reinforcement learning

Published: 07 February 2020 Publication History

Abstract

Game testing has been long recognized as a notoriously challenging task, which mainly relies on manual playing and scripting based testing in game industry. Even until recently, automated game testing still remains to be largely untouched niche. A key challenge is that game testing often requires to play the game as a sequential decision process. A bug may only be triggered until completing certain difficult intermediate tasks, which requires a certain level of intelligence. The recent success of deep reinforcement learning (DRL) sheds light on advancing automated game testing, without human competitive intelligent support. However, the existing DRLs mostly focus on winning the game rather than game testing. To bridge the gap, in this paper, we first perform an in-depth analysis of 1349 real bugs from four real-world commercial game products. Based on this, we propose four oracles to support automated game testing, and further propose Wuji, an on-the-fly game testing framework, which leverages evolutionary algorithms, DRL and multi-objective optimization to perform automatic game testing. Wuji balances between winning the game and exploring the space of the game. Winning the game allows the agent to make progress in the game, while space exploration increases the possibility of discovering bugs. We conduct a large-scale evaluation on a simple game and two popular commercial games. The results demonstrate the effectiveness of Wuji in exploring space and detecting bugs. Moreover, Wuji found 3 previously unknown bugs1, which have been confirmed by the developers, in the commercial games.

References

[1]
American Fuzzy Lop. http://lcamtuf.coredump.cx/afl/, 2018.
[2]
David Adamo, Md Khorrom Khan, Sreedevi Koppula, and Renée C. Bryce. Reinforcement learning for android GUI testing. In Proceedings of the 9th ACM SIGSOFT International Workshop on Automating TEST Case Design, Selection, and Evaluation, A-TEST@ESEC/SIGSOFT FSE 2018, Lake Buena Vista, FL, USA, November 05, 2018, pages 2--8, 2018.
[3]
Saiqa Aleem, Luiz Fernando Capretz, and Faheem Ahmed. Critical success factors to improve the game development process from a developer's perspective. J. Comput. Sci. Technol., 31(5):925--950, 2016.
[4]
Nadia Alshahwan, Xinbo Gao, Mark Harman, Yue Jia, Ke Mao, Alexander Mols, Taijin Tei, and Ilya Zorin. Deploying search based software engineering with sapienz at facebook. In International Symposium on Search Based Software Engineering, pages 3--45. Springer, 2018.
[5]
Ishan Banerjee, Bao N. Nguyen, Vahid Garousi, and Atif M. Memon. Graphical user interface (GUI) testing: Systematic mapping and repository. Information & Software Technology, 55(10):1679--1694, 2013.
[6]
Marcel Böhme, Van-Thuan Pham, and Abhik Roychoudhury. Coverage-based greybox fuzzing as markov chain. IEEE Transactions on Software Engineering, 45(5):489--506, 2017.
[7]
Konstantin Böttinger, Patrice Godefroid, and Rishabh Singh. Deep reinforcement fuzzing. In 2018 IEEE Security and Privacy Workshops, SP Workshops 2018, San Francisco, CA, USA, May 24, 2018, pages 116--122, 2018.
[8]
Cristian Cadar, Daniel Dunbar, Dawson R Engler, et al. Klee: Unassisted and automatic generation of high-coverage tests for complex systems programs. In OSDI, volume 8, pages 209--224, 2008.
[9]
Hongxu Chen, Yinxing Xue, Yuekang Li, Bihuan Chen, Xiaofei Xie, Xiuheng Wu, and Yang Liu. Hawkeye: Towards a desired directed greybox fuzzer. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, CCS '18, pages 2095--2108, New York, NY, USA, 2018. ACM.
[10]
Shauvik Roy Choudhary, Alessandra Gorla, and Alessandro Orso. Automated test input generation for android: Are we there yet? (E). In 30th IEEE/ACM International Conference on Automated Software Engineering, ASE 2015, Lincoln, NE, USA, November 9--13, 2015, pages 429--440, 2015.
[11]
Kalyanmoy Deb and Ram Bhusan Agrawal. Simulated binary crossover for continuous search space. Complex Systems, 9(2):115--148, 1994.
[12]
Kalyanmoy Deb, Samir Agrawal, Amrit Pratap, and Tanaka Meyarivan. A fast elitist non-dominated sorting genetic algorithm for multi-objective optimization: Nsga-ii. In International conference on parallel problem solving from nature, pages 849--858. Springer, 2000.
[13]
DeepMind. Aplhago. https://deepmind.com/research/alphago/, 2019.
[14]
DeepMind. Dota2. https://openai.com/five/, 2019.
[15]
Xiaoning Du, Xiaofei Xie, Yi Li, Lei Ma, Yang Liu, and Jianjun Zhao. Deepstellar: model-based quantitative analysis of stateful deep learning systems. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pages 477--487. ACM, 2019.
[16]
Gordon Fraser and Andrea Arcuri. Evosuite: automatic test suite generation for object-oriented software. In Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering, pages 416--419. ACM, 2011.
[17]
Google. Ui/application exerciser monkey. https://developer.android.com/studio/test/monkey, 2018.
[18]
Sidra Iftikhar, Muhammad Zohaib Iqbal, Muhammad Uzair Khan, and Wardah Mahmood. An automated model based testing approach for platform games. In 18th ACM/IEEE International Conference on Model Driven Engineering Languages and Systems, MoDELS 2015, Ottawa, ON, Canada, September 30 - October 2, 2015, pages 426--435, 2015.
[19]
Sergey Ioffe and Christian Szegedy. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. arXiv.org, February 2015.
[20]
Hammad Khalid, Meiyappan Nagappan, Emad Shihab, and Ahmed E. Hassan. Prioritizing the devices to test your app on: a case study of android game apps. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, (FSE-22), Hong Kong, China, November 16 - 22, 2014, pages 610--620, 2014.
[21]
Diederik P Kingma and Jimmy Ba. Adam - A Method for Stochastic Optimization. ICLR, 2015.
[22]
Yavuz Köroglu, Alper Sen, Ozlem Muslu, Yunus Mete, Ceyda Ulker, Tolga Tanriverdi, and Yunus Donmez. QBE: qlearning-based exploration of android applications. In 11th IEEE International Conference on Software Testing, Verification and Validation, ICST 2018, Västerås, Sweden, April 9--13, 2018, pages 105--115, 2018.
[23]
Yuekang Li, Yinxing Xue, Hongxu Chen, Xiuheng Wu, Cen Zhang, Xiaofei Xie, Haijun Wang, and Yang Liu. Cerebro: context-aware adaptive fuzzing for effective vulnerability detection. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pages 533--544. ACM, 2019.
[24]
Dayi Lin, Cor-Paul Bezemer, and Ahmed E. Hassan. Studying the urgent updates of popular games on the steam platform. Empirical Software Engineering, 22(4):2095--2126, 2017.
[25]
Gabriel Lovreto, André Takeshi Endo, Paulo Nardi, and Vinicius H. S. Durelli. Automated tests for mobile games: An experience report. In 17th Brazilian Symposium on Computer Games and Digital Entertainment, SBGames 2018, Foz do Iguaçu, Brazil, October 29 - November 1, 2018, pages 48--56, 2018.
[26]
Ke Mao, Mark Harman, and Yue Jia. Sapienz: multi-objective automated testing for android applications. In Proceedings of the 25th International Symposium on Software Testing and Analysis, ISSTA 2016, Saarbrücken, Germany, July 18--20, 2016, pages 94--105, 2016.
[27]
Brad L Miller, David E Goldberg, et al. Genetic algorithms, tournament selection, and the effects of noise. Complex systems, 9(3):193--212, 1995.
[28]
Volodymyr Mnih, Adria Puigdomenech Badia, Mehdi Mirza, Alex Graves, Timothy Lillicrap, Tim Harley, David Silver, and Koray Kavukcuoglu. Asynchronous methods for deep reinforcement learning. In International conference on machine learning, pages 1928--1937, 2016.
[29]
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, and Demis Hassabis. Human-level control through deep reinforcement learning. Nature, 518(7540):529--533, 2015.
[30]
Vinod Nair and Geoffrey E Hinton. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th international conference on machine learning (ICML-10), pages 807--814, 2010.
[31]
Newzoo. Global games market report. https://newzoo.com/solutions/standard/market-forecasts/global-games-market-report, 2018.
[32]
Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. Automatic differentiation in pytorch. 2017.
[33]
David Silver, Aja Huang, Chris J Maddison, Arthur Guez, Laurent Sifre, George Van Den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, et al. Mastering the game of go with deep neural networks and tree search. nature, 529(7587):484, 2016.
[34]
Helge Spieker, Arnaud Gotlieb, Dusica Marijan, and Morten Mossige. Reinforcement learning for automatic test case prioritization and selection in continuous integration. In Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis, Santa Barbara, CA, USA, July 10 - 14, 2017, pages 12--22, 2017.
[35]
Ting Su, Guozhu Meng, Yuting Chen, Ke Wu, Weiming Yang, Yao Yao, Geguang Pu, Yang Liu, and Zhendong Su. Guided, stochastic model-based gui testing of android apps. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, pages 245--256. ACM, 2017.
[36]
Ting Su, Guozhu Meng, Yuting Chen, Ke Wu, Weiming Yang, Yao Yao, Geguang Pu, Yang Liu, and Zhendong Su. Guided, stochastic modelbased GUI testing of android apps. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, ESEC/FSE 2017, Paderborn, Germany, September 4--8, 2017, pages 245--256, 2017.
[37]
Felipe Petroski Such, Vashisht Madhavan, Edoardo Conti, Joel Lehman, Kenneth O. Stanley, and Jeff Clune. Deep neuroevolution: Genetic algorithms are a competitive alternative for training deep neural networks for reinforcement learning. arXiv preprint arXiv:1712.06567, 2017.
[38]
Richard S Sutton and Andrew G Barto. Reinforcement learning: An introduction. MIT press, 2018.
[39]
Thi Anh Tuyet Vuong and Shingo Takada. A reinforcement learning based approach to automated testing of android applications. In Proceedings of the 9th ACM SIGSOFT International Workshop on Automating TEST Case Design, Selection, and Evaluation, A-TEST@ESEC/SIGSOFT FSE 2018, Lake Buena Vista, FL, USA, November 05, 2018, pages 31--37, 2018.
[40]
Junjie Wang, Bihuan Chen, Lei Wei, and Yang Liu. Skyfire: Data-driven seed generation for fuzzing. In 2017 IEEE Symposium on Security and Privacy (SP), pages 579--594. IEEE, 2017.
[41]
Wuji. Wuji. https://sites.google.com/view/gametesting, 2019.
[42]
Xiaofei Xie, Lei Ma, Felix Juefei-Xu, Minhui Xue, Hongxu Chen, Yang Liu, Jianjun Zhao, Bo Li, Jianxiong Yin, and Simon See. Deephunter: a coverage-guided fuzz testing framework for deep neural networks. In Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis, pages 146--157. ACM, 2019.
[43]
Xiaofei Xie, Lei Ma, Haijun Wang, Yuekang Li, Yang Liu, and Xiaohong Li. Diffchaser: Detecting disagreements for deep neural networks. In IJCAI, pages 5772--5778, 2019.
[44]
Tianpei Yang, Jianye Hao, Zhaopeng Meng, Chongjie Zhang, Yan Zheng, and Ze Zheng. Towards Efficient Detection and Optimal Response against Sophisticated Opponents. IJCAI, 2019.
[45]
Tianpei Yang, Jianye Hao, Zhaopeng Meng, Yan Zheng, Chongjie Zhang, and Ze Zheng. Bayes-ToMoP - A Fast Detection and Best Response Algorithm Towards Sophisticated Opponents. AAMAS, 2019.
[46]
Yan Zheng, Zhaopeng Meng, Jianye Hao, and Zongzhang Zhang. Weighted Double Deep Multiagent Reinforcement Learning in Stochastic Cooperative Environments. PRICAI, 2018.
[47]
Yan Zheng, Zhaopeng Meng, Jianye Hao, Zongzhang Zhang, Tianpei Yang, and Changjie Fan. A Deep Bayesian Policy Reuse Approach Against Non-Stationary Agents. NeurIPS, 2018.

Cited By

View all
  • (2024)Effective, Platform-Independent GUI Testing via Image Embedding and Reinforcement LearningACM Transactions on Software Engineering and Methodology10.1145/367472833:7(1-27)Online publication date: 21-Jun-2024
  • (2024)DinoDroid: Testing Android Apps Using Deep Q-NetworksACM Transactions on Software Engineering and Methodology10.1145/365215033:5(1-24)Online publication date: 4-Jun-2024
  • (2024)Enhancing Multi-agent System Testing with Diversity-Guided Exploration and Adaptive Critical State ExploitationProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680376(1491-1503)Online publication date: 11-Sep-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ASE '19: Proceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering
November 2019
1333 pages
ISBN:9781728125084

Sponsors

In-Cooperation

  • IEEE CS

Publisher

IEEE Press

Publication History

Published: 07 February 2020

Check for updates

Author Tags

  1. artificial intelligence
  2. deep reinforcement learning
  3. evolutionary multi-objective optimization
  4. game testing

Qualifiers

  • Research-article

Conference

ASE '19
Sponsor:

Acceptance Rates

Overall Acceptance Rate 82 of 337 submissions, 24%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)17
  • Downloads (Last 6 weeks)5
Reflects downloads up to 17 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Effective, Platform-Independent GUI Testing via Image Embedding and Reinforcement LearningACM Transactions on Software Engineering and Methodology10.1145/367472833:7(1-27)Online publication date: 21-Jun-2024
  • (2024)DinoDroid: Testing Android Apps Using Deep Q-NetworksACM Transactions on Software Engineering and Methodology10.1145/365215033:5(1-24)Online publication date: 4-Jun-2024
  • (2024)Enhancing Multi-agent System Testing with Diversity-Guided Exploration and Adaptive Critical State ExploitationProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680376(1491-1503)Online publication date: 11-Sep-2024
  • (2024)Test Optimization in DNN Testing: A SurveyACM Transactions on Software Engineering and Methodology10.1145/364367833:4(1-42)Online publication date: 27-Jan-2024
  • (2024)Practical Non-Intrusive GUI Exploration Testing with Visual-based Robotic ArmsProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639161(1-13)Online publication date: 20-May-2024
  • (2023)Fostering Collaboration and Advancing Research in Software Engineering and Game Development for Serious ContextsACM SIGSOFT Software Engineering Notes10.1145/3617946.361795548:4(46-50)Online publication date: 17-Oct-2023
  • (2023)LaF: Labeling-free Model Selection for Automated Deep Neural Network ReusingACM Transactions on Software Engineering and Methodology10.1145/361166633:1(1-28)Online publication date: 31-Jul-2023
  • (2023)A Unified Framework for Mini-game Testing: Experience on WeChatProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3613868(1623-1634)Online publication date: 30-Nov-2023
  • (2023)Games and Software EngineeringACM SIGSOFT Software Engineering Notes10.1145/3573074.357309648:1(85-89)Online publication date: 17-Jan-2023
  • (2023)Studying the Influence and Distribution of the Human Effort in a Hybrid Fitness Function for Search-Based Model-Driven EngineeringIEEE Transactions on Software Engineering10.1109/TSE.2023.332973049:12(5189-5202)Online publication date: 1-Dec-2023
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media