research-article

Open access

WEFix: Intelligent Automatic Generation of Explicit Waits for Efficient Web End-to-End Flaky Tests

Authors:

Weihang WangAuthors Info & Claims

WWW '24: Proceedings of the ACM Web Conference 2024

Pages 3043 - 3052

https://doi.org/10.1145/3589334.3645628

Published: 13 May 2024 Publication History

Abstract

Web end-to-end (e2e) testing evaluates the workflow of a web application. It simulates real-world user scenarios to ensure that the application flows behave as expected. However, web e2e tests are notorious for being flaky, \ie the tests can produce inconsistent results despite no changes to the code. One common type of flakiness is caused by nondeterministic execution orders between the test code and the client-side code under test. In particular, UI-based flakiness emerges as a notably prevalent and challenging issue to fix because the test code has limited knowledge about the client-side code execution. In this paper, we propose WEFix, a technique that can automatically generate fixes for UI-based flakiness in web e2e testing. The core of our approach is to leverage browser UI changes to predict the client-side code execution and generate proper wait oracles. We evaluate the effectiveness and efficiency of WEFix against 122 web e2e flaky tests from seven popular real-world projects. Our results show that WEFix dramatically reduces the overhead (from 3.7× to 1.25×) while achieving a high correctness (98%).

Supplemental Material

MP4 File

Supplemental video

Download
44.30 MB

References

[1]

Abdulrahman Alshammari, Christopher Morris, Michael Hilton, and Jonathan Bell. 2021. FlakeFlagger: Predicting flakiness without rerunning tests. In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). IEEE, 1572--1584.

Digital Library

[2]

BABEL. 2014. Babel Homepage. https://babeljs.io/

[3]

Jonathan Bell and Gail Kaiser. 2014. Unit test virtualization with VMVM. In Proceedings of the 36th International Conference on Software Engineering. 550--561.

Digital Library

[4]

Jonathan Bell, Owolabi Legunsen, Michael Hilton, Lamyaa Eloussi, Tifany Yung, and Darko Marinov. 2018. DeFlaker: Automatically detecting flaky tests. In 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE). IEEE, 433--444.

[5]

Shreya Bose. 2022. How to use Wait commands in Selenium WebDriver. BrowserStack. https://www.browserstack.com/guide/wait-commands-in-selenium-webdriver

[6]

BYBY.DEV. 2023. Top 6 End-to-End Testing Frameworks. https://byby.dev/e2e-testing-frameworks

[7]

Maura Cerioli, Maurizio Leotta, and Filippo Ricca. 2020. What 5 million job advertisements tell us about testing: a preliminary empirical investigation. In Proceedings of the 35th Annual ACM Symposium on Applied Computing. 1586--1594.

Digital Library

[8]

Cypress and Zach Bloomquist. 2019. Increase CDP timeout to 20 seconds. https://github.com/cypress-io/cypress/pull/5610

[9]

Saikat Dutta, August Shi, Rutvik Choudhary, Zhekun Zhang, Aryaman Jain, and Sasa Misailovic. 2020. Detecting flaky tests in probabilistic and machine learning applications. In Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis. 211--224.

Digital Library

[10]

Moritz Eck, Fabio Palomba, Marco Castelluccio, and Alberto Bacchelli. 2019. Understanding flaky tests: The developer's perspective. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 830--840.

Digital Library

[11]

Facebook. 2015. F8 2015 presentation - Big Code: Developer Infrastructure at Facebook's Scale. https://www.youtube.com/watch?v=X0VH78ye4yY&t=1896s

[12]

Mattia Fazzini, Alessandra Gorla, and Alessandro Orso. 2020. A framework for automated test mocking of mobile apps. In 2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 1204--1208.

Digital Library

[13]

Walmyr Filho. 2016. Are GUI tests flaky by their nature? Medium. https://walmyrlimaesilv.medium.com/are-ui-tests-flaky-by-their-nature-3ee24bc45042

[14]

GitHub. 2016. GitHub: storybookjs/storybook. https://github.com/storybookjs/storybook

[15]

GitHub. 2021. GitHub: keystonejs/keystone. https://github.com/keystonejs/keystone

[16]

GitHub. 2022. GitHub Search API. https://docs.github.com/en/rest/reference/search

[17]

Google. 2009. My Selenium Tests Aren't Stable! https://testing.googleblog.com/2009/06/my-selenium-tests-arent-stable.html

[18]

Martin Gruber, Stephan Lukasczyk, Florian Kroiß, and Gordon Fraser. 2021. An empirical study of flaky tests in python. In 2021 14th IEEE Conference on Software Testing, Verification and Validation (ICST). IEEE, 148--158.

[19]

Penny Hoarder. 2018. Why is it better to use explicitly wait rather than implicitly wait in Selenium Web Driver? Quora. https://www.quora.com/Why-is-it-better-to-use-explicitly-wait-rather_ -than-implicitly-wait-in-Selenium-Web-Driver

[20]

Mesut Kilicarslan. 2023. UI Test Automation Flakiness. https://www.linkedin.com/pulse/ui-test-automation-flakiness-mesut-kilicarslan/

[21]

Tariq M King, Dionny Santiago, Justin Phillips, and Peter J Clarke. 2018. Towards a bayesian network model for predicting flaky automated tests. In 2018 IEEE International Conference on Software Quality, Reliability and Security Companion (QRS-C). IEEE, 100--107.

[22]

Henning Koch. 2017. Fixing flaky E2E tests. makandra. https://makandracards.com/makandra/47336-fixing-flaky-e2e-tests

[23]

Wing Lam, Kivancc Mucs lu, Hitesh Sajnani, and Suresh Thummalapenta. 2020a. A study on the lifecycle of flaky tests. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering. 1471--1482.

Digital Library

[24]

Wing Lam, Reed Oei, August Shi, Darko Marinov, and Tao Xie. 2019. iDFlakies: A Framework for Detecting and Partially Classifying Flaky Tests. In 2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST). 312--322. https://doi.org/10.1109/ICST.2019.00038

[25]

Wing Lam, August Shi, Reed Oei, Sai Zhang, Michael D Ernst, and Tao Xie. 2020b. Dependent-test-aware regression testing techniques. In Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis. 298--311.

Digital Library

[26]

Maurizio Leotta, Andrea Stocco, Filippo Ricca, and Paolo Tonella. 2018. Pesto: Automated migration of DOM-based Web tests towards the visual approach. Software Testing, Verification And Reliability, Vol. 28, 4 (2018), e1665.

[27]

Qingzhou Luo, Farah Hariri, Lamyaa Eloussi, and Darko Marinov. 2014. An empirical analysis of flaky tests. In Proceedings of the 22nd ACM SIGSOFT international symposium on foundations of software engineering. 643--653.

Digital Library

[28]

J. Micco. 2016. Flaky tests at google and how we mitigate them. https://testing.googleblog.com/2016/05/flaky-tests-at-googleand-how-we.html

[29]

Mozilla. 2022. XPath Document. https://developer.mozilla.org/en-US/docs/Web/XPath

[30]

Mozilla. 2023. Mozilla MutationObserver Document. https://developer.mozilla.org/en-US/docs/Web/API/MutationObserver

[31]

Dario Olianas, Maurizio Leotta, and Filippo Ricca. 2022. SleepReplacer: A novel tool-based approach for replacing thread sleeps in selenium webdriver test code. Software Quality Journal, Vol. 30, 4 (2022), 1089--1121.

Digital Library

[32]

Dario Olianas, Maurizio Leotta, Filippo Ricca, and Luca Villa. 2021. Reducing Flakiness in End-to-End Test Suites: An Experience Report. In International Conference on the Quality of Information and Communications Technology. Springer, 3--17.

[33]

Paper Authors. 2022a. GitHub Repository of WEFix. https://github.com/WEFix-tech/WEFix

[34]

Paper Authors. 2022b. WEfix archived on figshare. figshare. https://figshare.com/s/0625c9bf31ca29c98d73

[35]

Paper Authors. 2022c. WEFix Data Presentation Page. https://wefix-tech.github.io

[36]

Paper Authors. 2022d. WEFix Dataset on figshare. figshare. https://figshare.com/s/1db0179fd40a40be3277

[37]

Paper Authors. 2022 e. @wefix-tech/wefix (package published on NPM). https://www.npmjs.com/package/@wefix-tech/wefix

[38]

Owain Parry, Gregory M Kapfhammer, Michael Hilton, and Phil McMinn. 2021. A survey of flaky tests. ACM Transactions on Software Engineering and Methodology (TOSEM), Vol. 31, 1 (2021), 1--74.

Digital Library

[39]

Gustavo Pinto, Breno Miranda, Supun Dissanayake, Marcelo d'Amorim, Christoph Treude, and Antonia Bertolino. 2020. What is the vocabulary of flaky tests?. In Proceedings of the 17th International Conference on Mining Software Repositories. 492--502.

Digital Library

[40]

reflow.io. 2022. How to Fix Flaky End-to-End Tests with Playwright and Reflow. HACKERMOON. https://hackernoon.com/how-to-fix-flaky-end-to-end-tests-with-playwright-and-reflow

[41]

Alan Romano, Zihe Song, Sampath Grandhi, Wei Yang, and Weihang Wang. 2021a. An Empirical Analysis of UI-Based Flaky Tests. In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). 1585--1597. https://doi.org/10.1109/ICSE43902.2021.00141

Digital Library

[42]

Alan Romano, Zihe Song, Sampath Grandhi, Wei Yang, and Weihang Wang. 2021b. UI-based flaky tests datasets. In 2021 IEEE/ACM 43rd International Conference on Software Engineering: Companion Proceedings (ICSE-Companion). IEEE, 234--235.

Digital Library

[43]

Krishna Rungta. 2022. Implicit, Explicit and Fluent Wait in Selenium WebDriver. Guru99. https://www.guru99.com/implicit-explicit-waits-selenium.html

[44]

Krishna Rungta. 2023. Selenium Wait -- Implicit, Explicit and Fluent Waits. https://www.guru99.com/implicit-explicit-waits-selenium.html

[45]

Selenium. 2023 a. Finding Web Element Using Selenium. https://www.selenium.dev/documentation/webdriver/elements/finders

[46]

Selenium. 2023 b. Waiting Strategies. https://www.selenium.dev/documentation/webdriver/waits/

[47]

Selenium. 2023 c. WebDriver. https://www.selenium.dev/documentation/webdriver/

[48]

August Shi, Jonathan Bell, and Darko Marinov. 2019a. Mitigating the effects of flaky tests on mutation testing. In Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis. 112--122.

Digital Library

[49]

August Shi, Alex Gyori, Owolabi Legunsen, and Darko Marinov. 2016. Detecting assumptions on deterministic implementations of non-deterministic specifications. In 2016 IEEE International Conference on Software Testing, Verification and Validation (ICST). IEEE, 80--90.

[50]

August Shi, Wing Lam, Reed Oei, Tao Xie, and Darko Marinov. 2019b. iFixFlakies: A framework for automatically fixing order-dependent flaky tests. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 545--555.

Digital Library

[51]

Sidharth Shukla. 2023. Top 10 Factors Behind UI-Web Test Automation(Selenium) Flakiness. https://medium.com/@sidharth.shukla19/top-10-factors-behind-ui-web-test-automation-selenium-flakiness-b8fd98185f34

[52]

Denini Silva, Leopoldo Teixeira, and Marcelo d'Amorim. 2020. Shake it! detecting flaky tests caused by concurrency with shaker. In 2020 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE, 301--311.

[53]

Davide Spadini, Maur'icio Aniche, Magiel Bruntink, and Alberto Bacchelli. 2017. To mock or not to mock? an empirical study on mocking practices. In 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR). IEEE, 402--412.

Digital Library

[54]

Storybook. 2023. Storybook: Projects that use Storybook. https://storybook.js.org/showcase/projects

[55]

Wei-Tek Tsai, Xiaoying Bai, Ray Paul, Weiguang Shao, and Vishal Agarwal. 2001. End-to-end integration testing design. In 25th Annual International Computer Software and Applications Conference. COMPSAC 2001. IEEE, 166--171.

[56]

Roberto Verdecchia, Emilio Cruciani, Breno Miranda, and Antonia Bertolino. 2021. Know you neighbor: Fast static prediction of test flakiness. IEEE Access, Vol. 9 (2021), 76119--76134.

[57]

W3C. 2022. W3C Wire Protocol. https://www.w3.org/TR/webdriver/#protocol

[58]

Peilun Zhang, Yanjie Jiang, Anjiang Wei, Victoria Stodden, Darko Marinov, and August Shi. 2021. Domain-specific fixes for flaky tests with wrong assumptions on underdetermined specifications. In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). IEEE, 50--61. io

Digital Library

Cited By

Zhang HLiao LDing ZShang WNarula NSporea CToma ASajedi SFilkov VRay BZhou M(2024)Towards a Robust Waiting Strategy for Web GUI Testing for an Industrial Software SystemProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695269(2065-2076)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695269

Index Terms

WEFix: Intelligent Automatic Generation of Explicit Waits for Efficient Web End-to-End Flaky Tests
1. Software and its engineering
  1. Software creation and management
    1. Software verification and validation

Recommendations

A Survey of Flaky Tests
Tests that fail inconsistently, without changes to the code under test, are described as flaky. Flaky tests do not give a clear indication of the presence of software bugs and thus limit the reliability of the test suites that contain them. A recent ...
Do Automatic Test Generation Tools Generate Flaky Tests?
ICSE '24: Proceedings of the IEEE/ACM 46th International Conference on Software Engineering

Non-deterministic test behavior, or flakiness, is common and dreaded among developers. Researchers have studied the issue and proposed approaches to mitigate it. However, the vast majority of previous work has only considered developer-written tests. The ...
An empirical analysis of flaky tests
FSE 2014: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering

Regression testing is a crucial part of software development. It checks that software changes do not break existing functionality. An important assumption of regression testing is that test outcomes are deterministic: an unmodified test is expected to ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

WWW '24: Proceedings of the ACM Web Conference 2024

May 2024

4826 pages

ISBN:9798400701719

DOI:10.1145/3589334

General Chairs:
Tat-Seng Chua
National University of Singapore
,
Chong-Wah Ngo
Singapore Management University
,
Proceedings Chair:
Roy Ka-Wei Lee
Singapore University of Technology and Design
,
Program Chairs:
Ravi Kumar
Google
,
Hady W. Lauw
Singapore Management University

Copyright © 2024 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 May 2024

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

US National Science Foundation

Conference

WWW '24

Sponsor:

SIGWEB

WWW '24: The ACM Web Conference 2024

May 13 - 17, 2024

Singapore, Singapore

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
153
Total Downloads

Downloads (Last 12 months)153
Downloads (Last 6 weeks)30

Reflects downloads up to 28 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Zhang HLiao LDing ZShang WNarula NSporea CToma ASajedi SFilkov VRay BZhou M(2024)Towards a Robust Waiting Strategy for Web GUI Testing for an Industrial Software SystemProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695269(2065-2076)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695269

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents