Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3643991.3644873acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article

gawd: A Differencing Tool for GitHub Actions Workflows

Published: 02 July 2024 Publication History

Abstract

The GitHub social coding platform introduced GitHub Actions as a way to automate different aspects of collaborative software development through the use of workflow files. It is the most popular CI/CD and workflow automation tool for GitHub. To maintain workflow code over time, it is useful to rely on differencing tools to identify the changes made during successive commits. Unfortunately, existing code differencing tools are not able to correctly identify changes made to workflow files. We therefore implemented gawd, a syntactic differencing tool for GitHub Actions workflows. The tool is capable of reporting the addition, deletion, modification and move of syntactic components in workflow files, taking into account the specific syntax of workflows. gawd has been evaluated on manually classified sets of workflow changes taken from existing commits in 40 different GitHub repositories, and was able to successfully identify these changes. gawd is publicly released as an open source Python tool distributed on PyPI.

References

[1]
Raihan Al-Ekram, Archana Adma, and Olga Baysal. 2005. diffX: an algorithm to detect changes in multi-version XML documents. In Conference of the Centre for Advanced Studies on Collaborative research. Citeseer, 1--11.
[2]
Taweesup Apiwattanapong, Alessandro Orso, and Mary Jean Harrold. 2004. A differencing algorithm for object-oriented programs. In International Conference on Automated Software Engineering (ASE). IEEE, 2--13.
[3]
Muhammad Asaduzzaman, Chanchal K Roy, Kevin A Schneider, and Massimiliano Di Penta. 2013. Lhdiff: A language-independent hybrid approach for tracking source code lines. In International Conference on Software Maintenance (ICSM). IEEE, 230--239.
[4]
Sudarshan S Chawathe, Anand Rajaraman, Hector Garcia-Molina, and Jennifer Widom. 1996. Change detection in hierarchically structured information. ACM SIGMOD Record 25, 2 (1996), 493--504.
[5]
Ozren Dabic, Emad Aghajani, and Gabriele Bavota. 2021. Sampling projects in GitHub for MSR studies. In International Conference on Mining Software Repositories.
[6]
Alexandre Decan, Tom Mens, and Hassan Onsori Delicheh. 2023. On the outdatedness of workflows in the GitHub Actions ecosystem. Journal of Systems and Software 206 (2023), 111827.
[7]
Alexandre Decan, Tom Mens, Pooya Rostami Mazrae, and Mehdi Golzadeh. 2022. On the Use of GitHub Actions in Software Development Repositories. In Int'l Conf. Software Maintenance and Evolution.
[8]
Sep Dehpour. [n. d.]. DeepDiff. https://github.com/seperman/deepdiff
[9]
Georg Dotzler and Michael Philippsen. 2016. Move-optimized source code tree differencing. In International Conference on Automated Software Engineering (ASE). 660--671.
[10]
Adam Duley, Chris Spandikow, and Miryung Kim. 2012. Vdiff: a program differencing algorithm for Verilog hardware description language. Automated Software Engineering 19, 4 (2012), 459--490.
[11]
Thomas Durieux, Rui Abreu, Martin Monperrus, Tegawendé F Bissyandé, and Luís Cruz. 2019. An analysis of 35+ million jobs of Travis CI. In Int'l Conf. Software Maintenance and Evolution (ICSME).
[12]
Fatih Erikli. [n. d.]. dictdiffer. https://github.com/inveniosoftware/dictdiffer
[13]
Jean-Rémy Falleri, Floréal Morandat, Xavier Blanc, Matias Martinez, and Martin Monperrus. 2014. Fine-grained and accurate source code differencing. In International Conference on Automated Software Engineering (ASE). 313--324.
[14]
Veit Frick. 2020. Understanding software changes: Extracting, classifying, and presenting fine-grained source code changes. In International Conference on Software Engineering: Companion Proceedings. 226--229.
[15]
Keheliya Gallaba and Shane McIntosh. 2018. Use and misuse of continuous integration features: An empirical study of projects that (mis) use Travis CI. Trans. Software Engineering 46, 1 (2018).
[16]
Mehdi Golzadeh, Alexandre Decan, and Tom Mens. 2022. On the rise and fall of CI services in GitHub. In Int'l Conf. Software Analysis, Evolution and Reengineering (SANER).
[17]
Kaifeng Huang, Bihuan Chen, Xin Peng, Daihong Zhou, Ying Wang, Yang Liu, and Wenyun Zhao. 2018. Cldiff: generating concise linked code differences. In International Conference on Automated Software Engineering (ASE). 679--690.
[18]
James W. Hunt and M. Douglas McIlroy. 1976. An Algorithm for Differential File Comparison. Technical Report 41. Computing Science Technical Report, Bell Laboratories.
[19]
Pooya Rostami Mazrae, Alexandre Decan, Tom Mens, and Mairieli Wessel. 2023. A Preliminary Study of GitHub Actions Workflow Changes. In Seminar on Advanced Techniques and Tools for Software Evolution (SATToSE), Vol. 3483. CEUR Workshop Proceedings.
[20]
Webb Miller and Eugene W. Myers. 1985. A file comparison program. Software: Practice and Experience 15, 11 (1985), 1025--1040.
[21]
Eugene W. Myers. 1986. An O(ND) difference algorithm and its variations. Algorithmica 1, 1-4 (1986), 251--266.
[22]
Yusuf Sulistyo Nugroho, Hideaki Hata, and Kenichi Matsumoto. 2020. How different are different diff algorithms in Git? Use-histogram for code changes. Empirical Software Engineering 25 (2020), 790--823.
[23]
Pooya Rostami Mazrae, Tom Mens, Mehdi Golzadeh, and Alexandre Decan. 2023. On the usage, co-usage and migration of CI/CD tools: A qualitative analysis. Empirical Software Engineering 28, 2 (2023), 52.
[24]
Pablo Valenzuela-Toledo and Alexandre Bergel. 2022. Evolution of GitHub Action Workflows. In Int'l Conf. Software Analysis, Evolution and Reengineering (SANER). IEEE, 123--127.
[25]
Mairieli Wessel, Tom Mens, Alexandre Decan, and Pooya Rostami Mazrae. 2023. The GitHub Development Workflow Automation Ecosystems. Springer International Publishing, Cham, 183--214.
[26]
Zhenchang Xing and Eleni Stroulia. 2005. UMLDiff: an algorithm for object-oriented design differencing. In International Conference on Automated Software Engineering (ASE). 54--65.
[27]
Fiorella Zampetti, Salvatore Geremia, Gabriele Bavota, and Massimiliano Di Penta. 2021. CI/CD pipelines evolution and restructuring: A qualitative and quantitative study. In Int'l Conf. Software Maintenance and Evolution (ICSME). IEEE.
[28]
Fiorella Zampetti, Carmine Vassallo, Sebastiano Panichella, Gerardo Canfora, Harald Gall, and Massimiliano Di Penta. 2020. An empirical characterization of bad practices in continuous integration. Emp. Soft. Eng. 25 (2020).

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MSR '24: Proceedings of the 21st International Conference on Mining Software Repositories
April 2024
788 pages
ISBN:9798400705878
DOI:10.1145/3643991
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 July 2024

Check for updates

Author Tags

  1. workflow automation
  2. diff tool
  3. software repository mining
  4. GitHub
  5. software changes
  6. software evolution

Qualifiers

  • Research-article

Funding Sources

Conference

MSR '24
Sponsor:

Upcoming Conference

ICSE 2025

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 50
    Total Downloads
  • Downloads (Last 12 months)50
  • Downloads (Last 6 weeks)7
Reflects downloads up to 16 Jan 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media