Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Free access
Just Accepted

Towards Better Comprehension of Breaking Changes in the NPM Ecosystem

Online AM: 02 November 2024 Publication History

Abstract

Code evolution is prevalent in software ecosystems, which can provide many benefits, such as new features, bug fixes, security patches, etc., while still introducing breaking changes that make downstream projects fail to work. Breaking changes cause a lot of effort to both downstream and upstream developers: downstream developers need to adapt to breaking changes and upstream developers are responsible for identifying and documenting them. In the NPM ecosystem, characterized by frequent code changes and a high tolerance for making breaking changes, the effort is larger.
For better comprehension of breaking changes in the NPM ecosystem and to enhance breaking change detection tools, we conduct a large-scale empirical study to investigate breaking changes in the NPM ecosystem. We construct a dataset of explicitly documented breaking changes from 381 popular NPM projects. We find that 95.4% of the detected breaking changes can be covered by developers’ documentation, and 19% of the breaking changes cannot be detected by regression testing. Then in the process of investigating source code of our collected breaking changes, we yield a taxonomy of JavaScript and TypeScript-specific syntactic breaking changes and a taxonomy of major types of behavioral breaking changes. Additionally, we investigate the reasons why developers make breaking changes in NPM and find three major reasons, i.e., to reduce code redundancy, to improve identifier names, and to improve API design, and each category contains several sub-items.
We provide actionable implications for future research, e.g., automatic naming and renaming techniques should be applied in JavaScript projects to improve identifier names, future research can try to detect more types of behavioral breaking changes. By presenting the implications, we also discuss the weakness of automatic renaming and breaking change detection approaches, such as the lack of support for public identifiers and various types of breaking changes.

References

[1]
[n. d.]. Clirr. https://clirr.sourceforge.net.
[2]
[n. d.]. Conventional Commits. https://www.conventionalcommits.org/en/v1.0.0.
[3]
[n. d.]. Destructuring assignment. https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Destructuring_assignment.
[4]
[n. d.]. dont-break. https://www.npmjs.com/package/dont-break.
[5]
[n. d.]. ECMAScript 2015. https://262.ecma-international.org/6.0/.
[6]
[n. d.]. ESLint. https://eslint.org.
[7]
[n. d.]. Inheritance and the prototype chain. https://developer.mozilla.org/en-US/docs/Web/JavaScript/Inheritance_and_the_prototype_chain.
[8]
[n. d.]. JSHint. https://jshint.com.
[9]
[n. d.]. PiDiff. https://github.com/rohanpm/pidiff.
[10]
[n. d.]. RevAPI. https://revapi.org.
[11]
[n. d.]. Semantic Versioning. https://semver.org.
[12]
Rabe Abdalkareem, Olivier Nourry, Sultan Wehaibi, Suhaib Mujahid, and Emad Shihab. 2017. Why do developers use trivial packages? an empirical case study on npm. In Proceedings of the 2017 11th joint meeting on foundations of software engineering. 385–395.
[13]
Christopher Bogart, Christian Kästner, James Herbsleb, and Ferdian Thung. 2016. How to break an API: cost negotiation and community values in three software ecosystems. In Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering. 109–120.
[14]
Chris Bogart, Christian Kästner, James Herbsleb, and Ferdian Thung. 2021. When and how to make breaking changes: Policies and practices in 18 open source software ecosystems. ACM Transactions on Software Engineering and Methodology (TOSEM) 30, 4 (2021), 1–56.
[15]
Aline Brito, Marco Tulio Valente, Laerte Xavier, and Andre Hora. 2020. You broke my code: understanding the motivations for breaking changes in APIs. Empirical Software Engineering 25 (2020), 1458–1492.
[16]
Aline Brito, Laerte Xavier, Andre Hora, and Marco Tulio Valente. 2018. APIDiff: Detecting API breaking changes. In 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, 507–511.
[17]
Caprile and Tonella. 2000. Restructuring program identifier names. In Proceedings 2000 International Conference on Software Maintenance. IEEE, 97–107.
[18]
Madhurima Chakraborty, Renzo Olivares, Manu Sridharan, and Behnaz Hassanshahi. 2022. Automatic root cause quantification for missing edges in javascript call graphs. In 36th European Conference on Object-Oriented Programming (ECOOP 2022). Schloss Dagstuhl-Leibniz-Zentrum für Informatik.
[19]
Xiaowei Chen, Rabe Abdalkareem, Suhaib Mujahid, Emad Shihab, and Xin Xia. 2021. Helping or not helping? Why and how trivial packages impact the npm ecosystem. Empirical Software Engineering 26 (2021), 1–24.
[20]
Filipe Roseiro Cogo, Gustavo A Oliva, and Ahmed E Hassan. 2019. An empirical study of dependency downgrades in the npm ecosystem. IEEE Transactions on Software Engineering 47, 11 (2019), 2457–2470.
[21]
Daniela S Cruzes and Tore Dyba. 2011. Recommended steps for thematic synthesis in software engineering. In 2011 international symposium on empirical software engineering and measurement. IEEE, 275–284.
[22]
Alexandre Decan and Tom Mens. 2019. What do package dependencies tell us about semantic versioning? IEEE Transactions on Software Engineering 47, 6 (2019), 1226–1240.
[23]
Alexandre Decan, Tom Mens, and Eleni Constantinou. 2018. On the impact of security vulnerabilities in the npm package dependency network. In Proceedings of the 15th international conference on mining software repositories. 181–191.
[24]
Alexandre Decan, Tom Mens, and Philippe Grosjean. 2019. An empirical comparison of dependency network evolution in seven software packaging ecosystems. Empirical Software Engineering 24, 1 (2019), 381–416.
[25]
Elizabeth Dinella, Gabriel Ryan, Todd Mytkowicz, and Shuvendu K Lahiri. 2022. Toga: A neural method for test oracle generation. In Proceedings of the 44th International Conference on Software Engineering. 2130–2141.
[26]
Xingliang Du and Jun Ma. 2022. AexPy: Detecting API Breaking Changes in Python Packages. In 2022 IEEE 33rd International Symposium on Software Reliability Engineering (ISSRE). IEEE, 470–481.
[27]
Asger Feldthaus and Anders Møller. 2013. Semi-Automatic Rename Refactoring for JavaScript. In ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages and Applications. 323–338.
[28]
Hao He, Runzhi He, Haiqiao Gu, and Minghui Zhou. 2021. A large-scale empirical study on Java library migrations: prevalence, trends, and rationales. In Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 478–490.
[29]
Kaifeng Huang, Bihuan Chen, Linghao Pan, Shuai Wu, and Xin Peng. 2021. REPFINDER: Finding replacements for missing APIs in library update. In 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 266–278.
[30]
Dhanushka Jayasuriya, Valerio Terragni, Jens Dietrich, and Kelly Blincoe. 2024. Understanding the Impact of APIs Behavioral Breaking Changes on Client Applications. Proceedings of the ACM on Software Engineering 1, FSE (2024), 1238–1261.
[31]
Dhanushka Jayasuriya, Valerio Terragni, Jens Dietrich, Samuel Ou, and Kelly Blincoe. 2023. Understanding Breaking Changes in the Wild. In Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis. 1433–1444.
[32]
Simon Holm Jensen, Magnus Madsen, and Anders Møller. 2011. Modeling the HTML DOM and browser API in static analysis of JavaScript web applications. In Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering. 59–69.
[33]
Vineeth Kashyap, Kyle Dewey, Ethan A Kuefner, John Wagner, Kevin Gibbons, John Sarracino, Ben Wiedermann, and Ben Hardekopf. 2014. JSAI: A static analysis platform for JavaScript. In Proceedings of the 22nd ACM SIGSOFT international symposium on Foundations of Software Engineering. 121–132.
[34]
Dino Konstantopoulos, John Marien, Mike Pinkerton, and Eric Braude. 2009. Best principles in the design of shared software. In 2009 33rd Annual IEEE International Computer Software and Applications Conference, Vol. 2. IEEE, 287–292.
[35]
Meir M Lehman. 1980. Programs, life cycles, and laws of software evolution. Proc. IEEE 68, 9 (1980), 1060–1076.
[36]
Chengwei Liu, Sen Chen, Lingling Fan, Bihuan Chen, Yang Liu, and Xin Peng. 2022. Demystifying the vulnerability propagation and its evolution via dependency trees in the npm ecosystem. In Proceedings of the 44th International Conference on Software Engineering. 672–684.
[37]
Hao Liu, Yanlin Wang, Zhao Wei, Yong Xu, Juhong Wang, Hui Li, and Rongrong Ji. 2023. Refbert: A two-stage pre-trained framework for automatic rename refactoring. In Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis. 740–752.
[38]
Magnus Madsen, Frank Tip, and Ondřej Lhoták. 2015. Static analysis of event-driven Node. js JavaScript applications. ACM SIGPLAN Notices 50, 10 (2015), 505–519.
[39]
Vittunyuta Maeprasart, Supatsara Wattanakriengkrai, Raula Gaikovina Kula, Christoph Treude, and Kenichi Matsumoto. 2023. Understanding the role of external pull requests in the NPM ecosystem. Empirical Software Engineering 28, 4 (2023), 1–23.
[40]
Gianluca Mezzetti, Anders Møller, and Martin Toldam Torp. 2018. Type regression testing to detect breaking changes in Node. js libraries. In 32nd european conference on object-oriented programming (ECOOP 2018). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik.
[41]
Anders Møller and Martin Toldam Torp. 2019. Model-based testing of breaking changes in Node. js libraries. In Proceedings of the 2019 27th ACM joint meeting on european software engineering conference and symposium on the foundations of software engineering. 409–419.
[42]
Shaikh Mostafa, Rodney Rodriguez, and Xiaoyin Wang. 2017. Experience paper: a study on behavioral backward incompatibilities of Java software libraries. In Proceedings of the 26th ACM SIGSOFT international symposium on software testing and analysis. 215–225.
[43]
Suhaib Mujahid, Rabe Abdalkareem, and Emad Shihab. 2023. What are the characteristics of highly-selected packages? A case study on the npm ecosystem. Journal of Systems and Software 198 (2023), 111588.
[44]
Brad A Myers and Jeffrey Stylos. 2016. Improving API usability. Commun. ACM 59, 6 (2016), 62–69.
[45]
Changhee Park, Sooncheol Won, Joonho Jin, and Sukyoung Ryu. 2015. Static Analysis of JavaScript Web Applications in the Wild via Practical DOM Modeling (T). In 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE). 552–562. https://doi.org/10.1109/ASE.2015.27
[46]
Steven Raemaekers, Arie Van Deursen, and Joost Visser. 2012. Measuring software library stability through historical version analysis. In 2012 28th IEEE international conference on software maintenance (ICSM). IEEE, 378–387.
[47]
Steven Raemaekers, Arie van Deursen, and Joost Visser. 2017. Semantic versioning and impact of breaking changes in the Maven repository. Journal of Systems and Software 129 (2017), 140–158.
[48]
Brittany Reid. 2020. NPM Package Information from Libraries.io. https://doi.org/10.5281/zenodo.3898749
[49]
Danilo Silva and Marco Tulio Valente. 2017. RefDiff: Detecting refactorings in version histories. In 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR). IEEE, 269–279.
[50]
Chungha Sung, Markus Kusano, Nishant Sinha, and Chao Wang. 2016. Static DOM event dependency analysis for testing web applications. In Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering. 447–459.
[51]
Yida Tao, Yingnong Dang, Tao Xie, Dongmei Zhang, and Sunghun Kim. 2012. How do software engineers understand code changes? An exploratory study in industry. In Proceedings of the ACM SIGSOFT 20th International symposium on the foundations of software engineering. 1–11.
[52]
Yingchen Tian, Yuxia Zhang, Klaas-Jan Stol, Lin Jiang, and Hui Liu. 2022. What makes a good commit message?. In Proceedings of the 44th International Conference on Software Engineering. 2389–2401.
[53]
Luca Traini, Daniele Di Pompeo, Michele Tucci, Bin Lin, Simone Scalabrino, Gabriele Bavota, Michele Lanza, Rocco Oliveto, and Vittorio Cortellessa. 2021. How software refactoring impacts execution time. ACM Transactions on Software Engineering and Methodology (TOSEM) 31, 2 (2021), 1–23.
[54]
Daniel Venturini, Filipe Roseiro Cogo, Ivanilton Polato, Marco A Gerosa, and Igor Scaliante Wiese. 2023. I depended on you and you broke me: An empirical study of manifesting breaking changes in client packages. ACM Transactions on Software Engineering and Methodology 32, 4 (2023), 1–26.
[55]
Laerte Xavier, Aline Brito, Andre Hora, and Marco Tulio Valente. 2017. Historical and impact analysis of API breaking changes: A large-scale study. In 2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, 138–147.
[56]
Lyuye Zhang, Chengwei Liu, Zhengzi Xu, Sen Chen, Lingling Fan, Bihuan Chen, and Yang Liu. 2022. Has my release disobeyed semantic versioning? Static detection based on semantic differencing. In Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering. 1–12.
[57]
Zhaoxu Zhang, Hengcheng Zhu, Ming Wen, Yida Tao, Yepang Liu, and Yingfei Xiong. 2020. How do python framework apis evolve? an exploratory study. In 2020 ieee 27th international conference on software analysis, evolution and reengineering (saner). IEEE, 81–92.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Software Engineering and Methodology
ACM Transactions on Software Engineering and Methodology Just Accepted
EISSN:1557-7392
Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Online AM: 02 November 2024
Accepted: 11 October 2024
Revised: 03 October 2024
Received: 17 April 2024

Check for updates

Author Tags

  1. Breaking Change
  2. NPM
  3. JavaScript
  4. Code Evolution

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 143
    Total Downloads
  • Downloads (Last 12 months)143
  • Downloads (Last 6 weeks)42
Reflects downloads up to 11 Feb 2025

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media