Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

A multi-dimensional analysis of technical lag in Debian-based Docker images

Published: 01 March 2021 Publication History

Abstract

Container-based solutions, such as Docker, have become increasingly relevant in the software industry to facilitate deploying and maintaining software systems. Little is known, however, about how outdated such containers are at the moment of their release or when used in production. This article addresses this question, by measuring and comparing five different dimensions of technical lag that Docker container images can face: package lag, time lag, version lag, vulnerability lag, and bug lag. We instantiate the formal technical lag framework from previous work to operationalise these different dimensions of lag on Docker Hub images based on the DebianLinux distribution. We carry out a large-scale empirical study of such technical lag, over a three-year period, in 140,498 Debian images. We compare the differences between official and community images, as well as between images with different Debian distributions: OldStable, Stable or Testing. The analysis shows that the different dimensions of technical lag are complementary, providing multiple insights. OfficialDebian images consistently have a lower lag than community images for all considered lag dimensions. The amount of lag incurred depends on the type of Debian distribution and the considered lag dimension. Our research offers empirical evidence that developers and deployers of Docker images can benefit from identifying to which extent their containers are outdated according to the considered dimensions, and mitigate the risks related to such outdatedness.

References

[1]
Abate P, Di Cosmo R, Boender J, Zacchiroli S (2009) Strong dependencies between software components. In: International symposium on empirical software engineering and measurement. IEEE Computer Society, pp 89–99
[2]
Abate P, Di Cosmo R, Treinen R, and Zacchiroli SDependency solving: a separate concern in component evolution managementJ Syst Softw201285102228-2240https://doi.org/10.1016/j.jss.2012.02.018
[3]
Abate P, Di Cosmo R, Treinen R, and Zacchiroli SLearning from the future of component repositoriesSci Comput Program20149093-115https://doi.org/10.1016/j.scico.2013.06.007
[4]
Anchore.io (2017) Snapshot of the container ecosystem. https://anchore.com/wp-content/uploads/2017/04/Anchore-Container-Survey-5.pdf. Accessed: 01/12/2019
[5]
Artho C, Suzaki K, Di Cosmo R, Treinen R, Zacchiroli S (2012) Why do software packages conflict?. In: Working conference mining software repositories., pp 141–150
[6]
Bernstein DContainers and cloud: from LXC to Docker to KubernetesIEEE Cloud Comput20141381-84https://doi.org/10.1109/MCC.2014.51
[7]
Bettini A (2015) Vulnerability exploitation in docker container environments. In: FlawCheck, Black Hat Europe
[8]
Boettiger CAn introduction to Docker for reproducible researchACM SIGOPS Oper Syst Rev201549171-79https://doi.org/10.1145/2723872.2723882
[9]
Cito J, Schermann G, Wittern JE, Leitner P, Zumberi S, Gall HC (2017) An empirical analysis of the Docker container ecosystem on GitHub. In: International conference on mining software repositories. IEEE Press, pp 323–333
[10]
Claes M, Mens T, Di Cosmo R, Vouillon J (2015) A historical analysis of Debian package incompatibilities. In: Working conference mining software repositories., pp 212–223
[11]
Cogo F R, Oliva G A, Hassan A E (2019) An empirical study of dependency downgrades in the npm ecosystem. IEEE Trans Softw Eng.
[12]
Combe T, Martin A, and Di Pietro RTo Docker or not to Docker: a security perspectiveIEEE Cloud Comput20163554-6210.1109/MCC.2016.100
[13]
Cox J, Bouwers E, van Eekelen M, Visser J (2015) Measuring dependency freshness in software systems. In: International conference on software engineering. IEEE Press, pp 109–118
[14]
de Visser M (2017) A look at how often Docker images are updated. https://anchore.com/look-often-docker-images-updated/. Accessed: 20 August 2020
[15]
Decan A, Mens T, Constantinou E (2018a) On the evolution of technical lag in the npm package dependency network. In: International conference software maintenance and evolution. IEEE, pp 404–414
[16]
Decan A, Mens T, Constantinou E (2018b) On the impact of security vulnerabilities in the npm package dependency network. In: International conference on mining software repositories.
[17]
Decan A, Mens T, and Grosjean PAn empirical comparison of dependency network evolution in seven software packaging ecosystemsEmpir Softw Eng2019241381-416ISSN 1573-7616. 10.1007/s10664-017-9589-y
[18]
DeHamer B (2020) Docker hub top 10. https://www.ctl.io/developers/blog/post/docker-hub-top-10/. Accessed: 20 August 2020
[19]
Docker Inc. (2020a) Docker registry HTTP API V2. https://docs.docker.com/registry/spec/api/. Accessed: 20 Aug 2020
[20]
Docker Inc. (2020b) Dockerfile reference. https://docs.docker.com/engine/reference/builder/. Accessed: 20 August 2020
[21]
Gonzalez-Barahona JM, Robles G, Michlmayr M, Amor JJ, and German DMMacro-level software evolution: a case study of a large software compilationEmpir Softw Eng2009143262-28510.1007/s10664-008-9100-x
[22]
Gonzalez-Barahona JM, Sherwood P, Robles G, Izquierdo D (2017) Technical lag in software compilations: measuring how outdated a software deployment is. In: IFIP international conference on open source systems. Springer, pp 182–192
[23]
Henkel J, Bird C, Lahiri SK, Reps T (2020) Learning from, understanding, and supporting DevOps artifacts for Docker. In: International conference on software engineering
[24]
Kula R G, German D M, Ishio T, Inoue K (2015) Trusting a library: a study of the latency to adopt the latest Maven release. In: International conference on software analysis, evolution, and reengineering., pp 520–524
[25]
Kula RG, German DM, Ouni A, Ishio T, and Inoue KDo developers update their library dependencies?Empir Softw Eng2017231384-41710.1007/s10664-017-9521-5. ISSN 1573-7616
[26]
Kwon S, Lee J-H (2020) Divds: Docker image vulnerability diagnostic system. IEEE Access.
[27]
Legay D, Decan A, Mens T (2020) On package freshness in Linux distributions. In: International conference software maintenance and evolution—NIER Track
[28]
Lu Z, Xu J, Wu Y, Wang T, Huang T (2019) An empirical case study on the temporary file smell in Dockerfiles. IEEE Access.
[29]
Merkel D Docker: lightweight Linux containers for consistent development and deployment Linux J 2014 2014 239 2
[30]
Mezzetti G, Møller A, Torp MT (2018) Type regression testing to detect breaking changes in Node. js libraries. In: European conference on object-oriented programming.
[31]
Møller A, Torp M T (2019) Model-based testing of breaking changes in Node.js libraries. In: Joint meeting on European software engineering conference and symposium on the foundations of software engineering. ACM, pp 409–419
[32]
Mouat A (2015) Using docker: developing and deploying software with containers. O’Reilly Media, Inc.
[33]
Nussbaum L, Zacchiroli S (2010) The ultimate Debian database: consolidating bazaar metadata for quality assurance and data mining. In: Working conference on mining software repositories., pp 52–61
[34]
Romano J, Kromrey JD, Coraggio J, Skowronek J, Devine L (2006) Exploring methods for evaluating group differences on the NSSE and other surveys: are the t-test and Cohen’s d indices the most appropriate choices?. In: Annual meeting of the southern association for institutional research
[35]
Salza P, Palomba F, Di Nucci D, De Lucia A, and Ferrucci FThird-party libraries in mobile apps: when, how, and why developers update themEmpir Softw Eng2020252341-237710.1007/s10664-019-09754-1
[36]
Shu R, Gu X, Enck W (2017) A study of security vulnerabilities on Docker Hub. In: International conference on data and application security and privacy. ACM, pp 269–280
[37]
Socchi E, Luu J (2019) A deep dive into Docker Hub’s security landscape—a story of inheritance? Master’s thesis University of Oslo
[38]
The Debian GNU/Linux FAQ (2019) The Debian package management tools. https://www.debian.org/doc/manuals/debian-faq/pkgtools.en.html. Accessed: 20 Aug 2020
[39]
Turnbull J (2014) The Docker book: containerization is the new virtualization. James Turnbull
[40]
Vermeer B, Henry W (2019) Shifting Docker security left. https://snyk.io/blog/shifting-docker-security-left/. Accessed: 02/11/2019
[41]
Vouillon J, Di Cosmo R (2011) On software component co-installability. In: Joint European software engineering conference and ACM SIGSOFT international symposium on foundations of software engineering.
[42]
Wohlin C, Runeson P, Host M, Ohlsson MC, Regnell B, and Wesslen AExperimentation in software engineering—an introduction2000BostonKluwer10.1007/978-1-4615-4625-2
[43]
Zapata RE, Kula RG, Chinthanet B, Ishio T, Matsumoto K, Ihara A (2018) Towards smoother library migrations: a look at vulnerable dependency migrations at function level for npm JavaScript packages. In: International conference on software maintenance and evolution. IEEE, pp 559–563
[44]
Zerouali A (2019) A measurement framework for analyzing technical lag in open-source software ecosystems. PhD thesis, University of Mons
[45]
Zerouali A (2020) Replication package for Debian-based Docker images.
[46]
Zerouali A, Constantinou E, Mens T, Robles G, González-Barahona J (2018) An empirical analysis of technical lag in npm package dependencies. In: International conference on software reuse. Springer, pp 95–110
[47]
Zerouali A, Cosentino V, Robles G, Gonzalez-Barahona JM, Mens T (2019a) Conpan: a tool to analyze packages in software containers. In: Proceedings of the 16th international conference on mining software repositories. IEEE Press, pp 592–596
[48]
Zerouali A, Mens T, Gonzalez-Barahona J, Decan A, Constantinou E, Robles G (2019b) A formal framework for measuring technical lag in component repositories—and its application to npm. J Softw: Evol Process.
[49]
Zerouali A, Mens T, Robles G, Gonzalez-Barahona JM (2019c) On the relation between outdated Docker containers, severity vulnerabilities, and bugs. In: International conference on software analysis, evolution and reengineering. IEEE, pp 491–501
[50]
Zhou J, Chen W, Wu G, Wei J (2019) SemiTagRec: a semi-supervised learning based tag recommendation approach for Docker repositories. In: International conference on software and systems reuse. Springer, pp 132–148
[51]
Zimmermann M, Staicu C-A, Tenny C, Pradel M (2019) Small world with high risks: a study of security threats in the npm ecosystem. In: USENIX security symposium, pp 1–16

Cited By

View all
  • (2024)Quantifying Security Issues in Reusable JavaScript Actions in GitHub WorkflowsProceedings of the 21st International Conference on Mining Software Repositories10.1145/3643991.3644899(692-703)Online publication date: 15-Apr-2024
  • (2024)Mitigating Security Issues in GitHub ActionsProceedings of the 2024 ACM/IEEE 4th International Workshop on Engineering and Cybersecurity of Critical Systems (EnCyCriS) and 2024 IEEE/ACM Second International Workshop on Software Vulnerability10.1145/3643662.3643961(6-11)Online publication date: 15-Apr-2024
  • (2023)DRIVE: Dockerfile Rule Mining and Violation DetectionACM Transactions on Software Engineering and Methodology10.1145/3617173Online publication date: 21-Aug-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Empirical Software Engineering
Empirical Software Engineering  Volume 26, Issue 2
Mar 2021
678 pages

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 01 March 2021
Accepted: 23 September 2020

Author Tags

  1. Technical lag
  2. Container images
  3. Docker
  4. Outdated packages
  5. Security vulnerabilities
  6. Bugs
  7. Debian
  8. Empirical analysis

Qualifiers

  • Research-article

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 20 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Quantifying Security Issues in Reusable JavaScript Actions in GitHub WorkflowsProceedings of the 21st International Conference on Mining Software Repositories10.1145/3643991.3644899(692-703)Online publication date: 15-Apr-2024
  • (2024)Mitigating Security Issues in GitHub ActionsProceedings of the 2024 ACM/IEEE 4th International Workshop on Engineering and Cybersecurity of Critical Systems (EnCyCriS) and 2024 IEEE/ACM Second International Workshop on Software Vulnerability10.1145/3643662.3643961(6-11)Online publication date: 15-Apr-2024
  • (2023)DRIVE: Dockerfile Rule Mining and Violation DetectionACM Transactions on Software Engineering and Methodology10.1145/3617173Online publication date: 21-Aug-2023
  • (2022)Understanding and Predicting Docker Build Duration: An Empirical Study of Containerized Workflow of OSS ProjectsProceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering10.1145/3551349.3556940(1-13)Online publication date: 10-Oct-2022
  • (2022)Task assignment to counter the effect of developer turnover in software maintenanceInformation and Software Technology10.1016/j.infsof.2021.106786143:COnline publication date: 1-Mar-2022
  • (2022)On the impact of security vulnerabilities in the npm and RubyGems dependency networksEmpirical Software Engineering10.1007/s10664-022-10154-127:5Online publication date: 30-May-2022

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media