research-article

Do Developers Really Know How to Use Git Commands? A Large-scale Study Using Stack Overflow

Authors:

Zhiqiu HuangAuthors Info & Claims

ACM Transactions on Software Engineering and Methodology (TOSEM), Volume 31, Issue 3

Article No.: 44, Pages 1 - 29

https://doi.org/10.1145/3494518

Published: 09 April 2022 Publication History

Abstract

Git, a cross-platform and open source distributed version control tool, provides strong support for non-linear development and is capable of handling everything from small to large projects with speed and efficiency. It has become an indispensable tool for millions of software developers and is the de facto standard of version control in software development nowadays. However, despite its widespread use, developers still frequently face difficulties when using various Git commands to manage projects and collaborate. To better help developers use Git, it is necessary to understand the issues and difficulties that they may encounter when using Git. Unfortunately, this problem has not yet been comprehensively studied. To fill this knowledge gap, in this article, we conduct a large-scale study on Stack Overflow, a popular Q&A forum for developers. We extracted and analyzed 80,370 relevant questions from Stack Overflow, and reported the increasing popularity of the Git command questions. By analyzing the questions, we identified the Git commands that are frequently asked and those that are associated with difficult questions on Stack Overflow to help understand the difficulties developers may encounter when using Git commands. In addition, we conducted a survey to understand how developers learn Git commands in practice, showing that self-learning is the primary learning approach. These findings provide a range of actionable implications for researchers, educators, and developers.

References

[1]

R. Abdalkareem, E. Shihab, and J. Rilling. 2017. What do developers use the crowd for? A study using stack overflow. IEEE Softw. 34, 2 (2017), 53–60.

Digital Library

[2]

Rakesh Agrawal and Ramakrishnan Srikant. 1994. Fast algorithms for mining association rules in large databases. In Proceedings of the 20th International Conference on Very Large Data Bases (VLDB’94). Morgan Kaufmann Publishers Inc., San Francisco, CA, 487–499.

Digital Library

[3]

Syed Ahmed and Mehdi Bagherzadeh. 2018. What do concurrency developers ask about? A large-scale study using stack overflow. In Proceedings of the 12th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM’18). Association for Computing Machinery, New York, NY, USA, Article 30, 10 pages.

Digital Library

[4]

M. Alshangiti, H. Sapkota, P. K. Murukannaiah, X. Liu, and Q. Yu. 2019. Why is developing machine learning applications challenging? A study on stack overflow posts. In Proceedings of the ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM’19). 1–11.

[5]

Mehdi Bagherzadeh and Raffi Khatchadourian. 2019. Going big: A large-scale study on what big data developers ask. In Proceedings of the 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE’19). Association for Computing Machinery, New York, NY, 432–442.

Digital Library

[6]

Kartik Bajaj, Karthik Pattabiraman, and Ali Mesbah. 2014. Mining questions asked by web developers. In Proceedings of the 11th Working Conference on Mining Software Repositories (MSR’14). Association for Computing Machinery, New York, NY, 112–121.

Digital Library

[7]

Abdul Ali Bangash, Hareem Sahar, Shaiful Chowdhury, Alexander William Wong, Abram Hindle, and Karim Ali. 2019. What do developers know about machine learning: A study of ML discussions on StackOverflow. In Proceedings of the 16th International Conference on Mining Software Repositories (MSR’19). IEEE Press, 260–264.

Digital Library

[8]

Anton Barua, Stephen W. Thomas, and Ahmed E. Hassan. 2014. What are developers talking about? An analysis of topics and trends in stack overflow. Emp. Softw. Eng. 19, 3 (2014), 619–654.

Digital Library

[9]

S. Bennett. 2012. 10 Things I Hate About Git. Retrieved August 15, 2021 from http://stevebennett.me/2012/02/24/10-things-ihate-about-git/.

[10]

S. Beyer, C. Macho, M. Di Penta, and M. Pinzger. 2018. Automatically classifying posts into question categories on stack overflow. In Proceedings of the IEEE/ACM 26th International Conference on Program Comprehension (ICPC’18). 211–21110.

Digital Library

[11]

A. Bosu, J. C. Carver, C. Bird, J. Orbeck, and C. Chockley. 2017. Process aspects and social dynamics of contemporary code review: Insights from open source development and industrial practice at microsoft. IEEE Trans. Softw. Eng. 43, 1 (2017), 56–75.

Digital Library

[12]

Fabio Calefato, Filippo Lanubile, and Nicole Novielli. 2016. Moving to stack overflow: Best-answer prediction in legacy developer forums. In Proceedings of the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM’16). Association for Computing Machinery, New York, NY, Article 13, 10 pages.

Digital Library

[13]

Scott Chacon and Ben Straub. 2014. Pro Git. Springer Nature.

[14]

Zhenpeng Chen, Yanbin Cao, Yuanqiang Liu, Haoyu Wang, Tao Xie, and Xuanzhe Liu. 2020. A Comprehensive Study on Challenges in Deploying Deep Learning Based Software. Association for Computing Machinery, New York, NY, 750–762.

Digital Library

[15]

Luke Church, Emma Söderberg, and Elayabharath Elango. 2014. A case of computational thinking: The subtle effect of hidden dependencies on the user experience of version control. In Proceedings of the Psychology of Programming Interest Group Annual Conference. 123–128.

[16]

A. Cummaudo, R. Vasa, S. Barnett, J. Grundy, and M. Abdelrazek. 2020. Interpreting cloud computer vision pain-points: A mining study of stack overflow. In Proceedings of the IEEE/ACM 42nd International Conference on Software Engineering (ICSE’20). 1584–1596.

Digital Library

[17]

Santiago Perez De Rosso and Daniel Jackson. 2016. Purposes, concepts, misfits, and a redesign of git. In Proceedings of the ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA’16). Association for Computing Machinery, New York, NY, 292–310.

Digital Library

[18]

Zhipeng Gao, Xin Xia, David Lo, and John Grundy. 2021. Technical Q8A site answer recommendation via question boosting. ACM Trans. Softw. Eng. Methodol. 30, 1, Article 11 (Dec. 2021), 34 pages.

Digital Library

[19]

Qianyu Guo, Sen Chen, Xiaofei Xie, Lei Ma, Qiang Hu, Hongtao Liu, Yang Liu, Jianjun Zhao, and Xiaohong Li. 2019. An empirical study towards characterizing deep learning development and deployment across different frameworks and platforms. In Proceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering (ASE’19). IEEE Press, 810–822.

Digital Library

[20]

GitHub Inc.2020. The 2020 State of the Octoverse. Retrieved May 20, 2021 from https://octoverse.github.com/.

[21]

GitLab Inc.2020. Is It Any Good?Retrieved May 20, 2021 from https://about.gitlab.com/is-it-any-good/.

[22]

Stack Exchange Inc.2020. Stack Exchange Dump. Retrieved May 20, 2021 from https://archive.org/details/stackexchange.

[23]

Md Johirul Islam, Giang Nguyen, Rangeet Pan, and Hridesh Rajan. 2019. A comprehensive study on deep learning bug characteristics. In Proceedings of the 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE’19). Association for Computing Machinery, New York, NY, 510–520.

Digital Library

[24]

Jing Jiang, Qiudi Wu, Jin Cao, Xin Xia, and Li Zhang. 2021. Recommending tags for pull requests in GitHub. Inf. Softw. Technol. 129 (2021), 14 pages.

[25]

Siyuan Jiang, Ameer Armaly, and Collin McMillan. 2017. Automatically generating commit messages from diffs using neural machine translation. In Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE’17). IEEE Press, 135–146.

Digital Library

[26]

Branko Kavšek, Nada Lavrač, and Viktor Jovanoski. 2003. APRIORI-SD: Adapting association rule learning to subgroup discovery. In Advances in Intelligent Data Analysis V, Michael R. Berthold, Hans-Joachim Lenz, Elizabeth Bradley, Rudolf Kruse, and Christian Borgelt (Eds.). Springer, Berlin, 230–241.

[27]

Oleksii Kononenko, Olga Baysal, and Michael W. Godfrey. 2016. Code review quality: How developers see it. In Proceedings of the 38th International Conference on Software Engineering (ICSE’16). Association for Computing Machinery, New York, NY, 1028–1038.

Digital Library

[28]

S. Liu, C. Gao, S. Chen, N. Lun Yiu, and Y. Liu. 2020. ATOM: Commit message generation based on abstract syntax tree and hybrid ranking. IEEE Trans. Softw. Eng. (2020), 1–1.

[29]

Zhongxin Liu, Xin Xia, Ahmed E. Hassan, David Lo, Zhenchang Xing, and Xinyu Wang. 2018. Neural-machine-translation-based commit message generation: How far are we? In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering (ASE’18). Association for Computing Machinery, New York, NY, 373–384.

Digital Library

[30]

Z. Liu, X. Xia, C. Treude, D. Lo, and S. Li. 2019. Automatic generation of pull request descriptions. In 2019Proceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering (ASE’19). 176–188.

Digital Library

[31]

Sonal Mahajan, Negarsadat Abolhassani, and Mukul R. Prasad. 2020. Recommending Stack Overflow Posts for Fixing Runtime Exceptions Using Failure Scenario Matching. Association for Computing Machinery, New York, NY, 1052–1064.

Digital Library

[32]

Lena Mamykina, Bella Manoim, Manas Mittal, George Hripcsak, and Björn Hartmann. 2011. Design lessons from the fastest Q&a site in the west. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’11). Association for Computing Machinery, New York, NY, 2857–2866.

Digital Library

[33]

Slashdot Media. 2020. About Sourceforge. Retrieved May 20, 2021 from https://sourceforge.net/about.

[34]

Sarah Meldrum, Sherlock A. Licorish, and Bastin Tony Roy Savarimuthu. 2017. Crowdsourced knowledge on stack overflow: A systematic mapping study. In Proceedings of the 21st International Conference on Evaluation and Assessment in Software Engineering (EASE’17). Association for Computing Machinery, New York, NY, 180–185.

Digital Library

[35]

Sarah Nadi, Stefan Krüger, Mira Mezini, and Eric Bodden. 2016. Jumping through hoops: Why do Java developers struggle with cryptography APIs? In Proceedngs of the International Conference on Software Engineering (ICSE’16). Association for Computing Machinery, New York, NY, 935–946.

Digital Library

[36]

[n.d.]. 2009. Retrieved May 20, 2021 from https://stackoverflow.com/questions/927358/how-do-i-undo-the-most-recent-local-commits-in-git.

[37]

[n.d.]. 2009. Retrieved May 20, 2021 from https://stackoverflow.com/questions/1469623/a-few-basic-version-control-questions.

[38]

[n.d.]. 2013. Retrieved May 20, 2021 from https://stackoverflow.com/questions/17371955/verifying-signed-git-commits.

[39]

[n.d.]. 2014. Retrieved May 20, 2021 from https://stackoverflow.com/questions/27508982/interpreting-git-diff-output.

[40]

[n.d.]. 2018. Retrieved May 20, 2021 from https://stackoverflow.com/questions/50827060/where-can-i-report-a-github-bug.

[41]

[n.d.]. 2020. Retrieved May 20, 2021 from https://stackoverflow.com/questions/62701501/git-checkout-vs-restore-single-file.

[42]

Santiago Perez De Rosso and Daniel Jackson. 2013. What’s wrong with git? A conceptual design analysis. In Proceedings of the ACM International Symposium on New Ideas, New Paradigms, and Reflections on Programming & Software (Onward! 2013). Association for Computing Machinery, New York, NY, 37–52.

Digital Library

[43]

Atlassian Corp Plc. 2019. Celebrating 10 Million Bitbucket Cloud Registered Users. Retrieved May 20, 2021 from https://bitbucket.org/blog/celebrating-10-million-bitbucket-cloud-registered-users.

[44]

T. Punter, M. Ciolkowski, B. Freimut, and I. John. 2003. Conducting on-line surveys in software engineering. In Proceedings of the International Symposium on Empirical Software Engineering (ISESE 2003).80–88.

[45]

Christoffer Rosen and Emad Shihab. 2016. What are Mobile developers asking about? A large scale study using stack overflow. Emp. Softw. Eng. 21, 3 (June 2016), 1192–1223.

Digital Library

[46]

Mohammad Tahaei, Kami Vaniea, and Naomi Saphra. 2020. Understanding privacy-related questions on stack overflow. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery, New York, NY, USA, 1–14. DOI:

Digital Library

[47]

C. Treude, O. Barzilay, and M. Storey. 2011. How do programmers ask and answer questions on the web?: NIER track. In Proceedings of the 33rd International Conference on Software Engineering (ICSE’11). 804–807.

Digital Library

[48]

B. Vasilescu, V. Filkov, and A. Serebrenik. 2013. StackOverflow and GitHub: Associations between software development and crowdsourced knowledge. In Proceedings of the International Conference on Social Computing. 188–195.

Digital Library

[49]

Shengbin Xu, Yuan Yao, Feng Xu, Tianxiao Gu, Hanghang Tong, and Jian Lu. 2019. Commit message generation for source code changes. In Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI’19), Sarit Kraus (Ed.). International Joint Conferences on Artificial Intelligence, 3975–3981.

[50]

Xin-Li Yang, David Lo, Xin Xia, Zhi-Yuan Wan, and Jian-Ling Sun. 2016. What security questions do developers ask? A large-scale study of stack overflow posts. J. Comput. Sci. Technol. 31, 5 (2016), 910–924.

[51]

T. Zhang, C. Gao, L. Ma, M. Lyu, and M. Kim. 2019. An empirical study of common challenges in developing deep learning applications. In Proceedings of the IEEE 30th International Symposium on Software Reliability Engineering (ISSRE’19). 104–115.

[52]

Yuhao Zhang, Yifan Chen, Shing-Chi Cheung, Yingfei Xiong, and Lu Zhang. 2018. An empirical study on TensorFlow program bugs. In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA’18). Association for Computing Machinery, New York, NY, 129–140.

Digital Library

Cited By

Choquehuallpa-Hurtado OBustinza Mendoza JAquino Cruz M(2024)Desarrollo de una aplicación de escritorio para la impresión de tickets en los sistemas de punto de venta de la empresa PUYUDevelopment of a desktop application for printing tickets in the point of sale systems of the PUYU CompanyMicaela Revista de Investigación - UNAMBA10.57166/micaela.v5.n2.2024.1455:2(1-8)Online publication date: 29-Oct-2024
https://doi.org/10.57166/micaela.v5.n2.2024.145
Bi TXia BXing ZLu QZhu L(2024)On the Way to SBOMs: Investigating Design Issues and Solutions in PracticeACM Transactions on Software Engineering and Methodology10.1145/365444233:6(1-25)Online publication date: 27-Jun-2024
https://dl.acm.org/doi/10.1145/3654442
Yang YHu XXia XYang X(2024)The Lost World: Characterizing and Detecting Undiscovered Test SmellsACM Transactions on Software Engineering and Methodology10.1145/363197333:3(1-32)Online publication date: 15-Mar-2024
https://dl.acm.org/doi/10.1145/3631973
Show More Cited By

Index Terms

Do Developers Really Know How to Use Git Commands? A Large-scale Study Using Stack Overflow
1. General and reference
  1. Cross-computing tools and techniques
    1. Empirical studies
2. Software and its engineering
  1. Software creation and management
    1. Software development process management
  2. Software notations and tools
    1. Software configuration management and version control systems

Recommendations

Richen: Automated enrichment of Git documentation with usage examples and scenarios
Abstract
As the predominant modern version control system, Git has become an indispensable tool for both commercial and open‐source software projects. It substantially improves software development effectiveness and efficiency through its distributed ...

Richen, an innovative approach, enhances Git documentation by incorporating usage examples and scenarios from Stack Overflow, improving developers' understanding and application of Git commands. The empirical study confirms that Richen's crowd‐sourced ...
Git command recommendations using crowd-sourced knowledge
Abstract Context:
Git is a fast, scalable, distributed version control system with a rich command set that provides high-level operations and full access to the internals. It has been widely used by millions of developers worldwide. ...
What Do Developers Use the Crowd For? A Study Using Stack Overflow

Stack Overflow relies on the crowd to construct quality developer-related knowledge. To determine what developers use this knowledge for, researchers analyzed 1,414 Stack Overflow-related code commits. The developers used this knowledge to support ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Software Engineering and Methodology

ACM Transactions on Software Engineering and Methodology Volume 31, Issue 3

July 2022

912 pages

ISSN:1049-331X

EISSN:1557-7392

DOI:10.1145/3514181

Editor:
Mauro Pezzè
USI Università della Svizzera italiana and SIT Schaffhausen Institute of Technology, Switzerland

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 April 2022

Online AM: 31 January 2022

Accepted: 01 October 2021

Revised: 01 August 2021

Received: 01 May 2021

Published in TOSEM Volume 31, Issue 3

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Refereed

Funding Sources

National Natural Science Foundation of China
Leading-edge Technology Program of Jiangsu Natural Science Foundation
Natural Science Foundation of Jiangsu Province

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

10
Total Citations
View Citations
1,761
Total Downloads

Downloads (Last 12 months)209
Downloads (Last 6 weeks)37

Reflects downloads up to 10 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Choquehuallpa-Hurtado OBustinza Mendoza JAquino Cruz M(2024)Desarrollo de una aplicación de escritorio para la impresión de tickets en los sistemas de punto de venta de la empresa PUYUDevelopment of a desktop application for printing tickets in the point of sale systems of the PUYU CompanyMicaela Revista de Investigación - UNAMBA10.57166/micaela.v5.n2.2024.1455:2(1-8)Online publication date: 29-Oct-2024
https://doi.org/10.57166/micaela.v5.n2.2024.145
Bi TXia BXing ZLu QZhu L(2024)On the Way to SBOMs: Investigating Design Issues and Solutions in PracticeACM Transactions on Software Engineering and Methodology10.1145/365444233:6(1-25)Online publication date: 27-Jun-2024
https://dl.acm.org/doi/10.1145/3654442
Yang YHu XXia XYang X(2024)The Lost World: Characterizing and Detecting Undiscovered Test SmellsACM Transactions on Software Engineering and Methodology10.1145/363197333:3(1-32)Online publication date: 15-Mar-2024
https://dl.acm.org/doi/10.1145/3631973
Khatoonabadi SAbdellatif ACosta DShihab E(2024)Predicting the First Response Latency of Maintainers and Contributors in Pull RequestsIEEE Transactions on Software Engineering10.1109/TSE.2024.344374150:10(2529-2543)Online publication date: 1-Oct-2024
https://dl.acm.org/doi/10.1109/TSE.2024.3443741
Jean de Dieu MLiang PShahin MYang CLi Z(2024)Mining architectural information: A systematic mapping studyEmpirical Software Engineering10.1007/s10664-024-10480-629:4Online publication date: 4-Jun-2024
https://dl.acm.org/doi/10.1007/s10664-024-10480-6
Shen CYang WJia HPan MZhou Y(2024)Richen: Automated enrichment of Git documentation with usage examples and scenariosJournal of Software: Evolution and Process10.1002/smr.2662Online publication date: 13-Mar-2024
https://doi.org/10.1002/smr.2662
Codabux ZZakia Sultana KChowdhury M(2024)A catalog of metrics at source code level for vulnerability predictionJournal of Software: Evolution and Process10.1002/smr.263936:7Online publication date: 14-Jul-2024
https://dl.acm.org/doi/10.1002/smr.2639
Shen CYang WPan MZhou Y(2023)Git Merge Conflict Resolution Leveraging Strategy Classification and LLM2023 IEEE 23rd International Conference on Software Quality, Reliability, and Security (QRS)10.1109/QRS60937.2023.00031(228-239)Online publication date: 22-Oct-2023
https://doi.org/10.1109/QRS60937.2023.00031
Jia HYang WShen CPan MZhou Y(2023)Git command recommendations using crowd-sourced knowledgeInformation and Software Technology10.1016/j.infsof.2023.107199159:COnline publication date: 1-Jul-2023
https://dl.acm.org/doi/10.1016/j.infsof.2023.107199
Alqahtani S(2023)Security bug reports classification using fasttextInternational Journal of Information Security10.1007/s10207-023-00793-w23:2(1347-1358)Online publication date: 22-Dec-2023
https://dl.acm.org/doi/10.1007/s10207-023-00793-w

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View full text|Download PDF

View Issue’s Table of Contents