Abstract
Maintenance consumes 40% to 80% of software development costs. So, it is essential to write source code that is easy to understand to reduce the costs with maintenance. Improving code understanding is important because developers often mistake the meaning of code, and misjudge the program behavior, which can lead to errors. There are patterns in source code, such as operator precedence, and comma operator, that have been shown to influence code understanding negatively. Despite initial results, these patterns have not been evaluated in a real-world setting, though. Thus, it is not clear whether developers agree that the patterns studied by researchers can cause substantial misunderstandings in real-world practice. To better understand the relevance of misunderstanding patterns, we applied a mixed research method approach, by performing repository mining and a survey with developers, to evaluate misunderstanding patterns in 50 C open-source projects, including Apache, OpenSSL, and Python. Overall, we found more than 109K occurrences of the 12 patterns in practice. Our study shows that according to developers only some patterns considered previously by researchers may cause misunderstandings. Our results complement previous studies by taking the perception of developers into account.
Similar content being viewed by others
Notes
References
Baxter ID (1992) Design maintenance systems. Commun ACM 35(4):73–89
Baxter I, Mehlich M (2001) Preprocessor conditional removal by simple partial evaluation. In: Proceedings of the working conference on reverse engineering, IEEE, WCRE, pp 281–290
Beller M, Bacchelli A, Zaidman A, Juergens E (2014) Modern code reviews in open-source projects: which problems do they fix? In: Proceedings of the working conference on mining software repositories. ACM, pp 202–211
Bland M (2014) Finding more than one worm in the apple. Commun ACM 57 (7):58–64
Burke D (1995) All Circuits are Busy Now: The 1990 AT&T Long Distance Network Collapse. California Polytechnic State University
Buse RP, Weimer WR (2008) A metric for software readability. In: Proceedings of the international symposium on software testing and analysis. ACM, pp 121–130
Cannon LW, Elliott RA, Kirchhoff LW, Miller JH, Milner JM, Mitze RW, Schan EP, Whittington NO, Spencer H, Brader M, Cannon LW, Elliott RA, Kirchhoff LW, Miller JH, Milner JM, Mitze RW, Schan EP, Whittington NO, Spencer H, Brader M (2000) Recommended C style and coding standards
Collberg C, Thomborson C, Low D (1997) A taxonomy of obfuscating transformations. Technical Report 148, Department of Computer Science. University of Auckland
Creswell JW, Clark VLP (2011) Designing and Conducting Mixed Methods Research. SAGE Publications, Thousand Oaks
Darnell PA, Margolis PE (1996) C: A Software Engineering Approach. Springer, Berlin
Dijkstra EW (1968) Go to statement considered harmful. Commun ACM 11 (3):147–148
Dowson M (1997) The Ariane 5 software failure. SIGSOFT Softw Eng Notes 22 (2):84–93
Easterbrook S, Singer J, Storey MA, Damian D (2008) Selecting empirical methods for software engineering research. Springer, Berlin, pp 285–311
Elgot CC (1976) Structured programming with and without go to statements. IEEE Trans Softw Eng SE-2(1):41–54
Ernst M, Badros G, Notkin D (2002) An empirical analysis of C, preprocessor use. IEEE Trans Softw Eng 28(12):1146–1170
Feigenspan J, Kästner C, Apel S, Liebig J, Schulze M, Dachselt R, Papendieck M, Leich T, Saake G (2013) Do background colors improve program comprehension in the #ifdef hell? Empir Softw Eng 18(4):699–745
Fowler M, Beck K, Brant J, Opdyke W, Roberts D, Gamma E (1999) Refactoring: Improving the Design of Existing Code. Addison-Wesley, Reading
Gamma E, Helm R, Johnson R, Vlissides J (1995) Design Patterns: Elements of Reusable Object-oriented Software. Addison-Wesley, Reading
Garrido A, Johnson R (2003) Refactoring C with conditional compilation. In: Proceedings of the IEEE international conference on automated software engineering. IEEE, pp 323–326
Glass RL (2001) Frequently forgotten fundamental facts about software engineering. IEEE Softw 18(3):112–111
Gopstein D, Iannacone J, Yan Y, DeLong L, Zhuang Y, Yeh MKC, Cappos J (2017) Understanding misunderstandings in source code. In: Proceedings of the 2017 11th joint meeting on foundations of software engineering. ACM, ESEC/FSE 2017, pp 129-139
Gousios G (2013) The GHTorent dataset and tool suite. In: Proceedings of the working conference on mining software repositories. IEEE Press, pp 233–236
Gopstein D, Zhou H, Frankl P, Cappos J (2018) Prevalence of confusing code in software projects: atoms of confusion in the wild. In: Proceedings of the working conference on mining software repositories. ACM
Herzberg A, Pinter SS (1987) Public protection of software. ACM Trans Comput Syst 5(4):371–393
ISO/IEC/IEEE (2006) Iso/iec/ieee international standard for software engineering - software life cycle processes - maintenance. Std 14764-2006, pp 1–58
Jha MM, Vilardell RMF, Narayan J (2016) Scaling agile scrum software development: providing agility and quality to platform development by reducing time to market. In: 2016 IEEE 11th international conference on global software engineering (ICGSE), pp 84–88
Kästner C, Giarrusso P, Rendel T, Erdweg S, Ostermann K, Berger T (2011) Variability-aware parsing in the presence of lexical macros and conditional compilation. In: Proceedings of the object-oriented programming systems languages and applications, ACM, pp 805–824
Kernighan BW, Pike R (1999) The Practice of Programming. Addison-Wesley, Reading
Liebig J, Kästner C, Apel S (2011) Analyzing the discipline of preprocessor annotations in 30 million lines of C code. In: Proceedings of the international conference on aspect-oriented software development. ACM, pp 191–202
Lohmann D, Scheler F, Tartler R, Spinczyk O, Schröder-Preikschat W (2006) A quantitative analysis of aspects in the eCos kernel. In: Proceedings of the European conference on computer systems. ACM, pp 191–204
Malaquias R, Ribeiro M, Bonifácio R, Monteiro E, Medeiros F, Garcia A, Gheyi R (2017) The discipline of preprocessor-based annotations does #ifdef TAG N’T #endif matter. In: Proceedings of the international conference on program comprehension. IEEE Press, pp 297–307
Marshall L, Webber J (2000) Gotos considered harmful and other programmers taboos. In: Proceedings of the workshop of the psychology of programming interest group. PPIG, pp 171–180
Medeiros F, Ribeiro M, Gheyi R (2013) Investigating preprocessor-based syntax errors. In: Proceedings of the international conference on generative programming, concepts & experiences. ACM, pp 75–84
Medeiros F, Kästner C, Ribeiro M, Nadi S, Gheyi R (2015a) The Love/Hate Relationship with the C Preprocessor: An Interview Study. In: European conference on object-oriented programming (ECOOP), Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Leibniz International Proceedings in Informatics (LIPIcs), vol 37, pp 495–518
Medeiros F, Rodrigues I, Ribeiro M, Teixeira L, Gheyi R (2015b) An empirical study on configuration-related issues: Investigating undeclared and unused identifiers. In: Proceedings of the ACM SIGPLAN international conference on generative programming, concepts and experiences. ACM, pp 35-44
Medeiros F, Kästner C, Ribeiro M, Gheyi R, Apel S (2016) A comparison of 10 sampling algorithms for configurable systems. In: Proceedings of the international conference on software engineering. ACM, pp 643–654
Medeiros F, Ribeiro M, Gheyi R, Apel S, Kastner C, Ferreira B, Carvalho L, Fonseca B (2018a) Discipline matters: refactoring of preprocessor directives in the #ifdef hell, vol 44
Medeiros F, Silva G, Amaral G, Apel S, Kästner C, Ribeiro M, Gheyi R (2018b) Investigating Misunderstanding Code Patterns in C Open-Source Software Projects (Replication Package). https://doi.org/10.5281/zenodo.1461534
Nagappan M, Robbes R, Kamei Y, Tanter E, McIntosh S, Mockus A, Hassan AE (2015) An empirical study of goto in C code from GitHub repositories. In: Proceedings of the joint meeting on foundations of software engineering. ACM, NY, pp 404–414
Padioleau Y (2009) Parsing C/C++ code without pre-processing. In: Proceedings of the international conference on compiler construction. Springer, pp 109–125
Pahal A, Chillar RS (2017) Code readability: a review of metrics for software quality. Int J Comput Trends Technol 46(1):1–58
Rigby PC, German DM, Storey MA (2008) Open source software peer review practices: a case study of the Apache server. In: Proceedings of the international conference on software engineering. ACM, pp 541–550
Schulze S, Liebig J, Siegmund J, Apel S (2013) Does the discipline of preprocessor annotations matter? a controlled experiment. In: Proceedings of the international conference on generative programming, concepts and experiences. ACM, pp 65–74
Scott ML (2000) Programming language pragmatics. Morgan Kaufmann Publishers Inc., San Francisco
Spencer H, Collyer G (1992) #ifdef considered harmful, or portability experience with C News. In: USENIX summer technical conference, pp 185–197
Stamelos I, Angelis L, Oikonomou A, Bleris GL (2002) Code quality analysis in open source software development. Inf Syst J 12(1):43–60
Wulf W, Shaw M (1973) Global variable considered harmful. SIGPLAN Not 8(2):28–34
Acknowledgments
We would like to thank Dan Gopstein for the useful feedback regarding our study. Apel’s work has been supported by the German Research Foundation (AP 206/6). This work was funded by CNPq (308380/2016-9, 477943/2013-6, 460883/2014-3, 465614/2014-0, 306610/2013-2, 307190/2015-3, and also CNPq 409335/2016-9), FAPEAL (PPG 14/2016), and CAPES grants (175956 and 117875).
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: Christoph Treude
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix A: Survey with Developers
Appendix A: Survey with Developers
We are investigating specific C constructions (code patterns) in the source code. This survey presents some code patterns and ask you about their influence in terms of understanding the source code. For each question we will present the code patterns at the Left-Hand Side (LHS) and an alternative on the Right-Hand Side (RHS).
You should be able to answer our survey in around 10-15 minutes. We will use your answers to understand the practical use of code patterns and develop supporting tools. We really appreciate your help. Thanks!
Rights and permissions
About this article
Cite this article
Medeiros, F., Lima, G., Amaral, G. et al. An investigation of misunderstanding code patterns in C open-source software projects. Empir Software Eng 24, 1693–1726 (2019). https://doi.org/10.1007/s10664-018-9666-x
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10664-018-9666-x