Causes of merge conflicts: A case study of elasticsearch

W Mahmood, M Chagama, T Berger… - Proceedings of the 14th …, 2020 - dl.acm.org
Proceedings of the 14th International Working Conference on Variability …, 2020dl.acm.org
Software branching and merging allows collaborative development and creating software
variants, commonly referred to as clone & own. While simple and cheap, a trade-off is the
need to merge code and to resolve merge conflicts, which frequently occur in practice. When
resolving conflicts, a key challenge for developer is to understand the changes that led to the
conflict. While merge conflicts and their characteristics are reasonably well understood, that
is not the case for the actual changes that cause them. We present a case study of the …
Software branching and merging allows collaborative development and creating software variants, commonly referred to as clone & own. While simple and cheap, a trade-off is the need to merge code and to resolve merge conflicts, which frequently occur in practice. When resolving conflicts, a key challenge for developer is to understand the changes that led to the conflict. While merge conflicts and their characteristics are reasonably well understood, that is not the case for the actual changes that cause them.
We present a case study of the changes---on the code and on the project-level (e.g., feature addition, refactoring, feature improvement)---that lead to conflicts. We analyzed the development history of ElasticSearch, a large open-source project that heavily relies on branching (forking) and merging. We inspected 40 merge conflicts in detail, sampled from 534 conflicts not resolvable by a semi-structured merge tool. On a code (structural) level, we classified the semantics of changes made. On a project-level, we categorized the decisions that motivated these changes. We contribute a categorization of code- and project-level changes and a detailed dataset of 40 conflict resolutions with a description of both levels of changes. Similar to prior studies, most of our conflicts are also small; while our categorization of code-level changes surprisingly differs from that of prior work. Refactoring, feature additions and feature enhancements are the most common causes of merge conflicts, most of which could potentially be avoided with better development tooling.
ACM Digital Library