Comparing State- and Operation-based Change Tracking
Maximilian Koegel Markus Yang Li
Institut für Informatik Herrmannsdoerfer Institut für Informatik Technische Universität Institut für Informatik Technische Universität München Technische Universität München Boltzmannstrasse 3, 87548 München Boltzmannstrasse 3, 87548 Garching, Germany Boltzmannstrasse 3, 87548 Garching, Germany koegel@in.tum.de Garching, Germany liya@in.tum.de herrmama@in.tum.de Jonas Helming Joern David Institut für Informatik Institut für Informatik Technische Universität Technische Universität München München Boltzmannstrasse 3, 87548 Boltzmannstrasse 3, 87548 Garching, Germany Garching, Germany helming@in.tum.de david@in.tum.de
ABSTRACT General Terms
In recent years, models are increasingly used throughout the Management, Documentation, Design, Experimentation entire lifecycle in software engineering projects. In effect, the need for managing these models in terms of change tracking and versioning emerged. However many researchers have Keywords shown that existing approaches for Version Control (VC) do Change Tracking, State-based, Operation-based, Change- not work well on graph-like models, and therefore proposed Based, Version Control System alternative techniques and methods. They can be catego- rized into two different classes: state-based and operation- 1. INTRODUCTION based approaches. Existing research shows advantages of Today, models are an essential artifact throughout the en- operation-based over state-based approaches in selected use tire lifecycle in software engineering projects. Model-driven cases. However, there are no results available on the advan- development is putting even more emphasis on models, since tages of operation-based approaches in the most common use they are not only an abstraction of the system under devel- case of a VC system: review and understand change. In this opment, but the system is (partly) generated from its mod- paper, we present and discuss both approaches and their use els. Consequently, models are about to cover the whole de- cases. Moreover, we present the results of an empirical study velopment process from requirements over design to deploy- to compare a state-based with an operation-based approach ment, including management of the process itself. With the in the use case of reviewing and understanding change. adoption of model-driven development in industry, the need for managing these models in terms of change tracking and versioning emerged. Version Control (VC), also commonly known as Software Configuration Management (SCM), is al- Categories and Subject Descriptors ready in wide-spread use for textual artifacts such as source D.2.7 [Software Engineering]: D.2.7 Distribution, Main- code. tenance, and Enhancement—Version control ; D.2.2 [Software However, many publications, e.g. [1, 6, 11, 12, 18, 19, 21], Engineering]: D.2.2 Design Tools and Techniques—Computer- recognized that existing VC approaches do not work well aided software engineering (CASE); D.2.2 [Software Engi- on models, which essentially are attributed graphs. The neering]: D.2.9 Management—Software configuration man- traditional VC systems are geared towards supporting tex- agement tual artifacts such as source code, managing them on a line- oriented level. In contrast, many software engineering arti- facts including models are not managed on a line-oriented level, and thus a line-oriented change management is not Permission to make digital or hard copies of all or part of this work for adequate. For example, adding an association between two personal or classroom use is granted without fee provided that copies are classes in a UML class diagram is a structural change, which not made or distributed for profit or commercial advantage and that copies is neither line-oriented, nor should be managed in a line- bear this notice and the full citation on the first page. To copy otherwise, to oriented way. However, a single structural change in the republish, to post on servers or to redistribute to lists, requires prior specific diagram is managed as multiple line changes by traditional permission and/or a fee. VC systems. Nguyen et al. describe this problem as the International Conference on Software Engineering 2010 Cape Town, South Africa impedance mismatch between the flat textual data models Copyright 200X ACM X-XXXXX-XX-X/XX/XX ...$10.00. of traditional VC systems, and graph-based software models [18]. Different approaches have been proposed to cope with the shortcomings of existing methods and techniques to bet- Update ter support change tracking and versioning of graph-based models. They can be categorized into two different classes: state-based and change-based approaches [4]. Commit State-based approaches only store states of a model, and thus need to derive differences by comparing two states, <<include>> e.g. a version and its successor, after the changes occurred Merge <<include>> [4]. This activity is often referred to as diffing. The diff- ing process can be viewed as a calculation to derive the <<include>> change post-mortem, and is generally expensive in compu- Review and tation time. Blame Understand <<include>> Change-based approaches record the changes, while Changes User they occur, and store them in a repository. There is no need <<include>> for diffing, since the changes are recorded and stored, and Show <<include>> thus do not need to be derived later on. Operation-based History approaches are a special class of change-based approaches <<include>> which represent the changes as transformation operations Show on a state [4]. The recorded operations can be applied to a Differences state to transform it into the successor state. Several publications exist that show advantages of change- based and in particular operation-based approaches over Revert state-based approaches in use cases such as conflict detection and merging [6, 16, 17], repository mining [23], inconsistency detection [2], and coupled evolution [9, 28]. However, there are no results available on the advantages of operation-based Figure 1: Use cases of a VC system (UML use case approaches in the most common use case of a VC system: diagram) reviewing and understanding change. We claim that under- standing change is the most important use case of a VC sys- tem from a user’s point of view, as it is required for almost any other use case, e.g. commit, update, merge, etc. There- use cases from well-known tools, such as the Revision Con- fore, we believe that it is essential to conduct experiments trol System (RCS) [25], the Concurrent Versioning System on how well this use case is supported by the state-based (CVS) [20], and Subversion (SVN) [26], from research tools and operation-based approaches. such as SiDiff [24] and UNICASE [10], and from the sur- In this paper, we discuss representatives of the different vey publications by Conradi and Westfechtel [4] and Dart approaches as well as the advantages and disadvantages of [5]. Figure 1 illustrates the use cases that we consider im- each type of approach in general. To qualitatively compare portant for discussing the differences between state-based them, we present frequent use cases of a VC system, and and change-based change tracking. For every use case, we show how they are supported by the respective approach. provide a short name and a description: Finally, we present the results of an empirical study we conducted to quantitatively compare a state-based with an Update. The users retrieve changes between their local ver- operation-based approach for the purpose of reviewing and sion and a target version (mostly the current head ver- understanding change. sion) from the repository. These incoming changes can Outline. Section 2 introduces common use cases of a VC be reviewed by the user, before they are incorporated system. Section 3 compares the state-based approach and into the local working copy of the model. If the user the operation-based approach to change tracking in a VC accepts the changes, and they do not conflict with lo- system. Related work is mentioned in the form of inline cal changes, the version of the local working copy is set citations within these two sections. Section 4 presents the to the target version, and the changes are incorporated design of the empirical study to compare both approaches, into the local copy. and Section 5 presents its results. Section 6 concludes the paper with a short summary. Commit. The user decides to share the changes from their local working copy with the repository. The user re- views the changes before the commit to ensure that 2. USE CASES OF A VERSION CONTROL only intentional changes are sent to the repository. If the user proceeds, the changes are sent to the reposi- SYSTEM tory to create a new version. In case they conflict with A VC system has to fulfill a lot of use cases, many of other commits that occurred since the last update, the which do not differ for state-based or change-based systems. commit is canceled by the VC system, and an update Consider the use case of baselining, in which the user marks must occur first. a certain approved version (e.g. a release). Since changes are not at all considered in this use case, both approaches Merge. When incoming changes conflict with existing lo- appear identical from a user’s point of view. Consequently, cal changes in the update use case, the merge use case we only focus on use cases where a difference between state- is initiated. The goal of a merge operation is to fil- based and change-based systems arises. We derived these ter or transform the incoming and/or local changes, so that they do not conflict anymore. The result will be The comparison calculation requires O(n). The space com- incorporated into the local workspace. The merging plexity for the whole diffing process is 2n, since both states process involves manual work in most cases, requir- need to be present. In a state-based system, changes are not ing to review the changes. Merging may also occur persisted in a way that reflects how they were actually per- if two branches in the repository are synchronized or formed. The diffing process can be viewed as a calculation rejoined, which essentially requires the same steps. to derive an approximation of the changes post-mortem. Since the VC system is not required to be able to observe Blame. To find out how and by whom a problem or an in- the changes while they occur, a total separation of the mod- consistency was created, the user is interested in find- eling tools and the VC system is possible. This is a clear ing recent changes on a certain part of the model. Typ- advantage over change-based systems. It is even possible to ically, the last n changes on a selected set of model use line-oriented VC systems, and to perform diffing on the elements need to be retrieved. The user reviews these client side. However, the diffing tool must at least know the changes to find the change that causes the problem. models’ meta model, based on which the changes are calcu- Show History. The user (often a project manager) reviews lated and represented. For example, EMF Compare [8] is the history to get an impression of the current activi- a diffing tool for meta models defined with the meta mod- ties in a project. Mostly, the user is not interested in eling language Ecore of the Eclipse Modeling Framework individual changes, but in an overview of how many (EMF) [7]. and which type of changes on how many artifacts oc- There are three main disadvantages of the diffing concept: curred. (1) The temporal order of changes is lost, and it can not be perfectly derived. For understanding changes, the temporal Show Differences. The user is interested in reviewing the order might be important. Moreover, the temporal order is differences between two versions of a model, for exam- useful for conflict detection and merging [16]. (2) Groupings ple two releases. The two versions are typically not of changes to composite changes are lost. Refactoring oper- very close in terms of the number of changes between ations e.g. cause many changes that can be grouped. This them. reduces the number of changes, and represents the change at a higher level of abstraction. Deriving composite changes, Revert. The user wants to undo some changes in the local e.g. to detect refactorings, is difficult and in some cases even working copy. To ensure that the right changes are impossible due to masking problems [29]. (3) The compu- undone, they need to be reviewed beforehand. tational complexity for diffing is high, especially if changes Review and Understand Changes. The user reviews between many states need to be retrieved, or the model is changes to understand what was changed, and most of a large size [15, 27]. By means of the empirical study, importantly, how it was changed. the design of which is presented in Section 4, we want to find out whether and how the disadvantages (1) and (2) will Interestingly, the first seven use cases are placed most affect the ability of users to understand change. prominently in many VC systems and their clients [10, 20, Considering the use cases presented in Section 2, we made 26]. We claim this is due to the fact that these are the most the following observations for the state-based approach: frequently executed use cases for the majority of users. In all of these seven use cases, the user reviews changes in one Merge. The merge result can not be as accurate, since com- or another way. As is shown in Figure 1, all use cases thus posite changes are not available [11, 16, 17]. Refac- include the use case ”Review and Understand Changes”. toring operations, for example, might only be partly reflected, if not all their caused changes are accepted. 3. COMPARISON OF CHANGE TRACKING Show History. The computational complexity for diffing In this section, we compare different approaches to change could result in a severe performance problem, espe- tracking in a VC system. Sections 3.1 and 3.2 introduce cially when looking at many versions and the changes the state-based and operation-based approach, respectively. that occurred in between them. Diffing is required for Section 3.3 contrasts the advantages and disadvantages of every such version. both approaches. Review and Understand Changes. We believe that the 3.1 State-based Change Tracking disadvantages (1) and (2) are impacting the ability State-based approaches derive differences by comparing of humans to understand change. The temporal or- two states, e.g. a version and its successor, after the changes der of the changes, which is lost using state-based occurred. This activity is often referred to as diffing, and is approaches, could help to understand the context in performed in two phases: matching and comparison. In the which the changes where performed. Composite chan- matching phase, for each node in a certain state, the cor- ges could group many changes that look unrelated, and responding node in the other state is found. The matching thus have to be grouped in the user’s mind. can be based on the similarity of the node’s content or on the graph structure it is connected to [13]. If the model sup- 3.2 Operation-based Change Tracking ports unique identifiers, the matching can be found in O(1), In contrast to state-based approaches, change-based ap- otherwise O(n2 ) are required for n nodes in a model [15, proaches record the changes while they occur, and store 27]. Chawathe et al. even claim that the matching problem them in a repository. This implies that change-based sys- for two states is NP-hard in its full generality [3]. In the tems persist changes in a way that reflects how they were comparison phase, each node is compared with its match- actually performed. There is no need for diffing, since the ing partner from the other state to derive potential changes. changes are already available by design. Operation-based approaches are a special class of change-based approaches provides all the required information [23]. Moreover, many which represent the changes as transformation operations researchers rely on information from a VC system to de- on a state [16]. An operation can be applied to a state to rive quantitative data to evaluate their approaches. This transform it into the successor state [4]. Figure 2 shows technique is in wide-spread use and is commonly known as the simplified taxonomy of operations from the UNICASE repository mining. For some approaches, for example rec- system [10, 11]. All operations refer to one ModelElement ommendation for traceability links, it is not only necessary that is being changed by the operation, and that is unam- to be able to retrieve every version, but also to recover in- biguously identified by a unique identifier. A ModelElement termediate states in between two versions. For example, is a node of the graph that can be linked to other nodes, for a realistic simulation of a recommendation use case in a and that has values for a number of attributes. An Attribu- post-mortem analysis, it is necessary to restore the state just teOperation changes the value of an attribute of a model before the recommendation. This scenario can only be sup- element. A ReferenceOperation creates or removes one or ported by operation-based VC systems, since the temporal several links between model elements. A CreateDeleteOp- order of changes is available. eration creates or deletes a model element. The created or Considering the use cases presented in Section 2, we made deleted model element is contained in the operation in the the following observations for the operation-based approach: case of operation-based change tracking. A CompositeOper- ation allows to group several related operations to represent Update. The operations incoming from the repository are a refactoring, for example. presented to the user. If the difference between local and target version is large, the system can canonize the isChangedBy operations to get a more compact representation. Con- ModelElement Operation * flict detection can fully rely on the operations, their creates/deletes temporal order and composites to supply a more ac- curate result and avoid unnecessary conflicts [17]. In CreateDelete Attribute Reference Composite general, conflict detectors apply a conservative estima- Operation Operation Operation Operation tion: If they are unsure about a potential conflict, they raise a conflict to avoid later data corruption. Figure 2: Taxonomy of operations (UML class dia- Commit. The recorded operations can be presented to the gram) user, possibly after canonization. No diffing is re- quired. Conflict detection may again profit from the additional information. The change-based approaches have one disadvantage in common: they require the VC system to be present when Merge. The merge can operate on the top level operations. the changes occur, i.e. when the modeling tool is manipu- Therefore, less decisions are needed, since many oper- lating the model. This requires an integration of the VC ations are contained in a composite operation. More- system into the modeling tool. However, this does not im- over, the decisions can not partially mask a refactoring, ply that the system must instrument the modeling tool, but as opposed to the state-based case [29]. may only use the infrastructure on which the tool is built. For change recording, observer mechanisms can be used; for Show Differences. This use case is best served by a state- composite detection, the command pattern can be used. In based representation. It can be implemented directly case of EMF models, one can rely on the EMF notifications by relying on an existing state-based approach or by and command stack [7]. This effectively decouples the VC deriving it from the recorded operations. system from the modeling tool. In general, change-based approaches can preserve the ex- Review and Understand Changes. The changes can be act temporal order, in which the changes occurred. This presented as operations in the correct temporal order is an important information for understanding changes, but and can be grouped as composite operations. is also useful to improve applications such as conflict de- tection and merging [16, 17]. Moreover, the exact times at 3.3 Summary which the changes occurred, can be recorded. Operation- State-based approaches exhibit the advantage that they based systems can record composite operations which ex- are independent of the tool used for changing the models. press the fact that the contained operations occurred in a State-based approaches derive an approximation of the exact common context. For example, a refactoring can be cap- change which is sufficient for certain kinds of changes and for tured in a composite operation. This can help to understand certain use cases. However, the need to derive the changes changes, but is also helpful for conflict detection and merg- from the states is a disadvantage of state-based approaches: ing [14, 17]. Since operations are essentially a command Due to the graph isomorphism problem, calculating the dif- pattern with persistent commands, the operations can also ference is a computationally complex endeavor. Moreover, be used to implement undo and redo functionality [17, 22]. state-based approaches can neither completely and correctly Operation-based systems can provide a filter method to can- derive the exact temporal order of the changes nor are they onize a sequence of operations, which hides operations that able to derive composite changes. are fully masked by later operations. For example, a Pull up Operation-based approaches have the advantage that to Superclass refactoring is fully masked by a later deletion the changes are explicitly recorded. Therefore, no computa- of all the involved classes. tion effort is necessary to derive the changes, when they are Robbes et al. even claim that only an operation-based VC required in the different use cases. Moreover, operation- system allows for effective research on evolution, since it based approaches retain the exact temporal order of the changes as well as composite changes. Operation-based ap- proaches exhibit the disadvantage that they need to be in- tegrated into the tool used for changing the models. As a consequence, they cannot be used for existing tools which do not provide such a functionality. The operation-based approach might seem very different from a state-based approach, but actually, it is only an en- hancement. It records additional information that is lost in the state-based approach. In an operation-based approach, we can perform everything that can be done in a state- based approach, by just ignoring the additional information. This boils down to the question whether the additional ef- fort for recording the changes is justified by its advantages. Therefore, we have conducted an empirical study to com- pare state-based with operation-based change tracking for the use case of reviewing and understanding change.
4. DESIGN OF THE EMPIRICAL STUDY
In this section, we present the design of the empirical study to compare state-based and operation-based change tracking. Section 4.1 lists the research questions that un- derlie the empirical study. Section 4.2 presents the tools we Figure 3: State-based representation (EMF Com- chose as representatives for state-based and operation-based pare) change tracking. Section 4.3 describes the data we used as input for the evaluation. Section 4.4 enumerates the differ- ent steps we carried out to conduct the empirical study. the same unified model and stored in the same repository as 4.1 Research Questions project model elements such as tasks or users. UNICASE is implemented based on EMF, and realizes operation-based We conduct the empirical study to answer the following change-tracking, conflict detection and merging [11]. research questions: Figure 4 depicts the example change as represented by 1. Do users better understand the changes in a state- UNICASE. The representation shows the sequence of oper- based or an operation-based representation? ations which have been executed on the source version to get to the target version. Note that the temporal order is from 2. Which factors influence the user in understanding a bottom to top. For each operation, its affected elements state-based or an operation-based representation? are shown below the operation. When an affected element 4.2 Setup is selected in the operation-based representation, it is auto- matically selected in a view of the target version (which is We chose representatives for the two different approaches not shown in the figure). to change tracking. State-based change tracking is represented by the open-source tool EMF Compare [8], which is the state-of- the-art diffing and merging implementation for EMF (Eclipse Modeling Framework) [7]. It is regularly delivered with the Eclipse Modeling Tools, which is one for several official Eclipse products. As we do not want to disadvantage EMF Compare by design, the used matching strategy is based on unique identifiers to ensure correct matchings. Figure 3 depicts an example of a change as represented by Figure 4: Operation-based representation (UNI- EMF Compare. The upper part shows the changes between CASE) two versions structured according to the target version. The lower part shows the target and the source version, respec- tively, and highlights affected elements. When a change is selected in the upper part, the affected elements are auto- 4.3 Input matically selected in the lower part. UNICASE was used to record operation histories, which Operation-based change tracking is represented by we use as an input to the empirical study. UNICASE also the open-source CASE tool UNICASE [10], which is based provides an Empirical Project Analysis Framework (EPAF). on a unified model. It consists of a set of editors to ma- Iterators can be reused to run through all revisions of a nipulate instances of a unified model, and a repository to model in a predefined way. Analyzers are used to analyze persist and version the model as well as to collaborate on and extract data per revision, and exporters write the data the model. The unified model covers the whole develop- to files. We used this framework to retrieve and analyze the ment process from requirements over design to deployment, data from the UNICASE VC system. including project management artifacts. System model el- The version model (see Figure 5) of UNICASE is a tree ements such as requirements or UML elements, are part of of versions with revision links [12]. Every version contains revises (3) The commit category determines what kind of op- 1 erations a commit contains: Category 1 contains only createdBy 1 Version 1 1 ChangePackage AttributeOperations and ReferenceOperations, category 1 2 also contains CreateDeleteOperations, and category 3 {ordered} also contains CompositeOperations (for the taxonomy, 0..1 * see Figure 2). ModelState Operation 2. Choose Users. We choose a number of users which are familiar with the input, as well as a number of users which are not familiar with the input. We record for Figure 5: Version Model (UML class diagram) all users whether they are familiar with the input using the attribute internal for every data set. 3. Extract data for users For each user, we randomly a change package and may contain a full version state rep- select 18 commits from the operation-based repository. resentation. A change package contains all operations that We select only commits with a size greater than 5 and transformed the previous version into this version along with smaller than 30. We exclude the shorter commits, as administrative information such as the modifying user, a we do not expect any difference between both repre- time stamp and a log message. sentations. We exclude the longer commits, as un- UNICASE was employed in a project named DOLLI2 derstanding them would take too much time in the (Distributed Online Logistics and Location Infrastructure 2) interview. at a major European airport. The objective of DOLLI2 was For each commit, we randomly decide whether the integrating facility management and telemetry data into the user is shown either the operation-based or state-based tracking and locating infrastructure developed in the previ- representation. Also we randomly determine the or- ous project, together with expanding the 3D visualization der in which the commits are presented to the user. on desktop computers as well as porting it to mobile de- We only take one representation for the same commit, vices. More than 20 developers worked on the project for a as the first representation might ease the understand- period of five months. All modeling was performed in the ing of the second representation. Moreover, we ensure UNICASE tool. This resulted in a comprehensive model that each user is presented roughly the same number consisting of about 1000 model elements and a history of of commits in operation-based and state-based repre- over 600 versions. sentation. To be able to correlate the answers with the complexity of the contained operations, we ran- 4.4 Conducting the Empirical Study domly sample 6 commits for each of the three above- We apply the following process to conduct the empiri- mentioned categories. For each user, there is a so- cal study, which consists of two phases. In the preparation called shadow user which is presented the same com- phase, we randomly select a number of commits from the mits in the opposite representation. We generate the repository. In the interview phase, we present these com- state-based representation for all the sample commits mits in different representations to a number of users. using EMF Compare. For the operation-based rep- The preparation phase consists of the following steps: resentation we just store a change package from the UNICASE VC system. 1. Extract commits from repository We query the UNICASE VC system for all commits of the DOLLI2 The interview phase is limited to a total of one hour per project and extract the project state after the commit user including a short training in the different representa- as well as the change package of the commit. Also tions. Since the order of the shown commits is random, the we preserve for any version the state of its predecessor interview can be ended at any time. If the interviewee com- version to be able to create a state-based diff in the pleted the interview on all 18 data sets in less than one hour, following data extraction. For each commit, we addi- the interview was also stopped. In the interview phase, we tionally record the following data: (1) commit size, (2) performed the following steps for each user and each commit: commit complexity and (3) commit category. 1. Present the commit to the user. Based on the (1) The commit size is measured in the number of data extracted before, we present the commit to the primitive changes. user in a state-based or an operation-based represen- tation. The user is shown the representation in the (2) The commit complexity is measured by a depen- respective tool which can be used to navigate through dency depth value. This value is supposed to measure the changes and the project (see Figures 3 and 4). how much the changes of one commit depend on each The user should try to understand and memorize the other. Operation a requires operation b means that “a changes at their best within a given time limit of 2 is not applicable on a project without b”. The requires minutes. We determined the time limit by experiment- relation is transitive [11]. We calculate the transitive ing with several test subjects. The time limit is sup- closure of the binary relation, requires, on an opera- posed to prevent the user from memorizing the changes tion sequence, and compute the 1-norm of the corre- rather than understanding them, and building a men- sponding adjacency matrix of this transitive closure. tal abstraction of the changes. We call this 1-norm dependency depth of an opera- tion sequence. In other words, dependency depth is 2. Question the user about the commit. We assess the longest path of requires relations in an operation the understanding of the user by means of exam ques- sequence. tions. Once the user decides to take the exam or once the time for understanding the changes is up, the users To get a first glance of the understanding process of the can no longer look at the change representations. In user when confronted with the two representations, we di- other words the exam is closed book. The exam con- vided the commits into two groups: those which result from sists of the following question types: the state-based approach (n1 = 75 items) and those, which (a) Understanding the impact of the changes: The user were operation-based (n2 = 87 items). We chose the vari- is confronted with a randomly selected element that able compare score and log message score as the key vari- was changed by the commit, in two versions: the ver- ables, because they describe to which extent the user has sion before and the version after the commit. The user correctly understood the performed change. Our hypothesis should try to answer the following question: “Which is was that the means of the compare score and/or log message the version of the element after the commit?” Since score, respectively, differ significantly in both groups, since elements that were deleted or created by the commit the operation-based representation should be more under- are not present in both versions, we show a different standable. Assuming a normal distribution of both values, question on deleted or created elements: “Was this el- we applied the T-test for independent samples to check the ement created, deleted, neither deleted nor created?” hypothesis of equal mean values. However, the test results We generate 5 instances of the first question type and 2 showed that there was no significant difference between the of the latter, if the commit has enough elements to gen- two groups for neither variable. Then we applied the T-test erate as many questions. The questions are supposed for taken time and self-assessment, also ending up without to determine whether the user has truly understood any significant result. the impact of the presented changes. Based on the advantages of the operation-based represen- tation, we expected that it should be easier to understand (b) Understanding the overall intention of the changes: more complicated commits in the operation-based represen- The user is presented with 10 commit messages and tation. As described earlier, we have recorded different mea- is asked to select the message that is assigned to the sures for complexity: commit size, commit complexity and commit. The other messages are randomly selected commit category. Thus, we decided to narrow down the from all commits of the respective project. Duplicates commit samples to the commits with higher complexity ac- are removed. Out of these 10 messages, the user can cording to the provided measures. We tried all measures as select and prioritize a maximum of three candidates decision criteria individually and in combination with oth- for the message of the presented commit. ers. The statistical results showed that the most significant During the interview, we recorded the following metrics difference between state- and operation-based representation that can be used for statistical evaluation: are in the group of changes from category (2) and (3) (see 4.4), given a commit size greater than or equal to 10. • the time taken for understanding the changes, which In the following, we detail the statistical results of this is between 0 and 2 minutes. group. We separated the filtered commits into two sub- groups: those which result from the state-based approach • the compare score based on the exam questions of type (n1 = 34 items) and those, which were operation-based (a). The compare score is the sum of the evaluation (n2 = 37 items). Our hypothesis was that the means of of all questions divided by the number of questions. A the compare score differ significantly in both groups. As- question evaluates to 1, in case it was answered cor- suming a normal distribution of both values, we applied rectly, and to -1 otherwise. As a consequence, the the T-test for independent samples to check the hypothe- aggregated result also is in the range of -1 and 1. sis of equal mean values. The 95%-confidence interval for • the log message score based on the exam questions the difference of the mean values mstate − mop = −0.195 is of type (b). The user obtains 4 points, if her first [−0.389, −0.002]. The T-test returned a T-value of −2.019, candidate is the correct commit message, 2 points for which means that the critical value −c(df = 60, α = 5%) = the second, 1 point for the third, and 0 points if none −1.67 for 60 degrees of freedom is exceeded according to of her candidates is correct. amount and thus the null-hypothesis has to be rejected on the 5% level of significance. Thereby, the variances of both • the self-assessment of the user reflecting the difficulty groups cannot be assumed to be equal, since the F-test re- she felt in understanding the changes. For this mea- turned an F-value of 4.0, which is greater than the critical sure, we use a scale with the following five values: very value c(df1 = 34, df2 = 37, α = 5%) = 1.74 for that level difficult (=1), difficult (=2), OK (=3), easy (=4), very of significance and the given degrees of freedom. Thus we easy (=5). used the T-test for different variances in both groups, im- plying a lower number of degrees of freedom (only about 60, 5. STUDY RESULT in contrast to1 n1 + n2 − 2 = 69). The variance of the user assessment regarding the operation-based representation is In this section, we present the results of the empirical significantly lower than for the state-based representation. study. Section 5.1 evaluates the results by means of statis- This might argue for a higher robustness of the operation- tical tests. Section 5.2 interprets the results in terms of the based representation, since the users are more consistent in research questions. Section 5.3 lists threats to the study’s their assessment of the changes, as illustrated by Figure 6. validity along with their mitigation. Besides the discriminating variable compare score, we also 5.1 Evaluation performed tests on other variables, i.e. taken time, log mes- sage score and self-assessment, but no substantial difference We perform a number of statistical tests to evaluate the measurements. 162 change assessments from 14 different 1 Two degrees are subtracted, since the expected value and users were recorded and made available for statistical tests. the variance have to be estimated from the data sample. changes in an operation-based representation, if the changes are sufficiently complex and in-depth understanding is re- quired. Also, we conclude that in the case of less complex changes, neither representation shows big advantages, since no significant difference in the variables’ distribution could be determined. Which factors influence the user in understand- ing change in a state-based or an operation-based representation? Based on the results from grouping the commits by the measures introduced above, we made the following observations on factors relevant for understanding the change. The dependency depth does not seem to be rele- vant, since its individual and combined use for grouping did not improve significance for any of the variables for neither Figure 6: Boxplot of the compare score distribution representation. The category however seems to be a rele- of state-based and operation-based representation vant factor. Both category 2 and 3 only contain changes that are composed of potentially many atomic changes. A delete operation for example removes an element and its chil- dren along with all cross references targetint them from the was found. Especially, the distribution of the binary vari- model. Using commits from these categories only already able internal, which indicates whether the respective change showed promising results, however, below a reasonable sig- was assessed by a member of the development team or by an nificance level. Only when combined with a restriction of external person (see 4.4), did not significantly differ within the commit size (≥ 10), significant differences showed up. In the two groups of state-based and operation-based changes. our opinion, the reason for this observation is that a commit Even if one argues that the number of changes in each with complex operations, but of a size of less than 10 prim- group is not normally distributed, the non-parametric Mann- itive changes, is not very difficult to understand in neither Whitney-U test can be used to recheck the result. Due representation. We conclude that commit size and category to its non-parametric character, Mann-Whitney-U test is are factors influencing the understanding of the change in weaker than the T-test but still succeeds in rejecting the the two given representations. null-hypothesis of equal means at the 10% level of signifi- cance. The test returns a Z-value of Z = −1.739 (standard- normal approximation), which makes the probability to er- 5.3 Threats to Validity roneously reject the null-hypothesis relatively small: We are aware of the following internal and external threats symm. of Φ to validity, which might have affected our results without Φ(Z)+(1−Φ(−Z)) = 2·Φ(Z) ≈ 0.082 (asymptotical being detected. significance). Internal Validity. The results might be influenced by the design we chose for the empirical study. Thereby, the 5.2 Interpretation results might be affected by the way we prepared the data In this section, we interpret the results in terms of the and performed the interviews. research questions posed in section 4.1: Commit selection. When preparing the data for the evalu- Do users better understand the changes in a state- ation, we might have chosen commits which favor one repre- based or an operation-based representation? The sta- sentation over the other. Consequently, one approach might tistical result shows no significant overall difference between perform better than the other, which poses a threat to the the respective representations in the variables compare score, result’s validity. To mitigate this threat, we randomly chose log message score, taken time and self-assessment. In gen- the commits and we sampled them according to different eral, we observed that the log message score, taken time and types of changes. self-assessment mostly showed a similar distribution in both Understanding vs. memorization. One could argue that representations. While compare score often deviated but the users only need to fully memorize the commit for an- below a reasonable significance level. We already supposed swering the questions. As a consequence, we would be ex- that the operation-based approach might provide better re- amining the user’s ability to memorize a number of changes, sults with increasing complexity of the changes. Therefore, instead of their ability to understand the changes. To force we have also recorded different measures for complexity for the users to really understand the commit, we limited the each commit. The results showed that by partitioning the time for looking at each commit. data sets according to category and size, we obtain a set of External Validity. The results might be influenced by more complex commits, where the operation-based represen- the fact that we questioned only a restricted number of users tation is significantly better than the state-based representa- about a restricted number of commits. Thereby, the results tion. By design, the compare score required a more detailed might not be representative for all developers or for all pos- understanding of the changes, while the log message score sible model changes on average. required a birds-eye-view understanding. We believe this Prior knowledge. The users might have prior knowledge is why only compare score shows significant differences in about a certain representation which favors the respective both representations. The log message can often be guessed representation. This poses a threat to the transferability of by memorizing frequent words for example, which is equally the results to all users in general. To mitigate this threat, we well supported in both representations. With respect to the trained all participants of the empirical study on a number research question, we conclude that users better understand of samples for both representations in advance. Specific modeling language. The results might be affected [7] Eclipse. Eclipse Modeling Framework. by the modeling language of UNICASE in which the models http://www.eclipse.org/emf. are developed. As a consequence, the results might not be [8] Eclipse. EMF Compare. transferable to the evolution of models created with other http://wiki.eclipse.org/EMF_Compare. modeling languages. However, the UNICASE modeling lan- [9] M. Herrmannsdoerfer, S. Benz, and E. Juergens. guage is similar to UML, whose different dialects are widely COPE - automating coupled evolution of metamodels used in software engineering practice. and models. In ECOOP 2009 - Object-Oriented Programming, volume 5653 of Lecture Notes in 6. CONCLUSION AND FUTURE WORK Computer Science, pages 52–76. Springer Berlin / Heidelberg, 2009. We reviewed both state-based and operation-based ap- proaches for change tracking. We compared both approaches [10] J. Helming, M. Koegel. UNICASE. by means of typical use cases of VC systems in a qualitative http://unicase.org. way. Also we conducted a case study to compare the two [11] M. Koegel, J. Helming, and S. Seyboth. approaches in the use case of reviewing and understand- Operation-based conflict detection and resolution. In ing changes in a quantitative way. The results from our CVSM ’09: Proceedings of the 2009 ICSE Workshop case study show that operation-based change tracking ex- on Comparison and Versioning of Software Models, hibits advantages in understanding more complex changes. pages 43–48, Washington, DC, USA, 2009. IEEE In addition, we found the size and type of changes of a com- Computer Society. mit to be relevant factors for understanding changes. Based [12] M. Kögel. Towards software configuration on these results, we believe that the additional information management for unified models. In CVSM ’08: recorded by the operation-based approach can be valuable Proceedings of the 2008 international workshop on in certain use cases. Further research is required to find Comparison and versioning of software models, pages further determining factors and to decide in which use case 19–24, New York, NY, USA, 2008. ACM. and under which conditions which representation should be [13] D. S. Kolovos, D. Di Ruscio, A. Pierantonio, and R. F. used. Paige. Different models for model matching: An analysis of approaches to support model differencing. In CVSM ’09: Proceedings of the 2009 ICSE 7. ACKNOWLEDGMENTS Workshop on Comparison and Versioning of Software We like to thank all the users which agreed to participate Models, pages 1–6, Washington, DC, USA, 2009. IEEE in the empirical study. This work is partially supported by Computer Society. grants from the BMBF (Federal Ministry of Education and [14] A. Lie, R. Conradi, T. M. Didriksen, and E.-A. Research, Innovationsallianz SPES 2020). Karlsson. Change oriented versioning in a software engineering database. In Proceedings of the 2nd 8. REFERENCES International Workshop on Software configuration [1] C. Bartelt. Consistence preserving model merge in management, pages 56–65, New York, NY, USA, 1989. collaborative development processes. In CVSM ’08: ACM. Proceedings of the 2008 international workshop on [15] T. Lindholm, J. Kangasharju, and S. Tarkoma. Fast Comparison and versioning of software models, pages and simple xml tree differencing by sequence 13–18, New York, NY, USA, 2008. ACM. alignment. In DocEng ’06: Proceedings of the 2006 [2] X. Blanc, I. Mounier, A. Mougenot, and T. Mens. ACM symposium on Document engineering, pages Detecting model inconsistency through 75–84, New York, NY, USA, 2006. ACM. operation-based model construction. In ICSE ’08: [16] E. Lippe and N. van Oosterom. Operation-based Proceedings of the 30th international conference on merging. In SDE 5: Proceedings of the fifth ACM Software engineering, pages 511–520, New York, NY, SIGSOFT symposium on Software development USA, 2008. ACM. environments, pages 78–87, New York, NY, USA, [3] S. S. Chawathe and H. Garcia-Molina. Meaningful 1992. ACM. change detection in structured data. In SIGMOD ’97: [17] T. Mens. A state-of-the-art survey on software Proceedings of the 1997 ACM SIGMOD international merging. IEEE Trans. Softw. Eng., 28(5):449–462, conference on Management of data, pages 26–37, New 2002. York, NY, USA, 1997. ACM. [18] T. N. Nguyen, E. V. Munson, J. T. Boyland, and [4] R. Conradi and B. Westfechtel. Version models for C. Thao. An infrastructure for development of software configuration management. ACM Comput. object-oriented, multi-level configuration management Surv., 30(2):232–282, 1998. services. In ICSE ’05: Proceedings of the 27th [5] S. Dart. Spectrum of functionality in configuration international conference on Software engineering, management systems. Technical report, CMU/SEI, pages 215–224, New York, NY, USA, 2005. ACM. 1990. [19] D. Ohst. A fine-grained version and configuration [6] D. Dig, K. Manzoor, R. Johnson, and T. N. Nguyen. model in analysis and design. In ICSM ’02: Refactoring-aware configuration management for Proceedings of the International Conference on object-oriented programs. In ICSE ’07: Proceedings of Software Maintenance (ICSM’02), page 521, the 29th international conference on Software Washington, DC, USA, 2002. IEEE Computer Society. Engineering, pages 427–436, Washington, DC, USA, [20] D. Prince. Concurrent Versioning System. 2007. IEEE Computer Society. http://www.nongnu.org/cvs. [21] J. Rho and C. Wu. An efficient version model of software diagrams. In APSEC ’98: Proceedings of the Fifth Asia Pacific Software Engineering Conference, page 236, Washington, DC, USA, 1998. IEEE Computer Society. [22] R. Robbes and M. Lanza. A change-based approach to software evolution. Electron. Notes Theor. Comput. Sci., 166:93–109, 2007. [23] R. Robbes, M. Lanza, and M. Lungu. An approach to software evolution based on semantic change. In Fundamental Approaches to Software Engineering, volume 4422 of Lecture Notes in Computer Science, pages 27–41. Springer Berlin / Heidelberg, 2007. [24] U. Siegen. SiDiff. http://sidiff.org. [25] W. F. Tichy. Design, implementation, and evaluation of a revision control system. In ICSE ’82: Proceedings of the 6th international conference on Software engineering, pages 58–67, Los Alamitos, CA, USA, 1982. IEEE Computer Society Press. [26] Tigris. Subversion VC System. http://subversion.tigris.org. [27] C. Treude, S. Berlik, S. Wenzel, and U. Kelter. Difference computation of large models. In ESEC-FSE ’07: Proceedings of the the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering, pages 295–304, New York, NY, USA, 2007. ACM. [28] G. Wachsmuth. Metamodel adaptation and model co-adaptation. In ECOOP 2007 - Object-Oriented Programming, volume 4609 of Lecture Notes in Computer Science, pages 600–624. Springer Berlin / Heidelberg, 2007. [29] Z. Xing and E. Stroulia. Refactoring detection based on umldiff change-facts queries. In WCRE ’06: Proceedings of the 13th Working Conference on Reverse Engineering, pages 263–274, Washington, DC, USA, 2006. IEEE Computer Society.
Limiting Technical Debt With Maintainability Assurance - An Industry Survey On Used Techniques and Differences With Service-And Microservice-Based Systems
(Download PDF) Architecting A Modern Data Warehouse For Large Enterprises Build Multi Cloud Modern Distributed Data Warehouses With Azure and Aws 1St Edition Anjani Kumar 3 Full Chapter PDF