Automatic Identification of Bug-Introducing Changes: Sunghun Kim, Thomas Zimmermann, Kai Pan, E. James Whitehead, JR
Automatic Identification of Bug-Introducing Changes: Sunghun Kim, Thomas Zimmermann, Kai Pan, E. James Whitehead, JR
Automatic Identification of Bug-Introducing Changes: Sunghun Kim, Thomas Zimmermann, Kai Pan, E. James Whitehead, JR
Figure 8 shows the difference in identified bug- Figure 11 compares the results of the SZZ approach
introducing change sets by ignoring comment and blank with the improved approach that identifies bug-
line changes. This approach removes 14%~20% of false introducing changes by ignoring format changes in bug-
positives. fix hunks. Overall, ignoring source code format changes
removes 18%~25% of false positives and 13%~14% of the changes are method name and parameter name
false negatives. changes. For example, one parameter type changed from
‘TypeDeclaration’ to ‘LocalTypeDeclaration’, and hence
the revision contains 7 file changes related to this change,
as shown Figure 13.
- public boolean visit(TypeDeclaration
- typeDeclaration, BlockScope scope){
+ public boolean visit(LocalTypeDeclaration
+ typeDeclaration, BlockScope scope){
5. Discussion
In this section, we discuss the relationship between
Figure 16. Bug-introducing change sets after manual fix identified bug-fixes and true bug-fixes. We also discuss
hunk validation. the relationship between identified bug-introducing
changes and true bugs.
4.6. Summary 5.1. Are All Identified Fixes True Fixes?
We applied the steps described in Figure 2 to remove We used two approaches to identify bug-fixes:
false positive and false negative bug-introducing changes. searching for keywords such as "Fixed" or "Bug" [12] and
In this section we compare the identified bug-introducing searching for references to bug reports like “#42233” [2,
change sets gathered using the original SZZ algorithm 4, 16]. The accuracy of bug-fix identification depends on
[16] and those from our new algorithm (steps 1-5 in the quality of change logs and linkages between SCM and
Figure 2). Overall, Figure 17 shows that applying our bug tracking systems. The two open source projects we
algorithms removes about 38%~51% of false positives examined have, to the best of our knowledge, the highest
and 14~15% of false negatives—a substantial error quality change log and linkage information of any open
reduction. source project. In addition, two human judges manually
validated all bug-fix hunks. We believe the identified
bug-fix hunks are, in almost all cases, real fixes. Still
there might be false negatives. For example, even though
a change log does not indicate a given change is a fix, it is Manual fix hunk verification may include errors. Even
possible that the change includes a fix. To measure false though we selected two human judges who have multiple
negative fix changes, we need to manually inspect all years of Java programming experience, their manual fix
hunks in all revisions, a daunting task. This remains hunk validation may contain errors.
future work.
6. Applications
5.2. Are Bug-Introducing Changes True Bugs? In the first part of this paper, we presented an approach
Are all identified bug-introducing changes real bugs?
for identifying bug-introducing changes more accurately
It may depend on the definition of ‘bug’. IEEE defines
than SZZ. In this section, we discuss possible applications
anomaly, which is a synonym of fault, bug, or error, as:
for these bug-introducing changes.
“any condition that departs from the expected [8].”
Verifying whether all identified bug-introducing changes 6.1. Bug-Introduction Statistics
meet a given definition of bug remains future work. Information about bug-introducing changes can be
More importantly, we propose algorithms to remove used to help understand software bugs. Unlike bug-fix
false positives and false negatives in the identified bugs. information, bug-introducing changes provide the exact
As shown in Figure 19, even though we do not know the time a bug occurs. For example, it is possible to determine
exact set of real bugs, our algorithms can identify a set the day in which bugs are most introduced. We can also
that is closer to the real bug set than the set identified by now determine the most bug-prone authors. When
the original SZZ algorithm [16]. Even if not perfect, our combined with bug-fix information, we can determine
approach is better than the current state of the art. how long it took to fix a bug after it was introduced.
Sliwerski et al. performed an experiment to find out
the most bug-prone day by computing bug-introducing
change rates over all changes [16]. They found that Friday
is the most bug-prone day in the projects examined.