Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2024445.2024455acmconferencesArticle/Chapter ViewAbstractPublication PagesfseConference Proceedingsconference-collections
research-article

Using the gini coefficient for bug prediction in eclipse

Published: 05 September 2011 Publication History

Abstract

The Gini coefficient is a prominent measure to quantify the inequality of a distribution. It is often used in the field of economy to describe how goods, e.g., wealth or farmland, are distributed among people. We use the Gini coefficient to measure code ownership by investigating how changes made to source code are distributed among the developer population. The results of our study with data from the Eclipse platform show that less bugs can be expected if a large share of all changes are accumulated, i.e., carried out, by relatively few developers.

References

[1]
A. Bernstein, J. Ekanayake, and M. Pinzger. Improving defect prediction using temporal features and non linear models. In Proc. Int'l Workshop on Principles of Softw. Evolution, pages 11--18, 2007.
[2]
C. Bird, A. Bachmann, E. Aune, J. Duffy, A. Bernstein, V. Filkov, and P. Devanbu. Fair and balanced?: bias in bug-fix datasets. In Proc. Joint European Softw. Eng. Conf. and ACM SIGSOFT Symposium on the Foundations of Softw. Eng., pages 121--130, 2009.
[3]
C. Bird, N. Nagappan, H. C. Gall, P. Devanbu, and B. Murphy. An analysis of the effect of code ownership on software quality across windows, eclipse, and firefox. Tech. Report 140, Microsoft Research, October 2010.
[4]
M. D'Ambros, M. Lanza, and R. Robbes. An extensive comparison of bug prediction approaches. In Proc. Int'l Workshop on Mining Softw. Repositories, pages 31--41, 2010.
[5]
R. Dorfman. A formula for the gini coefficient. The Review of Economics and Statistics, 61(1):146--149, February 1979.
[6]
S. Dowdy, S. Weardon, and D. Chilko. Statistics for Research. Probability and Statistics. John Wiley and Sons, Hoboken, New Jersey, third edition, 2004.
[7]
B. Fluri, M. Würsch, M. Pinzger, and H. C. Gall. Change Distilling: Tree Differencing for Fine-Grained Source Code Change Extraction. IEEE Trans. on Softw. Eng., 33(11):725--743, November 2007.
[8]
H. C. Gall, B. Fluri, and M. Pinzger. Change analysis with evolizer and changedistiller. IEEE Software, 26(1):26--33, January/February 2009.
[9]
E. Giger, M. Pinzger, and H. C. Gall. Comparing fine-grained source code changes and code churn for bug prediction. In Proc. Int'l Workshop on Mining Softw. Repositories, page to appear, 2011.
[10]
C. Gini. Variabilità e mutabilità. Memorie di metodologica statistica, 1912.
[11]
S. Lessmann, B. Baesens, C. M. Swantje, and Pietsch. Benchmarking classification models for software defect prediction: A proposed framework and novel findings. IEEE Trans. on Softw. Eng., 34(4):485--496, July 2008.
[12]
M. O. Lorenz. Methods of measuring the concentration of wealth. Publications of the American Statistical Association, 9(70):209--219, June 1905.
[13]
T. Menzies, J. Greenwald, and A. Frank. Data mining static code attributes to learn defect predictors. IEEE Trans. on Softw. Eng., 33(1):2--13, January 2007.
[14]
I. Mierswa, M. Wurst, R. Klinkenberg, M. Scholz, and T. Euler. Yale: Rapid prototyping for complex data mining tasks. In Proc. Int'l Conf. on Knowledge Discovery and Data Mining, pages 935--940, 2006.
[15]
R. Moser, W. Pedrycz, and G. Succi. A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction. In Proc. Int'l Conf. on Softw. Eng., pages 181--190, 2008.
[16]
T. Nguyen, B. Adams, and A. Hassan. Studying the impact of dependency network measures on software quality. In Proc. Int'l Conf. on Softw. Maintenance, pages 1 --10, 2010.
[17]
M. Pinzger, N. Nagappan, and B. Murphy. Can developer-module networks predict failures? In Proc. ACM SIGSOFT Symposium on the Foundations of Softw. Eng., pages 2--12, 2008.
[18]
A. Schroeter, T. Zimmermann, and A. Zeller. Predicting component failures at design time. In Proc. Int'l Symposium on Empirical Softw. Eng., pages 18--27, 2006.
[19]
A. Serebrenik and M. van den Brand. Theil index for aggregation of software metrics values. In Proc. Int'l Conf. on Softw. Maintenance, pages 1--9, 2010.
[20]
J. Sliwerski, T. Zimmermann, and A. Zeller. When do changes induce fixes? In Proc. Int'l Workshop on Mining Softw. Repositories, pages 1--5, 2005.
[21]
R. Vasa, M. Lumpe, P. Branch, and O. Nierstrasz. Comparative analysis of evolving software systems using the gini coefficient. In Proc. Int'l Conf. on Softw. Maintenance, pages 179--188, 2009.
[22]
E. Weyuker, T. Ostrand, and R. Bell. Do too many cooks spoil the broth? using the number of developers to enhance defect prediction models. Empirical Softw. Eng., 13(5):539--559, October 2008.
[23]
R. Winston. The gini coefficient as a measure of software project risk. http://www.theresearchkitchen.com/blog/archives/219.
[24]
I. Witten and E. Frank. Data Mining: Practical Machine Learning Tools and Techniques. Data Management Systems. Morgan Kaufmann, second edition, June 2005.
[25]
T. Zimmermann, R. Premraj, and A. Zeller. Predicting defects for eclipse. In Proc. Int'l Workshop on Predictor Models in Softw. Eng., pages 9--15, 2007.

Cited By

View all
  • (2020)Hub-Periphery Hierarchy in Bus Transportation Networks: Gini Coefficients and the Seoul Bus SystemSustainability10.3390/su1218729712:18(7297)Online publication date: 6-Sep-2020
  • (2020)How Well Do Change Sequences Predict Defects? Sequence Learning from Software ChangesIEEE Transactions on Software Engineering10.1109/TSE.2018.287625646:11(1155-1175)Online publication date: 1-Nov-2020
  • (2019)An empirical comparison of dependency network evolution in seven software packaging ecosystemsEmpirical Software Engineering10.1007/s10664-017-9589-y24:1(381-416)Online publication date: 1-Feb-2019
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
IWPSE-EVOL '11: Proceedings of the 12th International Workshop on Principles of Software Evolution and the 7th annual ERCIM Workshop on Software Evolution
September 2011
140 pages
ISBN:9781450308489
DOI:10.1145/2024445
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 September 2011

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. bug prediction
  2. code ownership
  3. gini coefficient

Qualifiers

  • Research-article

Conference

ESEC/FSE'11
Sponsor:

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)0
Reflects downloads up to 02 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2020)Hub-Periphery Hierarchy in Bus Transportation Networks: Gini Coefficients and the Seoul Bus SystemSustainability10.3390/su1218729712:18(7297)Online publication date: 6-Sep-2020
  • (2020)How Well Do Change Sequences Predict Defects? Sequence Learning from Software ChangesIEEE Transactions on Software Engineering10.1109/TSE.2018.287625646:11(1155-1175)Online publication date: 1-Nov-2020
  • (2019)An empirical comparison of dependency network evolution in seven software packaging ecosystemsEmpirical Software Engineering10.1007/s10664-017-9589-y24:1(381-416)Online publication date: 1-Feb-2019
  • (2017)The Use of Summation to Aggregate Software Metrics Hinders the Performance of Defect Prediction ModelsIEEE Transactions on Software Engineering10.1109/TSE.2016.259916143:5(476-491)Online publication date: 1-May-2017
  • (2016)An Ecosystemic and Socio-Technical View on Software Maintenance and Evolution2016 IEEE International Conference on Software Maintenance and Evolution (ICSME)10.1109/ICSME.2016.19(1-8)Online publication date: Oct-2016
  • (2015)Which process metrics can significantly improve defect prediction models? An empirical studySoftware Quality Journal10.1007/s11219-014-9241-723:3(393-422)Online publication date: 1-Sep-2015
  • (2014)Defect Prediction between Software Versions with Active Learning and Dimensionality ReductionProceedings of the 2014 IEEE 25th International Symposium on Software Reliability Engineering10.1109/ISSRE.2014.35(312-322)Online publication date: 3-Nov-2014
  • (2013)Replicating mining studies with SOFASProceedings of the 10th Working Conference on Mining Software Repositories10.5555/2487085.2487152(363-372)Online publication date: 18-May-2013
  • (2013)Replicating mining studies with SOFAS2013 10th Working Conference on Mining Software Repositories (MSR)10.1109/MSR.2013.6624050(363-372)Online publication date: May-2013
  • (2013)On the Application of Inequality Indices in Comparative Software AnalysisProceedings of the 2013 22nd Australian Conference on Software Engineering10.1109/ASWEC.2013.23(117-126)Online publication date: 4-Jun-2013
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media