Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Analyzing developer contributions using artifact traceability graphs

Published: 01 May 2022 Publication History

Abstract

Context

In a software project, properly analyzing the contributions of developers could provide valuable insights for decision-makers. The contributions of a developer could be in many different forms such as committing and reviewing code, opening and resolving issues. Previous approaches mainly consider the commit-based contributions which provide an incomplete picture of developer contributions.

Objective

Different from the traditional commit-based approaches for analyzing developer contributions, we aim to provide a more holistic approach to reflect the rich set of software development activities using artifact traceability graphs.

Method

For analyzing the developer contributions, we propose a novel categorization of developers (Jacks, Mavens and Connectors) in a software project. We introduce a set of algorithms on artifact traceability graphs to identify key developers, recommend replacements for leaving developers and evaluate knowledge distribution among developers.

Results

We evaluate our proposed algorithms on six open-source projects and demonstrate that the identified key developers match the top commenters up to 98%, recommended replacements are correct up to 91% and identified knowledge distribution labels are compatible 94% on average with the baseline approaches.

Conclusions

The proposed algorithms using artifact traceability graphs for analyzing developer contributions could be used by software project decision-makers in several scenarios. (1) Identifying different types of key developers. (2) Finding a replacement developer in large teams. (3) Evaluating the overall knowledge distribution amongst developers to take early precautions.

References

[1]
Agrawal A, Rahman A, Krishna R, Sobran A, Menzies T (2018) We don’t need another hero?: the impact of heroes on software development. In: Proceedings of the 40th international conference on software engineering: software engineering in practice. ACM, pp 245–253
[2]
Allaho M Y, Lee W C (2013) Analyzing the social ties and structure of contributors in open source software community. In: Proceedings of the 2013 IEEE/ACM international conference on advances in social networks analysis and mining, pp 56–60
[3]
Amirfallah A, Trautsch F, Grabowski J, Herbold S (2019) A systematic mapping study of developer social network research. arXiv:1902.07499
[4]
Avelino G, Passos L, Hora A, Valente M T (2016) A novel approach for estimating truck factors. In: 2016 IEEE 24th international conference on program comprehension (ICPC). IEEE, pp 1–10
[5]
Avelino G, Constantinou E, Valente M T, Serebrenik A (2019) On the abandonment and survival of open source projects: an empirical investigation. In: 2019 ACM/IEEE International symposium on empirical software engineering and measurement (ESEM). IEEE, pp 1–12
[6]
Balachandran V (2013) Reducing human effort and improving quality in peer code reviews using automatic static analysis and reviewer recommendation. In: 2013 35th international conference on software engineering (ICSE). IEEE, pp 931–940
[7]
Bird C, Gourley A, Devanbu P, Gertz M, Swaminathan A (2006) Mining email social networks. In: Proceedings of the 2006 international workshop on mining software repositories, pp 137–143
[8]
Bird C, Nagappan N, Murphy B, Gall H, Devanbu P (2011) Don’t touch my code! Examining the effects of ownership on software quality. In: Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on foundations of software engineering, pp 4–14
[9]
Brandes U A faster algorithm for betweenness centrality J Math Sociol 2001 25 2 163-177
[10]
Bulmer M G (1979) Principles of statistics. Courier Corporation
[11]
Canfora G, Di Penta M, Oliveto R, Panichella S (2012) Who is going to mentor newcomers in open source projects?. In: Proceedings of the ACM SIGSOFT 20th international symposium on the foundations of software engineering, pp 1–11
[12]
Cetin H A (2019) Identifying the most valuable developers using artifact traceability graphs. In: Proceedings of the 2019 27th ACM joint meeting on european software engineering conference and symposium on the foundations of software engineering, pp 1196–1198
[13]
Çetin H A, Tüzün E (2020) Identifying key developers using artifact traceability graphs. In: Proceedings of the 16th ACM international conference on predictive models and data analytics in software engineering, pp 51–60
[14]
Cheng J, Guo J L (2019) Activity-based analysis of open source software contributors: roles and dynamics. In: 2019 IEEE/ACM 12th international workshop on cooperative and human aspects of software engineering (CHASE). IEEE, pp 11–18
[15]
Conway ME How do committees invent Datamation 1968 14 4 28-31
[16]
Cosentino V, Izquierdo J L C, Cabot J (2015) Assessing the bus factor of git repositories. In: 2015 IEEE 22nd international conference on software analysis, evolution, and reengineering (SANER). IEEE, pp 499–503
[17]
Crowston K, Wei K, Li Q, Howison J (2006) Core and periphery in free/libre and open source software team communications. In: Proceedings of the 39th annual hawaii international conference on system sciences (HICSS’06), vol 6. IEEE, pp 118a–118a
[18]
Di Bella E, Sillitti A, and Succi G A multivariate classification of open source developers Inf Sci 2013 221 72-83
[19]
Ebbinghaus H (1885) ÜBer das gedächtnis: untersuchungen zur experimentellen psychologie. Duncker & Humblot
[20]
Ferreira M, Mombach T, Valente MT, and Ferreira K Algorithms for estimating truck factors: a comparative study Softw Qual J 2019 27 4 1583-1617
[21]
Fischer M, Pinzger M, Gall H (2003) Populating a release history database from version control and bug tracking systems. In: International conference on software maintenance, 2003. ICSM 2003. Proceedings. IEEE, pp 23–32
[22]
Foucault M, Palyart M, Blanc X, Murphy G C, Falleri J R (2015) Impact of developer turnover on quality in open-source software. In: Proceedings of the 2015 10th joint meeting on foundations of software engineering, pp 829–841
[23]
Freeman LC Centrality in social networks conceptual clarification Social Netw 1978 1 3 215-239
[24]
Fritz T, Murphy GC, Murphy-Hill E, Ou J, and Hill E Degree-of-knowledge: modeling a developer’s knowledge of code ACM Trans Softw Eng Methodol (TOSEM) 2014 23 2 1-42
[25]
Gladwell M The tipping point: how little things can make a big difference 2006 Brown Little
[26]
Goeminne M, Mens T (2011) Evidence for the pareto principle in open source software activity. In: The joint proceedings of the 1st international workshop on model driven software maintenance and 5th international workshop on software quality and maintainability. Citeseer, pp 74–82
[27]
Hayward ML, Shepherd DA, and Griffin D A hubris theory of entrepreneurship Manag Sci 2006 52 2 160-172
[28]
Huntley CL Organizational learning in open-source software projects: an analysis of debugging data IEEE Trans Eng Manag 2003 50 4 485-493
[29]
Joblin M, Apel S, Hunsen C, Mauerer W (2017) Classifying developers into core and peripheral: an empirical study on count and network metrics. In: 2017 IEEE/ACM 39th international conference on software engineering (ICSE). IEEE, pp 164–174
[30]
Kakimoto T, Kamei Y, Ohira M, Matsumoto K (2006) Social network analysis on communications for knowledge collaboration in oss communities. In: Proceedings of the international workshop on supporting knowledge collaboration in software development (KCSD’06). Citeseer, pp 35–41
[31]
Kosti MV, Feldt R, and Angelis L Archetypal personalities of software engineers and their work preferences: a new perspective for empirical studies Empir Softw Eng 2016 21 4 1509-1532
[32]
Kovalenko V, Tintarev N, Pasynkov E, Bird C, and Bacchelli A Does reviewer recommendation help developers? IEEE Trans Softw Eng 2018 46 7 710-731
[33]
Krüger J, Wiemann J, Fenske W, Saake G, Leich T (2018) Do you remember this source code?. In: 2018 IEEE/ACM 40th international conference on software engineering (ICSE). IEEE, pp 764–775
[34]
Massey FJJr The kolmogorov-smirnov test for goodness of fit J Am Stat Assoc 1951 46 253 68-78
[35]
Milewicz R, Pinto G, Rodeghero P (2019) Characterizing the roles of contributors in open-source scientific software projects. In: 2019 IEEE/ACM 16th international conference on mining software repositories (MSR). IEEE, pp 421–432
[36]
Mockus A (2010) Organizational volatility and its effects on software defects. In: Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering, pp 117–126
[37]
Narayanan S, Balasubramanian S, and Swaminathan JM A matter of balance: specialization, task variety, and individual learning in a software maintenance environment Manag Sci 2009 55 11 1861-1876
[38]
Nassif M, Robillard M P (2017) Revisiting turnover-induced knowledge loss in software projects. In: 2017 IEEE International conference on software maintenance and evolution (ICSME). IEEE, pp 261–272
[39]
Oliva GA, da Silva JT, Gerosa MA, Santana FWS, Werner CML, de Souza CRB, and de Oliveira KCM Evolving the system’s core: a case study on the identification and characterization of key developers in apache ant Comput Inform 2015 34 3 678-724
[40]
Ortu M, Hall T, Marchesi M, Tonelli R, Bowes D, Destefanis G (2018) Mining communication patterns in software development: a github analysis. In: Proceedings of the 14th international conference on predictive models and data analytics in software engineering, pp 70–79
[41]
Ouni A, Kula R G, Inoue K (2016) Search-based peer reviewers recommendation in modern code review. In: 2016 IEEE international conference on software maintenance and evolution (ICSME). IEEE, pp 367–377
[42]
Padhye R, Mani S, Sinha V S (2014) A study of external community contribution to open-source projects on github. In: Proceedings of the 11th working conference on mining software repositories, pp 332–335
[43]
Rath M and Mäder P The seoss 33 dataset—requirements, bug reports, code history, and trace links for entire projects Data Brief 2019 25 104005
[44]
Razali NM, Wah YB, et al. Power comparisons of shapiro-wilk, kolmogorov-smirnov, lilliefors and anderson-darling tests J Stat Model Anal 2011 2 1 21-33
[45]
Rigby P C, Bird C (2013) Convergent contemporary software peer review practices. In: Proceedings of the 2013 9th joint meeting on foundations of software engineering, pp 202–212
[46]
Rigby P C, Zhu Y C, Donadelli S M, Mockus A (2016) Quantifying and mitigating turnover-induced knowledge loss: case studies of chrome and a project at avaya. In: 2016 IEEE/ACM 38th international conference on software engineering (ICSE). IEEE, pp 1006–1016
[47]
Robillard M P, Nassif M, McIntosh S (2018) Threats of aggregating software repository data. In: 2018 IEEE international conference on software maintenance and evolution (ICSME). IEEE, pp 508–518
[48]
Royston P Remark as r94: a remark on algorithm as 181: The w-test for normality J R Stat Soc Ser C (Appl Stat) 1995 44 4 547-551
[49]
Runeson P and Höst M Guidelines for conducting and reporting case study research in software engineering Empir Softw Eng 2009 14 2 131
[50]
Sadowski C, Söderberg E, Church L, Sipko M, Bacchelli A (2018) Modern code review: a case study at google. In: Proceedings of the 40th international conference on software engineering: software engineering in practice, pp 181–190
[51]
Shapiro SS and Wilk MB An analysis of variance test for normality (complete samples) Biometrika 1965 52 3/4 591-611
[52]
Sülün E, Tüzün E, Doğrusöz U (2019) Reviewer recommendation using software artifact traceability graphs. In: Proceedings of the fifteenth international conference on predictive models and data analytics in software engineering, pp 66–75
[53]
Sülün E, Tüzün E, and Doğrusöz U Rstrace+: reviewer suggestion using software artifact traceability graphs Inf Softw Technol 2021 130 106455
[54]
Tüzün E and Tekinerdogan B Analyzing impact of experience curve on roi in the software product line adoption process Inf Softw Technol 2015 59 136-148
[55]
Tüzün E, Tekinerdogan B, Macit Y, and İnce K Adopting integrated application lifecycle management within a large-scale software company: an action research approach J Syst Softw 2019 149 63-82
[56]
Wang Z, Feng Y, Wang Y, Jones JA, and Redmiles D Unveiling elite developers’ activities in open source projects ACM Trans Softw Eng Methodol (TOSEM) 2020 29 3 1-35
[57]
Wu J, Goh K Y (2009) Evaluating longitudinal success of open source software projects: a social network perspective. In: 2009 42nd Hawaii international conference on system sciences. IEEE, pp 1–10
[58]
Xia X, Lo D, Wang X, Zhou B (2013) Accurate developer recommendation for bug resolution. In: 2013 20th Working conference on reverse engineering (WCRE). IEEE, pp 72–81
[59]
Yamashita K, McIntosh S, Kamei Y, Hassan A E, Ubayashi N (2015) Revisiting the applicability of the pareto principle to core development teams in open source software projects. In: Proceedings of the 14th international workshop on principles of software evolution, pp 46–55
[60]
Zhou M, Mockus A (2012) What make long term contributors: willingness and opportunity in oss community. In: 2012 34th International conference on software engineering (ICSE). IEEE, pp 518–528
[61]
Zwillinger D, Kokoska S (1999) CRC Standard probability and statistics tables and formulae. CRC Press

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Empirical Software Engineering
Empirical Software Engineering  Volume 27, Issue 3
May 2022
844 pages

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 01 May 2022
Accepted: 02 February 2022

Author Tags

  1. Key developers
  2. Social networks
  3. Artifact traceability graphs
  4. Developer replacement
  5. Developer turnover
  6. Knowledge distribution

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 06 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media