research-article

Source Code Recommender Systems: The Practitioners' Perspective

Authors:

Matteo Ciniselli,

Luca Pascarella,

Simone Scalabrino,

Gabriele BavotaAuthors Info & Claims

ICSE '23: Proceedings of the 45th International Conference on Software Engineering

Pages 2161 - 2172

https://doi.org/10.1109/ICSE48619.2023.00182

Published: 26 July 2023 Publication History

Abstract

The automatic generation of source code is one of the long-lasting dreams in software engineering research. Several techniques have been proposed to speed up the writing of new code. For example, code completion techniques can recommend to developers the next few tokens they are likely to type, while retrieval-based approaches can suggest code snippets relevant for the task at hand. Also, deep learning has been used to automatically generate code statements starting from a natural language description. While research in this field is very active, there is no study investigating what the users of code recommender systems (i.e., software practitioners) actually need from these tools. We present a study involving 80 software developers to investigate the characteristics of code recommender systems they consider important. The output of our study is a taxonomy of 70 "requirements" that should be considered when designing code recommender systems. For example, developers would like the recommended code to use the same coding style of the code under development. Also, code recommenders being "aware" of the developers' knowledge (e.g., what are the framework/libraries they already used in the past) and able to customize the recommendations based on this knowledge would be appreciated by practitioners. The taxonomy output of our study points to a wide set of future research directions for code recommenders.

References

[1]

"Amazon mechanical turk https://www.mturk.com."

[2]

"Github copilot https://copilot.github.com."

[3]

"Qualtrics https://www.qualtrics.com."

[4]

"Replication package https://code-recommenders.github.io."

[5]

M. Allamanis, E. T. Barr, C. Bird, and C. Sutton, "Learning natural coding conventions," in Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, ser. FSE 2014, 2014, pp. 281--293.

Digital Library

[6]

L. E. d. S. Amorim, S. Erdweg, G. Wachsmuth, and E. Visser, "Principled syntactic code completion using placeholders," ser. SLE 2016, 2016, p. 163?175.

Digital Library

[7]

F. V. Arrebola and P. T. A. Junior, "On source code completion assistants and the need of a context-aware approach," in International Conference on Human Interface and the Management of Information. Springer, 2017, pp. 191--201.

[8]

G. Bavota, B. Dit, R. Oliveto, M. D. Penta, D. Poshyvanyk, and A. D. Lucia, "An empirical study on the developers' perception of software coupling," in 35th International Conference on Software Engineering, ICSE '13, San Francisco, CA, USA, May 18--26, 2013. IEEE Computer Society, 2013, pp. 692--701.

[9]

G. Bavota, A. D. Lucia, A. Marcus, and R. Oliveto, "Automating extract class refactoring: an improved method and its evaluation," Empir. Softw. Eng., vol. 19, no. 6, pp. 1617--1664, 2014.

Digital Library

[10]

R. P. L. Buse and W. Weimer, "Learning a metric for code readability," IEEE Transactions on Software Engineering, vol. 36, no. 4, pp. 546--558, 2010.

Digital Library

[11]

M. Ciniselli, N. Cooper, L. Pascarella, A. Mastropaolo, E. Aghajani, D. Poshyvanyk, M. D. Penta, and G. Bavota, "An empirical study on the usage of transformer models for code completion," IEEE Transactions on Software Engineering, no. 01, pp. 1--1, 2022.

[12]

M. Ciniselli, L. Pascarella, and G. Bavota, "To what extent do deep learning-based code recommenders generate predictions by cloning code from the training set?" in IEEE/ACM 19th International Conference on Mining Software Repositories, MSR 2022, Pittsburgh, PA, USA, May 23--24, 2022. IEEE, 2022, pp. 167--178.

[13]

S. C. B. de Souza, N. Anquetil, and K. M. de Oliveira, "A study of the documentation essential to software maintenance," in Proceedings of the 23rd Annual International Conference on Design of Communication: Documenting & Designing for Pervasive Information, ser. SIGDOC '05. ACM, 2005, pp. 68--75.

[14]

J. Dorn, "A general software readability model," MCS Thesis available from (http://www.cs.virginia.edu/weimer/students/dorn-mcs-paper.pdf), vol. 5, pp. 11--14, 2012.

[15]

Z. Feng, D. Guo, D. Tang, N. Duan, X. Feng, M. Gong, L. Shou, B. Qin, T. Liu, D. Jiang, and M. Zhou, "CodeBERT: A pre-trained model for programming and natural languages," in Findings of the Association for Computational Linguistics: EMNLP 2020. Online: Association for Computational Linguistics, Nov. 2020, pp. 1536--1547.

[16]

A. Forward and T. C. Lethbridge, "The relevance of software documentation, tools and technologies: A survey," in Proc. of the 2002 ACM Symp. on Doc. Eng. (DocEng). ACM, 2002, pp. 26--33.

[17]

S. R. Foster, W. G. Griswold, and S. Lerner, "Witchdoctor: Ide support for real-time auto-completion of refactorings," in 2012 34th International Conference on Software Engineering (ICSE), 2012, pp. 222--232.

[18]

V. J. Hellendoorn, S. Proksch, H. C. Gall, and A. Bacchelli, "When code completion fails: A case study on real-world completions," in 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). IEEE, 2019, pp. 960--970.

[19]

X. Hu, G. Li, X. Xia, D. Lo, and Z. Jin, "Deep code comment generation," ser. ICPC '18, 2018.

Digital Library

[20]

Y. Huang, S. Huang, H. Chen, X. Chen, Z. Zheng, X. Luo, N. Jia, X. Hu, and X. Zhou, "Towards automatically generating block comments for code snippets," Information and Software Technology, vol. 127, p. 106373, 2020.

[21]

X. Jin and F. Servant, "The hidden cost of code completion: Understanding the impact of the recommendation-list length on its efficiency," in Proceedings of the 15th International Conference on Mining Software Repositories, 2018, pp. 70--73.

[22]

R. Karampatsis and C. A. Sutton, "Maybe deep neural networks are the best choice for modeling source code," CoRR, vol. abs/1903.05734, 2019. [Online]. Available: http://arxiv.org/abs/1903.05734

[23]

J. Kim, S. Lee, S. Hwang, and S. Kim, "Adding examples into java documents," in 2009 IEEE/ACM International Conference on Automated Software Engineering, 2009, pp. 540--544.

[24]

S. Kim, J. Zhao, Y. Tian, and S. Chandra, "Code prediction by feeding trees to transformers," arXiv preprint arXiv:2003.13848, 2020.

[25]

H. H. S. Kyaw, S. T. Aung, H. A. Thant, and N. Funabiki, "A proposal of code completion problem for java programming learning assistant system," in Conference on Complex, Intelligent, and Software Intensive Systems. Springer, 2018, pp. 855--864.

[26]

C. Le Goues, M. Dewey-Vogt, S. Forrest, and W. Weimer, "A systematic study of automated program repair: Fixing 55 out of 105 bugs for $8 each," in 2012 34th International Conference on Software Engineering (ICSE), 2012, pp. 3--13.

[27]

Y. Li, S. Wang, and T. N. Nguyen, "Dlfix: Context-based code transformation learning for automated program repair," in Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering, ser. ICSE '20, 2020, p. 602?614.

Digital Library

[28]

F. Liu, G. Li, Y. Zhao, and Z. Jin, "Multi-task learning based pre-trained language model for code completion," in Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering, ser. ASE 2020. Association for Computing Machinery, 2020.

[29]

M. Mărăşoiu, L. Church, and A. Blackwell, "An empirical investigation of code completion usage by professional software developers," in Proceedings of the 26th Annual Workshop of the Psychology of Programming Interest Group, 2015.

[30]

M. R. Marri, S. Thummalapenta, and T. Xie, "Improving software quality via code searching and mining," in 2009 ICSE Workshop on Search-Driven Development-Users, Infrastructure, Tools and Evaluation. IEEE, 2009, pp. 33--36.

[31]

C. McMillan, D. Poshyvanyk, M. Grechanik, Q. Xie, and C. Fu, "Portfolio: Searching for relevant functions and their usages in millions of lines of code," ACM Trans. Softw. Eng. Methodol., vol. 22, no. 4, pp. 37:1--37:30, 2013.

Digital Library

[32]

Q. Mi, J. Keung, Y. Xiao, S. Mensah, and Y. Gao, "Improving code readability classification using convolutional neural networks," Information and Software Technology, vol. 104, pp. 60--71, 2018.

[33]

L. Moreno, G. Bavota, M. Di Penta, R. Oliveto, and A. Marcus, "How can i use this method?" in Proceedings of the 37th International Conference on Software Engineering - Volume 1, ser. ICSE '15, 2015, p. 880?890.

[34]

L. Moreno, G. Bavota, M. D. Penta, R. Oliveto, A. Marcus, and G. Canfora, "Arena: An approach for the automated generation of release notes," IEEE Transactions on Software Engineering, vol. 43, no. 2, pp. 106--127, 2017.

Digital Library

[35]

G. C. Murphy, M. Kersten, and L. Findlater, "How are java software developers using the elipse ide?" IEEE software, vol. 23, no. 4, pp. 76--83, 2006.

Digital Library

[36]

S. M. Nasehi, J. Sillito, F. Maurer, and C. Burns, "What makes a good code example?: A study of programming q a in stackoverflow," in 2012 28th IEEE International Conference on Software Maintenance (ICSM), 2012, pp. 25--34.

[37]

T. Nguyen, P. C. Rigby, A. T. Nguyen, M. Karanfil, and T. N. Nguyen, "T2api: Synthesizing api code usage templates from english texts with statistical translation," in Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, ser. FSE 2016, 2016, p. 1013?1017.

[38]

D. Perelman, S. Gulwani, T. Ball, and D. Grossman, "Type-directed completion of partial expressions," in Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation, ser. PLDI '12, 2012, p. 275?286.

[39]

D. Posnett, A. Hindle, and P. Devanbu, "A simpler model of software readability," in Proceedings of the 8th working conference on mining software repositories, 2011, pp. 73--82.

[40]

S. Proksch, S. Amann, S. Nadi, and M. Mezini, "Evaluating the evaluations of code recommender systems: a reality check," in 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 2016, pp. 111--121.

[41]

S. P. Reiss, "Automatic code stylizing," in Proceedings of the Twenty-Second IEEE/ACM International Conference on Automated Software Engineering, ser. ASE '07, 2007, p. 74?83.

[42]

R. Robbes and M. Lanza, "How program history can improve code completion," in 2008 23rd IEEE/ACM International Conference on Automated Software Engineering, 2008, pp. 317--326.

[43]

R. Robbes and M. Lanza, "Improving code completion with program history," Automated Software Engineering, vol. 17, no. 2, pp. 181--212, 2010.

Digital Library

[44]

M. P. Robillard, W. Maalej, R. J. Walker, and T. Zimmermann, Recommendation Systems in Software Engineering. Springer Publishing Company, Incorporated, 2014.

[45]

S. Scalabrino, M. Linares-Vásquez, R. Oliveto, and D. Poshyvanyk, "A comprehensive model for code readability," Journal of Software: Evolution and Process, vol. 30, no. 6, p. e1958, 2018.

Digital Library

[46]

R. Schuster, C. Song, E. Tromer, and V. Shmatikov, "You autocomplete me: Poisoning vulnerabilities in neural code completion," 2020.

[47]

A. Svyatkovskiy, S. K. Deng, S. Fu, and N. Sundaresan, "Intelli-code compose: Code generation using transformer," arXiv preprint arXiv:2005.08025, 2020.

[48]

A. Svyatkovskiy, S. Lee, A. Hadjitofi, M. Riechert, J. V. Franco, and M. Allamanis, "Fast and memory-efficient neural code completion," in 18th IEEE/ACM International Conference on Mining Software Repositories, MSR 2021. IEEE, 2021, pp. 329--340.

[49]

A. Tamrawi, T. T. Nguyen, J. M. Al-Kofahi, and T. N. Nguyen, "Fuzzy set and cache-based approach for bug triaging," in Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering, ser. ESEC/FSE '11, 2011, p. 365?375.

[50]

N. Tsantalis, T. Chaikalis, and A. Chatzigeorgiou, "Ten years of jdeodorant: Lessons learned from the hunt for smells," in 25th International Conference on Software Analysis, Evolution and Reengineering, SANER 2018, R. Oliveto, M. D. Penta, and D. C. Shepherd, Eds. IEEE Computer Society, 2018, pp. 4--14.

[51]

M. Tufano, D. Drain, A. Svyatkovskiy, and N. Sundaresan, "Generating accurate assert statements for unit test cases using pretrained transformers," CoRR, vol. abs/2009.05634, 2020.

[52]

M. Tufano, C. Watson, G. Bavota, M. Di Penta, M. White, and D. Poshyvanyk, "An empirical study on learning bug-fixing patches in the wild via neural machine translation," ACM Trans. Softw. Eng. Methodol., vol. 28, no. 4, pp. 19:1--19:29, 2019.

Digital Library

[53]

R. Tufano, S. Masiero, A. Mastropaolo, L. Pascarella, D. Poshyvanyk, and G. Bavota, "Using pre-trained models to boost code review automation," in 44th IEEE/ACM 44th International Conference on Software Engineering, ICSE 2022, Pittsburgh, PA, USA, May 25--27, 2022. IEEE, 2022, pp. 2291--2302.

[54]

R. Tufano, L. Pascarella, M. Tufano, D. Poshyvanyk, and G. Bavota, "Towards automating code review activities," in 43rd IEEE/ACM International Conference on Software Engineering, ICSE 2021, Madrid, Spain, 22--30 May 2021. IEEE, 2021, pp. 163--174.

[55]

W. Wang, S. Shen, G. Li, and Z. Jin, "Towards full-line code completion with neural language models," arXiv preprint arXiv:2009.08603, 2020.

[56]

C. Watson, M. Tufano, K. Moran, G. Bavota, and D. Poshyvanyk, "On learning meaningful assert statements for unit test cases," inProceedings of the 42nd International Conference on Software Engineering, ICSE 2020, 2020, p. To Appear.

[57]

F. Wen, E. Aghajani, C. Nagy, M. Lanza, and G. Bavota, "Siri, write the next method," in 43rd IEEE/ACM International Conference on Software Engineering, ICSE 2021, Madrid, Spain, 22--30 May 202.1 IEEE, 2021, pp. 138--149.

[58]

X. Xia, D. Lo, Y. Ding, J. M. Al-Kofahi, T. N. Nguyen, and X. Wang, "Improving automated bug triaging with specialized topic model," IEEE Transactions on Software Engineering, vol. 43, no. 3, pp. 272--297, 2017.

Digital Library

[59]

T. Xie and J. Pei, "Mapo: Mining api usages from open source repositories," ser. MSR '06, 2006.

Digital Library

[60]

F. F. Xu, Z. Jiang, P. Yin, B. Vasilescu, and G. Neubig, "Incorporating external knowledge through pre-training for natural language to code generation," in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Jul. 2020.

[61]

F. F. Xu, B. Vasilescu, and G. Neubig, "In-ide code generation from natural language: Promise and challenges," 2021.

[62]

A. Ziegler, E. Kalliamvakou, X. A. Li, A. Rice, D. Rifkin, S. Simister, G. Sittampalam, and E. Aftandilian, "Productivity assessment of neural code completion," ser. MAPS 2022, 2022, p. 21?29.

Cited By

Ciniselli MMartin-Lopez ABavota GBaysal OLinares-Vasquez MMoran KSteinmacher I(2024)On the Generalizability of Deep Learning-based Code Completion Across Programming Language VersionsProceedings of the 32nd IEEE/ACM International Conference on Program Comprehension10.1145/3643916.3644411(99-111)Online publication date: 15-Apr-2024
https://dl.acm.org/doi/10.1145/3643916.3644411

Recommendations

Demystifying code snippets in code reviews: a study of the OpenStack and Qt communities and a practitioner survey
Abstract
Code review is widely known as one of the best practices for software quality assurance in software development. In a typical code review process, reviewers check the code committed by developers to ensure the quality of the code, during which ...
An Empirical Study on the Occurrences of Code Smells in Open Source and Industrial Projects
ESEM '22: Proceedings of the 16th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement

Background: Reusing source code containing code smells can induce significant amount of maintenance time and cost. A list of code smells has been identified in the literature and developers are encouraged to avoid the smells from the very beginning ...
Code smells detection via modern code review: a study of the OpenStack and Qt communities
Abstract
Code review plays an important role in software quality control. A typical review process involves a careful check of a piece of code in an attempt to detect and locate defects and other quality issues/violations. One type of issue that may impact ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ICSE '23: Proceedings of the 45th International Conference on Software Engineering

May 2023

2713 pages

ISBN:9781665457019

General Chair:
John Grundy
Department of Software Systems and Cybersecurity, Faculty of IT, Monash University, Australia
,
Program Co-chairs:
Lori Pollock
University of Delaware, DE, USA
,
Massimiliano Di Penta
University of Sannio, Italy

Sponsors

SIGSOFT: ACM Special Interest Group on Software Engineering

In-Cooperation

IEEE CS

Publisher

IEEE Press

Publication History

Published: 26 July 2023

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ICSE '23

Sponsor:

SIGSOFT

ICSE '23: 45th International Conference on Software Engineering

May 14 - 20, 2023

Victoria, Melbourne, Australia

Acceptance Rates

Overall Acceptance Rate 276 of 1,856 submissions, 15%

Upcoming Conference

ICSE 2025

2025 IEEE/ACM 46th International Conference on Software Engineering

April 26 - May 3, 2025

Ottawa , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
61
Total Downloads

Downloads (Last 12 months)44
Downloads (Last 6 weeks)3

Reflects downloads up to 09 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Ciniselli MMartin-Lopez ABavota GBaysal OLinares-Vasquez MMoran KSteinmacher I(2024)On the Generalizability of Deep Learning-based Code Completion Across Programming Language VersionsProceedings of the 32nd IEEE/ACM International Conference on Program Comprehension10.1145/3643916.3644411(99-111)Online publication date: 15-Apr-2024
https://dl.acm.org/doi/10.1145/3643916.3644411

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents