Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1109/ICSE48619.2023.00182acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article

Source Code Recommender Systems: The Practitioners' Perspective

Published: 26 July 2023 Publication History

Abstract

The automatic generation of source code is one of the long-lasting dreams in software engineering research. Several techniques have been proposed to speed up the writing of new code. For example, code completion techniques can recommend to developers the next few tokens they are likely to type, while retrieval-based approaches can suggest code snippets relevant for the task at hand. Also, deep learning has been used to automatically generate code statements starting from a natural language description. While research in this field is very active, there is no study investigating what the users of code recommender systems (i.e., software practitioners) actually need from these tools. We present a study involving 80 software developers to investigate the characteristics of code recommender systems they consider important. The output of our study is a taxonomy of 70 "requirements" that should be considered when designing code recommender systems. For example, developers would like the recommended code to use the same coding style of the code under development. Also, code recommenders being "aware" of the developers' knowledge (e.g., what are the framework/libraries they already used in the past) and able to customize the recommendations based on this knowledge would be appreciated by practitioners. The taxonomy output of our study points to a wide set of future research directions for code recommenders.

References

[1]
"Amazon mechanical turk https://www.mturk.com."
[2]
"Github copilot https://copilot.github.com."
[3]
"Qualtrics https://www.qualtrics.com."
[4]
"Replication package https://code-recommenders.github.io."
[5]
M. Allamanis, E. T. Barr, C. Bird, and C. Sutton, "Learning natural coding conventions," in Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, ser. FSE 2014, 2014, pp. 281--293.
[6]
L. E. d. S. Amorim, S. Erdweg, G. Wachsmuth, and E. Visser, "Principled syntactic code completion using placeholders," ser. SLE 2016, 2016, p. 163?175.
[7]
F. V. Arrebola and P. T. A. Junior, "On source code completion assistants and the need of a context-aware approach," in International Conference on Human Interface and the Management of Information. Springer, 2017, pp. 191--201.
[8]
G. Bavota, B. Dit, R. Oliveto, M. D. Penta, D. Poshyvanyk, and A. D. Lucia, "An empirical study on the developers' perception of software coupling," in 35th International Conference on Software Engineering, ICSE '13, San Francisco, CA, USA, May 18--26, 2013. IEEE Computer Society, 2013, pp. 692--701.
[9]
G. Bavota, A. D. Lucia, A. Marcus, and R. Oliveto, "Automating extract class refactoring: an improved method and its evaluation," Empir. Softw. Eng., vol. 19, no. 6, pp. 1617--1664, 2014.
[10]
R. P. L. Buse and W. Weimer, "Learning a metric for code readability," IEEE Transactions on Software Engineering, vol. 36, no. 4, pp. 546--558, 2010.
[11]
M. Ciniselli, N. Cooper, L. Pascarella, A. Mastropaolo, E. Aghajani, D. Poshyvanyk, M. D. Penta, and G. Bavota, "An empirical study on the usage of transformer models for code completion," IEEE Transactions on Software Engineering, no. 01, pp. 1--1, 2022.
[12]
M. Ciniselli, L. Pascarella, and G. Bavota, "To what extent do deep learning-based code recommenders generate predictions by cloning code from the training set?" in IEEE/ACM 19th International Conference on Mining Software Repositories, MSR 2022, Pittsburgh, PA, USA, May 23--24, 2022. IEEE, 2022, pp. 167--178.
[13]
S. C. B. de Souza, N. Anquetil, and K. M. de Oliveira, "A study of the documentation essential to software maintenance," in Proceedings of the 23rd Annual International Conference on Design of Communication: Documenting & Designing for Pervasive Information, ser. SIGDOC '05. ACM, 2005, pp. 68--75.
[14]
J. Dorn, "A general software readability model," MCS Thesis available from (http://www.cs.virginia.edu/weimer/students/dorn-mcs-paper.pdf), vol. 5, pp. 11--14, 2012.
[15]
Z. Feng, D. Guo, D. Tang, N. Duan, X. Feng, M. Gong, L. Shou, B. Qin, T. Liu, D. Jiang, and M. Zhou, "CodeBERT: A pre-trained model for programming and natural languages," in Findings of the Association for Computational Linguistics: EMNLP 2020. Online: Association for Computational Linguistics, Nov. 2020, pp. 1536--1547.
[16]
A. Forward and T. C. Lethbridge, "The relevance of software documentation, tools and technologies: A survey," in Proc. of the 2002 ACM Symp. on Doc. Eng. (DocEng). ACM, 2002, pp. 26--33.
[17]
S. R. Foster, W. G. Griswold, and S. Lerner, "Witchdoctor: Ide support for real-time auto-completion of refactorings," in 2012 34th International Conference on Software Engineering (ICSE), 2012, pp. 222--232.
[18]
V. J. Hellendoorn, S. Proksch, H. C. Gall, and A. Bacchelli, "When code completion fails: A case study on real-world completions," in 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). IEEE, 2019, pp. 960--970.
[19]
X. Hu, G. Li, X. Xia, D. Lo, and Z. Jin, "Deep code comment generation," ser. ICPC '18, 2018.
[20]
Y. Huang, S. Huang, H. Chen, X. Chen, Z. Zheng, X. Luo, N. Jia, X. Hu, and X. Zhou, "Towards automatically generating block comments for code snippets," Information and Software Technology, vol. 127, p. 106373, 2020.
[21]
X. Jin and F. Servant, "The hidden cost of code completion: Understanding the impact of the recommendation-list length on its efficiency," in Proceedings of the 15th International Conference on Mining Software Repositories, 2018, pp. 70--73.
[22]
R. Karampatsis and C. A. Sutton, "Maybe deep neural networks are the best choice for modeling source code," CoRR, vol. abs/1903.05734, 2019. [Online]. Available: http://arxiv.org/abs/1903.05734
[23]
J. Kim, S. Lee, S. Hwang, and S. Kim, "Adding examples into java documents," in 2009 IEEE/ACM International Conference on Automated Software Engineering, 2009, pp. 540--544.
[24]
S. Kim, J. Zhao, Y. Tian, and S. Chandra, "Code prediction by feeding trees to transformers," arXiv preprint arXiv:2003.13848, 2020.
[25]
H. H. S. Kyaw, S. T. Aung, H. A. Thant, and N. Funabiki, "A proposal of code completion problem for java programming learning assistant system," in Conference on Complex, Intelligent, and Software Intensive Systems. Springer, 2018, pp. 855--864.
[26]
C. Le Goues, M. Dewey-Vogt, S. Forrest, and W. Weimer, "A systematic study of automated program repair: Fixing 55 out of 105 bugs for $8 each," in 2012 34th International Conference on Software Engineering (ICSE), 2012, pp. 3--13.
[27]
Y. Li, S. Wang, and T. N. Nguyen, "Dlfix: Context-based code transformation learning for automated program repair," in Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering, ser. ICSE '20, 2020, p. 602?614.
[28]
F. Liu, G. Li, Y. Zhao, and Z. Jin, "Multi-task learning based pre-trained language model for code completion," in Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering, ser. ASE 2020. Association for Computing Machinery, 2020.
[29]
M. Mărăşoiu, L. Church, and A. Blackwell, "An empirical investigation of code completion usage by professional software developers," in Proceedings of the 26th Annual Workshop of the Psychology of Programming Interest Group, 2015.
[30]
M. R. Marri, S. Thummalapenta, and T. Xie, "Improving software quality via code searching and mining," in 2009 ICSE Workshop on Search-Driven Development-Users, Infrastructure, Tools and Evaluation. IEEE, 2009, pp. 33--36.
[31]
C. McMillan, D. Poshyvanyk, M. Grechanik, Q. Xie, and C. Fu, "Portfolio: Searching for relevant functions and their usages in millions of lines of code," ACM Trans. Softw. Eng. Methodol., vol. 22, no. 4, pp. 37:1--37:30, 2013.
[32]
Q. Mi, J. Keung, Y. Xiao, S. Mensah, and Y. Gao, "Improving code readability classification using convolutional neural networks," Information and Software Technology, vol. 104, pp. 60--71, 2018.
[33]
L. Moreno, G. Bavota, M. Di Penta, R. Oliveto, and A. Marcus, "How can i use this method?" in Proceedings of the 37th International Conference on Software Engineering - Volume 1, ser. ICSE '15, 2015, p. 880?890.
[34]
L. Moreno, G. Bavota, M. D. Penta, R. Oliveto, A. Marcus, and G. Canfora, "Arena: An approach for the automated generation of release notes," IEEE Transactions on Software Engineering, vol. 43, no. 2, pp. 106--127, 2017.
[35]
G. C. Murphy, M. Kersten, and L. Findlater, "How are java software developers using the elipse ide?" IEEE software, vol. 23, no. 4, pp. 76--83, 2006.
[36]
S. M. Nasehi, J. Sillito, F. Maurer, and C. Burns, "What makes a good code example?: A study of programming q a in stackoverflow," in 2012 28th IEEE International Conference on Software Maintenance (ICSM), 2012, pp. 25--34.
[37]
T. Nguyen, P. C. Rigby, A. T. Nguyen, M. Karanfil, and T. N. Nguyen, "T2api: Synthesizing api code usage templates from english texts with statistical translation," in Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, ser. FSE 2016, 2016, p. 1013?1017.
[38]
D. Perelman, S. Gulwani, T. Ball, and D. Grossman, "Type-directed completion of partial expressions," in Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation, ser. PLDI '12, 2012, p. 275?286.
[39]
D. Posnett, A. Hindle, and P. Devanbu, "A simpler model of software readability," in Proceedings of the 8th working conference on mining software repositories, 2011, pp. 73--82.
[40]
S. Proksch, S. Amann, S. Nadi, and M. Mezini, "Evaluating the evaluations of code recommender systems: a reality check," in 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 2016, pp. 111--121.
[41]
S. P. Reiss, "Automatic code stylizing," in Proceedings of the Twenty-Second IEEE/ACM International Conference on Automated Software Engineering, ser. ASE '07, 2007, p. 74?83.
[42]
R. Robbes and M. Lanza, "How program history can improve code completion," in 2008 23rd IEEE/ACM International Conference on Automated Software Engineering, 2008, pp. 317--326.
[43]
R. Robbes and M. Lanza, "Improving code completion with program history," Automated Software Engineering, vol. 17, no. 2, pp. 181--212, 2010.
[44]
M. P. Robillard, W. Maalej, R. J. Walker, and T. Zimmermann, Recommendation Systems in Software Engineering. Springer Publishing Company, Incorporated, 2014.
[45]
S. Scalabrino, M. Linares-Vásquez, R. Oliveto, and D. Poshyvanyk, "A comprehensive model for code readability," Journal of Software: Evolution and Process, vol. 30, no. 6, p. e1958, 2018.
[46]
R. Schuster, C. Song, E. Tromer, and V. Shmatikov, "You autocomplete me: Poisoning vulnerabilities in neural code completion," 2020.
[47]
A. Svyatkovskiy, S. K. Deng, S. Fu, and N. Sundaresan, "Intelli-code compose: Code generation using transformer," arXiv preprint arXiv:2005.08025, 2020.
[48]
A. Svyatkovskiy, S. Lee, A. Hadjitofi, M. Riechert, J. V. Franco, and M. Allamanis, "Fast and memory-efficient neural code completion," in 18th IEEE/ACM International Conference on Mining Software Repositories, MSR 2021. IEEE, 2021, pp. 329--340.
[49]
A. Tamrawi, T. T. Nguyen, J. M. Al-Kofahi, and T. N. Nguyen, "Fuzzy set and cache-based approach for bug triaging," in Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering, ser. ESEC/FSE '11, 2011, p. 365?375.
[50]
N. Tsantalis, T. Chaikalis, and A. Chatzigeorgiou, "Ten years of jdeodorant: Lessons learned from the hunt for smells," in 25th International Conference on Software Analysis, Evolution and Reengineering, SANER 2018, R. Oliveto, M. D. Penta, and D. C. Shepherd, Eds. IEEE Computer Society, 2018, pp. 4--14.
[51]
M. Tufano, D. Drain, A. Svyatkovskiy, and N. Sundaresan, "Generating accurate assert statements for unit test cases using pretrained transformers," CoRR, vol. abs/2009.05634, 2020.
[52]
M. Tufano, C. Watson, G. Bavota, M. Di Penta, M. White, and D. Poshyvanyk, "An empirical study on learning bug-fixing patches in the wild via neural machine translation," ACM Trans. Softw. Eng. Methodol., vol. 28, no. 4, pp. 19:1--19:29, 2019.
[53]
R. Tufano, S. Masiero, A. Mastropaolo, L. Pascarella, D. Poshyvanyk, and G. Bavota, "Using pre-trained models to boost code review automation," in 44th IEEE/ACM 44th International Conference on Software Engineering, ICSE 2022, Pittsburgh, PA, USA, May 25--27, 2022. IEEE, 2022, pp. 2291--2302.
[54]
R. Tufano, L. Pascarella, M. Tufano, D. Poshyvanyk, and G. Bavota, "Towards automating code review activities," in 43rd IEEE/ACM International Conference on Software Engineering, ICSE 2021, Madrid, Spain, 22--30 May 2021. IEEE, 2021, pp. 163--174.
[55]
W. Wang, S. Shen, G. Li, and Z. Jin, "Towards full-line code completion with neural language models," arXiv preprint arXiv:2009.08603, 2020.
[56]
C. Watson, M. Tufano, K. Moran, G. Bavota, and D. Poshyvanyk, "On learning meaningful assert statements for unit test cases," inProceedings of the 42nd International Conference on Software Engineering, ICSE 2020, 2020, p. To Appear.
[57]
F. Wen, E. Aghajani, C. Nagy, M. Lanza, and G. Bavota, "Siri, write the next method," in 43rd IEEE/ACM International Conference on Software Engineering, ICSE 2021, Madrid, Spain, 22--30 May 202.1 IEEE, 2021, pp. 138--149.
[58]
X. Xia, D. Lo, Y. Ding, J. M. Al-Kofahi, T. N. Nguyen, and X. Wang, "Improving automated bug triaging with specialized topic model," IEEE Transactions on Software Engineering, vol. 43, no. 3, pp. 272--297, 2017.
[59]
T. Xie and J. Pei, "Mapo: Mining api usages from open source repositories," ser. MSR '06, 2006.
[60]
F. F. Xu, Z. Jiang, P. Yin, B. Vasilescu, and G. Neubig, "Incorporating external knowledge through pre-training for natural language to code generation," in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Jul. 2020.
[61]
F. F. Xu, B. Vasilescu, and G. Neubig, "In-ide code generation from natural language: Promise and challenges," 2021.
[62]
A. Ziegler, E. Kalliamvakou, X. A. Li, A. Rice, D. Rifkin, S. Simister, G. Sittampalam, and E. Aftandilian, "Productivity assessment of neural code completion," ser. MAPS 2022, 2022, p. 21?29.

Cited By

View all
  • (2024)On the Generalizability of Deep Learning-based Code Completion Across Programming Language VersionsProceedings of the 32nd IEEE/ACM International Conference on Program Comprehension10.1145/3643916.3644411(99-111)Online publication date: 15-Apr-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICSE '23: Proceedings of the 45th International Conference on Software Engineering
May 2023
2713 pages
ISBN:9781665457019
  • General Chair:
  • John Grundy,
  • Program Co-chairs:
  • Lori Pollock,
  • Massimiliano Di Penta

Sponsors

In-Cooperation

  • IEEE CS

Publisher

IEEE Press

Publication History

Published: 26 July 2023

Check for updates

Author Tags

  1. code recommender systems
  2. empirical study
  3. practitioners' survey

Qualifiers

  • Research-article

Conference

ICSE '23
Sponsor:
ICSE '23: 45th International Conference on Software Engineering
May 14 - 20, 2023
Victoria, Melbourne, Australia

Acceptance Rates

Overall Acceptance Rate 276 of 1,856 submissions, 15%

Upcoming Conference

ICSE 2025

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)44
  • Downloads (Last 6 weeks)3
Reflects downloads up to 09 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)On the Generalizability of Deep Learning-based Code Completion Across Programming Language VersionsProceedings of the 32nd IEEE/ACM International Conference on Program Comprehension10.1145/3643916.3644411(99-111)Online publication date: 15-Apr-2024

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media