article

Code completion of multiple keywords from abbreviated input

Authors:

David R. Wallace,

Robert C. MillerAuthors Info & Claims

Automated Software Engineering, Volume 18, Issue 3-4

Pages 363 - 398

https://doi.org/10.1007/s10515-011-0083-2

Published: 01 December 2011 Publication History

Abstract

Abbreviation Completion is a novel technique to improve the efficiency of code-writing by supporting code completion of multiple keywords based on non-predefined abbreviated input--a different approach from conventional code completion that finds one keyword at a time based on an exact character match. Abbreviated input consisting of abbreviated keywords and non-alphanumeric characters between each abbreviated keyword (e.g. pb st nm) is expanded into a full expression (e.g. public String name) by a Hidden Markov Model learned from a corpus of existing code and abbreviation examples. The technique does not require the user to memorize abbreviations and provides incremental feedback of the most likely completions.

In addition to code completion by disabbreviation of multiple keywords, abbreviation completion supports prediction of the next keywords and non-alphanumeric characters of a code completion candidate, a technique called code completion by extrapolation. The system finds the most likely next keywords and non-alphanumeric characters using an n-gram model of programming language. This enables a code completion scenario in which a user first types a short abbreviated expression to complete the beginning part of a desired full expression and then uses the extrapolation feature to complete the remaining part without further typing.

This paper presents the algorithm for abbreviation completion, integrated with a new user interface for multiple-keyword code completion. We tested the system by sampling 4919 code lines from open source projects and found that more than 99% of the code lines could be resolved from acronym-like abbreviations. The system could also extrapolate code completion candidates to complete the next one or two keywords with the accuracy of 96% and 82%, respectively. A user study of code completion by disabbreviation found 30% reduction in time usage and 41% reduction of keystrokes over conventional code completion.

References

[1]

Abbrevs, GNU Emacs manual, http://www.gnu.org/software/emacs/manual/emacs.html (2010).

[2]

Amazon Mechanical Turk, Amazon Mechanical Turk user's guide, http://www.mturk.com (2010).

[3]

Bickel, S., Haider, P., Scheffer, T.: Predicting sentences using N-gram language models. In: Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language, pp. 193-?00 (2005).

[4]

Brown, P.F., deSouza, P.V., Mercer, R.L., Della Pietra, V.J., Lai, J.C.: Class-based n-gram models of natural language. Comput. Linguist. 18, 467-479 (1992).

Digital Library

[5]

Bruch, M., Monperrus, M., Mezini, M.: Learning from examples to improve code completion systems. In: Proceedings of the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering, pp. 213-222 (2009).

Digital Library

[6]

Code Assist, Eclipse Ganymede documentation, http://help.eclipse.org/ganymede/index.jsp (2009).

[7]

Complete Word, Visual Studio 2010 documentation, http://msdn.microsoft.com/en-us/library/1thxcsd9.aspx (2010).

[8]

Han, S., Wallace, D.R., Miller, R.C.: Code completion from abbreviated Input. In: Proceedings of International Conference on Automated Software Engineering, pp. 332-343 (2009).

[9]

Hill, R., Rideout, J.: Automatic method completion. In: Proceedings of Automated Software Engineering, pp. 228-235 (2004).

[10]

Kersten, M., Murphy, G.C.: Using task context to improve programmer productivity. In: Proceedings of ACM SIGSOFT International Symposium on Foundations of Software Engineering, pp. 1-11 (2006).

[11]

Little, G., Miller, R.C.: Keyword programming in Java. In: Proceedings of International Conference on Automated Software Engineering, vol. 16, pp. 37-71 (2007).

[12]

Mandelin, D., Xu, L., Bodik, R., Kimelman, D.: Jungloid mining: Helping to navigate the API jungle. In: Proceedings of Conference on Programming Language Design and Implementation, vol. 40, pp. 48-61 (2005).

[13]

Murphy, G.C., Kersten, M., Findlater, L.: How are java software developers using the eclipse IDE? IEEE Softw., 23(4), 78-83 (2006).

[14]

Nandi, A., Jagadish, H.V.: Effective phrase prediction. In: Proceedings of International Conference on Very Large Data Bases, pp. 219-230 (2007).

[15]

Nilsson, D., Goldberger, J.: An efficient algorithm for sequentially finding the n-best list. In: Proceedings of International Joint Conference on Artificial Intelligence, pp. 1280-1285 (2001).

[16]

Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. In: Proceedings of the IEEE, pp. 257-286 (1989).

[17]

Robbes, R., Lanza, M.: How Program history can improve code completion. In: Proceedings of Automated Software Engineering, pp. 181-212 (2008).

[18]

Sahavechaphan, N., Claypool, K.: XSnippet: Mining for sample code. In: Proceedings of ACM SIGPLAN Conference on Object-Oriented Programming Systems, Languages, and Applications (OOPSLA), vol. 16, pp. 413-430 (2006).

[19]

Shieber, S.M., Nelken, R.: Abbreviated text input using language modeling. Nat. Lang. Eng. 13, 137-163 (2007).

[20]

Soong, F.K., Huang, E.F.: A tree-trellis based fast search for finding the n-best sentence hypotheses in continuous speech recognition. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing, vol. 1, pp. 705-708 (1991).

[21]

Template, Eclipse Ganymede documentation, http://help.eclipse.org/ganymede/index.jsp (2009).

[22]

Willis, T., Pain, H., Trewin, S., Clark, S.: Probabilistic flexible abbreviation expansion for users with motor disabilities. In: Proceedings of Accessible Design in the Digital World (2005).

Cited By

Velasco APalacio DRodriguez-Cardenas DPoshyvanyk DRoychoudhury APaiva AAbreu RStorey MHierons RMadeira H(2024)Which Syntactic Capabilities Are Statistically Learned by Masked Language Models for Code?Proceedings of the 2024 ACM/IEEE 44th International Conference on Software Engineering: New Ideas and Emerging Results10.1145/3639476.3639768(72-76)Online publication date: 14-Apr-2024
https://dl.acm.org/doi/10.1145/3639476.3639768
Sharma TKechagia MGeorgiou STiwari RVats IMoazen HSarro F(2024)A survey on machine learning techniques applied to source codeJournal of Systems and Software10.1016/j.jss.2023.111934209:COnline publication date: 14-Mar-2024
https://dl.acm.org/doi/10.1016/j.jss.2023.111934
Ciniselli MPascarella LBavota GLo DMcIntosh SNovielli N(2022)To what extent do deep learning-based code recommenders generate predictions by cloning code from the training set?Proceedings of the 19th International Conference on Mining Software Repositories10.1145/3524842.3528440(167-178)Online publication date: 23-May-2022
https://dl.acm.org/doi/10.1145/3524842.3528440
Show More Cited By

Recommendations

Code Completion from Abbreviated Input
ASE '09: Proceedings of the 24th IEEE/ACM International Conference on Automated Software Engineering

Abbreviation Completion is a novel technique to improve the efficiency of code-writing by supporting code completion of multiple keywords based on non-predefined abbreviated input -- a different approach from conventional code completion that finds one ...
Exploring and Improving Code Completion for Test Code
ICPC '24: Proceedings of the 32nd IEEE/ACM International Conference on Program Comprehension

Code completion is an important feature in Integrated Development Environments (IDEs). These years, researchers have been making efforts for intelligent code completion. However, existing work on intelligent code completion either only considered ...
Improving code completion with program history

Code completion is a widely used productivity tool. It takes away the burden of remembering and typing the exact names of methods or classes: As a developer starts typing a name, it provides a progressively refined list of candidates matching the name. ...

Comments

Information & Contributors

Information

Published In

cover image Automated Software Engineering

Automated Software Engineering Volume 18, Issue 3-4

December 2011

172 pages

ISSN:0928-8910

Issue’s Table of Contents

Copyright © Copyright © 2011 Springer Science+Business Media, LLC.

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 01 December 2011

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 04 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Velasco APalacio DRodriguez-Cardenas DPoshyvanyk DRoychoudhury APaiva AAbreu RStorey MHierons RMadeira H(2024)Which Syntactic Capabilities Are Statistically Learned by Masked Language Models for Code?Proceedings of the 2024 ACM/IEEE 44th International Conference on Software Engineering: New Ideas and Emerging Results10.1145/3639476.3639768(72-76)Online publication date: 14-Apr-2024
https://dl.acm.org/doi/10.1145/3639476.3639768
Sharma TKechagia MGeorgiou STiwari RVats IMoazen HSarro F(2024)A survey on machine learning techniques applied to source codeJournal of Systems and Software10.1016/j.jss.2023.111934209:COnline publication date: 14-Mar-2024
https://dl.acm.org/doi/10.1016/j.jss.2023.111934
Ciniselli MPascarella LBavota GLo DMcIntosh SNovielli N(2022)To what extent do deep learning-based code recommenders generate predictions by cloning code from the training set?Proceedings of the 19th International Conference on Mining Software Repositories10.1145/3524842.3528440(167-178)Online publication date: 23-May-2022
https://dl.acm.org/doi/10.1145/3524842.3528440
Sridhara GSinha VMani SPadmanabhuni SNambiar RDevanbu PRamanathan MSureka A(2015)Naturalness of Natural Language Artifacts in SoftwareProceedings of the 8th India Software Engineering Conference10.1145/2723742.2723758(156-165)Online publication date: 18-Feb-2015
https://dl.acm.org/doi/10.1145/2723742.2723758

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents