Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

Code completion of multiple keywords from abbreviated input

Published: 01 December 2011 Publication History

Abstract

Abbreviation Completion is a novel technique to improve the efficiency of code-writing by supporting code completion of multiple keywords based on non-predefined abbreviated input--a different approach from conventional code completion that finds one keyword at a time based on an exact character match. Abbreviated input consisting of abbreviated keywords and non-alphanumeric characters between each abbreviated keyword (e.g. pb st nm) is expanded into a full expression (e.g. public String name) by a Hidden Markov Model learned from a corpus of existing code and abbreviation examples. The technique does not require the user to memorize abbreviations and provides incremental feedback of the most likely completions.
In addition to code completion by disabbreviation of multiple keywords, abbreviation completion supports prediction of the next keywords and non-alphanumeric characters of a code completion candidate, a technique called code completion by extrapolation. The system finds the most likely next keywords and non-alphanumeric characters using an n-gram model of programming language. This enables a code completion scenario in which a user first types a short abbreviated expression to complete the beginning part of a desired full expression and then uses the extrapolation feature to complete the remaining part without further typing.
This paper presents the algorithm for abbreviation completion, integrated with a new user interface for multiple-keyword code completion. We tested the system by sampling 4919 code lines from open source projects and found that more than 99% of the code lines could be resolved from acronym-like abbreviations. The system could also extrapolate code completion candidates to complete the next one or two keywords with the accuracy of 96% and 82%, respectively. A user study of code completion by disabbreviation found 30% reduction in time usage and 41% reduction of keystrokes over conventional code completion.

References

[1]
Abbrevs, GNU Emacs manual, http://www.gnu.org/software/emacs/manual/emacs.html (2010).
[2]
Amazon Mechanical Turk, Amazon Mechanical Turk user's guide, http://www.mturk.com (2010).
[3]
Bickel, S., Haider, P., Scheffer, T.: Predicting sentences using N-gram language models. In: Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language, pp. 193-?00 (2005).
[4]
Brown, P.F., deSouza, P.V., Mercer, R.L., Della Pietra, V.J., Lai, J.C.: Class-based n-gram models of natural language. Comput. Linguist. 18, 467-479 (1992).
[5]
Bruch, M., Monperrus, M., Mezini, M.: Learning from examples to improve code completion systems. In: Proceedings of the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering, pp. 213-222 (2009).
[6]
Code Assist, Eclipse Ganymede documentation, http://help.eclipse.org/ganymede/index.jsp (2009).
[7]
Complete Word, Visual Studio 2010 documentation, http://msdn.microsoft.com/en-us/library/1thxcsd9.aspx (2010).
[8]
Han, S., Wallace, D.R., Miller, R.C.: Code completion from abbreviated Input. In: Proceedings of International Conference on Automated Software Engineering, pp. 332-343 (2009).
[9]
Hill, R., Rideout, J.: Automatic method completion. In: Proceedings of Automated Software Engineering, pp. 228-235 (2004).
[10]
Kersten, M., Murphy, G.C.: Using task context to improve programmer productivity. In: Proceedings of ACM SIGSOFT International Symposium on Foundations of Software Engineering, pp. 1-11 (2006).
[11]
Little, G., Miller, R.C.: Keyword programming in Java. In: Proceedings of International Conference on Automated Software Engineering, vol. 16, pp. 37-71 (2007).
[12]
Mandelin, D., Xu, L., Bodik, R., Kimelman, D.: Jungloid mining: Helping to navigate the API jungle. In: Proceedings of Conference on Programming Language Design and Implementation, vol. 40, pp. 48-61 (2005).
[13]
Murphy, G.C., Kersten, M., Findlater, L.: How are java software developers using the eclipse IDE? IEEE Softw., 23(4), 78-83 (2006).
[14]
Nandi, A., Jagadish, H.V.: Effective phrase prediction. In: Proceedings of International Conference on Very Large Data Bases, pp. 219-230 (2007).
[15]
Nilsson, D., Goldberger, J.: An efficient algorithm for sequentially finding the n-best list. In: Proceedings of International Joint Conference on Artificial Intelligence, pp. 1280-1285 (2001).
[16]
Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. In: Proceedings of the IEEE, pp. 257-286 (1989).
[17]
Robbes, R., Lanza, M.: How Program history can improve code completion. In: Proceedings of Automated Software Engineering, pp. 181-212 (2008).
[18]
Sahavechaphan, N., Claypool, K.: XSnippet: Mining for sample code. In: Proceedings of ACM SIGPLAN Conference on Object-Oriented Programming Systems, Languages, and Applications (OOPSLA), vol. 16, pp. 413-430 (2006).
[19]
Shieber, S.M., Nelken, R.: Abbreviated text input using language modeling. Nat. Lang. Eng. 13, 137-163 (2007).
[20]
Soong, F.K., Huang, E.F.: A tree-trellis based fast search for finding the n-best sentence hypotheses in continuous speech recognition. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing, vol. 1, pp. 705-708 (1991).
[21]
Template, Eclipse Ganymede documentation, http://help.eclipse.org/ganymede/index.jsp (2009).
[22]
Willis, T., Pain, H., Trewin, S., Clark, S.: Probabilistic flexible abbreviation expansion for users with motor disabilities. In: Proceedings of Accessible Design in the Digital World (2005).

Cited By

View all
  • (2024)Which Syntactic Capabilities Are Statistically Learned by Masked Language Models for Code?Proceedings of the 2024 ACM/IEEE 44th International Conference on Software Engineering: New Ideas and Emerging Results10.1145/3639476.3639768(72-76)Online publication date: 14-Apr-2024
  • (2024)A survey on machine learning techniques applied to source codeJournal of Systems and Software10.1016/j.jss.2023.111934209:COnline publication date: 14-Mar-2024
  • (2022)To what extent do deep learning-based code recommenders generate predictions by cloning code from the training set?Proceedings of the 19th International Conference on Mining Software Repositories10.1145/3524842.3528440(167-178)Online publication date: 23-May-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Automated Software Engineering
Automated Software Engineering  Volume 18, Issue 3-4
December 2011
172 pages

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 01 December 2011

Author Tags

  1. Abbreviation
  2. Abbreviation completion
  3. Code assistants
  4. Code completion
  5. Data mining
  6. Extrapolation
  7. Hidden Markov model
  8. Multiple keywords

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 04 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Which Syntactic Capabilities Are Statistically Learned by Masked Language Models for Code?Proceedings of the 2024 ACM/IEEE 44th International Conference on Software Engineering: New Ideas and Emerging Results10.1145/3639476.3639768(72-76)Online publication date: 14-Apr-2024
  • (2024)A survey on machine learning techniques applied to source codeJournal of Systems and Software10.1016/j.jss.2023.111934209:COnline publication date: 14-Mar-2024
  • (2022)To what extent do deep learning-based code recommenders generate predictions by cloning code from the training set?Proceedings of the 19th International Conference on Mining Software Repositories10.1145/3524842.3528440(167-178)Online publication date: 23-May-2022
  • (2015)Naturalness of Natural Language Artifacts in SoftwareProceedings of the 8th India Software Engineering Conference10.1145/2723742.2723758(156-165)Online publication date: 18-Feb-2015

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media