research-article

How well do pre-trained contextual language representations recommend labels for GitHub issues?

Authors:

Xiaofang Zhang,

Lin ChenAuthors Info & Claims

Volume 232, Issue C

https://doi.org/10.1016/j.knosys.2021.107476

Published: 28 November 2021 Publication History

Abstract

Motivation:

Open-source organizations use issues to collect user feedback, software bugs, and feature requests in GitHub. Many issues do not have labels, which makes labeling time-consuming work for the maintainers. Recently, some researchers used deep learning to improve the performance of automated tagging for software objects. However, these researches use static pre-trained word vectors that cannot represent the semantics of the same word in different contexts. Pre-trained contextual language representations have been shown to achieve outstanding performance on lots of NLP tasks.

Description:

In this paper, we study whether the pre-trained contextual language models are really better than other previous language models in the label recommendation for the GitHub labels scenario. We try to give some suggestions in fine-tuning pre-trained contextual language representation models. First, we compared four deep learning models, in which three of them use traditional pre-trained word embedding. Furthermore, we compare the performances when using different corpora for pre-training.

Results:

The experimental results show that: (1) When using large training data, the performance of BERT model is better than other deep learning language models such as Bi-LSTM, CNN and RCNN. While with a small size training data, CNN performs better than BERT. (2) Further pre-training on domain-specific data can indeed improve the performance of models.

Conclusions:

When recommending labels for issues in GitHub, using pre-trained contextual language representations is better if the training dataset is large enough. Moreover, we discuss the experimental results and provide some implications to improve label recommendation performance for GitHub issues.

References

[1]

Zhou P., Liu J., Liu X., Yang Z., Grundy J., Is deep learning better than traditional approaches in tag recommendation for software information sites?, Inf. Softw. Technol. 109 (2019) 1–13.

Digital Library

[2]

Mikolov T., Chen K., Corrado G., Dean J., Efficient estimation of word representations in vector space, in: ICLR (Workshop Poster), 2013.

[3]

K. Lee, J. Devlin, M.-W. Chang, K. Toutanova, BERT: Pre-training of deep bidirectional transformers for language understanding, in: Proceedings of NAACL-HLT, 2019, pp. 4171–4186.

[4]

M.E. Peters, M. Neumann, M. Iyyer, M. Gardner, C. Clark, K. Lee, L. Zettlemoyer, Deep contextualized word representations, in: Proceedings of NAACL-HLT, 2018, pp. 2227–2237.

[5]

Alec R., Karthik N., Tim S., Ilya S., Improving Language Understanding with Unsupervised Learning, OpenAI, 2018.

[6]

Treude C., Storey M.-A., How tagging helps bridge the gap between social and technical aspects in software development, in: 2009 IEEE 31st International Conference on Software Engineering, IEEE, 2009, pp. 12–22.

[7]

Tsoumakas G., Katakis I., Multi-label classification: An overview, Int. J. Data Warehous. Min. (IJDWM) 3 (3) (2007) 1–13.

[8]

Xia X., Lo D., Wang X., Zhou B., Tag recommendation in software information sites, in: 2013 10th Working Conference on Mining Software Repositories (MSR), IEEE, 2013, pp. 287–296.

[9]

D. Yang, Y. Xiao, Y. Song, J. Zhang, K. Zhang, W. Wang, Tag propagation based recommendation across diverse social media, in: Proceedings of the 23rd International Conference on World Wide Web, 2014, pp. 407–408.

[10]

Zhou P., Liu J., Yang Z., Zhou G., Scalable tag recommendation for software information sites, in: 2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER), IEEE, 2017, pp. 272–282.

[11]

Wang S., Lo D., Vasilescu B., Serebrenik A., Entagrec++: An enhanced tag recommendation system for software information sites, Empir. Softw. Eng. 23 (2) (2018) 800–832.

[12]

Liu J., Zhou P., Yang Z., Liu X., Grundy J., Fasttagrec: fast tag recommendation for software information sites, Autom. Softw. Eng. 25 (4) (2018) 675–701.

[13]

Li C., Xu L., Yan M., He J., Zhang Z., Tagdeeprec: Tag recommendation for software information sites using attention-based bi-LSTM, in: International Conference on Knowledge Science, Engineering and Management, Springer, 2019, pp. 11–24.

[14]

Cabot J., Izquierdo J.L.C., Cosentino V., Rolandi B., Exploring the use of labels to categorize issues in open-source software projects, in: 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER), IEEE, 2015, pp. 550–554.

[15]

G. Antoniol, K. Ayari, M. Di Penta, F. Khomh, Y.-G. Guéhéneuc, Is it a bug or an enhancement? A text-based approach to classify change requests, in: Proceedings of the 2008 Conference of the Center for Advanced Studies on Collaborative Research: Meeting of Minds, 2008, pp. 304–318.

[16]

Herzig K., Just S., Zeller A., It’s not a bug, it’s a feature: how misclassification impacts bug prediction, in: 2013 35th International Conference on Software Engineering (ICSE), IEEE, 2013, pp. 392–401.

[17]

Kallis R., Di Sorbo A., Canfora G., Panichella S., Ticket tagger: Machine learning driven issue classification, in: 2019 IEEE International Conference on Software Maintenance and Evolution (ICSME), IEEE, 2019, pp. 406–409.

[18]

Y. Zhu, R. Kiros, R. Zemel, R. Salakhutdinov, R. Urtasun, A. Torralba, S. Fidler, Aligning books and movies: Towards story-like visual explanations by watching movies and reading books, in: Proceedings of the IEEE International Conference on Computer Vision, 2015, pp. 19–27.

[19]

E. Guzman, D. Azócar, Y. Li, Sentiment analysis of commit comments in GitHub: an empirical study, in: Proceedings of the 11th Working Conference on Mining Software Repositories, 2014, pp. 352–355.

[20]

R. Padhye, S. Mani, V.S. Sinha, A study of external community contribution to open-source projects on GitHub, in: Proceedings of the 11th Working Conference on Mining Software Repositories, 2014, pp. 332–335.

[21]

Hochreiter S., Schmidhuber J., Long short-term memory, Neural Comput. 9 (8) (1997) 1735–1780.

Digital Library

[22]

Y. Kim, Convolutional neural networks for sentence classification, in: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, 2014, pp. 1746–1751.

[23]

S. Lai, L. Xu, K. Liu, J. Zhao, Recurrent convolutional neural networks for text classification, in: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 29, 2015, pp. 2267–2273.

[24]

Chen Y., Liu L., Tao J., Xia R., Zhang Q., Yang K., Xiong J., Chen X., The improved image inpainting algorithm via encoder and similarity constraint, Vis. Comput. (2020) 1–15.

[25]

Chen Y., Zhang H., Liu L., Tao J., Zhang Q., Yang K., Xia R., Xie J., Research on image inpainting algorithm of improved total variation minimization method, J. Ambient Intell. Humaniz. Comput. (2021) 1–10.

[26]

Chen Y., Liu L., Tao J., Chen X., Xia R., Zhang Q., Xiong J., Yang K., Xie J., The image annotation algorithm using convolutional features from intermediate layer of deep learning, Multimedia Tools Appl. 80 (3) (2021) 4237–4261.

Digital Library

[27]

Chen Y., Liu L., Phonevilay V., Gu K., Xia R., Xie J., Zhang Q., Yang K., Image super-resolution reconstruction based on feature map attention mechanism, Appl. Intell. (2021) 1–14.

Digital Library

[28]

Collobert R., Weston J., Bottou L., Karlen M., Kavukcuoglu K., Kuksa P., Natural language processing (almost) from scratch, J. Mach. Learn. Res. 12 (Aug) (2011) 2493–2537.

[29]

Wu Y., Schuster M., Chen Z., Le Q.V., Norouzi M., Macherey W., Krikun M., Cao Y., Gao Q., Macherey K., et al., Google’s neural machine translation system: Bridging the gap between human and machine translation, 2016, arXiv preprint arXiv:1609.08144.

[30]

J. Pennington, R. Socher, C.D. Manning, Glove: Global vectors for word representation, in: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2014, pp. 1532–1543.

[31]

Zhang J., Li C., Cao D., Lin Y., Su S., Dai L., Li S., Multi-label learning with label-specific features by resolving label correlations, Knowl.-Based Syst. 159 (2018) 148–157.

[32]

Shani G., Gunawardana A., Evaluating recommendation systems, in: Recommender Systems Handbook, Springer, 2011, pp. 257–297.

[33]

Bojanowski P., Grave E., Joulin A., Mikolov T., Enriching word vectors with subword information, Trans. Assoc. Comput. Linguist. 5 (2017) 135–146.

[34]

Wilcoxon F., Individual comparisons by ranking methods, in: Breakthroughs in Statistics, Springer, 1992, pp. 196–202.

[35]

Friedman M., The use of ranks to avoid the assumption of normality implicit in the analysis of variance, J. Amer. Statist. Assoc. 32 (200) (1937) 675–701.

[36]

Nemenyi P., Distribution-free multiple comparisons (doctoral dissertation, princeton university, 1963), Diss. Abstr. Int. 25 (2) (1963) 1233.

[37]

R. Sennrich, B. Haddow, A. Birch, Neural machine translation of rare words with subword units, in: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2016, pp. 1715–1725.

[38]

Chawla N.V., Bowyer K.W., Hall L.O., Kegelmeyer W.P., Smote: synthetic minority over-sampling technique, J. Artificial Intelligence Res. 16 (2002) 321–357.

[39]

Charte F., Rivera A.J., del Jesus M.J., Herrera F., Mlsmote: Approaching imbalanced multilabel learning through synthetic instance generation, Knowl.-Based Syst. 89 (2015) 385–397.

Digital Library

Cited By

Dang YLe-Cong TNguyen PBui ANguyen PLe BHuynh Q(2024)LEGION: Harnessing Pre-trained Language Models for GitHub Topic Recommendations with Distribution-Balance LossProceedings of the 28th International Conference on Evaluation and Assessment in Software Engineering10.1145/3661167.3661168(181-190)Online publication date: 18-Jun-2024
https://dl.acm.org/doi/10.1145/3661167.3661168
Colavito GLanubile FNovielli NQuaranta LSpinellis DConstantinou EBacchelli A(2024)Leveraging GPT-like LLMs to Automate Issue LabelingProceedings of the 21st International Conference on Mining Software Repositories10.1145/3643991.3644903(469-480)Online publication date: 15-Apr-2024
https://dl.acm.org/doi/10.1145/3643991.3644903
Colavito GLanubile FNovielli NQuaranta L(2024)Impact of data quality for automatic issue classification using pre-trained language modelsJournal of Systems and Software10.1016/j.jss.2023.111838210:COnline publication date: 1-Apr-2024
https://dl.acm.org/doi/10.1016/j.jss.2023.111838
Show More Cited By

Index Terms

How well do pre-trained contextual language representations recommend labels for GitHub issues?

Index terms have been assigned to the content through auto-classification.

Recommendations

Personalizing label prediction for GitHub issues
Abstract Context:
Automated label prediction tools can help developers manage and categorize issues on GitHub. However, different open-source projects use various forms of labels with the same meaning. Previous label prediction ...
Human Action Recognition using Pre-trained Convolutional Neural Networks
VSIP '20: Proceedings of the 2020 2nd International Conference on Video, Signal and Image Processing

Recognition of human action is one of the challenges in the field of artificial intelligence. Deep learning model has become a research issue in action recognition applications due to its ability to outperform traditional machine learning approaches. ...
English–Assamese neural machine translation using prior alignment and pre-trained language model
Abstract
In a multilingual country like India, automatic natural language translation plays a key role in building a community with different linguistic people. Many researchers have explored and improved the translation process for high-...
Highlights
- Utilizes pre-trained multilingual contextual embeddings-based alignment technique to extract alignment information and which is used as prior alignment ...

Comments

Information & Contributors

Information

Published In

cover image Knowledge-Based Systems

Knowledge-Based Systems Volume 232, Issue C

Nov 2021

572 pages

ISSN:0950-7051

Issue’s Table of Contents

Elsevier B.V.

Publisher

Elsevier Science Publishers B. V.

Netherlands

Publication History

Published: 28 November 2021

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

8
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 17 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Dang YLe-Cong TNguyen PBui ANguyen PLe BHuynh Q(2024)LEGION: Harnessing Pre-trained Language Models for GitHub Topic Recommendations with Distribution-Balance LossProceedings of the 28th International Conference on Evaluation and Assessment in Software Engineering10.1145/3661167.3661168(181-190)Online publication date: 18-Jun-2024
https://dl.acm.org/doi/10.1145/3661167.3661168
Colavito GLanubile FNovielli NQuaranta LSpinellis DConstantinou EBacchelli A(2024)Leveraging GPT-like LLMs to Automate Issue LabelingProceedings of the 21st International Conference on Mining Software Repositories10.1145/3643991.3644903(469-480)Online publication date: 15-Apr-2024
https://dl.acm.org/doi/10.1145/3643991.3644903
Colavito GLanubile FNovielli NQuaranta L(2024)Impact of data quality for automatic issue classification using pre-trained language modelsJournal of Systems and Software10.1016/j.jss.2023.111838210:COnline publication date: 1-Apr-2024
https://dl.acm.org/doi/10.1016/j.jss.2023.111838
Gong LZhang JWei MZhang HHuang Z(2023)What Is the Intended Usage Context of This Model? An Exploratory Study of Pre-Trained Models on Various Model RepositoriesACM Transactions on Software Engineering and Methodology10.1145/356993432:3(1-57)Online publication date: 3-May-2023
https://dl.acm.org/doi/10.1145/3569934
Jawale SSawarker S(2022)Amalgamation of Embeddings With Model Explainability for Sentiment AnalysisInternational Journal of Applied Evolutionary Computation10.4018/IJAEC.31562913:1(1-24)Online publication date: 23-Dec-2022
https://dl.acm.org/doi/10.4018/IJAEC.315629
Santos FTrinkenreich BPimentel JWiese ISteinmacher ISarma AGerosa M(2022)How to Choose a Task? Mismatches in Perspectives of Newcomers and Existing ContributorsProceedings of the 16th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement10.1145/3544902.3546236(114-124)Online publication date: 19-Sep-2022
https://dl.acm.org/doi/10.1145/3544902.3546236
Colavito GLanubile FNovielli NSorbo APanichella S(2022)Issue report classification using pre-trained language modelsProceedings of the 1st International Workshop on Natural Language-based Software Engineering10.1145/3528588.3528659(29-32)Online publication date: 21-May-2022
https://dl.acm.org/doi/10.1145/3528588.3528659
Ding JLi BXu CQiao YZhang L(2022)Diagnosing crop diseases based on domain-adaptive pre-training BERT of electronic medical recordsApplied Intelligence10.1007/s10489-022-04346-x53:12(15979-15992)Online publication date: 1-Dec-2022
https://dl.acm.org/doi/10.1007/s10489-022-04346-x

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents