Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/2820282.2820296acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article

Synonym suggestion for tags on stack overflow

Published: 16 May 2015 Publication History

Abstract

The amount of diverse tags used to classify posts on Stack Overflow increased in the last years to more than 38,000 tags. Many of these tags have the same or similar meaning. Stack Overflow provides an approach to reduce the amount of tags by allowing privileged users to manually create synonyms. However, currently exist only 2,765 synonym-pairs on Stack Overflow that is quite low compared to the total number of tags.
To comprehend how synonym-pairs are built, we manually analyzed the tags and how the synonyms could be created automatically. Based on our findings, we then present TSST, a tag synonym suggestion tool, that outputs a ranked list of possible synonyms for each input tag.
We first evaluated TSST with the 2,765 approved synonym-pairs of Stack Overflow. For 88.4% of the tags TSST finds the correct synonyms, for 72.2% the correct synonym is within the top 10 suggestions. In addition, we applied TSST to 10 randomly selected Android related tags and evaluated the suggested synonyms with 20 Android app developers in an online survey. Overall, in 80% of their ratings, developers found an adequate synonym suggested by TSST.

References

[1]
C. Treude and M.-A. Storey, "How tagging helps bridge the gap between social and technical aspects in software development," in Proceedings of the International Conference on Software Engineering. IEEE Computer Society, 2009, pp. 12--22.
[2]
J. Wang and B. D. Davison, "Explorations in tag suggestion and query expansion," in Proceedings of the workshop on Search in social media. ACM, 2008, pp. 43--50.
[3]
A. Barua, S. W. Thomas, and A. Hassan, "What are developers talking about? an analysis of topics and trends in stack overflow," Empirical Software Engineering, pp. 1--36, 2012.
[4]
C. Treude, O. Barzilay, and M.-A. Storey, "How do programmers ask and answer questions on the web?: Nier track," in International Conference on Software Engineering. IEEE, 2011, pp. 804--807.
[5]
M. F. Porter, "An algorithm for suffix stripping," in Readings in Information Retrieval, K. Sparck Jones and P. Willett, Eds. Morgan Kaufmann Publishers Inc., 1997, pp. 313--316.
[6]
P. Jaccard, "The distribution of the flora in the alpine zone," New phytologist, vol. 11, no. 2, pp. 37--50, 1912.
[7]
V. Levenshtein, "Binary Codes Capable of Correcting Deletions, Insertions and Reversals," Soviet Physics Doklady, vol. 10, p. 707, 1966.
[8]
G. Kondrak, "N-gram similarity and distance," in String Processing and Information Retrieval, ser. Lecture Notes in Computer Science, M. Consens and G. Navarro, Eds. Springer, 2005, vol. 3772, pp. 115--126.
[9]
L. Philips, "Hanging on the metaphone," Computer Language, vol. 7, no. 12, 1990.
[10]
P. E. Shrout and J. L. Fleiss, "Intraclass correlations: uses in assessing rater reliability." Psychological bulletin, vol. 86, no. 2, p. 420, 1979.
[11]
J. Cohen, "Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial credit." Psychological bulletin, vol. 70, no. 4, p. 213, 1968.
[12]
E. Zangerle, W. Gassler, and G. Specht, "Using tag recommendations to homogenize folksonomies in microblogging environments," in Proceedings of the International Conference on Social Informatics. Springer-Verlag, 2011, pp. 113--126.
[13]
B. Sigurbjörnsson and R. van Zwol, "Flickr tag recommendation based on collective knowledge," in Proceedings of the International Conference on World Wide Web. ACM, 2008, pp. 327--336.
[14]
J. Al-Kofahi, A. Tamrawi, T. T. Nguyen, H. A. Nguyen, and T. N. Nguyen, "Fuzzy set approach for automatic tagging in evolving software," in Proceedings of the International Conference on Software Maintenance. IEEE, Sept 2010, pp. 1--10.
[15]
F. Thung, D. Lo, and L. Jiang, "Detecting similar applications with collaborative tagging," in International Conference on Software Maintenance. IEEE, Sept 2012, pp. 600--603.
[16]
S. Wang, D. Lo, and L. Jiang, "Inferring semantically related software terms and their taxonomy by leveraging collaborative tagging," 2013 IEEE International Conference on Software Maintenance, vol. 0, pp. 604--607, 2012.
[17]
Y. Tian, D. Lo, and J. Lawall, "Automated construction of a software-specific word similarity database," in Software Evolution Week, Conference on Software Maintenance, Reengineering and Reverse Engineering. IEEE, Feb 2014, pp. 44--53.
[18]
A. K. Saha, R. K. Saha, and K. A. Schneider, "A discriminative model approach for suggesting tags automatically for stack overflow questions," in Proceedings of the International Workshop on Mining Software Repositories. IEEE Press, 2013, pp. 73--76.
[19]
S. Wang, D. Lo, B. Vasilescu, and A. Serebrenik, "Entagrec: An enhanced tag recommendation system for software information sites," in International Conference on Software Maintenance and Evolution. IEEE, 2014, pp. 291--300.
[20]
C. Stanley and M. D. Byrne, "Predicting tags for stackoverflow posts," in Proceedings of the International Conference on Cognitive Modelling, 2013, pp. 414--419.
[21]
X. Xia, D. Lo, X. Wang, and B. Zhou, "Tag recommendation in software information sites," in Proceedings of the Working Conference on Mining Software Repositories. Piscataway, NJ, USA: IEEE Press, 2013, pp. 287--296.
[22]
L. Short, C. Wong, and D. Zeng, "Tag recommendations in stackoverflow," 2014.
[23]
E. G. Lezina and A. M. Kuznetsov, "Predict closed questions on stackoverflow," in Proceedings of the Spring Researchers Colloquium on Database and Information Systems, 2013, pp. 10--14.
[24]
D. Correa and A. Sureka, "Fit or unfit: analysis and prediction of 'closed questions' on stack overflow," in Proceedings of the Conference on Online social networks. ACM, 2013, pp. 201--212.
[25]
D. Kavaler, D. Posnett, C. Gibler, H. Chen, P. Devanbu, and V. Filkov, "Using and asking: Apis used in the android market and asked about in stackoverflow," in Social Informatics. Springer, 2013, pp. 405--418.
[26]
C. Parnin, C. Treude, L. Grammel, and M.-A. Storey, "Crowd documentation: Exploring the coverage and the dynamics of api discussions on stack overflow," Citeseer, Tech. Rep., 2012.
[27]
M.-A. Storey, C. Treude, A. van Deursen, and L.-T. Cheng, "The impact of social media on software engineering practices and tools," in Proceedings of the Workshop on Future of Software Engineering Research. ACM, 2010, pp. 359--364.

Cited By

View all
  • (2019)Man vs machineProceedings of the 16th International Conference on Mining Software Repositories10.1109/MSR.2019.00041(205-209)Online publication date: 26-May-2019
  • (2018)FastTagRecAutomated Software Engineering10.5555/3288647.328871125:4(675-701)Online publication date: 1-Dec-2018
  • (2017)What are Software Engineers asking about Android Testing on Stack Overflow?Proceedings of the XXXI Brazilian Symposium on Software Engineering10.1145/3131151.3131157(104-113)Online publication date: 20-Sep-2017
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICPC '15: Proceedings of the 2015 IEEE 23rd International Conference on Program Comprehension
May 2015
325 pages

Sponsors

Publisher

IEEE Press

Publication History

Published: 16 May 2015

Check for updates

Qualifiers

  • Research-article

Conference

ICSE '15
Sponsor:

Upcoming Conference

ICSE 2025

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 14 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2019)Man vs machineProceedings of the 16th International Conference on Mining Software Repositories10.1109/MSR.2019.00041(205-209)Online publication date: 26-May-2019
  • (2018)FastTagRecAutomated Software Engineering10.5555/3288647.328871125:4(675-701)Online publication date: 1-Dec-2018
  • (2017)What are Software Engineers asking about Android Testing on Stack Overflow?Proceedings of the XXXI Brazilian Symposium on Software Engineering10.1145/3131151.3131157(104-113)Online publication date: 20-Sep-2017
  • (2017)Recommending Answerers for Stack Overflow with LDA ModelProceedings of the 12th Chinese Conference on Computer Supported Cooperative Work and Social Computing10.1145/3127404.3127426(80-86)Online publication date: 22-Sep-2017
  • (2017)Unsupervised software-specific morphological forms inference from informal discussionsProceedings of the 39th International Conference on Software Engineering10.1109/ICSE.2017.48(450-461)Online publication date: 20-May-2017
  • (2016)Grouping android tag synonyms on stack overflowProceedings of the 13th International Conference on Mining Software Repositories10.1145/2901739.2901750(430-440)Online publication date: 14-May-2016

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media