research-article

Can We Use SE-specific Sentiment Analysis Tools in a Cross-Platform Setting?

Authors:

Nicole Novielli,

Fabio Calefato,

Davide Dongiovanni,

Daniela Girardi,

Filippo LanubileAuthors Info & Claims

MSR '20: Proceedings of the 17th International Conference on Mining Software Repositories

Pages 158 - 168

https://doi.org/10.1145/3379597.3387446

Published: 18 September 2020 Publication History

Abstract

In this paper, we address the problem of using sentiment analysis tools 'off-the-shelf', that is when a gold standard is not available for retraining. We evaluate the performance of four SE-specific tools in a cross-platform setting, i.e., on a test set collected from data sources different from the one used for training. We find that (i) the lexicon-based tools outperform the supervised approaches retrained in a cross-platform setting and (ii) retraining can be beneficial in within-platform settings in the presence of robust gold standard datasets, even using a minimal training set. Based on our empirical findings, we derive guidelines for reliable use of sentiment analysis tools in software engineering.

References

[1]

Amritanshu Agrawal and Tim Menzies. 2018. Is "Better Data" Better than "Better Data Miners"? On the Benefits of Tuning SMOTE for Defect Prediction (ICSE '18). ACM, New York, NY, USA, 1050--1061. https://doi.org/10.1145/3180155.3180197

Digital Library

[2]

T.Ahmed, A. Bosu, A. Iqbal, and S. Rahimi. 2017. SentiCR: A customized sentiment analysis tool for code review interactions. In 2017 32nd IEEE/ACM International Conf. on Automated Software Engineering (ASE). IEEE Press, 106--111. https://doi.org/10.1109/ASE.2017.8115623

[3]

John L. Austin. 1962. How to do things with words. Oxford University Press.

[4]

Shai Ben-David, John Blitzer, Koby Crammer, Alex Kulesza, Fernando Pereira, and Jennifer Wortman Vaughan. 2010. A theory of learning from different domains. Machine Learning 79, 1 (2010), 151--175. https://doi.org/10.1007/s10994-009-5152-4

Digital Library

[5]

Cássio Castaldi Araujo Blaz and Karin Becker. 2016. Sentiment Analysis in Tickets for IT Support (MSR '16). ACM, New York, NY, USA, 235--246. https://doi.org/10.1145/2901739.2901781

[6]

Fabio Calefato, Filippo Lanubile, Federico Maiorano, and Nicole Novielli. 2018. Sentiment Polarity Detection for Software Development. Empirical Software Engineering 23, 3 (2018), 1352--1382. https://doi.org/10.1007/s10664-017-9546-9

Digital Library

[7]

Fabio Calefato, Filippo Lanubile, and Nicole Novielli. 2018. How to ask for technical help? Evidence-based guidelines for writing questions on Stack Overflow. Information & Software Technology 94 (2018), 186--207. https://doi.org/10.1016/j.infsof.2017.10.009

Digital Library

[8]

Fabio Calefato, Filippo Lanubile, Nicole Novielli, and Luigi Quaranta. 2019. EMTk: The Emotion Mining Toolkit (SEmotion '19). IEEE Press, 34--37. https://doi.org/10.1109/SEmotion.2019.00014

[9]

Tommaso Caselli, Nicole Novielli, Viviana Patti, and Paolo Rosso. 2018. Evalita 2018: Overview on the 6th Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. In Proc. of the Sixth Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop (EVALITA 2018) co-located with the Fifth Italian Conf. on Computational Linguistics (CLiC-it 2018), Turin, Italy, December 12-13, 2018. CEUR-SW.org. http://ceur-ws.org/Vol-2263/paper001.pdf

[10]

Nitesh V. Chawla, Kevin W. Bowyer, Lawrence O. Hall, and W. Philip Kegelmeyer. 2002. SMOTE: Synthetic Minority Over-sampling Technique. J. Artif. Intell. Res. 16 (2002), 321--357. https://doi.org/10.1613/jair.953

[11]

Zhenpeng Chen, Yanbin Cao, Xuan Lu, Qiaozhu Mei, and Xuanzhe Liu. 2019. SEntiMoji: An Emoji-Powered Learning Approach for Sentiment Analysis in Software Engineering (ESEC/FSE 2019). ACM, New York, NY, USA, 841--852. https://doi.org/10.1145/3338906.3338977

[12]

Jacob Cohen. 1968. Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial credit. Psychological Bulletin 70, 4 (1968), 213. https://doi.org/10.1037/h0026256

[13]

Daviti Gachechiladze, Filippo Lanubile, Nicole Novielli, and Alexander Serebrenik. 2017. Anger and Its Direction in Collaborative Software Development(ICSE-NIER '17). IEEE Press, 11--14. https://doi.org/10.1109/ICSE-NIER.2017.18

[14]

Emitza Guzman, Rana Alkadhi, and Norbert Seyff. 2016. A Needle in a Haystack: What Do Twitter Users Say about Software?. In 24th IEEE International Requirements Engineering Conf., RE 2016, Beijing, China, September 12-16, 2016. IEEE, 96--105. https://doi.org/10.1109/RE.2016.67

[15]

Emitza Guzman, David Azócar, and Yang Li. 2014. Sentiment Analysis of Commit Comments in GitHub: An Empirical Study (MSR 2014). ACM, New York, NY, USA, 352--355. https://doi.org/10.1145/2597073.2597118

[16]

H. He and E. A. Garcia. 2009. Learning from Imbalanced Data. IEEE Transactions on Knowledge and Data Engineering 21, 9 (2009), 1263--1284. https://doi.org/10.1109/TKDE.2008.239

Digital Library

[17]

Md Rakibul Islam and Minhaz F. Zibran. 2017. Leveraging Automated Sentiment Analysis in Software Engineering (MSR '17). IEEE Press, 203--214. https://doi.org/10.1109/MSR.2017.9

[18]

Md Rakibul Islam and Minhaz F. Zibran. 2018. DEVA: sensing emotions in the valence arousal space in software engineering text. In Proc. of the 33rd Annual ACM Symposium on Applied Computing, SAC 2018, Pau, France, April 09-13, 2018. 1536--1543. https://doi.org/10.1145/3167132.3167296

Digital Library

[19]

Robbert Jongeling, Proshanta Sarkar, Subhajit Datta, and Alexander Serebrenik. 2017. On negative results when using sentiment analysis tools for software engineering research. Empirical Software Engineering 22, 5 (2017), 2543--2584. https://doi.org/10.1007/s10664-016-9493-x

Digital Library

[20]

Daniel Jurafsky and James H. Martin. 2009. Speech and Language Processing (2nd Edition). Prentice-Hall, Inc., USA.

Digital Library

[21]

Richard S. Lazarus. 1991. Emotion and adaptation / Richard S. Lazarus. Oxford University Press New York. xiii, 557 p.; pages. http://www.loc.gov/catdir/enhancements/fy0602/91009611-t.html

[22]

Bin Lin, Fiorella Zampetti, Gabriele Bavota, Massimiliano Di Penta, and Michele Lanza. 2019. Pattern-Based Mining of Opinions in Q&A Websites (ICSE '19). IEEE Press, 548--559. https://doi.org/10.1109/ICSE.2019.00066

[23]

Bin Lin, Fiorella Zampetti, Gabriele Bavota, Massimiliano Di Penta, Michele Lanza, and Rocco Oliveto. 2018. Sentiment Analysis for Software Engineering: How Far Can We Go? (ICSE 18). ACM, New York, NY, USA, 94--104. https://doi.org/10.1145/3180155.3180195

[24]

Walid Maalej, Zijad Kurtanoviundefined, Hadeer Nabil, and Christoph Stanik. 2016. On the Automatic Classification of App Reviews. Requir. Eng. 21, 3, 311--331. https://doi.org/10.1007/s00766-016-0251-9

Digital Library

[25]

Mika Mäntylä, Bram Adams, Giuseppe Destefanis, Daniel Graziotin, and Marco Ortu. 2016. Mining Valence, Arousal, and Dominance: Possibilities for Detecting Burnout and Productivity?(MSR 16). ACM, NewYork, NY, USA, 247--258. https://doi.org/10.1145/2901739.2901752

[26]

T. Menzies. 2020. The Five Laws of SE for AI. IEEE Software 37, 1 (Jan 2020), 81--85. https://doi.org/10.1109/MS.2019.2954841

Digital Library

[27]

Alessandro Murgia, Parastou Tourani, Bram Adams, and Marco Ortu. 2014. Do Developers Feel Emotions? An Exploratory Analysis of Emotions in Software Artifacts (MSR 2014). ACM, New York, NY, USA, 262--271. https://doi.org/10.1145/2597073.2597086

[28]

Nicole Novielli, Andrew Begel, and Walid Maalej. 2019. Introduction to the special issue on affect awareness in software engineering. Journal of Systems and Software 148 (2019), 180--182. https://doi.org/10.1016/jjss.2018.11.016

[29]

Nicole Novielli, Fabio Calefato, and Filippo Lanubile. 2015. The Challenges of Sentiment Detection in the Social Programmer Ecosystem (SSE 2015). ACM, New York, NY, USA, 33--40. https://doi.org/10.1145/2804381.2804387

[30]

Nicole Novielli, Daniela Girardi, and Filippo Lanubile. 2018. A Benchmark Study on Sentiment Analysis for Software Engineering Research (MSR '18). ACM, New York, NY, USA, 364--375. https://doi.org/10.1145/3196398.3196403

[31]

N. Novielli and A. Serebrenik. 2019. Sentiment and Emotion in Software Engineering. IEEE Software 36, 5 (2019), 6--23. https://doi.org/10.1109/MS.2019.2924013

Digital Library

[32]

Marco Ortu, Bram Adams, Giuseppe Destefanis, Parastou Tourani, Michele Marchesi, and Roberto Tonelli. 2015. Are Bullies More Productive? Empirical Study of Affectiveness vs. Issue Fixing Time (MSR '15). IEEE Press, 303--313.

[33]

Marco Ortu, Alessandro Murgia, Giuseppe Destefanis, Parastou Tourani, Roberto Tonelli, Michele Marchesi, and Bram Adams. 2016. The Emotional Side of Software Developers in JIRA (MSR '16). ACM, New York, NY, USA, 480--483. https://doi.org/10.1145/2901739.2903505

[34]

Bo Pang and Lillian Lee. 2008. Opinion Mining and Sentiment Analysis. Foundations and Trends in Information Retrieval 2, 1-2 (2008), 1--135. https://doi.org/10.1561/1500000011

Digital Library

[35]

Sebastiano Panichella, Andrea Di Sorbo, Emitza Guzman, Corrado A. Visaggio, Gerardo Canfora, and Harald C. Gall. 2015. How Can i Improve My App? Classifying User Reviews for Software Maintenance and Evolution. In Proc. of the 2015 IEEE International Conf. on Software Maintenance and Evolution (ICSME) (ICSME 15). IEEE Computer Society, USA, 281--290. https://doi.org/10.1109/ICSM.2015.7332474

Digital Library

[36]

Daniel Pletea, Bogdan Vasilescu, and Alexander Serebrenik. 2014. Security and Emotion: Sentiment Analysis of Security Discussions on GitHub (MSR 2014). ACM, New York, NY, USA, 348--351. https://doi.org/10.1145/2597073.2597117

[37]

Ellen Riloff, Siddharth Patwardhan, and Janyce Wiebe. 2006. Feature Subsumption for Opinion Analysis (EMNLP 06). ACL, USA, 440--448.

[38]

Sebastian Ruder and Barbara Plank. 2018. Strong Baselines for Neural Semi-Supervised Learning under Domain Shift. In Proc. of the 56th Annual Meeting of the Association for Computational Linguistics (Vol. 1: Long Papers). ACL, 1044--1054. https://doi.org/10.18653/v1/P18-1096

[39]

J.A. Russell. 1980. A circumplex model of affect. Journal of personality and social psychology 39, 6 (1980), 1161--1178.

[40]

Klaus R. Scherer, Tanja Wranik, Janique Sangsue, Véronique Tran, and Ursula Scherer. 2004. Emotions in everyday life: probability of occurrence, risk factors, appraisal and reaction patterns. Social Science Information 43, 4 (2004), 499--570. https://doi.org/10.1177/0539018404047701

[41]

Fabrizio Sebastiani. 2002. Machine learning in automated text categorization. ACM Comput. Surv. 34, 1 (2002), 1--47. https://doi.org/10.1145/505282.505283

Digital Library

[42]

Phillip Shaver, Judith Schwartz, Donald Kirson, and O'Connor Cary. 1987. Emotion knowledge: Further exploration of a prototype approach. Journal of Personality and Social Psychology 52, 6 (1987), 1061--1086. https://doi.org/10.1037/0022-3514.52.6.1061

[43]

Vinayak Sinha, Alina Lazar, and Bonita Sharif. 2016. Analyzing Developer Sentiment in Commit Logs (MSR 16). ACM, New York, NY, USA, 520--523. https://doi.org/10.1145/2901739.2903501

[44]

Richard Socher, Alex Perelygin, Jean Wu, Jason Chuang, Christopher D. Manning, Andrew Ng, and Christopher Potts. 2013. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank. In Proc. of the 2013 Conf. on Empirical Methods in Natural Language Processing. ACL, Seattle, Washington, USA, 1631--1642. https://www.aclweb.org/anthology/D13-1170

[45]

Maite Taboada, Julian Brooke, Milan Tofiloski, Kimberly Voll, and Manfred Stede. 2011. Lexicon-Based Methods for Sentiment Analysis. Comput. Linguist. 37, 2 (June 2011), 267--307. https://doi.org/10.1162/COLI_a_00049

[46]

Chakkrit Tantithamthavorn, Shane McIntosh, Ahmed E. Hassan, Akinori Ihara, and Kenichi Matsumoto. 2015. The Impact of Mislabelling on the Performance and Interpretation of Defect Prediction Models (ICSE '15). IEEE Press, 812--823.

[47]

Mike Thelwall, Kevan Buckley, Georgios Paltoglou, Di Cai, and Arvid Kappas. 2010. Sentiment Strength Detection in Short Informal Text. J. Am. Soc. Inf. Sci. Technol. 61, 12 (Dec. 2010), 2544--2558.

[48]

Gias Uddin and Foutse Khomh. 2017. Opiner: An Opinion Search and Summarization Engine for APIs (ASE 2017). IEEE Press, 978--983.

[49]

Anthony Viera and Joanne Garrett. 2005. Understanding Interobserver Agreement: The Kappa Statistic. Family medicine 37 (06 2005), 360--3.

[50]

Lei Zhang, Shuai Wang, and Bing Liu. 2018. Deep Learning for Sentiment Analysis: A Survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery (01 2018). https://doi.org/10.1002/widm.1253

Cited By

Coutinho DCito LLima MArantes BAlves Pereira JArriel JGodinho JMartins VLibório PLeite LGarcia AAssunção WSteinmacher IBaffa AFonseca B(2024)"Looks Good To Me ;-)": Assessing Sentiment Analysis Tools for Pull Request DiscussionsProceedings of the 28th International Conference on Evaluation and Assessment in Software Engineering10.1145/3661167.3661189(211-221)Online publication date: 18-Jun-2024
https://dl.acm.org/doi/10.1145/3661167.3661189
Jansen JCassee NSerebrenik A(2024)Sentiment of Technical Debt Security Questions on Stack Overflow: A Replication Study2024 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER60148.2024.00089(821-829)Online publication date: 12-Mar-2024
https://doi.org/10.1109/SANER60148.2024.00089
Shafikuzzaman MIslam MRolli AAkhter SSeliya N(2024)An Empirical Evaluation of the Zero-Shot, Few-Shot, and Traditional Fine-Tuning Based Pretrained Language Models for Sentiment Analysis in Software EngineeringIEEE Access10.1109/ACCESS.2024.343945012(109714-109734)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3439450
Show More Cited By

Index Terms

Can We Use SE-specific Sentiment Analysis Tools in a Cross-Platform Setting?

Recommendations

Sentiment analysis for software engineering: how far can we go?
ICSE '18: Proceedings of the 40th International Conference on Software Engineering

Sentiment analysis has been applied to various software engineering (SE) tasks, such as evaluating app reviews or analyzing developers' emotions in commit messages. Studies indicate that sentiment analysis tools provide unreliable results when used out-...
Joint sentiment/topic model for sentiment analysis
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge management

Sentiment analysis or opinion mining aims to use automated tools to detect subjective information such as opinions, attitudes, and feelings expressed in text. This paper proposes a novel probabilistic modeling framework based on Latent Dirichlet ...
Extracting domain-specific opinion words for sentiment analysis
MICAI'12: Proceedings of the 11th Mexican international conference on Advances in Computational Intelligence - Volume Part II

In this paper, we consider opinion word extraction, one of the key problems in sentiment analysis. Sentiment analysis (or opinion mining) is an important research area within computational linguistics. Opinion words, which form an opinion lexicon, ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MSR '20: Proceedings of the 17th International Conference on Mining Software Repositories

June 2020

675 pages

ISBN:9781450375177

DOI:10.1145/3379597

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 September 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

MSR '20

Sponsor:

SIGSOFT

MSR '20: 17th International Conference on Mining Software Repositories

June 29 - 30, 2020

Seoul, Republic of Korea

Upcoming Conference

ICSE 2025

2025 IEEE/ACM 46th International Conference on Software Engineering

April 26 - May 3, 2025

Ottawa , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

35
Total Citations
View Citations
357
Total Downloads

Downloads (Last 12 months)72
Downloads (Last 6 weeks)0

Reflects downloads up to 30 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

Coutinho DCito LLima MArantes BAlves Pereira JArriel JGodinho JMartins VLibório PLeite LGarcia AAssunção WSteinmacher IBaffa AFonseca B(2024)"Looks Good To Me ;-)": Assessing Sentiment Analysis Tools for Pull Request DiscussionsProceedings of the 28th International Conference on Evaluation and Assessment in Software Engineering10.1145/3661167.3661189(211-221)Online publication date: 18-Jun-2024
https://dl.acm.org/doi/10.1145/3661167.3661189
Jansen JCassee NSerebrenik A(2024)Sentiment of Technical Debt Security Questions on Stack Overflow: A Replication Study2024 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER60148.2024.00089(821-829)Online publication date: 12-Mar-2024
https://doi.org/10.1109/SANER60148.2024.00089
Shafikuzzaman MIslam MRolli AAkhter SSeliya N(2024)An Empirical Evaluation of the Zero-Shot, Few-Shot, and Traditional Fine-Tuning Based Pretrained Language Models for Sentiment Analysis in Software EngineeringIEEE Access10.1109/ACCESS.2024.343945012(109714-109734)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3439450
Claes MFarooq USalman ITeern AIsomursu MHalonen R(2024)Sentiment Analysis of Finnish Twitter Discussions on COVID-19 During the PandemicSN Computer Science10.1007/s42979-023-02595-25:2Online publication date: 13-Feb-2024
https://doi.org/10.1007/s42979-023-02595-2
Cassee NAgaronian AConstantinou ENovielli NSerebrenik A(2024)Transformers and meta-tokenization in sentiment analysis for software engineeringEmpirical Software Engineering10.1007/s10664-024-10468-229:4Online publication date: 3-Jun-2024
https://dl.acm.org/doi/10.1007/s10664-024-10468-2
Specht AObaidi MNagel LStess MKlünder J(2024)What is Needed to Apply Sentiment Analysis in Real Software Projects: A Feasibility Study in IndustryHuman-Centered Software Engineering10.1007/978-3-031-64576-1_6(105-129)Online publication date: 1-Jul-2024
https://doi.org/10.1007/978-3-031-64576-1_6
Anders MPaech BBockstaller L(2024)Exploring the Automatic Classification of Usage Information in FeedbackRequirements Engineering: Foundation for Software Quality10.1007/978-3-031-57327-9_17(267-283)Online publication date: 30-Mar-2024
https://doi.org/10.1007/978-3-031-57327-9_17
Schilpp JPethig FHoehle H(2023)Analytics Dashboards and User Behavior: Evidence from GitHub2023 46th MIPRO ICT and Electronics Convention (MIPRO)10.23919/MIPRO57284.2023.10159843(56-61)Online publication date: 22-May-2023
https://doi.org/10.23919/MIPRO57284.2023.10159843
V JS A(2023)Cross-lingual Sentiment Analysis of Tamil Language Using a Multi-stage Deep Learning ArchitectureACM Transactions on Asian and Low-Resource Language Information Processing10.1145/363139122:12(1-28)Online publication date: 1-Nov-2023
https://dl.acm.org/doi/10.1145/3631391
Chen ZZhang JSarro FHarman M(2023)A Comprehensive Empirical Study of Bias Mitigation Methods for Machine Learning ClassifiersACM Transactions on Software Engineering and Methodology10.1145/358356132:4(1-30)Online publication date: 27-May-2023
https://dl.acm.org/doi/10.1145/3583561
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents