Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3510003.3510201acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article
Public Access

Domain-specific analysis of mobile app reviews using keyword-assisted topic models

Published: 05 July 2022 Publication History

Abstract

Mobile application (app) reviews contain valuable information for app developers. A plethora of supervised and unsupervised techniques have been proposed in the literature to synthesize useful user feedback from app reviews. However, traditional supervised classification algorithms require extensive manual effort to label ground truth data, while unsupervised text mining techniques, such as topic models, often produce suboptimal results due to the sparsity of useful information in the reviews. To overcome these limitations, in this paper, we propose a fully automatic and unsupervised approach for extracting useful information from mobile app reviews. The proposed approach is based on keyATM, a keyword-assisted approach for generating topic models. keyATM overcomes the problem of data sparsity by using seeding keywords extracted directly from the review corpus. These keywords are then used to generate meaningful domain-specific topics. Our approach is evaluated over two datasets of mobile app reviews sampled from the domains of Investing and Food Delivery apps. The results show that our approach produces significantly more coherent topics than traditional topic modeling techniques.

References

[1]
Charu Aggarwal and Chengxiang Zhai. 2012. A survey of text clustering algorithms. In Mining Text Data. Springer, 77--128.
[2]
Nasser Alsaedi, Pete Burnap, and Omer Rana. 2016. Temporal TF-IDF: A high performance approach for event summarization in twitter. In International Conference on Web Intelligence. 515--521.
[3]
Leticia Anaya. 2011. Comparing Latent Dirichlet Allocation and Latent Semantic Analysis as Classifiers. ERIC.
[4]
David Andrzejewski and Xiaojin Zhu. 2009. Latent dirichlet allocation with topic-in-set knowledge. In Workshop on Semi-Supervised Learning for Natural Language Processing. 43--48.
[5]
David Andrzejewski, Xiaojin Zhu, and Mark Craven. 2009. Incorporating domain knowledge into topic modeling via Dirichlet forest priors. In International Conference on Machine Learning. 25--32.
[6]
Lidong Bing, Wai Lam, and Tak-Lam Wong. 2011. Using query log and social tagging to refine queries based on latent topics. In International Conference on Information and Knowledge Management. 583--592.
[7]
Steven Bird. 2006. NLTK: the Natural Language Toolkit. In Interactive Presentation Sessions. 69--72.
[8]
Stuart Blair, Yaxin Bi, and Maurice Mulvenna. 2020. Aggregated topic models for increasing social media topic coherence. Applied Intelligence 50, 1 (2020), 138--156.
[9]
David Blei, Andrew Ng, and Michael Jordan. 2003. Latent Dirichlet Allocation. The Journal of Machine Learning research 3 (2003), 993--1022.
[10]
Levent Bolelli, Şeyda Ertekin, and Lee Giles. 2009. Topic and trend detection in text collections using Latent Dirichlet Allocation. In European Conference on Information Retrieval. 776--780.
[11]
Gerlof Bouma. 2009. Normalized (pointwise) mutual information in collocation extraction. German Society for Computational Linguistics 30 (2009), 31--40.
[12]
Laura Galvis Carreno and Kristina Winbladh. 2013. Analysis of user comments: An approach for software requirements evolution. In International Conference on Software Engineering. 582--591.
[13]
Dimple Chehal, Parul Gupta, and Payal Gulati. 2021. Implementation and comparison of topic modeling techniques based on user reviews in e-commerce recommendations. Journal of Ambient Intelligence and Humanized Computing 12, 5 (2021), 5055--5070.
[14]
Ning Chen, Jialiu Lin, Steven Hoi, Xiaokui Xiao, and Boshen Zhang. 2014. AR-miner: mining informative reviews for developers from mobile app marketplace. In International Conference on Software Engineering. 767--778.
[15]
Kahyun Choi, Jin Ha Lee, Craig Willis, and Stephen Downie. 2015. Topic Modeling Users' Interpretations of Songs to Inform Subject Access in Music Digital Libraries. In Joined Conference on Digital Libraries. 183--186.
[16]
Hans Christian, Mikhael Pramodana Agus, and Derwin Suhartono. 2016. Single document automatic text summarization using term frequency-inverse document frequency (TF-IDF). ComTech: Computer, Mathematics and Engineering Applications 7, 4 (2016), 285--294.
[17]
Andrea De Lucia, Massimiliano Di Penta, Rocco Oliveto, Annibale Panichella, and Sebastiano Panichella. 2012. Using IR methods for labeling source code artifacts: Is it worthwhile?. In International Conference on Program Comprehension. 193--202.
[18]
Stefan Debortoli, Oliver Müller, Iris Junglas, and Jan Brocke. 2016. Text mining for information systems researchers: An annotated topic modeling tutorial. Communications of the Association for Information Systems 39, 1 (2016), 7.
[19]
Venkatesh Dhinakaran, Raseshwari Pulle, Nirav Ajmeri, and Pradeep Murukannaiah. 2018. App review analysis via active learning: reducing supervision effort without compromising classification accuracy. In IEEE International Requirements Engineering Conference. 170--181.
[20]
Shusei Eshima, Kosuke Imai, and Tomoya Sasaki. 2020. Keyword assisted topic models. arXiv preprint arXiv:2004.05964 (2020).
[21]
Necmiye Genc-Nayebi and Alain Abran. 2017. A systematic literature review: Opinion mining studies from mobile app store user reviews. Journal of Systems and Software 125 (2017), 207--219.
[22]
Maria Gomez, Romain Rouvoy, Martin Monperrus, and Lionel Seinturier. 2015. A recommender system of buggy app checkers for app store moderators. In International Conference on Mobile Software Engineering and Systems. 1--11.
[23]
Oscar Gonzalez and David Priest. 2021. Robinhood Backlash: What You Should Know About the GameStop Stock Controversy. https://www.cnet.com/personal-finance/investing/robinhood-backlash-what-you-should-know-about-the-gamestop-stock-controversy/. Accessed: 2021-12-29.
[24]
Hui Guo and Munindar Singh. 2020. Caspar: Extracting and synthesizing user stories of problems from app reviews. In International Conference on Software Engineering. 628--640.
[25]
Emitza Guzman, Muhammad El-Haliby, and Bernd Bruegge. 2015. Ensemble methods for app review classification: An approach for software evolution (n). In International Conference on Automated Software Engineering. 771--776.
[26]
Emitza Guzman and Walid Maalej. 2014. How do users like this feature? A fine grained sentiment analysis of app reviews. In IEEE International Requirements Engineering Conference. 153--162.
[27]
Larry Hedges. 1981. Distribution theory for Glass's estimator of effect size and related estimators. Journal of Educational Statistics 6, 2 (1981), 107--128.
[28]
Kazuyuki Higashi, Hiroyuki Nakagawa, and Tatsuhiro Tsuchiya. 2018. Improvement of User Review Classification Using Keyword Expansion (S). In International Conference on Software Engineering & Knowledge Engineering. 125--124.
[29]
Abram Hindle, Christian Bird, Thomas Zimmermann, and Nachiappan Nagappan. 2012. Relating requirements to implementation via topic analysis: Do topics extracted from requirements make sense to managers and developers?. In IEEE International Conference on Software Maintenance. 243--252.
[30]
Liangjie Hong and Brian Davison. 2010. Empirical study of topic modeling in twitter. In Workshop on Social Media Analytics. 80--88.
[31]
Leonard Hoon, Rajesh Vasa, Jean-Guy Schneider, and Kon Mouzakis. 2012. A preliminary analysis of vocabulary in mobile app user reviews. In Computer-Human Interaction Conference. 245--248.
[32]
Eduard Hovy, Chin-Yew Lin, et al. 1999. Automated text summarization in SUMMARIST. Advances in Automatic Text Summarization 14 (1999), 81--94.
[33]
Claudia Iacob and Rachel Harrison. 2013. Retrieving and analyzing mobile apps feature requests from online reviews. In Conference on Mining Software Repositories. 41--44.
[34]
David Inouye and Jugal Kalita. 2011. Comparing twitter summarization algorithms for multiple post summaries. In International Conference on Privacy, Security, Risk and Trust and International Conference on Social Computing. 298--306.
[35]
Jagadeesh Jagarlamudi, Hal Daumé, and Raghavendra Udupa. 2012. Incorporating lexical priors into topic models. In Conference of the European Chapter of the Association for Computational Linguistics. 204--213.
[36]
Nishant Jha and Anas Mahmoud. 2018. Using frame semantics for classifying and summarizing application store reviews. Empirical Software Engineering 23, 6 (2018), 3734--3767.
[37]
Elham Khabiri, James Caverlee, and Chiao-Fang Hsu. 2011. Summarizing user-contributed comments. In International AAAI Conference on Web and Social Media, Vol. 5.
[38]
Hammad Khalid, Emad Shihab, Meiyappan Nagappan, and Ahmed E Hassan. 2014. What do mobile app users complain about? IEEE software 32, 3 (2014), 70--77.
[39]
Mubasher Khalid, Muhammad Asif, and Usman Shehzaib. 2015. Towards improving the quality of mobile app reviews. International Journal of Information Technology and Computer Science 7, 10 (2015), 35.
[40]
Rahim Khan, Yurong Qian, and Sajid Naeem. 2019. Extractive based Text Summarization Using K-Means and TF-IDF. International Journal of Information Engineering & Electronic Business 11, 3 (2019).
[41]
Tuomo Korenius, Jorma Laurikkala, Kalervo Järveli, and Martti Juhola. 2004. Stemming and lemmatization in the clustering of finnish text documents. In International Conference on Information and Knowledge Management. 625--633.
[42]
Ralf Krestel, Peter Fankhauser, and Wolfgang Nejdl. 2009. Latent Dirichlet Allocation for tag recommendation. In Recommender Systems Conference. 61--68.
[43]
Donny Kristianto. 2021. Winning the Attention War: Consumers in Nine Major Markets Now Spend More than Four Hours a Day in Apps. https://www.appannie.com/en/insights/market-data/q1-2021-market-index/. Accessed: 2021-05-31.
[44]
Zijad Kurtanović and Walid Maalej. 2017. Mining user rationale from software reviews. In IEEE International Requirements Engineering Conference. 61--70.
[45]
Retno Kusumaningrum, Ihsan Aji Wiedjayanto, Satriyo Adhy, et al. 2016. Classification of Indonesian news articles based on Latent Dirichlet Allocation. In International Conference on Data and Software Engineering. 1--5.
[46]
Jey Han Lau, David Newman, and Timothy Baldwin. 2014. Machine reading tea leaves: Automatically evaluating topic coherence and topic model quality. In Conference of the European Chapter of the Association for Computational Linguistics. 530--539.
[47]
Xiaozhou Li, Boyang Zhang, Zheying Zhang, and Kostas Stefanidis. 2020. A Sentiment-Statistical Approach for Identifying Problematic Mobile App Updates Based on User Reviews. Information 11, 3 (2020), 152.
[48]
Clare Llewellyn, Claire Grover, and Jon Oberlander. 2014. Summarizing newspaper comments. In International AAAI Conference on Web and Social Media, Vol. 8.
[49]
Mengmeng Lu and Peng Liang. 2017. Automatic classification of non-functional requirements from augmented app user reviews. In International Conference on Evaluation and Assessment in Software Engineering. 344--353.
[50]
Stacy Lukins, Nicholas Kraft, and Letha Etzkorn. 2008. Source Code Retrieval for Bug Localization Using Latent Dirichlet Allocation. In Reverse Engineering. 155--164.
[51]
Stacy Lukins, Nicholas Kraft, and Letha Etzkorn. 2010. Bug localization using Latent Dirichlet Allocation. Information and Software Technology 52, 9 (2010), 972--990.
[52]
Walid Maalej, Zijad Kurtanović, Hadeer Nabil, and Christoph Stanik. 2016. On the automatic classification of app reviews. Requirements Engineering 21, 3 (2016), 311--331.
[53]
Walid Maalej and Hadeer Nabil. 2015. Bug report, feature request, or simply praise? On automatically classifying app reviews. In IEEE International Requirements Engineering Conference. 116--125.
[54]
Anas Mahmoud and Gary Bradshaw. 2017. Semantic topic models for source code analysis. Empirical Software Engineering 22, 4 (2017), 1956--2000.
[55]
Inderjeet Mani, Marc Verhagen, Ben Wellner, Chungmin Lee, and James Pustejovsky. 2006. Machine learning of temporal relations. In International Conference on Computational Linguistics and Meeting of the Association for Computational Linguistics. 753--760.
[56]
Usha Manjari, Syed Rousha, Dasi Sumanth, and Sirisha Devi. 2020. Extractive Text Summarization from Web pages using Selenium and TF-IDF algorithm. In International Conference on Trends in Electronics and Informatics. 648--652.
[57]
John McCrank. 2021. Robinhood Added 6 Million Crypto Users in Last Two Months. https://finance.yahoo.com/news/robinhood-added-6-million-crypto-212636002.html. Accessed: 2021-12-29.
[58]
Stuart McIlroy, Nasir Ali, Hammad Khalid, and Ahmed E Hassan. 2016. Analyzing and automatically labelling the types of user issues that are raised in mobile app reviews. Empirical Software Engineering 21, 3 (2016), 1067--1106.
[59]
Rishabh Mehrotra, Scott Sanner, Wray Buntine, and Lexing Xie. 2013. Improving LDA topic models for microblogs via tweet pooling and automatic labeling. In Conference on Research and Development in Information Retrieval. 889--892.
[60]
Ani Nenkova and Lucy Vanderwende. 2005. The impact of frequency on summarization. Microsoft Research, Redmond, Washington, Tech. Rep. MSR-TR-2005 101 (2005).
[61]
Xiaochuan Ni, Jian-Tao Sun, Jian Hu, and Zheng Chen. 2009. Mining multilingual topics from Wikipedia. In International Conference on World Wide Web. 1155--1156.
[62]
Ehsan Noei, Feng Zhang, and Ying Zou. 2019. Too many user-reviews, what should app developers look at first? Transactions on Software Engineering (2019).
[63]
Jeungmin Oh, Daehoon Kim, Uichin Lee, Jae-Gil Lee, and Junehwa Song. 2013. Facilitating developer-user interactions with mobile app review digests. In CHI Extended Abstracts on Human Factors in Computing Systems. 1809--1814.
[64]
Annibale Panichella, Bogdan Dit, Rocco Oliveto, Massimilano Di Penta, Denys Poshynanyk, and Andrea De Lucia. 2013. How to effectively use topic models for software engineering tasks? An approach based on genetic algorithms. In 2013 35th International Conference on Software Engineering. IEEE, 522--531.
[65]
Sebastiano Panichella, Andrea Di Sorbo, Emitza Guzman, Corrado Visaggio, Gerardo Canfora, and Harald Gall. 2015. How can I improve my app? Classifying user reviews for software maintenance and evolution. In International Conference on Software Maintenance and Evolution. 281--290.
[66]
Sebastiano Panichella, Andrea Di Sorbo, Emitza Guzman, Corrado Visaggio, Gerardo Canfora, and Harald Gall. 2016. Ardoc: App reviews development oriented classifier. In International Symposium on Foundations of Software Engineering. 1023--1027.
[67]
Dae Hoon Park, Mengwen Liu, Cheng Xiang Zhai, and Haohong Wang. 2015. Leveraging user reviews to improve accuracy for mobile app retrieval. In International Conference on Research and Development in Information Retrieval. 533--542.
[68]
Elizabeth Poché, Nishant Jha, Grant Williams, Jazmine Staten, Miles Vesper, and Anas Mahmoud. 2017. Analyzing user comments on YouTube coding tutorial videos. In International Conference on Program Comprehension. 196--206.
[69]
Zhilei Qiao, Xuan Zhang, Mi Zhou, Gang Alan Wang, and Weiguo Fan. 2017. A domain oriented LDA model for mining product defects from online customer reviews. (2017).
[70]
Research and Markets. 2021. Global Online Food Delivery Services Market Report 2021: Market is Expected to Reach $192.16 Billion in 2025, from $126.91 Billion in 2021 - Long-term Forecast to 2030. https://www.prnewswire.com. Accessed: 2021-07-24.
[71]
Michael Röder, Andreas Both, and Alexander Hinneburg. 2015. Exploring the space of topic coherence measures. In International Conference on Web Search and Data Mining. 399--408.
[72]
Furqan Rustam, Arif Mehmood, Muhammad Ahmad, Saleem Ullah, Dost Muhammad Khan, and Gyu Sang Choi. 2020. Classification of shopify app user reviews using novel multi text features. IEEE Access 8 (2020), 30234--30244.
[73]
Andrea Di Sorbo, Sebastiano Panichella, Carol Alexandru, Junji Shimagaki, Corrado Visaggio, Gerardo Canfora, and Harald Gall. 2016. What would users change in my app? Summarizing app reviews for recommending software changes. In International Symposium on Foundations of Software Engineering. 499--510.
[74]
Statista. 2021. Number of available apps in the Apple App Store from 2008 to 2020. https://www.statista.com/statistics/268251/number-of-apps-in-the-itunes-app-store-since-2008/. Accessed: 2021-05-31.
[75]
Levi Sumagaysay. 2020. The pandemic has more than doubled food-delivery apps' business. Now what? https://www.marketwatch.com. Accessed: 2021-07-24.
[76]
Shaheen Syed and Marco Spruit. 2017. Full-text or abstract? Examining topic coherence scores using Latent Dirichlet Allocation. In International Conference on Data Science and Advanced Analytics. 165--174.
[77]
Maria Terzi, Maria-Angela Ferrario, and Jon Whittle. 2011. Free text in user reviews: Their role in recommender systems. In Workshop on Recommender Systems and the Social Web at International Conference on Recommender Systems. 45--48.
[78]
Stephen Thomas, Meiyappan Nagappan, Dorothea Blostein, and Ahmed Hassan. 2013. The impact of classifier configuration and classifier combination on bug localization. IEEE Transactions on Software Engineering 39, 10 (2013), 1427--1443.
[79]
Kai Tian, Meghan Revelle, and Denys Poshyvanyk. 2009. Using Latent Dirichlet Allocation for automatic categorization of software. In International Working Conference on Mining Software Repositories. 163--166.
[80]
Miroslav Tushev, Fahimeh Ebrahimi, and Anas Mahmoud. 2020. Digital Discrimination in Sharing Economy A Requirements Engineering Perspective. In IEEE International Requirements Engineering Conference. 204--214.
[81]
Rajesh Vasa, Leonard Hoon, Kon Mouzakis, and Akihiro Noguchi. 2012. A preliminary analysis of mobile app user reviews. In Computer-Human Interaction Conference. 241--244.
[82]
Jianyu Wang, Rui Wen, Chunming Wu, Yu Huang, and Jian Xion. 2019. Fdgars: Fraudster detection via graph convolutional networks in online app review system. In World Wide Web Conference. 310--316.
[83]
Grant Williams and Anas Mahmoud. 2017. Mining Twitter Feeds for Software User Requirements. In International Requirements Engineering Conference. 1--10.
[84]
Grant Williams, Miroslav Tushev, Fahimeh Ebrahimi, and Anas Mahmoud. 2020. Modeling user concerns in Sharing Economy: the case of food delivery apps. Automated Software Engineering 27, 3 (2020), 229--263.
[85]
Xiaohui Yan, Jiafeng Guo, Yanyan Lan, and Xueqi Cheng. 2013. A biterm topic model for short texts. In International Conference on World Wide Web. 1445--1456.
[86]
Hui Yang and Peng Liang. 2015. Identification and Classification of Requirements from App User Reviews. In International Conference on Software Engineering & Knowledge Engineering. 7--12.
[87]
Wayne Xin Zhao, Jing Jiang, Jianshu Weng, Jing He, Ee-Peng Lim, Hongfei Yan, and Xiaoming Li. 2011. Comparing twitter and traditional media using topic models. In European Conference on Information Retrieval. 338--349.

Cited By

View all
  • (2025)RPerf: Mining user reviews using topic modeling to assist performance testing: An industrial experience reportJournal of Systems and Software10.1016/j.jss.2024.112283222(112283)Online publication date: Apr-2025
  • (2024)From Customer’s Voice to Decision-Maker Insights: Textual Analysis Framework for Arabic Reviews of Saudi Arabia’s Super AppApplied Sciences10.3390/app1416695214:16(6952)Online publication date: 8-Aug-2024
  • (2024)Keyword-assisted topic models reveal the dynamics in the main media frames of the Grand Ethiopian Renaissance Dam (2011–2022)Media, War & Conflict10.1177/17506352241241159Online publication date: 2-May-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICSE '22: Proceedings of the 44th International Conference on Software Engineering
May 2022
2508 pages
ISBN:9781450392211
DOI:10.1145/3510003
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

  • IEEE CS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 July 2022

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Funding Sources

Conference

ICSE '22
Sponsor:

Acceptance Rates

Overall Acceptance Rate 276 of 1,856 submissions, 15%

Upcoming Conference

ICSE 2025

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)297
  • Downloads (Last 6 weeks)41
Reflects downloads up to 08 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)RPerf: Mining user reviews using topic modeling to assist performance testing: An industrial experience reportJournal of Systems and Software10.1016/j.jss.2024.112283222(112283)Online publication date: Apr-2025
  • (2024)From Customer’s Voice to Decision-Maker Insights: Textual Analysis Framework for Arabic Reviews of Saudi Arabia’s Super AppApplied Sciences10.3390/app1416695214:16(6952)Online publication date: 8-Aug-2024
  • (2024)Keyword-assisted topic models reveal the dynamics in the main media frames of the Grand Ethiopian Renaissance Dam (2011–2022)Media, War & Conflict10.1177/17506352241241159Online publication date: 2-May-2024
  • (2024)Unveiling User Perspectives: Exploring Themes in Femtech Mobile App Reviews for Enhanced Usability and PrivacyProceedings of the ACM on Human-Computer Interaction10.1145/36765308:MHCI(1-21)Online publication date: 24-Sep-2024
  • (2024)Generating Rate Features for Mobile ApplicationsProceedings of the IEEE/ACM 11th International Conference on Mobile Software Engineering and Systems10.1145/3647632.3647986(54-64)Online publication date: 14-Apr-2024
  • (2024)Factors Influencing Mobile App User Experience: An Analysis of Education App User Reviews2024 4th International Conference on Advanced Research in Computing (ICARC)10.1109/ICARC61713.2024.10499727(223-228)Online publication date: 21-Feb-2024
  • (2024)How to effectively mine app reviews concerning software ecosystem? A survey of review characteristicsJournal of Systems and Software10.1016/j.jss.2024.112040213(112040)Online publication date: Jul-2024
  • (2023)The Impact of YouTube on Present and Future Firm Value: Using Unstructured Text AnalysisSustainability10.3390/su1505434615:5(4346)Online publication date: 28-Feb-2023
  • (2023)A Study of Gender Discussions in Mobile Apps2023 IEEE/ACM 20th International Conference on Mining Software Repositories (MSR)10.1109/MSR59073.2023.00086(598-610)Online publication date: May-2023
  • (2023)Empirical Evaluation of ChatGPT on Requirements Information Retrieval Under Zero-Shot Setting2023 International Conference on Intelligent Computing and Next Generation Networks(ICNGN)10.1109/ICNGN59831.2023.10396810(1-6)Online publication date: 17-Nov-2023
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media