Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Mining non-functional requirements from App store reviews

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

User reviews obtained from mobile application (app) stores contain technical feedback that can be useful for app developers. Recent research has been focused on mining and categorizing such feedback into actionable software maintenance requests, such as bug reports and functional feature requests. However, little attention has been paid to extracting and synthesizing the Non-Functional Requirements (NFRs) expressed in these reviews. NFRs describe a set of high-level quality constraints that a software system should exhibit (e.g., security, performance, usability, and dependability). Meeting these requirements is a key factor for achieving user satisfaction, and ultimately, surviving in the app market. To bridge this gap, in this paper, we present a two-phase study aimed at mining NFRs from user reviews available on mobile app stores. In the first phase, we conduct a qualitative analysis using a dataset of 6,000 user reviews, sampled from a broad range of iOS app categories. Our results show that 40% of the reviews in our dataset signify at least one type of NFRs. The results also show that users in different app categories tend to raise different types of NFRs. In the second phase, we devise an optimized dictionary-based multi-label classification approach to automatically capture NFRs in user reviews. Evaluating the proposed approach over a dataset of 1,100 reviews, sampled from a set of iOS and Android apps, shows that it achieves an average precision of 70% (range [66% - 80%]) and average recall of 86% (range [69% - 98%]).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Code 1
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Notes

  1. https://www.statista.com/topics/1729/app-stores/

  2. https://www.apple.com/itunes/charts

  3. https://rss.itunes.apple.com/us/?urlDesc=

  4. https://github.com/seelprojects/ManualReviewClassifier

  5. The data is available at: http://seel.cse.lsu.edu/data/emse19.zip

  6. https://www.nltk.org/

  7. Dataset is available at: http://seel.cse.lsu.edu/data/emse19.zip

  8. https://github.com/seelprojects/MARC-3.0

References

  • Apté C, Damerau F, Weiss S (1994) Towards language independent automated learning of text categorization models. In: Special interest group on information retrieval, pp 23–30

    Chapter  Google Scholar 

  • Bakiu E, Guzman E (2017) Which feature is unusable? Detecting usability and user experience issues from user reviews. In: International requirements engineering conference workshops, pp 182–187

  • Bano M, Zowghi D, da Rimini F (2017) User satisfaction and system success: An empirical exploration of user involvement in software development. Empir Softw Eng 22(5):2339–2372

    Article  Google Scholar 

  • Basole R, Karla J (2012) Value transformation in the mobile service ecosystem: A study of app store emergence and growth. Serv Sci 4(1):24–41

    Article  Google Scholar 

  • Berry D (2017) Evaluation of tools for hairy requirements and software engineering tasks. In: International requirements engineering conference workshops, pp 284–291

  • Bi W, Kwok J (2014) Multilabel classification with label correlations and missing labels. In: AAAI conference on artificial intelligence, pp 1680–1686

  • Bird S, Loper E, Klein E (2009) Natural language processing with python. Sentiment Short Strength Detect Informal Text 61(12):2544–2558

    MATH  Google Scholar 

  • Blei D, Ng A, Jordan M (2003) LAtent Dirichlet Allocation. J Mach Learn Res 3:993–1022

    MATH  Google Scholar 

  • Brinker K, Fürnkranz J, Hüllermeier E (2006) A unified model for multilabel classification and ranking. In: European conference on artificial intelligence, pp 489–493

  • Brusilovsky P, Kobsa A, Nejdl W (2007) The Adaptive Web: Methods and Strategies of Web Personalization. Springer, Berlin, pp 335–336

    Book  Google Scholar 

  • Burges C (1998) A tutorial on support vector machines for pattern recognition. Data Min Knowl Disc 2(2):121–167

    Article  Google Scholar 

  • Carreño L, Winbladh K (2013) Analysis of user comments: An approach for software requirements evolution. In: International conference on software engineering, pp 582–591

  • Chen N, Lin J, Hoi S, Xiao X, Zhang B (2014) AR-Miner: Mining informative reviews for developers from mobile app marketplace. In: International conference on software engineering, pp 767–778

  • Cheng W, Hüllermeier E (2009) A simple instance-based approach to multilabel classification using the mallows model. In: International workshop on learning from multi-label data, pp 28–38

  • Chung L, Cesar J, do Prado Leite S (2009) On non-functional requirements in software engineering. Springer, Berlin, pp 363–379

    Google Scholar 

  • Ciurumelea A, Schaufelbühl A, Panichella S, Gall H (2017) Analyzing reviews and code of mobile apps for better release planning. In: International conference on software analysis, evolution and reengineering, pp 91–102

  • Cleland-Huang J, Settimi R, BenKhadra O, Berezhanskaya E, Christina S (2005) Goal-centric traceability for managing non-functional requirements. In: International conference on software engineering, pp 362–371

  • Cleland-Huang J, Settimi R, Zou X, Solc P (2006) The detection and classification of non-functional requirements with application to early aspects. In: Requirements engineering, pp 39–48

  • Cleland-Huang J, Settimi R, Zou X, Solc P (2007) Automated classification of non-functional requirements. Requir Eng 12(2):103–120

    Article  Google Scholar 

  • Coulton P, Bamford W (2011) Experimenting through mobile apps and app stores. Int J Mob Hum Comput Interact 3(4):55–70

    Article  Google Scholar 

  • Dehlinger J, Dixon J (2011) Mobile application software engineering: Challenges and research directions. In: Workshop on mobile software engineering, pp 29–32

  • Eisenstein J, OĆonnor B, Smith N, Xing E (2014) Diffusion of lexical change in social media. PLoS ONE 9:1–13

    Article  Google Scholar 

  • Elisseeff A, Weston J (2001) A kernel method for multi-labelled classification. In: International conference on neural information processing systems: natural and synthetic, pp 681–687

  • Ester M, Kriegel H-P, Sander J, Xu X et al (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. Knowl Discov Data Min 96(34):226–231

    Google Scholar 

  • Finkelstein A, Harman M, Jia Y, Martin W, Sarro F, Zhang Y (2014) App store analysis: Mining app stores for relationships between customer, business and technical characteristics, University of College London, Tech. Rep. rN/14/10, Tech Rep.

  • Forman G, Zahorjan J (1994) The challenges of mobile computing. Computer 27(4):38–47

    Article  Google Scholar 

  • Fu B, Lin J, Li L, Faloutsos C, Hong J, Sadeh N (2013) Why people hate your app: Making sense of user feedback in a mobile app store. In: Knowledge discovery and data mining, pp 1276–1284

  • Ghamrawi N, McCallum A (2005) Collective multi-label classification. In: International conference on information and knowledge management, pp 195–200

  • Giardino C, Wang X, Abrahamsson P (2014) Why early-stage software startups fail: A behavioral framework. In: International conference of software business, pp 27–41

    Chapter  Google Scholar 

  • Glinz M (2007) On non-functional requirements. In: International requirements engineering conference, pp 21–26

  • Godbole S, Sarawagi S (2004) Discriminative methods for multi-labeled classification. In: Advances in knowledge discovery and data mining, pp 22–30

    Chapter  Google Scholar 

  • Gokcay D, Gokcay E (1995) Generating titles for paragraphs using statistically extracted keywords and phrases. Syst Man Cybern 4:3174–3179

    Google Scholar 

  • Gómez M, Adams B, Maalej W, Monperrus M, Rouvoy R (2017) App store 2.0: From crowdsourced information to actionable feedback in mobile ecosystems. IEEE Softw 34(2):81–89

    Article  Google Scholar 

  • Gotel O, Cleland-Huang J, Hayes J, Zisman A, Egyed A, Grünbacher P, Dekhtyar A, Antoniol G, Maletic J (2012) The grand challenge of traceability (v1. 0). In: Software and systems traceability, pp 343–409

    Google Scholar 

  • Gralha W, Damian D, Wasserman A, Goulao M, Araújo J (2018) The evolution of requirements practices in software startups. In: International conference on software engineering

  • Groen E, Kopczynska S, Hauer M, Krafft T, Doerr J (2017) Users - The hidden software product quality experts? Requirements Engineering, pp 80–89

  • Gross D, Yu E (2001) From non-functional requirements to design through patterns. Requir Eng 6(1):18–36

    Article  MATH  Google Scholar 

  • Guzman E, Maalej W (2014) How do users like this feature? A fine grained sentiment analysis of app reviews. In: Requirements engineering, pp 153–162

  • Harman M., Jia Y., Zhang Y. (2012) App store mining and analysis: MSR for app stores, In: Mining software repositories, pp 108–111

  • Harrison R, Flood D, Duce D (2013) Usability of mobile applications: Literature review and rationale for a new usability model. J Interact Sci 1(1):1–16

    Article  Google Scholar 

  • Hattori L, Lanza M (2008) On the nature of commits. In: International conference on automated software engineering, pp 63–71

  • He W, Tian X, Shen J (2015) Examining security risks of mobile banking applications through blog mining. In: Modern artificial intelligence and cognitive science conference, pp 103–108

  • Hindle A, Wilson A, Rasmussen K, Barlow J, Charles J, Romansky S (2014) GreenMiner: A hardware based mining software repositories software energy consumption framework. In: Working conference on mining software repositories, pp 21–21

  • Hutto C, Gilbert, E (2014) VADER: A parsimonious rule-based model for sentiment analysis of social media text. In: International AAAI conference on weblogs and social media

  • Ihm S, Loh W, Park Y (2013) App analytic: A study on correlation analysis of app ranking data. In: International conference on cloud and green computing, pp 561–563

  • Javarone M, Armano G (2013) Emergence of acronyms in a community of language users. Eur Phys J B 86(11):474

    Article  Google Scholar 

  • Jha N, Mahmoud A (2017a) Mining user requirements from application store reviews using frame semantics. In: Requirements engineering: foundation for software quality, pp 273–287

    Chapter  Google Scholar 

  • Jha N, Mahmoud A (2017b) MARC: A Mobile application review classifier. In: Requirements engineering: foundation for software quality, workshops, pp 1-15

    Chapter  Google Scholar 

  • Jha N, Mahmoud A (2018) Using frame semantics for classifying and summarizing application store reviews. Empir Softw Eng 23(6):3734–3767

    Article  Google Scholar 

  • Joachims T (1998) Text categorization with Support Vector Machines: Learning with many relevant features, pp 137–142

  • Johann T, Stanik C, Maalej W et al (2017) Safe: A simple approach for feature extraction from app descriptions and app reviews. In: Requirements engineering, pp 21–30

  • Jongeling R, Sarkar P, Datta S, Serebrenik A (2017) On negative results when using sentiment analysis tools for software engineering research. Empir Softw Eng 22(5):2543–2584

    Article  Google Scholar 

  • Kurtanović Z, Maalej W (2017) Mining user rationale from software reviews. In: Requirements engineering, pp 61–70

  • Lee G, Raghu T (2011) Product portfolio and mobile apps success: Evidence from app store market. In: Americas conference information systems, pp 3912–3921

  • Lewis D (1998) Naive (Bayes) at forty: The independence assumption in information retrieval. In: European conference on machine learning, pp 4–15

    Chapter  Google Scholar 

  • Li J, Yan H, Liu Z, Chen X, Huang X, Wong D (2017) Location-sharing systems with enhanced privacy in mobile online social networks. IEEE Syst J 11 (2):439–448

    Article  Google Scholar 

  • Lin B, Zampetti F, Bavota G, Di Penta M, Lanza M, Oliveto R (2018) Sentiment analysis for software engineering: How far can we go? In: International conference on software engineering, pp 94–104

  • Luaces O, Díez J, Barranquero J, Coz J, Bahamonde A (2012) Binary relevance efficacy for multilabel classification. Prog Artif Intell 1(4):303–313

    Article  Google Scholar 

  • Maalej W, Nabil H (2015) Bug report, feature request, or simply praise? On automatically classifying app reviews. In: Requirements engineering, pp 116–125

  • Maalej W, Kurtanović Z, nabil H, Stanik C (2016) On the automatic classification of app reviews. Requir Eng 21(3):311–331

    Article  Google Scholar 

  • Mahatanankoon R, Joseph Wen H, Lim B (2005) Consumer-based m-commerce: Exploring consumer perception of mobile applications. Comput Stand Interfaces 27 (4):347–357

    Article  Google Scholar 

  • Mahmoud A, Williams G (2016) Detecting, classifying, and tracing non-functional software requirements. Requir Eng 21(3):357–381

    Article  Google Scholar 

  • Mairiza D, Zowghi D, Nurmuliani N (2010) An investigation into the notion of non-functional requirements. In: Association for computing machinery symposium on applied computing, pp 311–317

  • Martin W, Harman M, Jia Y, Sarro F, Zhang Y (2015) The app sampling problem for app store mining. In: Working conference on mining software repositories, pp 123–133

  • Martin W, Sarro F, Harman M (2016a) Causal impact analysis for app releases in google play. In: International symposium on foundations of software engineering, pp 435–446

  • Martin W, Sarro F, Jia Y, Zhang Y, Harman M (2016b) A survey of app store analysis for software engineering. IEEE Transactions on Software Engineering

  • Martin W, Sarro F, Jia Y, Zhang Y, Harman M (2017) A survey of app store analysis for software engineering. IEEE Trans Softw Eng 43(9):817–847

    Article  Google Scholar 

  • McCallum A, Nigam K et al (1998) A comparison of event models for Naive Bayes text classification. In: AAAI-98 workshop on learning for text categorization, vol 752, pp 41–48

  • Mcllroy S, Ali N, Khalid H, Hassan A (2016) Analyzing and automatically labelling the types of user issues that are raised in mobile app reviews. Empir Softw Eng 21(3):1067–1106

    Article  Google Scholar 

  • Nayebi M, Adams B, Ruhe G (2016a) Release practices for mobile apps – what do users and developers think?. In: International conference on software analysis, evolution, and reengineering, pp 552–562

  • Nguyen Duc A, Abrahamsson P (2016b) Minimum viable product or multiple facet product? The role of mvp in software startups. In: Agile processes in software engineering and extreme programming, pp 118–130

    Chapter  Google Scholar 

  • Nayebi M, Farahi H, Ruhe G (2017a) Which version should be released to app store?. In: International symposium on empirical software engineering and measurement, pp 324–333

  • Nayebi M, Ruhe G (2017b) Optimized functionality for super mobile apps. In: International requirements engineering conference, pp 388–393

  • Nayebi M, Cho H, Ruhe G (2018) App store mining is not enough for app improvement. Empir Softw Eng 23(5):2764–2794

    Article  Google Scholar 

  • Nuseibeh B (2001) Weaving together requirements and architectures. Computer 34(3):115–119

    Article  Google Scholar 

  • Pagano D, Maalej W (2013) User feedback in the appstore: An empirical study. In: Requirements engineering, pp 125–134

  • Panichella S, Sorbo A, Guzman E, Visaggio C, Canfora G, Gall H (2015) How can I improve my app? Classifying user reviews for software maintenance and evolution. In: International conference on software maintenance and evolution, pp 281–290

  • Paternoster N, Giardino C, Unterkalmsteiner M, Gorschek T, Abrahamsson P (2014) Software development in Startup companies: A systematic mapping study. Inf Softw Technol 56(10):1200–1218

    Article  Google Scholar 

  • Pedregosa F, et al. (2011) Scikit-learn: Machine learning in Python. J Mach Learn Res 12:2825–2830

    MathSciNet  MATH  Google Scholar 

  • Petsas T, Papadogiannakis A, Polychronakis M, Markatos E, Karagiannis T (2013) Rise of the planet of the apps: A systematic study of the mobile app ecosystem. In: Conference on internet measurement, pp 277–290

  • Quinlan R (1986) Induction of Decision Trees. Mach Learn 1(1):81–106

    Google Scholar 

  • Read J, Pfahringer B, Holmes G (2008) Multi-label classification using ensembles of pruned sets. In: International conference on data mining, pp 995–1000

  • Regnell B, Höst M, Berntsson Svensson R (2007) A quality performance model for cost-benefit analysis of non-functional requirements applied to the mobile handset domain. In: Requirements engineering: foundation for software quality, pp 277–291

    Chapter  Google Scholar 

  • Ribeiro F, Araújo M, Gonċalves P, Benevenuto F, Gonċalves M (2015) SentiBench-a benchmark comparison of state-of-the-practice sentiment analysis methods, arXiv:http://arXiv.org/abs/1512.01818

  • Shah F, Sabanin Y, Pfahl D (2016) Feature-based evaluation of competing apps. In: International workshop on app market analytics, pp 15–21

  • Sorower M (2010) A literature survey on algorithms for multi-label learning, vol 18. Oregon State University, Corvallis

    Google Scholar 

  • Tsoumakas G, Dimou A, Spyromitros E, Mezaris V, Kompatsiaris I, Vlahavas I (2009) Correlation-based pruning of stacked binary relevance models for multi-label learning. In: International workshop on learning from multi-label data, pp 101–116

  • Villarroel L, Bavota G, Russo B, Oliveto R, Di Penta M (2016) Release planning of mobile apps based on user reviews. In: International conference on software engineering, pp 14–24

  • Wasserman A (2010) Software engineering issues for mobile application development. In: The FSE/SDP workshop on future of software engineering research, pp 397–400

  • Williams G, Mahmoud A (2017a) Analyzing, classifying, and interpreting emotions in software users’ tweets. In: International workshop on emotion awareness in software engineering, pp 2–7

  • Williams G, Mahmoud A (2017b) Mining Twitter feeds for software user requirements. In: International requirements engineering conference, pp 1–10

  • Williams G, Mahmoud A (2018) Modeling user concerns in the app store: A case study on the rise and fall of Yik Yak. In: International requirements engineering conference, pp 64–75

  • Wilson T, Wiebe J, Hoffmann P (2005) Recognizing contextual polarity in phrase-level sentiment analysis. In: Human language technology and empirical methods in natural language processing, pp 347–354

  • Wohlin C, Runeson P, Höst M, Ohlsson M, Regnell B (2012) A wesslèn Experimentation in Software Engineering. Springer, Berlin

    Book  MATH  Google Scholar 

Download references

Acknowledgements

We would like to extend our gratitude to Dr. Daniel M. Berry from the University of Waterloo for his contribution to this work. This work was supported in part by the Louisiana Board of Regents Research Competitiveness Subprogram (LA BoR-RCS), contract number: LEQSF(2015-18)-RD-A-07 and by the LSU Economic Development Assistantships (EDA) program.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anas Mahmoud.

Additional information

Communicated by: David Lo, Meiyappan Nagappan, Fabio Palomba, and Sebastiano Panichella

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jha, N., Mahmoud, A. Mining non-functional requirements from App store reviews. Empir Software Eng 24, 3659–3695 (2019). https://doi.org/10.1007/s10664-019-09716-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10664-019-09716-7

Keywords