Search | arXiv e-print repository

A Systematic Literature Review on Task Recommendation Systems for Crowdsourced Software Engineering

Authors: Shashiwadana Nirmani, Mojtaba Shahin, Hourieh Khalajzadeh, Xiao Liu

Abstract: Context: Crowdsourced Software Engineering CSE offers outsourcing work to software practitioners by leveraging a global online workforce. However these software practitioners struggle to identify suitable tasks due to the variety of options available. Hence there have been a growing number of studies on introducing recommendation systems to recommend CSE tasks to software practitioners. Objective:… ▽ More Context: Crowdsourced Software Engineering CSE offers outsourcing work to software practitioners by leveraging a global online workforce. However these software practitioners struggle to identify suitable tasks due to the variety of options available. Hence there have been a growing number of studies on introducing recommendation systems to recommend CSE tasks to software practitioners. Objective: The goal of this study is to analyze the existing CSE task recommendation systems, investigating their extracted data, recommendation methods, key advantages and limitations, recommended task types, the use of human factors in recommendations, popular platforms, and features used to make recommendations. Method: This SLR was conducted according to the Kitchenham and Charters guidelines. We used both manual and automatic search strategies without putting any time limitation for searching the relevant papers. Results: We selected 63 primary studies for data extraction, analysis, and synthesis based on our predefined inclusion and exclusion criteria. From the results of the data analysis, we classified the extracted data into 4 categories based on the data extraction source, categorized the proposed recommendation systems to fit into a taxonomy, and identified the key advantages and limitations of these systems. Our results revealed that human factors play a major role in CSE task recommendation. Further we identified five popular task types recommended, popular platforms, and their features used in task recommendation. We also provided recommendations for future research directions. Conclusion: This SLR provides insights into current trends gaps and future research directions in CSE task recommendation systems. △ Less

Submitted 13 July, 2024; originally announced July 2024.

Comments: 33 pages, 3 figures

arXiv:2407.08145 [pdf, other]

A Comprehensive Study of Disaster Support Mobile Apps

Authors: Muhamad Syukron, Anuradha Madugalla, Mojtaba Shahin, John Grundy

Abstract: Context: Disasters are a common global occurrence with climate change leading to increase both their frequency and intensity. To reduce the impact of these disasters on lives and livelihoods it is important to provide accurate warnings and information about recovery and mitigation. Today most emergency management agencies deliver this information via mobile apps. Objective: There is a large coll… ▽ More Context: Disasters are a common global occurrence with climate change leading to increase both their frequency and intensity. To reduce the impact of these disasters on lives and livelihoods it is important to provide accurate warnings and information about recovery and mitigation. Today most emergency management agencies deliver this information via mobile apps. Objective: There is a large collection of disaster mobile apps available across the globe. But a detailed study is not yet conducted on these apps and their reviews to understand their key features and user feedback. In this paper we present a comprehensive analysis to address this research gap. Method: We conducted a detailed analysis of 45 disaster apps and 28,161 reviews on these apps. We manually analysed the features of these 45 apps and for review analysis employed topic modelling and sentiment analysis techniques. Results: We identified 13 key features in these apps and categorised them in to the 4 stages of disaster life cycle. Our analysis revealed 22 topics with highest discussions being on apps alert functionality, app satisfaction and use of maps. Sentiment analysis of reviews showed that while 22\% of users provided positive feedback, 9.5\% were negative and 6.8\% were neutral. It also showed that signup/signin issues, network issues and app configuration issues were the most frustrating to users. These impacted user safety as these prevented them from accessing the app when it mattered most. Conclusions: We provide a set of practical recommendations for future disaster app developers. Our findings will help emergency agencies develop better disaster apps by ensuring key features are supported in their apps, by understanding commonly discussed user issues. This will help to improve the disaster app eco-system and lead to more user friendly and supportive disaster support apps in the future. △ Less

Submitted 10 July, 2024; originally announced July 2024.

arXiv:2406.18959 [pdf, other]

How Do Users Revise Architectural Related Questions on Stack Overflow: An Empirical Study

Authors: Musengamana Jean de Dieu, Peng Liang, Mojtaba Shahin, Arif Ali Khan

Abstract: Technical Questions and Answers (Q&A) sites, such as Stack Overflow (SO), accumulate a significant variety of information related to software development in posts from users. To ensure the quality of this information, SO encourages its users to review posts through various mechanisms (e.g., question and answer revision processes). Although Architecture Related Posts (ARPs) communicate architectura… ▽ More Technical Questions and Answers (Q&A) sites, such as Stack Overflow (SO), accumulate a significant variety of information related to software development in posts from users. To ensure the quality of this information, SO encourages its users to review posts through various mechanisms (e.g., question and answer revision processes). Although Architecture Related Posts (ARPs) communicate architectural information that has a system-wide impact on development, little is known about how SO users revise information shared in ARPs. To fill this gap, we conducted an empirical study to understand how users revise Architecture Related Questions (ARQs) on SO. We manually checked 13,205 ARPs and finally identified 4,114 ARQs that contain revision information. Our main findings are that: (1) The revision of ARQs is not prevalent in SO, and an ARQ revision starts soon after this question is posted (i.e., from 1 minute onward). Moreover, the revision of an ARQ occurs before and after this question receives its first answer/architecture solution, with most revisions beginning before the first architecture solution is posted. Both Question Creators (QCs) and non-QCs actively participate in ARQ revisions, with most revisions being made by QCs. (2) A variety of information (14 categories) is missing and further provided in ARQs after being posted, among which design context, component dependency, and architecture concern are dominant information. (3) Clarify the understanding of architecture under design and improve the readability of architecture problem are the two major purposes of the further provided information in ARQs. (4) The further provided information in ARQs has several impacts on the quality of answers/architecture solutions, including making architecture solution useful, making architecture solution informative, making architecture solution relevant, among others. △ Less

Submitted 27 June, 2024; originally announced June 2024.

arXiv:2404.05041 [pdf, other]

How Do OSS Developers Utilize Architectural Solutions from Q&A Sites: An Empirical Study

Authors: Musengamana Jean de Dieu, Peng Liang, Mojtaba Shahin

Abstract: Developers utilize programming-related knowledge (e.g., code snippets) on Q&A sites (e.g., Stack Overflow) that functionally matches the programming problems they encounter in their development. Despite extensive research on Q&A sites, being a high-level and important type of development-related knowledge, architectural solutions (e.g., architecture tactics) and their utilization are rarely explor… ▽ More Developers utilize programming-related knowledge (e.g., code snippets) on Q&A sites (e.g., Stack Overflow) that functionally matches the programming problems they encounter in their development. Despite extensive research on Q&A sites, being a high-level and important type of development-related knowledge, architectural solutions (e.g., architecture tactics) and their utilization are rarely explored. To fill this gap, we conducted a mixed-methods study that includes a mining study and a survey study. For the mining study, we mined 984 commits and issues (i.e., 821 commits and 163 issues) from 893 Open-Source Software (OSS) projects on GitHub that explicitly referenced architectural solutions from Stack Overflow (SO) and Software Engineering Stack Exchange (SWESE). For the survey study, we identified practitioners involved in the utilization of these architectural solutions and surveyed 227 of them to further understand how practitioners utilize architectural solutions from Q&A sites in their OSS development. Our main findings are that: (1) OSS practitioners use architectural solutions from Q&A sites to solve a large variety (15 categories) of architectural problems, wherein Component design issue, Architectural anti-pattern, and Security issue are dominant; (2) Seven categories of architectural solutions from Q&A sites have been utilized to solve those problems, among which Architectural refactoring, Use of frameworks, and Architectural tactic are the three most utilized architectural solutions; (3) Using architectural solutions from SO comes with a variety of challenges, e.g., OSS practitioners complain that they need to spend significant time to adapt such architectural solutions to address design concerns raised in their OSS development, and it is challenging to use architectural solutions that are not tailored to the design context of their OSS projects. △ Less

Submitted 7 April, 2024; originally announced April 2024.

arXiv:2401.16310 [pdf, other]

Security Code Review by Large Language Models

Authors: Jiaxin Yu, Peng Liang, Yujia Fu, Amjed Tahir, Mojtaba Shahin, Chong Wang, Yangxiao Cai

Abstract: Security code review, as a time-consuming and labour-intensive process, typically requires integration with automated security defect detection tools to ensure code security. Despite the emergence of numerous security analysis tools, those tools face challenges in terms of their poor generalization, high false positive rates, and coarse detection granularity. A recent development with Large Langua… ▽ More Security code review, as a time-consuming and labour-intensive process, typically requires integration with automated security defect detection tools to ensure code security. Despite the emergence of numerous security analysis tools, those tools face challenges in terms of their poor generalization, high false positive rates, and coarse detection granularity. A recent development with Large Language Models (LLMs) has made them a promising candidate to support security code review. To this end, we conducted the first empirical study to understand the capabilities of LLMs in security code review, delving into the performance, quality problems, and influential factors of LLMs to detect security defects in code reviews. Specifically, we compared the performance of 6 LLMs under five different prompts with the state-of-the-art static analysis tools to detect and analyze security defects. For the best-performing LLM, we conducted a linguistic analysis to explore quality problems in its responses, as well as a regression analysis to investigate the factors influencing its performance. The results are that: (1) existing pre-trained LLMs have limited capability in detecting security defects during code review but significantly outperform the state-of-the-art static analysis tools. (2) GPT-4 performs best among all LLMs when provided with a CWE list for reference. (3) GPT-4 makes few factual errors but frequently generates unnecessary content or responses that are not compliant with the task requirements given in the prompts. (4) GPT-4 is more adept at identifying security defects in code files with fewer tokens, containing functional logic and written by developers with less involvement in the project. △ Less

Submitted 8 June, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

arXiv:2401.12768 [pdf, other]

doi 10.1145/3643991.3644909

What Can Self-Admitted Technical Debt Tell Us About Security? A Mixed-Methods Study

Authors: Nicolás E. Díaz Ferreyra, Mojtaba Shahin, Mansooreh Zahedi, Sodiq Quadri, Ricardo Scandariato

Abstract: Self-Admitted Technical Debt (SATD) encompasses a wide array of sub-optimal design and implementation choices reported in software artefacts (e.g., code comments and commit messages) by developers themselves. Such reports have been central to the study of software maintenance and evolution over the last decades. However, they can also be deemed as dreadful sources of information on potentially exp… ▽ More Self-Admitted Technical Debt (SATD) encompasses a wide array of sub-optimal design and implementation choices reported in software artefacts (e.g., code comments and commit messages) by developers themselves. Such reports have been central to the study of software maintenance and evolution over the last decades. However, they can also be deemed as dreadful sources of information on potentially exploitable vulnerabilities and security flaws. This work investigates the security implications of SATD from a technical and developer-centred perspective. On the one hand, it analyses whether security pointers disclosed inside SATD sources can be used to characterise vulnerabilities in Open-Source Software (OSS) projects and repositories. On the other hand, it delves into developers' perspectives regarding the motivations behind this practice, its prevalence, and its potential negative consequences. We followed a mixed-methods approach consisting of (i) the analysis of a preexisting dataset containing 8,812 SATD instances and (ii) an online survey with 222 OSS practitioners. We gathered 201 SATD instances through the dataset analysis and mapped them to different Common Weakness Enumeration (CWE) identifiers. Overall, 25 different types of CWEs were spotted across commit messages, pull requests, code comments, and issue sections, from which 8 appear among MITRE's Top-25 most dangerous ones. The survey shows that software practitioners often place security pointers across SATD artefacts to promote a security culture among their peers and help them spot flaky code sections, among other motives. However, they also consider such a practice risky as it may facilitate vulnerability exploits. Our findings suggest that preserving the contextual integrity of security pointers disseminated across SATD artefacts is critical to safeguard both commercial and OSS solutions against zero-day attacks. △ Less

Submitted 2 March, 2024; v1 submitted 23 January, 2024; originally announced January 2024.

Comments: Accepted in the 21th International Conference on Mining Software Repositories (MSR '24)

arXiv:2401.08097 [pdf, other]

Fairness Concerns in App Reviews: A Study on AI-based Mobile Apps

Authors: Ali Rezaei Nasab, Maedeh Dashti, Mojtaba Shahin, Mansooreh Zahedi, Hourieh Khalajzadeh, Chetan Arora, Peng Liang

Abstract: Fairness is one of the socio-technical concerns that must be addressed in AI-based systems. Unfair AI-based systems, particularly unfair AI-based mobile apps, can pose difficulties for a significant proportion of the global population. This paper aims to analyze fairness concerns in AI-based app reviews. We first manually constructed a ground-truth dataset, including 1,132 fairness and 1,473 non-f… ▽ More Fairness is one of the socio-technical concerns that must be addressed in AI-based systems. Unfair AI-based systems, particularly unfair AI-based mobile apps, can pose difficulties for a significant proportion of the global population. This paper aims to analyze fairness concerns in AI-based app reviews. We first manually constructed a ground-truth dataset, including 1,132 fairness and 1,473 non-fairness reviews. Leveraging the ground-truth dataset, we developed and evaluated a set of machine learning and deep learning models that distinguish fairness reviews from non-fairness reviews. Our experiments show that our best-performing model can detect fairness reviews with a precision of 94%. We then applied the best-performing model on approximately 9.5M reviews collected from 108 AI-based apps and identified around 92K fairness reviews. Next, applying the K-means clustering technique to the 92K fairness reviews, followed by manual analysis, led to the identification of six distinct types of fairness concerns (e.g., 'receiving different quality of features and services in different platforms and devices' and 'lack of transparency and fairness in dealing with user-generated content'). Finally, the manual analysis of 2,248 app owners' responses to the fairness reviews identified six root causes (e.g., 'copyright issues') that app owners report to justify fairness concerns. △ Less

Submitted 20 June, 2024; v1 submitted 15 January, 2024; originally announced January 2024.

Comments: 30 pages, 5 images, 6 tables, Manuscript submitted to a Journal (2024)

arXiv:2311.07037 [pdf, other]

Phonological Level wav2vec2-based Mispronunciation Detection and Diagnosis Method

Authors: Mostafa Shahin, Julien Epps, Beena Ahmed

Abstract: The automatic identification and analysis of pronunciation errors, known as Mispronunciation Detection and Diagnosis (MDD) plays a crucial role in Computer Aided Pronunciation Learning (CAPL) tools such as Second-Language (L2) learning or speech therapy applications. Existing MDD methods relying on analysing phonemes can only detect categorical errors of phonemes that have an adequate amount of tr… ▽ More The automatic identification and analysis of pronunciation errors, known as Mispronunciation Detection and Diagnosis (MDD) plays a crucial role in Computer Aided Pronunciation Learning (CAPL) tools such as Second-Language (L2) learning or speech therapy applications. Existing MDD methods relying on analysing phonemes can only detect categorical errors of phonemes that have an adequate amount of training data to be modelled. With the unpredictable nature of the pronunciation errors of non-native or disordered speakers and the scarcity of training datasets, it is unfeasible to model all types of mispronunciations. Moreover, phoneme-level MDD approaches have a limited ability to provide detailed diagnostic information about the error made. In this paper, we propose a low-level MDD approach based on the detection of speech attribute features. Speech attribute features break down phoneme production into elementary components that are directly related to the articulatory system leading to more formative feedback to the learner. We further propose a multi-label variant of the Connectionist Temporal Classification (CTC) approach to jointly model the non-mutually exclusive speech attributes using a single model. The pre-trained wav2vec2 model was employed as a core model for the speech attribute detector. The proposed method was applied to L2 speech corpora collected from English learners from different native languages. The proposed speech attribute MDD method was further compared to the traditional phoneme-level MDD and achieved a significantly lower False Acceptance Rate (FAR), False Rejection Rate (FRR), and Diagnostic Error Rate (DER) over all speech attributes compared to the phoneme-level equivalent. △ Less

Submitted 12 November, 2023; originally announced November 2023.

arXiv:2311.01020 [pdf, other]

Exploring the Problems, their Causes and Solutions of AI Pair Programming: A Study with Practitioners of GitHub Copilot

Authors: Xiyu Zhou, Peng Liang, Beiqi Zhang, Zengyang Li, Aakash Ahmad, Mojtaba Shahin, Muhammad Waseem

Abstract: With the recent advancement of Artificial Intelligence (AI) and Large Language Models (LLMs), AI-based code generation tools become a practical solution for software development. GitHub Copilot, the AI pair programmer, utilizes machine learning models trained on a large corpus of code snippets to generate code suggestions using natural language processing. Despite its popularity in software develo… ▽ More With the recent advancement of Artificial Intelligence (AI) and Large Language Models (LLMs), AI-based code generation tools become a practical solution for software development. GitHub Copilot, the AI pair programmer, utilizes machine learning models trained on a large corpus of code snippets to generate code suggestions using natural language processing. Despite its popularity in software development, there is limited empirical evidence on the actual experiences of practitioners who work with Copilot. To this end, we conducted an empirical study to understand the problems that practitioners face when using Copilot, as well as their underlying causes and potential solutions. We collected data from 476 GitHub issues, 706 GitHub discussions, and 142 Stack Overflow posts. Our results reveal that (1) Operation Issue and Compatibility Issue are the most common problems faced by Copilot users, (2) Copilot Internal Error, Network Connection Error, and Editor/IDE Compatibility Issue are identified as the most frequent causes, and (3) Bug Fixed by Copilot, Modify Configuration/Setting, and Use Suitable Version are the predominant solutions. Based on the results, we discuss the potential areas of Copilot for enhancement, and provide the implications for the Copilot users, the Copilot team, and researchers. △ Less

Submitted 28 April, 2024; v1 submitted 2 November, 2023; originally announced November 2023.

arXiv:2310.02059 [pdf, other]

Security Weaknesses of Copilot Generated Code in GitHub

Authors: Yujia Fu, Peng Liang, Amjed Tahir, Zengyang Li, Mojtaba Shahin, Jiaxin Yu, Jinfu Chen

Abstract: Modern code generation tools, utilizing AI models like Large Language Models (LLMs), have gained popularity for producing functional code. However, their usage presents security challenges, often resulting in insecure code merging into the code base. Evaluating the quality of generated code, especially its security, is crucial. While prior research explored various aspects of code generation, the… ▽ More Modern code generation tools, utilizing AI models like Large Language Models (LLMs), have gained popularity for producing functional code. However, their usage presents security challenges, often resulting in insecure code merging into the code base. Evaluating the quality of generated code, especially its security, is crucial. While prior research explored various aspects of code generation, the focus on security has been limited, mostly examining code produced in controlled environments rather than real-world scenarios. To address this gap, we conducted an empirical study, analyzing code snippets generated by GitHub Copilot from GitHub projects. Our analysis identified 452 snippets generated by Copilot, revealing a high likelihood of security issues, with 32.8% of Python and 24.5% of JavaScript snippets affected. These issues span 38 different Common Weakness Enumeration (CWE) categories, including significant ones like CWE-330: Use of Insufficiently Random Values, CWE-78: OS Command Injection, and CWE-94: Improper Control of Generation of Code. Notably, eight CWEs are among the 2023 CWE Top-25, highlighting their severity. Our findings confirm that developers should be careful when adding code generated by Copilot and should also run appropriate security checks as they accept the suggested code. It also shows that practitioners should cultivate corresponding security awareness and skills. △ Less

Submitted 4 April, 2024; v1 submitted 3 October, 2023; originally announced October 2023.

arXiv:2309.06161 [pdf]

Towards an Understanding of Developers' Perceptions of Transparency in Software Development: A Preliminary Study

Authors: Humphrey O. Obie, Juliet Ukwella, Kashumi Madampe, John Grundy, Mojtaba Shahin

Abstract: Software applications play an increasingly critical role in various aspects of our lives, from communication and entertainment to business and healthcare. As these applications become more pervasive, the importance of considering human values in software development has gained significant attention. In this preliminary study, we investigate developers's perceptions and experiences related to human… ▽ More Software applications play an increasingly critical role in various aspects of our lives, from communication and entertainment to business and healthcare. As these applications become more pervasive, the importance of considering human values in software development has gained significant attention. In this preliminary study, we investigate developers's perceptions and experiences related to human values, with a focus on the human value of transparency. We interviewed five experienced developers and conducted thematic analysis to explore how developers perceive transparency, violations of transparency, and the process of fixing reported violations of transparency. Our findings reveal the significance of transparency as a fundamental value in software development, with developers recognising its importance for building trust, promoting accountability, and fostering ethical practices. Developers recognise the negative consequences of the violation of the human value of transparency and follow a systematic process to fix reported violations. This includes investigation, root cause analysis, corrective action planning, collaborative problem-solving, and testing and verification. These preliminary findings contribute to the understanding of transparency in software development and provide insights for promoting ethical practices. △ Less

Submitted 12 September, 2023; originally announced September 2023.

Comments: 2023 Workshop on Human Centric Software Engineering and Cyber Security

arXiv:2307.02326 [pdf, other]

Security Defect Detection via Code Review: A Study of the OpenStack and Qt Communities

Authors: Jiaxin Yu, Liming Fu, Peng Liang, Amjed Tahir, Mojtaba Shahin

Abstract: Background: Despite the widespread use of automated security defect detection tools, software projects still contain many security defects that could result in serious damage. Such tools are largely context-insensitive and may not cover all possible scenarios in testing potential issues, which makes them susceptible to missing complex security defects. Hence, thorough detection entails a synergist… ▽ More Background: Despite the widespread use of automated security defect detection tools, software projects still contain many security defects that could result in serious damage. Such tools are largely context-insensitive and may not cover all possible scenarios in testing potential issues, which makes them susceptible to missing complex security defects. Hence, thorough detection entails a synergistic cooperation between these tools and human-intensive detection techniques, including code review. Code review is widely recognized as a crucial and effective practice for identifying security defects. Aim: This work aims to empirically investigate security defect detection through code review. Method: To this end, we conducted an empirical study by analyzing code review comments derived from four projects in the OpenStack and Qt communities. Through manually checking 20,995 review comments obtained by keyword-based search, we identified 614 comments as security-related. Results: Our results show that (1) security defects are not prevalently discussed in code review, (2) more than half of the reviewers provided explicit fixing strategies/solutions to help developers fix security defects, (3) developers tend to follow reviewers' suggestions and action the changes, (4) Not worth fixing the defect now and Disagreement between the developer and the reviewer are the main causes for not resolving security defects. Conclusions: Our research results demonstrate that (1) software security practices should combine manual code review with automated detection tools, achieving a more comprehensive coverage to identifying and addressing security defects, and (2) promoting appropriate standardization of practitioners' behaviors during code review remains necessary for enhancing software security. △ Less

Submitted 5 July, 2023; originally announced July 2023.

Comments: The 17th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM)

arXiv:2303.16140 [pdf]

Machine learning tools to improve nonlinear modeling parameters of RC columns

Authors: Hamid Khodadadi Koodiani, Elahe Jafari, Arsalan Majlesi, Mohammad Shahin, Adolfo Matamoros, Adel Alaeddini

Abstract: Modeling parameters are essential to the fidelity of nonlinear models of concrete structures subjected to earthquake ground motions, especially when simulating seismic events strong enough to cause collapse. This paper addresses two of the most significant barriers to improving nonlinear modeling provisions in seismic evaluation standards using experimental data sets: identifying the most likely m… ▽ More Modeling parameters are essential to the fidelity of nonlinear models of concrete structures subjected to earthquake ground motions, especially when simulating seismic events strong enough to cause collapse. This paper addresses two of the most significant barriers to improving nonlinear modeling provisions in seismic evaluation standards using experimental data sets: identifying the most likely mode of failure of structural components, and implementing data fitting techniques capable of recognizing interdependencies between input parameters and nonlinear relationships between input parameters and model outputs. Machine learning tools in the Scikit-learn and Pytorch libraries were used to calibrate equations and black-box numerical models for nonlinear modeling parameters (MP) a and b of reinforced concrete columns defined in the ASCE 41 and ACI 369.1 standards, and to estimate their most likely mode of failure. It was found that machine learning regression models and machine learning black-boxes were more accurate than current provisions in the ACI 369.1/ASCE 41 Standards. Among the regression models, Regularized Linear Regression was the most accurate for estimating MP a, and Polynomial Regression was the most accurate for estimating MP b. The two black-box models evaluated, namely the Gaussian Process Regression and the Neural Network (NN), provided the most accurate estimates of MPs a and b. The NN model was the most accurate machine learning tool of all evaluated. A multi-class classification tool from the Scikit-learn machine learning library correctly identified column mode of failure with 79% accuracy for rectangular columns and with 81% accuracy for circular columns, a substantial improvement over the classification rules in ASCE 41-13. △ Less

Submitted 8 March, 2023; originally announced March 2023.

arXiv:2303.09808 [pdf, other]

A Study of Gender Discussions in Mobile Apps

Authors: Mojtaba Shahin, Mansooreh Zahedi, Hourieh Khalajzadeh, Ali Rezaei Nasab

Abstract: Mobile software apps ("apps") are one of the prevailing digital technologies that our modern life heavily depends on. A key issue in the development of apps is how to design gender-inclusive apps. Apps that do not consider gender inclusion, diversity, and equality in their design can create barriers (e.g., excluding some of the users because of their gender) for their diverse users. While there ha… ▽ More Mobile software apps ("apps") are one of the prevailing digital technologies that our modern life heavily depends on. A key issue in the development of apps is how to design gender-inclusive apps. Apps that do not consider gender inclusion, diversity, and equality in their design can create barriers (e.g., excluding some of the users because of their gender) for their diverse users. While there have been some efforts to develop gender-inclusive apps, a lack of deep understanding regarding user perspectives on gender may prevent app developers and owners from identifying issues related to gender and proposing solutions for improvement. Users express many different opinions about apps in their reviews, from sharing their experiences, and reporting bugs, to requesting new features. In this study, we aim at unpacking gender discussions about apps from the user perspective by analysing app reviews. We first develop and evaluate several Machine Learning (ML) and Deep Learning (DL) classifiers that automatically detect gender reviews (i.e., reviews that contain discussions about gender). We apply our ML and DL classifiers on a manually constructed dataset of 1,440 app reviews from the Google App Store, composing 620 gender reviews and 820 non-gender reviews. Our best classifier achieves an F1-score of 90.77%. Second, our qualitative analysis of a randomly selected 388 out of 620 gender reviews shows that gender discussions in app reviews revolve around six topics: App Features, Appearance, Content, Company Policy and Censorship, Advertisement, and Community. Finally, we provide some practical implications and recommendations for developing gender-inclusive apps. △ Less

Submitted 17 March, 2023; originally announced March 2023.

Comments: Preprint- Accepted for publication in 2023 IEEE/ACM 20th International Conference on Mining Software Repositories (MSR)

arXiv:2302.01894 [pdf, other]

Understanding the Issues, Their Causes and Solutions in Microservices Systems: An Empirical Study

Authors: Muhammad Waseem, Peng Liang, Aakash Ahmad, Arif Ali Khan, Mojtaba Shahin, Pekka Abrahamsson, Ali Rezaei Nasab, Tommi Mikkonen

Abstract: Many small to large organizations have adopted the Microservices Architecture (MSA) style to develop and deliver their core businesses. Despite the popularity of MSA in the software industry, there is a limited evidence-based and thorough understanding of the types of issues (e.g., errors, faults, failures, and bugs) that microservices system developers experience, the causes of the issues, and th… ▽ More Many small to large organizations have adopted the Microservices Architecture (MSA) style to develop and deliver their core businesses. Despite the popularity of MSA in the software industry, there is a limited evidence-based and thorough understanding of the types of issues (e.g., errors, faults, failures, and bugs) that microservices system developers experience, the causes of the issues, and the solutions as potential fixing strategies to address the issues. To ameliorate this gap, we conducted a mixed-methods empirical study that collected data from 2,641 issues from the issue tracking systems of 15 open-source microservices systems on GitHub, 15 interviews, and an online survey completed by 150 practitioners from 42 countries across 6 continents. Our analysis led to comprehensive taxonomies for the issues, causes, and solutions. The findings of this study inform that Technical Debt, Continuous Integration and Delivery, Exception Handling, Service Execution and Communication, and Security are the most dominant issues in microservices systems. Furthermore, General Programming Errors, Missing Features and Artifacts, and Invalid Configuration and Communication are the main causes behind the issues. Finally, we found 177 types of solutions that can be applied to fix the identified issues. Based on our study results, we formulated future research directions that could help researchers and practitioners to engineer emergent and next-generation microservices systems. △ Less

Submitted 11 July, 2023; v1 submitted 3 February, 2023; originally announced February 2023.

Comments: 35 pages, 5 images, 7 tables, Manuscript submitted to a Journal (2023)

arXiv:2301.00943 [pdf, other]

Characterizing Architecture Related Posts and Their Usefulness in Stack Overflow

Authors: Musengamana Jean de Dieu, Peng Liang, Mojtaba Shahin, Arif Ali Khan

Abstract: Context: Stack Overflow (SO) has won the intention from software engineers (e.g., architects) to learn, practice, and utilize development knowledge, such as Architectural Knowledge (AK). But little is known about AK communicated in SO, which is a type of high-level but important knowledge in development. Objective: This study aims to investigate the AK in SO posts in terms of their categories and… ▽ More Context: Stack Overflow (SO) has won the intention from software engineers (e.g., architects) to learn, practice, and utilize development knowledge, such as Architectural Knowledge (AK). But little is known about AK communicated in SO, which is a type of high-level but important knowledge in development. Objective: This study aims to investigate the AK in SO posts in terms of their categories and characteristics as well as their usefulness from the point of view of SO users. Method: We conducted an exploratory study by qualitatively analyzing a statistically representative sample of 968 Architecture Related Posts (ARPs) from SO. Results: The main findings are: (1) architecture related questions can be classified into 9 core categories, in which "architecture configuration" is the most common category, followed by the "architecture decision" category, and (2) architecture related questions that provide clear descriptions together with architectural diagrams increase their likelihood of getting more than one answer, while poorly structured architecture questions tend to only get one answer. Conclusions: Our findings suggest that future research can focus on enabling automated approaches and tools that could facilitate the search and (re)use of AK in SO. SO users can refer to our proposed guidelines to compose architecture related questions with the likelihood of getting more responses in SO. △ Less

Submitted 5 January, 2023; v1 submitted 2 January, 2023; originally announced January 2023.

Comments: Preprint accepted for publication in Journal of Systems and Software, 2023

arXiv:2212.13866 [pdf, other]

Architecture Decisions in AI-based Systems Development: An Empirical Study

Authors: Beiqi Zhang, Tianyang Liu, Peng Liang, Chong Wang, Mojtaba Shahin, Jiaxin Yu

Abstract: Artificial Intelligence (AI) technologies have been developed rapidly, and AI-based systems have been widely used in various application domains with opportunities and challenges. However, little is known about the architecture decisions made in AI-based systems development, which has a substantial impact on the success and sustainability of these systems. To this end, we conducted an empirical st… ▽ More Artificial Intelligence (AI) technologies have been developed rapidly, and AI-based systems have been widely used in various application domains with opportunities and challenges. However, little is known about the architecture decisions made in AI-based systems development, which has a substantial impact on the success and sustainability of these systems. To this end, we conducted an empirical study by collecting and analyzing the data from Stack Overflow (SO) and GitHub. More specifically, we searched on SO with six sets of keywords and explored 32 AI-based projects on GitHub, and finally we collected 174 posts and 128 GitHub issues related to architecture decisions. The results show that in AI-based systems development (1) architecture decisions are expressed in six linguistic patterns, among which Solution Proposal and Information Giving are most frequently used, (2) Technology Decision, Component Decision, and Data Decision are the main types of architecture decisions made, (3) Game is the most common application domain among the eighteen application domains identified, (4) the dominant quality attribute considered in architecture decision-making is Performance, and (5) the main limitations and challenges encountered by practitioners in making architecture decisions are Design Issues and Data Issues. Our results suggest that the limitations and challenges when making architecture decisions in AI-based systems development are highly specific to the characteristics of AI-based systems and are mainly of technical nature, which need to be properly confronted. △ Less

Submitted 28 December, 2022; originally announced December 2022.

Comments: The 30th IEEE International Conference on Software Analysis, Evolution, and Reengineering (SANER)

arXiv:2212.13179 [pdf, other]

Mining Architectural Information: A Systematic Mapping Study

Authors: Musengamana Jean de Dieu, Peng Liang, Mojtaba Shahin, Chen Yang, Zengyang Li

Abstract: Mining Software Repositories (MSR) has become an essential activity in software development. Mining architectural information to support architecting activities, such as architecture understanding, has received significant attention in recent years. However, there is a lack of clarity on what literature on mining architectural information is available. Consequently, this may create difficulty for… ▽ More Mining Software Repositories (MSR) has become an essential activity in software development. Mining architectural information to support architecting activities, such as architecture understanding, has received significant attention in recent years. However, there is a lack of clarity on what literature on mining architectural information is available. Consequently, this may create difficulty for practitioners to understand and adopt the state-of-the-art research results, such as what approaches should be adopted to mine what architectural information in order to support architecting activities. It also hinders researchers from being aware of the challenges and remedies for the identified research gaps. We aim to identify, analyze, and synthesize the literature on mining architectural information in terms of architectural information and sources mined, architecting activities supported, approaches and tools used, and challenges faced. An SMS has been conducted on the literature published between January 2006 and December 2022. Of the 104 primary studies selected, 7 categories of architectural information have been mined, among which architectural description is the most mined architectural information; 11 categories of sources have been leveraged for mining architectural information, among which version control system is the most popular source; 11 architecting activities can be supported by the mined architectural information, among which architecture understanding is the most supported activity; 95 approaches and 56 tools were proposed and employed in mining architectural information; and 4 types of challenges in mining architectural information were identified. This SMS provides researchers with future directions and help practitioners be aware of what approaches and tools can be used to mine what architectural information from what sources to support various architecting activities. △ Less

Submitted 4 April, 2024; v1 submitted 26 December, 2022; originally announced December 2022.

Comments: Preprint accepted for publication in Empirical Software Engineering, 2024

arXiv:2211.07769 [pdf, other]

Improving Children's Speech Recognition by Fine-tuning Self-supervised Adult Speech Representations

Authors: Renee Lu, Mostafa Shahin, Beena Ahmed

Abstract: Children's speech recognition is a vital, yet largely overlooked domain when building inclusive speech technologies. The major challenge impeding progress in this domain is the lack of adequate child speech corpora; however, recent advances in self-supervised learning have created a new opportunity for overcoming this problem of data scarcity. In this paper, we leverage self-supervised adult speec… ▽ More Children's speech recognition is a vital, yet largely overlooked domain when building inclusive speech technologies. The major challenge impeding progress in this domain is the lack of adequate child speech corpora; however, recent advances in self-supervised learning have created a new opportunity for overcoming this problem of data scarcity. In this paper, we leverage self-supervised adult speech representations and use three well-known child speech corpora to build models for children's speech recognition. We assess the performance of fine-tuning on both native and non-native children's speech, examine the effect of cross-domain child corpora, and investigate the minimum amount of child speech required to fine-tune a model which outperforms a state-of-the-art adult model. We also analyze speech recognition performance across children's ages. Our results demonstrate that fine-tuning with cross-domain child corpora leads to relative improvements of up to 46.08% and 45.53% for native and non-native child speech respectively, and absolute improvements of 14.70% and 31.10%. We also show that with as little as 5 hours of transcribed children's speech, it is possible to fine-tune a children's speech recognition system that outperforms a state-of-the-art adult model fine-tuned on 960 hours of adult speech. △ Less

Submitted 14 November, 2022; originally announced November 2022.

Comments: Under-review @ Speech Communication Journal

arXiv:2211.07142 [pdf, other]

Automated Detection, Categorisation and Developers' Experience with the Violations of Honesty in Mobile Apps

Authors: Humphrey O. Obie, Hung Du, Kashumi Madampe, Mojtaba Shahin, Idowu Ilekura, John Grundy, Li Li, Jon Whittle, Burak Turhan, Hourieh Khalajzadeh

Abstract: Human values such as honesty, social responsibility, fairness, privacy, and the like are things considered important by individuals and society. Software systems, including mobile software applications (apps), may ignore or violate such values, leading to negative effects in various ways for individuals and society. While some works have investigated different aspects of human values in software e… ▽ More Human values such as honesty, social responsibility, fairness, privacy, and the like are things considered important by individuals and society. Software systems, including mobile software applications (apps), may ignore or violate such values, leading to negative effects in various ways for individuals and society. While some works have investigated different aspects of human values in software engineering, this mixed-methods study focuses on honesty as a critical human value. In particular, we studied (i) how to detect honesty violations in mobile apps, (ii) the types of honesty violations in mobile apps, and (iii) the perspectives of app developers on these detected honesty violations. We first develop and evaluate 7 machine learning (ML) models to automatically detect violations of the value of honesty in app reviews from an end user perspective. The most promising was a Deep Neural Network model with F1 score of 0.921. We then conducted a manual analysis of 401 reviews containing honesty violations and characterised honest violations in mobile apps into 10 categories: unfair cancellation and refund policies; false advertisements; delusive subscriptions; cheating systems; inaccurate information; unfair fees; no service; deletion of reviews; impersonation; and fraudulent looking apps. A developer survey and interview study with mobile developers then identified 7 key causes behind honesty violations in mobile apps and 8 strategies to avoid or fix such violations. The findings of our developer study also articulate the negative consequences that honesty violations might bring for businesses, developers, and users. Finally, the app developers' feedback shows that our prototype ML-based models can have promising benefits in practice. △ Less

Submitted 14 November, 2022; originally announced November 2022.

Comments: Submitted Empirical Software Engineering Journal. arXiv admin note: substantial text overlap with arXiv:2203.07547

arXiv:2210.10231 [pdf, other]

Speaker- and Age-Invariant Training for Child Acoustic Modeling Using Adversarial Multi-Task Learning

Authors: Mostafa Shahin, Beena Ahmed, Julien Epps

Abstract: One of the major challenges in acoustic modelling of child speech is the rapid changes that occur in the children's articulators as they grow up, their differing growth rates and the subsequent high variability in the same age group. These high acoustic variations along with the scarcity of child speech corpora have impeded the development of a reliable speech recognition system for children. In t… ▽ More One of the major challenges in acoustic modelling of child speech is the rapid changes that occur in the children's articulators as they grow up, their differing growth rates and the subsequent high variability in the same age group. These high acoustic variations along with the scarcity of child speech corpora have impeded the development of a reliable speech recognition system for children. In this paper, a speaker- and age-invariant training approach based on adversarial multi-task learning is proposed. The system consists of one generator shared network that learns to generate speaker- and age-invariant features connected to three discrimination networks, for phoneme, age, and speaker. The generator network is trained to minimize the phoneme-discrimination loss and maximize the speaker- and age-discrimination losses in an adversarial multi-task learning fashion. The generator network is a Time Delay Neural Network (TDNN) architecture while the three discriminators are feed-forward networks. The system was applied to the OGI speech corpora and achieved a 13% reduction in the WER of the ASR. △ Less

Submitted 6 November, 2022; v1 submitted 18 October, 2022; originally announced October 2022.

Comments: Submitted to ICASSP2023

arXiv:2209.14055 [pdf, other]

Dealing with Data Challenges when Delivering Data-Intensive Software Solutions

Authors: Ulrike M. Graetsch, Hourieh Khalajzadeh, Mojtaba Shahin, Rashina Hoda, John Grundy

Abstract: The predicted increase in demand for data-intensive solution development is driving the need for software, data, and domain experts to effectively collaborate in multi-disciplinary data-intensive software teams (MDSTs). We conducted a socio-technical grounded theory study through interviews with 24 practitioners in MDSTs to better understand the challenges these teams face when delivering data-int… ▽ More The predicted increase in demand for data-intensive solution development is driving the need for software, data, and domain experts to effectively collaborate in multi-disciplinary data-intensive software teams (MDSTs). We conducted a socio-technical grounded theory study through interviews with 24 practitioners in MDSTs to better understand the challenges these teams face when delivering data-intensive software solutions. The interviews provided perspectives across different types of roles including domain, data and software experts, and covered different organisational levels from team members, team managers to executive leaders. We found that the key concern for these teams is dealing with data-related challenges. In this paper, we present the theory of dealing with data challenges that explains the challenges faced by MDSTs including gaining access to data, aligning data, understanding data, and resolving data quality issues; the context in and condition under which these challenges occur, the causes that lead to the challenges, and the related consequences such as having to conduct remediation activities, inability to achieve expected outcomes and lack of trust in the delivered solutions. We also identified contingencies or strategies applied to address the challenges including high-level strategic approaches such as implementing data governance, implementing new tools and techniques such as data quality visualisation and monitoring tools, as well as building stronger teams by focusing on people dynamics, communication skill development and cross-skilling. Our findings have direct implications for practitioners and researchers to better understand the landscape of data challenges and how to deal with them. △ Less

Submitted 24 March, 2023; v1 submitted 27 September, 2022; originally announced September 2022.

Comments: Submitted to IEEE Transactions on Software Engineering, 22 pages, 4 Figures, 1 Table

arXiv:2205.05043 [pdf, ps, other]

doi 10.18293/SEKE2022-171

Decisions in Continuous Integration and Delivery: An Exploratory Study

Authors: Yajing Luo, Peng Liang, Mojtaba Shahin, Zengyang Li, Chen Yang

Abstract: In recent years, Continuous Integration (CI) and Continuous Delivery (CD) has been heatedly discussed and widely used in part or all of the software development life cycle as the practices and pipeline to deliver software products in an efficient way. There are many tools, such as Travis CI, that offer various features to support the CI/CD pipeline, but there is a lack of understanding about what… ▽ More In recent years, Continuous Integration (CI) and Continuous Delivery (CD) has been heatedly discussed and widely used in part or all of the software development life cycle as the practices and pipeline to deliver software products in an efficient way. There are many tools, such as Travis CI, that offer various features to support the CI/CD pipeline, but there is a lack of understanding about what decisions are frequently made in CI/CD. In this work, we explored one popular open-source project on GitHub, Budibase, to provide insights on the types of decisions made in CI/CD from a practitioners' perspective. We first explored the GitHub Trending page, conducted a pilot repository extraction, and identified the Budibase repository as the case for our study. We then crawled all the closed issues from the repository and got 1,168 closed issues. Irrelevant issues were filtered out based on certain criteria, and 370 candidate issues that contain decisions were obtained for data extraction. We analyzed the issues using a hybrid approach combining pre-defined types and the Constant Comparison method to get the categories of decisions. The results show that the major type of decisions in the Budibase closed issues is Functional Requirement Decision (67.6%), followed by Architecture Decision (11.1%). Our findings encourage developers to put more effort on the issues and making decisions related to CI/CD, and provide researchers with a reference of decision classification made in CI/CD. △ Less

Submitted 10 May, 2022; originally announced May 2022.

Comments: The 34th International Conference on Software Engineering and Knowledge Engineering (SEKE)

arXiv:2203.12212 [pdf, other]

Supporting Developers in Addressing Human-centric Issues in Mobile Apps

Authors: Hourieh Khalajzadeh, Mojtaba Shahin, Humphrey O. Obie, Pragya Agrawal, John Grundy

Abstract: Failure to consider the characteristics, limitations, and abilities of diverse end-users during mobile apps development may lead to problems for end-users such as accessibility and usability issues. We refer to this class of problems as human-centric issues. Despite their importance, there is a limited understanding of the types of human-centric issues that are encountered by end-users and taken i… ▽ More Failure to consider the characteristics, limitations, and abilities of diverse end-users during mobile apps development may lead to problems for end-users such as accessibility and usability issues. We refer to this class of problems as human-centric issues. Despite their importance, there is a limited understanding of the types of human-centric issues that are encountered by end-users and taken into account by the developers of mobile apps. In this paper, we examine what human-centric issues end-users report through Google App Store reviews, which human-centric issues are a topic of discussion for developers on GitHub, and whether end-users and developers discuss the same human-centric issues. We then investigate whether an automated tool might help detect such human-centric issues and whether developers would find such a tool useful. To do this, we conducted an empirical study by extracting and manually analysing a random sample of 1,200 app reviews and 1,200 issue comments from 12 diverse projects that exist on both Google App Store and GitHub. Our analysis led to a taxonomy of human-centric issues that categorises human-centric issues into three-high levels: App Usage, Inclusiveness, and User Reaction. We then developed machine learning and deep learning models that are promising in automatically identifying and classifying human-centric issues from app reviews and developer discussions. A survey of mobile app developers shows that the automated detection of human-centric issues has practical applications. Guided by our findings, we highlight some implications and possible future work to further understand and incorporate human-centric issues in mobile apps development. △ Less

Submitted 3 October, 2022; v1 submitted 23 March, 2022; originally announced March 2022.

arXiv:2203.10551 [pdf, other]

Human Values Violations in Stack Overflow: An Exploratory Study

Authors: Sara Krishtul, Mojtaba Shahin, Humphrey O. Obie, Hourieh Khalajzadeh, Fan Gai, Ali Rezaei Nasab, John Grundy

Abstract: A growing number of software-intensive systems are being accused of violating or ignoring human values (e.g., privacy, inclusion, and social responsibility), and this poses great difficulties to individuals and society. Such violations often occur due to the solutions employed and decisions made by developers of such systems that are misaligned with user values. Stack Overflow is the most popular… ▽ More A growing number of software-intensive systems are being accused of violating or ignoring human values (e.g., privacy, inclusion, and social responsibility), and this poses great difficulties to individuals and society. Such violations often occur due to the solutions employed and decisions made by developers of such systems that are misaligned with user values. Stack Overflow is the most popular QA website among developers to share their issues, solutions (e.g., code snippets), and decisions during software development. We conducted an exploratory study to investigate the occurrence of human values violations in Stack Overflow posts. As comments under posts are often used to point out the possible issues and weaknesses of the posts, we analyzed 2000 Stack Overflow comments and their corresponding posts (1980 unique questions or answers) to identify the types of human values violations and the reactions of Stack Overflow users to such violations. Our study finds that 315 out of 2000 comments contain concerns indicating their associated posts (313 unique posts) violate human values. Leveraging Schwartz's theory of basic human values as the most widely used values model, we show that hedonism and benevolence are the most violated value categories. We also find the reaction of Stack Overflow commenters to perceived human values violations is very quick, yet the majority of posts (76.35%) accused of human values violation do not get downvoted at all. Finally, we find that the original posters rarely react to the concerns of potential human values violations by editing their posts. At the same time, they usually are receptive when responding to these comments in follow-up comments of their own. △ Less

Submitted 20 March, 2022; originally announced March 2022.

Comments: Preprint- Accepted for publication in 25th International Conference on Evaluation and Assessment in Software Engineering (EASE2022)

arXiv:2203.10382 [pdf, other]

Investigating End-Users' Values in Agriculture Mobile Applications Development: An Empirical Study on Bangladeshi Female Farmers

Authors: Rifat Ara Shams, Mojtaba Shahin, Gillian Oliver, Harsha Perera, Jon Whittle, Arif Nurwidyantoro, Waqar Hussain

Abstract: The omnipresent nature of mobile applications (apps) in all aspects of daily lives raises the necessity of reflecting end-users values (e.g., fairness, honesty, etc.) in apps. However, there are limited considerations of end-users values in apps development. Value violations by apps have been reported in the media and are responsible for end-users dissatisfaction and negative socio-economic conseq… ▽ More The omnipresent nature of mobile applications (apps) in all aspects of daily lives raises the necessity of reflecting end-users values (e.g., fairness, honesty, etc.) in apps. However, there are limited considerations of end-users values in apps development. Value violations by apps have been reported in the media and are responsible for end-users dissatisfaction and negative socio-economic consequences. Value violations may bring more severe and lasting problems for marginalized and vulnerable end-users of apps, which have been explored less (if at all) in the software engineering community. However, understanding the values of the end-users of apps is the essential first step towards values-based apps development. This research aims to fill this gap by investigating the human values of Bangladeshi female farmers as a marginalized and vulnerable group of end-users of Bangladeshi agriculture apps. We conducted an empirical study that collected and analyzed data from a survey with 193 Bangladeshi female farmers to explore the underlying factor structure of the values of Bangladeshi female farmers and the significance of demographics on their values. The results identified three underlying factors of Bangladeshi female farmers. The first factor comprises of five values: benevolence, security, conformity, universalism, and tradition. The second factor consists of two values: self-direction and stimulation. The third factor includes three values: power, achievement, and hedonism. We also identified strong influences of demographics on some of the values of Bangladeshi female farmers. For example, area has significant impacts on three values: hedonism, achievement, and tradition. Similarly, there are also strong influences of household income on power and security. △ Less

Submitted 19 March, 2022; originally announced March 2022.

Comments: 44 pages, 7 figures, 8 tables, Journal of Systems and Software

arXiv:2203.07547 [pdf]

On the Violation of Honesty in Mobile Apps: Automated Detection and Categories

Authors: Humphrey O. Obie, Idowu Ilekura, Hung Du, Mojtaba Shahin, John Grundy, Li Li, Jon Whittle, Burak Turhan

Abstract: Human values such as integrity, privacy, curiosity, security, and honesty are guiding principles for what people consider important in life. Such human values may be violated by mobile software applications (apps), and the negative effects of such human value violations can be seen in various ways in society. In this work, we focus on the human value of honesty. We present a model to support the a… ▽ More Human values such as integrity, privacy, curiosity, security, and honesty are guiding principles for what people consider important in life. Such human values may be violated by mobile software applications (apps), and the negative effects of such human value violations can be seen in various ways in society. In this work, we focus on the human value of honesty. We present a model to support the automatic identification of violations of the value of honesty from app reviews from an end-user perspective. Beyond the automatic detection of honesty violations by apps, we also aim to better understand different categories of honesty violations expressed by users in their app reviews. The result of our manual analysis of our honesty violations dataset shows that honesty violations can be characterised into ten categories: unfair cancellation and refund policies; false advertisements; delusive subscriptions; cheating systems; inaccurate information; unfair fees; no service; deletion of reviews; impersonation; and fraudulent-looking apps. Based on these results, we argue for a conscious effort in developing more honest software artefacts including mobile apps, and the promotion of honesty as a key value in software development practices. Furthermore, we discuss the role of app distribution platforms as enforcers of ethical systems supporting human values, and highlight some proposed next steps for human values in software engineering (SE) research. △ Less

Submitted 14 March, 2022; originally announced March 2022.

Comments: 12 pages, Accepted for publication in 2022 IEEE/ACM 19th International Conference on Mining Software Repositories (MSR)

arXiv:2201.05927 [pdf, other]

How are Diverse End-user Human-centric Issues Discussed on GitHub?

Authors: Hourieh Khalajzadeh, Mojtaba Shahin, Humphrey O. Obie, John Grundy

Abstract: Many software systems fail to meet the needs of the diverse end-users in society and are prone to pose problems, such as accessibility and usability issues. Some of these problems (partially) stem from the failure to consider the characteristics, limitations, and abilities of diverse end-users during software development. We refer to this class of problems as human-centric issues. Despite their im… ▽ More Many software systems fail to meet the needs of the diverse end-users in society and are prone to pose problems, such as accessibility and usability issues. Some of these problems (partially) stem from the failure to consider the characteristics, limitations, and abilities of diverse end-users during software development. We refer to this class of problems as human-centric issues. Despite their importance, there is a limited understanding of the types of human-centric issues encountered by developers. In-depth knowledge of these human-centric issues is needed to design software systems that better meet their diverse end-users' needs. This paper aims to provide insights for the software development and research communities on which human-centric issues are a topic of discussion for developers on GitHub. We conducted an empirical study by extracting and manually analysing 1,691 issue comments from 12 diverse projects, ranging from small to large-scale projects, including projects designed for challenged end-users, e.g., visually impaired and dyslexic users. Our analysis shows that eight categories of human-centric issues are discussed by developers. These include Inclusiveness, Privacy & Security, Compatibility, Location & Language, Preference, Satisfaction, Emotional Aspects, and Accessibility. Guided by our findings, we highlight some implications and possible future paths to further understand and incorporate human-centric issues in software development to be able to design software that meets the needs of diverse end users in society. △ Less

Submitted 15 January, 2022; originally announced January 2022.

arXiv:2201.05825 [pdf, other]

Decision Models for Selecting Patterns and Strategies in Microservices Systems and their Evaluation by Practitioners

Authors: Muhammad Waseem, Peng Liang, Aakash Ahmad, Mojtaba Shahin, Arif Ali Khan, Gastón Márquez

Abstract: Researchers and practitioners have recently proposed many Microservices Architecture (MSA) patterns and strategies covering various aspects of microservices system life cycle, such as service design and security. However, selecting and implementing these patterns and strategies can entail various challenges for microservices practitioners. To this end, this study proposes decision models for selec… ▽ More Researchers and practitioners have recently proposed many Microservices Architecture (MSA) patterns and strategies covering various aspects of microservices system life cycle, such as service design and security. However, selecting and implementing these patterns and strategies can entail various challenges for microservices practitioners. To this end, this study proposes decision models for selecting patterns and strategies covering four MSA design areas: application decomposition into microservices, microservices security, microservices communication, and service discovery. We used peer-reviewed and grey literature to identify the patterns, strategies, and quality attributes for creating these decision models. To evaluate the familiarity, understandability, completeness, and usefulness of the decision models, we conducted semi-structured interviews with 24 microservices practitioners from 12 countries across five continents. Our evaluation results show that the practitioners found the decision models as an effective guide to select microservices patterns and strategies. △ Less

Submitted 15 January, 2022; originally announced January 2022.

Comments: The 44th International Conference on Software Engineering (ICSE) SEIP Track. arXiv admin note: text overlap with arXiv:2110.03889

arXiv:2112.14927 [pdf, other]

An Empirical Study of Security Practices for Microservices Systems

Authors: Ali Rezaei Nasab, Mojtaba Shahin, Seyed Ali Hoseyni Raviz, Peng Liang, Amir Mashmool, Valentina Lenarduzzi

Abstract: Despite the numerous benefits of microservices systems, security has been a critical issue in such systems. Several factors explain this difficulty, including a knowledge gap among microservices practitioners on properly securing a microservices system. To (partially) bridge this gap, we conducted an empirical study. We first manually analyzed 861 microservices security points, including 567 issue… ▽ More Despite the numerous benefits of microservices systems, security has been a critical issue in such systems. Several factors explain this difficulty, including a knowledge gap among microservices practitioners on properly securing a microservices system. To (partially) bridge this gap, we conducted an empirical study. We first manually analyzed 861 microservices security points, including 567 issues, 9 documents, and 3 wiki pages from 10 GitHub open-source microservices systems and 306 Stack Overflow posts concerning security in microservices systems. In this study, a microservices security point is referred to as "a GitHub issue, a Stack Overflow post, a document, or a wiki page that entails 5 or more microservices security paragraphs". Our analysis led to a catalog of 28 microservices security practices. We then ran a survey with 74 microservices practitioners to evaluate the usefulness of these 28 practices. Our findings demonstrate that the survey respondents affirmed the usefulness of the 28 practices. We believe that the catalog of microservices security practices can serve as a valuable resource for microservices practitioners to more effectively address security issues in microservices systems. It can also inform the research community of the required or less explored areas to develop microservices-specific security practices and tools. △ Less

Submitted 18 November, 2022; v1 submitted 30 December, 2021; originally announced December 2021.

Comments: Preprint accepted for publication in Journal of Systems and Software, 2022

arXiv:2112.10920 [pdf, other]

How Do Developers Search for Architectural Information? An Industrial Survey

Authors: Musengamana Jean de Dieu, Peng Liang, Mojtaba Shahin

Abstract: Building software systems often requires knowledge and skills beyond what developers already possess. In such cases, developers have to leverage different sources of information to seek help. A growing number of researchers and practitioners have started investigating what programming-related information developers seek during software development. However, being a high level and a type of the mos… ▽ More Building software systems often requires knowledge and skills beyond what developers already possess. In such cases, developers have to leverage different sources of information to seek help. A growing number of researchers and practitioners have started investigating what programming-related information developers seek during software development. However, being a high level and a type of the most important development-related information, architectural information search activity is seldom explored. To fill this gap, we conducted an industrial survey completed by 103 participants to understand how developers search for architectural information to solve their architectural problems in development. Our main findings are: (1) searching for architectural information to learn about the pros and cons of certain architectural solutions (e.g., patterns, tactics) and to make an architecture decision among multiple choices are the most frequent purposes or tasks; (2) developers find difficulties mostly in getting relevant architectural information for addressing quality concerns and making design decisions among multiple choices when seeking architectural information; (3) taking too much time to go through architectural information retrieved from various sources and feeling overwhelmed due to the dispersion and abundance of architectural information in various sources are the top two major challenges developers face when searching for architectural information. Our findings (1) provide researchers with future directions, such as the design and development of approaches and tools for searching architectural information from multiple sources, and (2) can be used to provide guidelines for practitioners to refer to when seeking architectural information and providing architectural information that could be considered useful. △ Less

Submitted 20 December, 2021; originally announced December 2021.

Comments: The 19th IEEE International Conference on Software Architecture (ICSA)

arXiv:2111.15293 [pdf, other]

The Impact of Considering Human Values during Requirements Engineering Activities

Authors: Harsha Perera, Rashina Hoda, Rifat Ara Shams, Arif Nurwidyantoro, Mojtaba Shahin, Waqar Hussain, Jon Whittle

Abstract: Human values, or what people hold important in their life, such as freedom, fairness, and social responsibility, often remain unnoticed and unattended during software development. Ignoring values can lead to values violations in software that can result in financial losses, reputation damage, and widespread social and legal implications. However, embedding human values in software is not only non-… ▽ More Human values, or what people hold important in their life, such as freedom, fairness, and social responsibility, often remain unnoticed and unattended during software development. Ignoring values can lead to values violations in software that can result in financial losses, reputation damage, and widespread social and legal implications. However, embedding human values in software is not only non-trivial but also generally an unclear process. Commencing as early as during the Requirements Engineering (RE) activities promises to ensure fit-for-purpose and quality software products that adhere to human values. But what is the impact of considering human values explicitly during early RE activities? To answer this question, we conducted a scenario-based survey where 56 software practitioners contextualised requirements analysis towards a proposed mobile application for the homeless and suggested values-laden software features accordingly. The suggested features were qualitatively analysed. Results show that explicit considerations of values can help practitioners identify applicable values, associate purpose with the features they develop, think outside-the-box, and build connections between software features and human values. Finally, drawing from the results and experiences of this study, we propose a scenario-based values elicitation process -- a simple four-step takeaway as a practical implication of this study. △ Less

Submitted 30 November, 2021; originally announced November 2021.

Comments: 17 pages, 8 images, 5 tables

arXiv:2110.05150 [pdf, other]

Human Values in Mobile App Development: An Empirical Study on Bangladeshi Agriculture Mobile Apps

Authors: Rifat Ara Shams, Mojtaba Shahin, Gillian Oliver, Jon Whittle, Waqar Hussain, Harsha Perera, Arif Nurwidyantoro

Abstract: Given the ubiquity of mobile applications (apps) in daily lives, understanding and reflecting end-users' human values (e.g., transparency, privacy, social recognition etc.) in apps has become increasingly important. Violations of end users' values by software applications have been reported in the media and have resulted in a wide range of difficulties for end users. Value violations may bring mor… ▽ More Given the ubiquity of mobile applications (apps) in daily lives, understanding and reflecting end-users' human values (e.g., transparency, privacy, social recognition etc.) in apps has become increasingly important. Violations of end users' values by software applications have been reported in the media and have resulted in a wide range of difficulties for end users. Value violations may bring more and lasting problems for marginalized and vulnerable groups of end-users. This research aims to understand the extent to which the values of Bangladeshi female farmers, marginalized and vulnerable end-users, who are less studied by the software engineering community, are reflected in agriculture apps in Bangladesh. Further to this, we aim to identify possible strategies to embed their values in those apps. To this end, we conducted a mixed-methods empirical study consisting of 13 interviews with app practitioners and four focus groups with 20 Bangladeshi female farmers. The accumulated results from the interviews and focus groups identified 22 values of Bangladeshi female farmers, which the participants expect to be reflected in the agriculture apps. Among these 22 values, 15 values (e.g., accuracy, independence) are already reflected and 7 values (e.g., accessibility, pleasure) are ignored/violated in the existing agriculture apps. We also identified 14 strategies (e.g., "applying human-centered approaches to elicit values", "establishing a dedicated team/person for values concerns") to address Bangladeshi female farmers' values in agriculture apps. △ Less

Submitted 11 October, 2021; originally announced October 2021.

Comments: 18 pages, 6 figures, Manuscript submitted to IEEE Transactions on Software Engineering (2021)

arXiv:2110.03889 [pdf, other]

A Decision Model for Selecting Patterns and Strategies to Decompose Applications into Microservices

Authors: Muhammad Waseem, Peng Liang, Gastón Márquez, Mojtaba Shahin, Arif Ali Khan, Aakash Ahmad

Abstract: Microservices Architecture (MSA) style is a promising design approach to develop software applications consisting of multiple small and independently deployable services. Over the past few years, researchers and practitioners have proposed many MSA patterns and strategies covering various aspects of microservices design, such as application decomposition. However, selecting appropriate patterns an… ▽ More Microservices Architecture (MSA) style is a promising design approach to develop software applications consisting of multiple small and independently deployable services. Over the past few years, researchers and practitioners have proposed many MSA patterns and strategies covering various aspects of microservices design, such as application decomposition. However, selecting appropriate patterns and strategies can entail various challenges for practitioners. To this end, this study proposes a decision model for selecting patterns and strategies to decompose applications into microservices. We used peer-reviewed and grey literature to collect the patterns, strategies, and quality attributes for creating this decision model. △ Less

Submitted 8 October, 2021; originally announced October 2021.

Comments: The 19th International Conference on Service Oriented Computing (ICSOC)

arXiv:2110.01832 [pdf, ps, other]

Does Domain Change the Opinion of Individuals on Human Values? A Preliminary Investigation on eHealth Apps End-users

Authors: Humphrey Obie, Mojtaba Shahin, John Grundy, Burak Turhan, Li Li, Waqar Hussain, Jon Whittle

Abstract: The elicitation of end-users' human values - such as freedom, honesty, transparency, etc. - is important in the development of software systems. We carried out two preliminary Q-studies to understand (a) the general human value opinion types of eHealth applications (apps) end-users (b) the eHealth domain human value opinion types of eHealth apps end-users (c) whether there are differences between… ▽ More The elicitation of end-users' human values - such as freedom, honesty, transparency, etc. - is important in the development of software systems. We carried out two preliminary Q-studies to understand (a) the general human value opinion types of eHealth applications (apps) end-users (b) the eHealth domain human value opinion types of eHealth apps end-users (c) whether there are differences between the general and eHealth domain opinion types. Our early results show three value opinion types using generic value instruments: (1) fun-loving, success-driven and independent end-user, (2) security-conscious, socially-concerned, and success-driven end-user, and (3) benevolent, success-driven, and conformist end-user Our results also show two value opinion types using domain-specific value instruments: (1) security-conscious, reputable, and honest end-user, and (2) success-driven, reputable and pain-avoiding end-user. Given these results, consideration should be given to domain context in the design and application of values elicitation instruments. △ Less

Submitted 5 October, 2021; originally announced October 2021.

Comments: Preprint accepted to appear in 28th Asia-Pacific Software Engineering Conference (APSEC 2021). 5 Pages

arXiv:2110.00812 [pdf]

How Secondary School Girls Perceive Computational Thinking Practices through Collaborative Programming with the Micro:bit

Authors: Mojtaba Shahin, Chris Gonsalvez, Jon Whittle, Chunyang Chen, Li Li, Xin Xia

Abstract: Computational Thinking (CT) has been investigated from different perspectives. This research aims to investigate how secondary school girls perceive CT practices -- the problem-solving practices that students apply while they are engaged in programming -- when using the micro:bit device in a collaborative setting. This study also explores the collaborative programming process of secondary school g… ▽ More Computational Thinking (CT) has been investigated from different perspectives. This research aims to investigate how secondary school girls perceive CT practices -- the problem-solving practices that students apply while they are engaged in programming -- when using the micro:bit device in a collaborative setting. This study also explores the collaborative programming process of secondary school girls with the micro:bit device. We conducted mixed-methods research with 203 secondary school girls (in the state of Victoria, Australia) and 31 mentors attending a girls-only CT program (OzGirlsCT program). The girls were grouped into 52 teams and collaboratively developed computational solutions around realistic, important problems to them and their communities. We distributed two surveys (with 193 responses each) to the girls. Further, we surveyed the mentors (with 31 responses) who monitored the girls, and collected their observation reports on their teams. Our study indicates that the girls found "debugging" the most difficult type of CT practice to apply, while collaborative practices of CT were the easiest. We found that prior coding experience significantly reduced the difficulty level of only one CT practice - "debugging". Our study also identified six challenges the girls faced and six best practices they adopted when working on their computational solutions. △ Less

Submitted 2 October, 2021; originally announced October 2021.

Comments: Preprint accepted for publication in Journal of Systems and Software, Elsevier, 2021. 33 Pages, 8 Tables, 11 Figures

arXiv:2109.09286 [pdf, other]

Pandemic Software Development: The Student Experiences from Developing a COVID-19 Information Dashboard

Authors: Benjamin Koh, Mojtaba Shahin, Annette Ong, Soo Ying Yeap, Priyanka Saxena, Manvendra Singh, Chunyang Chen

Abstract: The COVID-19 pandemic has birthed a wealth of information through many publicly accessible sources, such as news outlets and social media. However, gathering and understanding the content can be difficult due to inaccuracies or inconsistencies between the different sources. To alleviate this challenge in Australia, a team of 48 student volunteers developed an open-source COVID-19 information dashb… ▽ More The COVID-19 pandemic has birthed a wealth of information through many publicly accessible sources, such as news outlets and social media. However, gathering and understanding the content can be difficult due to inaccuracies or inconsistencies between the different sources. To alleviate this challenge in Australia, a team of 48 student volunteers developed an open-source COVID-19 information dashboard to provide accurate, reliable, and real-time COVID-19 information for Australians. The students developed this software while working under legislative restrictions that required social isolation. The goal of this study is to characterize the experiences of the students throughout the project. We conducted an online survey completed by 39 of the volunteering students contributing to the COVID-19 dashboard project. Our results indicate that playing a positive role in the COVID-19 crisis and learning new skills and technologies were the most cited motivating factors for the students to participate in the project. While working on the project, some students struggled to maintain a work-life balance due to working from home. However, the students generally did not express strong sentiment towards general project challenges. The students expressed more strongly that data collection was a significant challenge as it was difficult to collect reliable, accurate, and up-to-date data from various government sources. The students have been able to mitigate these challenges by establishing a systematic data collection process in the team, leveraging frequent and clear communication through text, and appreciating and encouraging each other's efforts. By participating in the project, the students boosted their technical (e.g., front-end development) and non-technical (e.g., task prioritization) skills. Our study discusses several implications for students, educators, and policymakers. △ Less

Submitted 14 November, 2021; v1 submitted 19 September, 2021; originally announced September 2021.

Comments: 11 Pages. Accepted for publication in 28th Asia-Pacific Software Engineering Conference (APSEC 2021), IEEE, 2021 (Preprint)

arXiv:2108.06705 [pdf, other]

A Qualitative Study of Architectural Design Issues in DevOps

Authors: Mojtaba Shahin, Ali Rezaei Nasab, Muhammad Ali Babar

Abstract: Software architecture is critical in succeeding with DevOps. However, designing software architectures that enable and support DevOps (DevOps-driven software architectures) is a challenge for organizations. We assert that one of the essential steps towards characterizing DevOps-driven architectures is to understand architectural design issues raised in DevOps. At the same time, some of the archite… ▽ More Software architecture is critical in succeeding with DevOps. However, designing software architectures that enable and support DevOps (DevOps-driven software architectures) is a challenge for organizations. We assert that one of the essential steps towards characterizing DevOps-driven architectures is to understand architectural design issues raised in DevOps. At the same time, some of the architectural issues that emerge in the DevOps context (and their corresponding architectural practices or tactics) may stem from the context (i.e., domain) and characteristics of software organizations. To this end, we conducted a mixed-methods study that consists of a qualitative case study of two teams in a company during their DevOps transformation and a content analysis of Stack Overflow and DevOps Stack Exchange posts to understand architectural design issues in DevOps. Our study found eight specific and contextual architectural design issues faced by the two teams and classified architectural design issues discussed in Stack Overflow and DevOps Stack Exchange into 11 groups. Our aggregated results reveal that the main characteristics of DevOps-driven architectures are: being loosely coupled and prioritizing deployability, testability, supportability, and modifiability over other quality attributes. Finally, we discuss some concrete implications for research and practice. △ Less

Submitted 12 November, 2021; v1 submitted 15 August, 2021; originally announced August 2021.

Comments: Preprint accepted for publication in Journal of Software: Evolution and Process, 2021. 38 Pages, 6 Tables, 11 Figures. This article is an extended version of the ICSSP2020 paper (the preprint is available at arXiv:2003.06108). arXiv admin note: text overlap with arXiv:2003.06108

arXiv:2108.05624 [pdf, other]

doi 10.1109/ACCESS.2022.3190975

Operationalizing Human Values in Software Engineering: A Survey

Authors: Mojtaba Shahin, Waqar Hussain, Arif Nurwidyantoro, Harsha Perera, Rifat Shams, John Grundy, Jon Whittle

Abstract: Human values (e.g., pleasure, privacy, and social justice) are what a person or a society considers important. The inability to address them in software-intensive systems can result in numerous undesired consequences (e.g., financial losses) for individuals and communities. Various solutions (e.g., methodologies, techniques) are developed to help "operationalize values in software". The ultimate g… ▽ More Human values (e.g., pleasure, privacy, and social justice) are what a person or a society considers important. The inability to address them in software-intensive systems can result in numerous undesired consequences (e.g., financial losses) for individuals and communities. Various solutions (e.g., methodologies, techniques) are developed to help "operationalize values in software". The ultimate goal is to ensure building software (better) reflects and respects human values. In this survey, "operationalizing values" is referred to as the process of identifying human values and translating them to accessible and concrete concepts so that they can be implemented, validated, verified, and measured in software. This paper provides a deep understanding of the research landscape on operationalizing values in software engineering, covering 51 primary studies. It also presents an analysis and taxonomy of 51 solutions for operationalizing values in software engineering. Our survey reveals that most solutions attempt to help operationalize values in the early phases (requirements and design) of the software development life cycle. However, the later phases (implementation and testing) and other aspects of software development (e.g., "team organization") still need adequate consideration. We outline implications for research and practice and identify open issues and future research directions to advance this area. △ Less

Submitted 25 July, 2022; v1 submitted 12 August, 2021; originally announced August 2021.

Comments: Accepted for publication in IEEE Access Journal, IEEE - 27 Pages - 14 Tables, 7 Figures

arXiv:2108.03384 [pdf, other]

doi 10.1016/j.jss.2021.111061

Design, Monitoring, and Testing of Microservices Systems: The Practitioners' Perspective

Authors: Muhammad Waseem, Peng Liang, Mojtaba Shahin, Amleto Di Salle, Gastón Márquez

Abstract: Context: Microservices Architecture (MSA) has received significant attention in the software industry. However, little empirical evidence exists on design, monitoring, and testing of microservices systems. Objective: This research aims to gain a deep understanding of how microservices systems are designed, monitored, and tested in the industry. Method: A mixed-methods study was conducted with 106… ▽ More Context: Microservices Architecture (MSA) has received significant attention in the software industry. However, little empirical evidence exists on design, monitoring, and testing of microservices systems. Objective: This research aims to gain a deep understanding of how microservices systems are designed, monitored, and tested in the industry. Method: A mixed-methods study was conducted with 106 survey responses and 6 interviews from microservices practitioners. Results: The main findings are: (1) a combination of domain-driven design and business capability is the most used strategy to decompose an application into microservices, (2) over half of the participants used architecture evaluation and architecture implementation when designing microservices systems, (3) API gateway and Backend for frontend patterns are the most used MSA patterns, (4) resource usage and load balancing as monitoring metrics, log management and exception tracking as monitoring practices are widely used, (5) unit and end-to-end testing are the most used testing strategies, and (6) the complexity of microservices systems poses challenges for their design, monitoring, and testing, for which there are no dedicated solutions. Conclusions: Our findings reveal that more research is needed to (1) deal with microservices complexity at the design level, (2) handle security in microservices systems, and (3) address the monitoring and testing challenges through dedicated solutions. △ Less

Submitted 21 August, 2021; v1 submitted 7 August, 2021; originally announced August 2021.

Comments: Preprint accepted for publication in Journal of Systems and Software, 2021

arXiv:2107.11273 [pdf, other]

Towards a Human Values Dashboard for Software Development: An Exploratory Study

Authors: Arif Nurwidyantoro, Mojtaba Shahin, Michel Chaudron, Waqar Hussain, Harsha Perera, Rifat Ara Shams, Jon Whittle

Abstract: Background: There is a growing awareness of the importance of human values (e.g., inclusiveness, privacy) in software systems. However, there are no practical tools to support the integration of human values during software development. We argue that a tool that can identify human values from software development artefacts and present them to varying software development roles can (partially) addr… ▽ More Background: There is a growing awareness of the importance of human values (e.g., inclusiveness, privacy) in software systems. However, there are no practical tools to support the integration of human values during software development. We argue that a tool that can identify human values from software development artefacts and present them to varying software development roles can (partially) address this gap. We refer to such a tool as human values dashboard. Further to this, our understanding of such a tool is limited. Aims: This study aims to (1) investigate the possibility of using a human values dashboard to help address human values during software development, (2) identify possible benefits of using a human values dashboard, and (3) elicit practitioners' needs from a human values dashboard. Method: We conducted an exploratory study by interviewing 15 software practitioners. A dashboard prototype was developed to support the interview process. We applied thematic analysis to analyse the collected data. Results: Our study finds that a human values dashboard would be useful for the development team (e.g., project manager, developer, tester). Our participants acknowledge that development artefacts, especially requirements documents and issue discussions, are the most suitable source for identifying values for the dashboard. Our study also yields a set of high-level user requirements for a human values dashboard (e.g., it shall allow determining values priority of a project). Conclusions: Our study suggests that a values dashboard is potentially used to raise awareness of values and support values-based decision-making in software development. Future work will focus on addressing the requirements and using issue discussions as potential artefacts for the dashboard. △ Less

Submitted 23 July, 2021; originally announced July 2021.

Comments: 12 Pages. Accepted to appear in 15th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM). Preprint

arXiv:2107.10059 [pdf, other]

Automated Identification of Security Discussions in Microservices Systems: Industrial Surveys and Experiments

Authors: Ali Rezaei Nasab, Mojtaba Shahin, Peng Liang, Mohammad Ehsan Basiri, Seyed Ali Hoseyni Raviz, Hourieh Khalajzadeh, Muhammad Waseem, Amine Naseri

Abstract: Lack of awareness and knowledge of microservices-specific security challenges and solutions often leads to ill-informed security decisions in microservices system development. We claim that identifying and leveraging security discussions scattered in existing microservices systems can partially close this gap. We define security discussion as "a paragraph from developer discussions that includes d… ▽ More Lack of awareness and knowledge of microservices-specific security challenges and solutions often leads to ill-informed security decisions in microservices system development. We claim that identifying and leveraging security discussions scattered in existing microservices systems can partially close this gap. We define security discussion as "a paragraph from developer discussions that includes design decisions, challenges, or solutions relating to security". We first surveyed 67 practitioners and found that securing microservices systems is a unique challenge and that having access to security discussions is useful for making security decisions. The survey also confirms the usefulness of potential tools that can automatically identify such security discussions. We developed fifteen machine/deep learning models to automatically identify security discussions. We applied these models on a manually constructed dataset consisting of 4,813 security discussions and 12,464 non-security discussions. We found that all the models can effectively identify security discussions: an average precision of 84.86%, recall of 72.80%, F1-score of 77.89%, AUC of 83.75% and G-mean 82.77%. DeepM1, a deep learning model, performs the best, achieving above 84% in all metrics and significantly outperforms three baselines. Finally, the practitioners' feedback collected from a validation survey reveals that security discussions identified by DeepM1 have promising applications in practice. △ Less

Submitted 21 July, 2021; originally announced July 2021.

Comments: 24 Pages, Accepted to appear in Journal of Systems and Software (JSS), 2021. Preprint

arXiv:2107.07482 [pdf, ps, other]

doi 10.1145/3475716.3475782

Characteristics and Challenges of Low-Code Development: The Practitioners' Perspective

Authors: Yajing Luo, Peng Liang, Chong Wang, Mojtaba Shahin, Jing Zhan

Abstract: Background: In recent years, Low-code development (LCD) is growing rapidly, and Gartner and Forrester have predicted that the use of LCD is very promising. Giant companies, such as Microsoft, Mendix, and Outsystems have also launched their LCD platforms. Aim: In this work, we explored two popular online developer communities, Stack Overflow (SO) and Reddit, to provide insights on the characteristi… ▽ More Background: In recent years, Low-code development (LCD) is growing rapidly, and Gartner and Forrester have predicted that the use of LCD is very promising. Giant companies, such as Microsoft, Mendix, and Outsystems have also launched their LCD platforms. Aim: In this work, we explored two popular online developer communities, Stack Overflow (SO) and Reddit, to provide insights on the characteristics and challenges of LCD from a practitioners' perspective. Method: We used two LCD related terms to search the relevant posts in SO and extracted 73 posts. Meanwhile, we explored three LCD related subreddits from Reddit and collected 228 posts. We extracted data from these posts and applied the Constant Comparison method to analyze the descriptions, benefits, and limitations and challenges of LCD. For platforms and programming languages used in LCD, implementation units in LCD, supporting technologies of LCD, types of applications developed by LCD, and domains that use LCD, we used descriptive statistics to analyze and present the results. Results: Our findings show that: (1) LCD may provide a graphical user interface for users to drag and drop with little or even no code; (2) the equipment of out-of-the-box units (e.g., APIs and components) in LCD platforms makes them easy to learn and use as well as speeds up the development; (3) LCD is particularly favored in the domains that have the need for automated processes and workflows; and (4) practitioners have conflicting views on the advantages and disadvantages of LCD. Conclusions: Our findings suggest that researchers should clearly define the terms when they refer to LCD, and developers should consider whether the characteristics of LCD are appropriate for their projects. △ Less

Submitted 15 July, 2021; originally announced July 2021.

Comments: 15th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM)

arXiv:2104.12192 [pdf, other]

doi 10.1145/3463274.3463337

On the Nature of Issues in Five Open Source Microservices Systems: An Empirical Study

Authors: Muhammad Waseem, Peng Liang, Mojtaba Shahin, Aakash Ahmad, Ali Rezaei Nasab

Abstract: Due to its enormous benefits, the research and industry communities have shown an increasing interest in the Microservices Architecture (MSA) style over the last few years. Despite this, there is a limited evidence-based and thorough understanding of the types of issues (e.g., faults, errors, failures, mistakes) faced by microservices system developers and causes that trigger the issues. Such evid… ▽ More Due to its enormous benefits, the research and industry communities have shown an increasing interest in the Microservices Architecture (MSA) style over the last few years. Despite this, there is a limited evidence-based and thorough understanding of the types of issues (e.g., faults, errors, failures, mistakes) faced by microservices system developers and causes that trigger the issues. Such evidence-based understanding of issues and causes is vital for long-term, impactful, and quality research and practice in the MSA style. To that end, we conducted an empirical study on 1,345 issue discussions extracted from five open source microservices systems hosted on GitHub. Our analysis led to the first of its kind taxonomy of the types of issues in open source microservices systems, informing that the problems originating from Technical debt (321, 23.86%), Build (145, 10.78%), Security (137, 10.18%), and Service execution and communication (119, 8.84%) are prominent. We identified that "General programming errors", "Poor security management", "Invalid configuration and communication", and "Legacy versions, compatibility and dependency" are the predominant causes for the leading four issue categories. Study results streamline a taxonomy of issues, their mapping with underlying causes, and present empirical findings that could facilitate research and development on emerging and next-generation microservices systems. △ Less

Submitted 4 May, 2021; v1 submitted 25 April, 2021; originally announced April 2021.

Comments: The 25th International Conference on Evaluation and Assessment in Software Engineering (EASE)

arXiv:2102.12107 [pdf, other]

How Can Human Values Be Addressed in Agile Methods? A Case Study on SAFe

Authors: Waqar Hussain, Mojtaba Shahin, Rashina Hoda, Jon Whittle, Harsha Perera, Arif Nurwidyantoro, Rifat Ara Shams, Gillian Oliver

Abstract: Agile methods are predominantly focused on delivering business values. But can Agile methods be adapted to effectively address and deliver human values such as social justice, privacy, and sustainability in the software they produce? Human values are what an individual or a society considers important in life. Ignoring these human values in software can pose difficulties or risks for all stakehold… ▽ More Agile methods are predominantly focused on delivering business values. But can Agile methods be adapted to effectively address and deliver human values such as social justice, privacy, and sustainability in the software they produce? Human values are what an individual or a society considers important in life. Ignoring these human values in software can pose difficulties or risks for all stakeholders (e.g., user dissatisfaction, reputation damage, financial loss). To answer this question, we selected the Scaled Agile Framework (SAFe), one of the most commonly used Agile methods in the industry, and conducted a qualitative case study to identify possible intervention points within SAFe that are the most natural to address and integrate human values in software. We present five high-level empirically-justified sets of interventions in SAFe: artefacts, roles, ceremonies, practices, and culture. We elaborate how some current Agile artefacts (e.g., user story), roles (e.g., product owner), ceremonies (e.g., stand-up meeting), and practices (e.g., business-facing testing) in SAFe can be modified to support the inclusion of human values in software. Further, our study suggests new and exclusive values-based artefacts (e.g., legislative requirement), ceremonies (e.g., values conversation), roles (e.g., values champion), and cultural practices (e.g., induction and hiring) to be introduced in SAFe for this purpose. Guided by our findings, we argue that existing Agile methods can account for human values in software delivery with some evolutionary adaptations. △ Less

Submitted 12 November, 2021; v1 submitted 24 February, 2021; originally announced February 2021.

Comments: Preprint - Accepted to be published in IEEE Transactions on Software Engineering (2021), 18 Pages, 5 Figures, 3 Tables

arXiv:2012.10095 [pdf, other]

A First Look at Human Values-Violation in App Reviews

Authors: Humphrey O. Obie, Waqar Hussain, Xin Xia, John Grundy, Li Li, Burak Turhan, Jon Whittle, Mojtaba Shahin

Abstract: Ubiquitous technologies such as mobile software applications (mobile apps) have a tremendous influence on the evolution of the social, cultural, economic, and political facets of life in society. Mobile apps fulfil many practical purposes for users including entertainment, transportation, financial management, etc. Given the ubiquity of mobile apps in the lives of individuals and the consequent ef… ▽ More Ubiquitous technologies such as mobile software applications (mobile apps) have a tremendous influence on the evolution of the social, cultural, economic, and political facets of life in society. Mobile apps fulfil many practical purposes for users including entertainment, transportation, financial management, etc. Given the ubiquity of mobile apps in the lives of individuals and the consequent effect of these technologies on society, it is essential to consider the relationship between human values and the development and deployment of mobile apps. The many negative consequences of violating human values such as privacy, fairness or social justice by technology have been documented in recent times. If we can detect these violations in a timely manner, developers can look to better address them. To understand the violation of human values in a range of common mobile apps, we analysed 22,119 app reviews from Google Play Store using natural language processing techniques. We base our values violation detection approach on a widely accepted model of human values; the Schwartz theory of basic human values. The results of our analysis show that 26.5% of the reviews contained text indicating user perceived violations of human values. We found that benevolence and self-direction were the most violated value categories, and conformity and tradition were the least violated categories. Our results also highlight the need for a proactive approach to the alignment of values amongst stakeholders and the use of app reviews as a valuable additional source for mining values requirements. △ Less

Submitted 18 December, 2020; originally announced December 2020.

Comments: 10 pages, Accepted for publication in IEEE/ACM 43nd International Conference on Software Engineering: Software Engineering in Society (ICSE-SEIS), IEEE, 2021

arXiv:2012.03746 [pdf]

doi 10.1007/s11365-020-00713-7

The Impact of a STEM-based Entrepreneurship Program on the Entrepreneurial Intention of Secondary School Female Students

Authors: Mojtaba Shahin, Olivia Ilic, Chris Gonsalvez, Jon Whittle

Abstract: Despite dedicated effort and research in the last two decades, the entrepreneurship field is still limited by little evidence-based knowledge of the impacts of entrepreneurship programs on the entrepreneurial intention of students in pre-university levels of study. Further, gender equity continues to be an issue in the entrepreneurial sector, particularly in STEM-focused entrepreneurship. In this… ▽ More Despite dedicated effort and research in the last two decades, the entrepreneurship field is still limited by little evidence-based knowledge of the impacts of entrepreneurship programs on the entrepreneurial intention of students in pre-university levels of study. Further, gender equity continues to be an issue in the entrepreneurial sector, particularly in STEM-focused entrepreneurship. In this context, this study was designed to explore the effects of a one-day female-focused STEM-based entrepreneurship program (for brevity, we call it the OzGirlsEntrepreneurship program) on the entrepreneurial intention of secondary school female students. The study collected data from two surveys completed by 193 secondary school female students, aged 14-16 years, who participated in the OzGirlsEntrepreneurship program. This program encouraged girls to develop and implement creative computational solutions to socially relevant problems, with an Internet of Things (IoT) component using the micro:bit device. The findings reveal that a key factor in the development of entrepreneurial attitudes in young female students is associated with soft-skills development, particularly in the areas of creative thinking, risk-taking, problem-solving, and leadership development. The importance of meaningful human connections, including positive role modelling and peer to peer learning were also important factors in fostering entrepreneurial intent. With these factors in mind, our findings highlight that the OzGirlsEntrepreneurship program substantially increased the entrepreneurial intention of secondary school female students. In addition, this study offers actionable implications and recommendations to develop and deliver entrepreneurship education programs for secondary school level students. △ Less

Submitted 16 January, 2021; v1 submitted 4 December, 2020; originally announced December 2020.

Comments: 20 Pages, Accepted to appear in International Entrepreneurship and Management Journal (IEMJ), Springer, 2020

Journal ref: International Entrepreneurship and Management Journal (2021)

arXiv:2012.01268 [pdf]

Measuring Bangladeshi Female Farmers' Values for Agriculture Mobile Applications Development

Authors: Rifat Ara Shams, Mojtaba Shahin, Gillian Oliver, Waqar Hussain, Harsha Perera, Arif Nurwidyantoro, Jon Whittle

Abstract: The ubiquity of mobile applications (apps) in daily life raises the imperative that the apps should reflect users' values. However, users' values are not usually taken into account in app development. Thus there is significant potential for user dissatisfaction and negative socio-economic consequences. To be cognizant of values in apps, the first step is to find out what those values are, and that… ▽ More The ubiquity of mobile applications (apps) in daily life raises the imperative that the apps should reflect users' values. However, users' values are not usually taken into account in app development. Thus there is significant potential for user dissatisfaction and negative socio-economic consequences. To be cognizant of values in apps, the first step is to find out what those values are, and that was the objective of this study conducted in Bangladesh. Our focus was on rural women, specifically female farmers. The basis for our study was Schwartz's universal human values theory, and we used an associated survey instrument, the Portrait Values Questionnaire (PVQ). Our survey of 193 Bangladeshi female farmers showed that Conformity and Security were regarded as the most important values, while Power, Hedonism, and Stimulation were the least important. This finding would be helpful for developers to take into account when developing agriculture apps for this market. In addition, the methodology we used provides a model to follow to elicit the values of apps' users in other communities. △ Less

Submitted 22 November, 2020; originally announced December 2020.

Comments: 10 Pages, Accepted to appear in 54th Hawaii International Conference on System Sciences, 2021

arXiv:2008.07729 [pdf]

A Systematic Mapping Study on Microservices Architecture in DevOps

Authors: Muhammad Waseem, Peng Liang, Mojtaba Shahin

Abstract: Context: Applying Microservices Architecture (MSA) in DevOps has received significant attention in recent years. However, there exists no comprehensive review of the state of research on this topic. Objective: This work aims to systematically identify, analyze, and classify the literature on MSA in DevOps. Method: A Systematic Mapping Study (SMS) has been conducted on the literature published betw… ▽ More Context: Applying Microservices Architecture (MSA) in DevOps has received significant attention in recent years. However, there exists no comprehensive review of the state of research on this topic. Objective: This work aims to systematically identify, analyze, and classify the literature on MSA in DevOps. Method: A Systematic Mapping Study (SMS) has been conducted on the literature published between January 2009 and July 2018. Results: Forty-seven studies were finally selected and the key results are: (1) Three themes on the research on MSA in DevOps are "microservices development and operations in DevOps", "approaches and tool support for MSA based systems in DevOps", and "MSA migration experiences in DevOps". (2) 24 problems with their solutions regarding implementing MSA in DevOps are identified. (3) MSA is mainly described by using boxes and lines. (4) Most of the quality attributes are positively affected when employing MSA in DevOps. (5) 50 tools that support building MSA based systems in DevOps are collected. (6) The combination of MSA and DevOps has been applied in a wide range of application domains. Conclusions: The results and findings will benefit researchers and practitioners to conduct further research and bring more dedicated solutions for the issues of MSA in DevOps. △ Less

Submitted 17 August, 2020; originally announced August 2020.

Comments: 50 Pages, Accepted to appear in Journal of Systems and Software (JSS), 2020

arXiv:2005.07883 [pdf]

Architectural Design Space for Modelling and Simulation as a Service: A Review

Authors: Mojtaba Shahin, M. Ali Babar, Muhammad Aufeef Chauhan

Abstract: Modelling and Simulation as a Service (MSaaS) is a promising approach to deploy and execute Modelling and Simulation (M&S) applications quickly and on-demand. An appropriate software architecture is essential to deliver quality M&S applications following the MSaaS concept to a wide range of users. This study aims to characterize the state-of-the-art MSaaS architectures by conducting a systematic r… ▽ More Modelling and Simulation as a Service (MSaaS) is a promising approach to deploy and execute Modelling and Simulation (M&S) applications quickly and on-demand. An appropriate software architecture is essential to deliver quality M&S applications following the MSaaS concept to a wide range of users. This study aims to characterize the state-of-the-art MSaaS architectures by conducting a systematic review of 31 papers published from 2010 to 2018. Our findings reveal that MSaaS applications are mainly designed using layered architecture style, followed by service-oriented architecture, component-based architecture, and pluggable component-based architecture. We also found that interoperability and deployability have the greatest importance in the architecture of MSaaS applications. In addition, our study indicates that the current MSaaS architectures do not meet the critical user requirements of modern M&S applications appropriately. Based on our results, we recommend that there is a need for more effort and research to (1) design the user interfaces that enable users to build and configure simulation models with minimum effort and limited domain knowledge, (2) provide mechanisms to improve the deployability of M&S applications, and (3) gain a deep insight into how M&S applications should be architected to respond to the emerging user requirements in the military domain. △ Less

Submitted 31 July, 2020; v1 submitted 16 May, 2020; originally announced May 2020.

Comments: 38 Pages, To appear in Journal of Systems and Software (JSS), 2020

Showing 1–50 of 55 results for author: Shahin, M