Search | arXiv e-print repository

Decade-long Utilization Patterns of ICSE Technical Papers and Associated Artifacts

Authors: Sharif Ahmed, Rey Ortiz, Nasir U. Eisty

Abstract: Context: Annually, ICSE acknowledges a range of papers, a subset of which are paired with research artifacts such as source code, datasets, and supplementary materials, adhering to the Open Science Policy. However, no prior systematic inquiry dives into gauging the influence of ICSE papers using artifact attributes. Objective: We explore the mutual impact between artifacts and their associated pap… ▽ More Context: Annually, ICSE acknowledges a range of papers, a subset of which are paired with research artifacts such as source code, datasets, and supplementary materials, adhering to the Open Science Policy. However, no prior systematic inquiry dives into gauging the influence of ICSE papers using artifact attributes. Objective: We explore the mutual impact between artifacts and their associated papers presented at ICSE over ten years. Method: We collect data on usage attributes from papers and their artifacts, conduct a statistical assessment to identify differences, and analyze the top five papers in each attribute category. Results: There is a significant difference between paper citations and the usage of associated artifacts. While statistical analyses show no notable difference between paper citations and GitHub stars, variations exist in views and/or downloads of papers and artifacts. Conclusion: We provide a thorough overview of ICSE's accepted papers from the last decade, emphasizing the intricate relationship between research papers and their artifacts. To enhance the assessment of artifact influence in software research, we recommend considering key attributes that may be present in one platform but not in another. △ Less

Submitted 8 April, 2024; originally announced April 2024.

Comments: This paper has been accepted for publication and presentation at The 22nd IEEE/ACIS International Conference on Software Engineering, Management and Applications (SERA 2024) to be held in Honolulu, USA on May 30-June 1, 2024

arXiv:2404.05823 [pdf, other]

Exploiting CPU Clock Modulation for Covert Communication Channel

Authors: Shariful Alam, Jidong Xiao, Nasir U. Eisty

Abstract: Covert channel attacks represent a significant threat to system security, leveraging shared resources to clandestinely transmit information from highly secure systems, thereby violating the system's security policies. These attacks exploit shared resources as communication channels, necessitating resource partitioning and isolation techniques as countermeasures. However, mitigating attacks exploit… ▽ More Covert channel attacks represent a significant threat to system security, leveraging shared resources to clandestinely transmit information from highly secure systems, thereby violating the system's security policies. These attacks exploit shared resources as communication channels, necessitating resource partitioning and isolation techniques as countermeasures. However, mitigating attacks exploiting modern processors' hardware features to leak information is challenging because successful attacks can conceal the channel's existence. In this paper, we unveil a novel covert channel exploiting the duty cycle modulation feature of modern x86 processors. Specifically, we illustrate how two collaborating processes, a sender and a receiver can manipulate this feature to transmit sensitive information surreptitiously. Our live system implementation demonstrates that this covert channel can achieve a data transfer rate of up to 55.24 bits per second. △ Less

Submitted 8 April, 2024; originally announced April 2024.

Comments: This paper has been accepted for publication at The 22nd IEEE/ACIS International Conference on Software Engineering, Management and Applications (SERA 2024)

arXiv:2401.12959 [pdf]

doi 10.1145/3643787.3648035

Understanding Emojis :) in Useful Code Review Comments

Authors: Sharif Ahmed, Nasir U. Eisty

Abstract: Emojis and emoticons serve as non-verbal cues and are increasingly prevalent across various platforms, including Modern Code Review. These cues often carry emotive or instructive weight for developers. Our study dives into the utility of Code Review comments (CR comments) by scrutinizing the sentiments and semantics conveyed by emojis within these comments. To assess the usefulness of CR comments,… ▽ More Emojis and emoticons serve as non-verbal cues and are increasingly prevalent across various platforms, including Modern Code Review. These cues often carry emotive or instructive weight for developers. Our study dives into the utility of Code Review comments (CR comments) by scrutinizing the sentiments and semantics conveyed by emojis within these comments. To assess the usefulness of CR comments, we augment traditional 'textual' features and pre-trained embeddings with 'emoji-specific' features and pre-trained embeddings. To fortify our inquiry, we expand an existing dataset with emoji annotations, guided by existing research on GitHub emoji usage, and re-evaluate the CR comments accordingly. Our models, which incorporate textual and emoji-based sentiment features and semantic understandings of emojis, substantially outperform baseline metrics. The often-overlooked emoji elements in CR comments emerge as key indicators of usefulness, suggesting that these symbols carry significant weight. △ Less

Submitted 23 January, 2024; originally announced January 2024.

Comments: This paper has been accepted for inclusion in the Proceedings of the 3rd Intl. Workshop on NL-based Software Engineering co-located at 46th International Conference on Software Engineering (NLBSE@ICSE 2024)

arXiv:2307.00692 [pdf, other]

Exploring the Advances in Identifying Useful Code Review Comments

Authors: Sharif Ahmed, Nasir U. Eisty

Abstract: Effective peer code review in collaborative software development necessitates useful reviewer comments and supportive automated tools. Code review comments are a central component of the Modern Code Review process in the industry and open-source development. Therefore, it is important to ensure these comments serve their purposes. This paper reflects the evolution of research on the usefulness of… ▽ More Effective peer code review in collaborative software development necessitates useful reviewer comments and supportive automated tools. Code review comments are a central component of the Modern Code Review process in the industry and open-source development. Therefore, it is important to ensure these comments serve their purposes. This paper reflects the evolution of research on the usefulness of code review comments. It examines papers that define the usefulness of code review comments, mine and annotate datasets, study developers' perceptions, analyze factors from different aspects, and use machine learning classifiers to automatically predict the usefulness of code review comments. Finally, it discusses the open problems and challenges in recognizing useful code review comments for future research. △ Less

Submitted 6 July, 2023; v1 submitted 2 July, 2023; originally announced July 2023.

Comments: This paper has been accepted for inclusion in the Proceedings of the 17th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM 2023)

arXiv:2304.07482 [pdf, other]

Documentation Practices in Agile Software Development: A Systematic Literature Review

Authors: Md Athikul Islam, Rizbanul Hasan, Nasir U. Eisty

Abstract: Context: Agile development methodologies in the software industry have increased significantly over the past decade. Although one of the main aspects of agile software development (ASD) is less documentation, there have always been conflicting opinions about what to document in ASD. Objective: This study aims to systematically identify what to document in ASD, which documentation tools and methods… ▽ More Context: Agile development methodologies in the software industry have increased significantly over the past decade. Although one of the main aspects of agile software development (ASD) is less documentation, there have always been conflicting opinions about what to document in ASD. Objective: This study aims to systematically identify what to document in ASD, which documentation tools and methods are in use, and how those tools can overcome documentation challenges. Method: We performed a systematic literature review of the studies published between 2010 and June 2021 that discusses agile documentation. Then, we systematically selected a pool of 74 studies using particular inclusion and exclusion criteria. After that, we conducted a quantitative and qualitative analysis using the data extracted from these studies. Results: We found nine primary vital factors to add to agile documentation from our pool of studies. Our analysis shows that agile practitioners have primarily developed their documentation tools and methods focusing on these factors. The results suggest that the tools and techniques in agile documentation are not in sync, and they separately solve different challenges. Conclusions: Based on our results and discussion, researchers and practitioners will better understand how current agile documentation tools and practices perform. In addition, investigation of the synchronization of these tools will be helpful in future research and development. △ Less

Submitted 15 April, 2023; originally announced April 2023.

Comments: Accepted to 21st IEEE/ACIS International Conference on Software Engineering, Management and Applications (SERA 2023). May 23-25, 2023, Orlando, USA

arXiv:2304.01523 [pdf, other]

Analysis of Software Engineering Practices in General Software and Machine Learning Startups

Authors: Bishal Lakha, Kalyan Bhetwal, Nasir U. Eisty

Abstract: Context: On top of the inherent challenges startup software companies face applying proper software engineering practices, the non-deterministic nature of machine learning techniques makes it even more difficult for machine learning (ML) startups. Objective: Therefore, the objective of our study is to understand the whole picture of software engineering practices followed by ML startups and iden… ▽ More Context: On top of the inherent challenges startup software companies face applying proper software engineering practices, the non-deterministic nature of machine learning techniques makes it even more difficult for machine learning (ML) startups. Objective: Therefore, the objective of our study is to understand the whole picture of software engineering practices followed by ML startups and identify additional needs. Method: To achieve our goal, we conducted a systematic literature review study on 37 papers published in the last 21 years. We selected papers on both general software startups and ML startups. We collected data to understand software engineering (SE) practices in five phases of the software development life-cycle: requirement engineering, design, development, quality assurance, and deployment. Results: We find some interesting differences in software engineering practices in ML startups and general software startups. The data management and model learning phases are the most prominent among them. Conclusion: While ML startups face many similar challenges to general software startups, the additional difficulties of using stochastic ML models require different strategies in using software engineering practices to produce high-quality products. △ Less

Submitted 4 April, 2023; originally announced April 2023.

Comments: Accepted at the 21st IEEE/ACIS International Conference on Software Engineering Research, Management and Applications (SERA 2023)

arXiv:2303.16393 [pdf, other]

Analyzing the Effects of CI/CD on Open Source Repositories in GitHub and GitLab

Authors: Jeffrey Fairbanks, Akshharaa Tharigonda, Nasir U. Eisty

Abstract: Numerous articles emphasize the benefits of implementing Continuous Integration and Delivery (CI/CD) pipelines in software development. These pipelines are expected to improve the reputation of a project and decrease the number of commits and issues in the repository. Although CI/CD adoption may be slow initially, it is believed to accelerate service delivery and deployment in the long run. This s… ▽ More Numerous articles emphasize the benefits of implementing Continuous Integration and Delivery (CI/CD) pipelines in software development. These pipelines are expected to improve the reputation of a project and decrease the number of commits and issues in the repository. Although CI/CD adoption may be slow initially, it is believed to accelerate service delivery and deployment in the long run. This study aims to investigate the impact of CI/CD on commit velocity and issue counts in two open-source repositories, GitLab and GitHub. By analyzing more than 12,000 repositories and recording every commit and issue, it was discovered that CI/CD enhances commit velocity by 141.19 percent, but also increases the number of issues by 321.21 percent. △ Less

Submitted 28 March, 2023; originally announced March 2023.

Comments: This paper has been accepted at the 20th IEEE/ACIS International Conference on Software Engineering, Management and Applications (SERA 2022)

arXiv:2205.15982 [pdf, other]

doi 10.1007/s10664-022-10184-9

Testing Research Software: A Survey

Authors: Nasir U. Eisty, Jeffrey C. Carver

Abstract: Background: Research software plays an important role in solving real-life problems, empowering scientific innovations, and handling emergency situations. Therefore, the correctness and trustworthiness of research software are of absolute importance. Software testing is an important activity for identifying problematic code and helping to produce high-quality software. However, testing of research… ▽ More Background: Research software plays an important role in solving real-life problems, empowering scientific innovations, and handling emergency situations. Therefore, the correctness and trustworthiness of research software are of absolute importance. Software testing is an important activity for identifying problematic code and helping to produce high-quality software. However, testing of research software is difficult due to the complexity of the underlying science, relatively unknown results from scientific algorithms, and the culture of the research software community. Aims: The goal of this paper is to better understand current testing practices, identify challenges, and provide recommendations on how to improve the testing process for research software development. Method: We surveyed members of the research software developer community to collect information regarding their knowledge about and use of software testing in their projects. Results: We analysed 120 responses and identified that even though research software developers report they have an average level of knowledge about software testing, they still find it difficult due to the numerous challenges involved. However, there are a number of ways, such as proper training, that can improve the testing process for research software. Conclusions: Testing can be challenging for any type of software. This difficulty is especially present in the development of research software, where software engineering activities are typically given less attention. To produce trustworthy results from research software, there is a need for a culture change so that testing is valued and teams devote appropriate effort to writing and executing tests. △ Less

Submitted 31 May, 2022; originally announced May 2022.

Comments: Accepted for publication in Empirical Software Engineering

Journal ref: Empirical Soft. Eng. 27(2022) 138

arXiv:2204.08702 [pdf, other]

doi 10.1145/3528227.3528569

Software Engineering Approaches for TinyML based IoT Embedded Vision: A Systematic Literature Review

Authors: Shashank Bangalore Lakshman, Nasir U. Eisty

Abstract: Internet of Things (IoT) has catapulted human ability to control our environments through ubiquitous sensing, communication, computation, and actuation. Over the past few years, IoT has joined forces with Machine Learning (ML) to embed deep intelligence at the far edge. TinyML (Tiny Machine Learning) has enabled the deployment of ML models for embedded vision on extremely lean edge hardware, bring… ▽ More Internet of Things (IoT) has catapulted human ability to control our environments through ubiquitous sensing, communication, computation, and actuation. Over the past few years, IoT has joined forces with Machine Learning (ML) to embed deep intelligence at the far edge. TinyML (Tiny Machine Learning) has enabled the deployment of ML models for embedded vision on extremely lean edge hardware, bringing the power of IoT and ML together. However, TinyML powered embedded vision applications are still in a nascent stage, and they are just starting to scale to widespread real-world IoT deployment. To harness the true potential of IoT and ML, it is necessary to provide product developers with robust, easy-to-use software engineering (SE) frameworks and best practices that are customized for the unique challenges faced in TinyML engineering. Through this systematic literature review, we aggregated the key challenges reported by TinyML developers and identified state-of-art SE approaches in large-scale Computer Vision, Machine Learning, and Embedded Systems that can help address key challenges in TinyML based IoT embedded vision. In summary, our study draws synergies between SE expertise that embedded systems developers and ML developers have independently developed to help address the unique challenges in the engineering of TinyML based IoT embedded vision. △ Less

Submitted 19 April, 2022; originally announced April 2022.

Comments: 8 pages, 3 figures

arXiv:2204.00932 [pdf, other]

doi 10.1109/SERA54885.2022.9806783

Automatic Transformation of Natural to Unified Modeling Language: A Systematic Review

Authors: Sharif Ahmed, Arif Ahmed, Nasir U. Eisty

Abstract: Context: Processing Software Requirement Specifications (SRS) manually takes a much longer time for requirement analysts in software engineering. Researchers have been working on making an automatic approach to ease this task. Most of the existing approaches require some intervention from an analyst or are challenging to use. Some automatic and semi-automatic approaches were developed based on heu… ▽ More Context: Processing Software Requirement Specifications (SRS) manually takes a much longer time for requirement analysts in software engineering. Researchers have been working on making an automatic approach to ease this task. Most of the existing approaches require some intervention from an analyst or are challenging to use. Some automatic and semi-automatic approaches were developed based on heuristic rules or machine learning algorithms. However, there are various constraints to the existing approaches of UML generation, such as restriction on ambiguity, length or structure, anaphora, incompleteness, atomicity of input text, requirements of domain ontology, etc. Objective: This study aims to better understand the effectiveness of existing systems and provide a conceptual framework with further improvement guidelines. Method: We performed a systematic literature review (SLR). We conducted our study selection into two phases and selected 70 papers. We conducted quantitative and qualitative analyses by manually extracting information, cross-checking, and validating our findings. Result: We described the existing approaches and revealed the issues observed in these works. We identified and clustered both the limitations and benefits of selected articles. Conclusion: This research upholds the necessity of a common dataset and evaluation framework to extend the research consistently. It also describes the significance of natural language processing obstacles researchers face. In addition, it creates a path forward for future research. △ Less

Submitted 28 May, 2022; v1 submitted 2 April, 2022; originally announced April 2022.

Comments: Accepted to 20th IEEE/ACIS International Conference on Software Engineering, Management and Applications (SERA 2022). May 25-27, 2022, Las Vegas, USA

Journal ref: 2022 IEEE/ACIS 20th International Conference on Software Engineering Research, Management and Applications (SERA), 2022, pp. 112-119

arXiv:2109.10971 [pdf, other]

doi 10.1007/s10664-021-10053-x

Developers Perception of Peer Code Review in Research Software Development

Authors: Nasir U. Eisty, Jeffrey C. Carver

Abstract: Background: Research software is software developed by and/or used by researchers, across a wide variety of domains, to perform their research. Because of the complexity of research software, developers cannot conduct exhaustive testing. As a result, researchers have lower confidence in the correctness of the output of the software. Peer code review, a standard software engineering practice, has h… ▽ More Background: Research software is software developed by and/or used by researchers, across a wide variety of domains, to perform their research. Because of the complexity of research software, developers cannot conduct exhaustive testing. As a result, researchers have lower confidence in the correctness of the output of the software. Peer code review, a standard software engineering practice, has helped address this problem in other types of software. Aims: Peer code review is less prevalent in research software than it is in other types of software. In addition, the literature does not contain any studies about the use of peer code review in research software. Therefore, through analyzing developers perceptions, the goal of this work is to understand the current practice of peer code review in the development of research software, identify challenges and barriers associated with peer code review in research software, and present approaches to improve the peer code review in research software. Method: We conducted interviews and a community survey of research software developers to collect information about their current peer code review practices, difficulties they face, and how they address those difficulties. Results: We received 84 unique responses from the interviews and surveys. The results show that while research software teams review a large amount of their code, they lack formal process, proper organization, and adequate people to perform the reviews. Conclusions: Use of peer code review is promising for improving the quality of research software and thereby improving the trustworthiness of the underlying research results. In addition, by using peer code review, research software developers produce more readable and understandable code, which will be easier to maintain. △ Less

Submitted 22 September, 2021; originally announced September 2021.

Comments: Accepted for publication in Empirical Software Engineering

Journal ref: Empirical Software Engineering, 27(1), 2022

Showing 1–11 of 11 results for author: Eisty, N U