Markus Borg
Swedish Institute of Computer Science, Software and Systems Engineering Lab, Department Member
- Ph.D. Software Engineeringedit
[Context] It is an enigma that agile projects can succeed "without requirements" when weak requirements engineering is a known cause for project failures. While agile development projects often manage well without extensive requirements... more
[Context] It is an enigma that agile projects can succeed "without requirements" when weak requirements engineering is a known cause for project failures. While agile development projects often manage well without extensive requirements test cases are commonly viewed as requirements and detailed requirements are documented as test cases. [Objective] We have investigated this agile practice of using test cases as requirements to understand how test cases can support the main requirements activities, and how this practice varies. [Method] We performed an iterative case study at three companies and collected data through 14 interviews and 2 focus groups. [Results] The use of test cases as requirements poses both benefits and challenges when eliciting, validating, verifying, and managing requirements, and when used as a documented agreement. We have identified five variants of the test-cases-as-requirements practice, namely de facto, behaviour-driven, story-test driven, stand-alone strict and stand-alone manual for which the application of the practice varies concerning the time frame of requirements documentation, the requirements format, the extent to which the test cases are a machine executable specification and the use of tools which provide specific support for the practice of using test cases as requirements. [Conclusions] The findings provide empirical insight into how agile development projects manage and communicate requirements. The identified variants of the practice of using test cases as requirements can be used to perform in-depth investigations into agile requirements engineering. Practitioners can use the provided recommendations as a guide in designing and improving their agile requirements practices based on project characteristics such as number of stakeholders and rate of change.
Research Interests:
Issue management, a central part of software maintenance, requires much effort for complex software systems. The continuous inflow of issue reports makes it hard for developers to stay on top of the situation, and the threatening... more
Issue management, a central part of software maintenance, requires much effort for complex software systems. The continuous inflow of issue reports makes it hard for developers to stay on top of the situation, and the threatening information overload makes activities such as duplicate management,
Issue Assignment (IA), and Change Impact Analysis (CIA) tedious and error-prone. Still, most practitioners work with tools that act as little more than issue containers. Machine Learning encompasses approaches that identify patterns or make predictions based on empirical data. While humans have limited ability to work with big data, ML instead tends to
improve the more training data that is available. Consequently, we argue that the challenge of information overload in issue management appears to be particularly suitable for ML-based tool support. While others have initially explored the area, we develop two ML-based tools, and evaluate them in proprietary software engineering contexts. We replicated [1] for five projects in two companies, and our automated IA obtains an accuracy matching the current manual processes. Thus, as our solution delivers instantaneous IA, an organization can potentially save considerable analysis effort. Moreover, for the most comprehensive of the five projects, we implemented automated CIA in the tool ImpRec [3]. We evaluated the tool
in a longitudinal in situ study, i.e., deployment in two development teams in industry. Based on log analysis and complementary interviews using the QUPER model [2] for utility assessment, we conclude that ImpRec offered helpful support in the CIA task.
Issue Assignment (IA), and Change Impact Analysis (CIA) tedious and error-prone. Still, most practitioners work with tools that act as little more than issue containers. Machine Learning encompasses approaches that identify patterns or make predictions based on empirical data. While humans have limited ability to work with big data, ML instead tends to
improve the more training data that is available. Consequently, we argue that the challenge of information overload in issue management appears to be particularly suitable for ML-based tool support. While others have initially explored the area, we develop two ML-based tools, and evaluate them in proprietary software engineering contexts. We replicated [1] for five projects in two companies, and our automated IA obtains an accuracy matching the current manual processes. Thus, as our solution delivers instantaneous IA, an organization can potentially save considerable analysis effort. Moreover, for the most comprehensive of the five projects, we implemented automated CIA in the tool ImpRec [3]. We evaluated the tool
in a longitudinal in situ study, i.e., deployment in two development teams in industry. Based on log analysis and complementary interviews using the QUPER model [2] for utility assessment, we conclude that ImpRec offered helpful support in the CIA task.
Research Interests:
Students working in groups is a commonly used method of instruction in higher education, popularized by the introduction of problem based learning. As a result, management of small groups of people has become an important skill for... more
Students working in groups is a commonly used method of instruction in higher education, popularized by the introduction of problem based learning. As a result, management of small groups of people has become an important skill for teachers. The objective of our study is to investigate why conflicts arise in student groups at the Faculty of Engineering at Lund University and how teachers manage them. We have conducted an exploratory interdepartmental interview study on teachers' views on this matter, interviewing ten university teachers with different levels of seniority. Our results show that conflicts frequently arise in group work, most commonly caused by different levels of ambition among students. We also found that teachers prefer to work proactively against conflicts and stress the student’s responsibility. Finally, we show that teachers at our faculty tend to avoid the more drastic conflict resolution strategies suggested by previous research. The outcome of our study could be used as input to future guidelines on conflict management in student groups.
Research Interests:
It is a conundrum that agile projects can succeed 'without requirements' when weak requirements engineering is a known cause for project failures. While Agile development projects often manage well without extensive requirements... more
It is a conundrum that agile projects can succeed 'without requirements' when weak requirements engineering is a known cause for project failures. While Agile development projects often manage well without extensive requirements documentation, test cases are commonly used as requirements. We have investigated this agile practice at three companies in order to understand how test cases can fill the role of requirements. We performed a case study based on twelve interviews performed in a previous study. The findings include a range of benefits and challenges in using test cases for eliciting, validating , verifying, tracing and managing requirements. In addition, we identified three scenarios for applying the practice, namely as a mature practice, as a de facto practice and as part of an agile transition. The findings provide insights into how the role of requirements may be met in agile development including challenges to consider.
Research Interests:
Since software development is of a dynamic nature, the impact analysis is an inevitable work task. Traceability is known as one factor that supports this task, and several researchers have proposed traceability recovery tools to propose... more
Since software development is of a dynamic nature, the impact analysis is an inevitable work task. Traceability is known as one factor that supports this task, and several researchers have proposed traceability recovery tools to propose trace links in an existing system. However, these semi-automatic tools have not yet proven useful in industrial applications. Based on an established automation model, we analyzed the potential value of such a tool. We based our analysis on a pilot case study of an impact analysis process in a safety-critical development context, and argue that traceability recovery should be considered an investment in findability. Moreover, several risks involved in an increased level of impact analysis automation are already plaguing the state-of-practice work flow. Consequently, deploying a traceability recovery tool involves a lower degree of change than has previously been acknowledged.
Research Interests:
About a hundred studies on traceability recovery have been published in software engineering fora. In roughly half of them, software artifacts developed by students have been used as input. To what extent student artifacts differ from... more
About a hundred studies on traceability recovery have been published in software engineering fora. In roughly half of them, software artifacts developed by students have been used as input. To what extent student artifacts differ from industrial counterparts has not been fully explored in the literature. We conducted a survey among authors of studies on traceability recovery, including both academics and practitioners, to explore their perspectives on the matter. Our results indicate that a majority of authors consider software artifacts originating from student projects to be only partly representative to industrial artifacts. Moreover, only few respondents validated student artifacts for industrial representativeness. Furthermore, our respondents made suggestions for improving the description of artifact sets used in studies by adding contextual, domain-specificand artifact-centric information. Example suggestions include adding descriptions of processes used for artifact development,meaning of traceability links, and the structure of artifacts. Our findings call for further research on characterization and validation of software artifacts to support aggregation of results from empirical studies.
Research Interests:
This paper describes a system to extract events and time information from football match reports generated through minute-by-minute reporting. We describe a method that uses regular expressions to find the events and divides them into... more
This paper describes a system to extract events and time information from football match reports generated through minute-by-minute reporting. We describe a method that uses regular expressions to find the events and divides them into different types to determine in which order they occurred. In addition, our system detects time expressions and we present a way to structure the collected data using XML.
Research Interests:
Quality requirements is a difficult concept in software projects, and testing software qualities is a well-known challenge. Without proper management of quality requirements, there is an increased risk that the software product under... more
Quality requirements is a difficult concept in software projects, and testing software qualities is a well-known challenge. Without proper management of quality requirements, there is an increased risk that the software product under development will not meet the expectations of its future users. In this paper, we share experiences from testing quality requirements when developing a large system-of-systems in the public sector in Sweden. We complement the experience reporting by analyzing documents from the case under study. As a final step, we match the identified challenges with solution proposals from the literature. We report five main challenges covering inadequate requirements engineering and disconnected test managers. Finally, we match the challenges to solutions proposed in the scientific literature, including integrated requirements engineering, the twin peaks model, virtual plumblines, the QUPER model , and architecturally significant requirements. Our experiences are valuable to other large development projects struggling with testing of quality requirements. Furthermore, the report could be used by as input to process improvement activities in the case under study.
Research Interests:
Successful coordination of Requirements Engineering and Testing (RET) is crucial in large-scale software engineering. If the activities involved in RET are not aligned, effort is inevitably wasted, and the probability of delivering high... more
Successful coordination of Requirements Engineering and Testing (RET) is crucial in large-scale software engineering. If the activities involved in RET are not aligned, effort is inevitably wasted, and the probability of delivering high quality software products in time decreases. Previous work has identified sixteen challenges in aligning RET in a case study of six companies. However, all six case companies selected for the study are active in proprietary software engineering. In this experience report, we discuss to what extent the identified RET alignment challenges apply to the development of a large information system for managing grants from the European Union. We confirm that most of the findings from previous work also apply to the public sector, including the challenges of aligning goals within an organization, specifying high-quality requirements, and verifying quality aspects. Furthermore, we emphasize that the public sector might be impacted by shifting political power, and that several RET alignment challenges are amplified in multi-project environments.
Engineers working on safety critical software development must explicitly specify trace links as part of Impact Analyses (IA), both to code and non-code development artifacts. In large-scale projects, constituting information spaces of... more
Engineers working on safety critical software development must explicitly specify trace links as part of Impact Analyses (IA), both to code and non-code development artifacts. In large-scale projects, constituting information spaces of thousands of artifacts, conducting IA is tedious work relying on extensive system understanding. We propose to support this activity by enabling engineers to reuse knowledge from previously completed IAs. We do this by mining the trace links in documented IA reports, creating a semantic network of the resulting traceability, and rendering the resulting network amenable to visual analyses. We studied an Issue Management System (IMS), from within a company in the power and automation domain, containing 4,845 IA reports from 9 years of development relating to a single safety critical system. The domain has strict process requirements guiding the documented IAs. We used link mining to extract trace links, from these IA reports to development artifacts, and to determine their link semantics. We constructed a semantic network of the interrelated development artifacts, containing 6,104 non-code artifacts and 9,395 trace links, and we used two visualizations to examine the results. We provide initial suggestions as to how the knowledge embedded in such a network can be (re-)used to advance support for IA.
Research Interests:
Despite the widely recognized importance of replications in software engineering, industrial replications in software engineering are still rarely reported. Although the literature provides some evidence about the issues and challenges... more
Despite the widely recognized importance of replications in software engineering, industrial replications in software engineering are still rarely reported. Although the literature provides some evidence about the issues and challenges related to conducting experiments and replications the practitioner’s view of the issues and challenges has not been fully explored. This paper reports an industrial practitioner’s review of a replicated experiment on linguistic tool support for consolidation of requirements from multiple sources. The review identified potential confounding factors from a perspective that differed significantly from that of the designers of the experiment. The results suggest that industrial practice may focus upon specific process aspects that are not necessarily reflected in academic practice.
Research Interests:
The amount of data in large-scale software engineering contexts continues to grow and challenges efficiency of software engineering efforts. At the same time, information related to requirements plays a vital role in the success of... more
The amount of data in large-scale software engineering contexts continues to grow and challenges efficiency of software engineering efforts. At the same time, information related to requirements plays a vital role in the success of software products and projects. To face the current challenges in software engineering information management, software companies need to reconsider the current models of information. In this paper, we present a modeling frame-work for requirements artifacts dedicated to a large-scale market-driven re-quirements engineering context. The underlying meta-model is grounded in a clear industrial need for improved flexible models for storing requirements engineering information. The presented framework is created in collaboration with industry and initially evaluated by industry practitioners from three large companies. Participants of the evaluation positively evaluated the presented modeling framework as well as pointed out directions for further research and improvements.
Research Interests:
While tennis, table tennis, badminton, and squash have several characteristics in common, they all have different scoring systems. Due to the different scoring systems, direct comparisons of match scores across racket sport have not yet... more
While tennis, table tennis, badminton, and squash have several characteristics in common, they all have different scoring systems. Due to the different scoring systems, direct comparisons of match scores across racket sport have not yet been possible. Instead, comparative studies between racket sports have so far required scaling of match scores to a common format. However, such rescaling does not capture the mental aspects of different scoring systems, e.g., how the hierarchical scoring in tennis influences players differently as compared to the point-a-rally scoring of squash. In 2015, the English Racketlon Association introduced a new ranking system based on a ratings system, i.e., a player’s ranking depends on the individual performance in each match played. Moreover, as the rules of racketlon harmonizes the scoring systems of the racket sports, and the UK ranking system is based on match results from more than a year of international racketlon tournaments, a unique comparison of set scores across racket sports is now possible. We present an initial analysis of how set scores compare for different racket sports when the harmonized scoring system of racketlon is used. Our analysis shows that table tennis is the sport in which big wins are most likely. Furthermore, we report that while the variance in badminton and squash set scores decreases for better players, the opposite applies to table tennis matches. Our comparative study provides a novel perspective on set scores in competitive matches in the world’s four major racket sports, and our findings reveal that table tennis is particularly different. Future work should explore a larger set of match results, but our insights might already be used to inspire future match tactics and training regimes in racket sports.
Research Interests:
Research Interests:
Modern large-scale software development is a complex undertaking and coordinating various processes is crucial to achieve efficiency. The alignment between requirements and test activities is one very important aspect. Production and... more
Modern large-scale software development is a complex undertaking and coordinating various processes is crucial to achieve efficiency. The alignment between requirements and test activities is one very important aspect. Production and maintenance of software result in an ever-increasing amount of information. To be able to work efficiently under such circumstances, navigation in all available data needs support. Maintaining traceability links between software artifacts is one approach to structure the information space and support this challenge. Several researchers proposed traceability recovery by applying information retrieval (IR) methods, utilizing the fact that artifacts often have textual content in natural language. Case studies have showed promising results, but no large-scale in vivo evaluations have been made. Currently, there is a trend among our industrial partners to move to a specific new software engineering tool. Their aim is to collect different pieces of information in one system. Our ambition is to develop an IR-based traceability recovery plug-in to this tool. From this position, right in the middle of a real industrial setting, many interesting observations could be made. This would allow a unique evaluation of the usefulness of the IR-based approach.
Research Interests:
Large-scale software development is a complex undertaking and generates an ever-increasing amount of information. To be able to work efficiently under such circumstances, navigation in all available data needs support. Maintaining... more
Large-scale software development is a complex undertaking and generates an ever-increasing amount of information. To be able to work efficiently under such circumstances, navigation in all available data needs support. Maintaining traceability links between software artefacts is one approach to structure the information space and support this challenge. Several researchers have proposed traceability recovery by applying IR methods, based on textual similarities between artefacts. Early studies have shown promising results, but no large-scale in vivo evaluations have been made. Currently, there is a trend among our industrial partners to collect artefacts in a specific new software engineering tool. Our goal is to develop an IR-based traceability recovery plugin to this tool. From this position, in the environment of possible future users, the usefulness of supported findability in a software engineering context could be explored with an industrial validity.
Research Interests:
Software developers in large projects work in complex information landscapes and staying on top of all relevant software artifacts is an acknowledged challenge. As software systems often evolve over many years, a large number of issue... more
Software developers in large projects work in complex information landscapes and staying on top of all relevant software artifacts is an acknowledged challenge. As software systems often evolve over many years, a large number of issue reports is typically managed during the lifetime of a system, representing the units of work needed for its improvement, e.g., defects to fix, requested features, or missing documentation. Efficient management of incoming issue reports requires the successful navigation of the information landscape of a project.
In this thesis, we address two tasks involved in issue management: Issue Assignment (IA) and Change Impact Analysis (CIA). IA is the early task of allocating an issue report to a development team, and CIA is the subsequent activity of identifying how source code changes affect the existing software artifacts. While IA is fundamental in all large software projects, CIA is particularly important to safety-critical development.
Our solution approach, grounded on surveys of industry practice as well as scientific literature, is to support navigation by combining information retrieval and machine learning into Recommendation Systems for Software Engineering (RSSE). While the sheer number of incoming issue reports might challenge the overview of a human developer, our techniques instead benefit from the availability of ever-growing training data. We leverage the volume of issue reports to develop accurate decision support for software evolution.
We evaluate our proposals both by deploying an RSSE in two development teams, and by simulation scenarios, i.e., we assess the correctness of the RSSEs' output when replaying the historical inflow of issue reports. In total, more than 60,000 historical issue reports are involved in our studies, originating from the evolution of five proprietary systems for two companies. Our results show that RSSEs for both IA and CIA can help developers navigate large software projects, in terms of locating development teams and software artifacts. Finally, we discuss how to support the transfer of our results to industry, focusing on addressing the context dependency of our tool support by systematically tuning parameters to a specific operational setting.
In this thesis, we address two tasks involved in issue management: Issue Assignment (IA) and Change Impact Analysis (CIA). IA is the early task of allocating an issue report to a development team, and CIA is the subsequent activity of identifying how source code changes affect the existing software artifacts. While IA is fundamental in all large software projects, CIA is particularly important to safety-critical development.
Our solution approach, grounded on surveys of industry practice as well as scientific literature, is to support navigation by combining information retrieval and machine learning into Recommendation Systems for Software Engineering (RSSE). While the sheer number of incoming issue reports might challenge the overview of a human developer, our techniques instead benefit from the availability of ever-growing training data. We leverage the volume of issue reports to develop accurate decision support for software evolution.
We evaluate our proposals both by deploying an RSSE in two development teams, and by simulation scenarios, i.e., we assess the correctness of the RSSEs' output when replaying the historical inflow of issue reports. In total, more than 60,000 historical issue reports are involved in our studies, originating from the evolution of five proprietary systems for two companies. Our results show that RSSEs for both IA and CIA can help developers navigate large software projects, in terms of locating development teams and software artifacts. Finally, we discuss how to support the transfer of our results to industry, focusing on addressing the context dependency of our tool support by systematically tuning parameters to a specific operational setting.
Research Interests:
Successful development of software systems involves the efficient navigation of software artifacts. However, as artifacts are continuously produced and modified, engineers are typically plagued by challenging information landscapes. One... more
Successful development of software systems involves the efficient navigation of software artifacts. However, as artifacts are continuously produced and modified, engineers are typically plagued by challenging information landscapes. One state-of-practice approach to structure information is to establish trace links between artifacts; a practice that is also enforced by several development standards. Unfortunately, manually maintaining trace links in an evolving system is a tedious task. To tackle this issue, several researchers have proposed treating the capture and recovery of trace links as an Information Retrieval (IR) problem. The goal of this thesis is to contribute to the evaluation of IR-based trace recovery, both by presenting new empirical results and by suggesting how to increase the strength of evidence in future evaluative studies.
This thesis is based on empirical software engineering research. In a Systematic Literature Review (SLR) we show that a majority of previous evaluations of IR-based trace recovery have been technology-oriented, conducted in "the cave of IR evaluation", using small datasets as experimental input. Also, software artifacts originating from student projects have frequently been used in evaluations. We conducted a survey among traceability researchers and found that while a majority consider student artifacts to be only partly representative of industrial counterparts, such artifacts were typically not validated for industrial representativeness. Our findings call for additional case studies to evaluate IR-based trace recovery within the full complexity of an industrial setting. Thus, we outline future research on IR-based trace recovery in an industrial study on safety-critical impact analysis.
Also, this thesis contributes to the body of empirical evidence of IR-based trace recovery in two experiments with industrial software artifacts. The technology-oriented experiment highlights the clear dependence between datasets and the accuracy of IR-based trace recovery, in line with findings from the SLR. The human-oriented experiment investigates how different quality levels of tool output affect the tracing accuracy of engineers. While the results are not conclusive, there are indications that it is worthwhile further investigating into the actual value of improving tool support for IR-based trace recovery. Finally, we present how tools and methods are evaluated in the general field of IR research, and propose a taxonomy of evaluation contexts tailored for IR-based trace recovery in software engineering.
This thesis is based on empirical software engineering research. In a Systematic Literature Review (SLR) we show that a majority of previous evaluations of IR-based trace recovery have been technology-oriented, conducted in "the cave of IR evaluation", using small datasets as experimental input. Also, software artifacts originating from student projects have frequently been used in evaluations. We conducted a survey among traceability researchers and found that while a majority consider student artifacts to be only partly representative of industrial counterparts, such artifacts were typically not validated for industrial representativeness. Our findings call for additional case studies to evaluate IR-based trace recovery within the full complexity of an industrial setting. Thus, we outline future research on IR-based trace recovery in an industrial study on safety-critical impact analysis.
Also, this thesis contributes to the body of empirical evidence of IR-based trace recovery in two experiments with industrial software artifacts. The technology-oriented experiment highlights the clear dependence between datasets and the accuracy of IR-based trace recovery, in line with findings from the SLR. The human-oriented experiment investigates how different quality levels of tool output affect the tracing accuracy of engineers. While the results are not conclusive, there are indications that it is worthwhile further investigating into the actual value of improving tool support for IR-based trace recovery. Finally, we present how tools and methods are evaluated in the general field of IR research, and propose a taxonomy of evaluation contexts tailored for IR-based trace recovery in software engineering.
Research Interests:
More than 90 % of all computers are embedded in different types of systems, for example mobile phones and industrial robots. Some of these systems are real-time systems; they have to produce their output within certain time constraints.... more
More than 90 % of all computers are embedded in different types of systems, for example mobile phones and industrial robots. Some of these systems are real-time systems; they have to produce their output within certain time constraints. They can also be safety critical; if something goes wrong, there is a risk that a great deal of damage is caused. Industrial Extended Automation System 800xA, developed by ABB, is a realtime control system intended for industrial use within a wide variety of applications where a certain focus on safety is required, for example power plants and oil platforms. The software is currently written in C and C++, languages that are not optimal from a safety point of view. In this master’s thesis, it is investigated whether there are any plausible alternatives to using C/C++ for safety critical real-time systems. A number of requirements that programming languages used in this area have to fulfill are stated and it is evaluated if some candidate languages fulfill these requirements. The candidate languages, Java and Ada, are compared to C and C++. It is determined that the Java-to-C compiler LJRT (Lund Java-based Real Time) is a suitable alternative. The practical part of this thesis is concerned with the introduction of Java in 800xA. A module of the system is ported to Java and executed together with the original C/C++ solution. The functionality of the system is tested using a formal test suite and the performance and memory footprint of our solution is measured. The results show that it is possible to gradually introduce Java in 800xA using LJRT, which is the main contribution of this thesis.