Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

RSMM: A Framework to Assess Maturity of Research Software Project thanks: Submitted to IEEE eScience 2024 Conference

Deekshitha\orcidlink0000-0003-1831-8941 Utrecht University, Utrecht, The Netherlands and {d.deekshitha,slinger.jansen}@uu.nl Netherlands eScience Center, Amsterdam, The Netherlands {r.bakhshi, j.maassen, c.martinez}@esciencecenter.nl Rena Bakhshi\orcidlink0000-0002-2932-3028 Netherlands eScience Center, Amsterdam, The Netherlands {r.bakhshi, j.maassen, c.martinez}@esciencecenter.nl Jason Maassen \orcidlink0000-0002-8172-4865 Netherlands eScience Center, Amsterdam, The Netherlands {r.bakhshi, j.maassen, c.martinez}@esciencecenter.nl Carlos Martinez Ortiz\orcidlink0000-0001-5565-7577 Netherlands eScience Center, Amsterdam, The Netherlands {r.bakhshi, j.maassen, c.martinez}@esciencecenter.nl
Rob van Nieuwpoort\orcidlink0000-0002-2947-9444
Leiden University, Leiden, The Netherlands r.v.van.nieuwpoort@liacs.leidenuniv.nl
Slinger Jansen \orcidlink0000-0003-3752-2868 Utrecht University, Utrecht, The Netherlands and {d.deekshitha,slinger.jansen}@uu.nl
Abstract

The organizations and researchers producing research software face a common problem of making their software sustainable beyond funding provided by a single research project. This is addressed by research software engineers through building communities around their software, providing appropriate licensing, creating reliable and reproducible research software, making it sustainable and impactful, promoting, and ensuring that the research software is easy to adopt in research workflows, etc. As a result, numerous practices and guidelines exist to enhance research software quality, reusability, and sustainability. However, there is a lack of a unified framework to systematically integrate these practices and help organizations and research software developers refine their development and management processes. Our paper aims at bridging this gap by introducing a novel framework: RSMM. It is designed through systematic literature review and insights from interviews with research software project experts. In short, RSMM offers a structured pathway for evaluating and refining research software project management by categorizing 79 best practices into 17 capabilities across 4 focus areas. From assessing code quality and security to measuring impact, sustainability, and reproducibility, the model provides a complete evaluation of a research software project maturity. With RSMM, individuals as well as organizations involved in research software development gain a systematic approach to tackling various research software engineering challenges. By utilizing RSMM as a comprehensive checklist, organizations can systematically evaluate and refine their project management practices and organizational structure.

Keywords: research software engineering, focus area maturity model, research software project management

I Introduction

Open science is a movement that strives for greater openness and collaboration in research practices. It promotes the open sharing of publications, data, software, and various academic outputs as early as possible, making them accessible for reuse. As a result, open science leads to greater scientific and societal impact [1]. Research software plays an important role in the open science movement [2]. It includes source code files, algorithms, scripts, computational workflows, and executables created during the research process or for a research purpose [3]. For good scientific practice, the resulting research software should be open and adhere to the FAIR principles [4, 5] to allow repeatability, reproducibility, and reuse. Compared to research data, research software should be both archived for reproducibility and actively maintained for reusability [6, 7].

Traditionally, researchers focus on maximizing their production process towards scientific publications rather than the quality and sustainability of their research software. This is due to the pressure on researchers to continually publish their work to advance their careers (also known as publish-or-perish[8]. Long term sustainability strategies such as software quality and community building require resources in terms of time and expertise. Researchers may lack the necessary skills and do not prioritize these activities because these efforts are often not acknowledged by research organisations and funding agencies [9]. For the same reasons, the peer review process, which serves as a quality check for scientific publications and in open source software development, is rarely done for research software.

Over the past few years, however, the significance of research software has grown exponentially with its increased usage [10, 11]. The Research Software Engineering (RSE) movement, which started in the UK, has experienced substantial growth and rapid emergence [12]. This movement aims to recognize the vital importance of software for research, and the role of people, policy, and infrastructure in its development, support, and maintenance [13]. As a result, currently there is significant interest from researchers, RSEs and even funding organizations in establishing best practices for research software engineering [14, 4] and to ensure research software becomes sustainable, reproducible, and has a community around it. However, while numerous practices and guidelines exist, there is a lack of a unified framework to systematically integrate these practices and assist researchers and organizations in refining their development and management processes.

We present the Research Software focus area Maturity Model (RSMM), targeted at organizations that produce research software. Our model aims to provide a structured analysis of research software engineering practices that can help researchers and research supporting organizations produce research software. RSMM takes the FAIR principles as the baseline but goes further into software sustainability, community building, the software role in reproducibility, usability, and other quality factors relevant to the open science movement. This model includes 4 focus areas: Software Project Management, Research Software Management, Community Engagement, and Software Adoptability. Each focus area covers several best practices to improve its capabilities. RSMM facilitates an incremental pathway for organizations to enhance their research software project management. By utilizing RSMM as a comprehensive checklist, organizations can systematically evaluate and refine their project management practices and organizational structure.

The contributions of this study are as follows:

  • Maturity model for research software: We present RSMM to assess the maturity of research software projects by considering best practices of research software project management. The framework includes 79 best practices covering code quality and security, sustainability, reproducibility, community building, and many others required to improve the research software project management.

  • Community-based approach: The framework is designed based on interviews with researchers, research software engineers, and project managers.

  • The published dataset: The model itself, along with a complete description of the 79 practices, is publicly available as a dataset [15]. This description includes when the practices are implemented and is organized based on the MoSCoW prioritization (Must have, Should have, Could have, Won’t have). Additionally, it details the resources required for execution, dependencies among neighboring practices, and references.

The remainder of the paper is structured as follows: Section II presents maturity models and a comparative analysis between our model and the existing ones. In Section III, we provide an in-depth exploration of the research methodology employed in designing RSMM. Section IV explains our novel model RSMM and Section V illustrates the method of assessing research software project maturity using RSMM, supported by two examples. Section VI concludes the paper by summarizing the findings and contributions made in this work.

II Background and related work

II-A Maturity Models Concept

Maturity is an evolutionary progress in demonstrating a specific ability or accomplishing a target from an initial stage to a desired end stage. Maturity models capture this process from an initial state to the desired end state and can guide organizations in assessing and developing organizational capabilities [16]. Maturity models are tools developed for organizations to evaluate and compare, providing a basis for improvement and informed strategies to enhance specific areas within the organization [17].

To reach a particular maturity level, an organization must meet specific criteria and characteristics related to capabilities or process performance. The initial stage represents a starting point where the organization may have limited capabilities in the investigated domain. Organizations with the most advanced capabilities reach the highest maturity level [18, 19, 20]. This maturity level assessment of capabilities helps derive and prioritize improvement measures. The following subsection compares RSMM with the existing models.

II-B Landscape of Maturity models

Few existing maturity models are used for software capability management but are not specific to the research software development. Therefore, they do not cover aspects such as sustainability, reproducibility, impact measurement, promotion, visibility and adoptability.

Capability Maturity Model Integration (CMMI) and its predecessor Capability Maturity Model (CMM) are industry standard maturity models [21]. They include the 5 maturity levels. CMM is used to assess an organization software development processes in terms of maturity. It helps developers to enhance software quality and the overall software development process. CMMI v3.0 [21] goes even beyond software development and includes process quality assurance, configuration management, monitor and control, planning, estimating requirements development and management governance, implementation infrastructure, organizational training, process management verification and validation and etc. Therefore, it considers developers and other departments such as marketing, finance, and purchasing. This broader scope might be deemed unnecessarily complex when applied to Agile software development practices [22] like Extreme Programming, SCRUM, and Lean development are typical for research software projects.

Refer to caption
Figure 1: Design phases of RSMM: The steps involved in developing a RSMM v1.0: In the Scope, Design, and Populate phases, we conducted a systematic literature review to collect and explore academic and grey literature, resulting in the creation of RSMM v0.1. In the Test phase, we included Expert Interviews and Expert Confirmations. Each stage in this phase leads to the evolution of RSMM, also producing intermediate versions v0.2 and v0.3 (available upon request from authors). Case studies are conducted in the Deploy phase to validate the model applicability.

The focus area maturity model (FAMM) is one type of maturity model [20]. It helps organizations to measure their performance in a particular functional domain. A functional domain consists of different focus areas, each with its capabilities. These capabilities are arranged in a maturity matrix, which helps to identify different maturity levels. Each capability includes various improvement actions, known as practices, that support the organization in gradually improving in that functional domain. Unlike other maturity models, FAMM does not have fixed maturity levels; maturity levels can start from 0 and end at any positive integer. Each focus area is evaluated separately and has its maturity levels. The most relevant examples of FAMM are:

SEG-M2 [18] is a focus area maturity model designed to enhance the governance of software ecosystems within organizations. The model consists of 168 practices collected from structured literature reviews and desk studies, following the maturity model creation steps outlined by de Bruin et al. [17]. However, SEG-M2 focuses on many aspects of software ecosystem governance, such as ecosystem health, open markets, and market and sale-related capabilities. Thus, these are mainly applicable to industry-based software projects.

API-m-FAMM [19] is a model designed for managing APIs. The study integrates De Bruin et al. methodology with Design Science Research [17], using the card sorting technique and multiple evaluation cycles. This model includes capabilities such as version management, documentation, community engagement, and resource management.

Many maturity models exist for the management process of open-source projects [23, 24, 25]. While there are many similarities between those models and ours, a significant difference is that researchers commonly use different criteria for assessing open-source research software (e.g., reproducibility, visibility, impact). Furthermore, open-source projects focus more on testing, reputation, and generic integration. In contrast, RSMM focuses on various aspects such as sustainability, visibility, impact measurement [7], integration of research software into research workflows, adoptability, fostering partnerships, improving developers’ skill sets, and recognizing their contributions to research software projects. This model is designed from the perspectives of users, developers, funders, and partners, giving equal importance to all these stakeholders. As we already mentioned, our dataset is publicly available [15] and includes a complete description of all the model components. This dataset can assist individuals even with a limited software engineering training to understand the practices and resources required to implement them.

In summary, most of these models or frameworks serve industry-level projects; however, none specifically focus on research software project management. This managemnet process includes community building, visibility, conference promotion, and more. Another area for improvement is that many of these models are limited to a particular domain or capability. RSMM is designed to address these existing problems by focusing on the research software project management and the number of capabilities within a single framework.

To the best of our knowledge, RSMM is the first research software maturity model that combines the best practices from software development and open-source software development while giving importance to the research software engineering practices. To this end, the focus of RSMM extends beyond the developmental aspects of research software, giving equal importance to non-code considerations such as building a community around the research software and enhancing its impact, sustainability, and adoption within the research community.

III Designing the Maturity Model

Inspired by the previous FAMMs [18, 19], we have developed our model RSMM. This section covers its design phases.

Refer to caption
Figure 2: RSMM v0.1 (not the final model): The initial model includes 4 focus areas, 18 capabilities, and 61 practices. We included the practices collected through the systematic literature review from both academic and grey literature in the model, with maturity levels from 1 to 7. These practices are not placed based on their maturity.

III-A Research Method

We used De Bruin’s 5-phase approach [17] to design RSMM with the following phases (see Fig. 1):

  • Scope: The scope of RSMM is to evaluate and improve research software project management.

  • Design: The design phase focuses on the questions ”why,” ”how,” and ”who”, as outlined below:

    • The Why: The purpose of RSMM is to help an organization that produces research software to improve their research software project management by assessing and improving the maturity of their projects.

    • The How: RSMM acts as a guideline and helps organizations to understand and implement the best practices effectively to achieve their desired maturity level.

    • The Who: The intended audience of RSMM are researchers, research software engineers, research software project managers, funders and policy makers.

  • Populate: We identified the focus areas, capabilities, and practices of research software project management through a systematic literature review, resulting in RSMM v0.1. Section III-B explains the literature review process to collect components of RSMM.

  • Test: We conducted expert interviews during this phase to position practices within the maturity matrix. Furthermore, we sent the resulting model, RSMM v1.0, to the interview experts for confirmation. Section III-C describes how the collected practices are positioned based on their maturity in the matrix (maturity matrix).

  • Deploy: In this phase, as part of the future work, we will conduct comprehensive case study to validate and verify the applicability of the model. Section V discusses how to use this model to evaluate research software project management on two examples.

The following subsections briefly explain the Populate, Test and Deploy phases.

III-B Literature Review: Scope, Design, and Populate

As discussed in Section II, existing methods for evaluating and improving research software project management need to be revised. Initially, we conducted a systematic literature review gathering academic and grey literature. This process helps us to identify the practices, capabilities, and focus areas of research software project management to design the model. We utilized snowballing method (also known as backward and forward snowballing), that is used to expand a literature review by identifying articles relevant to the topic of interest [26]. We conducted our literature review using the ACM Digital Library and Google Scholar. We used the search string, “(“software management” OR “software project management”) AND (“research software” OR “research software project management”) AND (“community development” OR “community engagement” OR “community participation”) AND (“research software” OR “scientific software”) AND (“FAIR principles” OR “best practices”)” to collect articles related to our topic. Our search was limited to the past 10 years and yielded 36 results in total from both databases. After removing duplicates and irrelevant entries, we retained 19 academic papers and grey literature. Next, we look at older papers’ references to find more relevant sources, adding 5 more papers. After that, we found newer documents that cite the original ones to get the latest insights. We then identified practices, capabilities, and focus areas from these collected documents. Based on these findings, we started a second round of searching, this time focusing specifically on the identified capabilities and focus areas. As a result of this refined search, 66 new sources (both academic papers and grey literature) were added to the pool of resources. Then, we grouped practices into capabilities and capabilities into focus areas and vice versa (following both a top-down approach and bottom-up approach for grouping and collecting focus areas, capabilities, and practices [17]).

From this method, we identified 61 best practices and grouped these practices into capabilities and focus areas. The described process is a part of the Populate phase, which results in 4 focus areas, 18 capabilities, and 61 practices (indicated as A, B in Figure 1). As depicted in the figure, these practices and capabilities together form the focus areas that represent the functional domain of the research software project management. In the model, RSMM, the capabilities define the ability to achieve a goal related to the research software project management by executing two or more interrelated practices. In this case, we can define a practice as an action need to be taken to improve the research software project management. As it is shown in Fig. 2, we placed practices of RSMM v0.1 without considering its maturity. Next, we followed 12 expert interviews to determine the positioning of these practices. The following we provide details about this expert interview.

III-C Expert Interviews and confirmation: Test Phase

The Test phase of the design methodology includes semi-structured expert interviews and confirmation steps for designing RSMM v1.0. We consulted experts in research software development and project management to find the positioning of the practices in the maturity matrix and validate the completeness and correctness of the model. We followed 5 steps in the test phase, described in details below:

III-C1 Connecting with experts through Dutch and German RSE groups

We initiated the interview process by approaching experts through Dutch and German RSE groups, providing a brief project overview, and inviting them to express their interest in participating. Participation in the interview process was entirely voluntary.

III-C2 Expert selection

We received 15 positive responses. Following this, we shared the interview protocol with these interested participants, resulting in 12 confirmations from individuals who met the interview selection criteria of our study. We set specific criteria for expert selection to ensure participants had the experience and knowledge in research software project management to obtain valuable results. The expert selection criteria are as follows:

  1. 1)

    Participants must indicate they are knowledgeable on a minimum of two out of the 4 focus areas of the RSMM;

  2. 2)

    Participants must have a minimum of 3 years of experience in either developing or managing research software projects;

  3. 3)

    Participants must work at an organization as a Research Software Engineer, researcher, project manager as a part of a team working on research software projects or any comparable role related to research software.

The shared interview protocol outlines procedural steps, rules, and regulations, including the questions to address during the interview. Additionally, the file includes a description of the focus areas, capabilities, and practices used in the model.

III-C3 Conducting expert interviews

We scheduled online meetings based on the participants’ availability. The interviews took between 45 minutes and 1 hour and 36 minutes. Before the interview, we shared our maturity model without including maturity levels. During the expert interview, we asked the experts to arrange practices into the corresponding capability groups based on the order of maturity, seeking their insights.

After completing this task for each focus area, we ask for suggestions about any missing capabilities or practices within that specific focus area and whether any practices were misclassified into the wrong capability. This systematic approach is repeated for all 4 focus areas. Figure 3 depicts a comparison of the maturity model templates provided by the 3 experts. The ranking of the first capability Requirements is presented according to the inputs provided by the 3 experts. The final placement of practices is determined by considering the dependencies between different practices (these are shown in Fig. 2 as arrow marks). For example, the practice, Provide executable tests must be implemented or executed before the Execute tests in a public workflow. We refer interested reader to the dataset [15] for the detailed information.

Refer to caption
Figure 3: Maturity Model template by experts: The screenshot depicts 3 experts’ ranking of practices within the Requirements capability, illustrating their placements in the RSMM v0.1a expert’s dashboard.

Upon completion of this task, we asked experts questions to evaluate the model applicability, ease of use, and feasibility.

III-C4 Interview data processing

We mapped practices in each expert’s dashboard to maturity levels, then identified the positioning of these practices by taking the mode of all experts response. For example, out of 12 experts, 4 recommended positioning the Provide executable tests practice at level 4. Consequently, we assigned this practice to maturity level 4, considering its dependencies with another practice, Execute tests in a public workflow. We revised the practice names to follow the ’Verb Object (Qualifier)’ pattern. We also made small adjustments to the names of focus areas and capabilities.

III-C5 Expert confirmation

We incorporated the suggestions from the interviewees into a new version of the model, RSMM v1.0. During the interviews, experts added 18 new practices. We then validated the new model by sending it back to all interviewees for a second round of feedback. We are still awaiting final feedback from all the experts, but one of the first comments we received was “Yes, I will use input from this model in my future work”, which we perceive as encouraging.

In total, we received the approval from 8 experts to publish RSMM v1.0, including a few suggestions to modify the dataset. These suggestions will be considered in future versions of the model.

III-D Evaluation of research software projects: Deploy Phase

We assessed the applicability of RSMM v1.0 by evaluating a couple of research software projects in the Deploy phase. Section V presents the evaluation results for research software projects GGIR [27] and ESMValTool [28].

IV Our Maturity Model

This section presents the resulting model RSMM v1.0, which includes 4 focus areas related to Research Software Project Management: Software Project Management, Research Software Management, Community Engagement, and Software Adoptability. Figure 4 illustrates the focus areas, their associated capabilities, and their respective code names. The following subsection briefly describes 4 focus areas of the RSMM v1.0, and this model is depicted in Fig. 5.

Refer to caption
Figure 4: RSMM Focus areas and capabilities: The 4 focus areas and 17 capabilities of the RSMM v1.0 are shown in the figure.
Refer to caption
Figure 5: RSMM v1.0: The updated and final version of RSMM. It includes 4 focus areas, 17 capabilities, and 79 practices. These practices are placed between maturity levels 1 to 10. The first column of the table includes the capability codes.

IV-A Software project management

Software project management manages the resources and work activities needed to develop and modify software-intensive systems [29]. The software project is a highly people-intensive effort that extends over a considerable period, significantly impacting the work and performance of various stakeholders, including project managers. The primary success criteria for software managers are delivery of systems that satisfy specified needs and requirements, on time and within budget [29]. Software development involves challenges, including evolving technology, immature technology, sloppy development practices, and staff changes. Software project management addresses these complexities by promoting stakeholder involvement, managing risks, and fostering transparent communication, helping project managers navigate and overcome these obstacles.

Thus, various factors influence the software project management process, and they are almost the same for research software development. We included 3 capabilities and 19 practices related to this focus area. The dataset consists of descriptions of all the practices.

IV-B Research software management

Like software management, Research software management is a process of managing research software. Increasing usage of research software in the various research domains highlights the importance of adhering to best practices [4]. In this focus area, we included 4 capabilities such as Impact measurement, Sustainability, Visibility, and Usage cost and Ethics and 22 practices.

The 3 practices of RSMM v1.0 for improving research software visibility are Make code citable, Enable indexing of project meta-data [4], and Publish in a research software directory [30]. Figure 6 describes the practice Publish in a research software directory. The practice code (2.3.5) is generated by combining the focus area (2), capability (3), and maturity level (5). The description for when this practice is implemented is categorized into MoSCoW categories (with W omitted), including the resources required to complete it, its dependencies on other practices, and references provided within the practice description set.

Practice Code: 2.3.5
Practice Name: Publish in a research software directory
Description: Publishing research software projects in a research software directory facilitates discovery, citation, and collaboration among researchers, research software engineers, and users within the research community.
When implemented:
(M) Identifying and selecting suitable research software directories that align with the project focus area, target audience, and visibility goals, considering factors such as reputation, coverage, and accessibility. (M) Preparation and submission of comprehensive meta-data and documentation for the research software project to the selected directory. (S) Compliance with directory-specific submission guidelines, metadata standards, and Quality Assurance criteria to ensure accurate representation, indexing, and visibility of the research software project within the directory database and search interface. (C) Monitoring and updating of project listings and meta-data in the research software directory as needed to reflect changes, updates, or new releases of the software. Resources required: Time: Required for researching and selecting appropriate research software directories, preparing and submitting project listings, and maintaining directory entries over time. Ongoing time and effort may be needed to monitor directory performance, respond to inquiries, and update project information. Knowledge or expertise in directory submission processes, metadata standards, and documentation requirements for research software projects Collaboration with research software directory team to publish projects on their website. Dependencies:
References:  [30]

Figure 6: Description set for the practice Publish in a research software directory.

IV-C Community Engagement

A community generally evolves and maintains research software, creating an ecosystem of competing and collaborative products. It is influenced by the open-source movement culture of sharing and collaboration [31]. Consequently, we have identified 4 capabilities and 16 practices associated with this focus area to foster community development around research software. Below, we describe one of these practices.

Develop code of conduct: A code of conduct outlines how participants should communicate, enforces consequences for violations, and reflects the community values. It creates an inclusive space where everyone can contribute comfortably, regardless of gender, ethnicity, or sexual orientation [32]. Thus, a code of conduct helps effective collaboration, leading to successful project deliverables.

IV-D Software Adoptability

The focus area Software Adoptability concerns with how easily and effectively research software can be adopted and utilized by users. This focus area is aimed at understanding and enhancing the user-friendliness, accessibility, and overall adoption strategies of research software by the research community. It includes 6 capabilities and 22 practices. Below are descriptions of one capability and one of its practices.

Documentation: Good documentation is needed for the reuse of research software  [33]. For example, including examples in the documentation (Provide a common example usage) offers users a starting point for experimentation. When examples demonstrate the software functionality, users can quickly execute the code and enhance their understanding. All practices within this Documentation capability, viewed from the user perspective, emphasize the importance of clarity and accessibility in helping users effectively understand and reuse any research software project.

Focus
Area
Capa-
bility
1 2 3 4 5 6 7 8 9 10
1 1.1
1.2
1.3
2 2.1
2.2
2.3
2.4
3 3.1
3.2
3.3
3.4
4 4.1
4.2
4.3
4.4
4.5
4.6
(a)
Focus
Area
Capa-
bility
1 2 3 4 5 6 7 8 9 10
1 1.1
1.2
1.3
2 2.1
2.2
2.3
2.4
3 3.1
3.2
3.3
3.4
4 4.1
4.2
4.3
4.4
4.5
4.6
(b)
TABLE I: Case study: the evaluation results of the research software GGIR (left) and ESMValTool (right) using RSMM v1.0. The tick mark in the practices box indicates that the tool follows those practices, while the cross mark indicates that it did not. Grey shading is added to the cells to show the longest path of each focus area that achieves maturity.

V Validation and Discussion

In this section, we illustrate on 2 case studies how our model can be used to evaluate research software project management. Such an evaluation can help researchers and RSEs identifying the next steps for improving the maturity of their own research software. It assists them in choosing existing software for adoption, and allows them to track the progress and improvements over time. For research organisations and funding agencies, our model can be used to assess the maturity and viability of research software projects, thereby allowing them to make an informed decision on which projects to support and fund.

For our case study, we selected 2 research software projects: GGIR and ESMValTool. GGIR is an R-package to process and analysis multi-day data collected with wearable raw data accelerometers for physical activity and sleep research. ESMValTool is a community-developed climate model diagnostics and evaluation software package. We collected data related to GGIR and ESMValTool from their websites and GitHub repositories. Based on this data, we evaluated both research software projects using RSMM v1.0, resulting in maturity scores of 4-3-6-7 for GGIR and 5-4-8-8 for ESMValTool across the corresponding focus areas. The details of these evaluations are presented in Table I.

GGIR achieves a moderate level of maturity in Software Project Management but requires further improvement in code quality and security practices to achieve higher maturity. GGIR lacks a structured community and is developed through voluntary efforts [34]. It shows a high level of maturity in Software Adoptability, indicating well-defined methodologies in this focus area.

ESMValTool also achieves a moderate level of maturity in Software Project Management, due to its lack of adherence to security practices. However, being an open-source community-developed tool for climate science, it excels in Community Engagement and Software Adoptability and has implemented numerous practices in these focus areas to improve its capabilities.

Thus, using this model, users can evaluate their research software project management by identifying areas that need improvement and practices that need to be followed to achieve higher maturity. For example, in the case of GGIR, evaluating the code quality and security capability within the Software Project Management focus area reveals the absence of several key practices, such as Use crash reporting, Conduct security reviews, Define code coverage targets, and Follow industry standard for security. This currently prevents GGIR from achieving a higher maturity level.

Van Nieuwpoort and Katz [35] noted, however, the importance of roles and context in research software development, stating that not all software is created equal, and thus, not all software must follow the same practices. That is, research software should simply be mature enough to be fit for its intended purpose. In this light, it is understandable that both GGIR and ESMvalTool do not adhere to extensive security practices. These tools consists of scripts and libraries designed to support data analysis in their respective research fields, and are not typically used in an online environment, or with sensitive data. Thus, an additional refinement is necessary in future versions of RSMM to take this into account.

VI Conclusion and Future work

In this paper we have presented the maturity model RSMM, which provides a comprehensive framework for research software project management: RSMM combines best practices from 3 different areas, namely, software engineering practices (such as requirements analysis, code quality, and security), open-source software development practices (including code review, public repository storage with version control, and community building) and best research software practices (such as making code citable and findable). To that end, we have collected components of RSMM using systematic literature review, categorized them and further refined the model through experts interviews.

RSMM is aimed at helping research and research support organizations to systematically improve their management processes and achieve desired maturity in their research software projects. RSMM is a unique model that fits the needs of researchers, research software engineers, funders, policy makers, and organizations that produce research software alike as we uncovered in our interviews with research software engineers, researchers, and project managers. Using RSMM, a researcher, a research software engineer or a project manager can identify areas for improvement to achieve higher maturity levels, thereby improving their research project management. Our model allows for benchmarking project management processes across different projects and can help making informed decisions about project funding. Funders can use this evaluation results to decide which projects to support. Additionally, policy makers can utilize the model to update and improve policies based on evaluations of previous projects.
Future work: As a final step, we will conduct case studies to evaluate the applicability of RSMM. In this, we will assess the maturity of 50 research software projects, examining the relationships between concepts such as FAIR principles, quality, maturity, and impact — different but related aspects of research software. Additionally, we aim to refine our maturity model based on feedback from the experts. Lastly, we will investigate software verification practices, which are important for software and research quality.

Acknowledgement

We want to acknowledge the experts who contributed to redefining RSMM (not in any particular order): Bernadette Fritzsch, Jayesh Badwaik, Michael Schlottke-Lakemper, Pablo Lopez-Tarifa, Raoul Schram, Nicolas Renaud, Arend Rensink, Jan Philipp Dietrich, Axel Loewe, Aljen Uitbeijerse, Tomas Turner-Zwinkels, and Martine de Vos.

References

  • [1] “Open science – embrace it before it’s too late,” Nature, vol. 626, no. 7998, p. 233, 2024.
  • [2] R. Heumüller, S. Nielebock, J. Krüger, and F. Ortmeier, “Publish or perish, but do not forget your software artifacts,” Empirical Software Engineering, vol. 25, no. 6, pp. 4585–4616, 2020.
  • [3] N. P. Chue Hong, D. S. Katz, M. Barker, A.-L. Lamprecht, Y. Yehudi, and RDA FAIR4RS WG. (2022) FAIR Principles for Research Software (FAIR4RS Principles). Version: 1.0. [Online]. Available: https://doi.org/10.15497/RDA00068
  • [4] M. Barker, N. P. Chue Hong, D. S. Katz, A.-L. Lamprecht, C. Martinez-Ortiz, F. Psomopoulos, J. Harrow, L. J. Castro, M. Gruenpeter, P. A. Martinez et al., “Introducing the FAIR Principles for research software,” Sci. Data, vol. 9, no. 1, p. 622, 2022.
  • [5] F. Laura and G. Mark, “The case for free and open source software in research and scholarship,” Phil. Trans. R. Soc. A., no. 37920200079, 2021. doi: http://doi.org/10.1098/rsta.2020.0079
  • [6] W. Hasselbring, L. Carr, S. Hettrick, H. Packer, and T. Tiropanis, “From FAIR research data toward FAIR and open research software,” it-Information Technology, vol. 62, no. 1, pp. 39–47, 2020.
  • [7] Deekshitha, S. Farshidi, J. Maassen, R. Bakhshi, R. Van Nieuwpoort, and S. Jansen, “FAIRSECO: An Extensible Framework for Impact Measurement of Research Software,” in Proc. IEEE Conf. on e-Science (e-Science), 2023, pp. 1–10.
  • [8] C. Saunders and M. Wiener, “Making an Impact in A Publish-or-perish World,” in ECIS, 2017, pp. 3255–3259. [Online]. Available: https://aisel.aisnet.org/ecis2017_panels/1/
  • [9] C. Merow, B. Boyle, B. J. Enquist et al., “Better incentives are needed to reward academic software development,” Nat Ecol Evol, vol. 7, p. 626–627, 2023. doi: https://doi.org/10.1038/s41559-023-02008-w
  • [10] M. Barker, N. P. Chue Hong, D. S. Katz, M. Leggott, A. Treloar, J. van Eijnatten, and S. Aragon, “Research software is essential for research data, so how should governments respond?” Dec. 2021. [Online]. Available: https://doi.org/10.5281/zenodo.5762703
  • [11] J. Carver, N. Weber, K. Ram, S. Gesing, and D. Katz, “A survey of the state of the practice for research software in the united states,” PeerJ Comput Sci., vol. 8, p. e963, 05 2022. doi: 10.7717/peerj-cs.963
  • [12] J. Cohen, D. S. Katz, M. Barker, N. Chue Hong, R. Haines, and C. Jay, “The four pillars of research software engineering,” IEEE Software, vol. 38, no. 1, pp. 97–105, 2021. doi: 10.1109/MS.2020.2973362
  • [13] A.-L. Lamprecht, C. Martinez-Ortiz, M. Barker, S. L. Bartholomew, J. Barton, N. C. Hong, J. Cohen, S. Druskat, J. Forest, J.-N. Grad et al., “What do we (not) know about research software engineering?” Journal of Open Research Software, vol. 10, 2022.
  • [14] C. Martinez-Ortiz, M. Kuzak, J. H. Spaaks, J. Maassen, and T. Bakker, “Five recommendations for ”FAIR software”,” Dec. 2020. [Online]. Available: https://doi.org/10.5281/zenodo.4310217
  • [15] Deekshitha, R. Bakshi, J. Maassen, C. Martinez-Ortiz, R. van Nieuwpoort, and S. Jansen, “Research Software focus area Maturity Model (RSMM) dataset,” May 2024. [Online]. Available: https://doi.org/10.5281/zenodo.11371911
  • [16] J. Poeppelbuss, B. Niehaves, A. Simons, and J. Becker, “Maturity models in information systems research: literature search and analysis,” Communications of the Association for Information Systems, vol. 29, no. 1, p. 27, 2011. doi: https://doi.org/10.17705/1CAIS.02927
  • [17] T. De Bruin, M. Rosemann, R. Freeze, and U. Kaulkarni, “Understanding the main phases of developing a maturity assessment model,” in Proc. Australasian Conf. on Information Systems (ACIS).   AAIS, 2005, pp. 8–19.
  • [18] S. Jansen, “A focus area maturity model for software ecosystem governance,” Inf Softw Technol, vol. 118, p. 106219, 2020.
  • [19] M. Overeem, M. Mathijssen, and S. Jansen, “API-m-FAMM: A focus area maturity model for API Management,” Inf Softw Technol, vol. 147, p. 106890, 2022.
  • [20] M. van Steenbergen, R. Bos, S. Brinkkemper, I. van de Weerd, and W. Bekkers, “The Design of Focus Area Maturity Models,” in Proc. Conf. Global Perspectives on Design Science Research (DESRIST 2010), Springer.   Springer Berlin Heidelberg, 2010, pp. 317–332.
  • [21] ITG Consulting, “CMMI v3 and the transition from CMMI v2,” https://consulting.itgonline.com/cmmi-consulting/cmmi-v3-and-the-transition-from-cmmi-v2/, accessed: May 1, 2024.
  • [22] C. Patel and M. Ramachandran, “Agile maturity model (AMM): a software process improvement framework for agile software development practices,” J. Software Engineering (IJSE), vol. 2, no. 1, pp. 3–28, 2009.
  • [23] Opensource.com, “The open organization maturity model,” accessed: January 25, 2024. [Online]. Available: https://opensource.com/open-organization/resources/open-org-maturity-model
  • [24] M. C. Paulk, B. Curtis, M. B. Chrissis, and C. V. Weber, “Capability maturity model, version 1.1,” IEEE software, vol. 10, no. 4, pp. 18–27, 1993.
  • [25] J. K. Crawford, Project management maturity model.   Auerbach Publications, 2006.
  • [26] C. Wohlin, “Guidelines for snowballing in systematic literature studies and a replication in software engineering,” in Proc. Conf. on evaluation and assessment in software engineering, 2014, pp. 1–10.
  • [27] V. van Hees, Z. Fang, E. Mirkes, J. Heywood, J. H. Zhao, C. P. Joan, S. Sabia, and J. H. Migueles, “GGIR,” https://cran.r-project.org/web/packages/GGIR/vignettes/GGIR.html, Sep. 2022, version 2.7-6.
  • [28] B. Andela, B. Broetz, L. de Mora, N. Drost, V. Eyring, N. Koldunov, A. Lauer, B. Mueller et al., “ESMValTool,” Dec. 2023. [Online]. Available: https://github.com/ESMValGroup/ESMValTool/
  • [29] R. E. Fairley, Software Project Management.   John Wiley and Sons Ltd., 2003, p. 1634–1636.
  • [30] J. H. Spaaks, T. Klaver, S. Verhoeven, F. Diblen, J. Maassen, E. Tjong Kim Sang, P. Pawar, C. Meijer, L. Ridder, L. Kulik, T. Bakker, V. van Hees, L. Bogaardt, A. Mendrik, B. van Es, J. Attema, W. van Hage, E. Ranguelova, R. van Nieuwpoort, R. Gey, and H. Zach, “Research Software Directory,” 2020, version: 3.0.1. [Online]. Available: https://github.com/research-software-directory/research-software-directory
  • [31] D. S. Katz, L. C. McInnes, D. E. Bernholdt, A. C. Mayes, N. P. C. Hong et al., “Community Organizations: Changing the Culture in Which Research Software Is Developed and Sustained,” Computing in Science & Engineering, vol. 21, no. 2, pp. 8–24, 2019. doi: 10.1109/MCSE.2018.2883051
  • [32] P. Tourani, B. Adams, and A. Serebrenik, “Code of conduct in open source projects,” in Proc. IEEE Conf. on software analysis, evolution and reengineering (SANER).   IEEE, 2017, pp. 24–33.
  • [33] S. Hermann and J. Fehr, “Documenting research software in engineering science,” Sci. Rep., vol. 12, no. 1, p. 6567, 2022.
  • [34] “GGIR Software,” acessed: May 27, 2024. [Online]. Available: https://www.accelting.com/ggir-software/
  • [35] R. van Nieuwpoort and D. S. Katz, “Defining the roles of research software,” Upstream, 6 2023. [Online]. Available: https://doi.org/10.54900/9akm9y5-5ject5y