Introduction

Mobile robots are increasingly used to perform critical missions in extreme environments, which are inaccessible or hazardous to humans1,2,3,4. These missions range from the inspection and maintenance of offshore wind-turbine mooring chains and high-voltage cables to nuclear reactor repair and deep-space exploration5,6.

Using robots for such missions poses major challenges2,7. First and foremost, the robots need to operate with high levels of autonomy, as in these harsh environments their interaction and communication with human operators is severely restricted. Additionally, they frequently need to make complex mission-critical decisions, with errors endangering not just the robot—itself an expensive asset, but also the important system or environment being inspected, repaired or explored. Last but not least, they need to cope with the considerable uncertainty associated with these missions, which often comprise one-off tasks or are carried out in settings not encountered before.

Addressing these major challenges is the focus of intense research worldwide. In the UK alone, a recent £44.5M research programme has tackled technical and certification challenges associated with the use of robotics and AI in the extreme environments encountered in offshore energy (https://orcahub.org), space exploration (https://www.fairspacehub.org), nuclear infrastructure (https://rainhub.org.uk), and management of nuclear waste (https://www.ncnr.org.uk). This research has initiated a step change in the assurance and certification of autonomous robots—not least through the emergence of new concepts such as dynamic assurance8 and self-certification9 for robotic systems.

Dynamic assurance requires a robot to respond to failures, environmental changes and other disruptions not only by reconfiguring accordingly10, but also by producing new assurance evidence which guarantees that the reconfigured robot will continue to achieve its mission goals8. Self-certifying robots must continually verify their health and ability to complete missions in dynamic, risk-prone environments9. In line with the “defence in depth” safety engineering paradigm11, this runtime verification has to be performed independently of the front-end planning and control engine of the robot.

Despite these advances, current dynamic assurance and self-certification methods rely on quantitative verification techniques (e.g., probabilistic12,13 and statistical14 model checking) that do not handle well the parametric uncertainty that autonomous robots encounter in extreme environments. Indeed, quantitative verification operates with stochastic models that demand single-point estimates of uncertain parameters such as task execution and failure rates. These estimates capture neither epistemic nor aleatory parametric uncertainty. As such, they are affected by arbitrary estimation errors which—because stochastic models are often nonlinear—can be amplified in the verification process15, and may lead to invalid robot reconfiguration decisions, dynamic assurance and self-certification.

In this paper, we present a robust quantitative verification framework that employs Bayesian learning techniques to overcome this limitation. Our framework requires only partial and limited prior knowledge about the verified robotic system, and exploits its runtime observations (or lack thereof) to learn ranges of values for the system parameters. These parameter ranges are then used to compute the quantitative properties that underpin the robot’s decision making (e.g., probability of mission success, and expected energy usage) as intervals that—unique to our framework—capture the parametric uncertainty of the mission. Our framework is underpinned by probabilistic model checking, a technique that is broadly used to assess quantitative properties, e.g., reliability, performance and energy cost of systems exhibiting stochastic behaviour. Such systems include autonomous robots from numerous domains16, e.g., mobile service robots17, spacecraft18, drones19 and robotic swarms20. While we present a case study involving an autonomous underwater vehicle (AUV), the generalisability of our approach stems from the broad adoption of probabilistic model checking for the modelling and verification of this wide range of autonomous robots. As such, we anticipate that our results are applicable to autonomous agents across all these domains.

We start by introducing our robust verification framework, which comprises Bayesian techniques for learning the occurrence rates of both singular events (e.g., catastrophic failures and completion of one-off tasks) and events observed regularly during system operation. Next, we describe the use of the framework for an offshore wind turbine inspection and maintenance robotic mission. Finally, we discuss the framework in the context of related work, and we suggest directions for further research.

Results

Proposed framework

We developed an end-to-end verification framework for the online computation of bounded intervals for continuous-time Markov chain (CMTC) properties that correspond to key dependability and performance properties of autonomous robots. The verification framework integrates interval CTMC model checking21 with two new interval Bayesian inference techniques that we introduce in the Methods section. The former technique, Bayesian inference using partial priors (BIPP), computes estimate bounded intervals for the occurrence rates of singular events such as the successful completion of one-off robot tasks, or catastrophic failures. The latter technique, Bayesian inference using imprecise probability with sets of priors (IPSP), produces estimate bounded intervals for the occurrence rates of regular events encountered by an autonomous robot.

As shown in Fig. 1, the verification process underpinning the application of our framework involves devising a parametric CTMC model that captures the structural aspects of the system under verification through a System Modeller. This activity is typically performed once at design time (i.e., before the system is put into operation) by engineers with modelling expertise. By monitoring the system under verification after deployment, our framework enables observing both the occurrence of regular events and the lack of singular events during times when such events could have occurred (e.g., a catastrophic failure not happening when the system performs a dangerous operation). Our online BIPP Estimator and IPSP Estimator use these observations to calculate expected ranges for the rates of the monitored events, enabling a Model Generator to continually synthesise up-to-date interval CTMCs that model the evolving behaviour of the system.

Fig. 1: Robust Bayesian verification framework.
figure 1

The integration of Bayesian inference using partial priors (BIPP) and Bayesian inference using imprecise probability with sets of priors (IPSP) with interval continuous-time Markov chain (CMTC) model checking supports the online robust quantitative verification and reconfiguration of autonomous systems under parametric uncertainty.

The interval CTMCs, which are synthesised from the parametric CTMC model, are then continually verified by the PRISM-PSY Model Checker 22, to compute value intervals for key system properties. As shown in Fig. 1 and illustrated in the next section, these properties range from dependability (e.g., safety, reliability and availability)23 and performance (e.g., response time and throughput) properties to resource use and system utility. Finally, changes in the value ranges of these properties may prompt the dynamic reconfiguration of the system by a Controller module responsible for ensuring that the system requirements are satisfied at all times.

Offshore infrastructure maintenance

We demonstrate how our online robust verification and reconfiguration framework can support an AUV to execute a structural health inspection and cleaning mission of the substructure of an offshore wind farm. Similar scenarios for AUV use in remote, subsea environments have been described in other large-scale robotic demonstration projects, such as the PANDORA EU FP7 project24. Compared to remotely operated vehicles that must be tethered with expensive oceanographic surface vessels run by specialised personnel, AUVs bring important advantages, including reduced environmental footprint (since no surface vessel consuming fuel is needed), reduced cognitive fatigue for the involved personnel, increased frequency of mission execution, and reduced operational and maintenance cost.

The offshore wind farm comprises multiple floating wind turbines, with each turbine being a buoyant foundation structure secured to the sea bed with floating chains tethered to anchors weighing several tons. Wind farms with floating wind turbines offer increased wind exploitation (since they can be installed in deeper waters where winds are stronger and more consistent), reduced installation costs (since there is no need to build solid foundations), and reduced impact on the visual and maritime life (since they are further from the shore)25.

The AUV is deployed to collect data about the condition of k ≥ 1 floating chains to enable the post-mission identification of problems that could affect the structural integrity of the asset (floating chain). When the visual inspection of a chain is hindered due to accumulated biofouling or marine growth, the AUV can use its on-board high-pressure water jet to clean the chain and continue with the inspection task24.

The high degrees of aleatoric uncertainty in navigation and the perception of the marine environment entail that the AUV might fail to clean a chain. This uncertainty originates from the dynamic conditions of the underwater medium that includes unexpected water perturbations coupled with difficulties in scene understanding due to reduced visibility and the need to operate close to the floating chains. When this occurs, the AUV can retry the cleaning task or skip the chain and move to the next.

Stochastic mission modelling

Figure 2 shows the parametric CMTC model of the floating chain inspection and cleaning mission. The AUV inspects the ith chain with rate r inspect and consumes energy eins. The chain is clean with probability pc and the AUV travels to the next chain with rate r travel consuming energy et, or the chain needs cleaning with probability 1 − pc. When the AUV attempts the cleaning (xi = 1), the task succeeds with chain-dependent rate \({r}_{i}^{{{{{{{{\rm{clean}}}}}}}}}\), causes catastrophic damage to the floating chain or itself with rate r damage or fails with chain-dependent rate \({r}_{i}^{{{{{{{{\rm{fail}}}}}}}}}\). If the cleaning fails, the AUV prepares to retry with known and fixed rate \({r}^{{{{{{{{\rm{prepare}}}}}}}}}\) requiring energy ep, and it either retries cleaning (xi = 1) or skips the current chain and moves to chain i + 1 (xi = 0). After executing the tasks on the kth chain, the AUV returns to its base and completes the mission.

Fig. 2: Floating chain continuous-time Markov chain (CMTC) model of an autonomous underwater vehicle (AUV).
figure 2

CTMC of the floating chain cleaning and inspection mission, where e1, e2, …, ek represent the mean energy required to clean chains 1, 2, …, k, respectively. The AUV inspects a chain with rate r inspect, consuming energy eins, prepares to retry the chain cleaning task with rate \({r}^{{{{{{{{\rm{prepare}}}}}}}}}\), consuming energy ep, and travels to the next chain with rate r travel, consuming energy et. During an inspection, the chain is clean with probability pc, and x1, x2, …, xk ∈ {0, 1} denote the control parameters used by the AUV controller to synthesise a plan. The rate \({r}_{i}^{{\mathsf{fail}}}\) corresponds to a regular event and is therefore modelled using Bayesian inference using imprecise probability with sets of priors (IPSP) from (27) and (28). The rates \({r}_{i}^{{\mathsf{clean}}}\) and r damage correspond to singular events and are thus modelled using Bayesian inference using partial priors (BIPP) from (7) and (8).

Since the AUV can fail to clean the i-th chain with non-negligible probability and multiple times, this is a regular event whose transition rate \({r}_{i}^{{{{{{{{\rm{fail}}}}}}}}}\) is modelled using the IPSP estimator from (7) and (8). In contrast, the AUV is expected to not cause catastrophic damage but, with extremely low probability, may do so only once (after which the AUV and/or its mission are likely to be revised); thus, the corresponding transition rates \({r}_{i}^{{{{{{{{\rm{clean}}}}}}}}}\) and r damage are modelled using the BIPP estimator from (14) and (15). The other transition rates, i.e., those for inspection (r inspect), travelling (r travel) and preparation (\({r}^{{{{{{{{\rm{prepare}}}}}}}}}\)), are less influenced by the chain conditions and therefore assumed to be known, e.g., from previous trials and missions; hence, we fixed these transition rates.

When cleaning is needed for the ith chain, the AUV controller synthesises a plan by determining the control parameter xi ∈ {0, 1} for all remaining chains i, i + 1, …k so that the system requirements in Table 1 are satisfied.

Table 1 System requirements for the AUV floating chain inspection and cleaning mission

Robust verification results

We demonstrate our solution for robust verification and adaptation using a mission in which the AUV was deployed to inspect and, if needed, clean six chains placed in a hexagonal arrangement (Fig. 3). We used m = 3 and the BIPP estimator (7) and (8) for the transition rates \({r}_{i}^{{{{{{{{\rm{clean}}}}}}}}}\) and r damage, which correspond to singular events. For \({r}_{i}^{{{{{{{{\rm{clean}}}}}}}}}\), we used \({\epsilon }_{1}=0.12+{{{{{{{\mathcal{U}}}}}}}}(0,0.12)\), \({\theta }_{1}=0.10+{{{{{{{\mathcal{U}}}}}}}}(0,0.001)\), \({\epsilon }_{2}=0.90+{{{{{{{\mathcal{U}}}}}}}}(0,0.90)\), \({\theta }_{2}=0.85+{{{{{{{\mathcal{U}}}}}}}}(0,0.0085)\), where \({{{{{{{\mathcal{U}}}}}}}}(x,y)\) denotes a continuous uniform distribution with x and y being its minimum and maximum values, respectively. For r damage, we used \({\epsilon }_{1}=10^{-8}+{{{{{{{\mathcal{U}}}}}}}}(0,10^{-8})\), \({\theta }_{1}=0.88+{{{{{{{\mathcal{U}}}}}}}}(0,0.0088)\), \({\epsilon }_{2}=10^{-7}+{{{{{{{\mathcal{U}}}}}}}}(0,10^{-7})\), \({\theta }_{2}\,=\,0.10\,+\,{{{{{{{\mathcal{U}}}}}}}}(0,0.001)\). For \({r}_{i}^{{{{{{{{\rm{fail}}}}}}}}}\), we used \({t}^{(0)}=[10+{{{{{{{\mathcal{U}}}}}}}}(0,10)]\) and \({\lambda }^{(0)}=[0.0163+{{{{{{{\mathcal{U}}}}}}}}(0,0.00163)]\). During the mission execution, the AUV performs the model checking at every cleaning attempt so that runtime observations are incorporated into the decision making process entailing also that the currently synthesised plan is not necessarily used at subsequent chains. Hence, the AUV only needs to check system configurations where at least the current chain is to be cleaned, thus halving the number of configurations to be checked (since configurations with xi = 0 need not be checked). If all of these checks that consider xi = 1 fail to satisfy the requirements from Table 1, then the AUV decides to skip the current chain and proceed to inspect and clean the next chain.

Fig. 3: Demonstration of autonomous underwater vehicle (AUV) inspection and cleaning mission.
figure 3

a Simulated AUV mission involving the inspection of six wind farm chains and, if required, their cleaning. i Start of mission; ii cleaning chain 3; iii cleaning final chain. At this point, the AUV cleaned three chains, skipped one, and one chain did not require cleaning. b Plots of the outcome of the model checking carried out by the AUV at chain 3. Each row shows the configurations against the requirements. i–iii Results during the first attempt at cleaning chain 3. iv–vi Results during the second attempt at cleaning. vii–ix Results at the third and successful attempt at cleaning the chain. The configurations decorated with the red cross signify configurations violating the energy requirement R2 while configurations highlighted in yellow denote the chosen configuration for the corresponding attempt.

If a cleaning attempt at chain i failed, the AUV integrates this observation in (27) and (28), and performs model checking to determine whether to retry the cleaning or skip the chain. Since the AUV has consumed energy for the failed cleaning attempt, the energy available is reduced accordingly, which in turn can reduce the number of possible system configurations that can be employed and need checking. The observation of a failed attempt reduces the lower bound for the reliability of cleaning xi, and may result in a violation of the reliability requirement R1 (Table 1), which may further reduce the number of feasible configurations. If the AUV fails to clean chain i repeatedly, this lower bound will continue to decrease, potentially resulting in the AUV having no feasible configuration, and having to skip the current chain. Although skipping a chain overall decreases the risk of a catastrophic failure (as the number of cleaning attempts is reduced), leaving uncleaned chains will incur additional cost as a new inspection mission will need to be launched, e.g., using another AUV or human personnel.

Figure 3 shows a simulated run of the AUV performing an inspection and cleaning mission (Fig. 3a). At each chain that requires cleaning, the AUV decides whether to attempt to clean or skip the current chain. Figure 3b provides details of the probabilistic model checking carried out during the inspection and cleaning of chain 3 (Fig. 3a, ii). Overall, the AUV performed multiple attempts to clean chain 3, succeeding on the third attempt.

The results of the model checking analyses for these attempts are shown in successive columns in Fig. 3b, while each row depicts the analysis of one of the requirements from Table 1. A system configuration is feasible if it satisfies requirements R1—the AUV will not encounter a catastrophic failure with a probability of at least 0.95, and R2—the expected energy consumption does not exceed the remaining AUV energy. Lastly, if multiple configurations satisfy requirements R1 and R2, then the winner is the configuration that maximises the number of chains cleaned. If there is still a tie, the configuration is chosen randomly from those that clean the most chains.

In the AUV’s first attempt at chain 3 (Fig. 3b (i–iii)), all the configurations are feasible, so configuration 1 (highlighted, and corresponding to the highest number of chains cleaned) is selected. This attempt fails, and a second assessment is made (Fig. 3b (iv–vi)). This time, only system configurations 2–8 are feasible, and as configurations 2, 3, and 5 maximise R3, a configuration is chosen randomly from this subset (in this case, configuration 3). This attempt also fails, and on the third attempt (Fig. 3b (vii–ix)), only configurations 4–8 are feasible, with 5 maximising R3, and the AUV adopts this configuration and succeeds in cleaning the chain.

In this AUV mission instance, the AUV controller is concerned with cleaning the maximum number of chains and ensuring the AUV returns safely. In other variants of our AUV mission, the system properties from requirements R1 and R2 could also be used to determine a winning configuration in the event of a tie between multiple feasible configurations. For example, it might be optimal for the AUV to consume minimal energy in this scenario. Thus, the energy consumption from requirement R2 can be used as a metric to choose a configuration as a tie-breaker.

We also measured the overheads associated with executing the online verification process. Figure 4 shows the computation overheads incurred by the RBV framework for executing the AUV-based mission. The values comprising each boxplot have been collected over 10 independent runs. Each value denotes the time consumed for a single online robust quantitative verification and reconfiguration step when the AUV attempts to clean the indicated chain. For instance, the boxplot associated with the ‘Chain 1’ (‘Chain 2’) label on the x-axis signifies that the AUV attempts to clean chain 1 (chain 2) and corresponds to the time consumed by the RBV framework to analyse 64 (32) configurations. Overall, the time overheads are reasonable for the purpose of this mission. Since the AUV has more configurations to analyse at the earlier stages of the mission (e.g., when inspecting chain 1), the results follow the anticipated exponential pattern. The number of configurations decreases by half each time the AUV progresses further into the mission and moves to the next chain. Another interesting observation is that the length of each boxplot is small, i.e., the lower and upper quartiles are very close, indicating that the RBV framework showcases a consistent behaviour in the time taken for its execution.

Fig. 4: Verification time overheads.
figure 4

Time taken by our robust Bayesian verification framework to execute the online quantitative verification and reconfiguration step over 10 independent runs when the robot attempts to clean the indicated chain.

The consumed time comprises (1) the time required to compute the posterior estimate bounds of the modelled transition rates, \({r}_{i}^{{{{{{{{\rm{clean}}}}}}}}}\), r fail, 1 ≤ i ≤ k, and r damage, using the BIPP and IPSP estimators; (2) the time required to compute the value intervals for requirements R1 and R2 using the probabilistic model checker Prism-Psy 22; and (3) the time needed to find the best configuration satisfying requirements R1 and R2, and maximising requirement R3. Our empirical analysis provided evidence that the execution of the BIPP and IPSP estimators and the selection of the best configuration have negligible overheads with almost all time incurred by Prism-Psy. This outcome is not surprising and is aligned with the results reported in22 concerning the execution overheads of the model checker.

Additional information about the offshore infrastructure maintenance experiments, including details about the experimental methodology, is provided in Supplementary Methods 2 of the Supplementary material. The simulator used for the AUV mission, developed on top of the open-source MOOS-IvP middleware26, and a video showing the execution of this AUV mission instance are available at http://github.com/gerasimou/RBV.

Discussion

Unlike single-point estimators of Markov model parameters27,28,29,30, our Bayesian framework provides interval estimates that capture the inherent uncertainty of these parameters, enabling the robust quantitative verification of systems such as autonomous robots. Through its ability to exploit prior knowledge, the framework differs fundamentally from, and is superior to, a recently introduced approach to synthesising intervals for unknown transition parameters based on the frequentist theory of simultaneous confidence intervals15,31,32. Furthermore, instead of applying the same estimator to all Markov model transition parameters like existing approaches, our framework is the first to handle parameters corresponding to singular and regular events differently. This is an essential distinction, especially for the former type of parameter, for which the absence of observations violates a key premise of existing estimators. Our BIPP estimator avoids this invalid premise, and computes two-sided bounded estimates for singular CTMC transition rates—a considerable extension of our preliminary work to devise one-sided bounded estimates for the singular transition probabilities of discrete-time Markov chains33.

The proposed Bayesian framework is underpinned by the theoretical foundations of imprecise probabilities34,35 and Conservative Bayesian Inference (CBI)36,37,38 integrated with recent advances in the verification of interval CTMCs22. In particular, our BIPP theorems for singular events extend CBI significantly in several ways. First, BIPP operates in the continuous domain for a Poisson process, while previous CBI theorems are applicable to Bernoulli processes in the discrete domain. As such, BIPP enables the runtime quantitative verification of interval CTMCs, and thus the analysis of important properties that are not captured by discrete-time Markov models. Second, CBI is one-side (upper) bounded, and therefore only supports the analysis of undesirable singular events (e.g., catastrophic failures). In contrast, BIPP provides two-sided bounded estimates, therefore also enabling the analysis of “positive” singular events (e.g., the completion of difficult one-off tasks). Finally, BIPP can operate with any arbitrary number of confidence bounds as priors, which greatly increases the flexibility of exploiting different types of prior knowledge.

As illustrated by its application to an AUV infrastructure maintenance mission, our robust quantitative verification framework removes the need for precise prior beliefs, which are typically unavailable in many real-world verification tasks that require Bayesian inference. Instead, the framework enables the exploitation of Bayesian combinations of partial or imperfect prior knowledge, which it uses to derive informed estimation errors (i.e., intervals) for the predicted model parameters. Combined with existing techniques for obtaining this prior knowledge, e.g., the Delphi method and its variants39 or reference class forecasting40, the framework increases the trustworthiness of Bayesian inference in highly uncertain scenarios such as those encountered in the verification of autonomous robots.

Based on recent survey papers41,42,43 that provide in-depth discussions on the challenges and opportunities in the field of autonomous robot verification, it has become evident that a common taxonomy emerges, primarily revolving around two key dimensions. The first dimension centres on the specification of properties under verification, which includes various types of temporal logic languages41. The second dimension pertains to how system behaviours are modeled/structured. In this regard, formal models such as Belief Desire Intention, Petri Nets, and finite state machines, along with their diverse extensions, have emerged as popular approaches to capturing the intricate dynamics of autonomous systems. Our approach falls within the category of methods utilising continuous stochastic logic (CSL) and CTMCs for the verification of robots. However, unlike the existing methods from this category44,45, we introduced treatments of the model parameters uncertainty via robust Bayesian learning methods, and integrated them with recent research on interval CMTC model checking.

Another important approach for verifying the behaviour of autonomous agents under uncertainty uses hidden Markov models (HMMs)46,47,48. HMM-based verification supports the analysis of stochastic systems whose true state is not observable, and can only be estimated (with aleatoric uncertainty given by a predefined probability distribution) through monitoring a separate process whose observable state depends on the unknown state of the system. In contrast, our verification framework supports the analysis of autonomous agents whose true state is observable but for which the rates of transition between these known states are affected by epistemic uncertainty and need to be learnt from system observations (as shown in Fig. 1). As such, HMM-based verification and our robust verification framework differ substantially by tackling different types of autonomous agent uncertainty. Because autonomous agents may be affected by both types of uncertainty, the complementarity of the two verification approaches can actually be leveraged by using our BIPP and IPSP Bayesian estimators in conjunction with HMM-based verification, i.e., to learn the transition rates associated with continuous-time HMMs that model the behaviour of an autonomous agent. Nevertheless, achieving this integration will first require the development of generic continuous-time HMM verification techniques since, to the best of our knowledge, only verification techniques and tools for the verification of discrete-time HMMs are currently available.

Although our method demonstrates promising potential, it is not without limitations. One limitation is scalability—as the complexity of the robot’s behaviour and the environment grow, the number of unknown parameters to be estimated at runtime may increase, leading to increased computational overheads for our Bayesian estimators. Additionally, the method requires a certain level of expertise to construct the underlying CTMC model structure. This demands understanding both the robot’s dynamics and the environment in order to model them as a CTMC, making the approach less accessible to those without specialised knowledge. Last but not least, a challenge inherent to all Bayesian methods involves the acquisition of appropriate priors. While our robust Bayesian estimators mitigate this issue by eliminating the need for complete and precise prior knowledge, establishing the required partial and vague priors can still pose challenges. These limitations suggest important areas for future work.

Methods

Quantitative verification

Quantitative verification is a mathematically based technique for analysing the correctness, reliability, performance and other key properties of systems with stochastic behaviour49,50. The technique captures this behaviour into Markov models, formalises the properties of interest as probabilistic temporal logic formulae over these models, and employs efficient algorithms for their analysis. Examples of such properties include the probability of mission failure for an autonomous robot, and the expected battery energy required to complete a robotic mission.

In this paper, we focus on the quantitative verification of continuous-time Markov chains (CMTCs). CTMCs are Markov models for continuous-time stochastic processes over countable state spaces comprising (i) a finite set of states corresponding to real-world states of the system that are relevant for the analysed properties; and (ii) the rates of transition between these states. We use the following definition adapted from the probabilistic model checking literature49,50.

Definition 1

A continuous-time Markov chain is a tuple

$${{{{{{{\mathcal{M}}}}}}}}=(S,{s}_{0},{{{{{{{\bf{R}}}}}}}}),$$
(1)

where S is a finite set of states, s0 ∈ S is the initial state, and \({{{{{{{\bf{R}}}}}}}}:S\times S\to {\mathbb{R}}\) is a transition rate matrix such that the probability that the CTMC will leave state si ∈ S within t > 0 time units is \(1-{e}^{-t\cdot {\sum }_{{s}_{k}\in S\setminus \{{s}_{i}\}}{{{{{{{\bf{R}}}}}}}}({s}_{i},{s}_{k})}\) and the probability that the new state is sj ∈ S⧹{si} is \({p}_{ij}={{{{{{{\bf{R}}}}}}}}({s}_{i},{s}_{j})\,/\,{\sum }_{{s}_{k}\in S\setminus \{{s}_{i}\}}\,{{{{{{{\bf{R}}}}}}}}({s}_{i},{s}_{k})\).

The range of properties that can be verified using CTMCs can be extended by annotating the states and transitions with non-negative quantities called rewards.

Definition 2

A reward structure over a CTMC \({{{{{{{\mathcal{M}}}}}}}}=(S,{s}_{0},{{{{{{{\bf{R}}}}}}}})\) is a pair of functions \((\underline{\rho },{{{{{{{\boldsymbol{\iota }}}}}}}})\) such that \(\underline{\rho }:S\to {{\mathbb{R}}}_{\ge 0}\) is a state reward function (a vector), and \({{{{{{{\boldsymbol{\iota }}}}}}}}:S\times S\to {{\mathbb{R}}}_{\ge 0}\) is a transition reward function (a matrix).

CTMCs support the verification of quantitative properties expressed in CSL51 extended with rewards50.

Definition 3

Given a set of atomic propositions AP, a ∈ AP, p ∈ [0, 1], \(I\subseteq {{\mathbb{R}}}_{\ge 0}\), \(r,t\in {{\mathbb{R}}}_{\ge 0}\) and ⋈   ∈ {≥, > , < , ≤}, a CSL formula Φ is defined by the grammar:

$$\Phi ::= true\,| \,a\,| \,\Phi \wedge \Phi \,| \,\neg \Phi \,| \,{P}_{\bowtie p}[X\,\Phi ]\,| \,{P}_{\bowtie p}[\Phi \,{U}^{I}\,\Phi ]\,| \,{{{{{{{{\mathcal{S}}}}}}}}}_{\bowtie p}[\Phi ]\,| \\ {R}_{\bowtie r}[{I}^{ = t}]\,| \,{R}_{\bowtie r}[{C}^{\le t}]\,| \,{R}_{\bowtie r}[F\,\Phi ]\,| \,{R}_{\bowtie r}[S].$$

Given a CTMC \({{{{{{{\mathcal{M}}}}}}}}=(S,{s}_{0},{{{{{{{\bf{R}}}}}}}})\) with states labelled with atomic propositions from A P by a function L : S → 2AP, and a reward structure \((\underline{\rho },{{{{{{{\boldsymbol{\iota }}}}}}}})\) over \({{{{{{{\mathcal{M}}}}}}}}\), the CSL semantics is defined with a satisfaction relation ⊧ over the states and paths (i.e., feasible sequences of successive states) of \({{{{{{{\mathcal{M}}}}}}}}\) 49. The notation s⊧Φ means “Φ is satisfied in state s”. For any state s ∈ S, we have:

  • s ⊧ t r u e, s⊧a iff a ∈ L(s), s ⊧ ¬ Φ iff ¬ (s⊧Φ), and s⊧Φ1 ∧ Φ2 iff s ⊧ Φ1 and s ⊧ Φ2;

  • \(s\,\vDash \,{{{{{{{{\mathcal{P}}}}}}}}}_{\bowtie p}[X\,\Phi ]\) iff the probability x that Φ holds in the state following s satisfies x ⋈ p (probabilistic next formula);

  • \(s \, \vDash \, {{{{{{{{\mathcal{P}}}}}}}}}_{\bowtie p}[{\Phi }_{1}\,{U}^{I}\,{\Phi }_{2}]\) iff, across all paths starting at s, the probability x of going through only states where Φ1 holds until reaching a state where Φ2 holds at a time t ∈ I satisfies x ⋈ p (probabilistic until formula);

  • \(s\,\vDash \, {{{{{{{{\mathcal{S}}}}}}}}}_{\bowtie p}[\Phi ]\) iff, having started in state s, the probability x of \({{{{{{{\mathcal{M}}}}}}}}\) reaching a state where Φ holds in the long run satisfies x ⋈ p (probabilistic steady-state formula);

  • the instantaneous (R⋈r[I =t]), cumulative (R⋈r[C ≤t]), future-state (R⋈r[F Φ]) and steady-state (R⋈r[S]) reward formulae hold iff, having started in state s, the expected reward x at time instant t, cumulated up to time t, cumulated until reaching a state where Φ holds, and achieved at steady state, respectively, satisfies x ⋈ r.

Probabilistic model checkers such as PRISM52 and Storm53 use efficient analysis techniques to compute the actual probabilities and expected rewards associated with probabilistic and reward formulae, respectively. The formulae are then verified by comparing the computed values to the bounds p and r. Furthermore, the extended CSL syntax P=?[X Φ], P=?[Φ1 U I Φ2], R=?[I =t], etc. can be used to obtain these values from the model checkers.

While the transition rates of the CTMCs verified in this way must be known and constant, advanced quantitative verification techniques21 support the analysis of CTMCs whose transition rates are specified as intervals. The technique has been used to synthesise CTMCs corresponding to process configurations and system designs that satisfy quantitative constraints and optimisation criteria22,32,54, under the assumption that these bounded intervals are available. Here we introduce a Bayesian framework for computing these intervals in ways that reflect the parametric uncertainty of real-world systems such as autonomous robots.

Bayesian learning of CTMC transition rates

Given two states si and sj of a CTMC such that transitions from si to sj are possible and occur with rate λ, each transition from si to sj is independent of how state si was reached (the Markov property). Furthermore, the time spent in state si before a transition to sj is modelled by a homogeneous Poisson process of rate λ. Accordingly, the likelihood that ‘data’ collected by observing the CTMC shows n such transitions occurring within a combined time t spent in state si is given by the conditional probability:

$$l(\lambda )=Pr\left({{{{{{{\rm{data}}}}}}}}| \lambda \right)=\frac{{\left(\lambda t\right)}^{n}}{n!}{e}^{-\lambda t}$$
(2)

In practice, the rate λ is typically unknown, but prior beliefs about its value are available (e.g., from domain experts or from past missions performed by the system modelled by the CTMC) in the form of a probability (density or mass) function f(λ). In this common scenario, the Bayes Theorem can be used to derive a posterior probability function that combines the likelihood l(λ) and the prior f(λ) into a better estimate for λ at time t:

$$f(\lambda | {{{{{{{\rm{data}}}}}}}})=\frac{l(\lambda )f(\lambda )}{\int\nolimits_{0}^{\infty }l(\lambda )f(\lambda ){{{{{{{\rm{d}}}}}}}}\lambda }$$
(3)

where the Lebesgue-Stieltjes integral from the denominator is introduced to ensure that f(λ∣data) is a probability function. We note, we use Lebesgue-Stieltjes integration to cover in a compact way both continuous and discrete prior distributions f(λ), as these integrals naturally reduce to sums for discrete distributions. We calculate the posterior estimate for the rate λ at time t as the expectation of (3):

$${\lambda }^{(t)}={\mathbb{E}}\left[\Lambda | {{{{{{{\rm{data}}}}}}}}\right]=\frac{\int\nolimits_{0}^{\infty }\lambda l(\lambda )f(\lambda ){{{{{{{\rm{d}}}}}}}}\lambda }{\int\nolimits_{0}^{\infty }l(\lambda )f(\lambda ){{{{{{{\rm{d}}}}}}}}\lambda }.$$
(4)

where we use capital letters for random variables and lower case for their realisations.

Interval Bayesian inference for singular events

In the autonomous-robot missions considered in our paper, certain events are extremely rare, and treated as unique from a modelling viewpoint. These events include major failures (after each of which the system is modified to remove or mitigate the cause of the failure), and the successful completion of difficult one-off tasks. Using Bayesian inference to estimate the CTMC transition rates associated with such events is challenging because, with no observations of these events, the posterior estimate is highly sensitive to the choice of a suitable prior distribution. Furthermore, only limited domain knowledge is often available to select and justify a prior distribution for these singular events.

To address this challenge, we develop a Bayesian inference using partial priors (BIPP) estimator that requires only limited, partial prior knowledge instead of the complete prior distribution typically needed for Bayesian inference. For one-off events, such knowledge is both more likely to be available and easier to justify. BIPP provides bounded posterior estimates that are robust in the sense that the ground truth rate values are within the estimated intervals.

To derive the BIPP estimator, we note that for one-off events the likelihood (2) becomes

$$l(\lambda )=Pr({{{{{{{\rm{data}}}}}}}}| \lambda )={e}^{-\lambda \cdot t}$$
(5)

because n = 0. Instead of a prior distribution f(λ) (required to compute the posterior expectation (4)), we assume that we only have limited partial knowledge consisting of m ≥ 2 confidence bounds on f(λ):

$$Pr({\epsilon }_{i-1} < \lambda \le {\epsilon }_{i})={\theta }_{i}$$
(6)

where 1 ≤ i ≤ m, θi > 0, and \(\mathop{\sum }\nolimits_{i = 1}^{m}{\theta }_{i}=1\). The use of such bounds is a common practice for safety-critical systems. As an example, the IEC61508 safety standard 55 defines safety integrity levels (SILs) for the critical functions of a system based on the bounds for their probability of failure on demand (pfd): pfd between 10−2 and 10− 1 corresponds to SIL 1, pfd between 10−3 and 10−2 corresponds to SIL 2, etc.; and testing can be used to estimate the probabilities that a critical function has different SILs. We note that P r(λ ≥ ϵ0) = P r(λ ≤ ϵm) = 1 and that, when no specific information is available, we can use ϵ0 = 0 and ϵm = + ∞.

The partial knowledge encoded by the constraints (6) is far from a complete prior distribution: an infinite number of distributions f(λ) satisfy these constraints, and the result below provides bounds for the estimate rate (4) across these distributions.

Theorem 1

The set Sλ of posterior estimate rates (4) computed for all prior distributions f(λ) that satisfy (6) has an infinum λl and a supremum λu given by:

$${\lambda }_{l}=\min \left\{\left.\frac{\mathop{\sum }\nolimits_{i=1}^{m}\left[{\epsilon }_{i}l({\epsilon }_{i})(1-{x}_{i}){\theta }_{i}+{\epsilon }_{i-1}l({\epsilon }_{i-1}){x}_{i}{\theta }_{i}\right]}{\mathop{\sum }\nolimits_{i=1}^{m}[l({\epsilon }_{i})(1-{x}_{i}){\theta }_{i}+l({\epsilon }_{i-1}){x}_{i}{\theta }_{i}]}\right\vert \forall 1\le i\le m.{x}_{i}\in [0,1]\right\},$$
(7)
$${\lambda }_{u}=\max \left\{\left.\frac{\mathop{\sum }\nolimits_{i=1}^{m}{\lambda }_{i}l({\lambda }_{i}){\theta }_{i}}{\mathop{\sum }\nolimits_{i=1}^{m}l({\lambda }_{i}){\theta }_{i}}\right\vert \,\forall 1\le i\le m.{\lambda }_{i}\in ({\epsilon }_{i-1},{\epsilon }_{i}]\right\}.$$
(8)

Before providing a proof for Theorem 1, we note that the values λl and λu can be computed using numerical optimisation software packages available, for instance, within widely used mathematical computing tools like MATLAB and Maple. For applications where computational resources are limited or the BIPP estimator is used online with tight deadlines, the following corollaries (whose proofs are provided in our supplementary material) give closed-form estimator bounds for m = 3 (with m = 2 as a subcase).

Corollary 1

When m = 3, the bounds (7) and (8) satisfy:

$${\lambda }_{l}\ge \left\{\begin{array}{ll}\frac{{\epsilon }_{1}l({\epsilon }_{1}){\theta }_{2}}{{\theta }_{1}+l({\epsilon }_{1}){\theta }_{2}},\quad &\,\qquad\qquad\quad{{\mbox{if}}}\,\,\frac{{\theta }_{2}({\epsilon }_{1}-{\epsilon }_{2})}{{\theta }_{1}} > \frac{{\epsilon }_{2}l({\epsilon }_{2})-{\epsilon }_{1}l({\epsilon }_{1})}{l({\epsilon }_{1})l({\epsilon }_{2})}\\ \frac{{\epsilon }_{2}l({\epsilon }_{2}){\theta }_{2}}{{\theta }_{1}+l({\epsilon }_{2}){\theta }_{2}},\quad &\,{{\mbox{otherwise}}}\,\end{array}\right.$$
(9)

and

$${\lambda }_{u} < \left\{\begin{array}{ll}\frac{{\epsilon }_{1}l({\epsilon }_{1}){\theta }_{1}+{\epsilon }_{2}l({\epsilon }_{2}){\theta }_{2}+\frac{1}{t}l(\frac{1}{t})(1-{\theta }_{1}-{\theta }_{2})}{l({\epsilon }_{1}){\theta }_{1}},\quad &\,{{\mbox{if}}}\,\,t < \frac{1}{{\epsilon }_{2}}\\ \frac{{\epsilon }_{1}l({\epsilon }_{1}){\theta }_{1}+\frac{1}{t}l(\frac{1}{t}){\theta }_{2}+{\epsilon }_{2}l({\epsilon }_{2})(1-{\theta }_{1}-{\theta }_{2})}{l({\epsilon }_{1}){\theta }_{1}},\quad &\,\quad\quad{{\mbox{if}}}\,\,\frac{1}{{\epsilon }_{2}}\le t\le \frac{1}{{\epsilon }_{1}}\\ \frac{{\epsilon }_{1}l({\epsilon }_{1})({\theta }_{1}+{\theta }_{2})+{\epsilon }_{2}l({\epsilon }_{2})(1-{\theta }_{1}-{\theta }_{2})}{l({\epsilon }_{1}){\theta }_{1}},\quad &\,\quad\;{{\mbox{otherwise}}}\,\end{array}\right.$$
(10)

Corollary 2

Closed-form BIPP bounds for m = 2 can be obtained by setting ϵ2 = ϵ1 and θ2 = 0 in (9) and (10).

To prove Theorem 1, we require the following Lemma and Propositions.

Lemma 1

If l( ⋅ ) is the likelihood function defined in (5), then \(g:(0,\infty )\to {\mathbb{R}}\), g(w) = w ⋅ l −1(w) is a concave function.

Proof

Since \(g(w)=w\cdot \left(-\frac{\ln w}{t}\right)\) and t > 0, the second derivative of g satisfies

$$\frac{{d}^{2}g}{d{w}^{2}}=\frac{d}{dw}\left[-\frac{\ln w}{t}-\frac{1}{t}\right]=-\frac{1}{wt} \, < \, 0.$$
(11)

Thus, g(w) is concave. □

Proposition 1

With the notation from Theorem 1, there exist m values λ1 ∈ (ϵ0, ϵ1], λ2 ∈ (ϵ1, ϵ2], …, λm ∈ (ϵm−1, ϵm] such that \(\sup {{{{{{{{\mathcal{S}}}}}}}}}_{\lambda }\) is the posterior estimate (4) obtained by using as prior the m-point discrete distribution with probability mass f(λi) = P r(λ = λi) = θi for i = 1, 2, …, m.

Proof

Since f(λ) = 0 for λ ∉ [ϵ0, ϵm], the Lebesgue-Stieltjes integration from the objective function (4) can be rewritten as:

$${\mathbb{E}}(\Lambda | {{{{{{{\rm{data}}}}}}}})=\frac{\mathop{\sum }\nolimits_{i = 1}^{m}\int\nolimits_{{\epsilon }_{i-1}}^{{\epsilon }_{i}}\lambda l(\lambda )f(\lambda ){{{{{{{\rm{d}}}}}}}}\lambda }{\mathop{\sum }\nolimits_{i = 1}^{m}\int\nolimits_{{\epsilon }_{i-1}}^{{\epsilon }_{i}}l(\lambda )f(\lambda ){{{{{{{\rm{d}}}}}}}}\lambda }$$
(12)

The first mean value theorem for integrals (e.g.,56 p. 249]) ensures that, for every i = 1, 2, …, m, there are points \({\lambda }_{i},{\lambda }_{i}^{{\prime} }\in [{\epsilon }_{i-1},{\epsilon }_{i}]\) such that:

$$\int\nolimits_{{\epsilon }_{i-1}}^{{\epsilon }_{i}}l(\lambda )f(\lambda ){{{{{{{\rm{d}}}}}}}}\lambda =l({\lambda }_{i})\int\nolimits_{{\epsilon }_{i-1}}^{{\epsilon }_{i}}f(\lambda ){{{{{{{\rm{d}}}}}}}}\lambda =l({\lambda }_{i}){\theta }_{i}$$
(13)
$$\int\nolimits_{{\epsilon }_{i-1}}^{{\epsilon }_{i}}\lambda l(\lambda )f(\lambda ){{{{{{{\rm{d}}}}}}}}\lambda ={\lambda }_{i}^{{\prime} }l({\lambda }_{i}^{{\prime} })\int\nolimits_{{\epsilon }_{i-1}}^{{\epsilon }_{i}}f(\lambda ){{{{{{{\rm{d}}}}}}}}\lambda ={\lambda }_{i}^{{\prime} }l({\lambda }_{i}^{{\prime} }){\theta }_{i}$$
(14)

or, after simple algebraic manipulations of the previous results,

$$l({\lambda }_{i})={\mathbb{E}}[l(\Lambda )| {\epsilon }_{i-1}\le \Lambda \le {\epsilon }_{i}]$$
(15)
$${\lambda }_{i}^{{\prime} }l({\lambda }_{i}^{{\prime} })={\mathbb{E}}[\Lambda \cdot l(\Lambda )| {\epsilon }_{i-1}\le \Lambda \le {\epsilon }_{i}]$$
(16)

Using the shorthand notation w = l(λ) for the likelihood function (5) (hence w > 0), we define \(g:(0,\infty )\to {\mathbb{R}}\), g(w) = w ⋅ l −1(w). According to Lemma 1, g( ⋅ ) is a concave function, and thus we have:

$${\lambda }_{i}^{{\prime} }l({\lambda }_{i}^{{\prime} }) = {\mathbb{E}}[\Lambda \cdot l(\Lambda )| {\epsilon }_{i-1}\le \Lambda \le {\epsilon }_{i}]\\ = {\mathbb{E}}[W\cdot {l}^{-1}(W)| {\epsilon }_{i-1}\le {l}^{-1}(W)\le {\epsilon }_{i}]\\ = {\mathbb{E}}[g(W)| {\epsilon }_{i-1}\le {l}^{-1}(W)\le {\epsilon }_{i}]\\ \le g\left({\mathbb{E}}[W| {\epsilon }_{i-1}\le {l}^{-1}(W)\le {\epsilon }_{i}]\right)$$
(17)
$$ \hskip 08pt = {\mathbb{E}}[W| {\epsilon }_{i-1}\le {l}^{-1}(W)\le {\epsilon }_{i}]\cdot \\ {l}^{-1}\left({\mathbb{E}}[W| {\epsilon }_{i-1}\le {l}^{-1}(W)\le {\epsilon }_{i}]\right) \\ \hskip -10.5pt ={\mathbb{E}}[l(\Lambda )| {\epsilon }_{i-1}\le \Lambda \le {\epsilon }_{i}]\cdot {l}^{-1} \\ \left({\mathbb{E}}[l(\Lambda )| {\epsilon }_{i-1}\le \Lambda \le {\epsilon }_{i}]\right)\\ = l({\lambda }_{i})\cdot {l}^{-1}\left(l({\lambda }_{i})\right) \\ \hskip -10.5pt = {\lambda }_{i}\cdot l({\lambda }_{i}),$$
(18)

where the inequality step (17) is obtained by applying Jensen’s inequality36,57.

We can now use (13), (14) and (18) to establish an upper bound for the objective function (12):

$${\mathbb{E}}(\Lambda | {{{{{{{\rm{data}}}}}}}})=\frac{\mathop{\sum }\nolimits_{i = 1}^{m}{\lambda }_{i}^{{\prime} }l({\lambda }_{i}^{{\prime} }){\theta }_{i}}{\mathop{\sum }\nolimits_{i = 1}^{m}l({\lambda }_{i}){\theta }_{i}}\le \frac{\mathop{\sum }\nolimits_{i = 1}^{m}{\lambda }_{i}l({\lambda }_{i}){\theta }_{i}}{\mathop{\sum }\nolimits_{i = 1}^{m}l({\lambda }_{i}){\theta }_{i}}$$
(19)

This upper bound is attained by selecting an m-point discrete distribution fu(λ) with probability mass θi at λ = λi, for i = 1, 2, …, m (since substituting f( ⋅ ) from (12) with this fu( ⋅ ) yields the rhs result of (19)). As such, maximising this bound reduces to an optimisation problem in the m-dimensional space of (λ1, λ2, …, λm) ∈ (ϵ0, ϵ1] × (ϵ1, ϵ2] × ⋯ × (ϵm−1, ϵm]. This optimisation problem can be solved numerically, yielding a supremum (rather than a maximum) for \({{{{{{{{\mathcal{S}}}}}}}}}_{\lambda }\) in the case when the optimised prior distribution has points located at λi = ϵi−1 for i = 1, 2, …, m. □

Proposition 2

With the notation from Theorem 1, there exist m values x1, x2, …, xm ∈ [0, 1] such that \(\inf {{{{{{{{\mathcal{S}}}}}}}}}_{\lambda }\) is the posterior estimate (4) obtained by using as prior the (m + 1)-point discrete distribution with probability mass f(ϵ0) = P r(λ = ϵ0) = x1 θ1, f(ϵi) = P r(λ = ϵi) = (1 − xi)θi + xi+1 θi+1 for 1≤i < m, and f(ϵm) = P r(λ = ϵm) = (1 − xm)θm.

Proof

We reuse the reasoning steps from Proposition 1 up to inequality (17), which we replace with the following alternative inequality derived from the Converse Jensen’s Inequality58,59 and the fact that g(w) is a concave function (cf. Lemma 1):

$${\lambda }_{i}^{{\prime} }l({\lambda }_{i}^{{\prime} }) = {\mathbb{E}}[g(W)| {\epsilon }_{i-1}\le {l}^{-1}(W)\le {\epsilon }_{i}]\\ \ge \frac{l({\epsilon }_{i-1})-{\mathbb{E}}[W| {\epsilon }_{i-1}\le {l}^{-1}(W)\le {\epsilon }_{i}]}{l({\epsilon }_{i-1})-l({\epsilon }_{i})}g(l({\epsilon }_{i}))\\ +\frac{{\mathbb{E}}[W| {\epsilon }_{i-1}\le {l}^{-1}(W)\le {\epsilon }_{i}]-l({\epsilon }_{i})}{l({\epsilon }_{i-1})-l({\epsilon }_{i})}g(l({\epsilon }_{i-1})) \\ = \frac{l({\epsilon }_{i-1})-l({\lambda }_{i})}{l({\epsilon }_{i-1})-l({\epsilon }_{i})}{\epsilon }_{i}l({\epsilon }_{i})+\frac{l({\lambda }_{i})-l({\epsilon }_{i})}{l({\epsilon }_{i-1})-l({\epsilon }_{i})}{\epsilon }_{i-1}l({\epsilon }_{i-1})$$
(20)

We can now establish a lower bound for (12):

$${\mathbb{E}}(\Lambda | {{{{{{{\rm{data}}}}}}}}) = \frac{\mathop{\sum }\nolimits_{i = 1}^{m}{\lambda }_{i}^{{\prime} }l({\lambda }_{i}^{{\prime} }){\theta }_{i}}{\mathop{\sum }\nolimits_{i = 1}^{m}l({\lambda }_{i}){\theta }_{i}}\\ \ge \frac{\mathop{\sum }\nolimits_{i = 1}^{m}\left(\frac{l({\epsilon }_{i-1})-l({\lambda }_{i})}{l({\epsilon }_{i-1})-l({\epsilon }_{i})}{\epsilon }_{i}l({\epsilon }_{i})+\frac{l({\lambda }_{i})-l({\epsilon }_{i})}{l({\epsilon }_{i-1})-l({\epsilon }_{i})}{\epsilon }_{i-1}l({\epsilon }_{i-1})\right){\theta }_{i}}{\mathop{\sum }\nolimits_{i = 1}^{m}l({\lambda }_{i}){\theta }_{i}}$$
(21)
$$\qquad=\frac{\mathop{\sum }\nolimits_{i = 1}^{m}\left[{\epsilon }_{i}l({\epsilon }_{i})(1-{x}_{i}){\theta }_{i}+{\epsilon }_{i-1}l({\epsilon }_{i-1}){x}_{i}{\theta }_{i}\right]}{\mathop{\sum }\nolimits_{i = 1}^{m}[l({\epsilon }_{i})(1-{x}_{i}){\theta }_{i}+l({\epsilon }_{i-1}){x}_{i}{\theta }_{i}]}$$
(22)

where xi is defined as:

$${x}_{i}=\frac{l({\lambda }_{i})-l({\epsilon }_{i})}{l({\epsilon }_{i-1})-l({\epsilon }_{i})}$$
(23)

The result (22) is essentially in the same form as the result obtained by using a 2m-point distribution in which, for each interval [ϵi−1, ϵi], there are two points located at λ = ϵi−1 and λ = ϵi and the probability mass associated with these points is xi θi and (1 − xi)θi respectively. Intuitively, xi is the ratio of splitting the probability mass θi between the two points since, according to (23), xi ∈ [0, 1].

Furthermore, the points on the boundaries of two successive intervals are overlapping, which effectively reduces the number of points from 2m to m + 1. Expanding (22) yields an (m + 1)-point discrete distribution fl(λ) with probability mass fl(ϵ0) = x1 θ1, fl(ϵi) = (1 − xi)θi + xi+1 θi+1 for 1≤i < m and fl(ϵm) = (1 − xm)θm. As such, minimising (22) reduces to an m-dimensional optimisation problem in x1, x2, …, xm, which can be solved numerically given other model parameters. Finally, since (6) requires that ϵi−1 < λi≤ϵi, we have 0 ≤ xi < 1, and thus the posterior estimate is an infimum (rather than a minimum) of \({{{{{{{{\mathcal{S}}}}}}}}}_{\lambda }\) when the solution of the optimisation problem corresponds to a combination of x1, x2, …, xm values that includes one or more values of 1. □

We can now prove Theorem 1. In the Supplementary Methods 1 of the supplementary material, we use this result to prove Corollaries 1 and 2.

Proof

Proof of Theorem 1. Propositions 1 and 2 imply that the set of posterior estimates λ over all priors that satisfy the constraints (6) has:

  1. 1.

    the infinum λl from (7), obtained by using the prior f(λ) from Proposition 2 in (4);

  2. 2.

    the supremum λu from (8), obtained by using the prior f(λ) from Proposition 1 in (4). □

BIPP estimator evaluation

Figure 5 shows the results of experiments we carried out to evaluate the BIPP estimator in scenarios with m = 3 (Fig. 5a–c) and m = 2 (Fig. 5d) confidence bounds by varying the characteristics of the partial prior knowledge. For m = 3, the upper bound computed by the estimator exhibits a three-stage behaviour as the time over which no singular event occurs increases. These stages correspond to the three λu regions from (10). They start with a steep λu decrease for \(t < \frac{1}{{\epsilon }_{2}}\) in stage 1, followed by a slower λu decreasing trend for \(\frac{1}{{\epsilon }_{2}}\le t\le \frac{1}{{\epsilon }_{1}}\) in stage 2, and approaching the asymptotic value \(\frac{{\epsilon }_{1}({\theta }_{1}+{\theta }_{2})}{{\theta }_{1}}\) as the mission progresses through stage 3. Similarly, the lower bound λl demonstrates a two-stage behaviour, as expected given its two-part definition (9), with the overall value approaching 0 as the mission continues and no singular event modelled by this estimator (e.g., a catastrophic failure) occurs.

Fig. 5: Experimental analysis of the Bayesian inference using partial priors (BIPP) estimator.
figure 5

Systematic experimental analysis of the BIPP estimator showing the bounds λl and λu of the posterior estimates for the occurrence probability of singular events for the duration of a mission. Each plot shows the effect of different partial prior knowledge encoded in (6) on the calculation of the lower (7) and upper (8) posterior estimate bounds. The red circles indicate the time points when the different formulae for the lower and upper bounds in (9) and (10), respectively, become active. a BIPP estimator for m = 3, θ1 ∈ {0.1, 0.3, 0.6, 0.8}, θ2 = 0.1, \({\epsilon }_{1}=\frac{1}{5000}\), \({\epsilon }_{2}=\frac{1}{1000}\). b BIPP estimator for m = 3, θ1 = 0.1, θ2 ∈ {0.1, 0.3, 0.6, 0.8}, \({\epsilon }_{1}=\frac{1}{5000}\), \({\epsilon }_{2}=\frac{1}{1000}\). c BIPP estimator for m = 3, θ1 = 0.3, θ2 = 0.3, \(\left({\epsilon }_{1},{\epsilon }_{2}\right)\in \left\{\left(\frac{1}{500},\frac{1}{100}\right),\left(\frac{1}{1000},\frac{1}{500}\right),\left(\frac{1}{2000},\frac{1}{1000}\right),\left(\frac{1}{5000},\frac{1}{2000}\right)\right\}\) d BIPP estimator for m = 2, θ1 ∈ {0.3, 0.5}, θ2 = 0, \(\left({\epsilon }_{1},{\epsilon }_{2}\right)\in \left\{\left(\frac{1}{500},\frac{1}{500}\right),\left(\frac{1}{5000},\frac{1}{5000}\right)\right\}\).

Figure 5a shows the behaviour of the estimator for different θ1 values and fixed θ2, ϵ1, and ϵ2 values. For higher θ1 values, more probability mass is allocated to the confidence bound (ϵ0, ϵ1], yielding a steeper decrease in the upper bound λu and a lower λu value at the end of the mission. The lower bound λl presents limited variability across the different θ1 values, becoming almost constant and close to 0 as θ1 increases.

A similar decreasing pattern is observed in Fig. 5b, which depicts the results of experiments with θ1, ϵ1, and ϵ2 fixed, and θ2 variable. The upper bound λu in the long-term is larger for higher θ2 values, resulting in a wider posterior estimate bound as λu converges towards its theoretical asymptotic value.

Allocating the same probability mass to the confidence bounds, i.e., θ1 = θ2 = 0.3 and changing the prior knowledge bounds ϵ1 and ϵ2 affects greatly the behaviour of the BIPP estimator (Fig. 5c). When ϵ1 and ϵ2 have relatively high values compared to the duration of the mission (e.g., see the first three plots in Fig. 5c), the upper bound λu of the BIPP estimator rapidly converges to its asymptotic value, leaving no room for subsequent improvement as the mission progresses. Similarly, the earlier the triggering point for switching between the two parts of the lower bound λl calculation (9), the earlier λl reaches a plateau close to 0.

Finally, Fig. 5d shows experimental results for the special scenario comprising only m = 2 confidence bounds. In this scenario, replacing θ2 = 0 in (9) as required by Corollary 2 gives a constant lower bound λl = 0 irrespective of the other BIPP estimator parameters. As expected, the upper bound λu demonstrates a twofold behaviour, featuring a rapid decrease until \(t=\frac{1}{\epsilon 1}\), followed by a steady state behaviour where \({\lambda }_{u}=\frac{{\epsilon }_{1}}{{\theta }_{1}}\).

Interval Bayesian inference for regular events

For CTMC transitions that correspond to regular events within the modelled system, we follow the common practice60 of using a Gamma prior distribution for each uncertain transition rate λ:

$$f(\lambda )=\Gamma [\lambda ;\alpha ,\beta ]=\frac{{\beta }^{\alpha }}{(\alpha -1)!}{\lambda }^{\alpha -1}{e}^{-\beta \lambda }.$$
(24)

The Gamma distribution is a frequently adopted conjugate prior distribution for the likelihood (2) and, if the prior knowledge assumes an initial value λ (0) for the transition rate, the parameters α > 0 and β > 0 must satisfy

$${\mathbb{E}}(\Gamma [\lambda ;\alpha ,\beta ])=\frac{\alpha }{\beta }={\lambda }^{(0)}.$$
(25)

The posterior value λ (t) for the transition rate after observing n transitions within t time units is then obtained by using the prior (24) in the expectation (4), as in the following derivation adapted from classical Bayesian theory60:

$${\lambda }^{(t)} = \frac{\int\nolimits_{0}^{\infty }\lambda \left(\frac{{(\lambda t)}^{n}}{n!}{e}^{-\lambda t}\right)\left(\frac{\beta }{(\alpha -1)!}{\lambda }^{\alpha -1}{e}^{-\beta \lambda }\right){{{{{{{\rm{d}}}}}}}}\lambda }{\int\nolimits_{0}^{\infty }\left(\frac{{(\lambda t)}^{n}}{n!}{e}^{-\lambda t}\right)\left(\frac{\beta }{(\alpha -1)!}{\lambda }^{\alpha -1}{e}^{-\beta \lambda }\right){{{{{{{\rm{d}}}}}}}}\lambda }\\ = \frac{\int\nolimits_{0}^{\infty }{\lambda }^{n+\alpha }{e}^{-\lambda (t+\beta )}{{{{{{{\rm{d}}}}}}}}\lambda }{\int\nolimits_{0}^{\infty }{\lambda }^{n+\alpha -1}{e}^{-\lambda (t+\beta )}{{{{{{{\rm{d}}}}}}}}\lambda }=\frac{\int\nolimits_{0}^{\infty }{\lambda }^{n+\alpha }{\left(\frac{{e}^{-\lambda (t+\beta )}}{-(t+\beta )}\right)}^{{\prime} }{{{{{{{\rm{d}}}}}}}}\lambda }{\int\nolimits_{0}^{\infty }{\lambda }^{n+\alpha -1}{e}^{-\lambda (t+\beta )}{{{{{{{\rm{d}}}}}}}}\lambda }\\ = \frac{{\left.\left({\lambda }^{n+\alpha }\frac{{e}^{-\lambda (t+\beta )}}{-(t+\beta )}\right)\right\vert }_{0}^{\infty }-\int\nolimits_{0}^{\infty }(n+\alpha ){\lambda }^{n+\alpha -1}\frac{{e}^{-\lambda (t+\beta )}}{-(t+\beta )}{{{{{{{\rm{d}}}}}}}}\lambda }{\int\nolimits_{0}^{\infty }{\lambda }^{n+\alpha -1}{e}^{-\lambda (t+\beta )}{{{{{{{\rm{d}}}}}}}}\lambda }\\ = \frac{0+\frac{n+\alpha }{t+\beta }\int\nolimits_{0}^{\infty }{\lambda }^{n+\alpha -1}{e}^{-\lambda (t+\beta )}{{{{{{{\rm{d}}}}}}}}\lambda }{\int\nolimits_{0}^{\infty }{\lambda }^{n+\alpha -1}{e}^{-\lambda (t+\beta )}{{{{{{{\rm{d}}}}}}}}\lambda }=\frac{n+\alpha }{t+\beta }=\frac{n+\beta {\lambda }^{(0)}}{t+\beta }\\ = \frac{\beta }{t+\beta }{\lambda }^{(0)}+\frac{t}{t+\beta }\frac{n}{t}=\frac{{t}^{(0)}}{t+{t}^{(0)}}{\lambda }^{(0)}+\frac{t}{t+{t}^{(0)}}\frac{n}{t},$$
(26)

where t (0) = β. This notation reflects the way in which the posterior rate λ (t) is computed as a weighted sum of the mean rate \(\frac{n}{t}\) observed over a time period t, and of the prior λ (0) deemed as trustworthy as a mean rate calculated from observations over a time period t (0). When t (0) ≪ t (either because we have low trust in the prior λ (0) and thus t (0) ≃ 0, or because the system was observed for a time period t that is much longer than t (0)), the posterior (26) reduces to the maximum likelihood estimator, i.e. \({\lambda }^{(t)}\simeq \frac{n}{t}\). In this scenario, the observations fully dominate the estimator (26), with no contribution from the prior.

The selection of suitable values for the parameters t (0) and λ (0) of the traditional Bayesian estimator (26) is very challenging. What constitutes a suitable choice often depends on unknown attributes of the environment, or several domain experts may each propose different values for these parameters. In line with recent advances in imprecise probabilistic modelling34,35,61, we address this challenge by defining a robust transition rate estimator for Bayesian inference using imprecise probability with sets of priors (IPSP). The IPSP estimator uses ranges \([{\underline{t}}^{(0)},\overline{t}^{(0)}]\) and \([{\underline{\lambda }}^{(0)},\overline{\lambda }^{(0)}]\) (corresponding to the environmental uncertainty, or to input obtained from multiple domain experts) for the two parameters instead of point values.

The following theorem quantifies the uncertainty that the use of parameter ranges for t (0) and λ (0) induces on the posterior rate (26). This theorem specialises and builds on generalised Bayesian inference results34 that we adapt for the estimation of CTMC transition rates.

Theorem 2

Given uncertain prior parameters \({t}^{(0)}\in [{\underline{t}}^{(0)},\overline{t}^{(0)}]\) and \({\lambda }^{(0)}\in [{\underline{\lambda }}^{(0)},\overline{\lambda }^{(0)}]\), the posterior rate λ (t) from (26) can range in the interval \([{\underline{\lambda }}^{(t)},\overline{\lambda }^{(t)}]\), where:

$${\underline{\lambda }}^{(t)}=\left\{\begin{array}{ll}\frac{\overline{t}^{(0)}{\underline{\lambda }}^{(0)}+n}{\overline{t}^{(0)}+t},\quad & if\,\frac{n}{t}\ge {\underline{\lambda }}^{(0)}\\ \frac{{\underline{t}}^{(0)}{\underline{\lambda }}^{(0)}+n}{{\underline{t}}^{(0)}+t}, & {\rm{otherwise}}\end{array}\right.$$
(27)

and

$${\overline{\lambda }}^{(t)}=\left\{\begin{array}{ll}\frac{\overline{t}^{(0)}\overline{\lambda }^{(0)}+n}{\overline{t}^{(0)}+t}, &if\,\frac{n}{t}\le \overline{\lambda }^{(0)}\\ \frac{{\underline{t}}^{(0)}\overline{\lambda }^{(0)}+n}{{\underline{t}}^{(0)}+t}, &{\rm{otherwise}}\end{array}\right..$$
(28)

Proof

To find the extrema for the posterior rate λ (t), we first differentiate (26) with respect to λ (0):

$$\frac{d}{d{\lambda }^{(0)}}\left({\lambda }^{(t)}\right)=\frac{{t}^{(0)}}{t+{t}^{(0)}}.$$

As t (0) > 0 and t > 0, this derivative is always positive, so

$${\underline{\lambda }}^{(t)}=\mathop{\min }\limits_{{t}^{(0)}\in [{\underline{t}}^{(0)},{\overline{t}}^{(0)}]}\frac{{t}^{(0)}{\underline{\lambda }}^{(0)}+n}{{t}^{(0)}+t}$$
(29)

and

$${\overline{\lambda }}^{(t)}=\mathop{\max }\limits_{{t}^{(0)}\in [{\underline{t}}^{(0)},{\overline{t}}^{(0)}]}\frac{{t}^{(0)}{\overline{\lambda }}^{(0)}+n}{{t}^{(0)}+t}.$$
(30)

We now differentiate the quantity that needs to be minimised in (29) with respect to t (0):

$$\frac{d}{d{t}^{(0)}}\left(\frac{{t}^{(0)}{\underline{\lambda }}^{(0)}+n}{{t}^{(0)}+t}\right)= \frac{{\underline{\lambda }}^{(0)}({t}^{(0)}+t)-({t}^{(0)}{\underline{\lambda }}^{(0)}+n)\cdot 1}{{({t}^{(0)}+t)}^{2}}\\ = \frac{{\underline{\lambda }}^{(0)}t-n}{{({t}^{(0)}+t)}^{2}},$$

As this derivative is non-positive for \({\underline{\lambda }}^{(0)}\in \left(0,\frac{n}{t}\right]\) and positive for \({\underline{\lambda }}^{(0)} > \frac{n}{t}\), the minimum from (29) is attained for \({t}^{0}={\overline{t}}^{(0)}\) in the former case, and for \({t}^{0}={\underline{t}}^{(0)}\) in the latter case, which yields the result from (27). Similarly, the derivative of the quantity to maximise in (30), i.e.,

$$\frac{d}{d{t}^{(0)}}\left(\frac{{t}^{(0)}{\overline{\lambda }}^{(0)}+n}{{t}^{(0)}+t}\right)=\frac{{\overline{\lambda }}^{(0)}t-n}{{({t}^{(0)}+t)}^{2}},$$

is non-positive for \({\overline{\lambda }}^{(0)}\in \left(0,\frac{n}{t}\right]\) and positive for \({\overline{\lambda }}^{(0)} > \frac{n}{t}\), so the maximum from (30) is attained for \({t}^{0}={\underline{t}}^{(0)}\) in the former case, and for \({t}^{0}={\overline{t}}^{(0)}\) in the latter case, which yields the result from (28) and completes the proof. □

IPSP estimator evaluation

Figure 6 shows the results of experiments we performed to analyse the behaviour of the IPSP estimator in scenarios with varying ranges for the prior knowledge \([{\underline{t}}^{(0)},\overline{t}^{(0)}]\) and \([{\underline{\lambda }}^{(0)},\overline{\lambda }^{(0)}]\). A general observation is that the posterior rate intervals \([{\underline{\lambda }}^{(t)},{\overline{\lambda }}^{(t)}]\) become narrower as the mission progresses, irrespective of the level of trust assigned to the prior knowledge, i.e., across all columns of plots (which correspond to different \([{\underline{t}}^{(0)},\overline{t}^{(0)}]\) intervals) from Fig. 6a. Nevertheless, this trust level affects how the estimator incorporates observations into the calculation of the posterior interval. When the trust in the prior knowledge is weak (in the plots from the leftmost columns of Fig. 6a), the impact of the prior knowledge on the posterior estimation is low, and the IPSP calculation is heavily influenced by the observations, resulting in a narrow interval. In contrast, when the trust in the prior knowledge is stronger (in the plots from the rightmost columns), the contribution of the prior knowledge to the posterior estimation becomes higher, and the IPSP estimator produces a wider interval.

Fig. 6: Experimental analysis of the Bayesian inference using imprecise probability with sets of priors (IPSP) estimator showing the bounded posterior estimation for regular events.
figure 6

a IPSP estimator results showing the impact of different sets of priors \([{\underline{t}}^{(0)},\overline{t}^{(0)}]\) and \([{\underline{\lambda }}^{(0)},\overline{\lambda }^{(0)}]\). In each plot, the blue dotted line ( ⋅⋅⋅⋅ ) and green dashed line ( − − − ) show the posterior estimation bounds \({\underline{\lambda }}^{(t)}\) and \(\overline{\lambda }^{(t)}\) for narrow and wide \([{\underline{\lambda }}^{(0)},\overline{\lambda }^{(0)}]\) intervals, respectively. Each column of plots corresponds to assigning different strength to the prior knowledge, ranging from uninformative (leftmost column) to strong belief (rightmost column). The first row shows scenarios in which the actual rate \(\overline{\lambda }=3\) belongs to the prior knowledge interval \([{\underline{\lambda }}^{(0)},\overline{\lambda }^{(0)}]\). In the second and third rows, the prior intervals overestimate and underestimate \(\overline{\lambda }\), respectively. b IPSP estimator results illustrating the behaviour of IPSP across different actual rate values \(\overline{\lambda }\in \{0.03,0.3,3,30\}\). The experiments were carried out for \([{\underline{t}}^{(0)},\overline{t}^{(0)}]=[1000,1000]\) and included both narrow and wide \([{\underline{\lambda }}^{(0)},\overline{\lambda }^{(0)}]\) intervals, which are shown in blue dotted lines ( ⋅⋅⋅⋅ ) and green dashed lines ( − − − ), respectively. In all experiments, the unknown actual rate \(\overline{\lambda }\) was in the prior interval \([{\underline{\lambda }}^{(0)},\overline{\lambda }^{(0)}]\).

In the experiments from the first row of plots in Fig. 6a, the (unknown) actual rate \(\overline{\lambda }=3\) belongs to the prior knowledge interval \([{\underline{\lambda }}^{(0)},\overline{\lambda }^{(0)}]\). As a result, the posterior rate interval \([{\underline{\lambda }}^{(t)},\overline{\lambda }^{(t)}]\) progressively becomes narrower, approximating \(\overline{\lambda }\) with high accuracy. As expected, the narrower prior knowledge (blue dotted line) produces a narrower posterior rate interval than the wider and more conservative prior knowledge (green dashed line).

When the prior knowledge interval \([{\underline{\lambda }}^{(0)},\overline{\lambda }^{(0)}]\) overestimates or underestimates the actual rate \(\overline{\lambda }\) (second and third rows of plots from Fig. 6a, respectively), the ability of IPSP to adapt its estimations to reflect the observations heavily depends on the characteristics of the sets of priors. For example, if the width of the prior knowledge \([{\underline{\lambda }}^{(0)},\overline{\lambda }^{(0)}]\) is close to \(\overline{\lambda }\) and t (0) ≪ t, then IPSP more easily approaches \(\overline{\lambda }\), as shown by the narrow prior knowledge (blue dotted line) in Fig. 6a for \([{\underline{t}}^{(0)},\overline{t}^{(0)}]\in \{[5,15],[75,125],[750,1250]\}\). In contrast, wider narrow prior knowledge (green dashed line) combined with higher levels of trust in the prior, e.g., \([{\underline{t}}^{(0)},\overline{t}^{(0)}]\in \{[1500,2500]\}\), entails that more observations are needed for the posterior rate to approach the actual rate \(\overline{\lambda }\). When the actual rate is, in addition, nonstationary, change-point detection methods can be employed to identify these changes62,63 and recalibrate the IPSP estimator. Finally, Fig. 6b shows the behaviour of IPSP for different actual rate \(\overline{\lambda }\) values, i.e., \(\overline{\lambda }\in \{0.03,0.3,3,30\}\). As \(\overline{\lambda }\) increases, more observations are produced in the same time period, resulting in a smoother and narrower posterior bound estimate.