1. Introduction
Degradation processes are ubiquitous in many physical, engineering, biological, and social systems. Modeling the degradation is crucial for lifetime prediction and has drawn increasing attention in the field of reliability and risk analysis [
1]. Reliable and accurate lifetime prediction remains a great challenge due to the time-varying and stochastic nature of degradation processes.
In reliability theory, the hazard rate function characterizes the failure probability in the degradation processes, and determines the probability distribution of the lifetime. To estimate the hazard rate function, the lifetime distribution is usually presumed in a certain form, and is fitted with the lifetime testing data. Alternatively, with a large number of lifetime data, an empirical curve can be directly established by interpolation. Both methods require sufficient samples of data to assure the accuracy and reliability of the results. For high reliability-demanding systems or parts, the sample size is usually small. To alleviate the difficulty, the previous study [
2] proposed a method based on the maximum entropy principle (MaxEnt) [
3,
4,
5] to estimate the hazard rate function and the lifetime distribution with limited lifetime testing data of the whole system.
However, the forecast of an on-going aging process of a multi-component system is still challenging. For most complex multi-component systems, it is difficult to obtain enough system-level lifetime data due to the restriction on the trial cost, the limitation of the observation, the very low degradation rate, and so on. An alternative method is to estimate degradation at the component level, leveraging the fact that the component-level degradation is closely associated with the aging of the whole system. The association is defined by the structural function [
6], which can be represented by the reliability block diagram [
7]. Existing studies, including but not limited to [
6,
7], neglect the correlations between the components. However, the failure of an individual component usually leads to a load redistribution to other normal components in a complex system, influencing the degradation among components. Therefore, ignoring the correlation may cause unknown risk. There are no formal rules to deal with the interaction of the degradations among connected components [
8].
The network approach is widely used to model the spreading dynamics of epidemics and information in society [
9,
10,
11,
12,
13], and such spreading dynamics resemble the degradation propagation in a complex system. The network approach has demonstrated its advantage in modeling the systems with multiple correlated components [
14,
15] such as, an electrical circuit with multiple electronic components, a mechanical system involving multiple parts, a living consisting of multiple organs [
16,
17,
18,
19,
20,
21,
22,
23], and many other.
By combining the network approach, this study develops a MaxEnt-based reliability method for general multi-component systems. The basic idea is to represent the entropy of the system as a function of the hazard rate functions of the participating components. The connectivity of the components in the network can subsequently be recast to an equivalent reliability block diagram of the system. In particular, non-repairable and repairable models are focused to motivate the development of the proposed method. The former one represents a network with multiple inter-connected components where the components only undergo degradation process. The latter one allows for the recovery or replacement of failed components, by which the forecast of an on-going aging-recovering process is demonstrated. To study the degradation propagation, the failure of one component alters the hazard rate function of neighboring components in both models. By incorporating the reliability block diagram, the components are hierarchically organized in a parallel-series diagram. The statistical moments are used in the macroscopic model to reduce the inherent noise in early-stage data. Furthermore, under the assumption of a homogeneous hazard rate, the one-shot type of data can be transformed to equivalent moment data with the reliability block diagram.
This paper is organized as follows. In
Section 2, the multi-component system is briefly reviewed. In
Section 3, the microscopic model for the non-repairable system is developed. The MaxEnt is used to infer the (inhomogeneous and homogeneous) hazard rates of the components with the different topologies of the network. In
Section 4, the reliability block diagram is employed as a tool to aggregate different types of information. In
Section 5, the microscopic model for repairable systems is discussed in detail. The repairable-component model of the Watts–Strogatz small world [
24] is adopted to demonstrate the proposed method. Different limitations of accessible information, such as the local observation and the one-shot observation, are taken into account.
2. Modeling the Degradation and the Recovery Processes
In this paper, the multi-component systems are modeled by the networks, where nodes denote the components. Each component has two possible states, namely, the normal state and failed state.
The propagation of degradation is driven by one or more failed components in the system. The degradation process of a component is triggered by a failed neighboring component with the transition rate
, where
t is the duration that the component connects with at least one failed component. Its remaining lifetime
T is a random variable with a probability
. For a normal component connected to more than one failed component, the transition rate is assumed to be the same. The transition rate function is defined by:
which is also called the hazard rate function in reliability theory. With the degradation process defined for an individual component, the joint distributions of all the components’ lifetimes are directly constructed by the hazard functions.
The repairable-component model is built by adding the recovery process. Similar to the degradation process, the recovery time
is assumed to be a random variable with the cumulative distribution
. The recovery rate function is defined by:
Recovered components are assumed to undergo further degradation.
3. The Non-Repairable System
The standard MaxEnt provides a method to construct the most probable distribution with linear constraints, e.g., moment constraints, or convex constraints [
25]. In practice, the small number of constraints may lead to an imprecise inference. For example, for a two-dimensional distribution, if the constraints are the first moments of the two random variables, the standard MaxEnt only provides an uncorrelated distribution, since the first moments do not contain information of correlations. The construction of correlated distributions requires more constraints, which raises a higher requirement of observation.
To reduce the requirement of available information, an alternative way is proposed by combining the MaxEnt with the degradation model, which is regarded as prior knowledge and constrains the probability distribution. Namely, the variation is done in a physical subset of the probability distribution functional space. In this section, the variational probability distributions rely on the network structures and the model. A standard MaxEnt with moment constraints is equivalent to the maximum likelihood estimation, while the MaxEnt based on degradation model here is different from the maximum likelihood estimation.
To begin with, the double-component systems as presented in
Figure 1a,b, are studied to present the inference of the components’ hazard rates via MaxEnt. The two components are labeled by
and
with lifetimes
and
. The joint probability distribution of lifetimes
is associated with the hazard rates
with
. The system may degrade in two different possible ways:
degrades first and
follows, and the opposite. The joint distribution of the lifetimes is written as:
where
denotes the step function.
are the functions depending on the structure of the graph, which will be explicitly defined in different cases. The structure-dependent joint distribution implies the physical subset in which the variation is done.
In the following, the inference of the life time distribution
given a different type of information is developed based on the MaxEnt principle. Both the degradation sequence and the lifetimes are considered in the joint distribution (
3). The Shannon entropy [
26] of the joint distribution is written as:
The linear constraints are:
where
s are the Lagrange multipliers corresponding to the averages of
with
, and the averages are either the moments or the correlations of the components’ lifetimes. These constraints are the same with that considered in the standard MaxEnt.
The most probable probability distribution is obtained through maximizing the entropy with the constraints:
which also gives the most probable hazard rate.
3.1. MaxEnt for Double-Component Non-Repairable Model: Independent Degradation
Figure 1a shows independent degradations of the two components. The joint probability distribution of lifetimes
is determined by:
with the hazard rate function
of each component and
,
. In this case,
is same with
, i.e.,
. The hazard rate function is directly used here, because in general a one-dimensional distribution
(defined with
) can be expressed as
.
By defining a function
, Equation (
6) is rewritten as
. With Euler–Lagrange equations, it follows from Equation (
6) that:
where
,
, and
.
If taking the average lifetimes
as the constraints,
i.e.,
, then the solutions become
for Equation (
8).
Note that not all the correlations and the moments can be fused by Equation (
8). For example, one considers
and determines the Lagrange multipliers by these observed values
,
, and
. No solution exists for the Lagrange multipliers
in Equation (
8) when
, because the distribution in Equation (
7) implies
is independent with
which is conflicted with the available information. To remove such conflict, one could modify the degradation model (i.e., modify the physical subset) or select other constraints, for example,
, and the solution to Equation (
6) becomes:
where
Z is the partition function and
,
with
.
3.2. MaxEnt for Double-Component Non-Repairable Model: Correlated Degradation Case
In
Figure 1b, the degradation processes of the two components are correlated. The joint probability distribution of lifetimes is
. Combining the degradation-propagation rule with the network structure,
degrades due to connection with the degradation source and the degradation of
follows. As the result,
becomes:
where
is marginal probability distribution, and
is the conditional probability distribution.
is normalized if
. The time-dependent hazard rate function implies a non-Markovian degradation process for the correlated systems. The difference between the distribution by Equations (
7) and (
9) is caused by different network structures.
Equation (
6) becomes:
with
and
.
Equation (
9) implies that the random variables
and
are statistically independent. With observing the average lifetimes
, the hazard rates are inferred as
. In the above two cases, the entropy functions depend on the structure of the graphs, which leads to different dynamics of degradation processes.
4. System Hierarchy by Reliability Block Diagram
In this section, the information constraints to the MaxEnt are considered. The lifetime of one component is the summation of the two intervals. One is the shortest lifetime of the neighbor components. The other is the remaining lifetime of the component. The former relates to the path information, and the single-component information for the latter. The reliability block diagram is introduced to classify different types of information constraints.
4.1. System-Level Observation and Coarse-Grained Information
The system-level observation is defined in the following way. For an
n-component system, a failure of system occurs if more than
k of
n components degrade. In the reliability theory, such a system is called the ‘
system’ [
7]. The lifetime of the entire system is the coarse grain of the component-level information. For the models considered in this paper, the lifetimes of
systems depend on the degrading path, and the path information is also the coarse-grain information.
Specifically, the
(
) system is called the series (parallel) system, the reliability block diagram of which is shown in
Figure 2. In the diagrams, the blocks stand for the components. The diagrams illustrate the relationship between the system-level data and the component-level data.
The reliability block diagram explicitly presents the observed data. As follows, it shows that the reliability block diagram can be reduced according to the degradation rule and the network structure in some particular cases.
4.2. Tree-Type Networks
Consider a semi-infinite chain with
n components, as shown in
Figure 1c. From the left side to the right side, the components are labeled by
. The filled circle stands for a source of degradation. The degradation starts with
and ends with
. The joint probability distribution of lifetimes is:
For a parallel system, the average lifetime of
is observed, which gives the constraint
. According to MaxEnt, the most probable distribution becomes:
The lifetime distribution of parallel system follows:
The gamma distribution is retained by MaxEnt with the system-level information.
In the chain-type network, the lifetime of
can be decomposed into the remaining lifetime of each component as
. Each interval in the summation is associated with the hazard rate of the corresponding component. This implies that if the
n components are on one path, the further reduction of the parallel type diagram can be done according to
Figure 3. Since the path is unique for any tree graphs, the path information is further reduced to lots of single-component information.
4.3. Homogeneous Hazard Assumption
For the degradation propagated on more complex networks, it is difficult to apply the above approach to infer hazard rates of all components, since required information, such as lifetime moments of specific components and subsystems, increases rapidly with the increasing number of components. This information can only be obtained from the observation to system ensemble. However, it is difficult to obtain the ensemble data for a complex system in practice. In particular, it is impossible to make a precise component-dependent inference based on the one-shot degradation data. If the components could be sorted into several classes according to their degrees or other characteristics with negligible difference in the same class, a class-dependent inference is possible to achieve. For example, in the epidemic models, it is usually assumed that all individuals obey the same infection and recovery rates. In this way, it is feasible to infer a homogeneous hazard rate with the one-shot degradation data by MaxEnt.
Under the homogeneous hazard assumption, the variational joint probability distribution of the system in
Figure 1c is:
where
is the identical remaining lifetime distribution of all the components.
The constraints are the first and the second moment of
:
.
where
are the Lagrange multipliers. With
, the MaxEnt by Equation (
6) leads to the Euler–Lagrange equation of the hazard rate as:
The above equation has solution:
where
Z is normalization constant,
. The parameters are determined by:
These results are reduced to the single-component case [
2], which coincides with the reduction of the reliability block diagram in
Figure 3.
For one-shot observation, it is difficult to infer the component-dependent hazard rate due to the lack of moment information. The homogeneous hazard assumption provides an alternative way to rebuild the joint distribution with the one-shot data.
4.4. Loop Networks and Parallel-Series Type Diagram
With a loop structure in a network, the degradation path is not unique, which leads to a parallel-series type diagram. The diagram could not be reduced to single blocks. Consequently, the constraints for the MaxEnt is no longer linear. Assume there are
m paths from the target component to the source, and the lengths of the paths are denoted by
. The reliability block diagram is presented in
Figure 4.
Take the network in
Figure 1d as an example, the components are labeled by
,
, and
. The joint distribution of the lifetime is:
where
is the lifetime distributions of component
.
The constraints, for example, are the first-order moments for each individual component, namely:
where
is the survival probability of the homogeneous distribution
, and
are the Lagrange multipliers with
. The factor 3 in Equation (
20) is added to simplify the following calculation and note that it does not affect the distribution determined by the MaxEnt. The first term in Equation (
20) is the single-component moment, and the second term is the moment for the series structure of two components. These constraints are directly obtained with the reliability block diagram presented in
Figure 5.
To see the non-linearity of the constraint in Equation (
20), rewrite the entropy explicitly,
which shows that the entropy of the three-dimensional distribution Equation (
19) is proportional to that of the one-dimensional distribution
. It follows from Equation (
20) that:
where the constraint is nonlinear in
, although still linear in
p. The degradation model converts the linear constraint in high-dimensional distributions to the non-linear constraint in low-dimensional distributions.
From the structure-dependent joint probability distribution by Equation (
19) and the constraints by Equation (
20), the hazard rate for each component is inferred as:
with
. The solution is:
The above discussion presents the reduction of the reliability block diagram. It is worth mentioning that the reduction depends on the network structure and the rule of degradation propagation.
6. Conclusions and Discussion
In this work, a novel MaxEnt-based approach of multi-component systems was proposed to assess the reliability of non-repairable and repairable systems. The developed approach provides a rational way to estimate hazard rates of a system consisting of correlated degrading components. Combined with the reliability block diagram, the one-shot type of data can be used for the estimation. The case study shows that the developed approach can yield reliable results with limited and noisy data at the early stage.
The application of the approach involves the following steps in general as presented in
Figure 8. (1) Form a network with nodes representing the multi-component system, (2) build the variational joint distribution based on the network, (3) collect the observed lifetime (recovery duration) data of the components as testable information, (4) process the observed data according to the reliability block diagram and calculate the moments, and (5) maximize the entropy of the variational joint distribution with the moment constraints. For many artificial systems, the network structure is usually known, and the network can be constructed accordingly among the components. For the systems with an unknown structure, the inference of network structure is also needed. Such inference is not the subject of this paper. Relevant discussions can be found in [
29,
30] for network modeling. Combination the inference of network and dynamics will be studied in future work.