2.1. The Proposed Framework for an Adaptive Sampling Strategy
In response to the four problems of segment interval, an adaptive sampling framework is proposed by adding targeted content to deal with the issue of degradation. Increments are marked by red dashed lines in
Figure 5.
2.1.1. Hyperparameter Initialization
Except for traditional hyperparameters, those related to the distribution of sampling objects are introduced to clarify the sampling needs in P1, and they involve the target segment interval and target sample quantity. The target segment interval refers to the degradation indicator intervals that are wanted, and the target sample quantity denotes the desired sample number of each target segment interval.
The most widely used distribution is the uniform distribution, which has a target sample quantity of 1 in each interval and equal length target segment intervals. In this case, only the degradation interval needs to be specified. In addition, considering that the influence of the same degradation amount may be different in different conditions, just as people have different sensitivity to pain at different ages, these distributions must be determined on a case-by-case basis.
2.1.2. Time Series Collection
The collection of the initial time series relies on an initial sampling strategy, whereas the subsequent sampling is based on the results of condition predictions in the sampling regulator.
2.1.3. Transforming from a Time Series to a Degradation Series
The main difference between the proposed framework and traditional strategies lies in the increment operation of the variable swap, i.e., converting the time series to a degradation series. Time series prediction enables numerical inferences at specific future times. It, however, cannot estimate the time at a specific numerical value of degradation. To eliminate the theoretical error of time estimation in P4, a variable swap is designed to exchange the independent and dependent variables of the time series.
Before this conversion, the timesseries must be monotonic, a condition that is satisfied in most cases, since degradation itself is irreversible. For the fluctuations caused by noise and measurement error, smoothing and monotonization can be used to calibrate them to approximate actual degradation.
2.1.4. Degradation Prediction
As most degradation laws are nonlinear, regardless of whether the collected samples form an irregular time series or not, the obtained degradation series is commonly an irregular series after the variable swap is conducted, and its prediction is an ITSP issue of P2.
One possible approach is to transform the irregular series to a regular series and utilize times series forecasting methods for prediction. This would be best for predicting irregular series directly without transformation error. However, existing solutions of this sort are only applied in a few areas, such as astronomy [
25]. The feasibility of existing ITSP methods can be explored for specific datasets. In addition, machine learning methods are very promising options. Meanwhile, the realization of prediction settles the time lag problem of P3.
2.1.5. Segment Interval Calculation
After obtaining the forecasted segment interval, the actual segment interval still requires the consideration of certain time boundaries to avoid possible surprises, as the laws on which our predictions are based may change or even mutate. These boundaries should be valued in
Section 2.1.1. Subsequently, based on the actual segment interval, a new sample should be obtained in a loop until the failure threshold is reached.
2.2. A Proposed Method for Mechanical Systems
Mechanical equipment is pervasive in industry and a great deal of effort has been dedicated to its reliability and safe operation. At present, condition-based maintenance (CBM) is the state-of-the-art solution for counteracting the influence of mechanical degradation on reliability and safety. The realization of CBM includes the following three steps [
26,
27]: data acquisition, data processing method and maintenance decision-making. The latter two have received the most attention. Taking the popular statistical learning techniques as the example, many methods has been applied in degradation monitoring, including supervised learning [
28], unsupervised learning [
29], transfer learning [
30], statistical model [
31], integrated learning [
32], etc. For existing data problems, a large number of studies have also been carried out to reduce their impacts in the tasks of classification [
33,
34] and regression [
5,
35]. However, they all deal with this problem from the perspective of methodology, and the solutions from a data perspective have been underestimated. Considering its significance in practice, mechanical degradation monitoring was chosen as the object of the framework’s application.
Combined with the specific characteristic of mechanical degradation, a concrete method is advanced based on the proposed framework. The method can be applied to available monitoring data in numerical data formats, including pressure, temperature, acoustic emission, wear amount, voltage, current, etc. The applicability of the method depends on whether the data type can effectively reflect the condition degradation, which needs to be determined according to the specific scenarios. Since it is implemented based on the condition prediction, the most beneficial conditions are closely related to the predictability of the degradation, as follows: (1) The degradation indicator can effectively represent the degradation; (2) the degradation process is stable; (3) the degradation law is consistent and no mutation occurs. The method flow is illustrated in
Figure 6.
Step 1. Hyperparameter initialization. The first step is the determination of the target sampling distribution. Generally, the existing condition-based methods tend to accelerate or reduce sampling, for poor or good health conditions. Although not explicitly mentioned, the logic behind this is to make the difference of adjacent samples as small as possible. In other words, the implied target sampling is the uniform distribution. Occasionally, this target sampling may be a non-uniform distribution, which needs to be determined specifically.
In addition, the initial sampling strategy and sampling boundary should be specified. The former generally adopts a time-based strategy for the data accumulation process before the loop. For the latter, the upper limit of sampling is to prevent a sudden change of the condition law as well as possible large errors in prediction, and the lower limit of sampling is to avoid the sampling moment of the prediction being earlier than the moment when the prediction is completed.
Step 2. Convert the time series to a degradation series. After obtaining the samples, we need to check whether they have reached the failure threshold or not. If yes, the sampling should be finished and an output should be obtained from the samples. Otherwise, smoothing the time series via robust locally weighted regression can balance the trend and outliers well, and is especially useful for the robust handling of outliers. If the time series does not satisfy monotonicity at this time, additional monotonic processing is required. Finally, the independent variable and dependent variable of the time sequence must be exchanged to obtain the degradation series.
Step 3. Degradation series prediction. After the variable swap, the obtained sequences are basically irregular series. In this method, the regular time series forecasting method is selected to realize irregular time series forecasting, on the basis of interpolating irregular time series to obtain a regular series. Before this, the minimum account of the degradation series
V1 must be set to meet the data volume requirement for prediction. Accordingly, the maximum allowable degradation interval can be calculated:
where
MADI is the abbreviation for the maximum degradation interval, and
h1 and
hlast are the first and last items, respectively.
On this basis, an actual segment interval can be determined and further utilized for scale transformation with Equation (2), that is, the segment interval selection and degradation series transformation.
Hereere, SIs and SId are the selected and target segment intervals, respectively. S and St respectively indicate the sequence before and after transformation. and min( ) is the minimum function.
Let the transformed degradation sequence be expressed as {
thi:
i = 1, 2, …,
n} and construct a series of interpolation points {
Hj}. Since the degradation sequence was normalized by
SIa via scale transformation, the actual segment interval has been converted to the unit length in {
thi}. Consequently, {
Hj} is an arithmetic sequence, whose last term is
thlast. The difference is 1 and
j =
floor(
thlast −
th1), where
floor( ) only outputs the integer part of the value in parentheses. Then, the piecewise cubic Hermite interpolating polynomial is selected to obtain the regular series. This can preserve the data’s shape and corresponding monotonicity, which is exactly what we want. For a subinterval [
hk,
hk+1], let
where
sk is the slope of point
hk equal to
dk or
dk+1 for the piecewise linear interpolation. The fitted cubic polynomial
F(
h) can be represented as follows, for
hk ≤
h≤
hk+1:
Bringing {Hj} into F(h) on the corresponding subintervals to obtain the new time sequence {Tj}, then, the time series {(tj, hj)} is transformed into a degradation series {(Hj, Tj)}.
Afterwards, the autoregressive integrated moving average is utilized for sampling time prediction, which has been widely adopted and proven effective for mechanical degradation processes. This model can be represented by ARIMA(
p, d, q), where
p is the lag order that denotes the number of lag observations in the model;
d is the differencing degree that refers to the number of times the raw observations are differentiated;
q is the moving average order, which means the size of the moving average window. The model is expressed as follows:
where
ɛt is the random error at time
t;
φ( ) is a function of a
p-order autoregressive coefficient polynomial, and
θ( ) is a function of a
q-order self-moving average coefficient polynomial, and they are expressed by Equation (7):
Hereere,
B is the backshift operator defined as
where
Tt and
Tt-1 represent the
tth and (
t-1)
th element in {
Ti}.
The Box–Jenkins methodology is utilized to set up an ARIMA model that only needs a one-step prediction to forecast the sampling time of the target segment interval.
Step 4. Segment interval output. The upper and lower sampling limits are denoted as
Imax and
Imin, and the actual segment interval is decided as follows:
where
SIa denotes the final segment interval and
SIp is the predicted interval. In this way, we can sample with
SIa and update the sample set of the time series. Afterwards, we return to Step 2 and loop until the termination condition is reached.