1. Introduction
Using predictive analytics to foresee business outcomes may save time, money, and effort. More information will be accessible to help make better judgments. Organizations may also uncover possibilities and address their own challenges by providing precise and trustworthy insights. Aggregate data may be evaluated, using predictive analytics, to find new prospects for consumer acquisition. Prediction, which is important in many domains, has drawn greater attention in recent years. For instance, in business, an experimenter may wish to estimate the lifespan of an unseen future unit using data from a present sample. If so, the producer or the experimenter may release their goods into the market with the intention of capturing customers’ attention and increasing demand by lowering the thresholds for their warranties. See [
1,
2,
3] for additional details regarding the applicability of predictions in business and industry.
In cases involving parametric estimating, we usually assume the availability of a group of samples selected at random from a preset number of units (n). Furthermore, it is anticipated that random observations will fit a certain distribution. In practice, however, it is conceivable that not all of the unit observations are accessible, as some unit observations are either not captured or are lost during intermediary transmission. These data are referred to as censored samples. In the literature, there are techniques for estimating sample instances that have already been censored. Notably, from a theoretical standpoint, censoring may not be the most efficient method of conducting experiments because it reduces the number of observations. However, due to administrative, logistical, and budgetary limitations, restricted observations must be available. In other words, censoring systems compensate for an estimator’s loss of efficiency by providing administrative simplicity at a lower cost.
Numerous censoring methods have been examined in statistical literature, including time censoring (Type I censoring), item censoring (Type II censoring), Type I and Type II hybrid censoring, and progressive censoring. It should be noted that while the length of a test is assured if the experiment is stopped at a preset period (T), the results may differ if a censored sample’s observations are randomly distributed. However, even if this censoring provides a certain efficiency, the test’s duration becomes random if it is halted after a predefined number of observations. Hybrid censoring, created by combining the Type I and Type II censoring methods, provides a more flexible and useful life-testing technique.
After a predetermined time, the T of an experiment’s duration or a predetermined number (R) of observations have been acquired, and the test is ended using Type I hybrid censoring. For example, if
stands for the
ordered failure time, the experiment will end at
. This ensures that the test will take less time than T, but the number of observations will fluctuate and be lower than R. Epstein [
4] first presented the Type I hybrid censoring method, which was utilized in a subsequent reliability acceptance test [
5]. With respect to Type I hybrid censored data, several authors have studied the estimation of the unknown parameters for different probability distributions (see, for example, [
6,
7,
8,
9]).
In several fields, such as medicine, the military, and aeronautics, efficiency levels are more important than experiment duration. Therefore, a censoring strategy may be proposed, in which the test is stopped when a predetermined number of observations are gathered and a predetermined time for the duration of the experiment is reached—i.e., the experiment is stopped at
. This method is called the Type II hybrid censoring technique. It should be noted that the total number of observations in this censoring technique is random, but it will not be less than R. This ensures a minimal efficiency, although the length of the test can vary and continue beyond T. A thorough analysis of hybrid censoring systems, with generalizations and applications in competitive-risk and step-stress modeling, was provided by [
10]. Additionally, for estimating the parameters under Type II hybrid censoring schemes, see [
11,
12,
13].
Future-order statistics prediction arises readily in a variety of real-world circumstances. Here, we focus on future-order statistics estimates using the Bayesian paradigm. Predictive posterior distribution was first discussed by [
14] in relation to prediction issues. Since then, other censoring techniques have been included in prediction tasks. Ebrahmini [
15] provided two examples of Type I hybrid censoring prediction problems for exponential distributions. The one-sample and two-samples prediction problems, based on Type I hybrid censored samples for the general class of distribution and for generalized Lindley distribution, respectively, were studied by [
16]. Based on a Type II hybrid censored sample for a generic class of distribution, Balakrishnan and Shafay [
17] devised an estimation approach for one-sample and two-samples prediction issues.
Alpha-power Weibull (APW) distribution is significant because it extends the Weibull distribution and can model monotone and non-monotone failure rate functions, which are crucial in reliability research. In fact, the Weibull distribution’s widespread application in reliability theory and the generalization’s flexibility in lifetime data analysis served as the APW distribution’s main sources of inspiration. Moreover, Nassar et al. [
18] developed the APW model to provide a new generalization of the Weibull distribution based on the proposed Weibull model. The following is a list of the probability-density function (pdf), cumulative distribution (cdf), the associated hazard rate function (hrf), and the reversed hazard rate function (rhrf).
and
The formula for the mean time to failure (MTF) is
Despite there being a wealth of research on estimates under hybrid censoring and the APW distribution being superior to many competing distributions, including the Weibull, alpha power exponential, McDonald Weibull, beta Weibull, transmuted Weibull, gamma Lomax, Zografos–Balakrishnan log-logistic, exponentiated Weibull, and exponentiated Weibull distributions, there is no body of work discussing the prediction of the future ordered statistics based on hybrid censored samples for the APW model. In addition, there is not much information available regarding research achieving classical and Bayesian estimates of the unknown parameters, reliability, hazard rate functions, and the MTF for APW distribution under hybrid Type II censoring. All of these gaps in the current literature prompted us to produce this work, which has three major purposes. The first goal is to examine the issue of estimating the unknown parameters, reliability, hazard rate functions, and the MTF of APW distribution using classical and Bayesian methods of estimation under Type II hybrid censoring. We also calculate the corresponding interval estimates for the model parameter. The second goal is to obtain one- and two-sample Bayesian prediction problems. Also, the one- and two-sample Bayesian prediction interval limits are constructed based on the Type II hybrid censored data. The third goal is to perform a simulation study to ensure the theoretical results of the paper and that an actual data set is given to indicate the proposed methodology in real life.
The rest of the article is organized as follows: Both classical and Bayesian approaches are used to estimate the parameters, reliability, hazard rate, and the MTF under Type II hybrid censoring, and the associated interval limits of the unknown parameters are determined in
Section 2. In
Section 3, it is discussed how to solve for one- and two-sample Bayesian predictive posterior density. Also built in this section are the prediction boundaries for next samples. The results of the Monte Carlo simulation are studied in
Section 4 to allow for the comparison of the performance of the suggested estimators. In
Section 5, an actual data set is examined as an example. Concluding remarks are provided in
Section 6 at the end of this essay.
4. Simulation
In this section, we give some experimental findings, mostly for the purpose of seeing how the various approaches behave for various sample sizes and time censoring techniques. We use the MLE and Bayes estimators developed by employing the MCMC approach to estimate the unknown parameters. We contrast how well each estimator performs in terms of relative absolute bias (RAB) and MSE. Additionally, as the aim of this section, Bayes prediction has been obtained from the -th observation, where for a one-sample and two-sample prediction problem, and is associated with the inference based on the available data, namely . Specifically, we wish to provide an estimate of the posterior density function of given the data and also construct a 100(1 − )% predictive interval of . We consider these two cases separately.
When the parameters and are supposed to have true values with the time value of , T = 3 and 4; , T = 0.3 and 0.6; T = 0.15 and 0.4; and T = 0.5 and 0.8, we repeat hybrid censoring scheme data from an APW distribution 5000 times. We chose various sample sizes, such as n = 50 and 100, as well as hybrid censored sample sizes, such as r = 35 and 45 for n = 50, and r = 70 and 85 for n = 100. For Bayes prediction, 40 and 45 for r = 35, 46 and 48 for r = 45, 80 and 85 for r = 70, and 88 and 95 for r = 85.
The estimates of the parameters in the Bayes technique are generated based on informative priors in order to evaluate the type of prior. In the case of informative priors, the hyper-parameters are selected by elective hyper-parameters utilizing MLE information to display the outcomes of estimated parameters with the help of the MLE’s asymptotic distribution and Gibbs samples. For confidence intervals, we compare the average intervals (lower and upper) of the asymptotic confidence intervals for the MLE and HPD for the Bayesian estimators based on their coverage percentages of 95%. Also, we have calculated approximate 95% confidence intervals for the unknown parameters. The procedure is repeated 1000 times, and the average confidence/credibility is reported.
Table 1,
Table 2,
Table 3 and
Table 4 present the complete results. The acquired estimates were calculated using the “maxLik” and “coda” packages of the R programming language because the theoretical findings of d and q obtained using the suggested estimation methods cannot be represented in closed form. For these estimates, the “cmaxLik” package was used for MLE, while the “coda” package was used to obtain the Bayesian estimation for MCMC based on MH algorithms.
The simulation results concluded the following findings, which we highlight:
As the sample size increases for the parameters of APW and , the minimum RAB and MSE decline for the estimated parameters of MLE and Bayes estimates;
The Bayes estimates consistently outperform the MLE in terms of RAB, MSE, and interval values;
In most cases, the HPD intervals are shorter than ACI;
When the censored sample size r increases while keeping the hybrid censored sample’s sample size n and time constant, and the performance improves.
When sample size n and censored sample size r are kept constant, performance improves as the time of the hybrid censored sample lengthens.
5. Application
Reference [
18] used two real-life data sets and confirmed that APW distribution is better than many competitive distributions such as: Weibull, alpha power exponential, McDonald Weibull, beta Weibull, transmuted Weibull, gamma Lomax, Zografos–Balakrishnan log-logistic, exponentiated Weibull, and exponentiated Weibull distributions. The APW distribution has more different applications, such as inferences and engineering applications based on progressive Type II censoring (see [
27]); the optimal test plan of step-stress models under progressively Type II censored samples (see [
28]); and discrete APW and its applications (see [
29]). In this section, we consider two real-life data sets, which are discussed by [
18], and illustrate the methods proposed in the previous sections.
First, the data set is from [
18] for the application to the APW distribution, and it represents the survival times (in days) of 109 successive coal-mining disasters in Great Britain, within the period of 1875–1951. The statistical summarized measures of this data are given as follows: minimum value is 1; first quartile is 54; median is 145; mean is 233.3; third quartile is 312; and maximum value is 1630. Before moving on, we want to look at the data set using a scaled Total Time on Test (TTT) plot, a strip plot, a violin plot, and an empirical hazard function of the observed data.
Figure 1 gives a clear picture of the distribution’s hazard function’s shape, which has a decreasing shape and a TTT line under half line. In
Figure 1 and
Figure 2, the APW is good model to describe these data.
Figure 2 illustrates the APW distribution’s theoretical and empirical pdf, QQ, CDF, and P-P plot using the data set of the survival times data, and it can be seen that the APW is suitable and reliable for fitting the survival times data set.
Reference [
18] obtained the MLE parameters of APW as
0.01956
1.04985, and
0.00106, the KSD is 0.0592, and its P-value is 0.8383.
Table 5 discussed MLE and Bayesian estimation to estimate parameters of APW based on hybrid censoring with different size of samples, which different measures have been obtained as the MLE of the parameters with standard error (SE), lower, upper for confidence intervals with 95%, reliability, and hazard value with
.
Figure 3 discussed the existence and uniqueness plot of maximum likelihood estimates for survival times data to check the existence and uniqueness estimators. The maximum likelihood values for survival time data set, where r = 60 and T = 110 for the estimated parameter values that coincide with the MLE estimates in
Table 5, are shown in
Figure 3, which also supports the MLE estimates.
The MCMC results are normal, exhibit symmetric posterior density histograms, and have convergence measures by using the Brooks–Gelman–Rubin (BGR) statistic, as shown in
Figure 4 and
Figure 5. According to
Figure 6, the values for the MCMC series, which began with zero and ended with one, do not exhibit any auto-correlation.
Now let us look at the sample prediction problems for one and two. Based on the observed sample, we present in
Figure 7 the predictive point for one sample and two sample, and the predictive point of the s-st order statistic. Accordingly, the s-st failure will occur between r and 109 days based on the observed sample. Based on the results in
Table 6 and
Figure 7, we note the closeness of the results of the two methods for predicting values, but to determine the best method, the key is the KS test for two independent samples. It has been noted that the first method is the best because it contains the least distance between the data and also has the largest
p-value of KS test.
The authors of [
30] provided the second data set. The statistics are 1.5 cm glass fiber strengths, as determined by the National Physical Laboratory in England. The statistically summarized measures of these data are given as follows: minimum value is 0.55; first quartile is 1.375; median is 1.590; mean is 1.507; third quartile is 1.685; and maximum value is 2.240. Reference [
18] obtained the MLE parameters of APW for glass fiber data as
10.8558
4.48362, and
0.194777, the KSD is 0.10661, and its
p-value is 0.47107.
Figure 8 gives a clear picture of the distribution’s hazard function’s shape, which has a decreasing shape, and the TTT line is under half line. Based on
Figure 8 and
Figure 9, the APW is a good model to describe these data.
Figure 9 illustrates the APW distribution’s theoretical and empirical pdf, QQ, CDF, and P-P plot using the glass fiber data, and it can be seen that the APW is suitable and reliable for the fitting glass fiber data.
Table 7 discussed the MLE and Bayesian approaches to estimate the parameters of APW based on hybrid censoring with different size of samples, for which different measures have been obtained as the MLE of parameters with SE, lower, upper for confidence intervals with 95%, reliability, and hazard value with
.
Figure 10 discussed the existence and uniqueness plot of maximum likelihood estimates for glass fiber data to check for existence and uniqueness estimators. The maximum likelihood values for the glass fiber data set, where r = 40 and T = 1.5 for the estimated parameter values that coincide with the MLE estimates in
Table 7, are shown in
Figure 10, which also supports the MLE estimates.
The MCMC results are normal, exhibit symmetric posterior density histograms, and have convergence measures for APW parameters with the glass fiber data, as shown in
Figure 11 and
Figure 12. According to
Figure 13, the values for the MCMC series, which began with zero and ended with one, do not exhibit any auto-correlation for APW parameters with glass fibers data.
Now let us look at the sample prediction problems for one and two. Based on the observed sample, we present in
Figure 14 the predictive point for the one-sample and two-sample prediction of the s-st order statistic for glass fiber data. Accordingly, the s-st failure will occur between the r and 63rd observation based on the observed sample. Based on the results in
Table 8 and
Figure 7, we note the closeness of the results of the two methods for predicting values, but to suggest the best method, the key is the KS test for two independent samples. It has been noted that the first method is the best because it contains the least distance between the data and also has the largest
p-value of KS test.