dbo:abstract
|
- Response modeling methodology (RMM) is a general platform for statistical modeling of a linear/nonlinear relationship between a response variable (dependent variable) and a linear predictor (a linear combination of predictors/effects/factors/independent variables), often denoted the linear predictor function. It is generally assumed that the modeled relationship is monotone convex (delivering monotone convex function) or monotone concave (delivering monotone concave function). However, many non-monotone functions, like the quadratic equation, are special cases of the general model. RMM was initially developed as a series of extensions to the original inverse Box–Cox transformation: where y is a percentile of the modeled response, Y (the modeled random variable), z is the respective percentile of a and λ is the Box–Cox parameter. As λ goes to zero, the inverse Box–Cox transformation becomes: an exponential model. Therefore, the original inverse Box-Cox transformation contains a trio of models: linear (λ = 1), power (λ ≠ 1, λ ≠ 0) and exponential (λ = 0). This implies that on estimating λ, using sample data, the final model is not determined in advance (prior to estimation) but rather as a result of estimating. In other words, data alone determine the final model. Extensions to the inverse Box–Cox transformation were developed by Shore (2001a) and were denoted Inverse Normalizing Transformations (INTs). They had been applied to model monotone convex relationships in various engineering areas, mostly to model physical properties of chemical compounds (Shore et al., 2001a, and references therein). Once it had been realized that INT models may be perceived as special cases of a much broader general approach for modeling non-linear monotone convex relationships, the new Response Modeling Methodology had been initiated and developed (Shore, 2005a, 2011 and references therein). The RMM model expresses the relationship between a response, Y (the modeled random variable), and two components that deliver variation to Y:
* The linear predictor function, LP (denoted η): where {X1,...,Xk} are regressor-variables (“affecting factors”) that deliver systematic variation to the response;
* Normal errors, delivering random variation to the response. The basic RMM model describes Y in terms of the LP, two possibly correlated zero-mean normal errors, ε1 and ε2 (with correlation ρ and standard deviations σε1 and σε2, respectively) and a vector of parameters {α,λ,μ} (Shore, 2005a, 2011): and ε1 represents uncertainty (measurement imprecision or otherwise) in the explanatory variables (included in the LP). This is in addition to uncertainty associated with the response (ε2). Expressing ε1 and ε2 in terms of standard normal variates, Z1 and Z2, respectively, having correlation ρ, and conditioning Z2 | Z1 = z1 (Z2 given that Z1 is equal to a given value z1), we may write in terms of a single error, ε: where Z is a standard normal variate, independent of both Z1 and Z2, ε is a zero-mean error and d is a parameter. From these relationships, the associated RMM quantile function is (Shore, 2011): or, after re-parameterization: where y is the percentile of the response (Y), z is the respective standard normal percentile, ε is the model's zero-mean normal error with constant variance, σ, {a,b,c,d} are parameters and MY is the response median (z = 0), dependent on values of the parameters and the value of the LP, η: where μ (or m) is an additional parameter. If it may be assumed that cz<<η, the above model for RMM quantile function can be approximated by: The parameter “c” cannot be “absorbed” into the parameters of the LP (η) since “c” and LP are estimated in two separate stages (as expounded below). If the response data used to estimate the model contain values that change sign, or if the lowest response value is far from zero (for example, when data are left-truncated), a location parameter, L, may be added to the response so that the expressions for the quantile function and for the median become, respectively: (en)
|