Summary
We present three different methods based on the conditional mean imputation when binary explanatory variables are incomplete. Apart from the single imputation and multiple imputation especially the so-called pi imputation is presented as a new procedure. Seven procedures are compared in a simulation experiment when missing data are confined to one independent binary variable: complete case analysis, zero order regression, categorical zero order regression, pi imputation, single imputation, multiple imputation, modified first order regression. After a brief theoretical description of the simulation experiment, MSE-ratio, variance and bias are used to illustrate differences within and between the approaches.
Similar content being viewed by others
Notes
1ratio between the mean square error of the complete case analysis and the mean square error of the alternative method
References
Buck, S. F. (1960). A method of estimation of missing values in multivariate data suitable for use with an electronic computer, Journal of the Royal Statistical Society, Series B 22: 302–307.
Fahrmeir, L., Hamerle, A. & Tutz, G. (eds) (1996). Multivariate statistische Verfahren, 2edn, de Gruyter, Berlin.
Fieger, A. (1997). C++ Klassen zur Linearen Regression bei fehlenden Kovariablen, SFB386 — Discussion Paper 61, Ludwig-Maximilians-Universität München.
Fieger, A., Heumann, C., Kastner, C. & Watzka, K. (1997). Generische Bibliothek zur Linearen Algebra und zur Simulation in C++, SFB386 — Discussion Paper 63, Ludwig-Maximilians-Universität München.
Hill, R. C. & Ziemer, R. F. (1983). Missing regressor values under conditions of multicollinearity, Communications in Statistics, Part A—Theory and Methods 12: 2557–2573.
Little, R. J. A. (1992). Regression with missing X’s: A review, Journal of the American Statistical Association 87: 1227–1237.
Little, R. J. A. & Rubin, D. B. (1987). Statistical Analysis with Missing Data, Wiley, New York.
Rao, C. R. & Toutenburg, H. (1999). Linear Models: Least Squares and Alternatives, 2 edn, Springer, New York.
Rubin, D. B. (1987). Multiple Imputation for Nonresponse in Sample Surveys, Wiley, New York.
Rubin, D. B. (1996). Multiple imputation after 18+ years, Journal of the American Statistical Association 91: 473–489.
Toutenburg, H. (1992). Lineare Modelle, Physica, Heidelberg.
Toutenburg, H., Srivastava, V. K. & Fieger, A. (1996). Estimation of parameters in multiple regression with missing X-observations using first order regression procedure, SFB386—Discussion Paper 38, Ludwig-Maximilians-Universität München, Munich.
Wilks, S. S. (1932). Moments and distributions of estimates of population parameters from fragmentary samples, Annals of Mathematical Statistics 3: 163–195.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Toutenburg, H., Nittner, T. Linear Regression Models with Incomplete Categorical Covariates. Computational Statistics 17, 215–232 (2002). https://doi.org/10.1007/s001800200103
Published:
Issue Date:
DOI: https://doi.org/10.1007/s001800200103