Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Linear Regression Models with Incomplete Categorical Covariates

  • Published:
Computational Statistics Aims and scope Submit manuscript

Summary

We present three different methods based on the conditional mean imputation when binary explanatory variables are incomplete. Apart from the single imputation and multiple imputation especially the so-called pi imputation is presented as a new procedure. Seven procedures are compared in a simulation experiment when missing data are confined to one independent binary variable: complete case analysis, zero order regression, categorical zero order regression, pi imputation, single imputation, multiple imputation, modified first order regression. After a brief theoretical description of the simulation experiment, MSE-ratio, variance and bias are used to illustrate differences within and between the approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9

Similar content being viewed by others

Notes

  1. 1ratio between the mean square error of the complete case analysis and the mean square error of the alternative method

References

  • Buck, S. F. (1960). A method of estimation of missing values in multivariate data suitable for use with an electronic computer, Journal of the Royal Statistical Society, Series B 22: 302–307.

    MathSciNet  MATH  Google Scholar 

  • Fahrmeir, L., Hamerle, A. & Tutz, G. (eds) (1996). Multivariate statistische Verfahren, 2edn, de Gruyter, Berlin.

    Google Scholar 

  • Fieger, A. (1997). C++ Klassen zur Linearen Regression bei fehlenden Kovariablen, SFB386 — Discussion Paper 61, Ludwig-Maximilians-Universität München.

    Google Scholar 

  • Fieger, A., Heumann, C., Kastner, C. & Watzka, K. (1997). Generische Bibliothek zur Linearen Algebra und zur Simulation in C++, SFB386 — Discussion Paper 63, Ludwig-Maximilians-Universität München.

    Google Scholar 

  • Hill, R. C. & Ziemer, R. F. (1983). Missing regressor values under conditions of multicollinearity, Communications in Statistics, Part A—Theory and Methods 12: 2557–2573.

    Article  MathSciNet  Google Scholar 

  • Little, R. J. A. (1992). Regression with missing X’s: A review, Journal of the American Statistical Association 87: 1227–1237.

    Google Scholar 

  • Little, R. J. A. & Rubin, D. B. (1987). Statistical Analysis with Missing Data, Wiley, New York.

    MATH  Google Scholar 

  • Rao, C. R. & Toutenburg, H. (1999). Linear Models: Least Squares and Alternatives, 2 edn, Springer, New York.

    MATH  Google Scholar 

  • Rubin, D. B. (1987). Multiple Imputation for Nonresponse in Sample Surveys, Wiley, New York.

    Book  Google Scholar 

  • Rubin, D. B. (1996). Multiple imputation after 18+ years, Journal of the American Statistical Association 91: 473–489.

    Article  Google Scholar 

  • Toutenburg, H. (1992). Lineare Modelle, Physica, Heidelberg.

    Book  Google Scholar 

  • Toutenburg, H., Srivastava, V. K. & Fieger, A. (1996). Estimation of parameters in multiple regression with missing X-observations using first order regression procedure, SFB386—Discussion Paper 38, Ludwig-Maximilians-Universität München, Munich.

    MATH  Google Scholar 

  • Wilks, S. S. (1932). Moments and distributions of estimates of population parameters from fragmentary samples, Annals of Mathematical Statistics 3: 163–195.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Toutenburg, H., Nittner, T. Linear Regression Models with Incomplete Categorical Covariates. Computational Statistics 17, 215–232 (2002). https://doi.org/10.1007/s001800200103

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s001800200103

Keywords