Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

On EM Estimation for Mixture of Multivariate t-Distributions

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

This paper formulates a novel expectation maximization (EM) algorithm for the mixture of multivariate t-distributions. By introducing a new kind of “missing” data, we show that the empirically improved iterative algorithm, in literature, for the mixture of multivariate t-distributions is in fact a type of EM algorithm; thus a theoretical analysis is established, which guarantees the empirical algorithm converges to the maximization likelihood estimates of the mixture parameters. Simulated experiment and real experiments on classification and image segmentation confirm the effectiveness of the improved EM algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

References

  1. Alfò M, Nieddu L, Vicari D (2008) A finite mixture model for image segmentation. Stat Comput 18: 137–150

    Article  MathSciNet  Google Scholar 

  2. Asuncion A, Newman DJ (2007) UCI Machine learning repository [http://www.ics.uci.edu/mlearn/MLRepository.html]. University of California, School of Information and Computer Science, Irvine, CA

  3. Chen SB, Luo B (2004) Robust t-mixture modelling with SMEM algorithm. In: Proceedings of the third international conference on machine learning and cybernetics, vol. 6. Shanghai, China, pp 3689–3694

  4. Chen S, Qin J (2006) An empirical likelihood method in mixture models with incomplete classifications. Stat Sinica 16: 1101–1115

    MATH  MathSciNet  Google Scholar 

  5. Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data using the EM algorithm (with discussion). J R Stat Soc Ser B 39: 1–38

    MATH  MathSciNet  Google Scholar 

  6. Di Zio M, Guarnera U, Luzi O (2007) Imputation through finite Gaussian mixture models. Comput Stat Data Anal 51: 5305–5316

    Article  MATH  MathSciNet  Google Scholar 

  7. Fonseca JRS, Cardoso MGMS (2007) Mixture-model cluster analysis using information theoretical criteria. Intell Data Anal 11: 155–173

    Google Scholar 

  8. Gerogiannis D, Nikou C, Likas A (2007) Robust image registration using mixtures of t-distributions. In: 11th IEEE international conference on computer vision, vol. 1–6. Rio de Janeiro, Brazil, pp 2330–2337

  9. Ghahramani Z, Jordan MI (1994) Supervised learning from incomplete data via an EM approach. In: Cowan JD, Tesauro G, Alspector J (eds) Advances in neural information processing systems, vol 6. Morgan Kaufmann Publishers, San Francisco, CA, pp 120–127

    Google Scholar 

  10. Huang Z-K, Chau K-W (2008) A new image thresholding method based on Gaussian mixture model. Appl Math Comput 205: 899–907

    Article  MATH  MathSciNet  Google Scholar 

  11. Hunt LA, Jorgensen MA (2003) Mixture model clustering for mixed data with missing information. Comput Stat Data Anal 41: 429–440

    Article  MathSciNet  Google Scholar 

  12. Ibrahim J, Zhu H, Tang N (2008) Model selection criteria for missing-data problems using the EM algorithm. J Amer Stat Assoc 103: 1648–1658

    Article  MathSciNet  Google Scholar 

  13. Jain AK, Duin RPW, Mao J (2000) Statistical pattern recognition: a review. IEEE Trans Patt Anal Mach Intell 22: 4–37

    Article  Google Scholar 

  14. Jiao S, Zhang S (2008) The t-mixture model approach for detecting differentially expressed genes in microarrays. Funct Integr Genomics 8: 181–186

    Article  Google Scholar 

  15. Jolliffe IT (2002) Principal component analysis, 2nd edn. Springer, New York

    MATH  Google Scholar 

  16. Keribin C (2000) Consistent estimation of the order of mixture models. Sankhya Ser A 62: 49–66

    MATH  MathSciNet  Google Scholar 

  17. Kroh M (2006) Taking “don’t knows” as valid responses: a multiple complete random imputation of missing data. Qual Quant 40: 225–244

    Article  Google Scholar 

  18. Lange KL, Little RJA, Taylor JMG (1989) Robust statistical modeling using the t distribution. J Am Stat Assoc 84: 881–896

    Article  MathSciNet  Google Scholar 

  19. Lee Y, Hahn H, Han Y, Lee J (2005) Robust speaker identification based on t-distribution mixture model. Lect Notes Artif Intell 3809: 896–899

    MathSciNet  Google Scholar 

  20. Li M, Zhang L (2008) Multinomial mixture model with feature selection for text clustering. Knowl-Based Syst 21: 704–708

    Article  Google Scholar 

  21. Lin TI, Lee JC, Ni HF (2004) Bayesian analysis of mixture modelling using the multivariate t distribution. Stat Comput 14: 119–130

    Article  MathSciNet  Google Scholar 

  22. Liu C, Rubin DB, Wu YN (1998) Parameter expansion to accelerate EM: the PX-EM algorithm. Biometrika 85: 755–770

    Article  MATH  MathSciNet  Google Scholar 

  23. Lu Z, Peng Y (2008) A semi-supervised learning algorithm on gaussian mixture with automatic model selection. Neural Process Lett 27: 57–66

    Article  Google Scholar 

  24. McLachlanz GJ, Bean RW, Jones LBT (2007) Extension of the mixture of factor analyzers model to incorporate the multivariate t-distribution. Comput Stat Data Anal 51: 5327–5338

    Article  Google Scholar 

  25. McLachlan GJ, Bean RW, Peel D (2002) A mixture model-based approach to the clustering of microarray expression data. Bioinformatics 18: 413–422

    Article  Google Scholar 

  26. McLachlan GJ, Peel D (2000) Finite mixture models. Wiley, New York

    Book  MATH  Google Scholar 

  27. Meng XL, van Dyk D (1997) The EM algorithm-an old folk song sung to a fast new tune (with discussion). J R Stat Soc Ser B 59: 511–567

    Article  MATH  Google Scholar 

  28. Morgan BJT, Ridout MS (2008) A new mixture model for capture heterogeneity. J R Stat Soc Ser C Appl Stat 57: 433–446

    Article  MATH  Google Scholar 

  29. Paalanen P, Kamarainen J-K, Ilonen J, Kälviäinen H (2006) Feature representation and discrimination based on Gaussian mixture model probability densities—practices and algorithms. Pattern Recognit 39: 1346–1358

    Article  MATH  Google Scholar 

  30. Peel D, McLachlan GJ (2000) Robust mixture modelling using the t distribution. Stat Comput 10: 339–348

    Article  Google Scholar 

  31. Peng H, Zhu S (2007) Handling of incomplete data sets using ICA and SOM in data mining. Neural Comput Appl 16: 167–172

    Article  Google Scholar 

  32. Rudas T (2005) Mixture models of missing data. Qual Quant 39: 19–36

    Article  Google Scholar 

  33. Sahbi H (2008) A particular Gaussian mixture model for clustering and its application to image retrieval. Soft Comput 12: 667–676

    Article  Google Scholar 

  34. Shoham S (2002) Robust clustering by deterministic agglomeration EM of mixtures of multivariate t distributions. Pattern Recognit 35: 1127–1142

    Article  MATH  Google Scholar 

  35. Vellido A (2006) Missing data imputation through GTM as a mixture of t-distributions. Neural Netw 19: 1624–1635

    Article  MATH  Google Scholar 

  36. Verbeke G, Molenberghs G, Beunckens C (2008) Formal and informal model selection with incomplete data. Stat Sci 23: 201–218

    Article  Google Scholar 

  37. Wang H, Chen S, Hu Z, Luo B (2008) Probabilistic two-dimensional principal component analysis and its mixture model for face recognition. Neural Comput Appl 17: 541–547

    Google Scholar 

  38. Wang H, Zhang Q, Luo B, Wei S (2004) Robust mixture modelling using multivariate t-distribution with missing information. Pattern Recognit Lett 25: 701–710

    Article  Google Scholar 

  39. Williams D, Liao X, Xue Y, Carin L, Krishnapuram B (2007) On classification with incomplete data. IEEE Trans Patt Anal Mach Intell 29: 427–436

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Haixian Wang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, H., Hu, Z. On EM Estimation for Mixture of Multivariate t-Distributions. Neural Process Lett 30, 243–256 (2009). https://doi.org/10.1007/s11063-009-9121-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-009-9121-5

Keywords