Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1007/978-3-642-34487-9_4guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Nonparametric localized feature selection via a dirichlet process mixture of generalized dirichlet distributions

Published: 12 November 2012 Publication History

Abstract

In this paper, we propose a novel Bayesian nonparametric statistical approach of simultaneous clustering and localized feature selection for unsupervised learning. The proposed model is based on a mixture of Dirichlet processes with generalized Dirichlet (GD) distributions, which can also be seen as an infinite GD mixture model. Due to the nature of Bayesian nonparametric approach, the problems of overfitting and underfitting are prevented. Moreover, the determination of the number of clusters is sidestepped by assuming an infinite number of clusters. In our approach, the model parameters and the local feature saliency are estimated simultaneously by variational inference. We report experimental results of applying our model to two challenging clustering problems involving web pages and tissue samples which contain gene expressions.

References

[1]
Alizadeh, A. A., Eisen, M. B., Davis, R. E., et al.: Distinct Types of Diffuse Large B-cell Lymphoma Identified by Gene Expression Profiling. Nature 403, 503-511 (2000)
[2]
Attias, H.: A Variational Bayes Framework for Graphical Models. In: Proc. of Neural Information Processing Systems (NIPS), pp. 209-215 (1999)
[3]
Bishop, C. M.: Variational Learning in Graphical Models and Neural Networks. In: Proc. of ICANN, pp. 13-22. Springer (1998)
[4]
Blei, D. M., Ng, A.Y., Jordan, M. I.: Latent Dirichlet Allocation. Journal of Machine Learning Research 3, 993-1022 (2003)
[5]
Blei, D. M., Jordan, M. I.: Variational Inference for Dirichlet Process Mixtures. Bayesian Analysis 1, 121-144 (2005)
[6]
Bouguila, N., Ziou, D.: A Hybrid SEM Algorithm for High-Dimensional Unsupervised Learning Using a Finite Generalized Dirichlet Mixture. IEEE Transactions on Image Processing 15(9), 2657-2668 (2006)
[7]
Bouguila, N., Ziou, D.: High-Dimensional Unsupervised Selection and Estimation of a Finite Generalized Dirichlet Mixture Model Based on Minimum Message Length. IEEE Transactions on PAMI 29(10), 1716-1731 (2007)
[8]
Boutemedjet, S., Bouguila, N., Ziou, D.: A Hybrid Feature Extraction Selection Approach for High-Dimensional Non-Gaussian Data Clustering. IEEE Transactions on PAMI 31(8), 1429-1443 (2009)
[9]
Constantinopoulos, C., Titsias, M., Likas, A.: Bayesian Feature and Model Selection for Gaussian Mixture Models. IEEE Trans. on PAMI 28(6), 1013-1018 (2006)
[10]
Fan, W., Bouguila, N., Ziou, D.: Unsupervised Anomaly Intrusion Detection via Localized Bayesian Feature Selection. In: Proc. of ICDM, pp. 1032-1037 (2011)
[11]
Fan, W., Bouguila, N., Ziou, D.: Variational Learning for Finite Dirichlet Mixture Models and Applications. IEEE Trans. Neural Netw. Learning Syst. 23(5), 762-774 (2012)
[12]
Ferguson, T. S.: Bayesian Density Estimation by Mixtures of Normal Distributions. Recent Advances in Statistics 24, 287-302 (1983)
[13]
Figueiredo, M., Jain, A.: Unsupervised Learning of Finite Mixture Models. IEEE Transactions on PAMI 24(3), 381-396 (2002)
[14]
Ji, Y., Wu, C., Liu, P., Wang, J., Coombes, K. R.: Applications of Beta-mixture Models in Bioinformatics. Bioinformatics 21(9), 2118-2122 (2005)
[15]
Jordan, M. I., Ghahramani, Z., Jaakkola, T. S., Saul, L. K.: An Introduction to Variational Methods for Graphical Models. Machine Learning 37(2), 183-233 (1999)
[16]
Law, M. H.C., Figueiredo, M. A. T., Jain, A. K.: Simultaneous Feature Selection and Clustering Using Mixture Models. IEEE Trans. on PAMI 26(9), 1154-1166 (2004)
[17]
Li, Y., Dong, M., Hua, J.: Simultaneous Localized Feature Selection and Model Detection for Gaussian Mixtures. IEEE Transactions on PAMI 31, 953-960 (2009)
[18]
Ma, Z., Leijon, A.: Bayesian Estimation of Beta Mixture Models with Variational Inference. IEEE Transactions on PAMI 33(11), 2160-2173 (2011)
[19]
McLachlan, G. J., Khan, N.: On a Resampling Approach for Tests on the Number of Clusters with Mixture Model-based Clustering of Tissue Samples. J. Multivar. Anal. 90(1), 90-105 (2004)
[20]
Neal, R. M.: Markov Chain Sampling Methods for Dirichlet Process Mixture Models. Journal of Computational and Graphical Statistics 9(2), 249-265 (2000)
[21]
Sethuraman, J.: A Constructive Definition of Dirichlet Priors. Statistica Sinica 4, 639-650 (1994)
[22]
Teh, Y. W., Jordan, M. I., Beal, M. J., Blei, D. M.: Hierarchical Dirichlet Processes. Journal of the American Statistical Association 101, 705-711 (2004)

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
ICONIP'12: Proceedings of the 19th international conference on Neural Information Processing - Volume Part III
November 2012
709 pages
ISBN:9783642344862
  • Editors:
  • Tingwen Huang,
  • Zhigang Zeng,
  • Chuandong Li,
  • Chi Sing Leung

Sponsors

  • ExxonMobil
  • QAPCO: QAPCO
  • United Development: United Development Co.
  • Qatar Petroleum: Qatar Petroleum

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 12 November 2012

Author Tags

  1. clustering
  2. dirichlet process
  3. generalized dirichlet
  4. localized feature selection
  5. mixture models
  6. nonparametric bayesian
  7. variational inference

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 03 Oct 2024

Other Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media