Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

High-Dimensional Unsupervised Selection and Estimation of a Finite Generalized Dirichlet Mixture Model Based on Minimum Message Length

Published: 01 October 2007 Publication History

Abstract

We consider the problem of determining the structure of high-dimensional data, without prior knowledge of the number of clusters. Data are represented by a finite mixture model based on the generalized Dirichlet distribution. The generalized Dirichlet distribution has a more general covariance structure than the Dirichlet distribution and offers high flexibility and ease of use for the approximation of both symmetric and asymmetric distributions. This makes the generalized Dirichlet distribution more practical and useful. An important problem in mixture modeling is the determination of the number of clusters. Indeed, a mixture with too many or too few components may not be appropriate to approximate the true model. Here, we consider the application of the minimum message length (MML) principle to determine the number of clusters. The MML is derived so as to choose the number of clusters in the mixture model which best describes the data. A comparison with other selection criteria is performed. The validation involves synthetic data, real data clustering, and two interesting real applications: classification of web pages, and texture database summarization for efficient retrieval.

References

[1]
N. Bouguila and D. Ziou, “MML-Based Approach for High-Dimensional Learning Using the Generalized Dirichlet Mixture,” Proc. IEEE CS Conf. Computer Vision and Pattern Recognition—Workshops, p. 53, 2005.
[2]
G.J. McLachlan and D. Peel, Finite Mixture Models. John Wiley & Sons, 2000.
[3]
B.S. Everitt and D.J. Hand, Finite Mixture Distributions. Chapman and Hall, 1981.
[4]
A.K. Jain, R.P.W. Duin, and J. Mao, “Statistical Pattern Recognition: A Review,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 1, pp. 4-37, Jan. 2000.
[5]
H. Bensmail, G. Celeux, A. Raftery, and C. Robert, “Inference in Model-Based Cluster Analysis,” Statistics and Computing, vol. 7, pp. 1-10, 1997.
[6]
K. Roeder and L. Wasserman, “Practical Bayesian Density Estimation Using Mixture of Normals,” J. Am. Statistical Assoc., vol. 92, pp. 894-902, 1997.
[7]
S. Richardson and P. Green, “On Bayesian Analysis of Mixtures with Unknown Number of Components,” J. Royal Statistic Soc. B, vol. 59, pp. 731-792, 1997.
[8]
G. McLachlan, “On Bootstrapping the Likelihood Ratio Test Statistic for the Number of Components in a Normal Mixture,” J.Royal Statistic Soc. C, vol. 36, pp. 318-324, 1987.
[9]
P. Smyth, “Model Selection for Probabilistic Clustering Using Cross-Validated Likelihood,” Statistics and Computing, vol. 10, no. 1, pp. 63-72, 2000.
[10]
G. Schwarz, “Estimating the Dimension of a Model,” The Annals of Statistics, vol. 6, no. 2, pp. 461-464, 1978.
[11]
C.S. Wallace and D.M. Boulton, “An Information Measure for Classification,” The Computer J., vol. 11, no. 2, pp. 195-209, 1968.
[12]
M.A.T. Figueiredo and A.K. Jain, “Unsupervised Learning of Finite Mixture Models,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 3, pp. 4-37, Mar. 2002.
[13]
H. Akaike, “A New Look at the Statistical Model Identification,” IEEE Trans. Automatic Control, vol. 9, no. 6, pp. 716-723, 1974.
[14]
H. Bozdogan, “Determining the Number of Component Clusters in the Standard Multivariate Normal Mixture Model Using Model Selection Criteria,” Technical Report A83-1, Quantitative Methods Dept., Univ. of Illi nois, 1983.
[15]
J. Rissanen, “Modeling by Shortest Data Description,” Automatica, vol. 14, pp. 465-471, 1978.
[16]
J. Rissanen, “Universal Coding, Information, Prediction and Estimation,” IEEE Trans. Information Theory, vol. 30, no. 4, pp.629-636, 1984.
[17]
A.R. Barron, J. Rissanen, and B. Yu, “The Minimum Description Length Principle in Coding and Modeling,” IEEE Trans. Information Theory, vol. 44, no. 6, pp. 2743-2760, 1998.
[18]
M.A.T. Figueiredo, J.M.N. Leitao, and A.K. Jain, “On Fitting Mixture Models,” Energy Minimization Methods in Computer Vision and Pattern Recognition, E. Hancock and M. Pellilo, eds. Springer, pp. 54-69, 1999.
[19]
R.A. Baxter and J.J. Oliver, “Finding Overlapping Components with MML,” Statistics and Computing, vol. 10, no. 1, pp. 5-16, 2000.
[20]
S.J. Roberts, D. Husmeier, I. Rezek, and W. Penny, “Bayesian Approaches to Gaussian Mixture Modeling,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 20, no. 11, pp. 1133-1142, 1998.
[21]
C.S. Wallace, Statistical and Inductive Inference by Minimum Message Length. Springer, 2005.
[22]
J.W. Comley and D.L. Dowe, “Minimum Message Length, MDL and Generalised Bayesian Networks with Asymmetric Languages,” Advances in Minimum Description Length: Theory and Applications, pp. 265-294, 2005.
[23]
C.S. Wallace, “An Improved Program for Classification,” Proc. Ninth Australian Computer Science Conf., pp. 357-366, 1986.
[24]
C.S. Wallace, “Classification by Minimum-Message-Length Inference,” Proc. Advances in Computing and Information, S.G. Akl etal., eds., pp. 72-81, 1990.
[25]
C.S. Wallace and D.L. Dowe, “Intrinsic Classification by MML—The Snob Program,” Proc. Seventh Australian Joint Conf. Artificial Intelligence, pp. 37-44, 1994.
[26]
C.S. Wallace and D.L. Dowe, “MML Mixture Modelling of Multi-State, Poisson, von Mises Circular and Gaussian Distributions,” Proc. Sixth Int'l Workshop Artificial Intelligence and Statistics, pp.529-536, 1997.
[27]
C.S. Wallace and D.L. Dowe, “MML Mixture Modelling of Multi-State, Poisson, von Mises Circular and Gaussian Distributions,” Proc. 28th Symp. Interface, Computing Science and Statistics, pp.608-613, 1997.
[28]
C.S. Wallace and D.L. Dowe, “MML Clustering of Multi-State, Poisson, von Mises Circular and Gaussian Distributions,” Statistics and Computing, vol. 10, no. 1, pp. 73-83, 2000.
[29]
C.S. Wallace, “Intrinsic Classification of Spatially Correlated Data,” The Computer J., vol. 41, no. 8, pp. 602-611, 1998.
[30]
D. Ziou and N. Bouguila, “Unsupervised Learning of a Gamma Finite Mixture Using MML: Application to SAR Image Analysis,” Proc. 17th Int'l Conf. Pattern Recognition, pp.280-283, 2004.
[31]
Y. Agusta and D.L. Dowe, “Unsupervised Learning of Gamma Mixture Models Using Minimum Message Length,” Proc. Third IASTED Conf. Artificial Intelligence and Applications, M.H. Hamza, ed., pp. 457-462, 2003.
[32]
Y. Agusta and D.L. Dowe, “MML Clustering of Continuous-Valued Data Using Gaussian and t Distributions,” Proc. 15th Australian Joint Conf. Artificial Intelligence, B. McKay and J. Slaney, eds., pp. 143-154, 2002.
[33]
T. Wong, “Generalized Dirichlet Distribution in Bayesian Analysis,” Applied Math. and Computation, vol. 97, pp. 165-181, 1998.
[34]
N. Bouguila, D. Ziou, and J. Vaillancourt, “Unsupervised Learning of a Finite Mixture Model Based on the Dirichlet Distribution and Its Application,” IEEE Trans. Image Processing, vol. 13, no. 11, pp. 1533-1543, 2004.
[35]
R.J. Beckman and G.L. Tietjen, “Maximum Likelihood Estimation for the Beta Distribution,” J. Statistics and Computational Simulation, vol. 7, pp. 253-258, 1978.
[36]
K. Sjolander, K. Karplus, M. Brown, R. Hughey, A. Krogh, I.S. Mian, and D. Haussler, “Dirichlet Mixtures: A Method for Improving Detection of Weak but Significant Protein Sequence Homology,” Computer Applications in the Biosciences, vol. 12, no. 4, pp. 327-345, 1996.
[37]
D. Blei, A. Ng, and M. Jordan, “Latent Dirichlet Allocation,” J.Machine Learning Research, vol. 3, pp. 993-1022, 2003.
[38]
R.J. Connor and J.E. Mosimann, “Concepts of Independence for Proportions with a Generalization of the Dirichlet Distribution,” J.Am. Statistical Assoc., vol. 64, pp. 194-206, 1969.
[39]
C.S. Wallace and D.L. Dowe, “Minimum Message Length and Kolmogorov Complexity,” The Computer J., vol. 42, no. 4, pp. 270-283, 1999.
[40]
J.J. Oliver and R.A. Baxter, “MML and Bayesianism: Similarities and Differences,” Technical Report 205, Dept. Computer Science, Monash Univ., July 1994.
[41]
J. Conway and N. Sloane, Sphere Packings, Lattice, and Groups. Springer, 1993.
[42]
C.S. Wallace and P.R. Freeman, “Estimation and Inference by Compact Coding,” J. Royal Statistical Soc. B, vol. 49, pp. 240-252, 1987.
[43]
G.J. McLachlan and T. Krishnan, The EM Algorithm and Extensions. Wiley-Interscience, 1997.
[44]
A.P. Dempster, N.M. Laird, and D.B. Rubin, “Maximum Likelihood from Incomplete Data via the EM Algorithm,” J. Royal Statistical Soc., B, vol. 39, no. 1, pp. 1-38, 1977.
[45]
N. Bouguila and D. Ziou, “A Powerful Finite Mixture Model Based on the Generalized Dirichlet Distribution: Unsupervised Learning and Applications,” Proc. 17th Int'l Conf. Pattern Recognition, pp. 68-71, 2004.
[46]
D.W. Scott and J.R. Thompson, “Probability Density Estimation in Higher Dimensions,” Computer Science and Statistics, pp. 173-179, 1983.
[47]
C.R. Rao, Advanced Statistical Methods in Biomedical Research. John Wiley & Sons, 1952.
[48]
G. Celeux and J. Diebolt, “The SEM Algorithm: A Probabilistic Teacher Algorithm Derived from the EM Algorithm for the Mixture Problem,” Computational Statistics Quarterly, vol. 2, no. 1, pp. 73-82, 1985.
[49]
G. Celeux and J. Diebolt, “A Stochastic Approximation Type EM Algorithm for the Mixture Problem,” Stochastics and Stochastics Reports, vol. 41, pp. 119-134, 1992.
[50]
R.T. Edwards and D.L. Dowe, “Single Factor Analysis in MML Mixture Modelling,” Proc. Second Pacific-Asia Conf. Knowledge Discovery and Data Mining, pp. 96-109, 1998.
[51]
C.L. Blake and C.J. Merz, Repository of Machine Learning Databases. Dept. Information and Computer Sciences, Univ. of California, Irvine, 1998, http://www.ics.uci.edu/~mlearn/MLRepository.html.
[52]
L. Kaufman and P.J. Rousseeuw, Finding Groups in Data. John Wiley & Sons, 1990.
[53]
C. Fraley and A.E. Raftery, “How Many Clusters? Which Clustering Method? Answers via Model-Based Cluster Analysis,” The Computer J., vol. 41, no. 8, 1998.
[54]
G.M. Reaven and R.G. Miller, “An Attempt to Define the Nature of Chemical Diabetes Using a Multidimensional Analysis,” Diabetologia, vol. 16, pp. 17-24, 1979.
[55]
R. Kothari and D. Pitts, “On Finding the Number of Clusters,” Pattern Recognition Letters, vol. 20, pp. 405-416, 1999.
[56]
E. Anderson, “The Irises of the Gaspe Peninsula,” Bull. Am. Iris Soc., vol. 59, pp. 2-5, 1935.
[57]
Y. Agusta and D.L. Dowe, “Unsupervised Learning of Correlated Multivariate Gaussian Mixture Models Using MML,” Proc. 16th Australian Joint Conf. Artificial Intelligence, T.D. Gedeon and L.C.Fung, eds., pp. 477-489, 2003.
[58]
G. Salton, Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer. Addison-Wesley, 1989.
[59]
A.K. McCallum, “Bow: A Toolkit for Statistical Language Modeling, Text Retrieval, Classification and Clustering,” http://www.cs.cmu.edu/mccallum/bow, 1996.
[60]
W. Niblack, R. Barber, W. Equitz, M. Flickner, E.H. Glasman, D. Yanker, P. Faloutsos, and G. Taubin, “The QBIC Project: Querying Images by Content Using Color, Texture and Shape,” Technical Report RJ 9203, IBM, 1993.
[61]
A. Pentland, R. Picard, and S. Sclaroff, “Photobook: Content-Based Manipulation of Image Databases,” Int'l J. Computer Vision, vol. 18, no. 3, pp. 233-254, 1996.
[62]
C. Carson, S. Belongie, H. Greenspan, and J. Malik, “Blobworld: Image Segmentation Using Expectation-Maximization and Its Application to Image Querying,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 8, pp. 1026-1038, Aug. 2002.
[63]
J.R. Smith and S.F. Chang, “VisualSEEK: A Fully Automated Content-Based Image Query System,” Proc. Fourth ACM Int'l Conf. Multimedia, pp. 87-98, 1996.
[64]
M.L. Kherfi, D. Ziou, and A. Bernardi, “Combining Positive and Negative Examples in Relevance Feedback for Content-Based Image Retrieval,” J. Visual Comm. and Image Representation, vol. 14, pp. 428-457, 2003.
[65]
A. Vailaya, M.A.T. Figueiredo, A.K. Jain, and H. Zhang, “Image Classification for Content-Based Indexing,” IEEE Trans. Image Processing, vol. 10, no. 1, pp. 117-130, 2001.
[66]
S. Newsman, B. Sumengen, and B.S. Manjunath, “Category-Based Image Retrieval,” Proc. Seventh IEEE Int'l Conf. Image Processing, Special Session on Multimedia Indexing, Browsing, and Retrieval, 2001.
[67]
J. Huang, S.R. Kumar, M. Mitra, W.-J. Zhu, and R. Zabih, “Image Indexing Using Color Correlograms,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, p. 762, 1997.
[68]
T. Randen and J.H. Husoy, “Filtering for Texture Classification: A Comparative Study,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 21, no. 4, pp. 291-310, Apr. 1999.
[69]
M. Unser, “Sum and Difference Histograms for Texture Classification,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 8, no. 1, pp. 118-125, 1986.

Cited By

View all
  • (2024)Libby-Novick Beta-Liouville Distribution for Enhanced Anomaly Detection in Proportional DataACM Transactions on Intelligent Systems and Technology10.1145/367540515:5(1-26)Online publication date: 29-Jun-2024
  • (2024)Data Clustering with Libby-Novick Beta-Liouville Mixture Models: A Minimum Message Length ApproachProceedings of the 2024 9th International Conference on Intelligent Information Technology10.1145/3654522.3654551(314-321)Online publication date: 23-Feb-2024
  • (2024)Robust Clustering with McDonald’s Beta-Liouville Mixture Models for Proportional DataArtificial Neural Networks in Pattern Recognition10.1007/978-3-031-71602-7_5(49-60)Online publication date: 10-Oct-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Pattern Analysis and Machine Intelligence
IEEE Transactions on Pattern Analysis and Machine Intelligence  Volume 29, Issue 10
October 2007
190 pages

Publisher

IEEE Computer Society

United States

Publication History

Published: 01 October 2007

Author Tags

  1. AIC
  2. EM
  3. Finite mixture models
  4. LEC
  5. MDL
  6. MML
  7. data clustering
  8. generalized Dirichlet mixture
  9. image database summarization
  10. information theory
  11. webmining

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 09 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Libby-Novick Beta-Liouville Distribution for Enhanced Anomaly Detection in Proportional DataACM Transactions on Intelligent Systems and Technology10.1145/367540515:5(1-26)Online publication date: 29-Jun-2024
  • (2024)Data Clustering with Libby-Novick Beta-Liouville Mixture Models: A Minimum Message Length ApproachProceedings of the 2024 9th International Conference on Intelligent Information Technology10.1145/3654522.3654551(314-321)Online publication date: 23-Feb-2024
  • (2024)Robust Clustering with McDonald’s Beta-Liouville Mixture Models for Proportional DataArtificial Neural Networks in Pattern Recognition10.1007/978-3-031-71602-7_5(49-60)Online publication date: 10-Oct-2024
  • (2023)Finite Multivariate McDonald's Beta Mixture Model Learning Approach in Medical ApplicationsProceedings of the 38th ACM/SIGAPP Symposium on Applied Computing10.1145/3555776.3577650(1143-1150)Online publication date: 27-Mar-2023
  • (2023)Unsupervised nested Dirichlet finite mixture model for clusteringApplied Intelligence10.1007/s10489-023-04888-853:21(25232-25258)Online publication date: 7-Aug-2023
  • (2023)Bounded multivariate generalized Gaussian mixture model using ICA and IVAPattern Analysis & Applications10.1007/s10044-023-01148-w26:3(1223-1252)Online publication date: 1-Aug-2023
  • (2022)Multivariate bounded support asymmetric generalized Gaussian mixture model with model selection using minimum message lengthExpert Systems with Applications: An International Journal10.1016/j.eswa.2022.117516204:COnline publication date: 15-Oct-2022
  • (2021)A Hybrid of Interactive Learning and Predictive Modeling for Occupancy Estimation in Smart BuildingsIEEE Transactions on Consumer Electronics10.1109/TCE.2021.313194367:4(285-293)Online publication date: 1-Nov-2021
  • (2021)Unsupervised Learning Using Variational Inference on Finite Inverted Dirichlet Mixture Models with Component SplittingWireless Personal Communications: An International Journal10.1007/s11277-021-08308-3119:2(1817-1844)Online publication date: 1-Jul-2021
  • (2021)Entropy-Based Variational Learning of Finite Generalized Inverted Dirichlet Mixture ModelIntelligent Information and Database Systems10.1007/978-3-030-73280-6_11(130-143)Online publication date: 7-Apr-2021
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media