Abstract
This chapter is concerned with unsupervised classification, that is, the analysis of data sets for which no (or very little) training data is available. The main goals in this data-driven type of analysis are the discovery of a data set’s underlying structure, and the identification of groups (or clusters) of homogeneous data items — a process commonly referred to as cluster analysis.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
M. Ankerst, M. Breunig, H.-P. Kriegel, and J. Sander. OPTICS: Ordering points to identify clustering structure. In Proceedings of the 1999 International Conference on Management of Data, pages 49–60. ACM Press, 1999.
S. Bandyopadhyay and U. Manlik. Nonparametric genetic clustering: comparison of validity indices. IEEE Transactions on Systems, Man and Cybernetics, 31:120–125, 2001.
J. Bilmes, A. Vahdat, W. Hsu, and E.-J. Im. Empirical observations of probabilistic heuristics for the clustering problem. Technical Report TR-97-018, International Computer Science Institute, University of California, Berkeley, CA, 1997.
D. W. Corne, Nick R. Jerram, Joshua D. Knowles, and Martin J. Oates. PESAII: Region-based selection in evolutionary multiobjective optimization. In Proceedings of the Genetic and Evolutionary Computation Conference, pages 283–290. Morgan Kaufmann Publishers, 2001.
D. W. Corne, J. D. Knowles, and M. J. Oates. The Pareto envelope-based selection algorithm for multiobjectice optimization. In Proceedings of the Fifth Conference on Parallel Problem Solving from Nature, pages 839–848, 2000.
D. W. Corne, J. D. Knowles, and M. J. Oates. PESA-II: region-based selection in evolutionary multiobjective optimization. In Proceedings of the Genetic and Evolutionary Computation Conference, pages 283–290, 2001.
R. O. Duda, P. E. Hart, and D. G. Stork. Pattern Classification, Second edition. John Wiley and Son Ltd, 2001.
M. Ester, H. P. Kriegel, and J. Sander. A density-based algorithm for discovering clusters in large spatial databases with noise. In Proceedings of the Second International Conference on Knowledge Discovery and Data-Mining, pages 226–231. AIII Press, 1996.
V. Estivill-Castro. Why so many clustering algorithms: A position paper. ACM SIGKDD Explorations Newsletter Archive, 4:65–75, 2002.
B. S. Everitt. Cluster Analysis. Edward Arnold, 1993.
C. M. Fonseca and P. J. Fleming. On the performance assessment and comparison of stochastic multiobjective optimizers. In Proceedings of the Fourth International Conference on Parallel Problem Solving from Nature, pages 584–593. Springer-Verlag, 1996.
J. Handl and J. Knowles. Evolutionary multiobjective clustering. In Proceedings of the Eighth International Conference on Parallel Problem Solving from Nature, pages 1081–1091. Springer-Verlag, 2004.
J. Handl and J. Knowles. Multiobjective clustering with automatic determination of the number of clusters. Technical Report TR-COMPSYSBIO-2004-02, UMIST, Manchester, UK, 2004.
J. Handl and J. Knowles. Exploiting the trade-off: the benefits of multiple objectives in data clustering. In Proceedings of the Third International Conference on Evolutionary Multicriterion Optimization, pages 547–560. Springer-Verlag, 2005.
J. Handl and J. Knowles. Improvements to the scalability of multiobjective clustering. In IEEE Congress on Evolutionary Computation, pages 632–639. IEEE Press, 2005.
J. Handl, J. Knowles, and D. B. Kell. Computational cluster validation in post-genomic data analysis. Bioinformatics, 21:3201–3212, 2005.
T. Hastie, R. Tibshirani, and J. Friedman. The elements of statistical learning: data mining, inference and prediction. Springer-Verlag, 2001.
A. Hubert. Comparing partitions. Journal of Classification, 2:193–198, 1985.
A. K. Jain, M. N. Murty, and P. J. Flynn. Data clustering: a review. ACM Computing Surveys, 31:264–323, 1999.
J. Kleinberg. An impossibility theorem for clustering. In Proceedings of the 15th Conference on Neural Information Processing Systems. The Internet, 2002.
L. MacQueen. Some methods for classification and analysis of multivariate observations. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, pages 281–297. University of California Press, 1967.
G. McLachlan and T. Krishman. The EM Algorithm and Extensions. John Wiley and Son Ltd, 1997.
Y.-J. Park and M.-S. Song. A genetic algorithm for clustering problems. In Proceedings of the Third Annual Conference on Genetic Programming, pages 568–575, Madison, WI, 1998. Morgan Kaufmann.
J. M. Pena, J. A. Lozana, and P. Larranaga. An empirical comparison of four initialization methods for the k-means algorithm. Pattern Recognition Letters, 20:1027–1040, 1999.
V. J. Rayward-Smith, I. H. Osman, C. R. Reeves, and G. D. Smith. Modern Heuristic Search Methods. John Wiley and Son Ltd, 1996.
P. J. Rousseeuw. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20:53–65, 1987.
W. S. Sarle. Cubic clustering criterion. Technical report, SAS Technical Report A-108, Cary, NC: SAS Institute Inc, 1983.
A. Strehl and J. Ghosh. Cluster ensembles — a knowledge reuse framework for combining multiple partitions. Journal on Machine Learning Research, 3:583–617, 2002.
G. Syswerda. Uniform crossover in genetic algorithms. In Proceedings of the Third International Conference on Genetic Algorithms, pages 2–9. Morgan Kaufmann Publishers, 1989.
R. Tibshirani, G. Walther, and T. Hastie. Estimating the number of clusters in a dataset via the Gap statistic. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 63:411–423, 2001.
A. Topchy, A. K. Jain, and W. Punch. Clustering ensembles: Models of consensus and weak partitions. Submitted to IEEE Transactions on Pattern Analysis and Machine Intelligence, 2004.
E. Vorhees. The effectiveness and efficiency of agglomerative hierarchical clustering in document retrieval. PhD thesis, Department of Computer Science, Cornell University, 1985.
D. Whitley. A genetic algorithm tutorial. Statistics and Computing, 4:65–85, 1994.
R. J. Wilson and J. J. Watkins. Graphs: An Introductory Approach: A First Course in Discrete Mathematics. John Wiley and Sons, 1990.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer
About this chapter
Cite this chapter
Handl, J., Knowles, J. (2006). Multi-Objective Clustering and Cluster Validation. In: Jin, Y. (eds) Multi-Objective Machine Learning. Studies in Computational Intelligence, vol 16. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-33019-4_2
Download citation
DOI: https://doi.org/10.1007/3-540-33019-4_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-30676-4
Online ISBN: 978-3-540-33019-6
eBook Packages: EngineeringEngineering (R0)