The K-Means Clustering Architecture in the Multi-stage Data Mining Process

Gerardo, Bobby D.; Lee, Jae-Wan; Choi, Yeon-Sung; Lee, Malrey

doi:10.1007/11424826_8

Bobby D. Gerardo²⁴,
Jae-Wan Lee²⁴,
Yeon-Sung Choi²⁴ &
…
Malrey Lee²⁵

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3481))

Included in the following conference series:

International Conference on Computational Science and Its Applications

1423 Accesses

Abstract

In this paper, we used software engineering principles for the development of models and proposed the K-Means clustering architecture implemented on the multi-stage data mining process. We developed a modified architecture and expanded it by showing refinements on every process of the clustering and knowledge discovery stages. We used the mentioned hierarchical clustering model to partition the data into smaller groups of attributes so that we would determine the data structure before applying the data mining tools. The experiment shows that the model using the clustering resulted to an isolated but imperative association rules based on clustered data, which in return could be practically explained for decision making purposes. Shorter processing time had been observed in computing for smaller clusters implying faster and ideal processing period than dealing with the entire dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Clustering Algorithm and Its Application in Data Mining

Article 21 August 2019

A Comparative Study in Data Mining: Clustering and Classification Capabilities

High Performance Clustering Techniques: A Survey

References

Pressman, R.: Software Engineering: a practitioner’s approach, 5th edn. McGraw- Hill, USA (2001)
Google Scholar
Han, J., Kamber, M.: Data Mining Concepts & Techniques. Morgan Kaufmann, USA (2001)
Google Scholar
Chen, B., Haas, P., Scheuermann, P.: A new two-phase sampling based algorithm for discovering association rules. In: Proceedings of ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (2002)
Google Scholar
Cluster Analysis defined, Available at, http://www.clustan.com/what_is_cluster_analysis.html
Determining the Number of Clusters, Available at, http://cgm.cs.mcgill.ca/soss/cs644/projects/siourbas/cluster.html #kmeans
Using Hierarchical Clustering in XLMiner, Available at, http://www.resample.com/xlminer/help/HClst/HClst_intro.htm
Agglomerative Hierarchical Clustering, Available at, http://www2.cs.uregina.ca/~hamilton/courses/831/notes/clustering/clustering.htm
Ertz, L., Steinbach, M., Kumar, V.: Finding Topics in Collections of Documents: A Shared Nearest Neighbor Approach. In: Text Mine 2001, Workshop on Text Mining, First SIAM International Conference on Data Mining, Chicago, IL (2001)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Electronic and Information Engineering, Kunsan National University, 68 Miryong-dong, Kunsan, Chonbuk, 573-701, South Korea
Bobby D. Gerardo, Jae-Wan Lee & Yeon-Sung Choi
School of Electronic and Information Engineering, Chonbuk National University, 664-14 Deokjin-dong, Jeonju, Chonbuk, 561-756, South Korea
Malrey Lee

Authors

Bobby D. Gerardo
View author publications
You can also search for this author in PubMed Google Scholar
Jae-Wan Lee
View author publications
You can also search for this author in PubMed Google Scholar
Yeon-Sung Choi
View author publications
You can also search for this author in PubMed Google Scholar
Malrey Lee
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Mathematics and Computer Science, University of Perugia, via Vanvitelli, 1, I-06123, Perugia, Italy
Osvaldo Gervasi
Department of Computer Science, University of Calgary, 2500 University Drive N.W., T2N 1N4, Calgary, AB, Canada
Marina L. Gavrilova
William Norris Professor, Head of the Computer Science and Engineering Department, University of Minnesota, USA
Vipin Kumar
Department of Chemistry, University of Perugia, Via Elce di Sotto, 8, P.O. Box, I-06123, Perugia, Italy
Antonio Laganà
Institute of High Performance Computing, IHCP, 1 Science Park Road, 01-01 The Capricorn, Singapore Science Park II, 117528, Singapore
Heow Pueh Lee
School of Computing, Soongsil University, Seoul, Korea
Youngsong Mun
Clayton School of IT, Monash University, 3800, Clayton, Australia
David Taniar
OptimaNumerics Ltd, P.O. Box, Belfast, United Kingdom
Chih Jeng Kenneth Tan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gerardo, B.D., Lee, JW., Choi, YS., Lee, M. (2005). The K-Means Clustering Architecture in the Multi-stage Data Mining Process. In: Gervasi, O., et al. Computational Science and Its Applications – ICCSA 2005. ICCSA 2005. Lecture Notes in Computer Science, vol 3481. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11424826_8

Download citation

DOI: https://doi.org/10.1007/11424826_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25861-2
Online ISBN: 978-3-540-32044-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

The K-Means Clustering Architecture in the Multi-stage Data Mining Process

Abstract

Access this chapter

Preview

Similar content being viewed by others

Clustering Algorithm and Its Application in Data Mining

A Comparative Study in Data Mining: Clustering and Classification Capabilities

High Performance Clustering Techniques: A Survey

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

The K-Means Clustering Architecture in the Multi-stage Data Mining Process

Abstract

Access this chapter

Preview

Similar content being viewed by others

Clustering Algorithm and Its Application in Data Mining

A Comparative Study in Data Mining: Clustering and Classification Capabilities

High Performance Clustering Techniques: A Survey

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation