Using Weighted Hybrid Discretization Method to Analyze Climate Changes

Jung, Yong-Gyu; Kim, Kyoung Min; Kwon, Young Man

doi:10.1007/978-3-642-35600-1_28

Yong-Gyu Jung⁵,
Kyoung Min Kim⁵ &
Young Man Kwon⁵

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 351))

Included in the following conference series:

1375 Accesses

Abstract

Data mining is the process of posing queries to large quantities of data and extracting information, often previously unknown, using mathematical, statistical and machine learning techniques. However some of the data mining techniques like classification and clustering cannot deal with numeric attributes though most real dataset contains some numeric attributes. Continuous attributes should be divided into a small distinct range of nominal attributes in order to apply data mining techniques. Correct discretization makes the dataset succinct and contributes to the high performance of classification algorithms. Meanwhile, several methods are presented and applied, but it is often dependent on the area. In this paper, we propose a weighted hybrid discretization technique based on entropy and contingency coefficient. Also we analyze performance evaluation with well-known techniques of discretization such as Equal-width binning, 1R, MDLP and ChiMerge.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Data Warehousing and Mining for Climate Change: Application to the Maghreb Region

A novel discretization algorithm based on multi-scale and information entropy

Article 12 September 2020

A Comparison of Two Approaches to Discretization: Multiple Scanning and C4.5

References

Tan, P.-N., Steinbach, M., Kumar, V.: Introduction to Data Mining. Pearson Addison Wesley (2006)
Google Scholar
Witten, I.H., Frank, E.: Data Mining: Practical Machine learning Tools and Techniques, 3rd edn. Morgan Kaufmann, San Francisco (2011)
Google Scholar
Holte, R.C.: Very Simple Classification Rules Perform Well on Most Commonly Used Datasets. Machine Learning 11, 63–91 (1993)
Article MATH Google Scholar
Fayyad, U.M., Irani, K.B.: Multi-interval discretization of continuous-valued attributes for classification learning. Artificial Intelligence 13, 1022–1027 (1993)
Google Scholar
Barron, A., Rissanen, J., Yu, B.: The Minimum Description Length Principle in Coding and Modeling. IEEE Transactions on Information Theory 44(6), 2743–2760 (1998)
Article MATH MathSciNet Google Scholar
Kerber, R.: ChiMerge: Discretization of numeric attribute. In: Proc. AAAI 1991, 10th International Conference on Artificial Intelligence, pp. 123–127 (1992)
Google Scholar
Perner, P., Trautzsch, S.: Multi-interval discretization methods for decision tree learning. Pattern Recognition 1451, 475–482 (1998)
Google Scholar
Liu, H., Hussain, H.F., Tan, C.L., Dash, M.: Discretization: An enabling technique. Data Mining and Knowledge Discovery 6, 393–423 (2002)
Article MathSciNet Google Scholar
Fayyad, U.M., Irani, K.B.: Multi-interval discretization of continuous-valued attributes for classification learning. Artificial Intelligence 13, 1022–1027 (1993)
Google Scholar
Han, J., Kamber, M.: Data Mining Conceptsand Techniques. Morgan Kaufmann (2001)
Google Scholar
Liu, H., Setiono, R.: Feature selection via discretization. IEEE Transactions on Knowledge and Data Engineering 9, 642–645
Google Scholar
Kohavi, M.S.: Error-Based and Entropy-Based Discretization of Continuous Features. In: The 2nd International Conference on Knowledge Discovery and Data Mining, pp. 114–119 (1996)
Google Scholar
Zhu, Q., Lin, L., Shyu, M.L., Chen, S.C.: Effective Supervised Discretization for Classification based on Correlation Maximization. IEEE Transactions on Information Feuse and Integration, 390–295 (2011)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Medical IT and Marketing, Eulji University, 553 Sansungdaero Sujeong, Seongnam, Gyonggi, 462-731, Korea
Yong-Gyu Jung, Kyoung Min Kim & Young Man Kwon

Authors

Yong-Gyu Jung
View author publications
You can also search for this author in PubMed Google Scholar
Kyoung Min Kim
View author publications
You can also search for this author in PubMed Google Scholar
Young Man Kwon
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

GVSA and University of Tasmania, Hobart, TAS, Australia
Tai-hoon Kim
Chungwoon University, 350-701, Chungnam, Republic of Korea
Hyun-seob Cho
Department of Mathematics and Computer Science, University of Perugia, Via Vanvitelli, 1, 06123, Perugia, Italy
Osvaldo Gervasi
Information Assurance Center, Arizona State University, P.O. Box 878809, 85287-8809, Tempe, AZ, USA
Stephen S. Yau

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jung, YG., Kim, K.M., Kwon, Y.M. (2012). Using Weighted Hybrid Discretization Method to Analyze Climate Changes. In: Kim, Th., Cho, Hs., Gervasi, O., Yau, S.S. (eds) Computer Applications for Graphics, Grid Computing, and Industrial Environment. CGAG GDC IESH 2012 2012 2012. Communications in Computer and Information Science, vol 351. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35600-1_28

Download citation

DOI: https://doi.org/10.1007/978-3-642-35600-1_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35599-8
Online ISBN: 978-3-642-35600-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics