Better Decision Tree Induction for Limited Data Sets of Liver Disease

Sug, Hyontai

doi:10.1007/978-3-642-35521-9_12

Hyontai Sug⁶

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 353))

Included in the following conference series:

1424 Accesses

Abstract

Decision trees can be very useful data mining tools for human experts to diagnose the disease, because the knowledge structure is represented in tree shape. But we may not get satisfactory decision tree, if we do not have enough number of consistent instances in the data sets. Recently two kinds of relatively small data sets of liver disorder from America and India are available, so in order to generate more accurate and useful decision trees for the disease this paper suggests appropriate sampling for the data instances that are in the class of higher error rate. Experiments with the two public domain data sets and a representative decision tree algorithm, C4.5, shows very successful results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Data mining in predicting liver patients using classification model

Article 09 November 2022

A CRISP-DM Approach for Predicting Liver Failure Cases: An Indian Case Study

Early-Stage Detection of Liver Disease Through Machine Learning Algorithms

References

Ribeiro, R., Marinho, R., Velosa, J., Ramalho, F., Sanches, J.M.: Chronic liver disease staging classification based on ultrasound, clinical and laboratorial data. In: Proceedings of 2011 IEEE International Symposium on Biomedical Imaging from Nano to Macro, pp. 707–710 (2011)
Google Scholar
UCI Machine Learning Repository, http://archive.ics.uci.edu/ml/datasets/Liver+Disorders
Zhou, Z., Jiang, Y., Chen, S.: Extracting symbolic rules from trained neural network ensembles. AI Communications 16(1), 3–15 (2003)
Google Scholar
Podgorelec, V., Kokol, P., Stiglic, B., Rozman, I.: Decision trees: an overview and their use in medicine. Journal of Medical Systems 26(5), 445–463 (2002)
Article Google Scholar
Lin, Y.C.: Design and Implementation of an Ontology-Based Psychiatric Disorder Detection System. WSEAS Transactions on Information Sciences and Applications 7(1), 56–69 (2010)
Google Scholar
Tryfos, P.: Sampling for Applied Research: Text and Cases, Willy (1996)
Google Scholar
Ramana, B.V., Babu, M.S.P., Venkateswarlu, N.B.: A Critical Comparative Study of Liver Patients from USA and INDIA: An Exploratory Analysis. International Journal of Computer Science, 506–516 (2012)
Google Scholar
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, Inc. (1993)
Google Scholar
Wu, X., Kumar, V., Quinlan, J.R., Ghosh, J., Yang, Q., Motoda, H., McLachlan, G.J., Ng, A., Liu, B., Yu, P.S., Zhou, Z., Steinbach, M., Hand, D.J., Steinberg, D.: Top 10 Algorithms in Data Mining. Knowledge Information System 14, 1–37 (2008)
Article Google Scholar
Chawla, N.V.: C4.5 and Imbalanced data sets : Investigating the effect of sampling emthod, probalistic estimate, and decision tree structure. In: Workshop on Learning from Imbalanced Datasets II, ICML, Washington DC (2003)
Google Scholar
Drummond, C., Holte, R.C.: C4.5, Class Imbalance, and Cost Sensitivity: Why Under-sampling beats Over-sampling. In: Workshop on Learning from Imbalanced Datasets II, ICML, Washington DC (2003)
Google Scholar
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research 16, 341–378 (2002)
Google Scholar
Japkowicz, N., Stephen, S.: The class imbalance problem: A systematic study. Intelligent Data Analysis 6(5), 429–449 (2002)
MATH Google Scholar
Zhou, Z., Jiang, Y.: NeC4.5: Neural Ensemble Based C4.5. IEEE Transactions on Knowledge and Data Engineering 16 (2004)
Google Scholar
Garcke, J., Griebel, M.: Classification with sparse grids using simplicial basis function. Intelligent Data analysis 6 (2002)
Google Scholar
Kahramanli, H., Allahverdi, N.: Mining Classification Rules for Liver Disorders. International Journal of Mathematics and Computers in Simulation 3(1), 9–19 (2009)
Google Scholar
Ramana, B.V., Babu, M.S.P., Venkateswarlu, N.B.: A Critical Study of Selected Classification Algorithms for Liver Disease Diagnosis. International Journal of Database Management Systems 3(2), 101–114 (2011)
Article Google Scholar
Frank, A., Suncion, A.: UCI Machine Learning Repository. University of California, School of Information and Computer Sciences, Irvine (2010), http://archive.ics.uci.edu/ml
Google Scholar
Zheng, Z.: Scaling up the Rule Generation of C4.5. In: Wu, X., Kotagiri, R., Korb, K.B. (eds.) PAKDD 1998. LNCS, vol. 1394, pp. 348–359. Springer, Heidelberg (1998)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Division of Computer & Information Engineering, Dongseo University, 47 Jurye-ro, Sa-sang-gu, Busan, 617-716, Korea
Hyontai Sug

Authors

Hyontai Sug
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

GVSA and University of Tasmania, Hobart, TAS, Australia
Tai-hoon Kim
Dong Seoul University, Bokjeong-dong Sujeong-gu, 461-714, Seongnam-si, Gyeonggi-do, Korea
Jeong-Jin Kang
Department of Computer and Information Science, University of Michigan, 4901 Evergreen Road, 48128, Dearborn, MI, USA
William I. Grosky
Engineering and Electronics, Edinburgh University, King’s Buildings, Faraday, rm 3.101, Mayfield Road, EH9 3JL, Edinburgh, UK
Tughrul Arslan
College of Engineering & Computing, Florida International University, 33174, Miami, FL, USA
Niki Pissinou

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sug, H. (2012). Better Decision Tree Induction for Limited Data Sets of Liver Disease. In: Kim, Th., Kang, JJ., Grosky, W.I., Arslan, T., Pissinou, N. (eds) Computer Applications for Bio-technology, Multimedia, and Ubiquitous City. BSBT MulGraB IUrC 2012 2012 2012. Communications in Computer and Information Science, vol 353. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35521-9_12

Download citation

DOI: https://doi.org/10.1007/978-3-642-35521-9_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35520-2
Online ISBN: 978-3-642-35521-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Better Decision Tree Induction for Limited Data Sets of Liver Disease

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Data mining in predicting liver patients using classification model

A CRISP-DM Approach for Predicting Liver Failure Cases: An Indian Case Study

Early-Stage Detection of Liver Disease Through Machine Learning Algorithms

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Better Decision Tree Induction for Limited Data Sets of Liver Disease

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Data mining in predicting liver patients using classification model

A CRISP-DM Approach for Predicting Liver Failure Cases: An Indian Case Study

Early-Stage Detection of Liver Disease Through Machine Learning Algorithms

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation