Abstract
The knowledge extraction is an important element of the e-Health system. In this paper, we introduce a new method for decision rules extraction called Graph-based Rules Inducer to support the medical interview in the diabetes treatment. The emphasis is put on the capability of hidden context change tracking. The context is understood as a set of all factors affecting patient condition. In order to follow context changes, a forgetting mechanism with a forgetting factor is implemented in the proposed algorithm. Moreover, to aggregate data, a graph representation is used and a limitation of the search space is proposed to protect from overfitting. We demonstrate the advantages of our approach in comparison with other methods through an empirical study on the Electricity benchmark data set in the classification task. Subsequently, our method is applied in the diabetes treatment as a tool supporting medical interviews.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Alemdar H, Ersoy C (2010) Wireless sensor networks for healthcare: a survey. Comput Netw 54: 2688–2710
Andersen TL, and Martinez TR (1995) NP-completeness of minimum rule sets. In: Proceedings of the 10th international symposium on computer and information sciences, pp 411–418
Auer P, Warmuth MK (1998) Tracking the best disjunction. Mach Learn 32: 127–150
Bach SH, and Maloof MA (2008) Paired learners for concept drift. In: Proceedings of eighth IEEE international conference on data mining, pp 23–32
Baena-García M, del Campo-Ávila J, Fidalgo R, Bifet A et al (2006) Early drift detection method. In: ECML PKDD 2006 workshop on knowledge discovery from data streams, Berlin
Bishop CM (2006) Pattern recognition and machine learning. Springer, Singapore
Bouamrane M-M, Rector A., Hurrell M. (2011) Using OWL ontologies for adaptive patient information modelling and preoperative clinical decision support. Knowl Inf Syst 29(2): 405–418
Box GEP, Jenkins GM (1976) Time series analysis. Forecasting and control, Revised edn. Holden-Day, Oakland
Breiman L, Friedman JH, Olshen RA, Stone PJ (1984) Classification and regression trees. Wadsworth, Belmont
Breiman L (2001) Random forests. Mach Learn 45(1): 5–32
Bubnicki Z (1994) Knowledge-based approach as a generalization of pattern recognition problems. Syst Sci 19(2): 5–20
Bubnicki Z (1980) Identification of control plants. Elsevier, Oxford
Chang WW, Sung TJ, Huang HW et al (2011) A smart medication system using wireless sensor network technologies. Sens Actuators A Phys 172(1): 315–321
Cherkassky V, Mulier F (2007) Learning from data: concepts, theory, and methods. Wiley, New Jersey
Clark P, Niblett T (1989) The CN2 induction algorithm. Mach Learn 3: 261–283
Cook DJ, Holder LB (2000) Graph-based data mining. IEEE Intell Syst Appl 15(2): 32–41
Devroye L, Györfi L, Lugosi G (1997) A probabilistic theory of pattern recognition. Springer, New York
Diestel R (2000) Graph theory. Springer, New York
Domingos P, Hulten G (2000) Mining high-speed data streams. In: Proceedings of KDD 2000, pp 71–80
European Coalition for Diabetes (2009). EU Diabetes Working Group (2009–2014) Delivering for Diabetes in Europe. Policy paper. http://www.ecdiabetes.eu/documents/EUDWG-policy-paper-2009-2014
Gama J, Medas P, Castillo G, Rodrigues P (2004) Learning with drift detection. Lect Notes Comput Sci 3171: 66–112
Gama J, Sebastião R, Rodrigues P (2009) Issues in evaluation of stream learning algorithms. In: Proceedings of the 15th ACM SIGKDD international conference on KDD, pp 329–338
Georgii E, Tsuda K, Schölkopf G (2011) Multi-way set enumeration in weight tensors. Mach Learn 82(2): 123–155
Grzech A, Rygielski P, P (2010) Translations of service level agreement in systems based on service-oriented architectures. Cybern Syst 41(8): 610–627
Gonczarek A, Tomczak JM, Grzech J (2010) Decision rules clustering using K-means algorithm with different distance measures. In: Grzech A, Świa¸tek P, Drapała J (eds) Advances in systems science. Exit, Warsaw, pp 139–147
Grandinetti L, Pisacane O (2011) Web based prediction for diabetes treatment. Futur Gener Comput Syst 27: 139–147
Grzeszczak W (ed) (2010) Clinical recommendations for diabetics 2010. A Standpoint of Polish Diabetes Association. Pismo Polskiego Towarzystwa Diabetologicznego, vol 11, issue A (in Polish)
Harries MB (1999) Splice-2 comparative evaluation: electricity pricing. Technical Report UNSW-CSE-TR-9905
Harries MB, Sammut C, Horn K (1998) Extracting hidden context. Mach Learn 32: 101–126
Herrera F, Carmona CJ, Gonzlez P, del Jesus MJ (2011) An overview on subgroup discovery: foundations and applications. Knowl Inf Syst 29(3): 495–525
Holder LB, Cook DJ (2005) Graph-based Data Mining. In: Wang J (eds) Encyclopedia of data warehousing and mining. Information Science Reference, Hershey, pp 540–545
Hulten G, Spencer L, and Domingos P (2001) Mining time changing data streams. In: Proceedings of the seventh ACM SIGKDD international conference on knowledge discovery and data dining, San Francisco, California, ACM, pp 97–106
International Diabetes Federation and Federation of European Nurses in Diabetes (2005) The policy puzzle: towards benchmarking in the EU 25. IDF/FEND Report. http://www.idf.org/webdata/docs/idf-europe/DiabetesReport2005
Inokuchi A, Washio T, Motoda H (2003) Complete mining of frequent patterns from graphs: mining graph data. Mach Learn 50: 321–354
International Telecommunication Union (2008) Implementing e-health in developing countries. Guidance and Principles, ITU Report
Jordan MI (2004) Graphical models. Stat Sci 19(1): 140–155
Kearns M, Li M, Valiant L (1994) Learning Boolean formulae. J ACM 41(6): 1298–1328
Koleszynska J (2007) GIGISim—the intelligent telehealth system. Computer aided diabetes managment—a new review. Lect Notes Comput Sci 4692: 789–796
Kolter JZ, and Maloof MA (2005) Using additive expert ensembles to cope with concept drift. In: Proceedings of the twenty-second international conference on machine learning, ACM Press, New York, NY, pp 449–456
Kubat M (1993) Flexible concept learning in real-time systems. J Intell Robotic Syst 1: 155–171
Kulkarnia P, Ozturk Y (2011) mPHASiS: mobile patient healthcare and sensor information system. J Netw Comput Appl 34: 402–417
Last M (2002) Online classification of nonstationary data streams. Intell Data Anal 6: 129–147
Last M, Klein Y, Kandel A (2001) Knowledge discovery in time series databases. IEEE Trans Syst Man Cybern Part B Cybern 31: 160–169
Macía I, Grańa M, Paloc C (2011) Knowledge management in image-based analysis of blood vessel structures. Knowl Inf Syst 30(2): 457–491
Maloof MA, Michalski RS (1999) Selecting examples for partial memory learning. Mach Learn 41(1): 27–52
Michalski RS (1969) On the quasi-minimal solution of the general covering problem. In: Proceedings of the Vth international symposium on information processing. Yugoslavia, A3:125–128
Mitchell T (1997) Machine learning. McGraw Hill, New York
Mohktar MS, Basilakis J, Redmond SJ, and Lovell NH (2010) A guideline-based decision support system for generating referral recommendations from routinely recorded home telehealth measurement data. In: Proceedings of 32nd annual international conference of the IEEE EMBS Buenos Aires, Argentina, pp 6166–6169, 31 August–4 September 2010
Mougiakakou SG, Bartsocas CS, Bozas E et al (2010) SMARTDIAB: a communication and information technology approach for the intelligent monitoring, management and follow-up of type 1 diabetes patients. IEEE Trans Inf Technol Biomed 14: 622–633
Pantelopoulos A, Bourbakis NG (2010) A survey on wearable sensor-based systems for health monitoring and prognosis. IEEE Trans Syst Man Cybern C Appl Rev 40(1): 1–12
Pattichis CS, Kyriacou E, Voskarides S et al (2002) Wireless telemedicine systems: an overview. IEEE Trans Antennas Propag Maga 44(2): 143–153
Pawlak Z (2002) Decision Algorithms, Bayes’ Theorem and Flow Graphs. In: Rutkowski L, Kacprzyk J (eds) Neural networks and soft computing. Physica, Springer, Heidelberk, New York
Pawlak Z (2004) Data analysis and flow graphs. J Telecommun Inf Technol 3: 1–5
Potts J, Cook DJ, Holder LB (2007) Learning from supervised graphs. Stud Comput Intell 52: 183–201
Qin B, Xia Y, Prabhakar S (2011) Rule induction for uncertain data. Knowl Inf Syst 29(1): 103–130
Quinlan JR (1983) Learning efficient classification procedures and their application to chess end games. In: Michalski RS, Carbonell JG, Mitchell TM (eds) Machine learning: an artificial intelligence approach. Morgan Kaufmann, San Mateo, pp 463–482
Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann Publishers, Massachusetts
Rensink A (2004) Representing first-order logic using graphs. Lect Notes Comput Sci 3256: 319–335
Salganicoff M (1997) Tolerating concept and sampling shift in lazy learning using prediction error context switching. Artif Intell Rev 11: 133–155
Świa¸tek J, Brzostowski K, Tomczak JM (2011) Computer aided physician interview for remote control system of diabetes therapy. In: InterSymp 2011, 23rd international conference on system research, informatics and cybernetics, Baden-Baden, Germany
Tomczak JM, Brzostowski K, Grzech J (2010) Knowledge extraction using shifting window from non-stationary datastreams. In: Grzech A (eds) Information systems architecture and technology: networks and networks’ services. Oficyna Wydawnicza PWr, Wrocław, pp 321–331
Tomczak JM, Grzech J (2010) Bayesian classifiers with incremental learning for nonstationary datastreams. In: Grzech A, Świa¸tek P, Drapała J (eds) Advances in systems science. EXIT, Warszawa,, pp 251–260
UCI Machine Learning Repository (1994) Dataset prepared by Michael Kahn, MD, PhD. http://archive.ics.uci.edu/ml/datasets/Diabetes
Vapnik VN (1998) The statistical learning theory. A Wiley-Interscience Publication. John Wiley & Sons, New York
Verhoeven F, van Gemert-Pijnen L, Dijkstra K et al (2007) The contribution of teleconsultation and videoconferencing to diabetes care: a systematic literature review. J Med Internet Res 9(5): e37
Washio T, Motoda H (2003) State of the art of graph-based data mining. ACM SIGKDD Explor Newsl 5(1): 59–68
Widmer G, Kubat M (1996) Learning in the presence of concept drift and hidden contexts. Mach Learn 23: 69–101
World Health Organization (2006) Definition and diagnosis of diabetes mellitus and intermediate hyperglycemia. Report of a WHO/IDF Consultation. http://whqlibdoc.who.int/publications/2006/9241594934_eng
Open Access
This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Tomczak, J.M., Gonczarek, A. Decision rules extraction from data stream in the presence of changing context for diabetes treatment. Knowl Inf Syst 34, 521–546 (2013). https://doi.org/10.1007/s10115-012-0488-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-012-0488-7