Abstract
A brief overview of the history of the development of decision tree induction algorithms is followed by a review of techniques for dealing with missing attribute values in the operation of these methods. The technique of dynamic path generation is described in the context of tree-based classification methods. The waste of data which can result from casewise deletion of missing values in statistical algorithms is discussed and alternatives proposed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Breiman, L., Friedman, J.H., Olshen, R.A. & Stone, C.J. (1984). Classification and regression trees. Belmont: Wadsworth.
Clark, L.A. & Pregibon, D. (1992). Tree-based models. In Statistical Models in S, edited by J.M. Chambers & T.J. Hastie, pp. 377–419. California: Wadsworth & Brooks/Cole.
Friedman, H.F., Kohavi, R. & Yun, Y. (1996). Lazy decision trees. in Proceedings of the 13th National Conference on Artificial Intelligence, pp. 717–724, AAAI Press/MIT Press.
Hunt, E.B. (1962). Concept learning: an information processing problem. New York: Wiley.
Hunt, E.B., Marin, J. & Stone, P.J. (1966). Experiments in induction. New York: Academic Press.
Kass, G.V. (1980). An exploratory technique for investigating large quantities of categorical data. Applied Statistics, 29, 119–127.
Kononenko, I., Bratko, I. & Roskar, E. (1984). Experiments in automatic learning of medical diagnostic rules. Technical Report. Jozef Stefan Institute, Ljubjana, Yugoslavia.
Liu, W.Z. & White, A.P. (1991). A review of inductive learning. In Research and Development in Expert Systems VIII, edited by I.M. Graham and R.W. Milne, pp. 112–126. Cambridge: Cambridge University Press.
Liu, W.Z. & White, A.P. (1994). The importance of attribute selection measures in decision tree induction. Machine Learning, 15, 25–41.
Liu, W.Z. White, A.P. & Hallissey, M.T. (1994). Early screening for gastric cancer using machine learning techniques. In Machine Learning: ECML-94, edited by F. Bergadano and L. De Raedt, pp. 391–394. Springer-Verlag, Berlin.
Liu, W.Z., White, A.P., Hallissey, M.T. & Fielding, J.W.L. (1996). Machine learning techniques in early screening for gastric and oesophageal cancer. Artificial Intelligence in Medicine, 8, 327–341.
Mingers, J. (1989). An empirical comparison of pruning methods for decision tree induction. Machine Learning, 4, 227–243.
Quinlan, J.R. (1979). Discovering rules by induction from large collections of examples. In Expert Systems in the Micro-Electronic Age, edited by D. Michie, pp. 168–201. Edinburgh: Edinburgh University Press.
Quinlan, J.R. (1986). Induction of decision trees. Machine Learning, 1, 81–106.
White, A.P. (1987). Probabilistic induction by dynamic path generation in virtual trees. In Research and Development in Expert Systems III, edited by M.A. Bramer, pp. 35–46. Cambridge: Cambridge University Press.
White, A.P. & Liu, W.Z. (1994). Bias in information-based measures in decision tree induction. Machine Learning, 15, 321–329.
White, A.P., Liu, W.Z., Hallissey, M.T. & Fielding, J.W.L. (1996). A comparison of two classification techniques in screening for gastro-oesophageal cancer. Applications and Innovations in Expert Systems IV, edited by A. Macintosh and C. Cooper, pp. 83–97. Cambridge: Cambridge University Press.
White, A.P. & Liu, W.Z. (1997). Statistical properties of tree-based approaches to classification. In Machine Learning and Statistics: the Interface, edited by R. Nakhaeizadeh and C. Taylor, pp. 23–44. ISBN 0-471-14890-3, John Wiley & Sons, Inc.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1997 Springer-Verlag
About this paper
Cite this paper
Liu, W.Z., White, A.P., Thompson, S.G., Bramer, M.A. (1997). Techniques for dealing with missing values in classification. In: Liu, X., Cohen, P., Berthold, M. (eds) Advances in Intelligent Data Analysis Reasoning about Data. IDA 1997. Lecture Notes in Computer Science, vol 1280. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0052868
Download citation
DOI: https://doi.org/10.1007/BFb0052868
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-63346-4
Online ISBN: 978-3-540-69520-2
eBook Packages: Springer Book Archive