Abstract
Within this paper, we analyse the nature of knowledge discovery in database. We conclude that it is similar to that of knowledge acquisition, yet unique in that it employs pre-existing data collected for reasons other than analysis. The post-hoc nature of KDD means that the database is often unfit for analysis using traditional machine-learning techniques. We present a methodology for KDD that attempts to overcome this problem. Knowledge elicitation techniques are employed to define the structure of an appropriate learning dataset and to relate this structure to the raw database. The raw database is then redescribed in terms of the new structure before machine learning tools are applied. We also present CASTLE, a software workbench designed to support this methodology, and illustrate it's usage upon a worked example drawn from the Sisyphus-I room allocation problem.
This work was supported by award of a Phd studentship from the Department of Psychology, Nottingham University.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Austin, J.L. (1955) How to do things with words. Oxford University Press, New York.
Breuker, J., Wielinga, B., Van Someren, M., De Hoog, R., Schreiber, G., De Greef, P., Bredeweg, B., Wielemaker, L., Billault, J-P., Davoodi, M. and Hayward, S. (1987) Model Driven Knowledge Acquisition: interpretation models. ESPRIT Project P1098 Deliverable D1. University of Amsterdam and STL Ltd.
Chandrasekaran, B. (1988) Generic tasks as building blocks for knowledge-based systems: the diagnosis and routine design examples. The Knowledge Engineering Review, 3(3), 183–210.
Chi, M.T.H., Glasser, R. and Farr, M.J. (1988) The nature of expertise. Hillsdale, NJ: Lawrence Erlbaum.
Clark, P. and Niblett, T. (1979) The CN2 induction algorithm. Machine Learning 3, 263–283.
Corbridge, C., Rugg, G., Major, N., Shadbolt, N. and Burton, M. (1994) Laddering: technique and tool use in knowledge acquisition. Knowledge Acquisition (1994), 6, 315–341.
Cupit, J., Major, N. and Shadbolt, N. (1994) REKAP: A methodology for the automated construction of real-time and distributed knowledge-based systems. Proceedings of AP94.
Cupit, J. and Shadbolt, N. (1994) Representational redescription within knowledge intensive data-mining. Proceedings of JKAW 1994.
Frawley, W. Piatetsky-Shapiro, G. and Matheus, C.J. (1991) Knowledge Discovery in Databases: An overview. In Piatetsky-Shapiro and Frawley (eds). Knowledge discovery in databases (1991). AAAI Press.
Ganascia, J., Thomas, J. and Laublet, P. (1993) Integrating models of knowledge and machine learning. Proceedings of ECML, 1993. pp 396–401. Springer-Verlag.
Mannila, H.(1995) Aspects of data mining. In Kodratoff, Nakhaeizadeh and Taylor (eds) Statistics, Machine Learning and Knowledge Discovery in Databases. MLnet workshop notes, ECML-95.
O'Hara, K. (1993) A Representation of KADS-I Interpretation Models Using A Decompositional Approach. In Löckenhoff, C. Fensel, D. and Studer, R. (Eds.) Proceedings of the 3rd KADS Meeting, pp 147–169. Siemens AG, Munich.
Piatetsky-Shapiro, G. and Frawley, W. (eds). (1991) Knowledge discovery in databases. AAAI Press.
Quinlan, J.R. and Rivest, R.L. (1994) Inferring decisions trees using the minimum description length principle. Information and computation. 80. pp 227–248.
Rouveirol, C. and Albert, P. (1994) Knowledge level model of a configuable learning system. Proceedings of EKAW 1994. pp. 374–393. Springer-Verlag.
Rummelhart, D.E and McClelland, J.L. (eds). (1990) Parallel Distributed Cognition: Explorations in the Microstructure of Cognition: vol 1, Foundations (pp. 318–62), Cambridge, MA: MIT Press.
Russel, S.J., and Grosof, B.N. (1990) Declarative bias: an overview. In Change of representation and inductive bias. ed. by P Benjamin.
Terpstra, P., Van Heijst, G., Shadbolt, N. and Wielinga, B. (1993) Knowledge Acquisition Process Support Through Generalized Directive Models. In David, J-M., Krivine, J-P. and Simmons, R. (eds.) Second Generation Expert Systems, pp 428–454. Springer-Verlag.
Thomas, J. Ganascia, J and Laublet, P. (1993) Model-based knowledge acquisition and knowledge-biased machine learning: an example of a principled association. In Procdeedings of IJCAI workshop 16, Chambery.
Shadbolt, N. and Burton, M. (1989) The empirical study of knowledge elicitation techniques. SIGART Newsletter, 108, April 1989.
Shadbolt, N., Motta, E. and Rouge, A. (1993) Constructing knowledge-based systems. IEEE software, November, 34–39.
Shadbolt, N. and Wielinga, B. (1990) Knowledge based knowledge acquisition: the next generation of support tools. In B.J. Wielinga, B. Gaines, G. Scheiber and M. Van Sommeren (eds) Current Trends in Knowledge Acquisition, 313–338, Amsterdam. IOS Press.
Schlimmer, J. Mitchell, J and McDermott, J. (1991) Justification-based refinement of expert knowledge. In Piatetsky-Shapiro and Frawley (eds). Knowledge discovery in databases (1991). AAAI Press.
Wielinga, B.J., vad de Velde, W., Schreiber, G. and Akkermans, H. (1992). The Common KADS Framework for knowledge modelling. Proceedings of the 7th KA workshop, Banff, Alberta, Canada.
Wittgenstein, L. (1958) Philosphical Investigations. Blackwell, Oxford.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1996 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cupit, J., Shadbolt, N. (1996). Knowledge discovery in databases: Exploiting knowledge-level redescription. In: Shadbolt, N., O'Hara, K., Schreiber, G. (eds) Advances in Knowledge Acquisition. EKAW 1996. Lecture Notes in Computer Science, vol 1076. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-61273-4_16
Download citation
DOI: https://doi.org/10.1007/3-540-61273-4_16
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-61273-5
Online ISBN: 978-3-540-68391-9
eBook Packages: Springer Book Archive