Abstract
Text mining is defined as knowledge discovery in large text collections. It detects interesting patterns such as clusters, associations, deviations, similarities, and differences in sets of texts. Current text mining methods use simplistic representations of text contents, such as keyword vectors, which imply serious limitations on the kind and meaningfulness of possible discoveries. We show how to do some typical mining tasks using conceptual graphs as formal but meaningful representation of texts. Our methods involve qualitative and quantitative comparison of conceptual graphs, conceptual clustering, building a conceptual hierarchy, and application of data mining techniques to this hierarchy in order to detect interesting associations and deviations. Our experiments show that, despite widespread misbelief, detailed meaningful mining with conceptual graphs is computationally affordable.
Work done under partial support of CONACyT, CGEPI-IPN, and SNI, Mexico.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Agrawal, R., A. Arning, T. Bollinger, M. Mehta, J. Shafer, R. Srikant (1996). The Quest Data Mining System, Proc. of the 2nd International Conference on Knowledge Discovery in Databases and Data Mining, Portland, Oregon, August, 1996.
Arning Andreas, Rakesh Agrawal, and Prabhakar Raghavan (1996). A Linear Method for Deviation Detection in Large Databases, Proceedings of the 2nd International Conference on Knowledge Discovery in Databases and Data Mining, 1996.
Barrière (1997). From a Children’s First Dictionary to a Lexical Knowledge Base of Conceptual Graphs. Ph.D. Thesis, Université Simon Fraser, 1997.
Boytcheva, Dobrev, and Angelova (2001), CGExtract: Towards Extraction of Conceptual Graphs from Controlled English. Lecture Notes in Computer Science 2120, Springer 2001.
Ciravegna et al., Ed. (2001), Proc. of the 17Th International Joint Conference on Artificial Intelligence (IJCAI-2001), Workshop of Adaptive Text Mining, Seattle, WA, 2001.
Fayyad, Usama M., Gregory Piatetsky-Shapiro, Padhraic Smyth, and Ramasamy Uthurusamy (1996), Advances in Knowledge Discovery and Data Mining, Cambridge, MA: MIT Press, 1996.
Feldman, Ed. (1999), Proc. of The 16th International Joint Conference on Artificial Intelligence (IJCAI-1999), Workshop on Text Mining: Foundations, Techniques and Applications, Stockholm, Sweden, 1999.
Feldman, R., M. Fresko, Y. Kinar, Y. Lindell, O. Liphstat, M. Rajman, Y. Schler, and O. Zamir (1998). Text Mining at the Term Level, Proc. of the 2nd European Symposium on Principles of Data Mining and Knowledge Discovery (PKDD’98), Nantes, France, September 23-26, 1998.
Feldman and Hirsh (1996), Mining Associations in Text in the Presence of Background Knowledge, Proc. of the 2nd International Conference on Knowledge Discovery (KDD-96), Portland, 1996.
Han and Kamber (2001), Data Mining: Concepts and Techniques, Morgan Kaufmann Publishers, 2001.
Hearst (1999), Untangling Text Data Mining, Proc. of ACL’99: The 37th Annual Meeting of the Association for Computational Linguistics, University of Maryland, June 20-26, 1999.
Kaufman and Williams (1990), Finding groups in data, John Wiley & Sons, New York, 1990.
Knorr and Ng (1998), Algorithms for Mining Distance-based Outliers in Large Datasets, Proc. of the International Conference on Very Large Data Bases (VLDB’98), Newport Beach, CA, 1998.
Lent, Brian, Rakesh Agrawal, and Ramakrishnan Srikant (1997). Discovering Trends in Text Databases, Proc. of the 3rd Int’l Conference on Knowledge Discovery in Databases and Data Mining, Newport Beach, California, August 1997.
Mannila, Heikki (1997). Methods and Problems in Data Mining, Proc. International Conference on Database Theory, Delphi, Greece, January 1997.
Mladenic, Ed. (2000), Proc. of the Sixth International Conference on Knowledge Discovery and Data Mining, Workshop on Text Mining, Boston, MA, 2000.
Montes-y-Gómez, Gelbukh, López-López (1999), Document intentions expressed in titles. Extraction, representation, and possible use, Selected Works 1997-1998, Instituto Politécnico Nacional, Centro de Investigación en Computación, 1999.
Montes-y-Gómez, Gelbukh, López-López, Baeza-Yates (2001a), Flexible Comparison of Conceptual Graphs. Lecture Notes in Computer Science 2113. Springer-Verlag, 2001.
Montes-y-Gómez, Gelbukh, López-López, Baeza-Yates (2001b), Un Método de Agrupamiento de Grafos Conceptuales para Minería de Texto, Procesamiento de Lenguaje Natural, Vol. 27, Septiembre 2001.
Montes-y-Gómez, Gelbukh, López-López (2001c), Discovering Association Rules in Semi-structured Data Sets, Proc. of the Workshop on Knowledge Discovery from Distributed, Dynamic, Heterogeneous, Autonomous Data and Knowledge Sources, International Joint Conference on Artificial Intelligence (IJCAI’2001), Seattle, WA, August 2001.
Montes-y-Gómez, Gelbukh, López-López (2001d), Detecting deviations in text collections: An approach using conceptual graphs, To appear in Lecture Notes in Artificial Intelligence 2313.
Mugnier (1995), On generalization / specialization for conceptual graphs, Journal of Experimental and Theoretical Artificial Intelligence, volume 7, pages 325–344, 1995.
Tan (1999), Text Mining: The state of the art and challenges, Proc. of the Workshop Knowledge Discovery from advanced Databases PAKDDD-99, Abril 1999.
Tapia-Melchor and López-López (1998), Automatic Information Extraction from Documents in WWW, Memorias del Séptimo Congreso Internacional de Electrónica, Comunicaciones y Computadoras, CONIELECOMP 98, Febrero, 1998.
Shian-Hua Lin, Chi-Sheng Shin, Meng Chang Chen, Jan-Ming Ho, Ming-Tat Ko, and Yueh-Ming Huang (1998). Extracting Classification Knowledge of Internet Documents with Mining Term Associations: A semantic Approach, Proceedings of SIGIR’98, Melbourne, Australia, 1998.
Sowa (1999), Conceptual Graphs: Draft Proposed American National Standard, International Conference on Conceptual Structures ICCS-99, Lecture Notes in Artificial Intelligence 1640, Springer 1999.
Sowa and Way (1986), Implementing a semantic interpreter using conceptual graphs, IBM Journal of Research and Development 30:1, January, 1986.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Montes-y-Gómez, M., Gelbukh, A., López-López, A. (2002). Text Mining at Detail Level Using Conceptual Graphs. In: Priss, U., Corbett, D., Angelova, G. (eds) Conceptual Structures: Integration and Interfaces. ICCS 2002. Lecture Notes in Computer Science(), vol 2393. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45483-7_10
Download citation
DOI: https://doi.org/10.1007/3-540-45483-7_10
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43901-1
Online ISBN: 978-3-540-45483-0
eBook Packages: Springer Book Archive