Abstract
The log files generated by digital systems can be used in management information systems as the source of important information on the condition of systems. However, log files are not exhaustively exploited in order to extract information. The classical methods of information extraction such as terminology extraction methods are irrelevant to this context because of the specific characteristics of log files like their heterogeneous structure, the special vocabulary and the fact that they do not respect a natural language grammar. In this paper, we introduce our approach Exterlog to extract the terminology from log files. We detail how it deals with the particularity of such textual data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Yamanishi, K., Maruyama, Y.: Dynamic syslog mining for network failure monitoring. In: KDD 2005, pp. 499–508. ACM, New York (2005)
Facca, F.M., Lanzi, P.L.: Mining interesting knowledge from weblogs: a survey. Data Knowl. Eng. 53(3), 225–241 (2005)
Dey, L., Singh, S., Rai, R., Gupta, S.: Ontology aided query expansion for retrieving relevant texts. In: Szczepaniak, P.S., Kacprzyk, J., Niewiadomski, A. (eds.) AWIC 2005. LNCS, vol. 3528, pp. 126–132. Springer, Heidelberg (2005)
Tan, C.M., Wang, Y.F., Lee, C.D.: The use of bigrams to enhance text categorization. Inf. Process. Manage. 38(4), 529–546 (2002)
Grobelnik, M.: Word sequences as features in text-learning. In: Proceedings of the 17th Electrotechnical and Computer Science Conference (ERK 1998), pp. 145–148 (1998)
Roche, M., Heitz, T., Matte-Tailliez, O., Kodratoff, Y.: Exit: Un système itératif pour l’extraction de la terminologie du domaine à partir de corpus spécialisés. In: Proceedings of JADT 2004, vol. 2, pp. 946–956 (2004)
Smadja, F.: Retrieving collocations from text: Xtract. Comput. Linguist. 19(1), 143–177 (1993)
Sclano, F., Velardi, P.: Termextractor: a web application to learn the shared terminology of emergent web communities. In: I-ESA 2007, Funchal, Portugal (2007)
Daille, B.: Conceptual structuring through term variations. In: Proceedings of the ACL 2003 workshop on Multiword expressions, Morristown, NJ, USA, pp. 9–16. Association for Computational Linguistics (2003)
Evans, D.A., Zhai, C.: Noun-phrase analysis in unrestricted text for information retrieval. In: Proceedings of the 34th annual meeting on Association for Computational Linguistics, Morristown, NJ, USA, pp. 17–24. Association for Computational Linguistics (1996)
Brill, E.: A simple rule-based part of speech tagger. In: Proceedings of the Third Conference on Applied Natural Language Processing, pp. 152–155 (1992)
Amrani, A., Kodratoff, Y., Matte-Tailliez, O.: A semi-automatic system for tagging specialized corpora. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS, vol. 3056, pp. 670–681. Springer, Heidelberg (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Saneifar, H., Bonniol, S., Laurent, A., Poncelet, P., Roche, M. (2009). Terminology Extraction from Log Files. In: Bhowmick, S.S., Küng, J., Wagner, R. (eds) Database and Expert Systems Applications. DEXA 2009. Lecture Notes in Computer Science, vol 5690. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03573-9_65
Download citation
DOI: https://doi.org/10.1007/978-3-642-03573-9_65
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03572-2
Online ISBN: 978-3-642-03573-9
eBook Packages: Computer ScienceComputer Science (R0)