Abstract
Many approaches to Information Extraction (IE) have been proposed in literature capable of finding and extract specific facts in relatively unstructured documents. Their application in a large information space makes data ready for post-processing which is crucial to many context such as Web mining and searching tools. This paper proposes a new IE strategy, based on symbolic and neural techniques, and tests it experimentally within the price comparison service domain. In particular the strategy seeks to locate a set of atomic elements in free text which is preliminarily extracted from web documents and subsequently classify them assigning a class label representing a specific product.
Chapter PDF
Similar content being viewed by others
References
Chang, C.H., Hsu, C.N., Lui, S.C.: Automatic information extraction from semi-structured web pages by pattern discovery. Decis. Support Syst. 35(1), 129–147 (2003)
Muslea, I.: Extraction patterns for information extraction tasks: A survey. In: Califf, M.E. (ed.) Papers from the Sixteenth National Conference on Artificial Intelligence (AAAI-99) Workshop on Machine Learning for Information Extraction, Orlando, FL, AAAI Press (1999)
Chang, C.H., Kayed, M., Girgis, M.R., Shaalan, K.F.: A survey of web information extraction systems. IEEE Transactions on Knowledge and Data Engineering 18(10), 1411–1428 (2006)
Damerau, F.J.: A technique for computer detection and correction of spelling errors. Communications of the Association for Computing Machinery 7(3), 171–176 (1964)
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning internal representations by error propagation, 318–362 (1986)
Jackson, P., Moulinier, I.: Natural Language Processing for Online Applications: Text Retrieval, Extraction, and Categorization. Natural Language Processing, 5. John Benjamins Publishing Co. (2002)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gallo, I., Binaghi, E. (2007). Information Extraction and Classification from Free Text Using a Neural Approach. In: Rueda, L., Mery, D., Kittler, J. (eds) Progress in Pattern Recognition, Image Analysis and Applications. CIARP 2007. Lecture Notes in Computer Science, vol 4756. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-76725-1_95
Download citation
DOI: https://doi.org/10.1007/978-3-540-76725-1_95
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-76724-4
Online ISBN: 978-3-540-76725-1
eBook Packages: Computer ScienceComputer Science (R0)