Abstract
The dependence analysis is usually the key for improving the performance of text retrieval. Compared with the statistical value of a conceptual relationship, the recognition of relation type between concepts is more meaningful. In this paper, we explored a bootstrapping method for automatically extracting semantic patterns from a large-scale corpus to identify the geographical “be part of” relationship between Chinese location concepts in contexts. Our contributions different from other bootstrapping methods lie in: (1) introducing a bi-sequence alignment algorithm in bio-informatics to generating candidate patterns, and (2) giving a new evaluating metric for patterns’ confidence to enhance their extracting qualities in next iteration. In terms of automatic recognition of “be part of” relationship, the experiments showed that the pattern set generated by our method achieves higher coverage and precision than DIPRE does.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Brin, S.: Extracting patterns and relations from the World Wide Web. In: Proc. of the 1998 International Workshop on the Web and Databases (1998)
Riloff, E., Jones, R.: Learning dictionaries for information extraction by multi-level bootstrapping. In: Proc. of the Sixteenth National Conference on Artificial Intelligence (1999)
Agichtein, E., Gravano, S.: Snowball: Extracting relations from large plain-text collections. In: Proc. of the 5th ACM International Conference on Digital Libraries (2000)
Zhang, Y., Zhou Joe, F.: A trainable method for extracting Chinese entity names and their relations. In: Proc. of the second Chinese Language Processing Workshop (2000)
Thelen, M., Riloff, E.: A Bootstrapping Method for Learning Semantic Lexicon using Extraction Pattern Contexts. In: Proc. of the 2002 Conference on Empirical Methods in Natural Language Processing (2002)
Lin, W., Yangarber, R., Grishman, R.: Bootstrapped Learning of Semantic Classes from Positive and Negative Examples. In: Proc. of the ICML-2003 Workshop on the Continuum from Labeled to Unlabeled Data, Washington DC (2003)
Etzioni, O., Cafarella, M., Downey, D., Popescu, A.M., et al.: Methods for Domain-Independent Information Extraction from the Web: An Experimental Comparison. In: Proc. of the AAAI Conference (2004)
Han, H., Elmasri, R.: Learning Rules for Conceptual Structure on the Web. Journal of Intelligent Information System 22(3), 237–256 (2004)
Fisher, D., Soderland, S., McCarthy, J., Feng, F., Lehnert, W.: Description of the Umass systems as used for MUC-6. In: Proc. of the 6th Message Understanding Conference. Columbia, MD (1995)
Gao, J., Nie, J.-Y., Guangyuan, et al.: Dependence language model for information retrieval. In: Proc. of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 170–177 (2004)
Nallapati, R.: J. Allan. Capturing term dependencies using a language model based on sentence tree. In: Proc. of CIKM 2002, pp. 383-390 (2002)
Genest, D., Chenin, M.A: Content-Search Information Retrieval Process Based on Conceptual Graphs. Knowledge and Information Systems Journal 8, 292–309 (2005)
Roussey, C., Calabretto, S., Pinon, J.-M.: A New Conceptual Graph Formalism Adapted for Multilingual Information Retrieval Purposes. In: Mayr, H.C., Lazanský, J., Quirchmayr, G., Vogel, P. (eds.) DEXA 2001. LNCS, vol. 2113, pp. 92–101. Springer, Heidelberg (2001)
Sammeth, M., Morgenstern, B., Stoye, J.: Divide-and-conquer multiple alignment with segment-based constraints. Bioinformatics 19(2), 189–195 (2003)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hu, Y., Lu, R., Chen, Y., Chen, X., Duan, J. (2007). The Bootstrapping Based Recognition of Conceptual Relationship for Text Retrieval. In: Kedad, Z., Lammari, N., Métais, E., Meziane, F., Rezgui, Y. (eds) Natural Language Processing and Information Systems. NLDB 2007. Lecture Notes in Computer Science, vol 4592. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73351-5_23
Download citation
DOI: https://doi.org/10.1007/978-3-540-73351-5_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73350-8
Online ISBN: 978-3-540-73351-5
eBook Packages: Computer ScienceComputer Science (R0)