Abstract
In this paper we demonstrate and quantify the advantage gained by allowing relation extraction algorithms to make use of information about the cardinality of the target relation. The two algorithms presented herein differ only in their assumption about the nature of the target relation (one-to-many or many-to-many). The algorithms are tested on the same relation to show the degree of advantage gained by their differing assumptions. Comparison of the performance of the two algorithms on a one-to-many domain demonstrates the existence of several, previously undocumented behaviors which can be used to improve the performance of relation extraction algorithms. The first is a distinct, inverted u-shape in the initial portion of the recall curve of the many-to-many algorithm. The second is that, as the number of seeds increases, the rate of improvement of the two algorithms descreases to approach the rate at which new information is added via the seeds.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Brin, S.: Extracting patterns and relations from the world wide web. In: Schek, H.-J., Saltor, F., Ramos, I., Alonso, G. (eds.) EDBT 1998. LNCS, vol. 1377, pp. 172–183. Springer, Heidelberg (1998)
Agichtein, E., Gravano, L.: Snowball: extracting relations from large plain-text collections. In: Proceedings of the fifth ACM conference on Digital libraries, pp. 85–94 (2000)
Bunescu, R., Mooney, R.: A shortest path dependency kernel for relation extraction. In: Proceedings of the Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, pp. 724–731 (2005)
Riloff, E.: Automatically Generating Extraction Patterns from Untagged Text. In: Proceedings of the Thirteenth National Conference on Artificial Intelligence, vol. 2, pp. 1044–1049 (1996)
Agichtein, E., Eskin, E., Gravano, L.: Combining Strategies for Extracting Relations from Text Collections. In: Proceedings of the 2000 ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery (DMKD 2000) (May 2000)
Pennacchiotti, M., Pantel, P.: A Bootstrapping Algorithm for Automatically Harvesting Semantic Relations. In: Proceedings of Inference in Computational Semantics (ICoS 2006), Buxton, England (2006)
Pantel, P., Pennacchiotti, M.: Espresso: leveraging generic patterns for automatically harvesting semantic relations. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the ACL, pp. 113–120 (2006)
Liu, Y., Shi, Z., Sarkar, A.: Exploiting Rich Syntactic Information for Relation Extraction from Biomedical Articles. In: Procs. of NAACL/HLT (2007)
Etzioni, O., Cafarella, M., Downey, D., Popescu, A., Shaked, T., Soderland, S., Weld, D., Yates, A.: Methods for Domain-Independent Information Extraction from the Web: An Experimental Comparison. In: Proceedings of the AAAI Conference, pp. 391–398 (2004)
Hasegawa, T., Sekine, S., Grishman, R.: Discovering Relations among Named Entities from Large Corpora. In: Proceedings of the Annual Meeting of Association of Computational Linguistics (ACL) (2004)
Shinyama, Y., Sekine, S.: Preemptive information extraction using unrestricted relation discovery. In: Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, pp. 304–311 (2006)
Culotta, A., McCallum, A.: Reducing labeling effort for structured prediction tasks. In: Veloso, M.M., Kambhampati, S. (eds.) AAAI, pp. 746–751. AAAI Press / The MIT Press (2005)
de Marneffe, M., MacCartney, B., Manning, C.: Generating typed dependency parses from phrase structure parses. In: LREC 2006 (2006)
The World Factbook. Central Intelligence Agency (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Normand, E., Grant, K., Ioup, E., Sample, J. (2009). Improving Relation Extraction by Exploiting Properties of the Target Relation. In: Winslett, M. (eds) Scientific and Statistical Database Management. SSDBM 2009. Lecture Notes in Computer Science, vol 5566. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02279-1_39
Download citation
DOI: https://doi.org/10.1007/978-3-642-02279-1_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02278-4
Online ISBN: 978-3-642-02279-1
eBook Packages: Computer ScienceComputer Science (R0)