Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/1889788.1889799guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Modeling relations and their mentions without labeled text

Published: 20 September 2010 Publication History

Abstract

Several recent works on relation extraction have been applying the distant supervision paradigm: instead of relying on annotated text to learn how to predict relations, they employ existing knowledge bases (KBs) as source of supervision. Crucially, these approaches are trained based on the assumption that each sentence which mentions the two related entities is an expression of the given relation. Here we argue that this leads to noisy patterns that hurt precision, in particular if the knowledge base is not directly related to the text we are working with. We present a novel approach to distant supervision that can alleviate this problem based on the following two ideas: First, we use a factor graph to explicitly model the decision whether two entities are related, and the decision whether this relation is mentioned in a given sentence; second, we apply constraint-driven semi-supervision to train this model without any knowledge about which sentences express the relations in our training KB. We apply our approach to extract relations from the New York Times corpus and use Freebase as knowledge base. When compared to a state-of-the-art approach for relation extraction under distant supervision, we achieve 31% error reduction.

References

[1]
Bellare, K., McCallum, A.: Generalized expectation criteria for bootstrapping extractors using record-text alignment. In: Empirical Methods in Natural Language Processing (EMNLP), pp. 131-140 (2009).
[2]
Bollacker, K., Evans, C., Paritosh, P., Sturge, T., Taylor, J.: Freebase: a collaboratively created graph database for structuring human knowledge. In: SIGMOD '08: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pp. 1247-1250. ACM, New York (2008).
[3]
Bunescu, R.C., Mooney, R.J.: Learning to extract relations from the web using minimal supervision. In: Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, ACL' 07 (2007).
[4]
Chang, M.W., Goldwasser, D., Roth, D., Tu, Y.: Unsupervised constraint driven learning for transliteration discovery. In: NAACL '09: Proceedings of Human Language Technologies: Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 299-307 (2009).
[5]
Chang, M.W., Ratinov, L., Rizzolo, N., Roth, D.: Learning and inference with constraints. In: AAAI Conference on Artificial Intelligence, pp. 1513-1518. AAAI Press, Menlo Park (2008).
[6]
Chang, M.W., Ratinov, L., Roth, D.: Guiding semi-supervision with constraint-driven learning. In: Annual Meeting of the Association for Computational Linguistics (ACL), pp. 280-287 (2007).
[7]
Collins, M.: Discriminative training methods for hidden markov models: Theory and experiments with perceptron algorithms. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP '02), vol. 10, pp. 1-8 (2002).
[8]
Craven, M., Kumlien, J.: Constructing biological knowledge-bases by extracting information from text sources. In: Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology, Germany, pp. 77-86 (1999).
[9]
Culotta, A., McCallum, A.: Joint deduplication of multiple record types in relational data. In: Proceedings of the 14th ACM International Conference on Information and Knowledge Management (CIKM '05), pp. 257-258. ACM, New York (2005).
[10]
Dietterich, T., Lathrop, R., Lozano-Pérez, T.: Solving the multiple instance problem with axis-parallel rectangles. Artificial Intelligence 89(1-2), 31-71 (1997).
[11]
Dimitry Zelenko, C.A., Richardella, A.: Kernel methods for relation extraction. JMLR 3(6), 1083-1106 (2003).
[12]
Finkel, J.R., Grenager, T., Manning, C.: Incorporating non-local information into information extraction systems by gibbs sampling. In: Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL' 05), pp. 363-370 (June 2005).
[13]
Geman, S., Geman, D.: Stochastic relaxation, gibbs distributions, and the bayesian restoration of images, pp. 452-472 (1990).
[14]
Jensen, C.S., Kong, A., Kjaerulff, U.: Blocking gibbs sampling in very large probabilistic expert systems. International Journal of Human Computer Studies. Special Issue on Real-World Applications of Uncertain Reasoning 42, 647-666 (1993).
[15]
Lafferty, J.D., McCallum, A., Pereira, F.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: International Conference on Machine Learning, ICML (2001).
[16]
Mann, G.S., McCallum, A.: Generalized expectation criteria for semi-supervised learning of conditional random fields. In: Annual Meeting of the Association for Computational Linguistics (ACL), pp. 870-878 (2008).
[17]
McCallum, A., Schultz, K., Singh, S.: Factorie: Probabilistic programming via imperatively defined factor graphs. In: Bengio, Y., Schuurmans, D., Lafferty, J., Williams, C.K.I., Culotta, A. (eds.) Advances in Neural Information Processing Systems, vol. 22, pp. 1249-1257 (2009).
[18]
Mintz, M., Bills, S., Snow, R., Jurafsky, D.: Distant supervision for relation extraction without labeled data. In: Proceedings of the 47rd Annual Meeting of the Association for Computational Linguistics (ACL' 09), pp. 1003-1011. Association for Computational Linguistics (2009).
[19]
Morgan, A.A., Hirschman, L., Colosimo, M., Yeh, A.S., Colombe, J.B.: Gene name identification and normalization using a model organism database. J. of Biomedical Informatics 37(6), 396-410 (2004).
[20]
Nivre, J., Hall, J., Nilsson, J.: Memory-based dependency parsing. In: Proceedings of CoNLL, pp. 49-56 (2004).
[21]
Rohanimanesh, K., Wick, M., McCallum, A.: Inference and learning in large factor graphs with a rank based objective. Tech. Rep. UM-CS-2009-08, University of Massachusetts, Amherst (2009).
[22]
Sandhaus, E.: The New York Times Annotated Corpus. Linguistic Data Consortium, Philadelphia (2008).
[23]
Singh, S., Schultz, K., McCallum, A.: Bi-directional joint inference for entity resolution and segmentation using imperatively-defined factor graphs. In: European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD), pp. 414-429 (2009).
[24]
Singh, S., Yao, L., Riedel, S., McCallum, A.: Constraint-driven rank-based learning for information extraction. In: North American Chapter of the Association for Computational Linguistics - Human Language Technologies, NAACL HLT (2010).
[25]
Smith, N.A., Eisner, J.: Contrastive estimation: training log-linear models on unlabeled data. In: ACL '05: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pp. 354-362. Association for Computational Linguistics, Morristown (2005).
[26]
Sun, X., Matsuzaki, T., Okanohara, D., Tsujii, J.: Latent variable perceptron algorithm for structured classification. In: IJCAI'09: Proceedings of the 21st International Joint Conference on Artifical Intelligence, pp. 1236-1242. Morgan Kaufmann Publishers Inc., San Francisco (2009).
[27]
Wick, M., Rohanimanesh, K., Culotta, A., McCallum, A.: Samplerank: Learning preferences from atomic gradients. In: Neural Information Processing Systems (NIPS), Workshop on Advances in Ranking (2009).
[28]
Wu, F., Weld, D.S.: Autonomously semantifying wikipedia. In: Proceedings of the 16th ACM International Conference on Information and Knowledge Management (CIKM '07), pp. 41-50. ACM Press, New York (2007).

Cited By

View all
  • (2023)RSAA: Relation-Specific Attention and Global Alignment Based Joint Entity and Relation ExtractionProceedings of the 2023 4th International Conference on Control, Robotics and Intelligent System10.1145/3622896.3622905(53-58)Online publication date: 25-Aug-2023
  • (2023)LDRC: Long-tail Distantly Supervised Relation Extraction via Contrastive LearningProceedings of the 2023 7th International Conference on Machine Learning and Soft Computing10.1145/3583788.3583804(110-117)Online publication date: 5-Jan-2023
  • (2023)Extracting Methodology Components from AI Research Papers: A Data-driven Factored Sequence Labeling ApproachProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3615258(3897-3901)Online publication date: 21-Oct-2023
  • Show More Cited By

Index Terms

  1. Modeling relations and their mentions without labeled text
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Guide Proceedings
    ECML PKDD'10: Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part III
    September 2010
    631 pages
    ISBN:3642159389

    Sponsors

    • PASCAL2 - Pattern Analysis, Statistical Modelling and Computational Learning
    • Google Inc.
    • Nokia
    • Yahoo! Research Labs
    • INRIA: Institut Natl de Recherche en Info et en Automatique

    Publisher

    Springer-Verlag

    Berlin, Heidelberg

    Publication History

    Published: 20 September 2010

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 30 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)RSAA: Relation-Specific Attention and Global Alignment Based Joint Entity and Relation ExtractionProceedings of the 2023 4th International Conference on Control, Robotics and Intelligent System10.1145/3622896.3622905(53-58)Online publication date: 25-Aug-2023
    • (2023)LDRC: Long-tail Distantly Supervised Relation Extraction via Contrastive LearningProceedings of the 2023 7th International Conference on Machine Learning and Soft Computing10.1145/3583788.3583804(110-117)Online publication date: 5-Jan-2023
    • (2023)Extracting Methodology Components from AI Research Papers: A Data-driven Factored Sequence Labeling ApproachProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3615258(3897-3901)Online publication date: 21-Oct-2023
    • (2023)FREDA: Flexible Relation Extraction Data AnnotationProceedings of the 38th ACM/SIGAPP Symposium on Applied Computing10.1145/3555776.3578592(902-910)Online publication date: 27-Mar-2023
    • (2023)MA-MRC: A Multi-answer Machine Reading Comprehension DatasetProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3592015(2144-2148)Online publication date: 19-Jul-2023
    • (2022)Online Review System Using Relational Triple Extraction with Novel Data Augmentation MethodsProceedings of the 2022 5th International Conference on Algorithms, Computing and Artificial Intelligence10.1145/3579654.3579741(1-5)Online publication date: 23-Dec-2022
    • (2022)MiDTD: A Simple and Effective Distillation Framework for Distantly Supervised Relation ExtractionACM Transactions on Information Systems10.1145/350391740:4(1-32)Online publication date: 11-Jan-2022
    • (2021)Multi-Graph Cooperative Learning Towards Distant Supervised Relation ExtractionACM Transactions on Intelligent Systems and Technology10.1145/346656012:5(1-21)Online publication date: 23-Sep-2021
    • (2021)Zero-shot Relation Classification from Side InformationProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3482403(576-585)Online publication date: 26-Oct-2021
    • (2021)A Conditional Cascade Model for Relational Triple ExtractionProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3482045(3393-3397)Online publication date: 26-Oct-2021
    • Show More Cited By

    View Options

    View options

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media