Abstract
The natural language processing community has recently experienced a growth of interest in domain independent shallow semantic parsing—the process of assigning a Who did What to Whom, When, Where, Why, How etc. structure to plain text. This process entails identifying groups of words in a sentence that represent these semantic arguments and assigning specific labels to them. It could play a key role in NLP tasks like Information Extraction, Question Answering and Summarization. We propose a machine learning algorithm for semantic role parsing, extending the work of Gildea and Jurafsky (2002), Surdeanu et al. (2003) and others. Our algorithm is based on Support Vector Machines which we show give large improvement in performance over earlier classifiers. We show performance improvements through a number of new features designed to improve generalization to unseen data, such as automatic clustering of verbs. We also report on various analytic studies examining which features are most important, comparing our classifier to other machine learning algorithms in the literature, and testing its generalization to new test set from different genre. On the task of assigning semantic labels to the PropBank (Kingsbury, Palmer, & Marcus, 2002) corpus, our final system has a precision of 84% and a recall of 75%, which are the best results currently reported for this task. Finally, we explore a completely different architecture which does not requires a deep syntactic parse. We reformulate the task as a combined chunking and classification problem, thus allowing our algorithm to be applied to new languages or genres of text for which statistical syntactic parsers may not be available.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Allwein, E. L., Schapire, R. E., & Singer, Y. (2000). Reducing multiclass to binary: A unifying approach for margin classifiers. In Proceedings of the 17th International Conference on Machine Learning (pp. 9–16). San Francisco, CA: Morgan Kaufmann.
Baker, C. F., Fillmore, C. J., & Lowe, J. B. (1998). The Berkeley Framenet Project. In Proceedings of the International Conference on Computational Linguistics (COLING/ACL-98). (pp. 86–90). Montreal.
Bikel, D. M., Schwartz, R., & Weischedel, R. M. (1999). An algorithm that learns what’s in a name. Machine Learning, 34, 211–231.
Blaheta, D., & Charniak, E. (2000). Assigning function tags to parsed text. In Proceedings of the 1st Annual Meeting of the North American Chapter of the ACL(NAACL) (pp. 234–240). Seattle, Washington.
Burges, C. J. C. (1998). Atutorial on support vectormachines for pattern recognition. Data Mining and Knowledge Discovery, 2:2, 121–167.
Charniak, E. (2001). Immediate-head parsing for language models. In Proceedings of the 39th Annual Conference of the Association for Computational Linguistics (ACL-01). Toulouse, France.
Chen, J., & Rambow, O. (2003). Use of deep linguistics features for the recognition and labeling of semantic arguments. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Sapporo, Japan.
Collins, M. J. (1999) Head-driven statistical models for natural language parsing. Ph.D. thesis, University of Pennsylvania, Philadelphia.
Daniel, K., Schabes, Y., Zaidel, M., & Egedi, D.(1992). A freely available wide coverage morphological analyzer for English. In Proceedings of the 14th International Conference on Computational Linguistics (COLING-92). Nantes, France.
Fleischman, M., & Hovy, E. (2003). A maximum entropy approach to framenet tagging. In Proceedings of the Human Language Technology Conference. Edmonton, Canada.
Gildea, D., & Hockenmaier, J. (2003). Identifying semantic roles using combinatory categorial grammar. InProceedings of the Conference on Empirical Methodsin Natural Language Processing. Sapporo, Japan.
Gildea, D., & Jurafsky, D. (2000). Automatic labeling of semantic roles. In Proceedings of the 38th Annual Conference of the Association for Computational Linguistics (ACL-00) (pp. 512–520). Hong Kong.
Gildea, D. & Jurafsky, D. (2002).Automatic labeling of semantic roles. Computational Linguistics, 28:3, 245–288.
Gildea, D., & Palmer, M. (2002). The necessity of syntactic parsing for predicate argument recognition. In Proceedings of the 40th Annual Conference of the Association for Computational Linguistics (ACL-02). Philadelphia, PA.
Hacioglu, K., Pradhan, S., Ward, W., Martin, J., & Jurafsky, D. (2003). Shallow semantic parsing using support vector machines. Technical Report TR-CSLR-2003-1, Center for Spoken Language Research, Boulder, Colorado.
Hacioglu, K., & Ward, W. (2003). Target word detection and semantic role chunking using support vector machines. In Proceedings of the Human Language Technology Conference. Edmonton, Canada.
Hearst, M. (1999). Untangling text data mining. In Proceedings of the 37th Annual Meeting of the ACL (pp. 3–10). College Park, Maryland.
Hofmann, T., & Puzicha, J. (1998). Statistical models for co-occurrence data. Memo, Massachusetts Institute of Technology Artificial Intelligence Laboratory.
Joachims, T. (1998) Text categorization with support vector machines: Learning with many relevant features. In Proceedings of the European Conference on Machine Learning (ECML).
Kingsbury, P., Palmer, M., & Marcus, M. (2002). Adding semantic annotation to the Penn Treebank. In Proceedings of the Human Language Technology Conference. San Diego, CA.
Kressel, U. H. G. (1999). Pairwise classification and support vector machines. In B. Scholkopf, C. Burges, & A. J. Smola (Eds.), Advances in kernel methods. The MIT Press.
Kudo, T., & Matsumoto, Y. (2000). Use of support vector learning for chunk identification. In Proceedings of the 4th Conference on CoNLL-2000 and LLL-2000 (pp. 142–144).
Kudo, T., & Matsumoto, Y. (2001). Chunking with support vector machines. In Proceedings of the 2nd Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-2001).
LDC: (2002). The AQUAINT Corpus of English News Text, Catalog no. LDC2002T31.
Lin, D. (1998). Automatic retrieval and clustering of similar words. In Proceedings of the International Conference on Computational Linguistics (COLING/ACL-98). Montreal, Canada.
Lodhi, H., Saunders, C., Shawe-Taylor, J., Cristianini, N., & Watkins, C. (2002). Text classification using string kernels. Journal of Machine Learning Research, 2:Feb, 419–444.
Magerman, D. (1994). Natural language parsing as statistical pattern recognition. Ph.D. thesis, Stanford University, CA.
Marcus, M., Kim, G., Marcinkiewicz, M.A., MacIntyre, R., Bies, A., Ferguson, M., Katz, K., & Schasberger, B. (1994). The Penn treebank: Annotating predicate argument structure.
Platt, J. (2000). Probabilities for support vectormachines. In A. Smola, P. Bartlett, B. Scholkopf, & D. Schuurmans (Eds.), Advances in large margin classifiers. Cambridge, MA: MIT press.
Pradhan, S., Hacioglu, K., Ward, W., Martin, J., & Jurafsky, D. (2003). Semantic role parsing: Adding semantic structure to unstructured text. In Proceedings of the International Conference on Data Mining (ICDM 2003). Melbourne, Florida.
Pradhan, S., Ward, W., Hacioglu, K., Martin, J., & Jurafsky, D. (2004). Shallow Semantic parsing using support vector machines. In Proceedings of the Human Language Technology Conference/North American chapter of the Association of Computational Linguistics (HLT/NAACL). Boston, MA.
Quinlan, J. R. (1986). Induction of decision trees. Machine Learning, 1:1, 81&106.
Quinlan, R. (2003). Data Mining Tools See5 and C5.0. http://www.rulequest.com.
Ramshaw, L. A., & Marcus, M. P. (1995). Text chunking using transformation-based learning. In Proceedings of the Third Annual Workshop on Very Large Corpora (pp. 82–94).
Sang, E. F. T. K., & Veenstra, J. (1999). Representing text chunks. In Proceedingsof the EACL. (pp. 173–179).
Surdeanu, M., Harabagiu, S., Williams,J., & Aarseth, P. (2003). Using predicate-argument structures for information extraction. In Proceedings of the 41stAnnual Meeting of the Association for Computational Linguistics. Sapporo, Japan.
Thompson, C. A., Levy, R., & Manning, C. D. (2003). A generative model for semantic role labeling. In Proceedings of the European Conference on Machine Learning (ECML).
Vapnik, V. (1998). Statistical learning theory New York: John Wiley and Sons Inc.
Wallis, S., & Nelson, G. (2001). Knowledge discovery in grammatically analysed corpora. Data Mining and Knowledge Discovery, 5:4, 305–335.
Author information
Authors and Affiliations
Corresponding author
Additional information
Editors:
Dan Roth and Pascale Fung
This research was partially supported by the ARDA AQUAINT program via contract OCG4423B and by the NSF via grant IIS-9978025.
Rights and permissions
About this article
Cite this article
Pradhan, S., Hacioglu, K., Krugler, V. et al. Support Vector Learning for Semantic Argument Classification. Mach Learn 60, 11–39 (2005). https://doi.org/10.1007/s10994-005-0912-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10994-005-0912-2