Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content

A survey of paraphrasing and textual entailment methods

Published: 01 May 2010 Publication History


Paraphrasing methods recognize, generate, or extract phrases, sentences, or longer natural language expressions that convey almost the same information. Textual entailment methods, on the other hand, recognize, generate, or extract pairs of natural language expressions, such that a human who reads (and trusts) the first element of a pair would most likely infer that the other element is also true. Paraphrasing can be seen as bidirectional textual entailment and methods from the two areas are often similar. Both kinds of methods are useful, at least in principle, in a wide range of natural language processing applications, including question answering, summarization, text generation, and machine translation. We summarize key ideas from the two areas by considering in turn recognition, generation, and extraction methods, also pointing to prominent articles and resources.


Alpaydin, E. (2004). Introduction to Machine Learning. MIT Press.
Androutsopoulos, I., Oberlander, J., & Karkaletsis, V. (2007). Source authoring for multilingual generation of personalised object descriptions. Nat. Lang. Engineering, 13(3), 191-233.
Androutsopoulos, I., Ritchie, G. D., & Thanisch, P. (1995). Natural language interfaces to databases -an introduction. Nat. Lang. Engineering, 1(1), 29-81.
Baeza-Yates, R., & Ribeiro-Neto, B. (1999). Modern Information Retrieval. Addison Wesley.
Baker, C. F., Fillmore, C. J., & Lowe, J. B. (1998). The Berkeley FrameNet project. In Proc. of the 17th Int. Conf. on Comp. Linguistics, pp. 86-90, Montreal, Quebec, Canada.
Bannard, C., & Callison-Burch, C. (2005). Paraphrasing with bilingual parallel corpora. In Proc. of the 43rd Annual Meeting of ACL, pp. 597-604, Ann Arbor, MI.
Bar-Haim, R., Berant, J., & Dagan, I. (2009). A compact forest for scalable inference over entailment and paraphrase rules. In Proc. of the Conf. on EMNLP, pp. 1056-1065, Singapore.
Bar-Haim, R., Dagan, I., Dolan, B., Ferro, L., Giampiccolo, D., Magnini, B., & Szpektor, I. (2006). The 2nd PASCAL recognising textual entailment challenge. In Proc. of the 2nd PASCAL Challenges Workshop on Recognising Textual Entailment, Venice, Italy.
Bar-Haim, R., Dagan, I., Greental, I., & Shnarch, E. (2007). Semantic inference at the lexical-syntactic level. In Proc. of the 22nd Conf. on Artificial Intelligence, pp. 871-876, Vancouver, BC, Canada.
Barzilay, R., & Elhadad, N. (2003). Sentence alignment for monolingual comparable corpora. In Proc. of the Conf. on EMNLP, pp. 25-32, Sapporo, Japan.
Barzilay, R., & Lee, L. (2002). Bootstrapping lexical choice via multiple-sequence alignment. In Proc. of the Conf. on EMNLP, pp. 164-171, Philadelphia, PA.
Barzilay, R., & Lee, L. (2003). Learning to paraphrase: an unsupervised approach using multiple-sequence alignment. In Proc. of the HLT Conf. of NAACL, pp. 16-23, Edmonton, Canada.
Barzilay, R., & McKeown, K. (2001). Extracting paraphrases from a parallel corpus. In Proc. of the 39th Annual Meeting of ACL, pp. 50-57, Toulouse, France.
Barzilay, R., & McKeown, K. R. (2005). Sentence fusion for multidocument news summarization. Comp. Linguistics, 31(3), 297-327.
Bateman, J., & Zock, M. (2003). Natural language generation. In Mitkov, R. (Ed.), The Oxford Handbook of Comp. Linguistics, chap. 15, pp. 284-304. Oxford University Press.
Bensley, J., & Hickl, A. (2008). Workshop: Application of LCC's GROUNGHOG system for RTE-4. In Proc. of the Text Analysis Conference, Gaithersburg, MD.
Bergmair, R. (2009). A proposal on evaluation measures for RTE. In Proc. of the ACL Workshop on Applied Textual Inference, pp. 10-17, Singapore.
Berwick, R. C. (1991). Principles of principle-based parsing. In Berwick, R. C., Abney, S. P., & Tenny, C. (Eds.), Principle-Based Parsing: Computation and Psycholinguistics, pp. 1-37. Kluwer, Dordrecht, Netherlands.
Bhagat, R., Pantel, P., & Hovy, E. (2007). LEDIR: An unsupervised algorithm for learning directionality of inference rules. In Proc. of the Conf. on EMNLP and the Conf. on Computational Nat. Lang. Learning, pp. 161-170, Prague, Czech Republic.
Bhagat, R., & Ravichandran, D. (2008). Large scale acquisition of paraphrases for learning surface patterns. In Proc. of the 46th Annual Meeting of ACL: HLT, pp. 674-682, Columbus, OH.
Bikel, D. M., Schwartz, R. L., & Weischedel, R. M. (1999). An algorithm that learns what's in a name. Machine Learning, 34(1-3), 211-231.
Blum, A., & Mitchell, T. (1998). Combining labeled and unlabeled data with co-training. In Proc. of the 11th Annual Conf. on Computational Learning Theory, pp. 92-100, Madison, WI.
Bos, J., & Markert, K. (2005). Recognising textual entailment with logical inference. In Proc. of the Conf. on HLT and EMNLP, pp. 628-635, Vancouver, BC, Canada.
Brill, E. (1992). A simple rule-based part of speech tagger. In Proc. of the 3rd Conf. on Applied Nat. Lang. Processing, pp. 152-155, Trento, Italy.
Brockett, C., & Dolan,W. (2005). Support Vector Machines for paraphrase identification and corpus construction. In Proc. of the 3rd Int. Workshop on Paraphrasing, pp. 1-8, Jeju island, Korea.
Brown, P. F., Della Pietra, S. A., Della Pietra, V. J., & Mercer, R. L. (1993). The mathematics of statistical machine translation: Parameter estimation. Comp. Linguistics, 19(2), 263-311.
Budanitsky, A., & Hirst, G. (2006). Evaluating WordNet-based measures of lexical semantic relatedness. Comp. Linguistics, 32(1), 13-47.
Burchardt, A., & Pennacchiotti, M. (2008). FATE: A FrameNet-annotated corpus for textual entailment. In Proc. of the 6th Language Resources and Evaluation Conference, Marrakech, Marocco.
Burchardt, A., Pennacchiotti, M., Thater, S., & Pinkal, M. (2009). Assessing the impact of frame semantics on textual entailment. Nat. Lang. Engineering, 15(4).
Burchardt, A., Reiter, N., Thater, S., & Frank, A. (2007). A semantic approach to textual entailment: System evaluation and task analysis. In Proc. of the ACL-PASCAL Workshop on Textual Entailment and Paraphrasing, pp. 10-15, Prague, Czech Republic. ACL.
Califf, M., & Mooney, R. (2003). Bottom-up relational learning of pattern matching rules for information extraction. Journal of Machine Learning Research, 4, 177-210.
Callison-Burch, C. (2008). Syntactic constraints on paraphrases extracted from parallel corpora. In Proc. of the Conf. on EMNLP, pp. 196-205, Honolulu, HI.
Callison-Burch, C., Cohn, T., & Lapata, M. (2008). ParaMetric: An automatic evaluation metric for paraphrasing. In Proc. of the 22nd Int. Conf. on Comp. Linguistics, pp. 97-104, Manchester, UK.
Callison-Burch, C., Dagan, I., Manning, C., Pennacchiotti, M., & Zanzotto, F. M. (Eds.). (2009). Proc. of the ACL-IJCNLP Workshop on Applied Textual Inference. Singapore.
Callison-Burch, C., Koehn, P., & Osborne, M. (2006a). Improved statistical machine translation using paraphrases. In Proc. of the HLT Conf. of the NAACL, pp. 17-24, New York, NY.
Callison-Burch, C., Osborne, M., & Koehn, P. (2006b). Re-evaluating the role of BLEU in machine translation research. In Proc. of the 11th Conf. of EACL, pp. 249-256, Trento, Italy.
Carnap, R. (1952). Meaning postulates. Philosophical Studies, 3(5).
Charniak, E. (2000). A maximum-entropy-inspired parser. In Proc. of the 1st Conf. of NAACL, pp. 132-139, Seattle, WA.
Chevelu, J., Lavergne, T., Lepage, Y., & Moudenc, T. (2009). Introduction of a new paraphrase generation tool based on Monte-Carlo sampling. In Proc. of the 47th Annual Meeting of ACL and the 4th Int. Joint Conf. on Nat. Lang. Processing of AFNLP, pp. 249-252, Singapore.
Clarke, D. (2009). Context-theoretic semantics for natural language: an overview. In Proc. of the EACL workshop on Geometrical Models of Nat. Lang. Semantics, pp. 112-119, Athens, Greece.
Clarke, J., & Lapata, M. (2008). Global inference for sentence compression: An integer linear programming approach. Journal of Artificial Intelligence Research, 31(1), 399-429.
Cohn, T., Callison-Burch, C., & Lapata, M. (2008). Constructing corpora for the development and evaluation of paraphrase systems. Comp. Linguistics, 34(4), 597-614.
Cohn, T., & Lapata, M. (2008). Sentence compression beyond word deletion. In Proc. of the 22nd Int. Conf. on Comp. Linguistics, Manchester, UK.
Cohn, T., & Lapata, M. (2009). Sentence compression as tree transduction. Journal of Artificial Intelligence Research, 34(1), 637-674.
Collins, M. (2003). Head-driven statistical models for natural language parsing. Comput. Linguistics , 29(4), 589-637.
Corley, C., & Mihalcea, R. (2005). Measuring the semantic similarity of texts. In Proc. of the ACL Workshop on Empirical Modeling of Semantic Equivalence and Entailment, pp. 13-18, Ann Arbor, MI.
Cristianini, N., & Shawe-Taylor, J. (2000). An Introduction to Support Vector Machines and Other Kernel-based Learning Methods. Cambridge University Press.
Culicover, P. (1968). Paraphrase generation and information retrieval from stored text. Mechanical Translation and Computational Linguistics, 11(1-2), 78-88.
Dagan, I., Dolan, B., Magnini, B., & Roth, D. (2009). Recognizing textual entailment: Rational, evaluation and approaches. Nat. Lang. Engineering, 15(4), i-xvii. Editorial of the special issue on Textual Entailment.
Dagan, I., Glickman, O., & Magnini, B. (2006). The PASCAL recognising textual entailment challenge. In Quiñonero-Candela, J., Dagan, I., Magnini, B., & d'Alche' Buc, F. (Eds.), Machine Learning Challenges. Lecture Notes in Computer Science, Vol. 3944, pp. 177-190. Springer-Verlag.
Das, D., & Smith, N. A. (2009). Paraphrase identification as probabilistic quasi-synchronous recognition. In Proc. of the 47th Annual Meeting of ACL and the 4th Int. Joint Conf. on Nat. Lang. Processing of AFNLP, pp. 468-476, Singapore.
de Marneffe, M., Rafferty, A., & Manning, C. (2008). Finding contradictions in text. In Proc. of the 46th Annual Meeting of ACL: HLT, pp. 1039-1047, Columbus, OH.
Deléger, L., & Zweigenbaum, P. (2009). Extracting lay paraphrases of specialized expressions from monolingual comparable medical corpora. In Proc. of the 2nd Workshop on Building and Using Comparable Corpora: from Parallel to Non-parallel Corpora, pp. 2-10, Singapore.
Dolan, B., & Dagan, I. (Eds.). (2005). Proc. of the ACL workshop on Empirical Modeling of Semantic Equivalence and Entailment. Ann Arbor, MI.
Dolan, B., Quirk, C., & Brockett, C. (2004). Unsupervised construction of large paraphrase corpora: Eploiting massively parallel news sources. In Proc. of the 20th Int. Conf. on Comp. Linguistics, pp. 350-356, Geneva, Switzerland.
Dolan,W. B., & Brockett, C. (2005). Automatically constructing a corpus of sentential paraphrases. In Proc. of the 3rd Int. Workshop on Paraphrasing, pp. 9-16, Jeju island, Korea.
Dras, M. (1998). Search in constraint-based paraphrasing. In Proc. of the 2nd Int. Conf. on Natural Lang. Processing and Industrial Applications, pp. 213-219, Moncton, Canada.
Drass, M., & Yamamoto, K. (Eds.). (2005). Proc. of the 3rd Int. Workshop on Paraphrasing. Jeju island, Korea.
Duboue, P. A., & Chu-Carroll, J. (2006). Answering the question you wish they had asked: The impact of paraphrasing for question answering. In Proc. of the HLT Conf. of NAACL, pp. 33-36, New York, NY.
Duclaye, F., Yvon, F., & Collin, O. (2003). Learning paraphrases to improve a question-answering system. In Proc. of the EACL Workshop on Nat. Lang. Processing for Question Answering, pp. 35-41, Budapest, Hungary.
Durbin, R., Eddy, S., Krogh, A., & Mitchison, G. (1998). Biological Sequence Analysis. Cambridge University Press.
Elhadad, N., & Sutaria, K. (2007). Mining a lexicon of technical terms and lay equivalents. In Proc. of the Workshop on BioNLP, pp. 49-56, Prague, Czech Republic.
Erk, K., & Padó, S. (2006). Shalmaneser - a toolchain for shallow semantic parsing. In Proc. of the 5th Language Resources and Evaluation Conference, Genoa, Italy.
Erk, K., & Padó S. (2009). Paraphrase assessment in structured vector space: Exploring parameters and datasets. In Proc. of the EACL Workshop on Geometrical Models of Nat. Lang. Semantics, pp. 57-65, Athens, Greece.
Fellbaum, C. (1998). WordNet: An Electronic Lexical Database. MIT Press.
Finch, A., Hwang, Y. S., & Sumita, E. (2005). Using machine translation evaluation techniques to determine sentence-level semantic equivalence. In Proc. of the 3rd Int. Workshop on Paraphrasing , pp. 17-24, Jeju Island, Korea.
Freund, Y., & Schapire, R. E. (1995). A decision-theoretic generalization of on-line learning and an application to boosting. In Proc. of the 2nd European Conf. on Computational Learning Theory, pp. 23-37, Barcelona, Spain.
Friedman, J., Hastie, T., & Tibshirani, R. (2000). Additive logistic regression: a statistical view of boosting. Annals of Statistics, 28(2), 337-374.
Fung, P., & Cheung, P. (2004). Multi-level bootstrapping for extracting parallel sentences from a quasi-comparable corpus. In Proc. of the 20th Int. Conf. on Comp. Linguistics, pp. 1051- 1057, Geneva, Switzerland.
Galanis, D., & Androutsopoulos, I. (2010). An extractive supervised two-stage method for sentence compression. In Proc. of the HLT Conf. of NAACL, Los Angeles, CA.
Gale, W., & Church, K. (1993). A program for aligning sentences in bilingual corpora. Comp. Linguistics, 19(1), 75-102.
Germann, U., Jahr, M., Knight, K., Marcu, D., & Yamada, K. (2001). Fast decoding and optimal decoding for machine translation. In Proc. of the 39th Annual Meeting on ACL, pp. 228-235, Toulouse, France.
Giampiccolo, D., Dang, H., Magnini, B., Dagan, I., & Dolan, B. (2008). The fourth PASCAL recognizing textual entailment challenge. In Proc. of the Text Analysis Conference, pp. 1-9, Gaithersburg, MD.
Giampiccolo, D., Magnini, B., Dagan, I., & Dolan, B. (2007). The third PASCAL recognizing textual entailment challenge. In Proc. of the ACL-Pascal Workshop on Textual Entailment and Paraphrasing, pp. 1-9, Prague, Czech Republic.
Glickman, O., & Dagan, I. (2004). Acquiring lexical paraphrases from a single corpus. In Nicolov, N., Bontcheva, K., Angelova, G., & Mitkov, R. (Eds.), Recent Advances in Nat. Lang. Processing III, pp. 81-90. John Benjamins.
Grishman, R. (2003). Information extraction. In Mitkov, R. (Ed.), The Oxford Handbook of Comp. Linguistics, chap. 30, pp. 545-559. Oxford University Press.
Habash, N., & Dorr, B. (2003). A categorial variation database for english. In Proc. of the HLT Conf. of NAACL, pp. 17-23, Edmonton, Canada.
Haghighi, A. D. (2005). Robust textual inference via graph matching. In Proc. of the Conf. on EMNLP, pp. 387-394, Vancouver, BC, Canada.
Harabagiu, S., & Hickl, A. (2006). Methods for using textual entailment in open-domain question answering. In Proc. of the 21st Int. Conf. on Comp. Linguistics and the 44th Annual Meeting of ACL, pp. 905-912, Sydney, Australia.
Harabagiu, S., Hickl, A., & Lacatusu, F. (2006). Negation, contrast and contradiction in text processing. In Proc. of the 21st National Conf. on Artificial Intelligence, pp. 755-762, Boston, MA.
Harabagiu, S., & Moldovan, D. (2003). Question answering. In Mitkov, R. (Ed.), The Oxford Handbook of Comp. Linguistics, chap. 31, pp. 560-582. Oxford University Press.
Harabagiu, S. M., Maiorano, S. J., & Pasca, M. A. (2003). Open-domain textual question answering techniques. Nat. Lang. Engineering, 9(3), 231-267.
Harmeling, S. (2009). Inferring textual entailment with a probabilistically sound calculus. Nat. Lang. Engineering, 15(4), 459-477.
Harris, Z. (1964). Distributional Structure. In Katz, J., & Fodor, J. (Eds.), The Philosphy of Linguistics , pp. 33-49. Oxford University Press.
Hashimoto, C., Torisawa, K., Kuroda, K., De Saeger, S., Murata, M., & Kazama, J. (2009). Large-scale verb entailment acquisition from the Web. In Proc. of the Conf. on EMNLP, pp. 1172- 1181, Singapore.
Hearst, M. (1998). Automated discovery of Wordnet relations. In Fellbaum, C. (Ed.), WordNet: An Electronic Lexical Database. MIT Press.
Herbelot, A. (2009). Finding word substitutions using a distributional similarity baseline and immediate context overlap. In Proc. of the Student Research Workshop of the 12th Conf. of EACL, pp. 28-36, Athens, Greece.
Hickl, A. (2008). Using discourse commitments to recognize textual entailment. In Proc. of the 22nd Int. Conf. on Comp. Linguistics, pp. 337-344, Manchester, UK.
Hobbs, J. (1986). Resolving pronoun references. In Readings in Nat. Lang. Processing, pp. 339- 352. Morgan Kaufmann.
Hovy, E. (2003). Text summarization. In Mitkov, R. (Ed.), The Oxford Handbook of Comp. Linguistics , chap. 32, pp. 583-598. Oxford University Press.
Huffman, S. (1995). Learning information extraction patterns from examples. In Proc. of the IJCAI Workshop on New Approaches to Learning for Nat. Lang. Processing, pp. 127-142, Montreal, Quebec, Canada.
Ibrahim, A., Katz, B., & Lin, J. (2003). Extracting structural paraphrases from aligned monolingual corpora. In Proc. of the ACL Workshop on Paraphrasing, pp. 57-64, Sapporo, Japan.
Iftene, A. (2008). UAIC participation at RTE4. In Proc. of the Text Analysis Conference, Gaithersburg, MD.
Iftene, A., & Balahur-Dobrescu, A. (2007). Hypothesis transformation and semantic variability rules used in recognizing textual entailment. In Proc. of the ACL-PASCAL Workshop on Textual Entailment and Paraphrasing, pp. 125-130, Prague, Czech Republic.
Inui, K., & Hermjakob, U. (Eds.). (2003). Proc. of the 2nd Int. Workshop on Paraphrasing: Paraphrase Acquisition and Applications. Sapporo, Japan.
Jacquemin, C., & Bourigault, D. (2003). Term extraction and automatic indexing. In Mitkov, R. (Ed.), The Oxford Handbook of Comp. Linguistics, chap. 33, pp. 599-615. Oxford University Press.
Joachims, T. (2002). Learning to Classify Text Using Support Vector Machines: Methods, Theory, Algorithms. Kluwer.
Jurafsky, D., & Martin, J. H. (2008). Speech and Language Processing (2nd edition). Prentice Hall.
Kauchak, D., & Barzilay, R. (2006). Paraphrasing for automatic evaluation. In Proc. of the HLT Conf. of NAACL, pp. 455-462, New York, NY.
Klein, D., & Manning, C. D. (2003). Accurate unlexicalized parsing. In Proc. of the 41st Annual Meeting of ACL, pp. 423-430, Sapporo, Japan.
Knight, K., & Marcu, D. (2002). Summarization beyond sentence extraction: A probalistic approach to sentence compression. Artificial Intelligence, 139(1), 91-107.
Koehn, P. (2004). Pharaoh: a beam search decoder for phrase-based statistical machine translation models. In Proc. of the 6th Conf. of the Association for Machine Translation in the Americas, pp. 115-124, Washington, DC.
Koehn, P. (2009). Statistical Machine Translation. Cambridge University Press.
Koehn, P., Och, F. J., & Marcu, D. (2003). Statistical phrase-based translation. In Proc. of the HLT Conf. of NAACL, pp. 48-54, Edmonton, Canada. ACL.
Kohomban, U., & Lee, W. (2005). Learning semantic classes for word sense disambiguation. In Proc. of the 43rd Annual Meeting of ACL, pp. 34-41, Ann Arbor, MI.
Kouylekov, M., & Magnini, B. (2005). Recognizing textual entailment with tree edit distance algorithms. In Proc. of the PASCAL Recognising Textual Entailment Challenge.
Kubler, S., McDonald, R., & Nivre, J. (2009). Dependency Parsing. Synthesis Lectures on HLT. Morgan and Claypool Publishers.
Lappin, S., & Leass, H. (1994). An algorithm for pronominal anaphora resolution. Comp. Linguistics, 20(4), 535-561.
Leacock, C., Miller, G., & Chodorow, M. (1998). Using corpus statistics and WordNet relations for sense identification. Comp. Linguistics, 24(1), 147-165.
Lepage, Y., & Denoual, E. (2005). Automatic generation of paraphrases to be used as translation references in objective evaluation measures of machine translation. In Proc. of the 3rd Int. Workshop on Paraphrasing, pp. 57-64, Jesu Island, Korea.
Levenshtein, V. (1966). Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physice-Doklady, 10, 707-710.
Lin, D. (1994). PRINCIPAR: an efficient, broad-coverage, principle-based parser. In Proc. of the 15th Conf. on Comp. Linguistics, pp. 482-488, Kyoto, Japan. ACL.
Lin, D. (1998a). Automatic retrieval and clustering of similar words. In Proc. of the the 36th Annual Meeting of ACL and 17th Int. Conf. on Comp. Linguistics, pp. 768-774, Montreal, Quebec, Canada.
Lin, D. (1998b). An information-theoretic definition of similarity. In Proc. of the 15th Int. Conf. on Machine Learning, pp. 296-304, Madison, WI. Morgan Kaufmann, San Francisco, CA.
Lin, D. (1998c). An information-theoretic definition of similarity. In Proc. of the 15th Int. Conf. on Machine Learning, pp. 296-304, Madison, WI.
Lin, D., & Pantel, P. (2001). Discovery of inference rules for question answering. Nat. Lang. Engineering, 7, 343-360.
Lonneker-Rodman, B., & Baker, C. (2009). The FrameNet model and its applications. Nat. Lang. Engineering, 15(3), 414-453.
MacCartney, B., Galley, M., & Manning, C. (2008). A phrase-based alignment model for natural language inference. In Proc. of the Conf. on EMNLP, pp. 802-811, Honolulu, Hawaii.
MacCartney, B., & Manning, C. (2009). An extended model of natural logic. In Proc. of the 8th Int. Conf. on Computational Semantics, pp. 140-156, Tilburg, The Netherlands.
Madnani, N., Ayan, F., Resnik, P., & Dorr, B. J. (2007). Using paraphrases for parameter tuning in statistical machine translation. In Proc. of 2nd Workshop on Statistical Machine Translation, pp. 120-127, Prague, Czech Republic.
Malakasiotis, P. (2009). Paraphrase recognition using machine learning to combine similarity measures. In Proc. of the 47th Annual Meeting of ACL and the 4th Int. Joint Conf. on Nat. Lang. Processing of AFNLP, Singapore.
Malakasiotis, P., & Androutsopoulos, I. (2007). Learning textual entailment using SVMs and string similarity measures. In Proc. of the ACL-PASCAL Workshop on Textual Entailment and Paraphrasing , pp. 42-47, Prague. ACL.
Mani, I. (2001). Automatic Summarization. John Benjamins.
Manning, C. D. (2008). Introduction to Information Retrieval. Cambridge University Press.
Manning, C. D., & Schuetze, H. (1999). Foundations of Statistical Natural Language Processing. MIT press.
Màrquez, L., Carreras, X., Litkowski, K. C., & Stevenson, S. (2008). Semantic role labeling: an introduction to the special issue. Comp. Linguistics, 34(2), 145-159.
Marton, Y., Callison-Burch, C., & Resnik, P. (2009). Improved statistical machine translation using monolingually-derived paraphrases. In Proc. of Conf. on EMNLP, pp. 381-390, Singapore.
McCarthy, D., & Navigli, R. (2009). The English lexical substitution task. Lang. Resources & Evaluation, 43, 139-159.
McDonald, R. (2006). Discriminative sentence compression with soft syntactic constraints. In Proc. of the 11th Conf. of EACL, pp. 297-304, Trento, Italy.
McKeown, K. (1983). Paraphrasing questions using given and new information. Comp. Linguistics, 9(1).
Mehdad, Y. (2009). Automatic cost estimation for tree edit distance using particle swarm optimization. In Proc. of the 47th Annual Meeting of ACL and the 4th Int. Joint Conf. on Nat. Lang. Processing of AFNLP, pp. 289-292, Singapore.
Melamed, D. (1999). Bitext maps and alignment via pattern recognition. Comp. Linguistics, 25(1), 107-130.
Melcuk, I. (1987). Dependency Syntax: Theory and Practice. State University of New York Press.
Meyers, A., Macleod, C., Yangarber, R., Grishman, R., Barrett, L., & Reeves, R. (1998). Using NOMLEX to produce nominalization patterns for information extraction. In Proc. of the COLING-ACL workshop on the Computational Treatment of Nominals, Montreal, Quebec, Canada.
Mintz, M., Bills, S., Snow, R., & Jurafsky, D. (2009). Distant supervision for relation extraction without labeled data. In Proc. of the 47th Annual Meeting of ACL and the 4th Int. Joint Conf. on Nat. Lang. Processing of AFNLP, pp. 1003-1011, Singapore.
Mirkin, S., Dagan, I., & Shnarch, E. (2009a). Evaluating the inferential utility of lexical-semantic resources. In Proc. of the 12th Conf. of EACL, pp. 558-566, Athens, Greece.
Mirkin, S., Specia, L., Cancedda, N., Dagan, I., Dymetman, M., & Szpektor, I. (2009b). Source-language entailment modeling for translating unknown terms. In Proc. of the 47th Annual Meeting of ACL and the 4th Int. Joint Conf. on Nat. Lang. Processing of AFNLP, pp. 791- 799, Singapore.
Mitchell, J., & Lapata, M. (2008). Vector-based models of semantic composition. In Proc. of the 46th Annual Meeting of ACL: HLT, pp. 236-244, Columbus, OH.
Mitchell, T. (1997). Machine Learning. Mc-Graw Hill.
Mitkov, R. (2003). Anaphora resolution. In Mitkov, R. (Ed.), The Oxford Handbook of Comp. Linguistics, chap. 14, pp. 266-283. Oxford University Press.
Moens, M. (2006). Information Extraction: Algorithms and Prospects in a Retrieval Context. Springer.
Moldovan, D., & Rus, V. (2001). Logic form transformation of WordNet and its applicability to question answering. In Proc. of the 39th Annual Meeting of ACL, pp. 402-409, Toulouse, France.
Mollá, D., Schwitter, R., Rinaldi, F., Dowdall, J., & Hess, M. (2003). Anaphora resolution in EX-TRANS. In Proc. of the Int. Symposium on Reference Resolution and Its Applications to Question Answering and Summarization, pp. 23-25, Venice, Italy.
Mollá D., & Vicedo, J. (2007). Question answering in restricted domains: An overview. Comp. Linguistics, 33(1), 41-61.
Moore, R. C. (2001). Towards a simple and accurate statistical approach to learning translation relationships among words. In Proc. of the ACL Workshop on Data-Driven Machine Translation, Toulouse, France.
Moschitti, A. (2009). Syntactic and semantic kernels for short text pair categorization. In Proc. of the 12th Conf. of EACL, pp. 576-584, Athens, Greece.
Munteanu, D. S., & Marcu, D. (2006). Improving machine translation performance by exploiting non-parallel corpora. Comp. Linguistics, 31(4), 477-504.
Muslea, I. (1999). Extraction patterns for information extraction tasks: a survey. In Proc. of the AAAI Workshop on Machine Learning for Information Extraction, Orlando, FL.
Navigli, R. (2008). A structural approach to the automatic adjudication of word sense disagreements. Nat. Lang. Engineering, 14(4), 547-573.
Nelken, R., & Shieber, S. M. (2006). Towards robust context-sensitive sentence alignment for monolingual corpora. In Proc. of the 11th Conf. of EACL, pp. 161-168, Trento, Italy.
Nielsen, R., Ward, W., & Martin, J. (2009). Recognizing entailment in intelligent tutoring systems. Nat. Lang. Engineering, 15(4), 479-501.
Nivre, J., Hall, J., Nilsson, J., Chanev, A., Eryigit, G., Kuebler, S., Marinov, S., & Marsi, E. (2007). MALTPARSER: a language-independent system for data-driven dependency parsing. Nat. Lang. Engineering, 13(2), 95-135.
Och, F. J., & Ney, H. (2003). A systematic comparison of various stat. alignment models. Comp. Ling., 29(1), 19-21.
O'Donnell, M., Mellish, C., Oberlander, J., & Knott, A. (2001). ILEX: An architecture for a dynamic hypertext generation system. Nat. Lang. Engineering, 7(3), 225-250.
padó, S., Galley, M., Jurafsky, D., & Manning, C. D. (2009). Robust machine translation evaluation with entailment features. In Proc. of the 47th Annual Meeting of ACL and the 4th Int. Joint Conf. on Nat. Lang. Processing of AFNLP, pp. 297-305, Singapore.
Padó, S., & Lapata, M. (2007). Dependency-based construction of semantic space models. Comp. Ling., 33(2), 161-199.
Palmer, M., Gildea, D., & Kingsbury, P. (2005). The Propositional Bank: an annotated corpus of semantic roles. Comp. Linguistics, 31(1), 71-105.
Pang, B., Knight, K., & Marcu, D. (2003). Syntax-based alignment of multiple translations: extracting paraphrases and generating new sentences. In Proc. of the Human Lang. Techn. Conf. of NAACL, pp. 102-109, Edmonton, Canada.
Pantel, P., Bhagat, R., Coppola, B., Chklovski, T., & Hovy, E. H. (2007). ISP: Learning inferential selectional preferences. In Proc. of the HLT Conf. of NAACL, pp. 564-571, Rochester, NY.
Papineni, K., Roukos, S.,Ward, T., & Zhu,W. J. (2002). BLEU: a method for automatic evaluation of machine translation. In Proc. of the 40th Annual Meeting on ACL, pp. 311-318, Philadelphia, PA.
Pasca, M. (2003). Open-domain question answering from large text collections(2nd edition). Center for the Study of Language and Information.
Pasca, M., & Dienes, P. (2005). Aligning needles in a haystack: Paraphrase acquisition across the Web. In Proc. of the 2nd Int. Joint Conf. on Nat. Lang. Processing, pp. 119-130, Jeju Island, Korea.
Perez, D., & Alfonseca, E. (2005). Application of the BLEU algorithm for recognizing textual entailments. In Proc. of the PASCAL Challenges Worshop on Recognising Textual Entailment, Southampton, UK.
Porter, M. F. (1997). An algorithm for suffix stripping. In Jones, K. S., & Willet, P. (Eds.), Readings in Information Retrieval, pp. 313-316. Morgan Kaufmann.
Power, R., & Scott, D. (2005). Automatic generation of large-scale paraphrases. In Proc. of the 3rd Int. Workshop on Paraphrasing, pp. 73-79, Jesu Island, Korea.
Qiu, L., Kan, M. Y., & Chua, T. (2006). Paraphrase recognition via dissimilarity significance classification. In Proc. of the Conf. on EMNLP, pp. 18-26, Sydney, Australia.
Quinlan, J. R. (1993). C4.5: Programs for Machine Learning. Morgan Kaufmann.
Quirk, C., Brockett, C., & Dolan, W. B. (2004). Monolingual machine translation for paraphrase generation. In Proc. of the Conf. on EMNLP, pp. 142-149, Barcelona, Spain.
Ravichandran, D., & Hovy, E. (2002). Learning surface text patterns for a question answering system. In Proc. of the 40th Annual Meeting on ACL, pp. 41-47, Philadelphia, PA.
Ravichandran, D., Ittycheriah, A., & Roukos, S. (2003). Automatic derivation of surface text patterns for a maximum entropy based question answering system. In Proc. of the HLT Conf. of NAACL, pp. 85-87, Edmonton, Canada.
Reiter, E., & Dale, R. (2000). Building Natural Language Generation Systems. Cambridge University Press.
Resnik, P. (1999). Semantic similarity in a taxonomy: An information-based measure and its application to problems of ambiguity in natural language. Journal of Artificial Intelligence Research, 11, 95-130.
Riezler, S., Vasserman, A., Tsochantaridis, I., Mittal, V., & Liu, Y. (2007). Statistical machine translation for query expansion in answer retrieval. In Proc. of the 45th Annual Meeting of ACL, pp. 464-471, Prague, Czech Republic.
Riloff, E. (1996a). Automatically generating extraction patterns from untagged text. In Proc. of the 13th National Conf. on Artificial Intelligence, pp. 1044-1049, Portland, OR.
Riloff, E. (1996b). An empirical study of automated dictionary construction for information extraction in three domains. Artificial Intelligence, 85(1-2), 101-134.
Riloff, E., & Jones, R. (1999). Learning dictionaries for information extraction by multi-level bootstrapping. In Proc. of the 16th National Conf. on Artificial Intelligence, pp. 474-479, Orlando, FL.
Rinaldi, F., Dowdall, J., Kaljurand, K., Hess, M., & Molla, D. (2003). Exploiting paraphrases in a question answering system. In Proc. of the 2nd Int. Workshop in Paraphrasing, pp. 25-32, Saporo, Japan.
Sato, S., & Nakagawa, H. (Eds.). (2001). Proc. of the Workshop on Automatic Paraphrasing. Tokyo, Japan.
Schohn, G., & Cohn, D. (2000). Less is more: active learning with Support Vector Machines. In Proc. of the 17th Int. Conf. on Machine Learning, pp. 839-846, Stanford, CA.
Schuler, K. K. (2005). VerbNet: A Broad-Coverage, Comprehensive Verb Lexicon. Ph.D. thesis, Univ. of Pennsylvania.
Sekine, S., Inui, K., Dagan, I., Dolan, B., Giampiccolo, D., & Magnini, B. (Eds.). (2007). Proc. of the ACL-PASCAL Workshop on Textual Entailment and Paraphrasing. Prague. Czech Republic.
Sekine, S., & Ranchhod, E. (Eds.). (2009). Named Entities - Recognition, Classification and Use. John Benjamins.
Selkow, S. (1977). The tree-to-tree editing problem. Information Processing Letters, 6(6), 184-186.
Shinyama, Y., & Sekine, S. (2003). Paraphrase acquisition for information extraction. In Proc. of the ACL Workshop on Paraphrasing, Sapporo, Japan.
Siblini, R., & Kosseim, L. (2008). Using ontology alignment for the TAC RTE challenge. In Proc. of the Text Analysis Conference, Gaithersburg, MD.
Sleator, D. D., & Temperley, D. (1993). Parsing English with a link grammar. In Proc. of the 3rd Int. Workshop on Parsing Technologies, pp. 277-292, Tilburg, Netherlands and Durbuy, Belgium.
Soderland, S. (1999). Learning inf. extraction rules for semi-structured and free text. Mach. Learning, 34(1-3), 233-272.
Soderland, S., Fisher, D., Aseltine, J., & Lehnert, W. G. (1995). CRYSTAL: Inducing a conceptual dictionary. In Proc. of the 14th Int. Joint Conf. on Artificial Intelligence, pp. 1314-1319, Montreal, Quebec, Canada.
Stevenson, M., & Wilks, Y. (2003). Word sense disambiguation. In Mitkov, R. (Ed.), The Oxford Handbook of Comp. Linguistics, chap. 13, pp. 249-265. Oxford University Press.
Stolcke, A. (2002). SRILM - an extensible language modeling toolkit. In Proc. of the 7th Int. Conf. on Spoken Language Processing, pp. 901-904, Denver, CO.
Szpektor, I., & Dagan, I. (2007). Learning canonical forms of entailment rules. In Proc. of Recent Advances in Natural Lang. Processing, Borovets, Bulgaria.
Szpektor, I., & Dagan, I. (2008). Learning entailment rules for unary templates. In Proc. of the 22nd Int. Conf. on Comp. Linguistics, pp. 849-856, Manchester, UK.
Szpektor, I., Dagan, I., Bar-Haim, R., & Goldberger, J. (2008). Contextual preferences. In Proc. of the 46th Annual Meeting of ACL: HLT, pp. 683-691, Columbus, OH.
Szpektor, I., Shnarch, E., & Dagan, I. (2007). Instance-based evaluation of entailment rule acquisition. In Proc. of the 45th Annual Meeting of ACL, pp. 456-463, Prague, Czech Republic.
Szpektor, I., Tanev, H., Dagan, I., & Coppola, B. (2004). Scaling Web-based acquisition of entailment relations. In Proc. of the Conf. on EMNLP, Barcelona, Spain.
Tai, K.-C. (1979). The tree-to-tree correction problem. Journal of ACM, 26(3), 422-433.
Tatu, M., Iles, B., Slavick, J., Novischi, A., & Moldovan, D. (2006). COGEX at the second recognizing textual entailment challenge. In Proc. of the 2nd PASCAL Challenges Workshop on Recognising Textual Entailment, Venice, Italy.
Tatu, M., & Moldovan, D. (2005). A semantic approach to recognizing textual entailment. In Proc. of the Conf. on HLT and EMNLP, pp. 371-378, Vancouver, Canada.
Tatu, M., & Moldovan, D. (2007). COGEX at RTE 3. In Proc. of the ACL-PASCAL Workshop on Textual Entailment and Paraphrasing, pp. 22-27, Prague, Czech Republic.
Tomuro, N. (2003). Interrogative reformulation patterns and acquisition of question paraphrases. In Proc. of the 2nd Int. Workshop on Paraphrasing, pp. 33-40, Sapporo, Japan.
Tong, S., & Koller, D. (2002). Support Vector Machine active learning with applications to text classification. Machine Learning Research, 2, 45-66.
Toutanova, K., Klein, D., Manning, C. D., & Singer, Y. (2003). Feature-rich part-of-speech tagging with a cyclic dependency network. In Proc. of the HLT Conf. of NAACL, pp. 173-180, Edmonton, Canada.
Tsatsaronis, G. (2009). Word Sense Disambiguation and Text Relatedness Based on Word Thesauri. Ph.D. thesis, Department of Informatics, Athens University of Economics and Business.
Tsatsaronis, G., Varlamis, I., & Vazirgiannis, M. (2010). Text relatedness based on a word thesaurus. Artificial Intelligence Research, 37, 1-39.
Turney, P., & Pantel, P. (2010). From frequency to meaning: Vector space models of semantics. Artificial Intelligence Research, 37, 141-188.
Vapnik, V. (1998). Statistical learning theory. John Wiley.
Vendler, Z. (1967). Verbs and Times. In Linguistics in Philosophy, chap. 4, pp. 97-121. Cornell University Press.
Vogel, S., Ney, H., & Tillmann, C. (1996). HMM-based word alignment in statistical translation. In Proc. of the 16th Conf. on Comp. Linguistics, pp. 836-841, Copenhagen, Denmark.
Voorhees, E. (2001). The TREC QA track. Nat. Lang. Engineering, 7(4), 361-378.
Voorhees, E. (2008). Contradictions and justifications: Extensions to the textual entailment task. In Proc. of the 46th Annual Meeting of ACL: HLT, pp. 63-71, Columbus, OH.
Wan, S., Dras, M., Dale, R., & Paris, C. (2006). Using dependency-based features to take the "parafarce" out of paraphrase. In Proc. of the Australasian Language Technology Workshop, pp. 131-138, Sydney, Australia.
Wang, R., & Neumann, G. (2008). An divide-and-conquer strategy for recognizing textual entailment. In Proc. of the Text Analysis Conference, Gaithersburg, MD.
Wang, X., Lo, D., Jiang, J., Zhang, L., & Mei, H. (2009). Extracting paraphrases of technical terms from noisy parallel software corpora. In Proc. of the 47th Annual Meeting of ACL and the 4th Int. Joint Conf. on Nat. Lang. Processing of AFNLP, pp. 197-200, Singapore.
Witten, I. H., & Frank, E. (2005). Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann.
Wu, D. (2000). Alignment. In Dale, R., Moisl, H., & Somers, H. (Eds.), Handbook of Nat. Lang. Processing, pp. 415-458. Marcel Dekker.
Wubben, S., van den Bosch, A., Krahmer, E., & Marsi, E. (2009). Clustering and matching headlines for automatic paraphrase acquisition. In Proc. of the 12th European Workshop on Nat. Lang. Generation, pp. 122-125, Athens, Greece.
Xu, F., Uszkoreit, H., & Li, H. (2007). A seed-driven bottom-up machine learning framework for extracting relations of various complexity. In Proc. of the 45th Annual Meeting of the Association of Comp. Linguistics, pp. 584-591, Prague, Czech Republic.
Yang, X., Su, J., & Tan, C. L. (2008). A twin-candidate model for learning-based anaphora resolution. Comp. Linguistics, 34(3), 327-356.
Yarowski, D. (2000). Word-sense disambiguation. In Dale, R., Moisl, H., & Somers, H. (Eds.), Handbook of Nat. Lang. Processing, pp. 629-654. Marcel Dekker.
Zaenen, A., Karttunen, L., & Crouch, R. (2005). Local textual inference: Can it be defined or circumscribed?. In Proc. of the ACL workshop on Empirical Modeling of Semantic Equivalence and Entailment, pp. 31-36, Ann Arbor, MI.
Zanzotto, F. M., & Dell' Arciprete, L. (2009). Efficient kernels for sentence pair classification. In Proc. of the Conf. on EMNLP, pp. 91-100, Singapore.
Zanzotto, F. M., Pennacchiotti, M., & Moschitti, A. (2009). A machine-learning approach to textual entailment recognition. Nat. Lang. Engineering, 15(4), 551-582.
Zhang, K., & Shasha, D. (1989). Simple fast algorithms for the editing distance between trees and related problems. SIAM Journal of Computing, 18(6), 1245-1262.
Zhang, Y., & Patrick, J. (2005). Paraphrase identification by text canonicalization. In Proc. of the Australasian Language Technology Workshop, pp. 160-166, Sydney, Australia.
Zhang, Y., & Yamamoto, K. (2005). Paraphrasing spoken Chinese using a paraphrase corpus. Nat. Lang. Engineering, 11(4), 417-434.
Zhao, S., Lan, X., Liu, T., & Li, S. (2009). Application-driven statistical paraphrase generation. In Proc. of the 47th Annual Meeting of ACL and the 4th Int. Joint Conf. on Nat. Lang. Processing of AFNLP, pp. 834-842, Singapore.
Zhao, S.,Wang, H., Liu, T., & Li, S. (2008). Pivot approach for extracting paraphrase patterns from bilingual corpora. In Proc. of the 46th Annual Meeting of ACL: HLT, pp. 780-788, Columbus, OH.
Zhitomirsky-Geffet, M., & Dagan, I. (2009). Bootstrapping distributional feature vector quality. Computational Linguistics, 35, 435-461.
Zhou, L., Lin, C.-Y., & Hovy, E. (2006a). Re-evaluating machine translation results with paraphrase support. In Proc. of the Conf. on EMNLP, pp. 77-84.
Zhou, L., Lin, C.-Y., Munteanu, D. S., & Hovy, E. (2006b). PARAEVAL: Using paraphrases to evaluate summaries automatically. In Proc. of the HLT Conf. of NAACL, pp. 447-454, New York, NY.

Cited By

View all
  • (2021)A Hybrid Siamese Neural Network for Natural Language Inference in Cyber-Physical SystemsACM Transactions on Internet Technology10.1145/341820821:2(1-25)Online publication date: 15-Mar-2021
  • (2020)Aspect-based summarisation using distributed clustering and single-objective optimisationJournal of Information Science10.1177/016555151982789646:2(176-190)Online publication date: 1-Apr-2020
  • (2020)Paraphrase Detection with Dependency EmbeddingProceedings of the 2020 4th International Conference on Computer Science and Artificial Intelligence10.1145/3445815.3445850(213-218)Online publication date: 11-Dec-2020
  • Show More Cited By

Index Terms

  1. A survey of paraphrasing and textual entailment methods
        Index terms have been assigned to the content through auto-classification.



        Information & Contributors


        Published In

        cover image Journal of Artificial Intelligence Research
        Journal of Artificial Intelligence Research  Volume 38, Issue 1
        May 2010
        744 pages


        AI Access Foundation

        El Segundo, CA, United States

        Publication History

        Published: 01 May 2010
        Published in JAIR Volume 38, Issue 1


        • Article


        Other Metrics

        Bibliometrics & Citations


        Article Metrics

        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 01 Sep 2024

        Other Metrics


        Cited By

        View all
        • (2021)A Hybrid Siamese Neural Network for Natural Language Inference in Cyber-Physical SystemsACM Transactions on Internet Technology10.1145/341820821:2(1-25)Online publication date: 15-Mar-2021
        • (2020)Aspect-based summarisation using distributed clustering and single-objective optimisationJournal of Information Science10.1177/016555151982789646:2(176-190)Online publication date: 1-Apr-2020
        • (2020)Paraphrase Detection with Dependency EmbeddingProceedings of the 2020 4th International Conference on Computer Science and Artificial Intelligence10.1145/3445815.3445850(213-218)Online publication date: 11-Dec-2020
        • (2020)Ranking-Incentivized Quality Preserving Content ModificationProceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3397271.3401058(259-268)Online publication date: 25-Jul-2020
        • (2020)Semantic association computation: a comprehensive surveyArtificial Intelligence Review10.1007/s10462-019-09781-w53:6(3849-3899)Online publication date: 1-Aug-2020
        • (2019)Creative tagline generation framework for product advertisementIBM Journal of Research and Development10.1147/JRD.2019.289390063:1(6:1-6:10)Online publication date: 1-Jan-2019
        • (2019)Statute Law Information Retrieval and EntailmentProceedings of the Seventeenth International Conference on Artificial Intelligence and Law10.1145/3322640.3326742(283-289)Online publication date: 17-Jun-2019
        • (2019)Combining Similarity and Transformer Methods for Case Law EntailmentProceedings of the Seventeenth International Conference on Artificial Intelligence and Law10.1145/3322640.3326741(290-296)Online publication date: 17-Jun-2019
        • (2018)SCITAILProceedings of the Thirty-Second AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence10.5555/3504035.3504671(5189-5197)Online publication date: 2-Feb-2018
        • (2018)Recognizing and justifying text entailment through distributional navigation on definition graphsProceedings of the Thirty-Second AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence10.5555/3504035.3504637(4913-4920)Online publication date: 2-Feb-2018
        • Show More Cited By

        View Options

        View options

        Get Access

        Login options

        Full Access







        Share this Publication link

        Share on social media