Abstract
We discuss the nature and the scope of linguistic (morphological, syntactic and semantic) variation of terms and its impact on two information retrieval tasks: term acquisition and automatic indexing. A review of natural language processing techniques existing in these two areas is done, along with an in-depth presentation of FASTR, a corpus processor for the recognition, normalization, and acquisition of multi-word terms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Abeillé, A.: Les nouvelles syntaxes. Grammaires d’unification et analyse du franais. Armand Colin, Paris (1993)
Abeillé, A.: Grammaires et analyseurs syntaxiques. In: Pierrel, J.-M. (ed.) Ingénierie des langues. Hermes Sciences, Paris (2000)
Abney, S.: Partial parsing via finite-state cascade. In: Proceedings, Workshop on Robust Parsing, 8th European Summer Schol in Logic, Language and Information, Prague, Czech Republic, pp. 8–15 (1996)
AGROVOC. AGROVOC - Multilingual Agricultural Thesaurus. Food and Agricultural Organization of the United Nations (1995), http://www.fao.org/catalog/Book/products/v9669e.htm
Ambroziak, J., Woods, W.A.: Natural language technology in precision content retrieval. In: Proceedings, Natural Language Processing and Industrial Applications (NLP+IA 1998), Moncton, New Brunswick. University of Moncton (1998)
Andreewsky, A., Debili, F., Fluhr, C.: Computational learning of semantic lexical relations for the generation and automatic analysis of content. In: Proceedings, IFIP Congress, Toronto. pp. 667–673. IFIP (1977)
Arampatzis, A.T., Koster, C.H.A., Tsoris, T.: IRENA: Information retrieval engine based on natural language analysis. In: Proceedings, Intelligent Multimedia Information Retrieval Systems and Management (RIAO 1997), Montreal, pp. 159–175. CID, Paris (1997)
Arampatzis, A.T., Tsoris, T., Koster, C.H.A., van der Weide., T.P.: Phrase-based information retrieval. Information Processing and Management 34(6), 693–707 (1998)
Arppe,A.: Term extraction from unrestricted text (1995), http://www.lingsoft.fi/doc/nptool/termextraction.html
Barkema, H.: Determining the syntactic flexibility of idioms. In: Fries, U., Tottie, G., Schneider, P. (eds.) Creating and using English language corpora, Rodopi, Amsterdam, pp. 39–52 (1994)
Boguraev, B.K., Jones, K.S.: A natural language front end to databases with evaluative feedback. In: Boguraev, B.K., Jones, K.S. (eds.) New Applications of Databases, Academic Press, London (1984)
Bourigault, D., Slodzian, M.: Pour une terminologie textuelle. Terminologies Nouvelles 19 (1999)
Bourigault, D.: An endogeneous corpus-based method for structural noun phrase disambiguation. In: Proceedings, Sixth Conference of the European Chapter of the Association for Computational Linguistics (EACL 1993), Utrecht, pp. 81–86. ACL (1993)
Bourigault, D.: LEXTER un Logiciel d’EXtraction de TERminologie. Application à l’extraction des connaissances à partir de textes. Thèse en mathématiques, informatique appliquée aux sciences de l’homme, École des Hautes Études en Sciences Sociales, Paris (1994)
Bourigault, D.: LEXTER, a Natural Language tool for terminology extraction. In: Proceedings, Seventh EURALEX International Congress, Göteborg, pp. 771–779. EURALEX (1996)
Bourigault, D., Jacquemin, C.: Term extraction + term clustering: An integrated platform for computer-aided terminology. In: Proceedings, Ninth Conference of the European Chapter of the Association for Computational Linguistics (EACL 1999), Bergen, pp. 15–22. ACL (1999)
Bresnan, J. (ed.): The Mental Representation of Grammatical Relations. MIT Press, Cambridge (1992)
Brill, E.: A simple rule-based part of speech tagger. In: Proceedings, Third Conference on Applied Natural Language Processing (ANLP 1992), Trento, pp. 152–55. ACL (1992)
Brown, P.L., Della Pietra, V.J., de Souza, P.V., Lai, J.C., Mercer, R.L.: Class-based n-gram models of natural language. Computational Linguistics 18(4), 467–479 (1992)
Byrd, R.J., Klavans, J.L., Aronoff, M., Anshen, F.: Computer methods for morphological analysis. In: Proceedings, 24th Annual Meeting of the Association for Computational Linguistics (ACL 1986), New York, pp. 120–127. ACL (1986)
Castellví, M.T.C., Bagot, R.E., Palatresi, J.V.: Automatic term detection: A review of current systems. In: Bourigault, D., Jacquemin, C., L’Homme, M.-C. (eds.) Recent Advances in Computational Terminology, John Benjamins, Amsterdam (2001)
Chanod, J.-P., Tapanainen, P.: Statistical and constraint-based taggers for french. Technical report, Xerox Research Centre Europe, Grenoble, France (1994)
Charniak, E.: Statistical Language Learning. A Bradford Book. MIT Press, Cambridge (1993)
Chen, K.-H., Chen, H.-H.: Extracting noun phrases from large-scale texts: A hybrid approach and its automatic evaluation. In: Proceedings, 32nd Annual Meeting of the Association for Computational Linguistics (ACL 1994), Las Cruces, NM, pp. 234–241. ACL (1994)
Church, K.W., Hanks, P.: Word association norms, Mutual Information and lexicography. Computational Linguistics 16(1), 22–29 (1990)
Clemenceau, D.: Finite-state morphology: Inflections and derivations in a single framework using dictionaries and rules. In: Roche, E., Schabes, Y. (eds.) Finite-State Language Processing, pp. 383–406. MIT Press, Cambridge (1997)
Courtois, B.: Un système de dictionnaires électroniques pour les mots simples du franais. Langue Française 87 (1990)
Daciuk, J., Mihov, S., Watson, B., Watson, R.: Incremental construction of minimal acyclic finite state automata. Computational Linguistics 26(1), 3–16 (2000)
Dagan, I., Church, K.W.: Termight: Identifying and translating technical terminology. In: Proceedings, Fourth Conference on Applied Natural Language Processing (ANLP 1994), Stuttgart, pp. 34–40. ACL (1994)
Daille, B.: Approche mixte pour l’extraction de terminologie: Statistique lexicale et filtres linguistiques. In: Thèse en informatique fondamentale, Université de Paris 7, Paris (1994)
Daille, B.: Study and implementation of combined techniques for automatic extraction of terminology. In: Klavans, J.L., Resnik, P. (eds.) The Balancing Act: Combining Symbolic and Statistical Approaches to Language, pp. 49–66. MIT Press, Cambridge (1996)
Dal, G., Hathout, N., Namer, F.: Construire un lexique dérivationnel: Théorie et réalisations. In: Proceedings, Conférence de Traitement Automatique du Langage Naturel (TALN 1999), Cargèse, pp. 115–124. ATALA, Paris (1999)
David, S., Plante, P.: De la nécessité d’une approche morpho-syntaxique dans l’analyse de textes. Intelligence Artificielle et Sciences Cognitives au Québec 3(3), 140–154 (1990)
David, S., Plante, P.: Le progiciel TERMINO: de la nécessité d’une analyse morphosyntaxique pour le dépouillement terminologique des textes. In: Colloque International sur les Industries de la Langue: Perspectives des Années 1990, Montréal, pp. 71–88 (1990); Office de la Langue Fran caise et Société des Traducteurs du Quebec
Debili, F.: Analyse syntaxico-sémantique fondée sur une acquisition automatique de relations lexicales-sémantiques. Thèse de doctorat d’état en sciences informatiques, University of Paris 11, Orsay (1982)
Dice, L.R.: Measures of the amount of ecologic association between species. Journal of Ecology 26, 297–302 (1945)
Dillon, M., Gray, A.S.: FASIT: A fully automatic syntactically based indexing system. Journal of the American Society for Information Science 34(2), 99–108 (1983)
Dunning, T.: Accurate methods for the statistics of surprise and coincidence. Computational Linguistics 19(1), 61–74 (1993)
Enguehard, C., Pantera, L.: Automatic natural acquisition of a terminology. Journal of Quantitative Linguistics 2(1), 27–32 (1995)
Evans, D.A., Ginther-Webster, K., Hart, M., Lefferts, R.G., Monarch, I.A.: Automatic indexing using selective NLP and first-order thesauri. In: Proceedings, Intelligent Multimedia Information Retrieval Systems and Management (RIAO 1991), Barcelona, pp. 624–643. CID, Paris (1991)
Evans, D.A., Zhai, C.: Noun-phrase analysis in unrestricted text for information retrieval. In: Proceedings, 34th Annual Meeting of the Association for Computational Linguistics (ACL 1996), Santa Cruz, pp. 17–24. ACL (1996)
Fagan, J.L.: Automatic phrase indexing for document retrieval: An examination of syntactic and non-syntactic methods. In: Proceedings, Tenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 1987), pp. 91–101. ACM, New York (1987)
Fano, R.M.: Transmission of Information: A Statistical Theory of Communications. MIT Press, Cambridge (1961)
Frantzi, K.T., Ananiadou, S.: Retrieving collocations by co-occurrences and word order constraints. In: Proceedings, 16th International Conference on Computational Linguistics (COLING 1996), Copenhagen, pp. 41–46. ACL (1996)
Frenkel, K.A.: The human genome project and informatics. Communications of the ACM 34(11), 41–51 (1991)
Friburger, N., Maurel, D.: Finite-state transducer cascade to extract proper nouns in texts. In: Proceedings, 6th Conference on Implementations and Applications of Automata, Pretoria, South Africa, pp. 97–106 (2001)
Fung, P.: Using Word Signature Features for Terminology Translation from Large Corpora. PhD dissertation, Graduate School of Arts and Science, Columbia University, New York (1997)
Gaál, T.: Is this finite-state transducer sequentiable? In: Proceedings, 6th Conference on Implementations and Applications of Automata, Pretoria, South Africa, pp. 107–115 (2001)
Gaussier, É.: Flow network models for word alignment and terminology extraction from bilingual corpora. In: Proceedings, 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics (COLING-ACL 1998), Montreal, pp. 444–450. ACL (1998)
Gazdar, G., Klein, E., Pullum, G.K., Sag, I.A.: Generalized Phrase Structure Grammar. Harvard University Press, Cambridge (1985)
Gonzalo, J., Peñas, A., Verdejo, F.: Lexical ambiguity and information retrieval revisited. In: Proceedings, Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora (EMNLP/VLC 1999), University of Maryland, CollegePark, pp. 195–203. ACL (1999)
Gouadec, D. (ed.): Terminologie et Phraséologie pour Traduire - Le concordancier du Traducteur, La Maison du Dictionnaire, Paris (1997)
Gross, G.: Degré de figement des noms composés. Langages 90, 57–72 (1988)
Gross, M.: Grammaire transformationnelle du française, 2: Syntaxe du nom. Systématique de la langue française, Cantilène, Paris (1986)
Guilbert, L.: La formation du vocabulaire de l’aviation, Larousse, Paris (1965)
Habert, B.: OLMES: a versatile and extensible parser in CLOS. In: Proceedings, Fourth International Conference on Technology of Object-Oriented Languages and Systems (TOOLS 1991), Paris, pp. 149–160. Prentice-Hall, Englewood Cliffs (1991)
Habert, B., Jacquemin, C.: Noms composés, termes, dénominations complexes: Problématiques linguistiques et traitements automatiques. Traitement automatique des langues 34(2), 5–42 (1993)
Hall, P.A., Dowling, G.R.: Approximate string matching. Computing Surveys 12(4), 381–402 (1980)
Hamon, T., Nazarenko, A., Gros, C.: A step towards the detection of semantic variants of terms in technical documents. In: Proceedings, 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics (COLING-ACL 1998), Montreal, pp. 498–504. ACL (1998)
Harris, Z.S.: Mathematical Structure of Language. John Wiley, New York (1968)
Heidorn, G.E.: Augmented phrase structure grammars. In: Schank, R., Nash- Webber, B.L. (eds.) Theoretical Issues in Natural Language Processing: An Interdisciplinary Workshop in Computational Linguistics, Psychology, Linguistics, and Artificial Intelligence, pp. 10–13 Lawrence Erlbaum Associates, Hillsdale (1975)
Hobbs, J.R., Appelt, D., Bear, J., Israel, D., Kameyama, M., Stickel, M., Tyson, M.: FASTUS: A cascaded finite-state transducer for extracting information from natural-language text. In: Roche, E., Schabes, Y. (eds.) Finite-State Language Processing, pp. 383–406. MIT Press, Cambridge (1997)
Hopcroft, J.E.: An n log n algorithm for minimizing the states of in a finite automaton. In: Kohavi, Z., Paz, A. (eds.) The Theory of Machines and Computations, pp. 189–196. Academic Press, New York (1971)
Hopcroft, J.E., Ullman, J.D.: Introduction to Automata Theory, Languages, and Computation. Addison-Wesley, Reading (1979)
Ikehara, S., Shirai, S., Uchino, H.: A statistical method for extracting uninterrupted and interrupted collocations from very large corpora. In: Proceedings, 16th International Conference on Computational Linguistics (COLING 1996), Copenhagen, pp. 574–579. ACL (1996)
Jacquemin, C.: Optimizing the computational lexicalization of large grammars. In: Proceedings, 32nd Annual Meeting of the Association for Computational Linguistics (ACL 1994), Las Cruces, NM, pp. 196–203. ACL (1994)
Jacquemin, C.: Syntagmatic and paradigmatic representations of term variation. In: Proceedings, 37th Annual Meeting of the Association for Computational Linguistics (ACL 1999), University of Maryland, CollegePark, pp. 341–348. ACL (1999)
Jacquemin, C.: Spotting and Discovering Terms through NLP. MIT Press, Cambridge (2001)
Jacquemin, C., Daille, B., Royauté, J., Polanco, X.: In vitro evaluation of a program for machine-aided indexing. Information Processing and Management (2001) (forthcoming)
Jacquemin, C., Klavans, J.L., Tzoukermann, E.: Expansion of multiword terms for indexing and retrieval using morphology and syntax. In: Proceedings, 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics (ACL-EACL 1997), Madrid, pp. 24–31. ACL (1997)
Jacquemin, C., Tzoukermann, E.: NLP for term variant extraction: A synergy of morphology, lexicon, and syntax. In: Strzalkowski, T. (ed.) Natural Language Information Retrieval, pp. 25–74. Kluwer Academic Publisher, Boston (1999)
Joshi, A.K.: An introduction to Tree Adjoining Grammars. In: Manaster-Ramer, A. (ed.) Mathematics of Language, pp. 87–115. John Benjamins, Amsterdam (1987)
Justeson, J.S., Katz, S.M.: Technical terminology: some linguistic properties and an algorithm for identification in text. Natural Language Engineering 1(1), 9–27 (1995)
Kaplan, R., Kay, M.: Regular models of phonological rule systems. Computational Linguistics 20(3) (1994)
Karlsson, F., Voutilainen, A., Heikkilä, J., Anttila, A. (eds.): Constraint Grammar A Language-Independent System for Parsing Unrestricted Text. Mouton de Gruyter, Berlin (1995)
Kay, M.: Algorithm schemata and data structures in syntactic processing. In: Proceedings, Nobel Symposium on Text Processing, Gotheborg, Danemark, pp. 35–70 (1980); reprint in Grosz, B., Sparck Jones, K., Webber, B. (eds.): Readings in Natural Language Processing. Morgan Kaufman, San Francisco
Keen, E.M.: On the generation and searching of entries is printed subject indexes. Journal of Documentation 33(1), 15–45 (1977)
Klavans, J.L., Jacquemin, C., Tzoukermann, E.: A natural language approach to multi-word term conflation. In: DELOS Workshop on Cross-Language Information retireval, ETHZ, Zurich, Switzerland (1997) ERCIM: European Consortium for Informatics and Mathematics
Klavans, J.L., Resnik, P. (eds.): The Balancing Act: Combining Symbolic and Statistical Approaches to Language. MIT Press, Cambridge (1996)
Kornai, A.: Extended Finite State Models of Language. Cambridge University Press, Cambridge (1999)
Koskenniemi, K.: Two-Level Morphology: A General Computational Model for Word-Form Recognition and Production. PhD dissertation, University of Helsinki, Helsinki (1983)
Laporte, E.: Rational transductions for phonetic conversion and phonology. In: Roche, E., Schabes, Y. (eds.) Finite-State Language Processing, MIT Press, Cambridge (1997)
Laporte, E., Monceaux, A.: Elimination of lexical ambiguities by grammars: the ELAG system. Linguisticae Investigationes 22, John Benjamins Publishing Company (1998)
Lauriston, A.: Automatic recognition of complex terms: Problems and the TERMINO solution. Terminology 1(1), 147–170 (1994)
Lovins, J.B.: Development of a stemming algorithm. Translation and Computational Linguistics 11(1), 22–31 (1968)
Mathieu-Colas, M.: Orthographe et informatique: Établissement d’un dictionnaire électronique des variantes graphiques. Langue Française 87, 104–111 (1990)
Melishar, B., Skryja, J.: On the size of deterministic finite automata. In: Proceedings, 6th Conference on Implementations and Applications of Automata, Pretoria, South Africa, pp. 203–216 (2001)
Metzler, D.P., Haas, S.W.: The Constituent Object Parser: Syntactic structure matching for Information Retrieval. ACM Transactions on Information Systems 7(3), 292–316 (1989)
Metzler, D.P., Haas, S.W., Cosic, C.L., Weise, C.A.: Conjunction ellipsis, and other discontinuous constituents in the Constituent Object Parser. Information Processing and Management 26(1), 53–71 (1990)
Metzler, D.P., Haas, S.W., Cosic, C.L., Wheeler, L.H.: Constituent Object Parsing for Information Retrieval and similar text processing problems. Journal of the American Society for Information Science 40(6), 398–423 (1989)
Mitra, M., Buckley, C., Singhal, A., Cardie, C.: An analysis of statistical and syntactic phrases. In: Proceedings, Intelligent Multimedia Information Retrieval Systems and Management (RIAO 1997), Montreal, pp. 200–214. CID, Paris (1997)
Mohri, M.: Compact representations by finite-state transducers. In: Proceedings, 32nd Annual Meeting of the Association for Computational Linguistics (ACL 1994), Las Cruces, NM, pp. 204–208. ACL (1994)
Monceaux, A.: Le dictionnaire des mots simples anglais: mots nouveaux et variantes orthographiques. Sèrie Informes IGM 95-15, Institut Gaspard Monge, Université de Marnela-Vallée, Noisy-le-Grand, France (1995)
Oflazer, K.: Error-tolerant finite-state recognition with applications to morphological analysis and spelling correction. Computational Linguistics 22(1), 73–89 (1996)
Pollard, C., Sag, I.A.: Information-Based Syntax and Semantics. Volume 1: Fundamentals. CSLI Lecture Notes, vol. 13. Chicago University Press, Chicago (1987)
Porter, M.F.: An algorithm for suffix stripping. Program 14, 130–137 (1980)
Roche, E.: Parsing with finite state transducers. In: Roche, E., Schabes, Y. (eds.) Finite-State Language Processing, MIT Press, Cambridge (1997)
Roche, E., Schabes, Y.: Deterministic part-of-speech tagging with finitestate transducers. In: Roche, E., Schabes, Y. (eds.) Finite-State Language Processing, pp. 205–240. MIT Press, Cambridge (1997)
Sager, J.C.: A Practical Course in Terminology Processing. John Benjamins, Amsterdam (1990)
Sager, N.: Natural Language Information Processing: A Computer Grammar of English and Its Applications. Addison-Wesley, Reading (1981)
Salton, G.: Automatic Text Processing: The Transformation, Analysis and Retrieval of Information by Computer. Addison-Wesley, Reading (1989)
Salton, G., Lesk, M.E.: Computer evaluation og indexing and text processing. Journal of the Association for Computational Machinery 15(1), 8–36 (1968)
Salton, G., McGill, M.J.: Introduction to Modern Information Retrieval. McGraw-Hill, New York (1983)
Salton, G., Yang, C.S., Yu, C.T.: A theory of term importance in automatic text analysis. Journal of the American Society for Information Science 26(1), 33–44 (1975)
Savary, A.: Recensement et description des mots composés — méthodes et applications. Thèse de doctorat, Université de Marne-la-Vallée, Noisy-le-Grand, France (2000)
Savary, A.: Etude comparativee de deux outils d’acquisition de termes complexes. In: Proceedings, Conference Terminologie et Intelligence Artificielle (TIA-2001), INIST-CNRS, Nancy (2001)
Schabes, Y., Abeillé, A., Joshi, A.: Parsing strategies with ‘lexicalized’ grammars. In: Proceedings, 12th International Conference on Computational Linguistics (COLING 1988), Budapest, pp. 578–583. ACL (1988)
Schabes, Y., Joshi, A.K.: Parsing with Lexicalized Tree Adjoining Grammar. In: Tomita, M. (ed.) Current Issues in Parsing Technologies, Kluwer Academic Publisher, Boston (1990)
Schwarz, C.: Content-based text handling. Information Processing and Management 26(2), 219–226 (1989)
Schwarz, C.: Automatic syntactic analysis of free text. Journal of the American Society for Information Science 41(6), 408–417 (1990)
Sheridan, P., Smeaton, A.F.: The application of morpho-syntactic language processing to effective phrase matching. Information Processing and Management 28(3), 349–369 (1992)
Shieber, S.M.: An Introduction to Unification-Based Approaches to Grammar. CSLI Lecture Notes, vol. 4. Chicago University Press, Chicago (1986)
Shimohata, S., Sugio, T., Nagata, J.: Retrieving collocations by cooccurrences and word order constraints. In: Proceedings, 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics (ACL-EACL 1997), Madrid, pp. 476–481. ACL (1997)
Silberztein, M.: Dictionnaires électroniques et analyse automatique de textes: Le système INTEX, Masson, Paris (1993)
Smadja, F.: Xtract: An overview. Computer and the Humanities 26, 399–413 (1993)
Smadja, F., McKeown, K.R., Hatzivassiloglou, V.: Translating collocations for bilingual lexicons: A statistical approach. Computational Linguistics 22(1), 1–38 (1996)
Smeaton, A.F., Sheridan, P.: Using morpho-syntactic language analysis in phrase matching. In: Proceedings, Intelligent Multimedia Information Retrieval Systems and Management (RIAO 1991), Barcelona, pp. 415–429. CID, Paris (1991)
Jones, K.S., Tait, J.I.: Automatic search term variant generation. Journal of Documentation 40(1), 50–66 (1984)
Jones, K.S., Tait, J.I.: Linguistically motivated descriptive term selection. In: Proceedings, Tenth International Conference on Computational Linguistics (COLING 1984), Stanford, pp. 287–290. ACL (1984)
Srinivas, B., Egedi, D., Doran, C., Becker, T.: Lexicalization and grammar development. In: Proceedings, KONVENS 1994, Vienna, pp. 310–319. ÖGAI (1994)
Strzalkowski, T.: Robust text processing in automatic information retrieval. In: Proceedings, Fourth Conference on Applied Natural Language Processing (ANLP 1994), Stuttgart, pp. 168–173. ACL (1994)
Strzalkowski, T.: Natural language information retrieval. Information Processing and Management 31(3), 397–417 (1995)
Strzalkowski, T., Scheyen, P.G.N.: Evaluation of the Tagged Text Parser. In: Bunt, H., Tomita, M. (eds.) Recent Advances in Parsing Technology, pp. 201–220. Kluwer Academic Publisher, Boston (1996)
Strzalkowski, T., Vauthey, B.: Information retrieval using robust natural language processing. In: Proceedings, 20th Annual Meeting of the Association for Computational Linguistics (ACL 1992), Newark, DE, pp. 104–111. ACL (1992)
Tanimoto, T.T.: An elementary mathematical theory of classification. Technical report, IBM (1958)
Tzoukermann, É., Liberman, M.: A finite-state processor for Spanish. In: Proceedings, 13th International Conference on Computational Linguistics (COLING 1990), Helsinki, ACL (1990)
UMLS. Unified Medical Language System, UMLS Knowledge Source. National Library of Medicine, sixth experimental edition (1995), http://www.nlm.nih.gov/research/umls/UMLSDOC.HTML
Van der Eijk, P.: Automating the acquisition of bilingual terminology. In: Proceedings, Sixth Conference of the European Chapter of the Association for Computational Linguistics (EACL 1993), Utrecht, pp. 113–119. ACL (1993)
Véronis, J., Langlais, P.: Evaluation of parallel text alignement systems: Arcade. In: Véronis, J. (ed.) Parallel Text Processing, Kluwer Academic Publisher, Dordrecht (2000)
Palatresi, J.V.: Extracción de candidatos a término mediante combinación de estrategias heterogéneas. Tesi doctoral, Universitat Politécnica de Catalunya, Barcelona, Spain (2001)
Voutilainen, A.: NPtool, A detector of English noun phrases. In: Proceedings, Workshop on Very Large Corpora: Academic and Industrial Perspectives, Columbus, Ohio, pp. 48–57. ACL (1993)
Wagner, R.A., Fisher, M.J.: The string-to-string correction problem. Journal of the Association for Computational Machinery 21(1), 168–173 (1974)
Watson, B.: Taxonomies and Toolkits of Regular Language Algorithms. PhD. Thesis, University of Technology, Eindhoven, the Netherlands (1995)
Woods, W.A.: Conceptual indexing: A better way to organize knowledge. Technical Report SMLI TR-97-61, Sun Microsystems Laboratories, Mountain View (1997)
Yoshikane, F., Tsuji, K., Kageura, K., Jacquemin, C.: Detecting Japanese term variation in textual corpus. In: Proceedings, Fourth International Workshop on Information Retrieval with Asian Languages (IRAL 1999), Academia Sinica, Taipei, Taiwan, pp. 97–108 (1998)
Zhai, C.: Fast statistical parsing of noun phrases for document indexing. In: Proceedings, Fifth Conference on Applied Natural Language Processing (ANLP 1997), Washington, DC, pp. 312–319. ACL (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Savary, A., Jacquemin, C. (2003). Reducing Information Variation in Text. In: Renals, S., Grefenstette, G. (eds) Text- and Speech-Triggered Information Access. Lecture Notes in Computer Science(), vol 2705. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45115-0_6
Download citation
DOI: https://doi.org/10.1007/978-3-540-45115-0_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40635-8
Online ISBN: 978-3-540-45115-0
eBook Packages: Springer Book Archive