Abstract
We report on the design of a system for correcting spelling errors resulting in non-existent words. The system aims at improving edition of medical reports. Unlike traditional systems, both semantic and syntactic contexts are considered here. The system is organized along three steps. The first module is based on a context independent string-to-string edit distance calculus. The second module, based on the morpho-syntactic context attempts to rank more relevantly the data set provided by the first module, finally a third contextual module processes words with the same part-of-speech by applying some contextual word-sense disambiguation. Modules 2 and 3 are using both hand written rules and data-driven Markovian matrices. A final evaluation shows a significant improvement compared to context-free spelling correction.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Jurafsky D. and Martin J.H.: Speech and Language Processing, Prentice Hall. London.
Hersh W.R., Campbell E.M., Malveau S.E.: Assessing the feasibility of large-scale natural language processing in a corpus of ordinary medical records: a lexical analysis. Proc AMIA Annu Fall Symp (United States), 1997, p580–4
Lilley L.L., Guancy R.: Sound-alike cephalosporins. How drugs with similar spellings and sounds can lead to serious errors. Am J Nurs (United States), Jun 1995, 95(6) p14
Lambert B.L.: Predicting look-alike and sound-alike medication errors. Am J Health Syst Pharm (United States), May 15 1997, 54(10) p1161–71
Golding A.R., Shabes Y.: Combining Trigram-based and Feature-based Methods for Context-Sensitive Spelling Correction. In Proc. of the 34th Annual Meeting of the ACL, Santa Cruz, (1996) p. 71–78.
Golding A.R., Roth D.: Applying Winnow to Context-Sensitive Spelling Correction. In Proc of ICML (1996): p 182–190.
Mangu L., and Brill E.: Automatic Rule Acquisition for Spelling Correction. In Proc. of ICML, (1997).
Peterson, JL.: Computer Programs for Detecting and Correcting Spelling Errors. Computer Practices, Communications of the ACM (1980), vol. 23, number 12.
Brill E. and Moore R.C.: An Improved Error Model for Noisy Channel Spelling Correction. Proc. of the 38th Annual Meeting of the ACL, Hong-Kong (2000) p. ?.
Mays E., Damereau F., Mercer R.L.: Context based spelling correction. Information Processing and Management, 27(5), (1991), p. 517–522.
Oflazer, K.: Error-tolerant Finite State Recognition with Applications to Morphological Analysis and Spelling Correction. Computational Linguistics (1996), 1–18. Association for Computational Linguistics Eds.
Baud R., Lovis C., Ruch P., Rassinoux A.-M.: A Toolset for Medicl Text Processing, in Medical Infobahn for Europe, Proc. of MIE’2000. A. Hasman, B. Blobel, J. Dudeck, R. Engelbrecht, G. Gell, H.-U. Prokosh (eds). IOS Press. (2000).
Courtin J., Dujardin D., Kowarski I., Genthial D., De Lima V.L.: Towards a complete detection/correction system. Proc. of the ICCICL, Penang, Malaysia. (1991), p. 158–173.
Church K.W., Gale W.A.: Probability scoring for spelling correction. In Stat. Comp. 1., (1991) p. 93–103.
Ristad E., and Yanilos P.: Learning String Edit Distance. Int. Conf. on Machine Learning, Morgan Kaufmann. (1997).
Rivest R.L.: Learning Decision Lists, in Machine Learning, 2, (1987) 229–246.
Yarowsky D.: Decision Lists for Lexical Ambiguity Resolution: Application to Accent Restoration in Spanish and French. In Proc. of ACL (1994), p. 88–95.
Ruch P., Baud R., Bouillon P., Rassinoux A.-M., Robert G.: Tagging medical text: a rulebased experiment, in Medical Infobahn for Europe, Proc. of MIE’2000. A. Hasman, B. Blobel, J. Dudeck, R. Engelbrecht, G. Gell, H.-U. Prokosh (eds). IOS Press. (2000).
Ruch P., Baud R., Bouillon P., Robert G.: Minimal Commitment and Full Lexical Disambiguation: Balancing Rules and Hidden Markov Models. In Proc. of CoNLL-2000 (ACLSIGNLL). Lisbon. ACL (ed). (2000), p.111–115.
Ruch P., Baud R., Bouillon P., Rassinoux A.-M., Scherrer J.-R., MEDTAG: Tag-like Semantics for Medical Document Indexing. In Proc. of the AMIA’99 Annual Symposium. Washington. (1999).
Bouillon, P., Baud R., Robert G., Ruch P., Indexing by statistical tagging. In Proc. of the JADT’2000. Lausanne. (2000).
Damereau, F.J.: A technique for computer detection and correction of spelling errors. Commun. ACM, vol. 7, number 3. (1964)
Pollock J.J., Zamora A.: Automatic spelling correction in scientific and scholarly text. Computer Practices, Communications of the ACM (1984), vol. 27, number 4.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ruch, P., Baud, R., Geissbühler, A., Lovis, C., Rassinoux, AM., Rivière, A. (2001). Using Part-of-Speech and Word-Sense Disambiguation for Boosting String-Edit Distance Spelling Correction. In: Quaglini, S., Barahona, P., Andreassen, S. (eds) Artificial Intelligence in Medicine. AIME 2001. Lecture Notes in Computer Science(), vol 2101. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48229-6_36
Download citation
DOI: https://doi.org/10.1007/3-540-48229-6_36
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42294-5
Online ISBN: 978-3-540-48229-1
eBook Packages: Springer Book Archive