Abstract
In order to achieve pattern recognition tasks, we aim at learning an unbiased stochastic edit distance, in the form of a finite-state transducer, from a corpus of (input,output) pairs of strings. Contrary to the state of the art methods, we learn a transducer independently on the marginal probability distribution of the input strings. Such an unbiased way to proceed requires to optimize the parameters of a conditional transducer instead of a joint one. This transducer can be very useful in pattern recognition particularly in the presence of noisy data. Two types of experiments are carried out in this article. The first one aims at showing that our algorithm is able to correctly assess simulated theoretical target distributions. The second one shows its practical interest in a handwritten character recognition task, in comparison with a standard edit distance using a priori fixed edit costs.
This work was supported in part by the IST Programme of the European Community, under the Pascal Network of Excellence, IST-2002-506778. This publication only reflects the authors’ views.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Ristad, E.S., Yianilos, P.N.: Learning string-edit distance. IEEE Trans. Pattern Anal. Mach. Intell. 20(5), 522–532 (1998)
Ristad, E.S., Yianilos, P.N.: Finite growth models. Technical Report CS-TR-533-96, Princeton University Computer Science Department (1996)
Casacuberta, F.: Probabilistic estimation of stochastic regular syntax-directed translation schemes. In: Proceedings of VIth Spanish Symposium on Pattern Recognition and Image Analysis, pp. 201–207 (1995)
Clark, A.: Memory-based learning of morphology with stochastic transducers. In: Proceedings of the Annual meeting of the association for computational linguistic (2002)
Eisner, J.: Parameter estimation for probabilistic finite-state transducers. In: Proceedings of the Annual meeting of the association for computational linguistic, pp. 1–8 (2002)
Gómez, E., Micó, L., Oncina, J.: Testing the linear approximating eliminating search algorithm in handwritten character recognition tasks. In: VI Symposium Nacional de reconocimiento de Formas y Análisis de Imágenes, pp. 212–217 (1995)
Micó, L., Oncina, J.: Comparison of fast nearest neighbour classifiers for handwritten character recognition. Pattern Recognition Letters 19, 351–356 (1998)
Rico-Juan, J.R., Micó, L.: Comparison of aesa and laesa search algorithms using string and tree-edit-distances. Pattern Recognition Letters 24, 1417–1426 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Oncina, J., Sebban, M. (2006). Using Learned Conditional Distributions as Edit Distance. In: Yeung, DY., Kwok, J.T., Fred, A., Roli, F., de Ridder, D. (eds) Structural, Syntactic, and Statistical Pattern Recognition. SSPR /SPR 2006. Lecture Notes in Computer Science, vol 4109. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11815921_44
Download citation
DOI: https://doi.org/10.1007/11815921_44
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-37236-3
Online ISBN: 978-3-540-37241-7
eBook Packages: Computer ScienceComputer Science (R0)