A Model for Evaluating the Performance of a Multiple Keywords Spotting System for the Transcription of Historical Handwritten Documents
Abstract
:1. Introduction
2. The Model
- We are using a segmentation-based KWS system. This requires that the collection of documents we want to transcribe has been segmented to extract images, each containing one word;
- The KWS system provides, for each word images, an output list containing a ranked list of k possible transcriptions;
- images of the data collection have been manually transcribed and used as training set (TS), so that the number of samples that compose the data set (DS) to transcribe for completing the task is . In the case in which a Query-by-Example KWS system is used, the transcriptions are needed so that they can be automatically associated to the images retrieved by the system, while in case of a Query-by-String KWS system, the transcriptions are needed to train the system during the supervised learning step they envisage. In the following, we will denote with the time needed to choose the images of TS and to enter their transcriptions;
- The query list, i.e., the complete list of keywords to spot, is not available, as is customary in KWS performance evaluation literature, but rather, the only available information is obtained by transcribing the training set. Denoting with NDC and NTS the number of keywords, i.e., the number of entries in the vocabulary associated to DS and TS, respectively, this means that NTS is known, because of the manual transcription of the samples in the training set, while NDC is not known.
- and are the recall@k and precision@k of the KWS for the i-th keyword computed on DS;
- niDS is the number of word images of the i-th keyword in DS;
- is the number of correct samples, i.e., the number of word images of DS that are instances of the i-th keyword and whose output list includes that keyword;
- is the number of wrong samples, i.e., the number of word images of DS that are not instances of the i-th keyword, but whose output list includes that keyword;
- is the number of missed samples, i.e., the number of word images in DS that are instances of the i-th keyword, but whose output list does not include that keyword;
- is the number of out-of-vocabulary samples, i.e., the number of word images of DS that are instances of the NDS = NDC − NTS unknown entries of the keywords list of the data set.
2.1. Lexicon-Based Systems
2.2. Lexicon-Free Systems
3. The Model at Work
3.1. Transcription of the Training Data
3.2. Training of the System and Feasibility Check
3.3. Keyword Spotting on the Test Set
3.3.1. Estimating the User Time: Lexicon-Based System
- the distribution of the values of and computed on TTS and DS is similar;
- the distribution of the length of the keywords is similar on each set, and because depends mostly on the number of characters rather than on the actual character of the keyword, it is independent of the actual keyword;
- the values of the model parameters are normally distributed;
- all the samples of the data set are instances of the keywords obtained from the training set, i.e., that .
3.3.2. Estimating the User Time: Lexicon-Free System
3.4. Computing the Gain
4. Model Validation
4.1. The Validation Tool
4.2. Experimental Results
5. Conclusions
Author Contributions
Funding
Conflicts of Interest
References
- Giotis, A.P.; Skifas, G.; Gatos, B.; Nikou, C. A survey of document image word spotting techniques. Pattern Recognit. 2017, 68, 310–332. [Google Scholar] [CrossRef]
- Snow, R.; O’Connor, B.; Jurafsky, D.; Ng, A. Cheap and Fast—But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks. In Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, Honolulu, HI, USA, 25–27 October 2008. [Google Scholar]
- Manmatha, R.; Han, C.; Riseman, M. Word spotting: A new approach to indexing handwriting. In Proceedings of the CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, CA, USA, 18 June 1996. [Google Scholar]
- Rath, T.; Manmatha, R.; Lavrenko, V. A Search Engine Historical Manuscript Images. In Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Sheffield, UK, 25–29 July 2004. [Google Scholar]
- Rohlicek, J.R.; Russell, W.; Roukos, S.; Gish, H. Continuous hidden Markov modeling for speaker-independent word spotting. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, Glasgow, UK, 23 May 1989. [Google Scholar]
- Murugappan, A.; Ramachandran, B.; Dhavachelvan, P. A survey of keyword spotting techniques for printed document images. Artif. Intell. Rev. 2011, 35, 119–136. [Google Scholar] [CrossRef]
- Calvo-Zaragoza, J.; Toselli, A.H.; Vidal, E. Probabilistic Music-Symbol Spotting in Handwritten Scores. In Proceedings of the 16th International Conference on Frontiers in Handwriting Recognition, ICFHR, Niagra Falls, NY, USA, 5 August 2018; pp. 558–563. [Google Scholar]
- Rezvanifar, A.; Cote, M.; Branzan Albu, A. Symbol spotting for architectural drawings: State-of-the-art and new industry-driven developments. IPSJ Trans. Comput. Vis. Appl. 2019, 11, 2. [Google Scholar] [CrossRef]
- Almazan, J.; Gordo, A.; Fornés, A.; Valveny, E. Handwritten word spotting with corrected attributes. In Proceedings of the IEEE International Conference on Computer Vision, Sydney, Australia, 1–8 December 2013. [Google Scholar]
- Fischer, A.; Keller, A.; Frinken, V.; Bunke, H. Lexicon-free handwritten word spotting using character HMMs. Pattern Recognit. Lett. 2012, 33, 934–942. [Google Scholar] [CrossRef]
- Kumar, G.; Govindaraju, V. Bayesian active learning for keyword spotting in handwritten documents. In Proceedings of the 22nd International Conference on Pattern Recognition, Stockholm, Sweden, 24 August 2014. [Google Scholar]
- Rodriguez-Serrano, J.A.; Perronnin, F. Handwritten word spotting using hidden Markov models and universal vocabularies. Pattern Recognit. 2009, 42, 2106–2116. [Google Scholar] [CrossRef]
- Toselli, A.H.; Vidal, E. Fast HMM-Filler approach for keyword spotting in handwritten documents. In Proceedings of the 12th International Conference on Document Analysis and Recognition, Washington, DC, USA, 25 August 2013. [Google Scholar]
- Wshah, S.; Kumar, G.; Govindaraju, V. Script Independent Word Spotting in Offline Handwritten Documents Based on Hidden Markov Models. In Proceedings of the International Conference on Frontiers in Handwriting Recognition, Bari, Italy, 18 September 2012. [Google Scholar]
- Santoro, A.; Parziale, A.; Marcelli, A. A human in the loop approach to historical handwritten documents transcription. In Proceedings of the 15th International Conference on Frontiers in Handwriting Recognition, Shenzhen, China, 23 October 2016. [Google Scholar]
- Fernandez, D.; Lladós, J.; Fornés, A. Handwritten word spotting in old manuscript images using a pseudo-structural descriptor organized in a hash structure. In Pattern Recognition and Image Analysis, Proceedings of the 5th Iberian Conference, IbPRIA 2011, Las Palmas de Gran Canaria, Spain, 8–10 June 2011; Springer: Berlin/Heidelberg, Germany, 2011. [Google Scholar]
- Fornes, A.; Frinken, V.; Fischer, A.; Almazan, J.; Jackson, G.; Bunke, H. A keyword spotting approach using blurred shape model-based descriptors. In Proceedings of the 2011 Workshop on Historical Document Imaging and Processing, Beijing, China, 16–17 September 2011; pp. 83–90. [Google Scholar]
- Gatos, B.; Pratikakis, I. Segmentation-free word spotting in historical printed documents. In Proceedings of the 10th International Conference on Document Analysis and Recognition, Barcelona, Spain, 26 July 2009. [Google Scholar]
- Kolcz, A.; Alspector, J.; Augusteijn, M.; Carlson, R.; Popescu, G.V. A line-oriented approach to word spotting in handwritten documents. Pattern Anal. Appl. 2000, 3, 153–168. [Google Scholar] [CrossRef]
- Rusinol, M.; Aldavert, D.; Toledo, R.; Llados, J. Browsing heterogeneous document collections by a segmentation-free word spotting method. In Proceedings of the International Conference on Document Analysis and Recognition, Beijing, China, 18 September 2011. [Google Scholar]
- Vidal, E.; Toselli, A.H.; Puigcerver, J. High performance query-by-example keyword spotting using query-by-string techniques. In Proceedings of the 13th International Conference on Document Analysis and Recognition (ICDAR), Tunis, Tunisia, 23 August 2015. [Google Scholar]
- Frinken, V.; Baumgartner, M.; Fischer, A.; Bunke, H. Semi-Supervised Learning for Cursive Handwriting Recognition using Keyword Spotting. In Proceedings of the International Conference on Frontiers in Handwriting Recognition, Bari, Italy, 18 September 2012. [Google Scholar]
- Krishnan, P.; Dutta, K.; Jawahar, C.V. Deep feature embedding for accurate recognition and retrieval of handwritten text. In Proceedings of the 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), Shenzhen, China, 23 October 2016. [Google Scholar]
- Terasawa, K.; Tanaka, Y. Slit style HoG feature for document image word spotting. In Proceedings of the International Conference on Document Analysis and Recognition, Barcelona, Spain, 26 July 2009. [Google Scholar]
- Kumar, G.; Shi, Z.; Setlur, S.; Govindaraju, V.; Ramachandrula, S. Keyword spotting framework using dynamic background model. In Proceedings of the International Conference on Frontiers in Handwriting Recognition, Bari, Italy, 18 September 2012. [Google Scholar]
- van Der Zant, T.; Schomaker, L.; Haak, K. Handwritten-word spotting using biologically inspired features. IEEE Trans. Pattern Anal. Mach. Intell. 2008, 30, 1945–1957. [Google Scholar] [CrossRef] [PubMed]
- Frinken, V.; Fischer, A.; Manmatha, R.; Bunke, H. A novel word spotting method based on recurrent neural networks. IEEE Trans. Pattern Anal. Mach. Intell. 2012, 34, 211–224. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Leydier, Y.; Bourgeois, F.L.; Emptoz, H. Text search for medieval manuscript images. Pattern Recognit. 2007, 40, 3552–3567. [Google Scholar] [CrossRef]
- Zhang, X.; Tan, C. Segmentation-free keyword spotting for handwritten documents based on heat kernel signature. In Proceedings of the International Conference on Document Analysis and Recognition, Washington, DC, USA, 25 August 2013. [Google Scholar]
- Sudholt, S.; Fink, G.A. PHOCNet: A deep convolutional neural network for word spotting in handwritten documents. In Proceedings of the 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), Shenzhen, China, 23 October 2016. [Google Scholar]
- Sfikas, G.; Retsinas, G.; Gatos, B. Zoning aggregated hypercolumns for keyword spotting. In Proceedings of the 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), Shenzhen, China, 23 October 2016. [Google Scholar]
- Wilkinson, T.; Brun, A. Semantic and verbatim word spotting using deep neural networks. In Proceedings of the 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), Shenzhen, China, 23 October 2016. [Google Scholar]
- Zhong, Z.; Pan, P.; Jin, L.; Mouchère, H.; Viard-Gaudin, C. SpottingNet: Learning the similarity of word images with convolutional neural network for word spotting in handwritten historical documents. In Proceedings of the 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), Shenzhen, China, 23 October 2016. [Google Scholar]
- Almazán, J.; Gordo, A.; Fornés, A.; Valveny, E. Word spotting and recognition with embedded attributes. IEEE Trans. Pattern Anal. Mach. Intell. 2014, 36, 2552–2566. [Google Scholar] [CrossRef] [PubMed]
- Papandreou, A.; Gatos, B.; Louloudis, G. An adaptive zoning technique for efficient word retrieval using dynamic time warping. In Proceedings of the First International Conference on Digital Access to Textual Cultural Heritage; Association for Computing Machinery: New York, NY, USA, 2014; pp. 147–152. [Google Scholar]
- Khurshid, K.; Faure, C.; Vincent, N. Word spotting in historical printed documents using shape and sequence comparisons. Pattern Recognit. 2012, 45, 2598–2609. [Google Scholar] [CrossRef]
- Rodríguez, J.A.; Perronnin, F. Local gradient histogram features for word spotting in unconstrained handwritten documents. In Proceedings of the International Conference on Frontiers in Handwriting Recognition, Montréal, QC, Canada, 19–21 August 2008. [Google Scholar]
- Aldavert, D.; Rusiñol, M.; Toledo, R. A study of bag-of-visual-words representations for handwritten keyword spotting. IJDAR 2015, 18, 223–234. [Google Scholar] [CrossRef]
- Bhardwaj, A.; Jose, D.; Govindaraju, V. Script independent word spotting in multilingual documents. In Proceedings of the 2nd Workshop on Cross Lingual Information Access (CLIA) Addressing the Information Need of Multilingual Societies, Hyderabad, India, 11 January 2008. [Google Scholar]
- Bai, S.; Li, L.; Tan, C. Keyword spotting in document images through word shape coding. In Proceedings of the 10th International Conference on Document Analysis and Recognition, Barcelona, Spain, 26 July 2009. [Google Scholar]
- Puigcerver, J.; Toselli, A.H.; Vidal, E. Querying out-of-vocabulary words in lexicon-based keyword spotting. Neural Comput. Appl. 2017, 28, 2372–2382. [Google Scholar] [CrossRef] [Green Version]
- Rodríguez-Serrano, J.A.; Perronnin, F. A model-based sequence similarity with application to handwritten word spotting. IEEE Trans. Pattern Anal. Mach. Intell. 2012, 34, 2108–2120. [Google Scholar] [CrossRef] [PubMed]
- Santoro, A.; Marcelli, A. Using keyword spotting systems as tools for the transcription of historical handwritten documents: Models and procedures for performance evaluation. Pattern Recognit. Lett. 2020, 131, 329–335. [Google Scholar] [CrossRef]
- Long, D.G. The Manuscripts of Jeremy Bentham: A Chronological Index to the Collection in the Library of University College, London: Based on the Catalogue by Taylor Milne A; The College: London, UK, 1981. [Google Scholar]
- Puigcerver, J.; Toselli, A.H.; Vidal, E. ICDAR2015 Competition on Keyword Spotting for Handwritten Documents. In Proceedings of the 2015 13th International Conference on Document Analysis and Recognition (ICDAR), Nancy, France, 23 August 2015. [Google Scholar]
- Pratikakis, I.; Zagoris, K.; Gatos, B.; Louloudis, G.; Stamatopoulos, N. ICFHR 2014 Competition on Handwritten Keyword Spotting. In Proceedings of the 2014 14th International Conference on Frontiers in Handwriting Recognition, Crete, Greece, 1 September 2014. [Google Scholar]
- Cordella, L.P.; De Stefano, C.; Marcelli, A.; Santoro, A. Writing Order Recovery from Off-Line Handwriting by Graph Traversal. In Proceedings of the 2010 20th International Conference on Pattern Recognition, Istanbul, Turkey, 23 August 2010; pp. 1896–1899. [Google Scholar]
- De Stefano, C.; Guadagno, G.; Marcelli, A. A saliency-based segmentation method for online cursive handwriting. Int. J. Pattern Recognit. Artif. Intell. 2004, 18, 1139–1156. [Google Scholar] [CrossRef]
- Senatore, R.; Marcelli, A. Where are the characters? Characters segmentation in annotated cursive handwriting. In Proceedings of the 16th IGS Conference, Nara, Japan, 11 June 2013; pp. 171–174. [Google Scholar]
- De Stefano, C.; Garruto, M.; Lapresa, L.; Marcelli, A. Detecting Handwriting Primitives in Cursive Words by Stroke Sequence Matching. In Advances in Graphonomics; Marcelli, A., De Stefano, C., Eds.; Zona Editrice: Arezzo, Italy, 2005; pp. 281–285. [Google Scholar]
- De Stefano, C.; Marcelli, A.; Parziale, A.; Senatore, R. Reading cursive handwriting. In Proceedings of the 2010 12th International Conference on Frontiers in Handwriting Recognition, Kolkata, India, 16 November 2010; pp. 95–100. [Google Scholar]
nDC | nTS | NTS | nTTS | NTTS | nDS |
---|---|---|---|---|---|
10,733 | 1089 | 354 | 942 | 391 | 8702 |
TTS | TTTS | TDS | Tman |
---|---|---|---|
6240 | 5472 | 52,459 | 61,534 |
tv | tw | tm | tMw | rk | pk | noovc | noovw | ||||
---|---|---|---|---|---|---|---|---|---|---|---|
μ | σ | μ | σ | Μ | σ | μ | σ | ||||
1024 | 359 | 3152 | 1045 | 2543 | 682 | 5903 | 2611 | 0.65 | 0.71 | 39 | 247 |
Values | Lexicon-Based | Lexicon-Free | ||
---|---|---|---|---|
On TTS | Tuser | G (%) | Tuser | G (%) |
estimated | 62:12 | 14.86 | 51:21 | 20.41 |
actual | 64:48 | 13.52 | 61:30 | 15.62 |
Values | Lexicon-Based | Lexicon-Free | ||
---|---|---|---|---|
On DS | Tuser | G (%) | Tuser | G (%) |
estimated | 11:27:37 | 13.91 | 10:30:30 | 19.25 |
actual | 11:31:02 | 12.30 | 11:10:14 | 15.23 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Marcelli, A.; De Gregorio, G.; Santoro, A. A Model for Evaluating the Performance of a Multiple Keywords Spotting System for the Transcription of Historical Handwritten Documents. J. Imaging 2020, 6, 117. https://doi.org/10.3390/jimaging6110117
Marcelli A, De Gregorio G, Santoro A. A Model for Evaluating the Performance of a Multiple Keywords Spotting System for the Transcription of Historical Handwritten Documents. Journal of Imaging. 2020; 6(11):117. https://doi.org/10.3390/jimaging6110117
Chicago/Turabian StyleMarcelli, Angelo, Giuseppe De Gregorio, and Adolfo Santoro. 2020. "A Model for Evaluating the Performance of a Multiple Keywords Spotting System for the Transcription of Historical Handwritten Documents" Journal of Imaging 6, no. 11: 117. https://doi.org/10.3390/jimaging6110117