Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Text Line Segmentation in Historical Newspapers

  • Conference paper
  • First Online:
Artificial Intelligence and Soft Computing (ICAISC 2022)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13589))

Included in the following conference series:

  • 409 Accesses

Abstract

This paper deals with page segmentation into individual text lines used as an input of a line-based OCR system. This task is usually solved in one step which directly identifies text lines in whole documents. However, a direct approach may jeopardize the reading order of the lines and thus deteriorate the overall transcription result.

We propose a novel approach which decomposes this problem into two steps: text-block and text-line segmentation. The particular tasks are handled by algorithms based on fully convolutional neural networks.

The proposed method is evaluated on two standard corpora, Europeana and RDCL 2019, and on a novel dataset created from data available in Porta fontium portal. This dataset is freely available for research purposes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 84.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    http://www.portafontium.cz/.

  2. 2.

    http://ocr-corpus.kiv.zcu.cz/.

  3. 3.

    http://www.portafontium.eu/.

  4. 4.

    https://scriptnet.iit.demokritos.gr/competitions/11/.

  5. 5.

    https://github.com/Transkribus/TranskribusBaseLineEvaluationScheme.

References

  1. Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., Süsstrunk, S.: Slic superpixels. Technical report (2010)

    Google Scholar 

  2. Breuel, T.M., Ul-Hasan, A., Azawi, M.I.A.A., Shafait, F.: High-performance OCR for printed English and fraktur using LSTM networks. In: 2013 12th International Conference on Document Analysis and Recognition, pp. 683–687 (2013)

    Google Scholar 

  3. Chen, K., Seuret, M., Hennebert, J., Ingold, R.: Convolutional neural networks for page segmentation of historical document images. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 965–970. IEEE (2017)

    Google Scholar 

  4. Chen, K., Seuret, M., Liwicki, M., Hennebert, J., Ingold, R.: Page segmentation of historical document images with convolutional autoencoders. In: 2015 13th International Conference on Document Analysis and Recognition (ICDAR), pp. 1011–1015. IEEE (2015)

    Google Scholar 

  5. Clausner, C., Papadopoulos, C., Pletschacher, S., Antonacopoulos, A.: The ENP image and ground truth dataset of historical newspapers. In: 2015 13th International Conference on Document Analysis and Recognition (ICDAR), pp. 931–935. IEEE (2015)

    Google Scholar 

  6. Clausner, C., Pletschacher, S., Antonacopoulos, A.: Aletheia-an advanced document layout and text ground-truthing system for production environments. In: 2011 International Conference on Document Analysis and Recognition, pp. 48–52. IEEE (2011)

    Google Scholar 

  7. Clausner, C., Pletschacher, S., Antonacopoulos, A.: Scenario driven in-depth performance evaluation of document layout analysis methods. In: 2011 International Conference on Document Analysis and Recognition, pp. 1404–1408. IEEE (2011)

    Google Scholar 

  8. Diem, M., Kleber, F., Fiel, S., Grüning, T., Gatos, B.: CBAD: ICDAR 2017 competition on baseline detection. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 1355–1360. IEEE (2017)

    Google Scholar 

  9. Galibert, O., Kahn, J., Oparin, I.: The zonemap metric for page segmentation and area classification in scanned documents. In: 2014 IEEE International Conference on Image Processing (ICIP), pp. 2594–2598. IEEE (2014)

    Google Scholar 

  10. Grüning, T., Labahn, R., Diem, M., Kleber, F., Fiel, S.: Read-bad: a new dataset and evaluation scheme for baseline detection in archival documents. In: 2018 13th IAPR International Workshop on Document Analysis Systems (DAS), pp. 351–356. IEEE (2018)

    Google Scholar 

  11. Grüning, T., Leifert, G., Strauß, T., Labahn, R.: A Two-Stage Method for Text Line Detection in Historical Documents (2019). arxiv.org/abs/1802.03345

  12. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)

    Google Scholar 

  13. Lenc, L., Martínek, J., Král, P.: Tools for semi-automatic preparation of training data for OCR. In: MacIntyre, J., Maglogiannis, I., Iliadis, L., Pimenidis, E. (eds.) AIAI 2019. IAICT, vol. 559, pp. 351–361. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-19823-7_29

    Chapter  Google Scholar 

  14. Li, X.H., Yin, F., Xue, T., Liu, L., Ogier, J.M., Liu, C.L.: Instance aware document image segmentation using label pyramid networks and deep watershed transformation. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 514–519. IEEE (2019)

    Google Scholar 

  15. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)

    Google Scholar 

  16. Mechi, O., Mehri, M., Ingold, R., Amara, N.E.B.: Text line segmentation in historical document images using an adaptive U-net architecture. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 369–374. IEEE (2019)

    Google Scholar 

  17. Novikov, A.A., Lenis, D., Major, D., Hladuvka, J., Wimmer, M., Bühler, K.: Fully convolutional architectures for multiclass segmentation in chest radiographs. IEEE Trans. Med. Imaging 37(8), 1865–1876 (2018)

    Article  Google Scholar 

  18. Pletschacher, S., Antonacopoulos, A.: The page (page analysis and ground-truth elements) format framework. In: 2010 20th International Conference on Pattern Recognition, pp. 257–260. IEEE (2010)

    Google Scholar 

  19. Powers, D.: Evaluation: from precision, recall and F-measure to ROC, informedness, markedness & correlation. J. Mach. Learn. Technol. 2(1), 37–63 (2011)

    Google Scholar 

  20. Rezatofighi, H., Tsoi, N., Gwak, J., Sadeghian, A., Reid, I., Savarese, S.: Generalized intersection over union: a metric and a loss for bounding box regression. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 658–666 (2019)

    Google Scholar 

  21. Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28

    Chapter  Google Scholar 

  22. Sherrah, J.: Fully convolutional networks for dense semantic labelling of high-resolution aerial imagery. arXiv preprint arXiv:1606.02585 (2016)

  23. Shi, B., Bai, X., Yao, C.: An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Trans. Pattern Anal. Mach. Intell. 39(11), 2298–2304 (2016)

    Article  Google Scholar 

  24. Simistira, F., Ul-Hassan, A., Papavassiliou, V., Gatos, B., Katsouros, V., Liwicki, M.: Recognition of historical greek polytonic scripts using LSTM networks. In: 2015 13th International Conference on Document Analysis and Recognition (ICDAR), pp. 766–770. IEEE (2015)

    Google Scholar 

  25. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

  26. Wick, C., Puppe, F.: Fully convolutional neural networks for page segmentation of historical document images. In: 2018 13th IAPR International Workshop on Document Analysis Systems (DAS), pp. 287–292. IEEE (2018)

    Google Scholar 

  27. Xu, Y., He, W., Yin, F., Liu, C.L.: Page segmentation for historical handwritten documents using fully convolutional networks. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 541–546. IEEE (2017)

    Google Scholar 

  28. Zhong, X., Tang, J., Yepes, A.J.: Publaynet: largest dataset ever for document layout analysis. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 1015–1022. IEEE (2019)

    Google Scholar 

Download references

Acknowledgements

This work has been partly supported from ERDF “Research and Development of Intelligent Components of Advanced Technologies for the Pilsen Metropolitan Area (InteCom)” (no.: CZ.02.1.01/0.0/0.0/17_048/0007267).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ladislav Lenc .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Lenc, L., Martínek, J., Král, P. (2023). Text Line Segmentation in Historical Newspapers. In: Rutkowski, L., Scherer, R., Korytkowski, M., Pedrycz, W., Tadeusiewicz, R., Zurada, J.M. (eds) Artificial Intelligence and Soft Computing. ICAISC 2022. Lecture Notes in Computer Science(), vol 13589. Springer, Cham. https://doi.org/10.1007/978-3-031-23480-4_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-23480-4_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-23479-8

  • Online ISBN: 978-3-031-23480-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics