An efficient and universal polygon prediction method based on derivable analytic geometry for arbitrary-shaped text detection

Zhang, Xiangnan; Tian, Chunna; Gao, Xinbo

doi:10.1007/s00371-023-03081-9

An efficient and universal polygon prediction method based on derivable analytic geometry for arbitrary-shaped text detection

Original article
Published: 21 September 2023

Volume 40, pages 4273–4285, (2024)
Cite this article

The Visual Computer Aims and scope Submit manuscript

147 Accesses
Explore all metrics

Abstract

A polygon can represent the boundary of curved text more compactly than a rectangle. However, predicting reasonable polygon lacks of solutions due to the complex spatial relationships caused by having more vertices. The two main challenges are how to satisfy the constraints between vertices and how to cope with data conflicts caused by inconsistent annotation standards. To address these problems, we propose a divide and conquer methodology, in which a polygon is considered as a set of convex quadrangles. By predicting quadrangles in sequence, the vertices of the polygon are obtained consecutively and constrained by the previous ones. Then, we propose a measure for the overlap between convex quadrangles, with which the IoU between two polygons is calculated densely. Our method is derivable and can be trained end-to-end. Also, the polygon prediction branch that we proposed is universal and transplantable. We select basic architecture as the backbone, and the text/non-text classification branch adopts an online hard example mining strategy. Experiments on curved benchmark datasets, namely Total Text and CTW1500, demonstrate that our approach achieves state-of-the-art accuracy. It also maintains a high level of inferring efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes

DQ-DETR: Dynamic Queries Enhanced Detection Transformer for Arbitrary Shape Text Detection

Bidirectional Regression for Arbitrary-Shaped Text Detection

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Availability of data and materials

Data sharing is not applicable to this article as no datasets were generated or analyzed during the current study.

References

Yu, J., Jiang, Y., Wang, Z., Cao, Z., Huang, T.S.: Unitbox: An advanced object detection network. In: Hanjalic, A., Snoek, C., Worring, M., Bulterman, D.C.A., Huet, B., Kelliher, A., Kompatsiaris, Y., Li, J. (eds.) Proceedings of the 2016 ACM Conference on Multimedia Conference, MM 2016, Amsterdam, The Netherlands, October 15–19, 2016, pp. 516–520 (2016). https://doi.org/10.1145/2964284.2967274
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S.E., Fu, C., Berg, A.C.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) Computer Vision - ECCV 2016—14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part I. Lecture Notes in Computer Science, vol. 9905, pp. 21–37 (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Ren, S., He, K., Girshick, R.B., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Cortes, C., Lawrence, N.D., Lee, D.D., Sugiyama, M., Garnett, R. (eds.) Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, December 7–12, 2015, Montreal, Quebec, Canada, pp. 91–99 (2015). https://proceedings.neurips.cc/paper/2015/hash/14bfa6bb14875e45bba028a21ed38046-Abstract.html
Yin, X., Yin, X., Huang, K., Hao, H.: Robust text detection in natural scene images. IEEE Trans. Pattern Anal. Mach. Intell. 36(5), 970–983 (2014). https://doi.org/10.1109/TPAMI.2013.182
Article Google Scholar
Wu, H., Zou, B., Zhao, Y., Guo, J.: Scene text detection using adaptive color reduction, adjacent character model and hybrid verification strategy. Vis. Comput. 33(1), 113–126 (2017). https://doi.org/10.1007/s00371-015-1156-1
Article Google Scholar
Liao, M., Shi, B., Bai, X.: Textboxes++: A single-shot oriented scene text detector. IEEE Trans. Image Process. 27(8), 3676–3690 (2018). https://doi.org/10.1109/TIP.2018.2825107
Article MathSciNet Google Scholar
Tian, Z., Huang, W., He, T., He, P., Qiao, Y.: Detecting text in natural image with connectionist text proposal network. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) Computer Vision - ECCV 2016 - 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part VIII. Lecture Notes in Computer Science, vol. 9912, pp. 56–72 (2016). https://doi.org/10.1007/978-3-319-46484-8_4
Zhou, X., Yao, C., Wen, H., Wang, Y., Zhou, S., He, W., Liang, J.: EAST: an efficient and accurate scene text detector. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21-26, 2017, pp. 2642–2651 (2017). https://doi.org/10.1109/CVPR.2017.283
Liao, M., Zhu, Z., Shi, B., Xia, G., Bai, X.: Rotation-sensitive regression for oriented scene text detection. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18–22, 2018, pp. 5909–5918 (2018). 10.1109/CVPR.2018.00619 . http://openaccess.thecvf.com/content_cvpr_2018/html/Liao_Rotation-Sensitive_Regression_for_CVPR_2018_paper.html
Shi, B., Bai, X., Belongie, S.J.: Detecting oriented text in natural images by linking segments. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21–26, 2017, pp. 3482–3490 (2017). https://doi.org/10.1109/CVPR.2017.371
Chng, C.K., Chan, C.S.: Total-text: A comprehensive dataset for scene text detection and recognition. In: 14th IAPR International Conference on Document Analysis and Recognition, ICDAR 2017, Kyoto, Japan, November 9–15, 2017, pp. 935–942 (2017). https://doi.org/10.1109/ICDAR.2017.157
Liu, Y., Jin, L., Zhang, S., Zhang, S.: Detecting curve text in the wild: New dataset and new solution. CoRR arXiv:abs/1712.02170 (2017)
Xu, Y., Wang, Y., Zhou, W., Wang, Y., Yang, Z., Bai, X.: Textfield: Learning a deep direction field for irregular scene text detection. IEEE Trans. Image Process. 28(11), 5566–5579 (2019). https://doi.org/10.1109/TIP.2019.2900589
Article MathSciNet Google Scholar
Liao, M., Wan, Z., Yao, C., Chen, K., Bai, X.: Real-time scene text detection with differentiable binarization. In: The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, February 7–12, 2020, pp. 11474–11481 (2020). https://ojs.aaai.org/index.php/AAAI/article/view/6812
Liao, M., Lyu, P., He, M., Yao, C., Wu, W., Bai, X.: Mask textspotter: An end-to-end trainable neural network for spotting text with arbitrary shapes. IEEE Trans. Pattern Anal. Mach. Intell. 43(2), 532–548 (2021). https://doi.org/10.1109/TPAMI.2019.2937086
Article Google Scholar
Deng, D., Liu, H., Li, X., Cai, D.: Pixellink: Detecting scene text via instance segmentation. In: McIlraith, S.A., Weinberger, K.Q. (eds.) Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18), the 30th Innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI-18), New Orleans, Louisiana, USA, February 2-7, 2018, pp. 6773–6780 (2018). https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16469
Baek, Y., Lee, B., Han, D., Yun, S., Lee, H.: Character region awareness for text detection. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16–20, 2019, pp. 9365–9374 (2019). https://doi.org/10.1109/CVPR.2019.00959. http://openaccess.thecvf.com/content_CVPR_2019/html/Baek_Character_Region_Awareness_for_Text_Detection_CVPR_2019_paper.html
Feng, W., He, W., Yin, F., Zhang, X., Liu, C.: Textdragon: An end-to-end framework for arbitrary shaped text spotting. In: 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27–November 2, 2019, pp. 9075–9084 (2019). https://doi.org/10.1109/ICCV.2019.00917
Liu, Y., Chen, H., Shen, C., He, T., Jin, L., Wang, L.: Abcnet: Real-time scene text spotting with adaptive bezier-curve network. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13–19, 2020, pp. 9806–9815 (2020). https://doi.org/10.1109/CVPR42600.2020.00983. https://openaccess.thecvf.com/content_CVPR_2020/html/Liu_ABCNet_Real-Time_Scene_Text_Spotting_With_Adaptive_Bezier-Curve_Network_CVPR_2020_paper.html
Zhu, Y., Chen, J., Liang, L., Kuang, Z., Jin, L., Zhang, W.: Fourier contour embedding for arbitrary-shaped text detection. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, Virtual, June 19–25, 2021, pp. 3123–3131 (2021). https://doi.org/10.1109/CVPR46437.2021.00314. https://openaccess.thecvf.com/content/CVPR2021/html/Zhu_Fourier_Contour_Embedding_for_Arbitrary-Shaped_Text_Detection_CVPR_2021_paper.html
Liu, H., Yuan, M., Wang, T., Ren, P., Yan, D.: LIST: low illumination scene text detector with automatic feature enhancement. Vis. Comput. 38(9), 3231–3242 (2022). https://doi.org/10.1007/s00371-022-02570-7
Article Google Scholar
Tang, J., Zhang, W., Liu, H., Yang, M., Jiang, B., Hu, G., Bai, X.: Few could be better than all: Feature sampling and grouping for scene text detection. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18–24, 2022, pp. 4553–4562 (2022). https://doi.org/10.1109/CVPR52688.2022.00452
Zhang, C., Liang, B., Huang, Z., En, M., Han, J., Ding, E., Ding, X.: Look more than once: An accurate detector for text of arbitrary shapes. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16–20, 2019, pp. 10552–10561 (2019). https://doi.org/10.1109/CVPR.2019.01080. http://openaccess.thecvf.com/content_CVPR_2019/html/Zhang_Look_More_Than_Once_An_Accurate_Detector_for_Text_of_CVPR_2019_paper.html
Wang, X., Jiang, Y., Luo, Z., Liu, C., Choi, H., Kim, S.: Arbitrary shape scene text detection with adaptive text region representation. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16–20, 2019, pp. 6449–6458 (2019). https://doi.org/10.1109/CVPR.2019.00661. http://openaccess.thecvf.com/content_CVPR_2019/html/Wang_Arbitrary_Shape_Scene_Text_Detection_With_Adaptive_Text_Region_Representation_CVPR_2019_paper.html
Wang, W., Xie, E., Li, X., Hou, W., Lu, T., Yu, G., Shao, S.: Shape robust text detection with progressive scale expansion network. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16–20, 2019, pp. 9336–9345 (2019). https://doi.org/10.1109/CVPR.2019.00956. http://openaccess.thecvf.com/content_CVPR_2019/html/Wang_Shape_Robust_Text_Detection_With_Progressive_Scale_Expansion_Network_CVPR_2019_paper.html
Wang, W., Xie, E., Song, X., Zang, Y., Wang, W., Lu, T., Yu, G., Shen, C.: Efficient and accurate arbitrary-shaped text detection with pixel aggregation network. In: 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27–November 2, 2019, pp. 8439–8448 (2019). https://doi.org/10.1109/ICCV.2019.00853
Tian, Z., Shu, M., Lyu, P., Li, R., Zhou, C., Shen, X., Jia, J.: Learning shape-aware embedding for scene text detection. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16–20, 2019, pp. 4234–4243 (2019). https://doi.org/10.1109/CVPR.2019.00436. http://openaccess.thecvf.com/content_CVPR_2019/html/Tian_Learning_Shape-Aware_Embedding_for_Scene_Text_Detection_CVPR_2019_paper.html
Zhang, S., Zhu, X., Hou, J., Liu, C., Yang, C., Wang, H., Yin, X.: Deep relational reasoning graph network for arbitrary shape text detection. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13–19, 2020, pp. 9696–9705 (2020). https://doi.org/10.1109/CVPR42600.2020.00972. https://openaccess.thecvf.com/content_CVPR_2020/html/Zhang_Deep_Relational_Reasoning_Graph_Network_for_Arbitrary_Shape_Text_Detection_CVPR_2020_paper.html
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27–30, 2016, pp. 770–778 (2016). https://doi.org/10.1109/CVPR.2016.90
Lin, T., Dollár, P., Girshick, R.B., He, K., Hariharan, B., Belongie, S.J.: Feature pyramid networks for object detection. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21–26, 2017, pp. 936–944 (2017). https://doi.org/10.1109/CVPR.2017.106
Shrivastava, A., Gupta, A., Girshick, R.B.: Training region-based object detectors with online hard example mining. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27–30, 2016, pp. 761–769 (2016). https://doi.org/10.1109/CVPR.2016.89
Karatzas, D., Gomez-Bigorda, L., Nicolaou, A., Ghosh, S.K., Bagdanov, A.D., Iwamura, M., Matas, J., Neumann, L., Chandrasekhar, V.R., Lu, S., Shafait, F., Uchida, S., Valveny, E.: ICDAR 2015 competition on robust reading. In: 13th International Conference on Document Analysis and Recognition, ICDAR 2015, Nancy, France, August 23–26, 2015, pp. 1156–1160 (2015). https://doi.org/10.1109/ICDAR.2015.7333942
Nayef, N., Yin, F., Bizid, I., Choi, H., Feng, Y., Karatzas, D., Luo, Z., Pal, U., Rigaud, C., Chazalon, J., Khlif, W., Luqman, M.M., Burie, J., Liu, C., Ogier, J.: ICDAR2017 robust reading challenge on multi-lingual scene text detection and script identification - RRC-MLT. In: 14th IAPR International Conference on Document Analysis and Recognition, ICDAR 2017, Kyoto, Japan, November 9–15, 2017, pp. 1454–1459 (2017). https://doi.org/10.1109/ICDAR.2017.237
Neubeck, A., Gool, L.V.: Efficient non-maximum suppression. In: 18th International Conference on Pattern Recognition (ICPR 2006), 20-24 August 2006, Hong Kong, China, pp. 850–855 (2006). https://doi.org/10.1109/ICPR.2006.479
Nayef, N., Liu, C., Ogier, J., Patel, Y., Busta, M., Chowdhury, P.N., Karatzas, D., Khlif, W., Matas, J., Pal, U., Burie, J.: ICDAR2019 robust reading challenge on multi-lingual scene text detection and recognition—RRC-MLT-2019. In: 2019 International Conference on Document Analysis and Recognition, ICDAR 2019, Sydney, Australia, September 20-25, 2019, pp. 1582–1587 (2019). https://doi.org/10.1109/ICDAR.2019.00254
Long, S., Ruan, J., Zhang, W., He, X., Wu, W., Yao, C.: Textsnake: A flexible representation for detecting text of arbitrary shapes. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) Computer Vision—ECCV 2018—15th European Conference, Munich, Germany, September 8–14, 2018, Proceedings, Part II. Lecture Notes in Computer Science, vol. 11206, pp. 19–35 (2018). https://doi.org/10.1007/978-3-030-01216-8_2
Qiao, L., Tang, S., Cheng, Z., Xu, Y., Niu, Y., Pu, S., Wu, F.: Text perceptron: Towards end-to-end arbitrary-shaped text spotting. In: The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, February 7–12, 2020, pp. 11899–11907 (2020). https://ojs.aaai.org/index.php/AAAI/article/view/6864
Raisi, Z., Naiel, M.A., Younes, G., Wardell, S., Zelek, J.S.: Transformer-based text detection in the wild. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2021, Virtual, June 19–25, 2021, pp. 3162–3171 (2021). https://doi.org/10.1109/CVPRW53098.2021.00353. https://openaccess.thecvf.com/content/CVPR2021W/VOCVALC/html/Raisi_Transformer-Based_Text_Detection_in_the_Wild_CVPRW_2021_paper.html
Ronen, R., Tsiper, S., Anschel, O., Lavi, I., Markovitz, A., Manmatha, R.: GLASS: global to local attention for scene-text spotting. In: Avidan, S., Brostow, G.J., Cissé, M., Farinella, G.M., Hassner, T. (eds.) Computer Vision—ECCV 2022—17th European Conference, Tel Aviv, Israel, October 23-27, 2022, Proceedings, Part XXVIII. Lecture Notes in Computer Science, vol. 13688, pp. 249–266 (2022). https://doi.org/10.1007/978-3-031-19815-1_15
Song, S., Wan, J., Yang, Z., Tang, J., Cheng, W., Bai, X., Yao, C.: Vision-language pre-training for boosting scene text detectors. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18–24, 2022, pp. 15660–15670 (2022). https://doi.org/10.1109/CVPR52688.2022.01523
Wang, F., Xu, X., Chen, Y., Li, X.: Fuzzy semantics for arbitrary-shaped scene text detection. IEEE Trans. Image Process. 32, 1–12 (2023). https://doi.org/10.1109/TIP.2022.3201467
Liu, Z., Lin, G., Yang, S., Liu, F., Lin, W., Goh, W.L.: Towards robust curve text detection with conditional spatial expansion. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16–20, 2019, pp. 7269–7278 (2019). https://doi.org/10.1109/CVPR.2019.00744. http://openaccess.thecvf.com/content_CVPR_2019/html/Liu_Towards_Robust_Curve_Text_Detection_With_Conditional_Spatial_Expansion_CVPR_2019_paper.html
Wang, Y., Xie, H., Zha, Z., Xing, M., Fu, Z., Zhang, Y.: Contournet: Taking a further step toward accurate arbitrary-shaped scene text detection. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13–19, 2020, pp. 11750–11759 (2020). https://doi.org/10.1109/CVPR42600.2020.01177. https://openaccess.thecvf.com/content_CVPR_2020/html/Wang_ContourNet_Taking_a_Further_Step_Toward_Accurate_Arbitrary-Shaped_Scene_Text_CVPR_2020_paper.html
Dai, P., Zhang, S., Zhang, H., Cao, X.: Progressive contour regression for arbitrary-shape scene text detection. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, Virtual, June 19–25, 2021, pp. 7393–7402 (2021). https://doi.org/10.1109/CVPR46437.2021.00731. https://openaccess.thecvf.com/content/CVPR2021/html/Dai_Progressive_Contour_Regression_for_Arbitrary-Shape_Scene_Text_Detection_CVPR_2021_paper.html
Wang, P., Zhang, C., Qi, F., Liu, S., Zhang, X., Lyu, P., Han, J., Fifth AAAI Conference on Artificial Intelligence, AAAI 2021, Thirty-Third Conference on Innovative Applications of Artificial Intelligence, IAAI 2021, The Eleventh Symposium on Educational Advances in Artificial Intelligence, EAAI 2021, Virtual Event, February 2–9, 2021, pp. 2782–2790 (2021). https://ojs.aaai.org/index.php/AAAI/article/view/16383
Cao, M., Zhang, C., Yang, D., Zou, Y.: All you need is a second look: Towards arbitrary-shaped text detection. IEEE Trans. Circuits Syst. Video Technol. 32(2), 758–767 (2022). https://doi.org/10.1109/TCSVT.2021.3068133
Article Google Scholar
Rezatofighi, H., Tsoi, N., Gwak, J., Sadeghian, A., Reid, I.D., Savarese, S.: Generalized intersection over union: A metric and a loss for bounding box regression. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16–20, 2019, pp. 658–666 (2019). https://doi.org/10.1109/CVPR.2019.00075. http://openaccess.thecvf.com/content_CVPR_2019/html/Rezatofighi_Generalized_Intersection_Over_Union_A_Metric_and_a_Loss_for_CVPR_2019_paper.html
Zheng, Z., Wang, P., Liu, W., Li, J., Ye, R., Ren, D.: Distance-iou loss: Faster and better learning for bounding box regression. In: The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, February 7–12, 2020, pp. 12993–13000 (2020). https://ojs.aaai.org/index.php/AAAI/article/view/6999
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Bartlett, P.L., Pereira, F.C.N., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a Meeting Held December 3–6, 2012, Lake Tahoe, Nevada, United States, pp. 1106–1114 (2012). https://proceedings.neurips.cc/paper/2012/hash/c399862d3b9d6b76c8436e924a68c45b-Abstract.html

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China under Grants 62036007, 62176195, 62221005, U22A2096 and U21A20514.

Author information

Authors and Affiliations

School of Electronic Engineering, Xidian University, Xi’an, 710071, China
Xiangnan Zhang & Chunna Tian
Chongqing Key Laboratory of Image Cognition, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China
Xinbo Gao

Authors

Xiangnan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Chunna Tian
View author publications
You can also search for this author in PubMed Google Scholar
Xinbo Gao
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Conceptualization, methodology, formal analysis and investigation and writing—original draft preparation were done by XZ; writing—review and editing, funding acquisition, resources and supervision were done by XG and CT.

Corresponding author

Correspondence to Xinbo Gao.

Ethics declarations

Conflict of interest:

We declare that we have no financial and personal relationships with other people or organizations that can inappropriately influence our work, and there is no professional or other personal interest of any nature or kind in any product, service and/or company that could be construed as influencing the position presented in, or the review of, the manuscript entitled.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhang, X., Tian, C. & Gao, X. An efficient and universal polygon prediction method based on derivable analytic geometry for arbitrary-shaped text detection. Vis Comput 40, 4273–4285 (2024). https://doi.org/10.1007/s00371-023-03081-9

Download citation

Accepted: 25 August 2023
Published: 21 September 2023
Issue Date: June 2024
DOI: https://doi.org/10.1007/s00371-023-03081-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An efficient and universal polygon prediction method based on derivable analytic geometry for arbitrary-shaped text detection

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes

DQ-DETR: Dynamic Queries Enhanced Detection Transformer for Arbitrary Shape Text Detection

Bidirectional Regression for Arbitrary-Shaped Text Detection

Availability of data and materials

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest:

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

An efficient and universal polygon prediction method based on derivable analytic geometry for arbitrary-shaped text detection

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes

DQ-DETR: Dynamic Queries Enhanced Detection Transformer for Arbitrary Shape Text Detection

Bidirectional Regression for Arbitrary-Shaped Text Detection

Explore related subjects

Availability of data and materials

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest:

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation