Abstract
Traditional manual coloring methods require hand-drawn colors to create visually pleasing color combinations, which is both time-consuming and laborious. Reference-based line art coloring is a challenging task in computer vision. However, existing reference-based methods often struggle to generate visually appealing coloring images because sketch images lack texture and training data. To address this, we propose a new sketch coloring network based on the PatchGAN architecture. First, we propose a new self-attention gate (SAG) to effectively and correctly identifying the line semantic information from shallow to deep layers in the CNN. Second, we propose a new Progressive PatchGAN (PPGAN) to help train the discriminator to better distinguish real anime images. Our experiments show that compared to existing methods, our approach demonstrates significant improvements in some benchmark tests, with Fréchet Inception Distance (FID) improved up to 24.195% and Structural Similarity Index Measure (SSIM) improved up to 14.30% compared to the best values.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Liu, G., Chen, X., Hu, Y.: Anime sketch coloring with swish-gated residual U-Net. In: Peng, H., Deng, C., Wu, Z., Liu, Y. (eds.) ISICA 2018. CCIS, vol. 986, pp. 190–204. Springer, Singapore (2019). https://doi.org/10.1007/978-981-13-6473-0_17
Mirza, M., Osindero, S.: Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784 (2014). https://doi.org/10.48550/arXiv.1411.1784
Yan, D., Ito, R., Moriai, R., Saito, S.: Two-step training: adjustable sketch colourization via reference image and text tag. In: Computer Graphics Forum, Wiley Online Library (2023). https://doi.org/10.1111/cgf.14791
Seo, C.W., Seo, Y.: Seg2pix: few shot training line art colorization with segmented image data. Appl. Sci. 11, 1464 (2021). https://doi.org/10.3390/app11041464
Sato, K., Matsui, Y., Yamasaki, T., Aizawa, K.: Reference-based manga colorization by graph correspondence using quadratic programming. In: SIGGRAPH Asia 2014 Technical Briefs, pp. 1–4 (2014). https://doi.org/10.1145/2669024.2669037
Choi, Y., Uh, Y., Yoo, J., Ha, J.-W.: StarGAN v2: diverse image synthesis for multiple domains. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8188–8197 (2020). https://doi.org/10.1109/CVPR42600.2020.00821
Lee, J., Kim, E., Lee, Y., Kim, D., Chang, J., Choo, J.: Reference based sketch image colorization using augmented-self reference and dense semantic correspondence. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5801–5810 (2020). https://doi.org/10.1109/cvpr42600.2020.00584
Z. Li, Z. Geng, Z. Kang, W. Chen, Y. Yang, Eliminating gradient conflict in reference-based line-art colorization, in: Computer Vision- ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XVII, Springer, 2022, pp. 579–596. https://doi.org/10.1007/978-3-031-19790-1_35
Bynagari, N.B.: GANs trained by a two time-scale update rule converge to a local Nash equilibrium. Asian J. Appl. Sci. Eng. 8, 25–34 (2019)
Liu, Y., Qin, Z., Wan, T., Luo, Z.: Auto-painter: cartoon image generation from sketch by using conditional wasserstein generative adversarial networks. Neurocomputing 311, 78–87 (2018). https://doi.org/10.1016/j.neucom.2018.05.045
Park, D.Y., Lee, K.H.: Arbitrary style transfer with style-attentional networks. In: proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5880–5888 (2019). https://doi.org/10.1109/CVPR.2019.00603
Frans, K.: Outline colorization through tandem adversarial networks, arXiv preprint arXiv:1704.08834 (2017). https://doi.org/10.48550/arXiv.1704.08834
Huber, P.J.: Robust estimation of a location parameter. In: Kotz, S., Johnson, N.L. (eds.) Breakthroughs in Statistics, pp. 492–518. Springer, New York (1992). https://doi.org/10.1214/aoms/1177703732
Tai, Y.-W., Jia, J., Tang, C.-K.: Local color transfer via probabilistic segmentation by expectation-maximization. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), vol. 1, pp. 747–754. IEEE (2005). https://doi.org/10.1109/CVPR.2005.215
Available at https://www.kaggle.com/ktaebum/animesketch-colorization-pair
Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 694–711. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_43
Huang, J., Liao, J., Kwong, S.: Semantic example guided image-to image translation. IEEE Trans. Multimedia 23, 1654–1665 (2020). https://doi.org/10.1109/TMM.2020.3001536
Furusawa, C., Kitaoka, S., Li, M., Odagiri, Y.: Generative probabilistic image colorization. arXiv preprint arXiv:2109.14518 (2021). https://doi.org/10.48550/arXiv.2109.14518
Zhang, G., Qu, M., Jin, Y., Song, Q.: Colorization for anime sketches with cycle-consistent adversarial network. Int. J. Perform. Eng. 15, 910 (2019). https://doi.org/10.23940/ijpe.19.03.p20.910918
Ci, Y., Ma, X., Wang, Z., Li, H., Luo, Z.: User-guided deep anime line art colorization with conditional adversarial networks. In: Proceedings of the 26th ACM International Conference on Multimedia, pp. 1536–1544 (2018). https://doi.org/10.1145/3240508.3240661
Gatys, L.A., Ecker, A.S., Bethge, M.: Image style transfer using convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2414–2423 (2016). https://doi.org/10.1109/CVPR.2016.265
Zhang, L., Li, C., Wong, T.-T., Ji, Y., Liu, C.: Two-stage sketch colorization. ACM Trans. Graphics (TOG) 37, 1–14 (2018). https://doi.org/10.1145/3272127.3275090
Huang, X., Liu, M.-Y., Belongie, S., Kautz, J.: Multimodal unsupervised image-to-image translation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11207, pp. 179–196. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01219-9_11
Parakkat, A.D., Memari, P., Cani, M.-P.: Delaunay painting: Perceptual image colouring from raster contours with gaps. In: Computer Graphics Forum, vol. 41, Wiley Online Library, pp. 166–181 (2022). https://doi.org/10.1111/cgf.14517
Huang, X., Belongie, S.: Arbitrary style transfer in real-time with adaptive instance normalization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1501–1510 (2017). https://doi.org/10.1109/ICCV.2017.167
Isola, P., Zhu, J.-Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1125–1134 (2017). https://doi.org/10.1109/CVPR.2017.632
Yan, C., Chung, J.J.Y., Kiheon, Y., Gingold, Y., Adar, E., Hong, S.R.: Flatmagic: improving flat colorization through AI-driven design for digital comic professionals. In: Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems, pp. 1–17 (2022). https://doi.org/10.1145/3491102.3502075
Yuan, M., Simo-Serra, E.: Line art colorization with concatenated spatial attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3946–3950 (2021). https://doi.org/10.1109/CVPRW53098.2021.00442
Acknowledgment
This work is supported by NSFC (Grant No. 62366047 and No. 62061042).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Li, H., Wang, N., Fang, J., Jia, Y., Ji, L., Chen, X. (2024). Anime Sketch Coloring Based on Self-attention Gate and Progressive PatchGAN. In: Liu, Q., et al. Pattern Recognition and Computer Vision. PRCV 2023. Lecture Notes in Computer Science, vol 14435. Springer, Singapore. https://doi.org/10.1007/978-981-99-8552-4_19
Download citation
DOI: https://doi.org/10.1007/978-981-99-8552-4_19
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8551-7
Online ISBN: 978-981-99-8552-4
eBook Packages: Computer ScienceComputer Science (R0)