Abstract
In this paper, we present a novel automatic model Self-Attention AutoEncoder (SA-AE) for generating a relit image from a source image to match the illumination setting of a guide image, which is called any-to-any relighting. In order to reduce the difficulty of learning, we adopt an implicit scene representation learned by the encoder to render the relit image using the decoder. Based on the learned scene representation, a lighting estimation network is designed as a classification task to predict the illumination settings from the guide images. Also, a lighting-to-feature network is well designed to recover the corresponding implicit scene representation from the illumination settings, which is the inverse process of the lighting estimation network. In addition, a self-attention mechanism is introduced in the autoencoder to focus on the re-rendering of the relighting-related regions in the source images. Extensive experiments on the VIDIT dataset show that the proposed approach achieved the 1st place in terms of MPS and the 1st place in terms of SSIM in the AIM 2020 Any-to-any Relighting Challenge.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Barron, J.T., Malik, J.: Color constancy, intrinsic images, and shape estimation. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7575, pp. 57–70. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33765-9_5
Barron, J.T., Malik, J.: Shape, albedo, and illumination from a single image of an unknown object. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 334–341. IEEE (2012)
Barron, J.T., Malik, J.: Intrinsic scene properties from a single RGB-D image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 17–24 (2013)
Barron, J.T., Malik, J.: Shape, illumination, and reflectance from shading. IEEE Trans. Pattern Anal. Mach. Intell. 37(8), 1670–1687 (2014)
Bau, D., et al.: Semantic photo manipulation with a generative image prior. ACM Trans. Graph. (TOG) 38(4), 1–11 (2019)
Debevec, P., Hawkins, T., Tchou, C., Duiker, H.P., Sarokin, W., Sagar, M.: Acquiring the reflectance field of a human face. In: Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques, pp. 145–156 (2000)
El Helou, M., Zhou, R., Johan, B., Süsstrunk, S.: VIDIT: virtual image dataset for illumination transfer. arXiv preprint arXiv:2005.05460 (2020)
El Helou, M., Zhou, R., Süsstrunk, S., Timofte, R., et al.: AIM 2020: scene relighting and illumination estimation challenge. In: Bartoli, A., Fusiello, A. (eds.) ECCV 2020 Workshops. LNCS, vol. 12537, pp. 499–518. Springer, Cham (2020)
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1026–1034 (2015)
Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 694–711. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_43
Kanamori, Y., Endo, Y.: Relighting humans: occlusion-aware inverse rendering for full-body human images. ACM Trans. Graph. (TOG) 37(6), 1–11 (2018)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Knill, D.C., Richards, W.: Perception as Bayesian inference. Chapter The Perception of Shading and Reflectance. Cambridge University Press, New York (1996)
Ledig, C., et al.: Photo-realistic single image super-resolution using a generative adversarial network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4681–4690 (2017)
Li, Z., Shafiei, M., Ramamoorthi, R., Sunkavalli, K., Chandraker, M.: Inverse rendering for complex indoor scenes: shape, spatially-varying lighting and SVBRDF from a single image. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2475–2484 (2020)
Matusik, W., Loper, M., Pfister, H.: Progressively-refined reflectance functions from natural illumination. In: Rendering Techniques, pp. 299–308 (2004)
Nestmeyer, T., Lalonde, J.F., Matthews, I., Lehrmann, A.: Learning physics-guided face relighting under directional light. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5124–5133 (2020)
Odena, A., Dumoulin, V., Olah, C.: Deconvolution and checkerboard artifacts. Distill 1(10), e3 (2016)
Peers, P., et al.: Compressive light transport sensing. ACM Trans. Graph. (TOG) 28(1), 1–18 (2009)
Reddy, D., Ramamoorthi, R., Curless, B.: Frequency-space decomposition and acquisition of light transport under spatially varying illumination. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7577, pp. 596–610. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33783-3_43
Sengupta, S., Kanazawa, A., Castillo, C.D., Jacobs, D.W.: SfSNET: learning shape, reflectance and illuminance of faces ‘in the wild’. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6296–6305 (2018)
Sun, T., et al.: Single image portrait relighting. ACM Trans. Graph. 38(4), 79:1–79:12 (2019)
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)
Xu, Z., Sunkavalli, K., Hadap, S., Ramamoorthi, R.: Deep image-based relighting from optimal sparse samples. ACM Trans. Graph. (TOG) 37(4), 1–13 (2018)
Yu, Y., Smith, W.A.: InverseRenderNet: learning single image inverse rendering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3155–3164 (2019)
Zhang, H., Goodfellow, I., Metaxas, D., Odena, A.: Self-attention generative adversarial networks. In: International Conference on Machine Learning, pp. 7354–7363 (2019)
Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 586–595 (2018)
Zhao, H., Gallo, O., Frosio, I., Kautz, J.: Loss functions for image restoration with neural networks. IEEE Trans. Comput. Imag. 3(1), 47–57 (2016)
Zhou, H., Hadap, S., Sunkavalli, K., Jacobs, D.W.: Deep single-image portrait relighting. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 7194–7202 (2019)
Acknowledgements
This work is supported by NSFC under Grant 61531014.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Hu, Z., Huang, X., Li, Y., Wang, Q. (2020). SA-AE for Any-to-Any Relighting. In: Bartoli, A., Fusiello, A. (eds) Computer Vision – ECCV 2020 Workshops. ECCV 2020. Lecture Notes in Computer Science(), vol 12537. Springer, Cham. https://doi.org/10.1007/978-3-030-67070-2_32
Download citation
DOI: https://doi.org/10.1007/978-3-030-67070-2_32
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-67069-6
Online ISBN: 978-3-030-67070-2
eBook Packages: Computer ScienceComputer Science (R0)