Article

Efficient Long-Range Attention Network for Image Super-Resolution

Authors:

Lei ZhangAuthors Info & Claims

Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XVII

Pages 649 - 667

https://doi.org/10.1007/978-3-031-19790-1_39

Published: 23 October 2022 Publication History

Abstract

Recently, transformer-based methods have demonstrated impressive results in various vision tasks, including image super-resolution (SR), by exploiting the self-attention (SA) for feature extraction. However, the computation of SA in most existing transformer based models is very expensive, while some employed operations may be redundant for the SR task. This limits the range of SA computation and consequently limits the SR performance. In this work, we propose an efficient long-range attention network (ELAN) for image SR. Specifically, we first employ shift convolution (shift-conv) to effectively extract the image local structural information while maintaining the same level of complexity as 1

\times

1 convolution, then propose a group-wise multi-scale self-attention (GMSA) module, which calculates SA on non-overlapped groups of features using different window sizes to exploit the long-range image dependency. A highly efficient long-range attention block (ELAB) is then built by simply cascading two shift-conv with a GMSA module, which is further accelerated by using a shared attention mechanism. Without bells and whistles, our ELAN follows a fairly simple design by sequentially cascading the ELABs. Extensive experiments demonstrate that ELAN obtains even better results against the transformer-based SR models but with significantly less complexity. The source codes of ELAN can be found at https://github.com/xindongzhang/ELAN.

References

[1]

Agustsson, E., Timofte, R.: NTIRE 2017 challenge on single image super-resolution: dataset and study. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 126–135 (2017)

[2]

Ahn N, Kang B, and Sohn K-A Ferrari V, Hebert M, Sminchisescu C, and Weiss Y Fast, accurate, and lightweight super-resolution with cascading residual network Computer Vision – ECCV 2018 2018 Cham Springer 256-272

Digital Library

[3]

Bevilacqua, M., Roumy, A., Guillemot, C., Alberi-Morel, M.L.: Low-complexity single-image super-resolution based on nonnegative neighbor embedding (2012)

[4]

Brown, T., et al.: Language models are few-shot learners. In: Advances in Neural Information Processing Systems 33, pp. 1877–1901 (2020)

[5]

Buades, A., Coll, B., Morel, J.M.: A non-local algorithm for image denoising. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), vol. 2, pp. 60–65. IEEE (2005)

[6]

Caballero, J., et al.: Real-time video super-resolution with spatio-temporal networks and motion compensation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4778–4787 (2017)

[7]

Cao, H., et al.: Swin-Unet: Unet-like pure transformer for medical image segmentation. arXiv preprint arXiv:2105.05537 (2021)

[8]

Carion N, Massa F, Synnaeve G, Usunier N, Kirillov A, and Zagoruyko S Vedaldi A, Bischof H, Brox T, and Frahm J-M End-to-end object detection with transformers Computer Vision – ECCV 2020 2020 Cham Springer 213-229

Digital Library

[9]

Chen, H., et al.: Pre-trained image processing transformer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12299–12310 (2021)

[10]

Dai, T., Cai, J., Zhang, Y., Xia, S.T., Zhang, L.: Second-order attention network for single image super-resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11065–11074 (2019)

[11]

Dong C, Loy CC, He K, and Tang X Image super-resolution using deep convolutional networks IEEE Trans. Pattern Anal. Mach. Intell. 2015 38 2 295-307

Digital Library

[12]

Dong C, Loy CC, and Tang X Leibe B, Matas J, Sebe N, and Welling M Accelerating the super-resolution convolutional neural network Computer Vision – ECCV 2016 2016 Cham Springer 391-407

[13]

Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)

[14]

Fedus, W., Zoph, B., Shazeer, N.: Switch transformers: scaling to trillion parameter models with simple and efficient sparsity. arXiv preprint arXiv:2101.03961 (2021)

[15]

Franzen, R.: Kodak lossless true color image suite (1998). http://r0k.us/graphics/kodak/

[16]

He, X., Mo, Z., Wang, P., Liu, Y., Yang, M., Cheng, J.: ODE-inspired network design for single image super-resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1732–1741 (2019)

[17]

Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)

[18]

Huang, J.B., Singh, A., Ahuja, N.: Single image super-resolution from transformed self-exemplars. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5197–5206 (2015)

[19]

Hui, Z., Gao, X., Yang, Y., Wang, X.: Lightweight image super-resolution with information multi-distillation network. In: Proceedings of the 27th ACM International Conference on Multimedia, pp. 2024–2032 (2019)

[20]

Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning, pp. 448–456. PMLR (2015)

[21]

Kim, J., Kwon Lee, J., Mu Lee, K.: Accurate image super-resolution using very deep convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1646–1654 (2016)

[22]

Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

[23]

Lai, W.S., Huang, J.B., Ahuja, N., Yang, M.H.: Deep Laplacian pyramid networks for fast and accurate super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 624–632 (2017)

[24]

LeCun Y, Bengio Y, and Hinton G Deep learning Nature 2015 521 7553 436-444

[25]

Ledig, C., et al.: Photo-realistic single image super-resolution using a generative adversarial network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4681–4690 (2017)

[26]

Li, W., Zhou, K., Qi, L., Jiang, N., Lu, J., Jia, J.: LAPAR: linearly-assembled pixel-adaptive regression network for single image super-resolution and beyond. In: Advances in Neural Information Processing Systems 33, pp. 20343–20355 (2020)

[27]

Li, Y., Zhang, K., Cao, J., Timofte, R., Van Gool, L.: LocalViT: bringing locality to vision transformers. arXiv preprint arXiv:2104.05707 (2021)

[28]

Li, Z., Yang, J., Liu, Z., Yang, X., Jeon, G., Wu, W.: Feedback network for image super-resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3867–3876 (2019)

[29]

Liang, J., Cao, J., Sun, G., Zhang, K., Van Gool, L., Timofte, R.: SwinIR: image restoration using swin transformer. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1833–1844 (2021)

[30]

Lim, B., Son, S., Kim, H., Nah, S., Mu Lee, K.: Enhanced deep residual networks for single image super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 136–144 (2017)

[31]

Liu, D., Wen, B., Fan, Y., Loy, C.C., Huang, T.S.: Non-local recurrent network for image restoration. In: Advances in Neural Information Processing Systems 31 (2018)

[32]

Liu, J., Tang, J., Wu, G.: Residual feature distillation network for lightweight image super-resolution. arXiv preprint arXiv:2009.11551 (2020)

[33]

Liu L et al. Deep learning for generic object detection: a survey Int. J. Comput. Vis. 2020 128 2 261-318

Digital Library

[34]

Liu, Y., et al.: RoBERTa: a robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692 (2019)

[35]

Liu, Y., Sun, G., Qiu, Y., Zhang, L., Chhatkuli, A., Van Gool, L.: Transformer in convolutional neural networks. arXiv preprint arXiv:2106.03180 (2021)

[36]

Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022 (2021)

[37]

Lu, Z., Liu, H., Li, J., Zhang, L.: Efficient transformer for single image super-resolution. arXiv preprint arXiv:2108.11084 (2021)

[38]

Luo X, Xie Y, Zhang Y, Qu Y, Li C, and Fu Y Vedaldi A, Bischof H, Brox T, and Frahm J-M LatticeNet: towards lightweight image super-resolution with lattice block Computer Vision – ECCV 2020 2020 Cham Springer 272-289

Digital Library

[39]

Martin, D., Fowlkes, C., Tal, D., Malik, J.: A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In: Proceedings Eighth IEEE International Conference on Computer Vision, ICCV 2001, vol. 2, pp. 416–423. IEEE (2001)

[40]

Matsui Y et al. Sketch-based manga retrieval using Manga109 dataset Multimed. Tools Appl. 2017 76 20 21811-21838

Digital Library

[41]

Mei, Y., Fan, Y., Zhou, Y.: Image super-resolution with non-local sparse attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3517–3526 (2021)

[42]

Niu B et al. Vedaldi A, Bischof H, Brox T, Frahm J-M, et al. Single image super-resolution via a holistic attention network Computer Vision – ECCV 2020 2020 Cham Springer 191-207

Digital Library

[43]

Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems 32, pp. 8026–8037 (2019)

[44]

Radford, A., Narasimhan, K., Salimans, T., Sutskever, I.: Improving language understanding by generative pre-training (2018)

[45]

Ramachandran, P., Parmar, N., Vaswani, A., Bello, I., Levskaya, A., Shlens, J.: Stand-alone self-attention in vision models. In: Advances in Neural Information Processing Systems 32 (2019)

[46]

Sajjadi, M.S., Scholkopf, B., Hirsch, M.: EnhanceNet: single image super-resolution through automated texture synthesis. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4491–4500 (2017)

[47]

Sajjadi, M.S., Vemulapalli, R., Brown, M.: Frame-recurrent video super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6626–6634 (2018)

[48]

Shi, W., et al.: Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1874–1883 (2016)

[49]

Tai, Y., Yang, J., Liu, X.: Image super-resolution via deep recursive residual network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3147–3155 (2017)

[50]

Tai, Y., Yang, J., Liu, X., Xu, C.: MemNet: a persistent memory network for image restoration. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4539–4547 (2017)

[51]

Tao, X., Gao, H., Liao, R., Wang, J., Jia, J.: Detail-revealing deep video super-resolution. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4472–4480 (2017)

[52]

Tian C, Xu Y, and Zuo W Image denoising using deep CNN with batch renormalization Neural Netw. 2020 121 461-473

Digital Library

[53]

Timofte, R., Agustsson, E., Van Gool, L., Yang, M.H., Zhang, L.: NTIRE 2017 challenge on single image super-resolution: methods and results. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 114–125 (2017)

[54]

Touvron, H., Cord, M., Douze, M., Massa, F., Sablayrolles, A., Jégou, H.: Training data-efficient image transformers & distillation through attention. In: International Conference on Machine Learning, pp. 10347–10357. PMLR (2021)

[55]

Wang, X., Chan, K.C., Yu, K., Dong, C., Change Loy, C.: EDVR: video restoration with enhanced deformable convolutional networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2019)

[56]

Wang X et al. Leal-Taixé L, Roth S, et al. ESRGAN: enhanced super-resolution generative adversarial networks Computer Vision – ECCV 2018 Workshops 2019 Cham Springer 63-79

Digital Library

[57]

Wang, Z., Cun, X., Bao, J., Liu, J.: Uformer: a general U-shaped transformer for image restoration. arXiv preprint arXiv:2106.03106 (2021)

[58]

Wu, B., et al.: Shift: a zero flop, zero parameter alternative to spatial convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9127–9135 (2018)

[59]

Wu, B., et al.: Visual transformers: token-based image representation and processing for computer vision. arXiv preprint arXiv:2006.03677 (2020)

[60]

Xia, Z., Chakrabarti, A.: Identifying recurring patterns with deep neural networks for natural image denoising. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 2426–2434 (2020)

[61]

Zamir, S.W., Arora, A., Khan, S., Hayat, M., Khan, F.S., Yang, M.H.: Restormer: efficient transformer for high-resolution image restoration. arXiv preprint arXiv:2111.09881 (2021)

[62]

Zeyde R, Elad M, Protter M, et al. Boissonnat J-D et al. On single image scale-up using sparse-representations Curves and Surfaces 2012 Heidelberg Springer 711-730

Digital Library

[63]

Zhang K, Li Y, Zuo W, Zhang L, Van Gool L, and Timofte R Plug-and-play image restoration with deep denoiser prior IEEE Trans. Pattern Anal. Mach. Intell. 2021 44 6360-6376

Digital Library

[64]

Zhang L, Wu X, Buades A, and Li X Color demosaicking by local directional interpolation and nonlocal adaptive thresholding J. Electron. Imaging 2011 20 2

[65]

Zhang Y, Li K, Li K, Wang L, Zhong B, and Fu Y Ferrari V, Hebert M, Sminchisescu C, and Weiss Y Image super-resolution using very deep residual channel attention networks Computer Vision – ECCV 2018 2018 Cham Springer 294-310

Digital Library

[66]

Zhang, Y., Li, K., Li, K., Zhong, B., Fu, Y.: Residual non-local attention networks for image restoration. arXiv preprint arXiv:1903.10082 (2019)

[67]

Zhang, Y., Tian, Y., Kong, Y., Zhong, B., Fu, Y.: Residual dense network for image super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2472–2481 (2018)

[68]

Zheng, S., et al.: Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6881–6890 (2021)

[69]

Zhou, S., Zhang, J., Zuo, W., Loy, C.C.: Cross-scale internal graph neural network for image super-resolution. In: Advances in Neural Information Processing Systems 33, pp. 3499–3509 (2020)

Cited By

Zhang YZhang KVan Gool LDanelljan MYu FSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)Lightweight image super-resolution via flexible meta pruningProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3694565(60305-60314)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3694565
Zamfir EWu ZMehta NZhang YTimofte RSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)See more detailsProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3694470(58158-58173)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3694470
Yaermaimaiti Yilihamu Liu YXi LWang R(2024)Helmet Detection Algorithm Based on Improved YOLOv7Automatic Control and Computer Sciences10.3103/S014641162470111658:6(642-655)Online publication date: 1-Dec-2024
https://dl.acm.org/doi/10.3103/S0146411624701116
Show More Cited By

Recommendations

Deep Residual Attention Network for Spectral Image Super-Resolution
Computer Vision – ECCV 2018 Workshops
Abstract
Spectral imaging sensors often suffer from low spatial resolution, as there exists an essential tradeoff between the spectral and spatial resolutions that can be simultaneously achieved, especially when the temporal resolution needs to be ...
Efficient single image super-resolution via graph-constrained least squares regression

We explore in this paper an efficient algorithmic solution to single image super-resolution (SR). We propose the gCLSR, namely graph-Constrained Least Squares Regression, to super-resolve a high-resolution (HR) image from a single low-resolution (LR) ...
Spatial-Spectral Deep Residual Network for Hyperspectral Image Super-Resolution
Abstract
Recently, single hyperspectral image super-resolution (SR) methods based on deep learning have been extensively studied. However, there has been limited technical development focusing on single hyperspectral image super-resolution due to the high-...

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XVII

Oct 2022

799 pages

ISBN:978-3-031-19789-5

DOI:10.1007/978-3-031-19790-1

Editors:
Shai Avidan
Tel Aviv University, Tel Aviv, Israel
,
Gabriel Brostow
University College London, London, UK
,
Moustapha Cissé
Google AI, Accra, Ghana
,
Giovanni Maria Farinella
University of Catania, Catania, Italy
,
Tal Hassner
Facebook (United States), Menlo Park, CA, USA

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 23 October 2022

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

21
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 28 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zhang YZhang KVan Gool LDanelljan MYu FSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)Lightweight image super-resolution via flexible meta pruningProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3694565(60305-60314)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3694565
Zamfir EWu ZMehta NZhang YTimofte RSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)See more detailsProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3694470(58158-58173)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3694470
Yaermaimaiti Yilihamu Liu YXi LWang R(2024)Helmet Detection Algorithm Based on Improved YOLOv7Automatic Control and Computer Sciences10.3103/S014641162470111658:6(642-655)Online publication date: 1-Dec-2024
https://dl.acm.org/doi/10.3103/S0146411624701116
Dai TWang JGuo HLi JWang JZhu ZLarson K(2024)FreqFormerProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/81(731-739)Online publication date: 3-Aug-2024
https://dl.acm.org/doi/10.24963/ijcai.2024/81
Li YDeng ZCao YLiu LCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)GRFormer: Grouped Residual Self-Attention for Lightweight Single Image Super-ResolutionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681554(9378-9386)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681554
Chen DZhang ZLiang JZhang LCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)SSL: A Self-similarity Loss for Improving Generative Image Super-resolutionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680874(3189-3198)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3680874
Zhao XLi LXie CZhang XJiang TLin WLiu SLi TCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)Efficient Single Image Super-Resolution with Entropy Attention and Receptive Field AugmentationProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680744(1302-1310)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3680744
He ZHuang MLuo LYang XZhu C(2024)Towards real-time practical image compression with lightweight attentionExpert Systems with Applications: An International Journal10.1016/j.eswa.2024.124142252:PAOnline publication date: 24-Jul-2024
https://dl.acm.org/doi/10.1016/j.eswa.2024.124142
He YShen LHu Y(2024)SCW-YOLO: An Improved Algorithm for Fall Detection Based on Deep LearningAdvanced Intelligent Computing Technology and Applications10.1007/978-981-97-5612-4_35(408-418)Online publication date: 5-Aug-2024
https://dl.acm.org/doi/10.1007/978-981-97-5612-4_35
Zhang XZhang YYu F(2024)HiT-SR: Hierarchical Transformer for Efficient Image Super-ResolutionComputer Vision – ECCV 202410.1007/978-3-031-73661-2_27(483-500)Online publication date: 29-Sep-2024
https://dl.acm.org/doi/10.1007/978-3-031-73661-2_27
Show More Cited By

View Options

View options

Figures

Tables

Media

View Table of Conten