Involution: Inverting the Inherence of Convolution for Visual Recognition

Li, Duo; Hu, Jie; Wang, Changhu; Li, Xiangtai; She, Qi; Zhu, Lei; Zhang, Tong; Chen, Qifeng

Computer Science > Computer Vision and Pattern Recognition

arXiv:2103.06255 (cs)

[Submitted on 10 Mar 2021 (v1), last revised 11 Apr 2021 (this version, v2)]

Title:Involution: Inverting the Inherence of Convolution for Visual Recognition

Authors:Duo Li, Jie Hu, Changhu Wang, Xiangtai Li, Qi She, Lei Zhu, Tong Zhang, Qifeng Chen

View PDF

Abstract:Convolution has been the core ingredient of modern neural networks, triggering the surge of deep learning in vision. In this work, we rethink the inherent principles of standard convolution for vision tasks, specifically spatial-agnostic and channel-specific. Instead, we present a novel atomic operation for deep neural networks by inverting the aforementioned design principles of convolution, coined as involution. We additionally demystify the recent popular self-attention operator and subsume it into our involution family as an over-complicated instantiation. The proposed involution operator could be leveraged as fundamental bricks to build the new generation of neural networks for visual recognition, powering different deep learning models on several prevalent benchmarks, including ImageNet classification, COCO detection and segmentation, together with Cityscapes segmentation. Our involution-based models improve the performance of convolutional baselines using ResNet-50 by up to 1.6% top-1 accuracy, 2.5% and 2.4% bounding box AP, and 4.7% mean IoU absolutely while compressing the computational cost to 66%, 65%, 72%, and 57% on the above benchmarks, respectively. Code and pre-trained models for all the tasks are available at this https URL.

Comments:	Accepted to CVPR 2021. Code and models are available at this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2103.06255 [cs.CV]
	(or arXiv:2103.06255v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2103.06255

Submission history

From: Duo Li [view email]
[v1] Wed, 10 Mar 2021 18:40:46 UTC (8,322 KB)
[v2] Sun, 11 Apr 2021 12:30:11 UTC (8,320 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Involution: Inverting the Inherence of Convolution for Visual Recognition

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Involution: Inverting the Inherence of Convolution for Visual Recognition

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators