Castling-ViT: Compressing Self-Attention via Switching Towards Linear-Angular Attention During Vision Transformer Inference

Haoran You, Yunyang Xiong, Xiaoliang Dai, Bichen Wu, Peizhao Zhang, Haoqi Fan, Peter Vajda, Yingyan Lin

Accepted by CVPR 2023. More Info: [ Paper | Slide | Youtube | Poster | Github ]

This is supposed to be an unofficial release of miniature code to reveal the core implementation of our attention block. The final adopted attention block is in a MultiScaleAttention format.

python attention.py

Here are some general guidances for reproducing results reported in our paper.

For classification task, we build our codebase on top of MobileVision@Meta.
For segmentation task, we build our codebase on top of Mask2Former, where the unsupervised pretrained models are trained using the MAE framework.
For detection task, we build our codebase on top of PicoDet@PaddleDet and its PyTorch version. The supervised pretrained models are trained using the LeViT framework.

To facilitate the usage in our research community, I am working on translating some of the highly coupled codes to standalone version. Ideally, the detection codebase can be exptected later, stay tuned.

Citation

If you find this codebase is useful for your research, please cite:

@inproceedings{you2023castling,
  title={Castling-ViT: Compressing Self-Attention via Switching Towards Linear-Angular Attention During Vision Transformer Inference},
  author={You, Haoran and Xiong, Yunyang and Dai, Xiaoliang and Wu, Bichen and Zhang, Peizhao and Fan, Haoqi and Vajda, Peter and Lin, Yingyan},
  booktitle={The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023)},
  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
attention.py		attention.py
castling-vit.png		castling-vit.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Castling-ViT: Compressing Self-Attention via Switching Towards Linear-Angular Attention During Vision Transformer Inference

Citation

About

Releases

Packages

Languages

License

GATECH-EIC/Castling-ViT

Folders and files

Latest commit

History

Repository files navigation

Castling-ViT: Compressing Self-Attention via Switching Towards Linear-Angular Attention During Vision Transformer Inference

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages