Enhancing Few-Shot Image Classification through Learnable Multi-Scale Embedding and Attention Mechanisms
Implementation of a Few-Shot Image Classification Model based on the Prototypical Network Model and Tested on the MiniImagenet and FC100 Datasets.
For more information, check out our paper on [arXiv], [paperswithcode].
Our model consists of the following components:
- We extracted five feature maps from backbone in order to capture both global and task specific features
- We employ a self-attention mechanism for each feature map obtained from every stage in order to capture more valuable information
- We incorporate learnable weights at each stage.
- We propose a novel few-shot classification. We have significantly improved the accuracy on the MiniImageNet and FC100 datasets.
The final model architecture is as follows:
The mapper architecture is as follows:
You can study the model in more detail from this PDF.
For the 5-way 5-shot:
python train.py --max-epoch 200 --save-epoch 20 --shot 5 --query 10 --train-way 30 --test-way 5 --save-path ./save/proto-5-change --gpu 0
For the 5-way 1-shot:
python train.py --max-epoch 200 --save-epoch 20 --shot 1 --query 10 --train-way 20 --test-way 5 --save-path ./save/proto-1-change --gpu 0
If you use this repository in your work, please cite the following paper:
@article{askari2024enhancing,
title={Enhancing Few-Shot Image Classification through Learnable Multi-Scale Embedding and Attention Mechanisms},
author={Askari, Fatemeh and Fateh, Amirreza and Mohammadi, Mohammad Reza},
journal={arXiv preprint arXiv:2409.07989},
year={2024}
}