VistaFormer

This repository is the official code repository for the paper "VistaFormer: Scalable Vision Transformers for Satellite Image Time Series Segmentation"

Installation

To run code from this repository you will first need to install the project dependencies which can be done either using the requirements.txt file or the Poetry configuration. To install this project using the requirements.txt file, you can execute the following commands:

pip install -r requirements.txt
# To train the Neighbourhood Attention-based VistaFormer model, run the following
pip3 install natten==0.14.6 -f https://shi-labs.com/natten/wheels/cu117/torch1.13/index.html

Train Models

To use this repository to run inference on pre-trained weights or train a model on one of the datasets, please see the documentation in the datasets directory for instructions on how to download and create these datasets.

Once you have created a dataset you can train a model using the following commands:

export MODEL_CONFIG="very-real-model-config-path"
python -m vistaformer.train_and_evaluate.train

To evaluate the performance of a pre-trained model on a given dataset, please refer to the notebooks/inference.ipynb file to compute complete metrics.

Please note that pre-trained model weights and training logs for each trial that was reported in the results section of the accompanying paper will be released once an unanonymized name can accompany this repository.

Results on PASTIS (Optical only) Semantic Segmentation Benchmark

Model Name	mIoU	oA	#Params (M)	GFLOPs
U-TAE	63.1	83.2	1.1	23.06
TSViT †	65.4	83.4	1.6	91.88
VistaFormer(Neighourhood)	65.3	83.7	1.1	9.82
VistaFormer	65.5	84.0	1.3	7.7

Results on MTLCC Semantic Segmentation Benchmark

Model Name	mIoU	oA	#Params (M)	GFLOPs
U-TAE	77.1	93.1	1.1	23.06
TSViT	84.8	95.0	1.6	91.88
VistaFormer(Neighourhood)	88.5	96.1	1.1	9.82
VistaFormer	87.8	95.9	1.3	7.7

Note that the GFLOPS and parameter measurements are based on inputs with input dimensions (B, C, T, H, W) = (4, 10, 60, 32, 32).

Citation

If you find this work or code useful in your research, please consider citing using the following citation:

@article{macdonald_2024_vistaformer,
  title={VistaFormer: Scalable Vision Transformers for Satellite Image Time Series Segmentation},
  author={MacDonald, Ezra and Jacoby, Derek and Coady, Yvonne},
  journal={arXiv preprint arXiv:2409.08461},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.vscode		.vscode
configs		configs
notebooks		notebooks
slurm		slurm
vistaformer		vistaformer
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
singularity.def		singularity.def

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VistaFormer

Installation

Train Models

Results on PASTIS (Optical only) Semantic Segmentation Benchmark

Results on MTLCC Semantic Segmentation Benchmark

Citation

About

Releases

Packages

Languages

macdonaldezra/VistaFormer

Folders and files

Latest commit

History

Repository files navigation

VistaFormer

Installation

Train Models

Results on PASTIS (Optical only) Semantic Segmentation Benchmark

Results on MTLCC Semantic Segmentation Benchmark

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages