ArchesWeather: An efficient AI weather forecasting model at 1.5° resolution
Abstract
One of the guiding principles for designing AI-based weather forecasting systems is to embed physical constraints as inductive priors in the neural network architecture. A popular prior is locality, where the atmospheric data is processed with local neural interactions, like 3D convolutions or 3D local attention windows as in Pangu-Weather. On the other hand, some works have shown great success in weather forecasting without this locality principle, at the cost of a much higher parameter count. In this paper, we show that the 3D local processing in Pangu-Weather is computationally sub-optimal. We design ArchesWeather, a transformer model that combines 2D attention with a column-wise attention-based feature interaction module, and demonstrate that this design improves forecasting skill. ArchesWeather is trained at 1.5° resolution and 24h lead time, with a training budget of a few GPU-days and a lower inference cost than competing methods. An ensemble of four of our models shows better RMSE scores than the IFS HRES and is competitive with the 1.4° 50-members NeuralGCM ensemble for one to three days ahead forecasting. Our code and models are publicly available at https://github.com/gcouairon/ArchesWeather.
- Publication:
-
arXiv e-prints
- Pub Date:
- May 2024
- DOI:
- 10.48550/arXiv.2405.14527
- arXiv:
- arXiv:2405.14527
- Bibcode:
- 2024arXiv240514527C
- Keywords:
-
- Computer Science - Machine Learning;
- Computer Science - Artificial Intelligence
- E-Print:
- Accepted at the Machine Learning for Earth System Modeling Workshop at ICML 2024