Correction of CAMS PM10 Reanalysis Improves AI-Based Dust Event Forecast

Sarafian, Ron; Nathan, Sagi; Nissenbaum, Dori; Khan, Salman; Rudich, Yinon

doi:10.3390/rs17020222

Open AccessArticle

Correction of CAMS PM₁₀ Reanalysis Improves AI-Based Dust Event Forecast

by

Ron Sarafian

^1,*

,

Sagi Nathan

²,

Dori Nissenbaum

¹,

Salman Khan

³

and

Yinon Rudich

¹

Earth and Planetary Science Department, Weizmann Institute of Science, P.O. Box 26, Rehovot 7610001, Israel

²

Department of Statistics and Data Science, Hebrew University of Jerusalem, Jerusalem 91905, Israel

³

Computer Vision Department, Mohamed bin Zayed University of Artificial Intelligence, Building 1B, Masdar City, Abu Dhabi P.O. Box 7909, United Arab Emirates

^*

Author to whom correspondence should be addressed.

Remote Sens. 2025, 17(2), 222; https://doi.org/10.3390/rs17020222

Submission received: 7 November 2024 / Revised: 30 December 2024 / Accepted: 31 December 2024 / Published: 9 January 2025

(This article belongs to the Section Atmospheric Remote Sensing)

Download

Browse Figures

Versions Notes

Abstract

:

High dust loading significantly impacts air quality, climate, and public health. Early warning is crucial for mitigating short-term effects, and accurate dust field estimates are needed for forecasting. The Copernicus Atmosphere Monitoring Service (CAMS) offers global reanalysis datasets and forecasts of particulate matter with a diameter of under 10 μm (PM₁₀), which approximate dust, but recent studies highlight discrepancies between CAMS data and ground in-situ measurements. Since CAMS is often used for forecasting, errors in PM₁₀ fields can hinder accurate dust event forecasts, which is particularly challenging for models that use artificial intelligence (AI) due to the scarcity of dust events and limited training data. This study proposes a machine-learning approach to correct CAMS PM₁₀ fields using in-situ data to enhance AI-based dust event forecasting. A correction model that links pixel-wise errors with atmospheric and meteorological variables was taught using gradient-boosting algorithms. This model is then utilized to predict CAMS error in previously unobserved pixels across the Eastern Mediterranean, generating CAMS error fields. Our bias-corrected PM₁₀ fields are, on average, 12 μg m⁻³ more accurate, often reducing CAMS errors by significant percentages. To evaluate the contribution, we train a deep neural network to predict city-scale dust events (0–72 h) over the Balkans using PM₁₀ fields. Comparing the network’s performance when trained on both original and bias-corrected CAMS PM₁₀ fields, we show that the correction improves AI-based forecasting performance across all metrics.

Keywords:

PM₁₀; CAMS; dust forecasting; artificial intelligence; machine learning

1. Introduction

Events of high dust loading are extreme meteorological phenomena that substantially impact climate, air quality, and public health due to the dust particles’ properties [1,2]. Along with short-term risks, exposure to dust events is associated with cardiovascular, respiratory, and other diseases [3,4,5]. The growing awareness of the adverse effects caused by dust raises the demand for accurate early warning systems—a critical component in mitigating dust’s social and economic adverse effects.

Forecasting dust events accurately hinges on continuous estimates of regional dust fields. These fields are crucial inputs for monitoring systems, enabling the automatic or manual identification of conditions conducive to dust event development at different locations and lead times [6,7,8,9]. Regional dust field estimates are also used in epidemiological studies of the dust’s adverse health effects. Hence, their accuracy can crucially affect the studies’ reliability [5,10].

Dust mass is frequently approximated by the concentration of inhalable suspended particulate matter with a diameter of under 10 μm (PM₁₀). The Copernicus Atmosphere Monitoring Service (CAMS) provides global reanalysis datasets of particles’ atmospheric composition and is considered state-of-the-art in estimating surface PM₁₀ hourly concentrations at global scales [11]. It is often used for forecasting air quality and atmospheric phenomena such as dust storms [12,13,14].

While CAMS reanalysis-based PM₁₀ datasets offer valuable insights regarding the global distribution of dust, recent studies have identified significant discrepancies between these estimates and ground-based in-situ measurements, manifesting as both underestimations and overestimations, depending on spatial, temporal, and other conditions [15,16,17]. For example, Ryu and Min [15] and Ali et al. [17] and found a season-dependent persistent overestimation of CAMS PM₁₀ over South Korea and most regions of China, respectively, while Sekmoudi et al. [16] found an average underestimation over Morocco during summer. The above studies explore conditional aggregations of CAMS PM₁₀ differences from corresponding in-situ measurements without attempting to model the CAMS bias generally and, therefore, are limited to specific spatiotemporal distribution settings.

Other studies directly model in-situ particulate matter measurements from a coarse climate model or satellite imagery via spatial prediction or downscaling [18,19,20,21,22,23] and can be used to estimate CAMS bias indirectly. In these studies, the improvement in PM₁₀ estimates emerges from high-resolution data fed into the prediction model, including new information that is spatially unresolved by CAMS. Downscaling often suffers from poor generalization in unseen spatial domains, especially in extreme conditions [24,25]. Downscaled PM₁₀ regional fields are therefore unsuitable for drawing out-of-sample conclusions regarding the CAMS bias. In addition to lacking generality, downscaling on global scales over long periods is likely to be computationally expensive [26,27].

This study proposes a machine-learning approach to learn a generalized correction model for the CAMS PM₁₀ bias and demonstrate it over the Eastern Mediterranean. By training the model using meteorological and remotely sensed data, regardless of space-, time-, or implementation-specific context, the correction model skillfully predicts CAMS PM₁₀ errors out of the training domain, allowing for extrapolating conclusions to unseen spatial and temporal domains. Next, we employ the model to correct untrained CAMS PM₁₀ fields in a different area and study its efficiency in improving AI-based dust event forecasting.

In recent years, there is a growing interest in using AI approaches for dust forecasting, leveraging satellite-based reanalysis datasets such as the Modern-Era Retrospective Analysis for Research and Applications Version 2 (MERRA-2) [28,29], or the North American Regional Reanalysis (NARR) [30]. Based on our previous success in dust event forecasting using CAMS data [12], the correction potential for improving AI-based early warnings is evaluated by training a deep network to forecast city-scale dust events in short lead times from historical PM₁₀ regional fields. City-scale forecasting demands high spatial resolution—a notable challenge for existing models, as shown below. AI offers strong potential for advancing predictions in this domain, particularly because these models are less restricted by physical equations and coarse resolution limitations. As discussed in Section 3.2, training an AI forecast model using corrected PM₁₀ data improves its performance in all metrics and lead times (0–72 h).

In addition, the correction model was used to characterize the CAMS seasonal and regional bias tendencies over larger regions. Using machine-learning explainability methods, we reveal the predictors of the CAMS bias, primarily temporal factors, normalized difference vegetation index (NDVI), surface winds, and aerosol optical depth (AOD). Section 3.3 discusses the origins of CAMS biases, and their spatiotemporal variation as interpreted from the model in more detail.

2. Materials and Methods

2.1. Data

2.1.1. In-Situ Data

Three-hour averages of PM₁₀ concentration measurements from 657 air quality monitoring ground stations, collected from 1 January 2013 through 31 December 2020, were obtained from the Tropospheric Ozone Assessment Report (TOAR) database [31]. The stations are scattered over about 2000 km by 1000 km, covering the entire region of Anatolia and parts of the Armenian Highlands, the Fertile Crescent, and the Balkan peninsula. We removed stations that collected less than 10% of the measurements over the study period (2336 three-hourly measurements) to avoid small sample biases. A sensitivity analysis with lower thresholds (e.g., 1500 or 2000 measurements) indicated minimal impact on the model’s performance. The excluded stations are spatially distributed similarly to the included stations and do not significantly represent a specific region or time, so we do not expect potential bias from excluding them. In cases where two or more stations were assigned the exact location, only the station that collected the most samples was kept. This filtering left 275 study stations. Figure 1 illustrates the station’s distribution over the study domain, showing a relatively uniform distribution. As further detailed, stations from Region A are used for training a correction model. This area does not overlap with Regions B and C, which are used for validation and the AI forecast model’s training. The average in-situ PM₁₀ concentration in the study stations is 48.0 μg m⁻³, the standard deviation is 53.3 μg m⁻³, and the median is 36.3 μg m⁻³. PM₁₀ in-situ distribution exhibits high positive skewness (4.3 μg m⁻³) and spatial heterogeneity, with medians of 38.0 μg m⁻³ in Region A, compared to 29.8 μg m⁻³ in Region B. PM₁₀ concentrations are consistently lower during the spring and summer (34.2 μg m⁻³ median in months 3–9) and higher during the fall and winter (44.0 μg m⁻³ median in months 10–2). The annual median decreased between 2013 and 2020 by 6.9 μg m⁻³.

2.1.2. CAMS PM₁₀ Reanalysis

The CAMS reanalysis datasets provided by the European Center for Medium-Range Weather Forecasts (ECMWF) is used here for continuous regional estimates. Specifically, we use PM₁₀ atmospheric composition 3-hourly fields from the EAC4 (ECMWF Atmospheric Composition fourth generation Reanalysis) global reanalysis database. The fields cover the ground monitoring stations area, as well as the broader Mediterranean Basin, on a spatial resolution of 0.75° × 0.75° (approximately 67 km² in the study region) and a temporal resolution of 3 h. The CAMS global PM₁₀ reanalysis is produced by re-processing atmospheric data from an a priori physical model, which are then assimilated with satellite and in-situ observations, retrospectively [11]. Considering PM₁₀ concentrations in the pixels that contain the study stations (Regions A and B), the average, standard deviation, and median are 22.0 μg m⁻³, 15.5 μg m⁻³, and 18.5 μg m⁻³, respectively which are significantly lower than the corresponding in-situ measurements. The distribution exhibits positive skewness (3.3 μg m⁻³), albeit to a lesser extent, while spatial heterogeneity remains pronounced, with medians of 18.6 μg m⁻³ in Region A and 14.4 μg m⁻³ in Region B.

2.1.3. Correction Model Inputs

The correction model uses meteorological, atmospheric, and satellite data as inputs. Meteorological variables were retrieved from the European Centre for Medium-Range Weather Forecasts Reanalysis 5th Generation (ERA5) [32]. These include geopotential height, winds, potential vorticity, specific humidity, air temperature, and dust aerosol (0.9–20 μm) mixing ratio, all taken at five different pressure levels (250, 500, 700, 850, and 900 hPa). Other variables were retrieved from the CAMS database. They included zonal and meridional wind at 10 m, total column water vapor (TCWV), dust aerosol optical depth at 550 nm (AOD), and dust aerosol (0.9–20 μm) mixing ratio at four different model levels (20, 30, 40, and 50). To account for surface reflectance and absorbance, we also included the normalized difference vegetation index (NDVI) as an additional input feature, retrieved from the Sentinel-2 database on bi-weekly temporal resolution and 1 km spatial resolution. The NDVI data were spatially interpolated to fit CAMS PM₁₀ resolution using a nearest-neighbors method.

The rationale for including these variables is to learn the key factors contributing to CAMS PM₁₀ estimation errors. Variables such as winds, geopotential height, and potential vorticity represent atmospheric dynamics that drive dust transport, while specific humidity and temperature influence dust lifting, suspension, and deposition processes. Dust aerosol mixing ratio and AOD serve as proxies for aerosol loading, providing insights into discrepancies in CAMS estimates. Surface features like NDVI reflect land-surface conditions, such as reflectance and absorbance, and aid in identifying potential biases. Incorporating the variables across atmospheric levels ensures that both surface and vertical processes are captured.

2.1.4. CAMS Forecasts

As a benchmark for the AI-based dust event forecast, we use CAMS global atmospheric composition forecast [11] of PM₁₀. Unlike reanalysis datasets, which provide only historical reconstructions, the CAMS forecast model has a prediction feature, providing future atmospheric states based on the atmospheric conditions at the time the prediction is made. CAMS global forecasts atmospheric modeling is based on physical laws and chemical processes to predict the time evolution of various species’ concentrations, including PM₁₀, for 0 to 5 days lead time. Its spatial and temporal resolutions are 0.4° × 0.4°, and one hour, respectively, covering 2015 to the present.

2.1.5. Dust Event Definition

We define dust events on a 3-hourly basis when the PM₁₀ in-situ measurement exceeds 50 μg m⁻³, as per the air quality guidelines of the World Health Organization’s definition of the lowest risk level [33]. On average, 22% of observations are classified as dust events over all stations and timestamps during the study’s period; the average PM₁₀ concentration of dust events is 90.6 μg m⁻³, with a standard deviation of 72.2 μg m⁻³. Most events occur in the winter months (44%), less in the summer (10%), and the rest are roughly equally divided between the spring and fall.

2.2. The Machine-Learning Setup

Figure 2 illustrates the machine-learning setup. The setup is composed of separate stages. First, the correction model is trained and tested using PM₁₀ measurements from ground stations in distinct regions. Next, we use the error predicted by the trained model to provide the bias-corrected PM₁₀ fields on a larger scale in other areas that were not seen by the model before. We use these fields as the input of a deep neural network that forecasts city-scale dust events on a range of lead times. To estimate the impact of this correction on the network’s capabilities, we compare the forecast performances to those obtained when the network is trained on original CAMS PM₁₀ inputs only.

Figure 1 illustrates the split of the spatial domain. Region A includes 237 monitoring stations from Turkey for model training. The correction model is trained to predict CAMS error over pixels containing stations, denoted as training pixels. Region B includes 38 stations from the Balkan peninsula, which are used to test the correction performance. Both the training and testing runs use data from 2013–2020 (the station representation may be partial in some time samples due to missing data).

Bias-corrected PM₁₀ fields are computed over Region C (15–49°N, −6–27°E), encompassing the Saharan desert, the Mediterranean Basin, and Southern Europe. We chose Region C to include most dust sources transported to Region B. Therefore, it is the most appropriate input for an AI-based early warning system for Region B. Based on sequences of PM₁₀fields of Region C, we train a deep neural network to output dust event forecasts for each monitoring station located in Region B with a lead time of 0–72 h, similar to recent studies [12]. We use data from 2013–2018 for the network training and 2019–2020 for testing.

2.3. The PM₁₀ Correction Model

The correction model predicts the deviation of CAMS PM₁₀ from in-situ measurements for each 3-hourly pixel. In this setup, correction for every location and time is performed regardless of the pixel’s global position, avoiding over-fitting particular spatial structures and resulting in a more generalized correction model.

CAMS PM₁₀ ground-level error is computed for all timestamps (T) in pixels containing one or more monitoring stations (see Figure 2). We denote this set of pixels as the training pixels (

T r P

). The

T r P

error is defined by:

e_{i, t} = \frac{1}{| i |} \sum_{j \subset i} (P M_{i, t}^{C A M S} - P M_{j, t}^{i n - s i t u}), \forall i \in T r P, t \in T,

(1)

where

P M_{i, t}^{C A M S}

is the CAMS PM₁₀ value at pixel i and time t,

P M_{j, t}^{i n - s i t u}

is a PM₁₀ measurement at time t in station j, whose coordinates are located in the area covered by pixel i, and

| i |

is the number of stations in pixel i. Section 3.3, addresses the difficulty in comparing in-situ observations to reanalysis estimates in more detail.

To understand the causes of the CAMS biases, we investigated them directly by learning an error model,

h (x_{i, t})

, which outputs the CAMS error of a sample

(i, t)

given its meteorological, atmospheric, and remote sensing features. The variables

x_{i, t}

are explained in Section 2.1.3. In addition, to introduce the process dynamics,

x_{i, t}

also contains the variables’ time lag (i.e., measured in pixel i at

t - 1, t - 2, \dots

) as additional features. To control for seasonal and human activity effects (e.g., working hours),

x_{i, t}

also includes time features such as the day hour, week of year, month, and year.

We learn h from the training pixels of Region A (Figure 1). Next, we use h to predict out-of-training CAMS errors for testing in Region B and to estimate the contribution for AI in Region C (see Figure 2):

{\hat{e}}_{i, t} = h (x_{i, t}), \forall i \notin T r P, t \in T .

(2)

We evaluated both deep (Feed-Forward Networks) and shallow architectures (Extreme Gradient Boosting (XGB) [34], LightGBM [35], and Histogram Gradient Boosting [36]) for learning h. The XGB algorithm demonstrated the best performance. Finding that a shallow machine-learning architecture outperforms deep learning may indicate that the concise representation

x_{i, t}

captures the key underlying pixel patterns well. Shallow models are often better suited to leverage concise input representations, as they have a lower tendency to overfit [37].

The function h proved to be most accurate with time-lag spatial features that record up to 6 h back from the prediction time. Extending the pixel history beyond this timeframe did not enhance prediction performance across any of the tested learning architectures.

We learned h by minimizing the mean squared error loss in a regression setting, using a tree-based booster trained on a CUDA-enabled GPU. The model hyperparameters were tuned using a grid search, with tree depths evaluated from 8 to 20, step sizes from 0.01 to 0.1, training instances subsample ratio (subsample) from 0.2 to 0.8, and the tree columns subsample ratio (colsample_bytree) from 0.1 to 0.9. The final parameters (16-node tree depth, 0.05 step size shrinkage, 0.5 subsample ratio, and 0.6 colsample_bytree) minimized validation error while avoiding overfitting. The complete code is available at https://github.com/saginat/CAMS_bias_correction (accessed on 2 January 2025).

Once training is complete, the bias-corrected PM₁₀ fields are derived as follows:

P M_{i, t}^{c o r r e c t e d} = P M_{i, t}^{C A M S} - {\hat{e}}_{i, t} .

(3)

The above setup is implemented for generating bias-corrected PM₁₀ fields for Region C for 2013–2020 at CAMS original resolution (0.75° × 0.75°, 3-h).

2.4. AI-Based Dust Event Forecasting

We use a deep Convolutional Neural Network (CNN) as the foundation for our dust event forecast model. In previous work [12], we demonstrated the efficiency of a CNN-based architecture in accurately capturing the development and progression of dust storms from PM₁₀ fields that encompass source regions. Alternative architectures, such as Vision Transformers and LSTMs, were also tested but proved less suitable, achieving lower accuracy. We conclude that the superior performance of CNNs stems from their ability to effectively capture the spatiotemporal dynamics governing dust storms, particularly when working with small training sets and complex input data.

The network’s input is an area of 65 × 67 pixels PM₁₀field in Region C, and its output is the dust level in 38 ground stations in Region B. The network is trained to forecast 10 dust levels at lead times varying from 0 to 72 h by minimizing the station’s average cross-entropy loss function [38].

The network is built of convolutional layers with skip connections, ReLU activation, batch normalization, dropouts, pooling, and fully connected layers. The complete codes are available online https://github.com/saginat/CAMS_bias_correction (accessed on 2 January 2025).

To evaluate how the correction affects the network’s accuracy, we independently trained initialized networks of the above architecture with two different inputs:

P M^{C A M S}

and

P M^{c o r r e c t e d}

. Once training is done, we compare the predictive performances of the two networks over a test set. Hyperparameters and training decisions are held constant for the two training processes. Specifically, the Adam optimizer [39] with a 0.001 initial learning rate was used, along with a ReduceLROnPlateau scheduler, reducing the learning rate by a factor of 2 once learning shows no improvement over 5 epochs on a validation set.

We evaluated the network’s performance in binary dust event classification settings. The highest PM₁₀ levels (levels 9 and 10) indicate dust events following its definition in Section 2.1. Due to the imbalanced proportion of dust events, we assessed performance using metrics well-suited for unequal class distribution, specifically

F 1 S c o r e

, and Balanced Accuracy, as further discussed in Section 3.2.

3. Results

3.1. PM₁₀ Correction Model Performance

The correction model decreases the CAMS 3-h error over Region B test stations from 26.3 to 14.6 μg m⁻³ on average or 16.3 to 6.1 μg m⁻³ on median. A paired t-test between the errors of the corrected and uncorrected CAMS estimates (absolute deviations from the in-situ values) yielded a p-value of effectively zero (<1 × 10⁻¹⁰⁰), indicating significant improvement. Figure 3 compares the PM₁₀ measured in the stations (blue), their CAMS estimates (orange), and their associated bias-correction predictions (green) in Region B. Panel (a) shows the day of the year PM₁₀ average and interquartile range (IQR) over all the pixels containing ground stations in Region B. Panel (b) shows boxplots of the PM₁₀ distribution in six sample pixels, with the location of the pixels shown in Panel (c). Panel (d) shows yearly boxplots of the PM₁₀ error of CAMS and the bias-corrected estimates over the study period.

Figure 3 highlights a consistent underestimation of CAMS PM₁₀ at Region B stations. As shown in Panel (a), CAMS underestimation is particularly pronounced during the winter months when the in-situ PM₁₀ concentrations tend to be more extreme and volatile. Panel (b) shows that CAMS bias varies over space, mainly depending on the in-situ PM₁₀ levels. Moreover, the in-situ PM₁₀’s exhibits high variance, contributed from space (between stations) and time (between timestamps) variations, mainly during the winter. CAMS estimates fail to capture the variance patterns, as evidenced by the significantly narrower IQR ribbons in Panel (a) and the shorter boxplot whiskers in Panel (b), indicating lower variance in its predictions. The findings are consistent with previously reported CAMS underestimations in global [40] and regional [41] analyses, specifically within the Mediterranean region.

Compared to CAMS, our bias-corrected PM₁₀ estimates align more closely with the in-situ PM₁₀ distribution. As shown in Panel (a), except for a short period around January, the bias-corrected averages demonstrate strong agreement with the in-situ averages over the year. The bias-corrected estimates also exhibit variance patterns more similar to those of the in-situ PM₁₀, with wider IQRs that generally fall within the bounds of in-situ IQR. Panel (b) shows that, in certain locations, the distributions of in-situ PM₁₀ and the bias-corrected estimates are nearly identical, with medians, IQRs, and extreme values closely matching. Overall, across Region B, the statistical distance between the bias-corrected PM₁₀ distribution and the in-situ measurements is notably smaller than that between the CAMS PM₁₀ estimates and the in-situ data, with Kullback–Leibler divergences [42] of 0.41 and 0.56, respectively.

Panel (d) demonstrates that the errors in the bias-corrected estimates are consistently smaller than those of CAMS (closer to 0) over the study period. The higher errors before 2016 are attributed to a different station distribution in this period, with a greater proportion of stations located in the northern part of Region B, where more extremes are typically observed. Since 2016, the spatial distribution of stations has remained stable. During this period, the bias-corrected estimates were, on average, 14 μg m⁻³ more accurate than CAMS.

To study whether the CAMS underestimation of the study region can be addressed solely through time and space adjustments, we examined straightforward spatio-temporal bias correction models. XGB and generalized linear models incorporating only spatial (e.g., geographic coordinates and regions) and temporal (e.g., date and hour) components demonstrated much lower performance, with average errors of 19.5 μg m⁻³ and 22.9 μg m⁻³, respectively—substantially higher than the 14.6 μg m⁻³ achieved by the correction model. We conclude that meteorological, atmospheric, and spectral spatiotemporal features play a critical role in correcting CAMS errors. In Section 3.3, we evaluate the contribution of these features to predicting CAMS error and discuss potential mechanisms through which they influence bias patterns in Section 4.

Figure 4 presents several examples of error corrections obtained at different timestamps over Region B (not used for training). In each panel, the original CAMS PM₁₀ field (left heatmap) is compared with its bias-corrected counterpart (right heatmap), alongside in-situ measurements represented by color-filled points. The root mean squared error (RMSE) for the stations in each panel is shown in the bottom-left corner. White lines outline the dust event areas, defined in Section 2.1.

The correction model predicts finer surfaces and sometimes returns spatial patterns that are not solved by CAMS. For example, in Panel (c), while CAMS estimates high PM₁₀ concentrations mainly over North Macedonia, the bias-corrected estimates suggest that the event continues southward to West and Central Greece and Attica. The in-situ measurements from the Athens and Thessaloniki area hint that the correction provides more accurate estimates of the surface PM₁₀ distribution. In another example, the correction returns lower PM₁₀ concentrations (i.e., decreases CAMS overestimation), as can be seen in Panel (b), where the bias-corrected PM₁₀ is lower by 100 μg m⁻³ in certain areas.

3.2. AI-Based Dust Event Forecasting with Bias-Corrected PM₁₀

The CNN network dust forecast consistently performs better when trained on the CAMS corrected fields compared to the original fields for every lead time from 0 to 72 h. Figure 5 presents binary dust event classification metrics for the same network architecture when trained on corrected (blue points) and uncorrected (red points) PM₁₀ fields. As a benchmark, the CAMS global composition PM₁₀ forecasts (green points) was examined, as detailed in Section 2.1.

The contribution of the bias correction to the network’s forecast accuracy is clear, consistent, and particularly significant at shorter lead times. Using the corrected PM₁₀ fields enhances the network’s recall—defined as the event detection rate—by approximately seven percentage points for forecasts between 0 to 24 h, and by three percentage points for 24 to 72 h forecasts. To address the balance between false positives and false negatives, we also compare the F1 score, which combines recall with the model’s precision, the latter being the accuracy of positive event predictions. F1 score emphasizes the model’s ability to correctly predict dust events (true positives) while penalizing both false positives (predicting an event when none occurs) and false negatives (missing an actual event).

Our results show that using the corrected PM₁₀ fields leads to an average increase of five percentage points in the F1 score for lead times up to 48 h. Additionally, Balanced Accuracy—which accounts for the imbalance between dust events and non-events by considering both the true positive rate and the true negative rate—provides a more holistic view of model performance. This metric ensures that improvements in detecting dust events do not come at the cost of excessive false positives. Our findings indicate that Balanced Accuracy was consistently higher when using bias-corrected PM₁₀ fields as inputs, reflecting the model’s enhanced ability to maintain reliable performance across both dust events and non-events.

3.3. CAMS PM₁₀ Bias

Besides improving AI, the correction model allows for gaining a deeper understanding of the error patterns and their sources. Section 3.1 shows that the correction model exhibits high performance in the untrained areas, although the model was trained in specific locations of another region. Building on this success, we state that it can be used to draw general conclusions about the CAMS bias’s dependence on various environmental factors.

This section examines the bias structure of the CAMS PM₁₀ fields in the entirety of Region A, i.e., between training stations. The region encompasses extensive unmonitored areas, notably in northern parts of Syria and Iraq. Quantifying CAMS PM₁₀ accuracy in these areas is particularly challenging, and the correction model offers a promising approach to address this issue. A more general discussion about potential mechanisms underlying CAMS error is given in Section 4.

Figure 6 shows the monthly medians of the model’s predicted CAMS error over Region A. Negative values indicate CAMS underestimation (see error definition in Equation (1)). The model suggests that CAMS tends to underestimate PM₁₀ in the region, mainly during the fall and winter. The results correspond to CAMS PM₁₀ underestimation over the Balkans and Turkey previously reported in CAMS annual validation reports [41].

The model reveals a complex spatiotemporal bias pattern. For example, during the winter, CAMS prominently underestimates PM₁₀ in central Anatolia and the mountainous border areas with Iraq and Iran. Surprisingly, CAMS shows significantly lower errors in this border triangle area during the spring than in other locations. It is also possible to distinguish patterns related to sea and land. For example, the spatial variance in CAMS errors is lower over the Black Sea.

Using Shapley additive explanations (SHAP) values [43], we explore the explainability of the correction model to better understand the bias factors. The SHAP value of a specific sample (pixel and timestamp) is its contribution to the sample’s model prediction relative to the global space–time average. Figure 7 presents pixel-wise SHAP averages over 2013–2020, representing maps of relative importance (a) and the features’ spatiotemporal SHAP average, indicating an importance index (b).

The time features (e.g., month of the year, etc.) receive high SHAP values, indicating high seasonal contributions to the CAMS error. Apart from time features, several spatiotemporal features show high predictability with a complex spatial pattern. NDVI, calculated at red (620–670 nm) and near-infrared (841–876 nm) wavelength, is the most significant feature contributing to the observed errors. NDVI influences the error primarily in the hot, arid, and semi-arid lands of northern Syria and Iraq, but also, to a lesser extent, in the cold, arid steppe region of Central Anatolia. Since NDVI is dominated by the surface albedo, its dominance in the error model may indicate that surface reflectance and absorbance are essential contributors to CAMS bias in the PM₁₀ fields.

Like NDVI, AOD also influences, mainly in northern Syria and Iraq. The model implies that in this region, lower AOD observations explain a significant part of CAMS PM₁₀ underestimation. Zonal surface winds are mostly substantial over the Mediterranean Sea, possibly associated with desert dust transport from North African sources or associated with fluxes of sea salt aerosols. The spatial patterns of the importance of SLP, as well as air temperature and specific humidity at 850 hPa, correlate with the region’s climatology and topography.

4. Discussion

City-scale dust event forecasting is a challenging problem. Figure 5 supports this claim, demonstrating the poor performance of CAMS global composition forecasts, which are commonly used in atmospheric sciences and environmental monitoring. Figure 5 also indicates that AI has a high potential to improve these forecasts, demonstrating that the neural network consistently outperforms these CAMS numerical forecasts, regardless of the PM₁₀ input. Although AI’s performance increases when using bias-corrected PM₁₀ inputs, the impact decreases with the forecast lead time, particularly in terms of recall and the F1 score. One possible explanation is that as the forecast horizon extends further, the network relies more on large-scale dust transport patterns, which are associated with growing uncertainties. Since the correction tends to improve predictions on more local scales, its effectiveness diminishes at longer lead times, where these uncertainties are more pronounced.

Besides affecting dust event forecasting, PM₁₀ systematic estimation errors have consequences in other applications. A prominent example is environmental epidemiology, where the health effects of environmental exposure, such as PM₁₀, are studied. In statistical parlance, PM₁₀ estimation errors are errors-in-variable in the epidemiological model, which bias the estimation of the exposure effects [5,10]. Reinvestigating exposure effects by replacing CAMS PM₁₀ with bias-corrected estimates in epidemiological studies is left for future research.

The deviation between in-situ and CAMS PM₁₀ estimates may originate from various sources. First, although we treat the in-situ measurements as ground truth, some measurement errors may exist. Nonsystematic measurement errors (for example, imprecise measuring devices) cancel within a stratum and thus affect only the PM₁₀ variance. Systematic measurement errors (e.g., miscalibrated detectors) can introduce bias in unknown ways. We mitigate these effects by filtering out stations and time samples that appear unreliable.

In addition, in-situ observations may sometimes represent local conditions, often not resolved by reanalysis, representing area averages. The statistical literature refers to this source of nonsystematic error as Berkson errors [44]. We mitigate this effect by averaging multiple in-situ measurements within each corresponding pixel to ensure that the ground measurements accurately represent the area.

CAMS PM₁₀ product is computed from multiple mass mixing ratios from each aerosol tracer, including sea salt, desert dust, nitrate, organic matter, and more [45]. Measurement errors of multiple aerosol tracers propagate into the PM₁₀ calculations, which complicates tracing its bias sources. Previous CAMS validations associated seasonal and regional factors in PM₁₀’s combined components [40,46]. For example, organic matter estimation error is associated with differences between desert dust and AOD observations over desert regions in North Africa and the Middle East [46]. This type of complex relation may explain some of the observed spatiotemporal variance patterns of CAMS PM₁₀ bias (Figure 6).

Another potential source of discrepancy may lie in the assimilation process. Previous studies suggest that a substantial part of the reanalysis bias could stem from limited in-situ spatial coverage [47,48,49]. We examined whether CAMS errors correlate with the level of in-situ representation in the area, calculated based on the number of ground stations within the nearby pixel region. However, we did not find any statistically significant evidence for this source of bias.

The difference between CAMS and the in-situ measurements may also lie in an incorrect estimation of the vertical distribution of the dust. For example, dust plumes may exhibit unique vertical profiles with differences between ground and higher atmospheric levels [50]. Satellite-based platforms like CAMS may fail to estimate the surface concentrations in such plumes due to limitations of accurate three-dimensional interpretation. Evidence for this can be found in the high SHAP values of surface meteorological features, such as the zonal and meridional wind at 10 m or ground-level humidity (Figure 7). This indicates the significant contribution of spatiotemporal surface conditions in the predicted error of CAMS PM₁₀ concentration. In addition, we find that the contribution of CAMS dust to CAMS PM₁₀ error changes drastically in altitude. Closest to the surface, at model level 50, the dust aerosol (0.9–20 μm) feature is among the most important, with an importance index of 0.75. In contrast, the importance index falls to 0.4, 0.3, and 0.1 at model levels 40, 30, and 20, respectively.

5. Conclusions

This study proposes a machine-learning approach to correct CAMS PM₁₀ reanalysis using in-situ data, which enhances AI-based dust event forecasting.

Using meteorological, atmospheric, and remotely sensed data, we train a gradient-boosting algorithm to learn a correction model and predict CAMS deviations from ground measurements over Anatolia and parts of the Armenian Highlands. Tested over in-situ measurements from unseen areas in the Balkan peninsula, our bias-corrected fields are more accurate by 12 μg m⁻³ on average than CAMS, and often suppress CAMS errors by dozens of percentages, depending on time and space. Key predictive features identified include temporal factors, NDVI, surface winds, and AOD.

To estimate the correction potential for improving AI-based early warnings, we compare the AI forecast performance in two scenarios: (i) AI is trained on CAMS data, and (ii) AI is trained on bias-corrected CAMS data. Replacing CAMS with bias-corrected fields improves the network’s performance in all metrics and lead times (0–72 h). Specifically, at the same precision, it increases the 18 h detection rate (recall) by 11 percentage points. Considering the event rate in the test region, this translates to 180 h a year of dust events that could be predicted.

Author Contributions

Conceptualization, R.S.; methodology, R.S., S.N., D.N. and Y.R.; software, S.N. and R.S.; validation, R.S., S.N., D.N., S.K. and Y.R.; formal analysis, R.S. and S.N.; investigation, R.S. and S.N.; resources, Y.R.; data curation, R.S., S.N. and D.N.; writing—original draft preparation, R.S.; writing—review and editing, S.N., D.N., S.K. and Y.R.; visualization, S.N.; supervision, Y.R.; project administration, R.S. and Y.R.; funding acquisition, Y.R. All authors have read and agreed to the published version of the manuscript.

Funding

This research was partially supported by the MBZUAI-WIS Joint Program for Artificial Intelligence Research.

Data Availability Statement

All data supporting the reported results are publicly available and can be found at: https://toar-data.org/ (accessed on 2 January 2025) (TOAR database); https://www.ecmwf.int/en/forecasts/dataset/cams-global-reanalysis (accessed on 2 January 2025) (CAMS reanalysis); https://cds.climate.copernicus.eu/datasets/reanalysis-era5-single-levels (accessed on 2 January 2025) (ERA5 reanalysis); https://lpdaac.usgs.gov/products/mod13a2v006/ (accessed on 2 January 2025) (NDVI data); https://www.ecmwf.int/en/forecasts/dataset/cams-global-atmospheric-composition-forecasts (accessed on 2 January 2025) (CAMS global atmospheric composition forecast).

Conflicts of Interest

The authors declare no conflicts of interest.

References

Ginoux, P.; Prospero, J.M.; Gill, T.E.; Hsu, N.C.; Zhao, M. Global-scale attribution of anthropogenic and natural dust sources and their emission rates based on MODIS Deep Blue aerosol products. Rev. Geophys. 2012, 50, RG3005. [Google Scholar] [CrossRef]
Middleton, N.J. Desert dust hazards: A global review. Aeolian Res. 2017, 24, 53–63. [Google Scholar] [CrossRef]
Goudie, A.S. Desert dust and human health disorders. Environ. Int. 2014, 63, 101–113. [Google Scholar] [CrossRef] [PubMed]
Ward-Caviness, C.K.; Nwanaji-Enwerem, J.C.; Wolf, K.; Wahl, S.; Colicino, E.; Trevisi, L.; Kloog, I.; Just, A.C.; Vokonas, P.; Cyrys, J.; et al. Long-term exposure to air pollution is associated with biological aging. Oncotarget 2016, 7, 74510. [Google Scholar] [CrossRef] [PubMed]
Sarafian, R.; Kloog, I.; Rosenblatt, J.D. Optimal-design domain-adaptation for exposure prediction in two-stage epidemiological studies. J. Expo. Sci. Environ. Epidemiol. 2022, 33, 963–970. [Google Scholar] [CrossRef]
Nissenbaum, D.; Sarafian, R.; Rudich, Y.; Raveh-Rubin, S. Six types of dust events in Eastern Mediterranean identified using unsupervised machine-learning classification. Atmos. Environ. 2023, 309, 119902. [Google Scholar] [CrossRef]
Fazzini, P.; Montuori, M.; Pasini, A.; Cuzzucoli, A.; Crotti, I.; Campana, E.F.; Petracchini, F.; Dobricic, S. Forecasting PM Levels Using Machine Learning Models in the Arctic: A Comparative Study. Remote Sens. 2023, 15, 3348. [Google Scholar] [CrossRef]
Fluck, E.; Raveh-Rubin, S. Dry air intrusions link Rossby wave breaking to large-scale dust storms in Northwest Africa: Four extreme cases. Atmos. Res. 2023, 286, 106663. [Google Scholar] [CrossRef]
Baladima, F.; Thomas, J.L.; Voisin, D.; Dumont, M.; Junquas, C.; Kumar, R.; Lavaysse, C.; Marelle, L.; Parrington, M.; Flemming, J. Modeling an extreme dust deposition event to the French Alpine seasonal snowpack in April 2018: Meteorological context and predictions of dust deposition. J. Geophys. Res. Atmos. 2022, 127, e2021JD035745. [Google Scholar] [CrossRef]
Szpiro, A.A.; Paciorek, C.J. Measurement error in two-stage analyses, with application to air pollution epidemiology. Environmetrics 2013, 24, 501–517. [Google Scholar] [CrossRef] [PubMed]
Inness, A.; Ades, M.; Agustí-Panareda, A.; Barré, J.; Benedictow, A.; Blechschmidt, A.M.; Dominguez, J.J.; Engelen, R.; Eskes, H.; Flemming, J.; et al. The CAMS reanalysis of atmospheric composition. Atmos. Chem. Phys. 2019, 19, 3515–3556. [Google Scholar] [CrossRef]
Sarafian, R.; Nissenbaum, D.; Raveh-Rubin, S.; Agrawal, V.; Rudich, Y. Deep multi-task learning for early warnings of dust events implemented for the Middle East. npj Clim. Atmos. Sci. 2023, 6, 23. [Google Scholar] [CrossRef]
Pappa, A.; Kioutsioukis, I. Forecasting particulate pollution in an urban area: From copernicus to sub-km scale. Atmosphere 2021, 12, 881. [Google Scholar] [CrossRef]
Stortini, M.; Arvani, B.; Deserti, M. Operational forecast and daily assessment of the air quality in Italy: A copernicus-CAMS downstream service. Atmosphere 2020, 11, 447. [Google Scholar] [CrossRef]
Ryu, Y.H.; Min, S.K. Long-term evaluation of atmospheric composition reanalyses from CAMS, TCR-2, and MERRA-2 over South Korea: Insights into applications, implications, and limitations. Atmos. Environ. 2021, 246, 118062. [Google Scholar] [CrossRef]
Sekmoudi, I.; Khomsi, K.; Faieq, S.; Idrissi, L. Assessment of global and regional PM 10 CAMSRA data: Comparison to observed data in Morocco. Environ. Sci. Pollut. Res. 2021, 28, 29984–29997. [Google Scholar] [CrossRef]
Ali, M.A.; Bilal, M.; Wang, Y.; Nichol, J.E.; Mhawish, A.; Qiu, Z.; de Leeuw, G.; Zhang, Y.; Zhan, Y.; Liao, K.; et al. Accuracy assessment of CAMS and MERRA-2 reanalysis PM_2.5 and PM₁₀ concentrations over China. Atmos. Environ. 2022, 288, 119297. [Google Scholar] [CrossRef]
Shtein, A.; Kloog, I.; Schwartz, J.; Silibello, C.; Michelozzi, P.; Gariazzo, C.; Viegi, G.; Forastiere, F.; Karnieli, A.; Just, A.C.; et al. Estimating daily PM_2.5 and PM₁₀ over Italy using an ensemble model. Environ. Sci. Technol. 2019, 54, 120–128. [Google Scholar] [CrossRef]
Sarafian, R.; Kloog, I.; Just, A.C.; Rosenblatt, J.D. Gaussian markov random fields versus linear mixed models for satellite-based PM_2.5 assessment: Evidence from the northeastern USA. Atmos. Environ. 2019, 205, 30–35. [Google Scholar] [CrossRef]
Hough, I.; Sarafian, R.; Shtein, A.; Zhou, B.; Lepeule, J.; Kloog, I. Gaussian Markov random fields improve ensemble predictions of daily 1 km PM_2.5 and PM₁₀ across France. Atmos. Environ. 2021, 264, 118693. [Google Scholar] [CrossRef]
Sarafian, R.; Kloog, I.; Sarafian, E.; Hough, I.; Rosenblatt, J.D. A domain adaptation approach for performance estimation of spatial predictions. IEEE Trans. Geosci. Remote Sens. 2020, 59, 5197–5205. [Google Scholar] [CrossRef]
Riccio, A.; Chianese, E. Accurate, reliable, and high-resolution air quality predictions by improving the Copernicus Atmosphere Monitoring Service using a novel statistical post-processing method. Atmos. Chem. Phys. 2024, 24, 1673–1689. [Google Scholar] [CrossRef]
Bertrand, J.M.; Meleux, F.; Ung, A.; Descombes, G.; Colette, A. Improving the European air quality forecast of the Copernicus Atmosphere Monitoring Service using machine learning techniques. Atmos. Chem. Phys. 2023, 23, 5317–5333. [Google Scholar] [CrossRef]
Hernanz, A.; García-Valero, J.A.; Domínguez, M.; Rodríguez-Camino, E. A critical view on the suitability of machine learning techniques to downscale climate change projections: Illustration for temperature with a toy experiment. Atmos. Sci. Lett. 2022, 23, e1087. [Google Scholar] [CrossRef]
Hernanz, A.; Correa, C.; Sánchez-Perrino, J.C.; Prieto-Rico, I.; Rodríguez-Guisado, E.; Domínguez, M.; Rodríguez-Camino, E. On the limitations of deep learning for statistical downscaling of climate change projections: The transferability and the extrapolation issues. Atmos. Sci. Lett. 2023, 25, e1195. [Google Scholar] [CrossRef]
Schär, C.; Fuhrer, O.; Arteaga, A.; Ban, N.; Charpilloz, C.; Di Girolamo, S.; Hentgen, L.; Hoefler, T.; Lapillonne, X.; Leutwyler, D.; et al. Kilometer-scale climate models: Prospects and challenges. Bull. Am. Meteorol. Soc. 2020, 101, E567–E587. [Google Scholar] [CrossRef]
Karger, D.N.; Lange, S.; Hari, C.; Reyer, C.P.; Conrad, O.; Zimmermann, N.E.; Frieler, K. CHELSA-W5E5: Daily 1 km meteorological forcing data for climate impact studies. Earth Syst. Sci. Data Discuss. 2022, 15, 2445–2464. [Google Scholar] [CrossRef]
Asadollah, S.B.H.S.; Sharafati, A.; Motta, D.; Jodar-Abellan, A.; Pardo, M.Á. Satellite-based prediction of surface dust mass concentration in southeastern Iran using an intelligent approach. Stoch. Environ. Res. Risk Assess. 2023, 37, 3731–3745. [Google Scholar] [CrossRef]
Alshammari, R.K.; Alrwais, O.; Aksoy, M.S. Machine Learning Forecast of Dust Storm Frequency in Saudi Arabia Using Multiple Features. Atmosphere 2024, 15, 520. [Google Scholar] [CrossRef]
Aryal, Y. Application of Artificial Intelligence Models for Aeolian Dust Prediction at Different Temporal Scales: A Case with Limited Climatic Data. AI 2022, 3, 707–718. [Google Scholar] [CrossRef]
Schultz, M.G.; Schröder, S.; Lyapina, O.; Cooper, O.R.; Galbally, I.; Petropavlovskikh, I.; Von Schneidemesser, E.; Tanimoto, H.; Elshorbany, Y.; Naja, M.; et al. Tropospheric Ozone Assessment Report: Database and metrics data of global surface ozone observations. Elem. Sci. Anthr. 2017, 5, 58. [Google Scholar] [CrossRef]
Hersbach, H.; Bell, B.; Berrisford, P.; Hirahara, S.; Horányi, A.; Muñoz-Sabater, J.; Nicolas, J.; Peubey, C.; Radu, R.; Schepers, D.; et al. The ERA5 global reanalysis. Q. J. R. Meteorol. Soc. 2020, 146, 1999–2049. [Google Scholar] [CrossRef]
WHO. Ambient (outdoor) air pollution. In Air Quality Guidelines% 22 Estimate, Related Deaths by Around; WHO: Geneva, Switzerland, 2018; pp. 15–25. [Google Scholar]
Chen, T.; Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
Ke, G.; Meng, Q.; Finley, T.; Wang, T.; Chen, W.; Ma, W.; Ye, Q.; Liu, T.Y. Lightgbm: A highly efficient gradient boosting decision tree. In Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; Volume 30. [Google Scholar]
Guryanov, A. Histogram-based algorithm for building gradient boosting ensembles of piecewise linear decision trees. In Proceedings of the Analysis of Images, Social Networks and Texts: 8th International Conference, AIST 2019, Kazan, Russia, 17–19 July 2019; Revised Selected Papers 8; Springer: Cham, Switzerland, 2019; pp. 39–50. [Google Scholar]
Goodfellow, I. Deep Learning; MIT Press: Cambridge, MA, USA, 2016. [Google Scholar]
Mao, A.; Mohri, M.; Zhong, Y. Cross-entropy loss functions: Theoretical analysis and applications. In Proceedings of the International Conference on Machine Learning, PMLR, Honolulu, HI, USA, 23–29 July 2023; pp. 23803–23828. [Google Scholar]
Kingma, D.P. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
Arola, A.; Basart, S.; Benedictow, A.; Bennouna, Y.; Blechschmidt, A.M.; Cuevas, E.; Errera, Q.; Eskes, H.; Kapsomenakis, J.; Langerock, B.; et al. Validation Report of the CAMS Near-Real-Time Global Atmospheric Composition Service: Period September–November 2021; ECMWF: Reading, UK, 2022. [Google Scholar] [CrossRef]
Meleux, F. Annual Report on the Evaluation of Validated Reanalyses VRA 2020; INERIS: Verneuil-en-Halatte, France, 2023. [Google Scholar]
Csiszár, I. I-divergence geometry of probability distributions and minimization problems. Ann. Probab. 1975, 3, 146–158. [Google Scholar] [CrossRef]
Lundberg, S.M.; Lee, S.I. A unified approach to interpreting model predictions. In Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; Volume 30. [Google Scholar]
Buonaccorsi, J.P. Measurement Error: Models, Methods, and Applications; Chapman and Hall/CRC: Boca Raton, FL, USA, 2010; pp. 76–78. [Google Scholar]
Part VIII: Atmospheric Composition; DOCUMENTATION-Cy48r1, I; European Centre for Medium-Range Weather Forecasts: Reading, UK, 2023.
Errera, Q.; Bennouna, Y.; Schulz, M.; Eskes, H.; Basart, S.; Benedictow, A.M.; Blechschmidt, A.M.; Chabrillat, S.; Clark, H.; Cuevas, E.; et al. Validation Report for the CAMS Global Reanalyses of Aerosols and Reactive Trace Gases, Years 2003–2020; ECMWF: Reading, UK, 2021. [Google Scholar]
Chen, G.; Iwasaki, T.; Qin, H.; Sha, W. Evaluation of the warm-season diurnal variability over East Asia in recent reanalyses JRA-55, ERA-Interim, NCEP CFSR, and NASA MERRA. J. Clim. 2014, 27, 5517–5537. [Google Scholar] [CrossRef]
Chen, B.; Liu, Z. Global water vapor variability and trend from the latest 36 year (1979 to 2014) data of ECMWF and NCEP reanalyses, radiosonde, GPS, and microwave satellite. J. Geophys. Res. Atmos. 2016, 121, 11–442. [Google Scholar] [CrossRef]
Alghamdi, A.S. Evaluation of four reanalysis datasets against radiosonde over Southwest Asia. Atmosphere 2020, 11, 402. [Google Scholar] [CrossRef]
Peshev, Z.; Deleva, A.; Vulkova, L.; Dreischuh, T. Large-Scale Saharan Dust Episode in April 2019: Study of Desert Aerosol Loads over Sofia, Bulgaria, Using Remote Sensing, in-situ, and Modeling Resources. Atmosphere 2022, 13, 981. [Google Scholar] [CrossRef]

Figure 1. The study domain. The correction model is trained over the stations (points) in Region A (red box) and tested over the stations of Region B (blue box). The corrected CAMS PM₁₀ fields are predicted over Region C (green box) and serve as inputs to a neural network that forecasts dust events over stations in Region B.

Figure 2. AI-based forecasting and correction setups. (a) Dust event forecasting is based on CAMS PM₁₀ fields. Two scenarios are illustrated: CAMS PM₁₀ fields serve as the AI model input (red box); bias-corrected PM₁₀ fields, generated using a correction model, serve as the AI model input (green box). The input fields are from regions not seen during correction model training. In both AI models, PM₁₀ fields are passed to a CNN architecture that forecasts dust events in a region of interest at specific locations. (b) The correction model is trained to predict the deviation of CAMS PM₁₀ from the average in-situ measurements (points) over the training pixels (black framed) using the training pixels’ spatial features (meteorological, atmospheric, and remote sensing data). Bias-corrected PM₁₀ fields are calculated by subtracting the predicted error from the original CAMS fields (Equation (3)). The correction model is tested using in-situ measurements of unseen regions.

Figure 3. (a): A comparison of the in-situ PM₁₀ measurements (blue), their CAMS PM₁₀ estimates (orange), and their associated bias-correction predictions (green) as defined in Equation (3), in Region B test stations over the study period. Lines indicate the 5-day moving averages of the day of the year. Shaded ribbons indicate the range between the 75th and 25th percentiles (IQR). (b): Boxplots of the PM₁₀ estimates mentioned in (a) in six selected pixels in the test set. (c): Locations of the pixels shown in Panel (b). (d): Boxplots of the error (deviation from the in-situ PM₁₀ measurements) of CAMS (orange) and the bias-corrected (green) for every year in the study. The horizontal line indicates zero error.

Figure 4. Several examples (a–d) compare the original CAMS (left panels) and the bias-corrected PM₁₀ estimates (defined in Equation (3); right panels) over Region B, which is unseen during training, at different timestamps. The heatmap color indicates the PM₁₀ concentration at μg m⁻³. White lines indicate dust event areas (PM₁₀

> 50

μg m⁻³) according to the specific estimate. Dots indicate active ground stations, where the dots’ color corresponds to their in-situ PM₁₀ measurements. Black lines indicate the region’s coastlines.

Figure 4. Several examples (a–d) compare the original CAMS (left panels) and the bias-corrected PM₁₀ estimates (defined in Equation (3); right panels) over Region B, which is unseen during training, at different timestamps. The heatmap color indicates the PM₁₀ concentration at μg m⁻³. White lines indicate dust event areas (PM₁₀

> 50

μg m⁻³) according to the specific estimate. Dots indicate active ground stations, where the dots’ color corresponds to their in-situ PM₁₀ measurements. Black lines indicate the region’s coastlines.

Figure 5. A comparison of the dust event forecast performance at 0–72 h lead times for three models: (i) a network that was trained on original CAMS PM₁₀ (red); (ii) a network that was trained on our bias-corrected PM₁₀ (blue); (iii) the CAMS global atmospheric composition forecasts (green). The forecasts’ F1 score (Panel (a)), Recall (Panel (b)), and Balanced Accuracy (Panel (c)) are computed over the validation set (stations of Region B). Lines represent linear trend fits.

Figure 6. Maps of Region A’s monthly medians of estimated CAMS PM₁₀ error in μg m⁻³ (defined in Equation (1)). Negative values mean CAMS underestimation. Black lines indicate the region’s coastlines.

Figure 7. (a): Pixel-wise average of absolute SHAP values for selected features in Region A over 2013–2020 (computed over timestamps unseen during training only). The SHAP values are the contribution to the error model prediction (Equation (2)) from the average CAMS error. Values are comparable between pixels and features and represent maps of relative importance. (b): Importance index. Computed as the average absolute SHAP values per feature over all pixels and time lags. The first 20 most significant features are presented.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Sarafian, R.; Nathan, S.; Nissenbaum, D.; Khan, S.; Rudich, Y. Correction of CAMS PM₁₀ Reanalysis Improves AI-Based Dust Event Forecast. Remote Sens. 2025, 17, 222. https://doi.org/10.3390/rs17020222

AMA Style

Sarafian R, Nathan S, Nissenbaum D, Khan S, Rudich Y. Correction of CAMS PM₁₀ Reanalysis Improves AI-Based Dust Event Forecast. Remote Sensing. 2025; 17(2):222. https://doi.org/10.3390/rs17020222

Chicago/Turabian Style

Sarafian, Ron, Sagi Nathan, Dori Nissenbaum, Salman Khan, and Yinon Rudich. 2025. "Correction of CAMS PM₁₀ Reanalysis Improves AI-Based Dust Event Forecast" Remote Sensing 17, no. 2: 222. https://doi.org/10.3390/rs17020222

APA Style

Sarafian, R., Nathan, S., Nissenbaum, D., Khan, S., & Rudich, Y. (2025). Correction of CAMS PM₁₀ Reanalysis Improves AI-Based Dust Event Forecast. Remote Sensing, 17(2), 222. https://doi.org/10.3390/rs17020222

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Correction of CAMS PM₁₀ Reanalysis Improves AI-Based Dust Event Forecast

Abstract

1. Introduction