1. Introduction
Data ingestion and data integration are grand challenges of today’s age of digital agriculture. The enormous volume of data produced by all phases of agriculture, such as field observations, satellite imagery, flux towers or soil/plant sensors, has enabled data-driven decision making for improving agricultural productivity [
1,
2,
3]. However, the inherently fragmented nature of observational data due to variable temporal and/or spatial resolution has made the integration of these data products challenging and often impractical [
4]. As a result, previous multidimensional assessments of agricultural productivity and environmental impacts have often fallen short of data-measured potential. To perform multidimensional studies across broads regions, agricultural researchers are increasingly turning to process-based simulation models [
5,
6], such as the The Agricultural Production Systems sIMulator (APSIM) [
7] or the Decision Support System for Agrotechnology Transfer (DSSAT) [
8]. These models integrate state-of-the-art knowledge on a multitude of soil and crop processes to enable analyses of higher dimensionality than what is possible with field experiments. These pioneering models are at the core of many recent forecasting and climate impact assessment efforts around the world [
9,
10,
11]. However, due to the large number of uncertain parameters within these models, their predictive capacity is limited in accuracy, precision, or both for real-world applications [
12,
13].
Although constraining parameters in nonlinear models is a common task across many different disciplines, optimizing and constraining process-based crop model parameters is a particularly challenging task for several reasons [
14]. First, process-based crop models are computationally expensive, and it is often impractical to optimize them using “big data” that spans a large number of sites and/or years [
15]. As a result, crop model calibration exercises are typically limited to a single site or single data constraint, an approach that is in direct contrast to the diverse range of available observations produced by all phases of agriculture [
13]. Second, the observational data used for calibrating crop models often have substantial associated uncertainty due to low sample size. Since most numerical optimization techniques lack mechanisms to account for uncertainty in observational data, their application can potentially lead to wrongly over-confident model predictions. Third, it is unclear how observational data from multiple sites and years can be simultaneously ingested. The most common approach is to independently calibrate crop models at different sites (known as site-level calibration); however, this approach assumes that all sites are completely independent and ignores the potential of across-site information. Consequently, site-level calibration offers limited potential for upscaling model simulations to new sites and across broad regions [
15,
16]. On the other hand, global optimization (known as joint calibration) assumes no site-to-site variability and pools all observations to identify a combination of parameters that minimizes model prediction error at all sites simultaneously. Past studies have shown that both approaches have trouble estimating the “true” uncertainty in model parameters and offer no formal distinction between “within-sample” prediction and “out-of-sample” prediction [
17]. In addition, these approaches provide no clear path to capturing spatiotemporal variability in model parameters. Lastly, process-based models are calibrated with a limited number of observations that often pales in comparison to the potential list of parameters impacting the simulated results. This often results in equifinality, such that calibration does not lead to unique parameter values and different combinations of parameter values can give the same results [
14]. Due to the above-mentioned reasons, most model calibration studies lack proper constraint and accounting of uncertainty in model parameters and, therefore, produce models that perform well only in a subspace of the genetics (G), environment (E), and management (M) (G × E × M) inference space and, consequently, have limited generalizability for broader applications.
Earth Observations (EOs) with their extensive temporal and spatial coverage have provided the scientific community with a unique opportunity to monitor and map plant status [
18]. Radiative Transfer Models (RTM) have enabled retrieval of biophysical and biochemical plant traits from EOs through both passive and active measurements [
19,
20]. These estimates are spatially contiguous and temporally frequent and they offer a substantially larger sample size compared to field measurements [
21]. Consistent and long-term plant traits obtained from EOs could potentially inform process-based models, constraining multiple state variables, such as LAI, plant N concentration, and plant biomass, and/or model parameters, such as thermal times for phenological stages. Thus, EOs could help to overcome the spatiotemporal limitations of field experiment data and could serve as a powerful resource for multi-criteria crop model calibration across broad regions [
21]. However to leverage EOs, novel and flexible methods are needed that allow for streamlining data extraction, ingestion, and integration into crop models with the goal of constraining model parameters and state variables.
As one of the most widely used process-based crop models, APSIM model has been calibrated using a variety of different optimization techniques and observational data types for simulating maize across the U.S. Midwest [
22,
23,
24,
25]. These studies have often achieved high yet inconsistent prediction accuracy in estimating maize grain yields. For instance, ref. [
26] tested the APSIM model with 12 years of experimental data covering maize-maize and maize-soybean rotations in IA. APSIM captured changes in maize yield due to nitrogen fertilizer treatments but still made large prediction errors in a few site-years. The authors attributed such errors to missing crop damage processes (i.e., hail), missing spatial information (i.e., elevation), and an overly simple N-uptake routine, which emphasized N stress at floral initiation. Ref. [
27] also calibrated and tested APSIM at several maize sites in the state of Iowa in the U.S. and found the model to be highly accurate, explaining 87% of the variability in maize yields and achieving a Normalized Root Mean Square Error (nRMSE) value of 7.7%. However, the authors also highlighted the fact that their calibration and validation datasets originated from a similar study area and, consequently, experienced the same weather patterns. Consequently, their calibrated model was tested under limited environmental conditions. Ref. [
24] calibrated APSIM for 6 years of experimental data covering maize-maize, maize-soybean, and winter cover crop rotations in IA to investigate crop water use efficiency. When performing two-year simulations, they found the model was successful in predicting yield for the maize-maize rotations with an RMSE of 723 kg/ha. Ref. [
28] calibrated the APSIM model for 16 site-years in the U.S. Midwest that spanned different rotations and fertilizer rates. They found the calibrated model to perform better when N fertilizers were applied since the model only had to rely on correct water limitation simulation. Moreover, ref. [
29] used default APSIM settings to test the model in predicting maize stover and grain yield at three locations. Against 113 observations, APSIM achieved an RMSE of 1241 kg/ha (nRMSE 14%) for maize grain yield estimates, and the authors also noted improved simulation with increased N fertilizer rates (i.e., lower N stress conditions).
Although the APSIM model has been calibrated and validated extensively using field observations, the unprecedented data coverage provided by EOs has never been leveraged to our knowledge to optimize APSIM and broaden its successful application. In this study, we present a mathematical framework that constrains APSIM model parameters with EOs for maize simulations in the U.S. Midwest using emulation and three Bayesian optimization schemes: global, individual, and hierarchical. The overall objectives of the current study are: (1) to assess the potential of EOs for improving maize yield prediction using the APSIM model; and (2) to compare different optimization techniques with varying degrees of information sharing across sites in their maize yield prediction in the U.S. Midwest.
4. Discussion
Although direct measurements are not available for all optimized parameters, a recent study by [
51], explored the variability of leaf appearance rate across 98 sites for maize in the U.S. Midwest. Their results showed that the average first phase phyllochron for modern maize hybrids in the U.S. Midwest is 57.9 ± 7.5 (
C-day
). This aligns with the results of this study, such that the mean estimates of this parameter was 64, 60 and 56 (
C-day
) for the global, site-level, and HPDA schemes, respectively. In another study with manual calibration, ref. [
27] found that leaf appearance occurs at a rate of 57 (
C-day
), a rate that is significantly faster than APSIM maize’s default rate of 65 (
C-day
) in the U.S. Corn Belt.
Similar to [
17], who did not find any improvement in model accuracy using HPDA over global optimization scheme on “out-of-sample” sites, we did not find any improvement in maize yield prediction using HPDA. However, ref. [
17] suggested that the difference in performance between HPDA and global optimization will increase in favor of HPDA as the number of sites/parameters increases. In contrast to completely generic (global) or completely site-specific (site-level) model parameters, HPDA offers an alternative solution that allows under-sampled sites to borrow strength from sites with more data to achieve better constraint in model predictions [
52,
53]. HPDA has the advantage of having a formal distinction between prediction at known calibration sites and “out-of-sample” sites [
17]. When predictions are made for “within-sample” sites, site-level parameters (
,
Table 1) can be used to make predictions, whereas the global mean (
) estimated through HPDA is used to make predictions on “out-of-sample” sites. Furthermore, when designing HPDA optimization schemes, extra attention needs to be paid in deciding which parameters should be random. By attempting to capture site-to-site variability, the total number of parameters in our hierarchical model increases, which could be limiting in cases with low sample sizes. However, this is less of a concern for EOs given their extensive spatiotemporal coverage and larger sample size compared to field experiments. Similar to [
23] which accounted only for the effect of year in APSIM model parameters, future works on HPDA also may need to explore the impact of including a year effect in addition to the site effect to estimate the full spatiotemporal variability in APSIM maize parameters.
Site-level model calibration showed variable performance in constraining LAI and yield estimates, with performance operating as a function of data density across within-sample sites. This points to the inherently limited nature of this calibration scheme for application to broad regions. Moreover, this limitation in site-level calibration is even more pronounced due to the “perfect model” assumption behind all calibration procedures. The “perfect model” assumption ignores any structural error in process-based crop models, assuming discrepancies between model estimates and data originate solely from non-optimum parameters. Calibration attempts to correct for these discrepancies in the model by adjusting the parameter values so that the model performs well within the domain of the training dataset. However, due to biased parameter estimates which often differ from the “true” parameter values, the final parameterized model often underperforms in out-of-sample sites [
54].
Ref. [
54] suggested it is often impractical to fit multiple data streams simultaneously using complex computer simulation models due to internal constraints, such as mass balance and structural error. They argued that calibration schemes tend to find an optimum set of parameter values that corrects for error in the more data-rich output at the cost of larger errors or overly narrow confidence intervals in data-poor streams. In this study, though an unbalanced dataset of LAI and NDVI was used, APSIM performed reasonably well in simulating both outputs and, overall, no trade-off was found between NDVI and LAI predictions (e.g., Site 79924).
No significant correlation was found between maximum a posteriori of optimized parameters and different soil or weather variables across all training sites suggesting that the estimated site effect is potentially also reflecting a confounding G X E effect and not just an environmental (’E’) effect. However, in order to untangle this confounding G x E interaction, more replications are needed for the cultivars used in the training sites. In a similar study, ref. [
17] calibrated the the Simplified Photosynthesis and Evapotranspiration (SIPNET) model through HPDA across 12 temperate deciduous Ameriflux sites. They identified a missing temperature response in respiration and photosynthesis and associated it with a lack of thermal acclimation and adaptation in the model. The missing temperature response was found due to large site-site variability (’E’ effect) found in parameters related to these processes.
For within-sample model application, the global optimization approach offered the most robust (least variability) parameter estimates and provided the most well-constrained LAI and yield predictions. This shows the potential for global optimization in low sample size prediction problems. Global optimization also performed substantially better than HPDA in “out-of-sample” sites, especially when sites fell outside of the spatiotemporal extent of the training data (
Table 3).
Ref. [
55] states that describing the state of knowledge for complex ecological systems clearly and accurately is only possible when one has accounted for all sources of uncertainty. These sources include uncertainty in model parameters, initial and boundary conditions, and agricultural practices [
12], similar to those propagated in this study. Bayesian optimization techniques offer a systematic approach for accounting for these uncertainties [
56,
57] and are often preferred over numerical methods for calibrating process-based crop models because (1) when new data becomes available, posterior densities from previous studies can be used as an informed prior to rapidly incorporate new information and (2) Bayesian techniques allow for incorporating error in observational data, which, given the noisy nature of EOs, can be essential to account for true uncertainty in model parameters [
21].
The larger sample size and multi-faceted observational data provided by EOs can potentially reduce the risk of equifinality in model calibration by constraining multiple parameters/state variables in models simultaneously. Furthermore, recent advancements in mapping and monitoring crop phenology [
58] through EOs could be leveraged in addition to retrieving biophysical variables, like LAI, to further increase the number of data constraints and adjust crop phenology accordingly. Process-based crop models are often calibrated under no-stress conditions which is impractical given the nature of EOs. However, as a future direction, we will work towards including additional parameters that control water/N stress within the optimization problem. This could be particularly important as [
51] reported that water and nitrogen stress may delay phenology and leaf appearance in maize across the U.S. Midwest.
Our proposed framework is novel among previous model calibration efforts in that it (1) incorporates and leverages site-to-site variability in optimization, (2) employs EOs to constrain model parameters, (3) accounts for observation uncertainty and estimates parameter uncertainty, and (4) follows a systematic calibration approach that could be updated as new data becomes available for new sites. This work demonstrated a proof-of-concept for direct application of EOs in constraining and improving yield prediction with the APSIM model. Future works should consider upscaling this framework from 13 calibration sites to potentially hundreds of sites with sufficient sample sizes on different crops/cultivars to fully explore the inherent spatiotemporal variability of APSIM model parameters. Furthermore, the effect of increase in the total number of data constraints and their spatial resolution on crop yield and optimized parameters needs to be further investigated.