Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Next Article in Journal
Landscape Character and Management Zones in Polder Landscapes: A Case Study of the Dongting Lake Area
Previous Article in Journal
Application and Prospect of Artificial Intelligence Technology in Low-Carbon Cities—From the Perspective of Urban Planning Content and Process
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Comparing Two Geostatistical Simulation Algorithms for Modelling the Spatial Uncertainty of Texture in Forest Soils

by
Gabriele Buttafuoco
National Research Council of Italy, Institute for Agriculture and Forestry Systems in the Mediterranean, 87036 Rende, CS, Italy
Land 2024, 13(11), 1835; https://doi.org/10.3390/land13111835
Submission received: 21 September 2024 / Revised: 19 October 2024 / Accepted: 30 October 2024 / Published: 5 November 2024

Abstract

:
Uncertainty assessment is an essential part of modeling and mapping the spatial variability of key soil properties, such as texture. The study aimed to compare sequential Gaussian simulation (SGS) and turning bands simulation (TBS) for assessing the uncertainty in unknown values of the textural fractions accounting for their compositional nature. The study area was a forest catchment (1.39 km2) with soils classified as Typic Xerumbrepts and Ultic Haploxeralf. Samples were collected at 135 locations (0.20 m depth) according to a design developed using a spatial simulated annealing algorithm. Isometric log-ratio (ilr) was used to transform the three textural fractions into a two-dimensional real vector of coordinates ilr.1 and ilr.2, then 100 realizations were simulated using SGS and TBS. The realizations obtained by SGS and TBS showed a strong similarity in reproducing the distribution of ilr.1 and ilr.2 with minimal differences in average conditional variances of all grid nodes. The variograms of ilr.1 and ilr.2 coordinates were better reproduced by the realizations obtained by TBS. Similar results in reproducing the texture data statistics by both algorithms of simulation were obtained. The maps of expected values and standard deviations of the three soil textural fractions obtained by SGS and TBS showed no notable visual differences or visual artifacts. The realizations obtained by SGS and TBS showed a strong similarity in reproducing the distribution of isometric log-ratio coordinates (ilr.1 and ilr.2). Overall, their variograms and data were better reproduced by the realizations obtained by TBS.

1. Introduction

Forest soils cover more than a third of the Earth’s land area and are much more than just the basement of forests and more than a provider of water and nutrients [1]. Soils as components of natural and man-made ecosystems (agricultural, forest, and urban) are dynamic regulatory systems, comprising thousands or millions of types of organisms, which absorb energy and matter and release heat, gases and chemical by-products that remain in the soil, generating a multitude of soil functions and supporting the delivery of major ecosystem services for human well-being [1,2,3]. Soil ecosystem services depend on soil properties and their interaction through soil functions [2]. Among soil physical properties, texture is one of the most fundamental. It is based on the combination of the amounts of the fractions of different mineral particle sizes (sand, silt and clay) and is inherited from the parent materials through weathering and pedogenetic processes. Soil texture cannot usually be easily or quickly changed by management practices, but it may be altered by erosion, deposition, truncation, landfill, etc. [4]. Texture influences a number of soil properties and functions, such as the formation and stability of soil structure, water infiltration and retention, carbon storage, microbial biomass, capacity to supply and retain nutrients, particularly in highly weathered soils, etc. [4,5,6,7]. Textural particle size fractions are commonly used as inputs to develop pedotransfer functions (PTFs) to predict soil properties that are difficult to obtain or expensive to measure [8,9].
Soil’s properties, as well as its processes and functions, vary in space and time at different scales [10]. The filtering of soil sampling at only a finite number of locations and times produces incomplete pictures of soils and their properties. Consequently, it is important to quantify the underlying spatial variability model, and methods are needed both for the prediction of soil properties’ values at unsampled locations and for quantifying their uncertainty. As is now well known, geostatistics [11] provides a broad methodology for analyzing and modeling spatial data, and its methods have been described in a wide number of texts [12,13,14,15,16]. Moreover, since the early 1980s, the use of geostatistics has become common in soil science, and many methods have been developed for modeling and mapping soil properties [17,18,19,20,21].
However, in any soil property estimation and whatever kriging algorithm is used, there is surely some error involved, resulting in spatial uncertainty that should be assessed [22]. Indeed, for each possible estimated soil property value, there is an unknown error, which is defined as the difference between the true value and the estimated one (or predicted) [23]. Moreover, the measure of uncertainty should not be addressed only at a particular location within an area of interest, but to the joint uncertainty along any string of locations within this area (joint uncertainty) [22].
Kriging aims to calculate the best estimate at unsampled locations, minimizing the estimation variance (or kriging variance) that provides some information on local uncertainty; locations with high estimation variances are likely more uncertain than the ones with low estimation variances. The estimation variance does not depend on the data values but only on the geometry of data locations and on the variogram model, which represents spatial dependence [12].
The uncertainty about a given soil property value at unsampled locations is caused by incomplete information on it, and quantifying such an uncertainty is inherently subjective and not trivial [23,24]. Uncertainty cannot be objectively measured and, therefore, there is no “true uncertainty”. What is only possible is to provide models of uncertainty [24]. However, uncertainty estimates should always be provided to complement the maps produced to warn their potential users about the quality of soil property predictions [25].
An important contribution of geostatistics is to model the uncertainty about any unknown value of a soil property at a given location x (x = coordinate vector) as a random variable Z ( x ) characterized by a specific probability distribution. Moreover, this concept must be extended to the set of all random variables spatially distributed at the locations (n) of interest, which is referred to as a random field or random function [23,24]. The random variables included in the random field must also be characterized by their spatial correlation [23]. The random function has an n-variate multivariate probability distribution that fully defines both its heterogeneity and uncertainty. Stochastic simulation amounts in generating realizations of the random function (values at the n locations), which allow one to visualize the heterogeneity and quantify the uncertainty [26]. The main challenges are defining the n-variate multivariate distribution and generating realizations that properly reflect the uncertainty of the property of interest. Several simulation methods have been proposed, but there are no general guidelines for choosing the right method. Therefore, investigating geostatistical methods for modeling the spatial uncertainty of soil properties is a research issue of interest; in particular, when there is more than one property to be simulated jointly, as in the case presented in this study where the texture consists of three different particle size fractions (sand, silt, and clay). It is precisely with texture that further practical problems arise for both geostatistical spatial prediction and simulation since sand, silt and clay data are subject to a constant sum constraint at every point of a sampling region and constitute a sample of a regionalized composition [27,28]. The statistical analysis of compositional data sets is based on the Aitchison’s [29] idea and its developments [30,31] that if all components in a soil sample are analyzed (texture in this case), they sum up to a constant. Therefore, the relationships that variables have to one another are conditioned by the rest of the elements included in the composition. The constant sum constraint forces at least one covariance (and at least one correlation coefficient between variables) to be negative, and spurious correlations are induced [29]. Other problems can occur and detailed discussions on properties of compositional data can be found in Aitchison [29], Pawlowsky-Glahn et al. [30], and Pawlowsky-Glahn and Buccianti [31]. Most of the data to which geostatistics is applied are compositional, and the main consequence of these properties is that standard statistical and geostatistical techniques for unconstrained random variables cannot be used to analyze such data in the raw form [30]. Most of the data to which geostatistics is applied are compositional, and the vector of observed values z ( x ) at x is a composition if they are strictly positive D components representing parts of a whole [32]. The same concept of composition applies to the spatially distributed vectors of random variables Z i ( x α ) (i = 1, …, N) at any point x α (α = 1, …, n) of a spatial domain in an n-dimensional Euclidian space. Such a composition is called regionalized composition [32,33]. Therefore, a regionalized composition (r-composition) is by definition both a composition and a coregionalization [27]. If Z ( x ) is a regionalized composition, estimating its elements at unsampled locations by whatever kind of (co)kriging, there is no guarantee that separate estimates will sum to a constant value, because kriging and cokriging are not convex operators and nothing forces their results to be bounded by the data or any constraints on them [12,28].
Modeling the spatial uncertainty of texture has rarely been considered. Fuks and Voltz [34] compared sequential Gaussian and indicator simulation algorithms for estimating the spatial uncertainty of particle size classes, and found that on average, sequential Gaussian simulation was more accurate than indicator simulation under the conditions of their specific case study. Buttafuoco et al. [35], as part of a study on the propagation of errors and uncertainty in the calculation of the soil erodibility factor (K) of RUSLE, modeled the joint spatial uncertainty of soil texture and organic matter using geostatistical stochastic simulation that also took into account the compositional nature of texture.
The compositional nature of soil texture has long been recognized, and a few papers have been published, although it is not established in the scientific community. Generally, it is preferred to interpolate two of the three textural particle size fractions, and calculate the third as the difference between 1 or 100 (%) and the sum of the two interpolated particle size fractions.
Among those that have considered the compositional nature of texture, some of them have proposed [36] or used [37] an extension of ordinary kriging that fulfils the constraint of summing particle size fractions to 100 (%) or 1 (-). Others used different log-ratio transformations (additive log-ratio, alr, centered log-ratio, clr, and isometric log-ratio, ilr) coupled with geostatistical approaches [35,38,39,40], whereas yet others compared the performance of compositional kriging with the geostatistical approaches of different log-ratio transformations [37,41,42,43]. Finally, other authors proposed digital soil mapping approaches using proximal or remote ancillary data to improve the prediction of different types of log-ratio transformations [43,44,45].
Quantifying the spatial uncertainty of predictions becomes particularly important when soil properties are used as inputs to models because their uncertainty propagates through environmental modeling to predictions [23,46]. This is certainly the case when texture is used as an input in the pedotransfer functions.
Different geostatistical simulation algorithms are available in both commercial and free software environments for geostatistical computing. They are easy to apply, but as warned previously, this ease of application sometimes makes geostatistical modeling a ‘black box’ practice with no control over the quality of the results [47]. Sequential Gaussian simulation [48] and turning bands simulation [11] have been widely used in different research areas, but both algorithms have advantages and disadvantages, and their performance and accuracy need to be checked. Furthermore, the results of this study might provide some guidance or insights into the choice between SGS and TBS.
In this context, the main objectives of this study were (1) comparing two of the most widely used methods of stochastic simulation (sequential Gaussian simulation and turning bands simulation) to provide insights into the choice of the most appropriate method, and (2) assessing the uncertainty in the unknown values of the three textural particle size fractions (sand, silt, and clay) accounting for their compositional nature.

2. Materials and Methods

2.1. Study Area

The study area is the Bonis catchment (about 1.39 square kilometers), located in the mountainous landscape of the Sila Massif (Figure 1) in southern Italy. The catchment area is mainly covered (68%) by pure stands of Calabrian pine (Pinus nigra subsp. laricio (Poir.) Maire) 50–60 years old, while for the remaining part, there are mixed stands of Calabrian pine with chestnut (Castanea sativa Mill) and alder (Alnus glutinosa (L.) Gaertn.) [49].
From the lithological point of view, Palaeozoic granitoid rocks are predominant and often deeply fractured and weathered. In most of the catchment area, thick layers of saprolite cover the bedrock [50]. The elevation ranges from about 1019 m to 1341 m above sea level, whereas the morphology in the upper reaches of the catchment is characterized by relatively flat to gently inclined crests consisting of fragments of the old planation surfaces (paleosurfaces), shaped during the Late Pliocene–Early Pleistocene [51]. These crests are framed by steep slopes often cut by deep incisions. The slope gradient ranges between 0° and about 50° with an average of 21°.
The soils in the study area are classified mainly as Typic Xerumbrepts and Ultic Haploxeralf [52,53], with sandy loam and sandy clay loam as the prevalent soil textural classes.
The study area has the typical climate of mountainous areas in Calabria, with a long-term average precipitation of about 1080 mm and an average temperature of about 0.1 °C in the coldest month and 18.3 °C in the warmest month.

2.2. Field Survey and Data

Soil samples were collected during summer 2015 using a metallic core cylinder with a diameter of 7.5 cm and height of 20 cm at 135 locations (Figure 1b), with a density of almost one sample per hectare. The coordinates of the sampling locations were found using a Trimble 5700 device in RTK mode. The spatial soil sampling scheme was optimized in the software package Simulated Annealing for Optimization of Sampling (SANOS) [54,55]. SANOS version 0.1 is an annealing algorithm for spatial simulation and allows placing sampling points in space according to some optimization criterion, such as the minimization of mean of the shortest distances (MMSD) criterion and the inclusion of objective weighting factors by using auxiliary information. After excluding non-sampleable areas (e.g., those with rock outcrops) and overlaying the Bonis catchment area with a 5 m square grid, the MMSD criterion was applied to locate 70 sampling points. The remaining 65 sampling points were located using as a criterion the minimization of weighted mean of the shortest distances (MWMSD) using slope value as the weight variable, since preliminary surveys observed greater soil variability in areas with a steeper slope.
In the laboratory, soil samples were analyzed for soil texture by the hydrometer method after a pre-treatment with sodium hexametaphosphate as a dispersant [56], and soil particles (sand, silt, and clay) were classified according to a US Department of Agriculture (USDA) system [57].

2.3. Geostatistical Approach

The measured values (z) of any particle size fraction (sand, silt, or clay), at each data point x (x is the location coordinates vector), were considered as the outcome of a random variable Z x [58,59], and their spatial structure was quantified and modeled by an experimental variogram. It is a function of a distance vector h (module and direction) called lag. The variogram quantifies for defined directions how different its values become as the distance increases. Since the experimental variogram is a set of unconnected points, to be used to predict the variable of interest at unsampled locations, it is necessary to fit a continuous mathematical function (model) to calculate variogram values for any distances and not to result in negative variances for any combination of random variables [59,60]. A variogram is defined by the type of model and its parameters of range and sill. The range is the distance at which (in this case) pairs of particle size fractions are still spatially correlated, while the sill is the variogram value corresponding to the range [59,60]. The variogram together with the set of data are used for estimating the values of the attribute(s) of interest in one of the many variants of kriging or stochastic simulation.
In this study, it was necessary to extend the univariate variogram to the multivariate case using a linear model of coregionalization (LMC) to allow multiple variables to be estimated simultaneously. The LMC consists of a variogram for each variable (direct variogram) and a cross-variogram for each pair of variables [13]. The direct and cross-variograms in the LMC cannot be considered independently and are jointly fitted so as to be definitely negative for variance constraints and to be physically plausible [16,61]. The relationships between the variables are controlled by the weighting of factors, and the LMC assumes that the N variables are a linear combination of L underlying independent factors acting at given spatial scales:
γ i j ( x ) = l = 0 L b i j l Γ l ( h )
Γ l ( h ) is a vector of authorized variogram models standardized to unit sill at the scale l. The l = 0 variogram is by convention a pure nugget effect. It has the form of a discontinuity at the origin (h = 0) of the variogram and is due to measurements errors and variation within the smallest sampling distance [13]. The coefficients b i j l of Equation (1) form L + 1 symmetrical coregionalization matrices corresponding to each spatial scale l. More details can be found in Goovaerts [13], Chilès and Delfiner [12], Webster and Oliver [60] and Wackernagel [15], among many others.
The use of geostatistical stochastic simulation to assess the uncertainty about a given soil property value at unsampled locations has required specifying a stochastic model describing the spatial distribution of the regionalized variable, as well as an algorithm of simulation for building alternative and equiprobable images (also called realizations) of the unknown true spatial distribution of the values of the property of interest z(x) within the spatial domain A. The different realizations are denoted by superscript l:
z l x , x A
The approaches of stochastic simulation were conditional because the realizations honored the data values at their positions. The stochastic conditional simulation algorithms used a random drawing from the conditional distribution at a given point to estimate the unmeasured value of the property of interest. Such constrained randomness modeled the variance at all scales and thus produced the alternative realizations having spatial distributions varying from realization to realization, and nevertheless always matched the input data and reproduced the statistical features and variogram. Multiple realizations of the variables of interest were generated, and the differences in these realizations enabled modeling the spatial uncertainty and quantifying the reliability of the estimates at a specific location [12,22,62].
Among the available algorithms of stochastic conditional simulation, sequential Gaussian simulation (SGS) [62] and turning bands simulation (TBS) [11] were used. Both methods are suitable for simulating Gaussian random functions and, therefore, the variables were transformed into normal distribution shaped using the Gaussian anamorphosis [15]. It allows one to transform each variable Z i x ,   x R 2 into a Gaussian-shaped variable Y i x ,   x R 2 with zero mean and unit variance using Hermite polynomials [15].
Sequential Gaussian simulation is based on the fact that in the multi-Gaussian random function model with known mean, kriging is the same as the local conditional expected value. The distribution of Z x n + 1 values conditional on measured data Z x α = z α ,   α = 1 , , n is Gaussian with mean z* (simple kriging estimate) and variance σ K 2 (K, kriging variance). Therefore, simulating Z x n + 1 conditionally on the data amounts to selecting a random value in the normal distribution with mean z* and variance σ K 2 [12,63].
The algorithm involves the following:
  • Defining a random path through all grid nodes;
  • Computing a simple kriging estimate of Z x n + 1 using the measured data Z x α = z α ,   α = 1 , , n and kriging variance σ K 2 ;
  • Drawing a random value Y with standard normal distribution;
  • Assigning z * + σ K Y to Z x n + 1 (which is the local conditional expectation in the multi-Gaussian model);
  • Adding the new value Z x n + 1 to the conditioning data set;
  • Repeating steps 2 to 5 in the next nodes in the random path until all nodes are simulated.
TBS has been chosen for its accuracy and computational efficiency [63]. Turning bands simulation (TBS) generates simulations in space from simulations on lines. The TBS method consists of adding up a large number of independent simulations defined on lines scanning the plane [11,12]. The lines are drawn in space with random or quasi-regular orientations and simulating a one-dimensional random field along each line. The one-dimensional simulations are projected on the spatial coordinates and averaged to provide the simulated value. TBS is very efficient for generating non-conditional simulations and reproducing the variogram better than other methods, particularly for small simulated fields [64]. Since TBS is an inherently unconditional simulation method, a second step to impose conditioning is required. The conditioning methods rely on two kriging-based models, one of the data, y k c * x , and one of the unconditional simulated values at the data locations, y k u * x , to form a correction model:
y c s l x = y u c l x + y k c * x y k u * x
where y u c l x is the unconditional realization and y c s l x is the conditional realization [65].
The number of realizations (L) to be generated from the random function model depends on the objective and on the structure of the phenomenon [12]. Since there is no reference model to approach and no particular aspect of uncertainty or precision with which the uncertainty assessment is required, the number of realizations (L) was set equal to 100, which is the typical number of realizations, large enough to ensure the stability of the results and small enough to allow the processing of the simulated realizations [12,66].
In the multivariate case with spatially cross-correlated variables, their joint simulation is required because the interest is to construct numerical models that reproduce the joint distribution of two or more coregionalized variables at unsampled locations, conditionally on the measurements at sampling locations. Adopting as a stochastic model of the random function the multivariate Gaussian model [12], the approach consists of using conditional cosimulation for generating realizations not more than a Gaussian random field Y = Y x , x A , but a vector Gaussian random field Y = Y x , x A [12,63]. The vector Gaussian random field has zero mean, and its spatial structure is modeled by a linear model of coregionalization (LMC) [16].
All statistical and geostatistical analyses were performed using the software package Isatis.neo, release 2024.04 [67].

2.4. The Compositional Data Approach

Soil textural particle size fractions (sand, silt, or clay) are data with a restricted space, wherein they can vary only from 0 to 100% (or 0 to 1). Such data are called compositional data, and their sample space is not the real space R3 but is known as the simplex (SD) [29,30]. The D-part simplex, SD, is a D-dimensional real space of strictly positive vectors of D parts summing to a constant κ:
S D = u = u 1 , u 2 , , u D | u j 0   and   j = 1 D u j = κ
where the constant κ can be 1, 100, 106, or any other constant [68].
Statistical methods designed for multivariate normal distributions can be used if the data in the original D-part simplex sample space are projected to multivariate real space [29,69].
Therefore, soil textural particle size fractions are compositional data because each of them (sand, silt, or clay) gives relative information about a part of a whole. In fact, the three observed particle size fractions (sand, silt, or clay) at each location x (x is the vector of location coordinates) is a vector z x , which satisfies the conditions that all its components are positive, and the sum of all components is constant and equal to 100. If geostatistics is applied to such compositional data, spurious spatial correlation might occur [32]. The same concept of composition applies to the spatially distributed vectors of random variables Z i x α ; i = 1 , , N at any point x α (α = 1, …, n) of a spatial domain R2. Such a composition is called regionalized composition [28,32]. Therefore, a regionalized composition (r-composition) is both a composition and a coregionalization [70].
Further practical problems arise with geostatistical spatial prediction because if Z ( x ) is a regionalized composition, estimating its elements at unsampled locations by whatever kind of (co)kriging, there is no guarantee that separate estimates will sum to 100% or 1 because kriging and cokriging are not convex operators, and nothing forces their results to be bounded by the data or any constraints on them [12,28]. Moreover, since compositional data are multivariate by nature, the singularity of the covariance matrix makes both geostatistical estimation and simulation undefined [33,71]. Moreover, stochastic simulation algorithms can generate larger violations of the constraint requested by compositional data because they can add some variability to the (co)kriging predictions [33]. The Gaussian anamorphosis required by the multi-Gaussian random function model is often adopted for geostatistical stochastic simulations and, if appropriately chosen, might guarantee positivity in a similar way to the logarithmic transformation, but it cannot ensure the constant sum constraint [28].
A log-ratio transformation has been proposed to deal with compositional data [29]. The log-ratio transformation allows a composition to be represented as a real vector and makes it possible to apply the geostatistical methods to the log-ratios, and finally to back transform the interpolated or simulated data [72]. There are different log-ratio transformations that have been introduced to represent compositional data [73], and among them, here was used the isometric log-ratio (ilr) transformation [74] because it allows effective and practical treatment of compositional data for the correct application of geostatistics [28,33] without claiming or wanting to deepen the meaning of the ilr transformation. The log-ratio transformations are also termed coordinates or coefficients because of their more natural terminology [75].
In summary, the geostatistical approach to the analysis of compositional data consists of the following steps [33]:
  • To represent the D-part compositional vectors at all soil sampling locations as (D-1)-dimensional real vectors of coordinates by means of the ilr transformation;
  • To apply the standard geostatistical methods to the ilr transformed data;
  • To back transform the simulated scores using the ilr inverse.
In this study case, the soil textural particle size fractions at each measured location are a three-part composition, and its representation is written as a (D-1)-dimensional real vector of two coordinates, ilr.1 and ilr.2.
The compositional data procedures were performed using the software package CoDaPack 2.0 [76].

2.5. The Performance Criteria for Comparing the Geostatistical Simulation Algorithms

The two algorithms have been compared for assessing their quality in cosimulating the ilr coordinates using the same stochastic model. The latter was a Gaussian random field model [12] with the spatial correlation structure given by the variograms in the linear model of coregionalization.
An approach for assessing the quality of a simulation algorithm is to check the agreement between the experimental and model statistics. The key inputs to a geostatistical simulation algorithm include the actual data and the variogram (or variograms in the LMC in this multivariate case) as a measure of spatial correlation. Therefore, the first analysis to evaluate the performance of the two stochastic simulation algorithms was the reproduction quality of the model statistics, i.e., (1) the data distribution of the two ilr log-ratio coordinates (ilr.1 and ilr.2) and (2) the spatial continuity measured by the variograms in the LMC [47].
A local cumulative distribution function (cdf) conditional on the available data can be built from a set of L realizations for each node of the simulation grid. From these conditional cumulative distribution function (ccdf) models, different statistical values can be calculated—the estimated expected value (E-type estimate) or conditional expectation, standard deviation, minimum, maximum, quantiles, probability intervals, and confidence interval of the expected value [13,48].
The distributions of the realizations ilr.1 and ilr.2 coordinates obtained by SGS and TBS were compared with the distributions of original data through quantile–quantile plots. On average, over 100 realizations, it is expected that the distributions of individual realizations versus the distribution of original ilr.1 and ilr.2 coordinates data fit the diagonal line, indicating an accurate reproduction of data distribution.
Regarding the variograms’ reproduction, they were calculated for multiple SGS/TBS realizations and compared to the input variogram model in the same directions. The model variogram should be reproduced within acceptable ergodic fluctuations [64].
Moreover, swath plots were used for providing comparisons between measured data (ilr.1 and ilr.2) and SGS/TBS cosimulated values to identify any bias towards under-estimation or over-estimation, or any smoothing in the results. In this study, the swath plots are graphical representations of measured and SGS/TBS cosimulated realizations of ilr.1 and ilr.2 coordinates derived from a series of sectional zones or bands (swaths), generated in two-dimensional sections that are propagated by a given step size along the vector normal to the sectional data [77]. A swath plot can be constructed along any vector to allow one to analyze how the property of interest changes along that vector. In this study, the swath plots were computed along north and east directions using a slice size in both directions of 200 m.
Finally, since the interest is in assessing the quality of the simulation of the three textural fractions size (sand, silt, and clay), the 100 realizations of the ilr.1 and ilr.2 coordinates were back-transformed into the three grain size fractions (sand, silt, and clay). The arithmetic average of the 100 realizations at each node of the grid of simulation provided the estimated expected value (E-type estimate), whereas the standard deviation provided measures of uncertainty.
As an alternative method to using a histogram to check reproduction, box plots of the means and standard deviations from all the realizations were used. The box plots can show the full range of data, providing information about the tails and insights on the distribution. Particularly, the box plots show the position of the mean (or expected values of the simulated realizations) and enclose the interquartile range (IQR = upper quartile minus lower quartile), which allow one to characterize the spread of the distribution.

3. Results and Discussion

The distributions of the three soil textural fractions’ data were quite symmetric for sand and silt, with very close values for mean and median (Table 1 and Figure 2a), with no skewness for silt and very low (−0.26) skewness for sand concentration.
The asymmetry is more pronounced (skewness = 0.73) only for clay, along with a greater difference between mean and median (Table 1 and Figure 2a).
The sand fraction is dominant on average, and ranges between 39 and 86%, while silt varies between 1 and 40%, and the clay content shows the lowest variation between its minimum and maximum values (22%) with an average value of about 16% (Table 1). The sandy component of the soils suggests their low degree of pedogenetic evolution and, as mentioned before, soils are classified mainly as Typic Xerumbrepts and Ultic Haploxeralf [52,53].
The three particle size fractions were transformed into two isometric log-ratios (ilr.1 and ilr.2) and used for the following approach steps. The isometric log-ratio transformation of the three grain size fractions increased the skewness and kurtosis (Table 1 and Figure 2b) particularly for ilr.2, whereas the means and medians remained quite close for both log-ratios. However, since both simulation methods require a Gaussian framework, the two isometric log-ratios were transformed into a normal distribution shaped using the Gaussian anamorphosis with zero mean and unit variance.
The two Gaussian isometric log-ratios (G ilr.1 and Gilr.2) were used to calculate two variogram maps and no relevant difference in the semivariance of G ilr.1 or Gilr.2 as a function of direction (anisotropy) was detected. Therefore, isotropic direct and cross-variograms were calculated (Figure 3).
All direct and cross-variograms were upper-bounded, and a nested linear model of coregionalization (LMC) was fitted. The LMC included a nugget effect, a spherical model [60] with range equal to about 133 m, and an exponential model [60] with a practical range equal to 600 m (Figure 3). Here is reported the practical range because the exponential model has no finite range. Usually, the distance at which the variogram value equals 95% of the sill variance is used as the practical range. The LMC performed well on the cross-validation, shown with the mean errors (ME) close to 0 and the mean squared deviation ratios (MSDR) close to 1. Particularly, the mean errors varied between −0.018 and −0.011, while the variance of standardized errors varied between 1.03 and 1.09.
The variograms of the LMC indicate the presence of soil processes evolving at two scales of variation. The first, represented by the spherical model, shows a short-range variation that evolves rapidly, as can be seen from the slope of the first part of the fitted model and the second long-range variation to which an exponential model has been fitted. In addition to having a greater range, the exponential model is characterized by an increase in semivariance with the distance between pairs of points slower than the spherical model (Figure 3).
The Gaussian ilr.1 and ilr.2 coordinates data were then simulated using sequential Gaussian simulation and turning bands simulation according to a regular grid with a mesh of 2 m. The 100 simulated realizations were then back-transformed into the raw data by inverting the anamorphosis mathematical model previously calculated.

3.1. Comparison of the Simulation Approaches

3.1.1. Reproduction of the Model Statistics

In order to validate the results, the distributions of the simulated isometric log-ratio coordinates (ilr.1 and ilr.2) were compared to those of the original data through quantile–quantile plots to assess whether the distributions of the ilr.1 and ilr.2 realizations honor the input histogram and show any fluctuation (Figure 4).
On average, over many realizations, one would expect that the distributions of ilr.1 and ilr.2 realizations fit the diagonal lines (identity lines) so as to achieve an accurate reproduction of data distributions. In fact, this is the case for the reproduction of the distribution of ilr.1 for the realizations obtained with both sequential Gaussian simulation (SGS, Figure 4a) and turning bands (TBS, Figure 4c) between 0.5 and 1.3 (0.2 and 0.94 quantiles). Similarly accurate is the reproduction of the distribution of the ilr.1 data, which are less than 0.3 in both simulations (Figure 4a,c). Less accurate is the reproduction of ilr.1 data distribution greater than 1.3 for both simulation algorithms (Figure 4a,c), with a higher dispersion for SGS (Figure 4c). The reproduction of the ilr.2 data distribution of the realizations obtained by both simulation algorithms (Figure 4b,d) appears with a more variable accuracy than that obtained for ilr.1 (Figure 4a,c), and with an apparently greater dispersion along the identity line for TBS (Figure 4d). However, a strong similarity in the results of reproducing the distribution of isometric log-ratio coordinates for the realizations obtained by both simulation algorithms is evident (Figure 4).
Moreover, the mean values of ilr.1 (0.799) and ilr.2 (0.680) coordinates data and those of the cosimulated realizations with SGS (ilr.1 = 0.795; ilr.2 = 0.682) and TBS (ilr.1 = 0.802; ilr.2 = 0.687) are practically equal. Therefore, both algorithms of stochastic simulation reproduced the means of the measured ilr.1 and ilr.2 coordinate data.
At each node of the grid of simulation, the variance (or the standard deviation) of the simulated ilr.1 or ilr.2 values is a measure of uncertainty [78]. Therefore, at each grid of the simulation, uncertainty increases as the spread of the distribution of the cosimulated values increases, and an average conditional variance of all grid nodes can be used as a measure of global uncertainty (U) [78]. For the cosimulated realizations of ilr.1 and ilr.2, the average variance (and thus the global uncertainty) is for both slightly higher in TBS (Uilr.1 = 7.58 × 10−3; Uilr.2 = 5.19 × 10−4) than in SGS (Uilr.1 = 6.14 × 10−3; Uilr.2 = 4.63 × 10−4).

3.1.2. Reproduction of the Variograms

The direct variograms for the two isometric log-ratio coordinates (ilr.1 and ilr.2) computed from the realizations obtained through SGS and TBS are shown in Figure 5 and Figure 6.
In the scope of checking the reproduction of the spatial structure of the two isometric log-ratio coordinates, it is important to point out that the linear model of coregionalization was isotropic, but the tool implemented in Isatis.neo geostatistical software version 2023.04 (https://www.geovariances.com/en/, accessed on 21 August 2024) can compute the variograms from the main directions of anisotropy or from gridded data, and therefore, they were computed along northern and eastern directions. However, this is not a problem, since the linear coregionalization model is isotropic, and therefore valid in all directions. The realizations were simulated using the Gaussian model, and the raw realizations are the results of the back-transformed Gaussian simulations. The experimental variograms for the realizations obtained by SGS and TBS have been compared to the back-transformed Gaussian variogram models of the two isometric log-ratio coordinates ilr1. and ilr.2.
Overall, the direct variogram of ilr.1 coordinates (Figure 5) is better reproduced by the realizations obtained by TBS, and particularly in the northern direction (Figure 5d). For the realizations obtained from SGS, a greater range (maximum distance within which pairs of values are still spatially correlated) is evident compared to the variogram model fitted to the experimental values of ilr.1 coordinates in the linear coregionalization model (Figure 5a,b) coupled with a reduction in sill (variogram value over which pairs of values are spatially correlated). Compared to the ilr.1 coordinates, the ilr.1 realizations obtained from SGS show greater continuity (longer range) and less variability (smaller sill) (Figure 5a,b). This increase in range coupled with a reduction in sill is also observed in the eastern direction of the variogram of the realizations obtained from TBS, but to a lesser extent (Figure 5c).
The direct variogram of ilr.2 coordinates (Figure 6) is clearly well reproduced by the realizations obtained by TBS in both directions (Figure 6c,d), whereas for the realizations obtained from SGS, a marked decrease in range coupled with a lower sill in both directions is evident (Figure 6a,b).

3.1.3. Checking Data Reproduction Through Swath Plots

The swath plots in the north and east directions (Figure 7) show a good reproduction of ilr.1 coordinates’ data along the eastern direction from the realizations obtained by both algorithms of simulation (Figure 7b,f).
The ilr.1 coordinates’ realizations are also quite evenly distributed around the ilr.1 coordinates data with no areas with too high or too low realizations.
Deviations from the ilr.1 coordinates data occur both for SGS and TBS at the west border of the catchment, and only for TBS at the east border. However, deviations at both borders of the catchment occur in the other cases (Figure 7a,c,d,e,g,h) and, in particular, are larger in the western edge for the ilr.2 coordinates data of both simulation algorithms cases (Figure 7d,h). This is probably due to the border effect given and the soil samples’ spatial distribution.
The swath plots in the north direction show a similar reproduction of ilr.1 coordinates data from the realizations obtained by both algorithms of simulation (Figure 7b,e). Hower, the realizations, from west to east, show an alternating pattern of lower and higher values than ilr.1 coordinates data (Figure 7b,e). A similar trend can be observed for ilr.2 coordinates realizations in the northern direction (Figure 7c,g). However, the deviations from ilr.2 data are greater for the SGS (Figure 7c).
Finally, ilr.2 coordinates data are better reproduced by ilr.2 the coordinates realizations obtained by TBS (Figure 7c,g).

3.1.4. Reproduction of the Texture Data Statistics

From all these results, one can already draw some insights into the performances of the two simulation algorithms. However, the interest is in modeling the spatial uncertainty of the three soil fractions, and therefore the 100 realizations of the ilr.1 and ilr.2 coordinates obtained from SGS and TBS were transformed into sand, silt, and clay data. Therefore, further checks on the credibility of the simulated realizations were carried out on the three soil fraction sizes (sand, silt, and clay) data.
A previous check of histograms and statistics on individual ilr.1 and ilr.3 realizations was undertaken using quantile–quantile plots between original data and simulated isometric log-ratio coordinates (ilr.1 and ilr.2). An alternative to quantile–quantile plots could be offered by a boxplot of the means of each of the 100 realizations for the three soil textural fractions (sand, silt, and clay). The same could be done with the standard deviations, and we can produce a boxplot of the standard deviations of the 100 realizations for the three soil textural fractions (sand, silt, and clay).
In Figure 8a, the box plots of the measured soil fraction sizes (sand, silt, and clay) data and expected values calculated from realizations cosimulated by SGS and TBS are reported.
For the concentration of sand (Figure 8a), the spread of the distribution as measured by the interquartile range (IQR = upper quartile − lower quartile) is not closely reproduced, and is reduced in both simulation algorithms. In fact, the IQR of the measured data (12%) was reduced to 7.75% for SGS and 9.09% for TBS. Conversely, the mean and median are overestimated for both simulation algorithms. There is a slight negative skewness for both distributions of the simulated values (−038, SGS and −0.35, TBS) of a slightly greater size than the one of the measured data (−0.26).
The same behavior of IQR occurred for the concentrations of both silt (measured data = 9.00%, SGS = 6.01%, TBS = 6.57%) and clay (measured data = 7.00%, SGS = 2.11%, TBS = 3.02%) (Figure 8a), but more for clay as the IQR was more than halved for both simulation algorithms. Particularly, this occurred for SGS, in which the IQR dropped from 7 to 2.11%.
The variations of three textural size fractions data sets are pretty well reproduced for sand and clay concentrations by both simulation algorithms (Figure 8a), while for silt concentration, the range of variation drops from 39% of measured data to 30.87% for SGS and 29.84% for TBS.
The boxplots of the standard deviations of the realizations obtained by SGS and TBS are shown in Figure 8b. As explained above for the realizations of coordinates ilr.1 and ilr.2, the standard deviation is particularly important in stochastic geostatistical simulation because the spread of the probability distribution at each node of the simulation grid is a measure of uncertainty. As the spread of the probability distribution increases, the uncertainty increases and, therefore, can be quantified by the variance or standard deviation [78].
The means and medians of standard deviation for the realizations of the three soil textural fractions obtained by SGS and TBS are close, with distributions that are quite symmetric (Figure 8b). The means of standard deviation of the three soil textural fractions are always slightly higher for TBS than for SGS. The same applies to the medians of the standard deviations of the SGS and TBS realizations and the interquartile range (Figure 8b). However, the means and medians of standard deviation for the realizations of the three soil textural fractions obtained by SGS and TBS are so close that it is reasonable to exclude differences without any significance testing of them.

3.2. Maps of the Mean Values and Standard Deviations of Soil Fractions Size (Sand, Silt, and Clay) Data

The arithmetic average at each node of the grid of simulation of the 100 realizations of sand, silt, and clay concentrations provides the estimated expected value (E-type estimate), whereas the standard deviation at each node of the grid of simulation of the 100 realizations of sand, silt, and clay, as just explained above, provides measures of uncertainty.
In Figure 9, the maps of the E-type estimates for the three soil textural fractions and for the two simulation algorithms are shown.
To display the maps, the intervals of variation of the E-type estimates were split into ten iso-frequency classes (of equal size) using 0.1 quantiles. Each class of the color scale contains the same number of values, and the bounds of the classes are irregular. This scale is very useful for analyzing variation by allowing the identification of areas with values around the median and the different 0.1 quantiles.
The maps of the estimated expected values for each of the three soil textural fractions obtained by SGS and TBS show a similar distribution with no notable visual differences or visual artifacts (Figure 9). In all the maps there are small areas of high or low concentration values that may depend on local variability, and which are not modeled by the LMC due to variability at scales smaller than the minimum sampling distance, resulting in the presence of a discontinuity at the origin (nugget effect) in one of the variograms of the LMC (Figure 4).
In general, the parts of the Bonis catchment area with the highest sand concentration values (Figure 9a,d) are found in the areas at higher elevations (Figure 1), with one exception in a small portion of it located in Figure 9a,d at the bottom right. Even for silt (Figure 9b,e) and clay (Figure 9c,f), the maps of their estimated expected values obtained by SGS and TBS have similar distributions, and no visual artifacts are evident. The concentrations of silt and clay, as opposed to sand, are generally higher in lower-elevation areas (Figure 1), with exceptions in some areas (Figure 9b,c,e,f). The maps of the expected values for clay (Figure 9c,f) reflect what was observed in the boxplots (Figure 8a) regarding the reduction in the interquartile range (IQR), which is particularly evident for SGS in which the IQR drops from 7 to 2.11%.
The maps of standard deviations for all realizations from the SGS and TBS realizations (Figure 10) represent the uncertainty distribution.
The standard deviation maps of the realizations obtained by SGS and TBS show both a similar spatial distribution and range of variation for each textural fraction, and no relevant difference is made detectable by visual inspection (Figure 10). Indeed, from Figure 8b, the boxplots clearly show the same range of variation in standard deviation values as the realizations simulated by SGS and TBS. Their distributions are also symmetrical, with the averages very close to the medians (Figure 8b). The only difference apparent even from a visual inspection is the smaller interquartile range for the standard deviation of the clay realizations obtained by SGS compared to TBS (Figure 8b). This is also made partly observable in Figure 10c,f by the slightly different spatial distributions of the standard deviation values of the realizations simulated by SGS and TBS.
In trying to establish which of the two stochastic simulation algorithms performed better, it could be said that, based on what was shown and discussed above, turning bands simulation (TBS) performed better overall. In many of the results presented, there is often a near-equivalence between the two algorithms, the major differences being in the reproduction of variograms and data reproduction through swath plots, in which TBS showed better reproducibility.
The differences in performance of SGS and TBS can only be attributed to the simulation algorithms because both used Gaussian random field [12] as stochastic model with the structure of spatial correlation modelled by the same linear model of coregionalization (LMC). The differences between SGS and TBS lie in the specificities of each algorithm, namely, the use of previously simulated data in sequential Gaussian simulation and the use of finite number lines in turning bands simulation [12].
Moreover, the realizations were generated over a bounded domain (the Bonis catchment in this study), and some deviations (or fluctuations) between the statistics of the experimental data and realizations based on a stochastic model are expected, even when the algorithm of simulation is exact [64].
Although there are not many examples in the literature comparing the use of different simulation algorithms in quantifying spatial uncertainty, the better reproducibility of the spatial correlation structure of TBS compared to SGS is in agreement with what other authors have found, and with the specific characteristics of the two simulation algorithms [64,79]. However, what is well established in the literature is that, among the advantages of TBS, it achieves a particularly good reproduction of the variogram and thus of the spatial correlation structure [12,79]. The lower reproducibility of the SGS spatial correlation structure is attributable to the search neighborhood selection, which can lead to the poor conditioning and replication of the variogram [12,79].
A more accurate comparison of the two simulation algorithms would require the use of different data sets with different spatial configurations and changes in the simulation parameters of the algorithms (number of bands, size of the search neighborhood, etc.), which could be the subject of further study and investigation. In this study, it has applied one approach of comparison among many possible ones, seeking to highlight the importance of the spatial uncertainty assessment. The importance of spatial uncertainty assessment as an integral component of spatial prediction models and map production is still strongly recommended, although it does not yet appear to be a standard in the scientific community [23,25,80]. In digital soil mapping, uncertainty assessment is even a requirement included in the specifications of initiatives such as GlobalSoilMap [81].

4. Conclusions

The comparison between SGS and TBS showed a strong similarity of the realizations obtained by SGS and TBS in reproducing the distribution of isometric log-ratio coordinates (ilr.1 and ilr.2). Indeed, the means of the measured ilr.1 and ilr.2 coordinates data were reproduced by both algorithms of stochastic simulation. Even the global uncertainty measured by the average conditional variances of all grid nodes showed minimal differences. Moreover, the variograms of ilr.1 and ilr.2 coordinates are better reproduced by the realizations obtained by TBS. The swath plots show similar reproductions of ilr.1 coordinates data from the realizations obtained by both algorithms of simulation with an alternating pattern of lower and higher values than ilr.1 coordinates data. Generally, ilr.2 coordinates data were better reproduced by ilr.2 coordinates realizations obtained by TBS.
Regarding the reproduction of the texture data statistics, both algorithms of simulation produced similar results. Indeed, the ranges of variation of the three textural size fractions data sets were pretty well reproduced for sand and clay concentrations by both simulation algorithms, whereas the range of variation for silt was reduced. Also, the means and medians of standard deviation for the realizations of the three soil textural fractions obtained by SGS and TBS were very close.
Finally, the maps of the estimated expected values for each of the three soil textural fractions obtained by SGS and TBS showed a similar distribution with no notable visual differences or visual artifacts. What was reported for the maps of the expected values of the three soil textural fractions also applies to the standard deviation maps of the realizations obtained by SGS and TBS. These maps represent the distribution of spatial uncertainty and show both a similar spatial distribution and range of variation for each textural fraction with no relevant difference.
There is ‘no true uncertainty’ about the understanding of the spatial distribution of the three textural fraction sizes (sand, silt, and clay), and this study attempted to somewhat mimic the natural complexity of the soils in the Bonis catchment with regard to the textural fractions through exploring the uncertainty in the non-sampled areas. One could object to the presence of some subjectivity in the description and discussion of the results, but uncertainty cannot ever be measured, and any assessment of uncertainty needs to be based on some sort of statistical model, which requires implicit or explicit assumptions that are necessarily subjective (Caers 2011). It is essential to provide an estimate of the spatial uncertainty by accepting as a working hypothesis that all assessments of uncertainty are nothing more than models, and there is no ‘best model’ [82]. In this context, using different stochastic simulation algorithms and proposing an approach for assessing their performance may represent a contribution and a prompt to others to always integrate map production with the estimation of spatial uncertainty.

Funding

This research was funded by the project ‘ALForLab’ PON03PE_00024_1, co-funded by the National Operational Programme for Research and Competitiveness (PON R&C) 2007–2013, through the European Regional Development Fund (ERDF) and national resource (Revolving Fund—Cohesion Action Plan (CAP) MIUR).

Data Availability Statement

Data used in this study are available upon a reasonable request to the author. The data are not publicly available due to a temporary embargo period until the completion of some research papers.

Acknowledgments

The author thanks the three anonymous reviewers of this paper for providing constructive comments, which have contributed to the improvement of the published version.

Conflicts of Interest

The author declares no conflicts of interest.

References

  1. Binkley, D.; Fisher, R.F. Ecology and Management of Forest Soils, 5th ed.; John Wiley & Sons Ltd.: Glasgow, UK, 2020; ISBN 9781119455653. [Google Scholar]
  2. Adhikari, K.; Hartemink, A.E. Linking Soils to Ecosystem Services—A Global Review. Geoderma 2016, 262, 101–111. [Google Scholar] [CrossRef]
  3. Blum, W.E.H. Functions of Soil for Society and the Environment. Rev. Environ. Sci. Biotechnol. 2005, 4, 75–79. [Google Scholar] [CrossRef]
  4. Osman, K.T. Forest Soils; Springer International Publishing: Cham, Switzerland, 2013; Volume 9783319025, ISBN 978-3-319-02540-7. [Google Scholar]
  5. Dilustro, J.J.; Collins, B.; Duncan, L.; Crawford, C. Moisture and Soil Texture Effects on Soil CO2 Efflux Components in Southeastern Mixed Pine Forests. For. Ecol. Manag. 2005, 204, 87–97. [Google Scholar] [CrossRef]
  6. Telles, E.d.C.C.; de Camargo, P.B.; Martinelli, L.A.; Trumbore, S.E.; da Costa, E.S.; Santos, J.; Higuchi, N.; Oliveira, R.C. Influence of Soil Texture on Carbon Dynamics and Storage Potential in Tropical Forest Soils of Amazonia. Glob. Biogeochem. Cycles 2003, 17, 1040. [Google Scholar] [CrossRef]
  7. Cheng, P.; Liu, F.; Chen, X.; Zhang, Y.; Yao, K. Estimation of the Installation Torque–Capacity Correlation of Helical Pile Considering Spatially Variable Clays. Can. Geotech. J. 2024, 61, 2064–2074. [Google Scholar] [CrossRef]
  8. Minasny, B.; McBratney, A.B.; Bristow, K.L. Comparison of Different Approaches to the Development of Pedotransfer Functions for Water-Retention Curves. Geoderma 1999, 93, 225–253. [Google Scholar] [CrossRef]
  9. Van Looy, K.; Bouma, J.; Herbst, M.; Koestel, J.; Minasny, B.; Mishra, U.; Montzka, C.; Nemes, A.; Pachepsky, Y.A.; Padarian, J.; et al. Pedotransfer Functions in Earth System Science: Challenges and Perspectives. Rev. Geophys. 2017, 55, 1199–1256. [Google Scholar] [CrossRef]
  10. Heuvelink, G.B.M.; Webster, R. Modelling Soil Variation: Past, Present, and Future. Geoderma 2001, 100, 269–301. [Google Scholar] [CrossRef]
  11. Matheron, G. The Intrinsic Random Functions and Their Applications. Adv. Appl. Probab. 1973, 5, 439–468. [Google Scholar] [CrossRef]
  12. Chilès, J.-P.; Delfiner, P. Geostatistics: Modeling Spatial Uncertainty, 2nd ed.; Wiley Series in Probability and Statistics; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 2012; ISBN 9781118136188. [Google Scholar]
  13. Goovaerts, P. Geostatistics for Natural Resources Evaluation; Oxford University Press: New York, NY, USA, 1997; ISBN 0195115384. [Google Scholar]
  14. Isaaks, E.H.; Srivastava, R.M. Applied Geostatistics; Oxford University Press: New York, NY, USA, 1989; Volume 21, ISBN 0195050134. [Google Scholar]
  15. Wackernagel, H. Multivariate Geostatistics: An Introduction with Applications; Springer: Berlin/Heidelberg, Germany, 2003; ISBN 3540441425. [Google Scholar]
  16. Journel, A.G.; Huijbregts, C.J. Mining Geostatistics; Academic Press: London, UK, 1978; ISBN 0123910501. [Google Scholar]
  17. Burgess, T.M.; Webster, R. Optimal Interpolation and Isarithmic Mapping of Soil Properties: II Block Kriging. J. Soil. Sci. 1980, 31, 333–341. [Google Scholar] [CrossRef]
  18. Lark, R.M.; Minasny, B. Classical Soil Geostatistics. In Pedometrics; McBratney, A., Minasny, B., Stockmann, U., Eds.; Springer International Publishing: Cham, Switzerland, 2018; pp. 291–340. ISBN 978-3-319-63437-1. [Google Scholar]
  19. Lucà, F.; Buttafuoco, G.; Terranova, O. 2.03—GIS and Soil A2. In Comprehensive Geographic Information Systems; Huang, B., Ed.; Elsevier: Oxford, UK, 2018; pp. 37–50. ISBN 978-0-12-804793-4. [Google Scholar]
  20. Malone, B.P.; Minasny, B.; McBratney, A.B. Using R for Digital Soil Mapping; Springer International Publishing: Cham, Switzerland, 2017; ISBN 978-3-319-44325-6. [Google Scholar]
  21. McBratney, A.; Minasny, B.; Stockmann, U. (Eds.) Pedometrics; Progress in Soil Science; Springer International Publishing: Cham, Switzerland, 2018; ISBN 978-3-319-63437-1. [Google Scholar]
  22. Journel, A.G.; Alabert, F. Non-Gaussian Data Expansion in the Earth Sciences. Terra Nova 1989, 1, 123–134. [Google Scholar] [CrossRef]
  23. Heuvelink, G.B.M. Uncertainty and Uncertainty Propagation in Soil Mapping and Modelling. In Pedometrics; McBratney, A., Minasny, B., Stockmann, U., Eds.; Springer International Publishing: Cham, Switzerland, 2018; pp. 439–461. ISBN 978-3-319-63437-1. [Google Scholar]
  24. Caers, J. Modeling Uncertainty in the Earth Sciences; John Wiley & Sons Ltd.: Chichester, UK, 2011; ISBN 9781119995920. [Google Scholar]
  25. Padarian, J.; McBratney, A.B. QuadMap: Variable Resolution Maps to Better Represent Spatial Uncertainty. Comput. Geosci. 2023, 181, 105480. [Google Scholar] [CrossRef]
  26. Leuangthong, O.; Khan, K.D.; Deutsch, C.V. Solved Problems in Geostatistics; Wiley: Hoboken, NJ, USA, 2008; ISBN 9780470177921. [Google Scholar]
  27. Pawlowsky-Glahn, V.; Olea, R.A.; Davis, J.C. Estimation of Regionalized Compositions: A Comparison of Three Methods. Math. Geol. 1995, 27, 105–127. [Google Scholar] [CrossRef]
  28. Tolosana-Delgado, R.; Mueller, U.; van den Boogaart, K.G. Geostatistics for Compositional Data: An Overview. Math. Geosci. 2019, 51, 485–526. [Google Scholar] [CrossRef]
  29. Aitchison, J. The Statistical Analysis of Compositional Data; Chapman and Hall: London, UK; New York, NY, USA, 1986; Volume 44, ISBN 978-94-010-8324-9. [Google Scholar]
  30. Pawlowsky-Glahn, V.; Egozcue, J.J.; Tolosana-Delgado, R. Modelling and Analysis of Compositional Data. Statistics in Practice; John Wiley & Sons, Ltd.: Chichester, UK, 2015; ISBN 9781118443064. [Google Scholar]
  31. Pawlowsky-Glahn, V.; Buccianti, A. Compositional Data Analysis: Theory and Applications; John Wiley & Sons: Chichester, UK, 2011; ISBN 978-0-470-71135-4. [Google Scholar]
  32. Pawlowsky-Glahn, V.; Egozcue, J.J. Spatial Analysis of Compositional Data: A Historical Review. J. Geochem. Explor. 2016, 164, 28–32. [Google Scholar] [CrossRef]
  33. Tolosana-Delgado, R.; Mueller, U. Geostatistics for Compositional Data with R; Use R! Springer International Publishing: Cham, Switzerland, 2021; ISBN 978-3-030-82567-6. [Google Scholar]
  34. Fuks, S.D.; Voltz, M. A Comparison of Geostatistical Simulation Approaches for Estimating the Spatial Uncertainty of Soil Texture. In geoENV III—Geostatistics for Environmental Applications, Proceedings of the Third European Conference on Geostatistics for Environmental Applications, Avignon, France, 22–24 November 2000; Monestiez, P., Allard, D., Froidevaux, R., Eds.; Springer: Dordrecht, The Netherlands, 2001; pp. 463–474. [Google Scholar]
  35. Buttafuoco, G.; Conforti, M.; Aucelli, P.P.C.; Robustelli, G.; Scarciglia, F. Assessing Spatial Uncertainty in Mapping Soil Erodibility Factor Using Geostatistical Stochastic Simulation. Environ. Earth Sci. 2012, 66, 1111–1125. [Google Scholar] [CrossRef]
  36. Walvoort, D.J.J.; De Gruijter, J.J. Compositional Kriging: A Spatial Interpolation Method for Compositional Data. Math. Geol. 2001, 33, 951–966. [Google Scholar] [CrossRef]
  37. Sun, X.-L.; Wu, Y.-J.; Wang, H.-L.; Zhao, Y.-G.; Zhang, G.-L. Mapping Soil Particle Size Fractions Using Compositional Kriging, Cokriging and Additive Log-Ratio Cokriging in Two Case Studies. Math. Geosci. 2014, 46, 429–443. [Google Scholar] [CrossRef]
  38. Lark, R.M.; Bishop, T.F.A.A. Cokriging Particle Size Fractions of the Soil. Eur. J. Soil. Sci. 2007, 58, 763–774. [Google Scholar] [CrossRef]
  39. Wang, Z.; Shi, W. Robust Variogram Estimation Combined with Isometric Log-Ratio Transformation for Improved Accuracy of Soil Particle-Size Fraction Mapping. Geoderma 2018, 324, 56–66. [Google Scholar] [CrossRef]
  40. Li, J.; Wan, H.; Shang, S. Comparison of Interpolation Methods for Mapping Layered Soil Particle-Size Fractions and Texture in an Arid Oasis. Catena 2020, 190, 104514. [Google Scholar] [CrossRef]
  41. Odeh, I.O.A.; Todd, A.J.; Triantafilis, J. Spatial Prediction of Soil Particle-Size Fractions as Compositional Data. Soil. Sci. 2003, 168, 501–515. [Google Scholar] [CrossRef]
  42. Wang, Z.; Shi, W. Mapping Soil Particle-Size Fractions: A Comparison of Compositional Kriging and Log-Ratio Kriging. J. Hydrol. 2017, 546, 526–541. [Google Scholar] [CrossRef]
  43. Muzzamal, M.; Huang, J.; Nielson, R.; Sefton, M.; Triantafilis, J. Mapping Soil Particle-Size Fractions Using Additive Log-Ratio (ALR) and Isometric Log-Ratio (ILR) Transformations and Proximally Sensed Ancillary Data. Clays Clay Miner. 2018, 66, 9–27. [Google Scholar] [CrossRef]
  44. Huang, J.; Subasinghe, R.; Triantafilis, J. Mapping Particle-Size Fractions as a Composition Using Additive Log-Ratio Transformation and Ancillary Data. Soil. Sci. Soc. Am. J. 2014, 78, 1967–1976. [Google Scholar] [CrossRef]
  45. Buchanan, S.; Triantafilis, J.; Odeh, I.O.A.; Subansinghe, R. Digital Soil Mapping of Compositional Particle-Size Fractions Using Proximal and Remotely Sensed Ancillary Data. Geophysics 2012, 77, WB201–WB211. [Google Scholar] [CrossRef]
  46. Heuvelink, G.B.M. Error Propagation in Environmental Modelling with GIS, 1st ed.; CRC Press: Chichester, UK, 1998; ISBN 9780748407439. [Google Scholar]
  47. Leuangthong, O.; McLennan, J.A.; Deutsch, C.V. Minimum Acceptance Criteria for Geostatistical Realizations. Nat. Resour. Res. 2004, 13, 131–141. [Google Scholar] [CrossRef]
  48. Deutsch, C.V.; Journel, A.G. GSLIB: Geostatistical Software Library and User’s Guide; Oxford University Press: New York, NY, USA, 1997; ISBN 978-0195100150. [Google Scholar]
  49. Maesano, M.; Santopuoli, G.; Moresi, F.; Matteucci, G.; Lasserre, B.; Scarascia Mugnozza, G. Above Ground Biomass Estimation from UAV High Resolution RGB Images and LiDAR Data in a Pine Forest in Southern Italy. IForest 2022, 15, 451–457. [Google Scholar] [CrossRef]
  50. Le Pera, E.; Sorriso-Valvo, M. Weathering and Morphogenesis in a Mediterranean Climate, Calabria, Italy. Geomorphology 2000, 34, 251–270. [Google Scholar] [CrossRef]
  51. Molin, P.; Fubelli, G.; Dramis, F. Evidence of Tectonic Influence on Drainage Evolution in an Uplifting Area: The Case of Northern Sila (Calabria, Italy). Geogr. Fis. Din. Quat. 2012, 35, 49–60. [Google Scholar] [CrossRef]
  52. ARSSA. Carta Dei Suoli Della Regione Calabria—Scala 1:250,000. In Monografia Divulgativa; Servizio Agropedologia; Agenzia Regionale per Lo Sviluppo e per i Servizi in Agricoltura: Soveria Mannelli, Italy, 2003. [Google Scholar]
  53. Soil Survey Staff. Keys to Soil Taxonomy, 13th ed.; USDA Natural Resources Conservation Service: Washington, DC, USA, 2022. [Google Scholar]
  54. van Groenigen, J.W.; Stein, A. Constrained Optimization of Spatial Sampling Using Continuous Simulated Annealing. J. Environ. Qual. 1998, 27, 1078–1086. [Google Scholar] [CrossRef]
  55. van Groenigen, J.W.; Pieters, G.; Stein, A. Optimizing Spatial Sampling for Multivariate Contamination in Urban Areas. Environmetrics 2000, 11, 227–244. [Google Scholar] [CrossRef]
  56. Bouyoucos, G.J. Hydrometer Method Improved for Making Particle Size Analyses of Soils. Agron. J. 1962, 54, 464–465. [Google Scholar] [CrossRef]
  57. Soil Science Division Staff. Soil Survey Manual. USDA Handbook 18; Ditzler, C., Scheffe, K., Monger, H.C., Eds.; Government Printing Office: Washington, DC, USA, 2017. [Google Scholar]
  58. Matheron, G. Principles of Geostatistics. Econ. Geol. 1963, 58, 1246–1266. [Google Scholar] [CrossRef]
  59. Armstrong, M. Basic Linear Geostatistics; Springer: Berlin/Heidelberg, Germany, 1998; ISBN 978-3-540-61845-4. [Google Scholar]
  60. Webster, R.; Oliver, M.A. Geostatistics for Environmental Scientists; Statistics in Practice; John Wiley & Sons, Ltd.: Chichester, UK, 2007; ISBN 9780470517277. [Google Scholar]
  61. Goulard, M.; Voltz, M. Linear Coregionalization Model: Tools for Estimation and Choice of Cross-Variogram Matrix. Math. Geol. 1992, 24, 269–286. [Google Scholar] [CrossRef]
  62. Deutsch, C.V.; Journel, A.G. GSLIB: Geostatistical Software Library, 2nd ed.; Oxford University Press: New York, NY, USA, 1998; ISBN 0195100158. [Google Scholar]
  63. Lantuéjoul, C. Geostatistical Simulation; Springer: Berlin/Heidelberg, Germany, 2002; ISBN 978-3-642-07582-7. [Google Scholar]
  64. Paravarzar, S.; Emery, X.; Madani, N. Comparing Sequential Gaussian and Turning Bands Algorithms for Cosimulating Grades in Multi-Element Deposits. C. R. Geosci. 2015, 347, 84–93. [Google Scholar] [CrossRef]
  65. Pyrcz, M.J.; Deutsch, C.V. Geostatistical Reservoir Modeling, 2nd ed.; Oxford University Press: New York, NY, USA, 2014; ISBN 9780199731442. [Google Scholar]
  66. Remy, N.; Boucher, A.; Wu, J. Applied Geostatistics with SGeMS: A User’s Guide; Cambridge University Press: Cambridge, UK, 2009; Volume 9780521514, ISBN 9781139150019. [Google Scholar]
  67. Bleines, C.; Deraisme, J.; Geffroy, F.; Jeannée, N.; Perseval, S.; Rambert, F. Isatis.Neo Technical References; Geovariances: Avon, France, 2024. [Google Scholar]
  68. Tolosana-Delgado, R.; Otero, N.; Pawlowsky-Glahn, V. Some Basic Concepts of Compositional Geometry. Math. Geol. 2005, 37, 673–680. [Google Scholar] [CrossRef]
  69. Pawlowsky-Glahn, V.; Egozcue, J.J.; Lovell, D. Tools for Compositional Data with a Total. Stat. Model. 2015, 15, 175–190. [Google Scholar] [CrossRef]
  70. Pawlowsky-Glahn, V.; Olea, R.A. Geostatistical Analysis of Compositional Data; Oxford University Press: New York, NY, USA, 2004; ISBN 0195171667. [Google Scholar]
  71. Pawlowsky-Glahn, V. Cokriging of Regionalized Compositions. Math. Geol. 1989, 21, 513–521. [Google Scholar] [CrossRef]
  72. Tolosana-Delgado, R.; van den Boogaart, K.G.; Pawlowsky-Glahn, V. Geostatistics for Compositions. In Compositional Data Analysis: Theory and Applications; John Wiley & Sons, Ltd.: Chichester, UK, 2011; pp. 73–86. ISBN 9780470711354. [Google Scholar]
  73. Pawlowsky-Glahn, V.; Egozcue, J.J. Compositional Data in Geostatistics: A Log-Ratio Based Framework to Analyze Regionalized Compositions. Math. Geosci. 2020, 52, 1067–1084. [Google Scholar] [CrossRef]
  74. Egozcue, J.J.; Pawlowsky-Glahn, V.; Mateu-Figueras, G.; Barceló-Vidal, C. Isometric Logratio Transformations for Compositional Data Analysis. Math. Geol. 2003, 35, 279–300. [Google Scholar] [CrossRef]
  75. Pawlowsky-Glahn, V.; Egozcue, J.J. Compositional Data and Their Analysis: An Introduction. In Geological Society, London, Special Publications; Buccianti, A., Mateu-Figueras, G., Pawlowsky-Glahn, V., Eds.; Geological Society: London, UK, 2006; Volume 264, pp. 1–10. [Google Scholar]
  76. Comas-Cufi, M.; Thió-Henestrosa, S. CoDaPack 2.0: A Stand-Alone, Multi-Platform Compositional Software. In Proceedings of the CoDaWork’11: 4th International Workshop on Compositional Data Analysis, Sant Feliu de Guixols, Spain, 9–13 May 2011; Egozcue, J.J., Tolosana-Delgado, R., Ortego, M.I., Eds.; Universitat de Girona: Girona, Spain, 2011. [Google Scholar]
  77. Wilde, B.J.; Deutsch, C.V. Programs for Swath Plots. Canada, 2012. [Google Scholar]
  78. Deutsch, C.V. Direct Assessment of Local Accuracy and Precision. In Geostatistics Wollongong ’96; Baafi, E.Y., Schofield, N.A., Eds.; Kluwer Academic Publishers: Dordrecht, The Netherlands, 1997. [Google Scholar]
  79. Vann, J.; Bertoli, O.; Jackson, S. An Overview of Geostatistical Simulation for Quantifying Risk. In Proceedings of the Symposium on Quantifying Risk and Error, Perth Western, Australia, 21–22 March 2002; Geostatistical Association of Australian: Fremantle, Australia, 2002. [Google Scholar]
  80. Schmidinger, J.; Heuvelink, G.B.M. Validation of Uncertainty Predictions in Digital Soil Mapping. Geoderma 2023, 437, 116585. [Google Scholar] [CrossRef]
  81. Arrouays, D.; Grundy, M.G.; Hartemink, A.E.; Hempel, J.W.; Heuvelink, G.B.M.; Hong, S.Y.; Lagacherie, P.; Lelyk, G.; McBratney, A.B.; McKenzie, N.J.; et al. GlobalSoilMap; Elsevier: Amsterdam, The Netherlands, 2014; pp. 93–134. [Google Scholar]
  82. Journel, A.G. Modeling Uncertainty: Some Conceptual Thoughts. In Geostatistics for the Next Century: An International Forum in Honour of Michel David’s Contribution to Geostatistics, Montreal, 1993; Dimitrakopoulos, R., Ed.; Springer: Dordrecht, The Netherlands, 1994; pp. 30–43. ISBN 978-94-011-0824-9. [Google Scholar]
Figure 1. Location (a) and digital elevation model (b) of the study area. Locations of soil sampling points are also reported. In Figure 1b, coordinates are reported using the World Geodetic System (1984) with the Universal Transverse Mercator zone 33 North Projection (WGS84 UTM 33N).
Figure 1. Location (a) and digital elevation model (b) of the study area. Locations of soil sampling points are also reported. In Figure 1b, coordinates are reported using the World Geodetic System (1984) with the Universal Transverse Mercator zone 33 North Projection (WGS84 UTM 33N).
Land 13 01835 g001
Figure 2. Box plots of the (a) soil textural particle size fractions (sand, silt, and clay) and (b) isometric log-ratio (ilr) transformed data (ilr.1 and ilr.2).
Figure 2. Box plots of the (a) soil textural particle size fractions (sand, silt, and clay) and (b) isometric log-ratio (ilr) transformed data (ilr.1 and ilr.2).
Land 13 01835 g002
Figure 3. Direct and cross-variogram models (thick solid red lines) of the linear model of coregionalization (LMC) for Gaussian values of isometric log-ratio coordinates (ilr.1 and ilr.2).
Figure 3. Direct and cross-variogram models (thick solid red lines) of the linear model of coregionalization (LMC) for Gaussian values of isometric log-ratio coordinates (ilr.1 and ilr.2).
Land 13 01835 g003
Figure 4. Quantile–quantile plots between original data and simulated isometric log-ratio coordinates (ilr.1 and ilr.2) for sequential Gaussian cosimulation, SGS (a,b) and turning bands cosimulation, TBS (c,d).
Figure 4. Quantile–quantile plots between original data and simulated isometric log-ratio coordinates (ilr.1 and ilr.2) for sequential Gaussian cosimulation, SGS (a,b) and turning bands cosimulation, TBS (c,d).
Land 13 01835 g004
Figure 5. Direct variograms of Gaussian isometric log-ratio 1 (ilr.1) data simulated with sequential Gaussian simulation, SGS (a,b) and turning bands, TBS (c,d) computed along north (a,c) and east (b,d) directions.
Figure 5. Direct variograms of Gaussian isometric log-ratio 1 (ilr.1) data simulated with sequential Gaussian simulation, SGS (a,b) and turning bands, TBS (c,d) computed along north (a,c) and east (b,d) directions.
Land 13 01835 g005
Figure 6. Direct variograms of Gaussian isometric log-ratio 2 (ilr.2) data simulated with sequential Gaussian simulation, SGS (a,b) and turning bands, TBS (c,d) computed along north (a,c) and east (b,d) directions.
Figure 6. Direct variograms of Gaussian isometric log-ratio 2 (ilr.2) data simulated with sequential Gaussian simulation, SGS (a,b) and turning bands, TBS (c,d) computed along north (a,c) and east (b,d) directions.
Land 13 01835 g006
Figure 7. Swath plots in the northern (a,c,e,g) and eastern directions (b,d,f,h) for the realizations of ilr.1 and ilr.2 obtained by sequential Gaussian simulation, SGS (ad) and turning bands, TBS (eh).
Figure 7. Swath plots in the northern (a,c,e,g) and eastern directions (b,d,f,h) for the realizations of ilr.1 and ilr.2 obtained by sequential Gaussian simulation, SGS (ad) and turning bands, TBS (eh).
Land 13 01835 g007
Figure 8. Boxplots of (a) measured soil fraction sizes (sand, silt, and clay) data and expected values calculated from all SGS and TBS realizations; (b) standard deviations from all SGS and TBS realizations.
Figure 8. Boxplots of (a) measured soil fraction sizes (sand, silt, and clay) data and expected values calculated from all SGS and TBS realizations; (b) standard deviations from all SGS and TBS realizations.
Land 13 01835 g008
Figure 9. Maps of the estimated expected values for sand, silt and clay obtained by sequential Gaussian simulation (SGS) (ac) and turning band simulation (TBS) (df).
Figure 9. Maps of the estimated expected values for sand, silt and clay obtained by sequential Gaussian simulation (SGS) (ac) and turning band simulation (TBS) (df).
Land 13 01835 g009
Figure 10. Maps of the standard deviation of the realizations for sand, silt and clay obtained by sequential Gaussian simulation (SGS) (ac) and turning band simulation (TBS) (df).
Figure 10. Maps of the standard deviation of the realizations for sand, silt and clay obtained by sequential Gaussian simulation (SGS) (ac) and turning band simulation (TBS) (df).
Land 13 01835 g010
Table 1. Summary statistics for the concentrations of sand (2–0.05 mm), silt (0.05–0.002 mm), clay (<0.002 mm), and the isometric log-ratio coordinates (ilr.1 and ilr.2). Number of samples = 135.
Table 1. Summary statistics for the concentrations of sand (2–0.05 mm), silt (0.05–0.002 mm), clay (<0.002 mm), and the isometric log-ratio coordinates (ilr.1 and ilr.2). Number of samples = 135.
StatisticsSand (%)Silt (%)Clay (%)ilr.1 (−)ilr.2 (−)
Minimum39.001.007.00−0.276−0.018
Lower quartile57.0017.0012.000.4620.567
Median63.0022.0015.000.7300.755
Mean62.6321.3416.030.6800.799
Upper quartile69.0026.0019.000.8981.023
Maximum86.0040.0029.001.2713.150
Stand. Dev.9.726.725.040.2820.389
Skewness (-)−0.260.000.73−0.4031.678
Kurtosis (-)2.863.013.112.84211.338
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Buttafuoco, G. Comparing Two Geostatistical Simulation Algorithms for Modelling the Spatial Uncertainty of Texture in Forest Soils. Land 2024, 13, 1835. https://doi.org/10.3390/land13111835

AMA Style

Buttafuoco G. Comparing Two Geostatistical Simulation Algorithms for Modelling the Spatial Uncertainty of Texture in Forest Soils. Land. 2024; 13(11):1835. https://doi.org/10.3390/land13111835

Chicago/Turabian Style

Buttafuoco, Gabriele. 2024. "Comparing Two Geostatistical Simulation Algorithms for Modelling the Spatial Uncertainty of Texture in Forest Soils" Land 13, no. 11: 1835. https://doi.org/10.3390/land13111835

APA Style

Buttafuoco, G. (2024). Comparing Two Geostatistical Simulation Algorithms for Modelling the Spatial Uncertainty of Texture in Forest Soils. Land, 13(11), 1835. https://doi.org/10.3390/land13111835

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop