Mapping The Relative Abundance of Soil Microbiome Biodiversity From eDNA and Remote Sensing - ScienceDirect
Mapping The Relative Abundance of Soil Microbiome Biodiversity From eDNA and Remote Sensing - ScienceDirect
Mapping The Relative Abundance of Soil Microbiome Biodiversity From eDNA and Remote Sensing - ScienceDirect
Show more
Highlights
• Combining image spectroscopy and eDNA for mapping soil microbiome
biodiversity.
• PLSR predicts the relative abundance of soil microbial families from remote
sensing.
Abstract
Although an enormous number of plant and animal species have been directly observed and
recorded in online databases, the spatial variation in the composition of the microbiome
remains relatively largely unknown. In this study, for the first time, we demonstrate mapping
of the relative abundance of the soil microbiome for three terrestrial ecosystems across North
https://www.sciencedirect.com/science/article/pii/S266601722200027X 1/30
10/21/22, 11:34 AM Mapping the relative abundance of soil microbiome biodiversity from eDNA and remote sensing - ScienceDirect
America (savanna, boreal and tundra) using airborne image spectroscopy and environmental
DNA (eDNA) data. We identified field plots of publicly available eDNA data co-occurring with
AVIRIS-NG hyperspectral imagery. An eDNA processing pipeline was developed to generate a
consistent profile of the relative abundance for thousands of microbiome operational
taxonomic units (OTU) and 225 microbiome families. Using Linear Discriminate Analysis
(LDA) scores for the eDNA data, we identified 81 families with the greatest explanatory power
based on the community composition between the three ecosystems. A case study example
demonstrates our conceptual approach by selecting a dominant and functionally important
bacterial family for each ecosystem, with each family representing a specific biomarker. A
partial least squares regression (PLSR) was applied to spatially predict the relative abundance
of each bacteria family (in the boreal, tundra, and savanna ecosystems) from the hyperspectral
imagery. For the boreal, Pseudomonadaceae is shown to be a dominant family taxon, as it is a
saprophytic family specialized in decomposing the dense organic matter of boreal forest soils.
Members of an understudied family of Acidobacteria, so far indicated as AKIW659, are
abundant in acidic Arctic soils and peat bogs. Finally, the Micromonosporaceae are dominant
and functionally important in drier regions with grass-tree dominated woodlands, being a
member of Actinobacteria with a high relative abundance in soils with high carbon content
and nitrate leaching. We demonstrate for the first time how the spatial prediction of relative
abundance of these bacteria taxa based on remote sensing, showing patterns of the soil
microbiome biodiversity and ecosystem function within and across the three ecosystems.
Previous Next
Keywords
Image spectroscopy; Environmental DNA; Biodiversity; Species abundance
1. Introduction
Traditional biodiversity monitoring involves the field observation of species by specialized
biologists, aided by skilled volunteers, whose expertise is restricted to specific biotic
groupings. Botanists may identify and record the presence of plant species and ornithologists
the bird biota, but ‘unpopular’ biotic groups such as fungi, bacteria and insects are under-
observed or escape identification altogether. Traditional methods for identifying soil
microbial communities from in situ samples have been based on direct human observation of
organisms cultured under laboratory conditions. Though traditional techniques have been
steadily refined they remain expensive, time consuming, inconsistent across time and space,
as well as only capturing a fraction of true biodiversity (Chariton et al., 2010). In recent years,
environmental DNA (eDNA) metabarcoding has enabled the rapid collection of
comprehensive species occurrence of a sample's microbial community composition and
functional potential (Taberlet et al., 2018).
https://www.sciencedirect.com/science/article/pii/S266601722200027X 2/30
10/21/22, 11:34 AM Mapping the relative abundance of soil microbiome biodiversity from eDNA and remote sensing - ScienceDirect
Though eDNA remains expensive, the costs per amount of data are rapidly reducing, while
protocols (for field sampling of eDNA as well as laboratory analysis) are standardizing, and
from massive online reference databases it is possible to identify taxa from DNA-sequences
(Chariton et al., 2010; Taberlet et al., 2018). Another key innovation is the rapid growth in the
number of meta-databases of sequenced DNA samples, though samples are thin on the
ground especially when upscaled to landscape or global levels. For the microbiome, the spatial
distribution of taxa remains essentially unknown, as more intensive sampling to ‘fill the gaps’
at landscape to global levels would be prohibitively expensive in terms of fieldwork and
laboratory costs.
Concurrent with these advances in microbiology, imaging spectroscopy with a finer spatial
resolution pixel (2–5 m) can remotely sense the chemical, compositional, and functional
characteristics of above-ground plant communities using spectral reflectance and vegetation
indices (Lausch et al., 2016; Feilhauer et al., 2016; Rahman et al., 2003). Interestingly, even
ecosystem functions such as phenology and photosynthetic activity have been linked to
vegetation indices using coarser resolution hyperspectral satellites (250–500 m) such as
MODIS and Hyperion, (Zhang et al., 2003; Guerschman et al., 2009). The combination of very
high-resolution image spectroscopy and eDNA approaches link disparate ecosystem
components by capturing dominant ecosystem functions across multiple domains of life. The
use of very high-resolution image spectroscopy allows contiguous mapping, by ‘filling the
gaps’ between in situ eDNA field samples.
As biodiversity change has many dimensions and occurs at a range of spatial and temporal
scales, relying on observational in situ samples is proving inadequate for the 196 parties to the
UN Convention on Biodiversity (CBD) to meet their monitoring commitments (Marques et al.,
2014). The Essential Biodiversity Variables developed by the Group of Earth Observation
Biodiversity Observation Network (GEO BON) provide a framework for measuring
biodiversity at multiple levels (Pereira et al., 2013). Although some essential biodiversity
variables can already be monitored using broad-band optical satellite imagery such as from
Landsat and Sentinel-2, these are primarily measures of gross biodiversity metrics, e.g. land
cover and phenology (Potapov et al., 2017; Skidmore et al., 2015), and lack the high spectral
resolution needed to model ecosystem functions from eDNA in situ samples. To address this
shortfall requires fundamentally new approaches for rapidly quantifying and understanding
the complexity and changes in global biodiversity, including the relative abundance of species
and community composition. Exciting actual and planned innovations in next-generation
hyperspectral satellites (viz., the Environmental Mapper and Analysis Programme EnMAP
(Storch, 2022) from DLR-Germany, PRecursore IperSpettrale della Missione Applicativa
PRISMA (PRISMA, 2022) from ASI-Italy , Copernicus Hyperspectral Imaging Mission for the
Environment CHIME (ESA, 2022) from ESA-Europe, and Surface Biology and Geology SBG
(Nastal and Pavlick, 2022) from NASA-USA) further our abilities to upscale biodiversity
products to a global level and further understanding of how biodiversity responds to stress
and environmental change (Lausch et al., 2016).
https://www.sciencedirect.com/science/article/pii/S266601722200027X 3/30
10/21/22, 11:34 AM Mapping the relative abundance of soil microbiome biodiversity from eDNA and remote sensing - ScienceDirect
Here we demonstrate for the first time the potential to spatially predict microbial community
composition from remote sensing. We further highlight examples of how ecosystem
functions, such as soil respiration and plant growth reduction in waterlogged anaerobic
conditions, can be synchronously measured by remote sensing as well as deduced from the
high relative abundance of specific soil microbiome families. This builds on the work of
spatial ecology and image spectroscopy, highlighting how functional ecology has been linked
to image spectroscopy (Asner and Martin, 2016) as well as Plant Functional Types (Ustin and
Gamon, 2010) as indicators of soil properties and the soil microbiome (Aponte et al., 2013).
Our research, using the image spectroscopy response to inform on environmental
niche/habitat, reveals for the first time how biodiversity such as the soil properties as well the
soil microbiome can be derived from image spectroscopy. In other words, through analysis of
co-occurring eDNA and image spectroscopy data, we aim to show how ecosystem function can
be interpreted from the literature, thereby filling the Linnaean gap (i.e., the gap caused by
most species not being described by traditional sampling) and the Wallacean gap (i.e., the
unknown geographical distribution and extent of most species) (Bini et al., 2006).
2. Methods
https://www.sciencedirect.com/science/article/pii/S266601722200027X 4/30
10/21/22, 11:34 AM Mapping the relative abundance of soil microbiome biodiversity from eDNA and remote sensing - ScienceDirect
The publicly available eDNA data have bacterial 16S gene profiles, allowing inter-comparison
of bacteria family taxonomic data across the three ecosystems (savanna, boreal and tundra).
These profiles have been constructed using the well-established 515F-806R primer set
targeting the V4 region of the 16S SSU rRNA (Caporaso et al., 2011). The DNA profiles were
downloaded from the public databases as operational taxonomic unit (OTU) tables annotated
using the Greengenes reference database (v13.8). Data in these databases has been collected
and processed according to standardized protocols
(dx.doi.org/10.17504/protocols.io.nuudeww) and comply with the high data standards of these
databases (https://earthmicrobiome.org/protocols-and-standards/16s/,
https://www.ebi.ac.uk/ena/browser/about/data-standards), with the aim to reduce variation
between individual studies (Yilmaz et al., 2011). Meta-analyses of microbial community
samples collated in these databases have resulted in over 100 high-quality publications
exploring microbial diversity at local to global scales by utilizing the information stored about
the DNA extraction, amplification, sequencing and bioinformatic (QIIME 1) protocols
(Caporaso et al., 2010; Thompson et al., 2017; Nuccio et al., 2016; Gilbert et al., 2014). See
Supplementary Table S2 for source information of each dataset used in this study, including
the sampling and sequencing procedures. We proceeded with only those taxa assigned to
Eubacteria (i.e., true bacteria), removing any taxa assigned to Eukaryotic organelles (i.e.,
mitochondria and chloroplast) and Archaea (i.e., organisms with molecular characteristics
separating them from bacteria). Replicate samples from the same plot were averaged and
taxonomic units with a maximum of 9 or less counts per sample were removed on a sample-
by-sample basis to avoid false positives and low-frequency noise (De Barba et al., 2014;
Siegenthaler et al., 2019). The resulting table was rarefied to 10,000 reads per sample to
normalize for variation in sequencing depth (mean ± SD read count per sample:
90,693 ± 49,086) (Sanders, 1968; Weiss et al., 2017) and taxa were grouped into families and
then the relative read abundance per family was calculated by dividing the raw reads of each
family by the total number of reads in a sample. Relative abundances show the relative
contribution of each taxon to the overall composition of the amplified product and are less
influenced compared to raw read counts by stochastic processes introduced by the
amplification and sequencing process (Philippot et al., 2009; Taberlet et al., 2012). Relative
abundances is a well established approach for studying microbial community composition
(Orwin et al., 2018; Lozupone and Knight, 2007; Roesch et al., 2007; Jesmok et al., 2016) as well
as analysing diversity patterns in eDNA datasets (Philippot et al., 2009; Albaina et al., 2016;
Evans et al., 2016). We conducted Linear Discriminant Analysis (LDA) to select bacterial
families that form representative biomarkers for the 3 ecosystems. Each ecosystem (i.e.,
savanna, boreal and tundra) represented a class in a Linear Discriminant Analysis (LDA). The
variables in this LDA were the eDNA families and values were the relative abundance of each
family. Differentially abundant families between the three ecosystems (boreal, savanna and
tundra) were identified using LDA Effect Size (LEfSe) analysis (Segata et al., 2011) with α' ≤ 0.05
https://www.sciencedirect.com/science/article/pii/S266601722200027X 5/30
10/21/22, 11:34 AM Mapping the relative abundance of soil microbiome biodiversity from eDNA and remote sensing - ScienceDirect
(factorial Kruskal-Wallis test) and a minimum logarithmic LDA score ≥3 thresholds (Segata et
al., 2011).
2.3. Modeling the bacterial families from remote sensing using PLSR
Based on the Linear Discriminant Analysis (LDA) thresholds, a total of 81 discriminant family
taxonomic units were extracted for each sample. Then, using a Partial Least Squares
regression (PLSR) the relative abundance of each family was predicted using the hyperspectral
imagery (with the AVIRIS image band being the predictor variable in the PLSR analysis). PLSR
reduces predictor variables (i.e., image bands) to a small set of uncorrelated components, and
regression is then started on this small set of components. As spectral measurements have
predictors (wavebands) which are highly collinear and heavily outnumber observations, PLSR
has become the preferred method of analysis for image spectroscopy and is frequently used
for modeling multiple response variables (Axelsson et al., 2013; Ramoelo et al., 2013; Bi et al.,
2019). The optimal number of PLSR components (with 'components' being a term often used
in remote sensing, versus ‘latent variables' which is used more in biology) was chosen based
on the lowest corrected Akaike Information Criterion (AICc) threshold values and lowest
RMSE, where AICc is the AIC corrected for small sample sizes (Segata et al., 2011). PLSR
models were run in R (v. 4.1.1) using 'pls' (v. 2.8-0) with NIPALS (Nonlinear Iterative Partial
Least Squares) (Mevik and Wehrens, 2007) as well as the Python (3.8.2) environment. Models
were validated using leave-one-out (LOO) cross-validated datasets (Philippot et al., 2009;
Albaina et al., 2016; Evans et al., 2016) (see also Supplementary Fig. S3). Examples of
discriminant bacterial families (Log-LDA score >3) which were highly correlated (R2 > 0.5) to
the image spectra in the PLSR models) are summarized in Table 1.
Table 1. Examples of bacterial families selected from the 81 family taxonomic units as extracted
for each sample. Families were selected that had the following thresholds: they showed a
significant (P < 0.05; Kruskal-Wallis test) and discriminant association with a ecosystem (Log-
LDA score >3) and they were highly correlated (R2 > 0.5) to the image spectra in the PLSR
models. The number of PLSR components was selected based the lowest AICc value. Mean
relative read abundance (MRA% in Table 1) are shown per family for the full dataset and for
individual ecosystems.
https://www.sciencedirect.com/science/article/pii/S266601722200027X 6/30
10/21/22, 11:34 AM Mapping the relative abundance of soil microbiome biodiversity from eDNA and remote sensing - ScienceDirect
To demonstrate the potential to map and further understand the distribution of microbial
families, we selected three example microbial families (from the 81 pre-selected families) as a
representative biomarker for each of the three ecosystems (Table 2). These three families were
selected based on a high R-squared value from the PLSR and known ecological information
explaining the presence of each family taxon across a landscape based on its ecological
function. We considered when selecting the three family taxa whether the typical habitat of
the family taxon could be deduced from known hyperspectral remote sensing wavelengths.
For example, it is known that 940–960 nm are wavelengths important for discriminating
tundra vegetation (Bratsch et al., 2017), while the 1305–1355 nm range is associated with the
waterlogged conditions in the Arctic tundra which exposes a greater component of the
soil/ground/water reflectance signal (Knipling, 1970). From the literature, we also considered
the habitat typical of the family taxon and links to ecological function (e.g., amount of green
biomass and non-photosynthetic components such as lignin and cellulose linked to
photosynthesis, the role of canopy nitrogen and phosphorus in nutrient cycling, and role of
biochemical compounds such as leaf proteins and polyphenols etc.) The first selected
microbial family AKIW659 is abundant in waterlogged, anaerobic, and highly acidic soils, and
proportionally more abundant in tundra wetlands. The second family is Pseudomonadaceae,
which is saprophytic and specialized in decomposing the thick organic matter of boreal forest
floors. Finally, the Micromonosporaceae family is dominant and functionally important in
savannah soils, and forms in herbage with a higher carbon content and leaching of nitrates
(see Table 2 for more details).
Family Description
https://www.sciencedirect.com/science/article/pii/S266601722200027X 7/30
10/21/22, 11:34 AM Mapping the relative abundance of soil microbiome biodiversity from eDNA and remote sensing - ScienceDirect
Family Description
We ran the PLSR (NIPALS option) with leave-one-out (LOO) cross-validation of the
predictions (Mevik and Wehrens, 2007). For the three selected microbial families, the optimal
number of components (i.e., the lowest AICc and RMSEP) were 5 for AKIW659 and
Micromonosporaceae, and 3 components for Pseudomonadaceae (Table 3). To validate the
modeling of the bacterial families from the remote sensing using PLSR, the measured and
predicted values derived by the leave-one-out cross validation demonstrate that the PLSR
models are reliable with an accuracy comparable with similar studies (Ramoelo et al., 2013;
Askari et al., 2015; Fernandez-Habas et al., 2021) (Table 3 and Fig. S3).
Table 3. Predictions of microbiome family relative abundances using PLSR fitted with AVIRIS-
NG hyperspectral imagery. RMSE and R2values are based on bias-corrected cross-validation
estimates (NIPALS option). Total model variance explained using an optimal number of
components (i.e., the lowest AICc and RMSEP). See also Supplementary Fig. S1.
https://www.sciencedirect.com/science/article/pii/S266601722200027X 8/30
10/21/22, 11:34 AM Mapping the relative abundance of soil microbiome biodiversity from eDNA and remote sensing - ScienceDirect
We utilized the AVIRIS-NG hyperspectral imagery to predict the relative abundance of the 3
microbial families (AKIW659, Pseudomonadaceae, Micromonosporaceae) across the 3
ecosystems (viz. tundra, boreal and savanna). The PLSR model directly predicted the relative
abundance of each family derived from the eDNA profile data across each AVIRIS-NG image
(i.e., Python 3.8.2 PLSR regression ‘predict’ function), with the relative abundance values of the
three selected microbial families being written into three separate image matrices. We finally
projected RGB colour composite maps, by combining the relative abundance image of each
family into a colour composite image as the red (AKIW659), green (Micromonosporaceae) and
blue (Pseudomonadaceae) respectively (Schmidtlein et al., 2012) (Fig. 2). Using the visible bands
of the AVIRIS-NG imagery, aerial imagery was simulated as a true (also known as ‘natural’)
colour composition (Siok and Ewiak, 2020) (Fig. 2).
3. Results
DNA profiles contained on average (±SD) 90,693 ± 49,086 reads per sample after removal of
reads from Eukaryotic organelles, Archaea, and low-frequency noise. Rarefaction to 10,000
reads per sample resulted in a total number of 779 OTUs (267 ± 59 Mean ± SD OTUs per
sample). Seven OTUs were removed as result of the rarefaction. A total of 81 bacterial families
were detected in the dataset, with a mean (±SD) 52 ± 10 families per profile.
The PLSR provides insight into the microbial community, as well as indicating habitat
requirements by highlighting significant wavelength bands associated with specific microbial
families. From Supplementary Fig. S1, the plotted PLSR scores for the first 2 components
(latent factors 1 and 2) show good separation of the family taxon across the three ecosystems.
For example, AKIW659 has a high relative abundance with PLS1 in the tundra (green dots),
and a low relative abundance in the boreal and savanna, highlighting the importance of the
microbiome family as a biomaker for certain ecosystems.
Examining the PLSR coefficients in combination with the Variable Importance of Projection
(VIP) scores, indicate wavelengths having a significant impact on predicting the relative
abundance of each family. The wavelengths having an impact on the relative abundance of the
family AKIW659 (Fig. 1a) are at 760 nm, 940–960 nm (double peak), 1120 nm, 1305–1355 nm
(double peak), and 2015 nm). The wavelengths having an impact on the relative abundance of
the Pseudomonadaceae family associate with the wavelength range 750 nm, 900 nm, 1120 nm
with a negative association at 1970–2020 nm for the relative abundance of Pseudomonadaceae
(Fig. 1b). A common family present in savanna/Mediterranean landscapes is the
Micromonosporaceae, which occurred with high relative abundance in our eDNA profile data.
There is a high relative abundance of Micromonosporaceae associated with wavelengths at
https://www.sciencedirect.com/science/article/pii/S266601722200027X 9/30
10/21/22, 11:34 AM Mapping the relative abundance of soil microbiome biodiversity from eDNA and remote sensing - ScienceDirect
680 nm, 1480–1490 nm, 1690–1710 nm, with a low relative abundance at 1425 nm 2100 nm
(Fig. 1c).
Fig. 1. The VIP scores and regression coefficients for families a) AKIW659 b)
Pseudomonadaceae c) Micromonosporaceae generated by the PLSR analysis for the three
ecosystems (boreal, savanna and tundra). VIP scores below the red line (VIP<1) indicate parts
of the spectrum that are non-informative for predicting the relative abundance of the
respective family. The vertical blue lines indicate the wavelengths that are informative with
https://www.sciencedirect.com/science/article/pii/S266601722200027X 10/30
10/21/22, 11:34 AM Mapping the relative abundance of soil microbiome biodiversity from eDNA and remote sensing - ScienceDirect
https://www.sciencedirect.com/science/article/pii/S266601722200027X 11/30
10/21/22, 11:34 AM Mapping the relative abundance of soil microbiome biodiversity from eDNA and remote sensing - ScienceDirect
Fig. 2. a–f: True colour composite image segments of tundra (2a), boreal (2b), savanna (2c).
Relative abundance of the three families (viz. AKIW659, Pseudomonadaceae,
Micromonosporaceae) in the tundra (2c), boreal (2d) and savanna (2e). Red dots indicate
sample locations that are overlapping with the shown image segments – note that some
samples are located on adjacent imagery (which are not shown). (For interpretation of the
references to colour in this figure legend, the reader is referred to the Web version of this
article.)
The relative abundance of the 3 families in the tundra, boreal and savannah hyperspectral
images were modelled from reflectance values for each image and compared with a true
colour image (Fig. 2). To interpret the relative abundance imagery (Schmidtlein et al., 2012) the
colour of each pixel indicates the relative abundance of each family projected as a mix of the 3
colour guns, with the dominant colour indicating the respective higher relative abundance of
the microbial families in the landscape. A cursory inspection of the relative abundance images
for the tundra, boreal and savanna imagery (Fig. 2) highlights an overall reddish colour for the
tundra (where AKIW659 (red) dominates the tundra), boreal is dominated by blue
(Pseudomonadaceae), and savanna by green (Micromonosporaceae) – with the relative
abundances of the families being confirmed by summary statistics (Table 1). In Fig. 2, larger
areas of similar colour are hotspots of higher relative abundance for that specific mix of
microbial families. Black pixels are water pixels, filtered using a negative NDVI values.
4. Discussion
The family AKIW659 occurs with high relative abundance in the tundra, the
Psedomonadaceae family is abundant in the boreal, and for Mediterranean climates the
Micromonosporaceae family has a high relative abundance (Table 1). There is a relative
abundance of AKIW659 associated with an important (VIP >1.0) negative regression coefficient
at 760 nm, highlighting the contrast between the lack of leaf structure of lichens/mosses
(Kuusinen et al., 2020) (i.e., poorly developed NIR shoulder) in the tundra with the well-
developed NIR shoulder structure typical of grasses, shrubs and trees in tundra, boreal and
savanna vegetation (Schmidt and Skidmore, 2001; Hope et al., 1993) (Fig. 1a). The wavelength
range 660–680 nm has a high regression coefficient (Fig. 1a) and has been shown to be
statistically significant for differentiating vegetation communities in this region west of
Barrow Alaska (Davidson et al., 2016). This part of the visible spectrum, coinciding with a peak
contribution of the regression coefficient and VIP (0.7), differentiates grasses (including
hummock grass) and non-ericaceous shrubs from mosses (Tieszen and Johnson, 1968), with a
higher chlorophyll a pigment concentration being associated with green vegetation (Liu et al.,
2017). We do note that even though there is a high correlation between relative abundance of
AKIW659 and this wavelength, the VIP of 0.7 indicates the wavelength does not contribute a
large amount of information to the model. The double peak regression coefficient at 940–
960 nm has been noted as a wavelength range important for discriminating tundra vegetation
(Bratsch et al., 2017). This concurs with findings that structural carbon elements (including
https://www.sciencedirect.com/science/article/pii/S266601722200027X 12/30
10/21/22, 11:34 AM Mapping the relative abundance of soil microbiome biodiversity from eDNA and remote sensing - ScienceDirect
lignin, starch, and cellulose) are strongly associated with the SWIR at 950–1000 nm (starch)
and at 1120 nm (lignin and cellulose) (Curran, 1989), and as noted in Table 2 the AKIW659
family has a specialized role in carbon cycling by oxidizing carbohydrates. The strong negative
coefficient feature at 1305–1355 nm is associated with the waterlogged conditions of the Arctic
tundra, which also has a lower leaf area index (LAI) contrasting with the boreal forests to the
south, exposing a greater component of the soil/ground/water reflectance signal, interspersed
with sparser shrub and grass vegetation (Knipling, 1970). The atmospheric absorption bands
for atmospheric cloud and water vapour around 1250–1400 nm are generally not of interest to
terrestrial remote sensing, being included in bands on satellites for automatic cirrus cloud
detection at 1375 nm (Richter et al., 2011).
The Pseudomonadaceae family occurred with high relative abundance in all boreal samples
(Table 1). In the boreal forest, tree spectra dominate an increasing part of hyperspectral image
reflectance. The 1120 nm wavelength is known to be associated with lignin (Curran, 1989)
which is an important structural carbohydrate component of the fibre content of leaves and
litter in boreal forests (Richardson, 2004), and the PLSR analysis indicates this wavelength
range is important for explaining the variability of the relative abundance of the
Pseudomonadaceae family (Fig. 1b). The importance of the red edge wavelengths 680 nm
(Gitelson et al., 1996) association with the relative abundance of Pseudomonadaceae (Fig. 1b)
can be explained by the chlorophyll content of the tree species of the Canadian and Alaskan
Shield reflecting relatively more near-infrared energy compared to the wetland areas of the
tundra. The 750–925 nm range is part of the spectrum associated with light scattering due to
leaf and canopy structure (Ali et al., 2016) coupled with low absorption by water, lignin and
cellulose, resulting in high plant reflectance and transmittance (Rautiainen et al., 2018). The
wavelength range 1970–2020 nm associates with higher protein/nitrogen as well as
cellulose/starch content leading to lower reflectance (Curran, 1989). In boreal forests, there is a
relative increase in the lignin and cellulose content resulting from slower litter
decomposition at the cooler temperatures of more northerly latitudes (Richardson, 2004). Low
soil macronutrient levels, especially nitrogen and phosphorus further reduce the rate of litter
decomposition (Archibold, 1995). The boreal forests are generally podzolic, acid soils, with a
well-developed leaf litter layer containing a relatively high content of cellulose and lignin
(Handley, 1954) providing a suitable habitat (Table 3) for the observed high relative abundance
of Pseudomonadaceae.
For Mediterranean climates, the Micromonosporaceae have been isolated from soil and plant
materials (Genilloud Trujillo et al., 2015), and in our data have a high relative abundance
(Table 1), as this family is able to withstand and colonize drier habitats evidenced by a number
of stress related genes being detected (Carro et al., 2019). Micromonosporaceae can withstand
high temperatures (Khan et al., 2020). This family has been isolated in semiarid grasslands and
are thought to have an important role in the degradation of cellulose (Yeager et al., 2017). The
640–660 nm wavelength associated with Micromonosporaceae (Fig. 1c) relates to increased
canopy nitrogen content and chlorophyll a+b absorption (Curran, 1989) resulting from
Micromonosporaceae taxa having a function in plant growth promotion (including nitrogen
fixation) (Hirsch and Valdes, 2010) through plant/rhizosphere nitrogen fixing nodules (Trujillo
https://www.sciencedirect.com/science/article/pii/S266601722200027X 13/30
10/21/22, 11:34 AM Mapping the relative abundance of soil microbiome biodiversity from eDNA and remote sensing - ScienceDirect
The promising results demonstrate the potential to interpret the relative abundance and
ecological function of the three selected families for the savanna, boreal and tundra habitats,
from the colour composite images (Fig. 2). For example, of the three selected species, in the
boreal (Fig. 2e) the Pseudomonadaceae (bright blue pixels) dominate. In Fig. 2e, the abundance
of Pseudomonadaceae is relatively higher (bluer colour) on the north facing slopes associated
with the mesic black spruce (coniferous forest) (Fig. 2b) compared with the south facing slopes
dominated by drier hardwood forests (LTER, 2021) – see the NEON field site network Caribou-
Poker Creeks Research Watershed data for reference (https://www.neonscience.org/field-
sites/bona). The southern facing slopes has Micromonosporaceae mixing with the
Pseudomonadaceae (i.e., greener tinge in Fig. 2e), which associates with increasing mixed
coniferous-broadleaf forest, broadleaf forest and open woodland/tussock vegetation (Haugen
et al., 1982). The warmer microclimate on the southern facing slopes also makes this habitat
more suitable for Micromonosporaceae compared to the colder northern facing slopes (Table
2). The broadleaf (birch-aspen dominated) forest is a persistent seral hardwood community
maintained by repeated disturbance approximately every 75 years, with closed canopy and a
shrub/herbaceous understory (Boucher and Reid, 2010), and is associated with a higher
relative abundance of Micromonosporaceae. Regular disturbance results in spruce exclusion
as this species is fire intolerant. Because conifer litter decays more slowly than hardwood litter
(Fassnacht and Gower, 1999; McClaugherty et al., 1985) the relative abundance of
Pseudomonadaceae will be higher. Further we note the mixing of the Micromonosporaceae
(green-blue areas) in the boreal-taiga vegetation as well as a trace of AKWI659 (red) in the
wetter bog areas (Fig. 2b).
Forest vegetation is the habitat driving the relative abundance of the microbiome. Land cover
mapping, such as for the Cariboh-Poker Catchment (Haugen et al., 1982), has distinct
vegetation types across which the microbial families subtly merge and these microbial
families associate with specific habitats across vegetation types (e.g., AKWI659 is in the
phylum Acidobacteria and characterized as an aerobic heterotroph typically occurring with
sedges, grasses and various mosses). This subtle grading of microbial families occur within
specific habitat types allows the relative abundance of the microbiome taxa to be visualized
across a vegetation or habitat type - our novel approach provides detailed spatial information
https://www.sciencedirect.com/science/article/pii/S266601722200027X 14/30
10/21/22, 11:34 AM Mapping the relative abundance of soil microbiome biodiversity from eDNA and remote sensing - ScienceDirect
about the microbiome biodiversity across vegetation habitats, albeit requiring eDNA and
hyperspectral imagery and relatively complicated processing. As discussed above for the three
example microbial families that are representative biomarkers for each of the three
ecosystems, we deduce that the functional ecology of each microbiome family mirrors the
canopy functional traits through the soil resources, as proposed by the Spectranomics
approach (Asner and Martin, 2016). Canopy plant species have evolved plant functional types
and traits that echo soil structure and nutrient concentration (Massmann et al., 2022), which
in turn are further adapted by the abiotic conditions of a particular location including climate,
geology, and topography (Higgins et al., 2014). Image spectroscopy can associate ecosystem
processes with plants, for example through biogeochemical cycles, thereby mirroring soil
conditions (Asner and Martin, 2016) and specifically the soil microbiome (Aponte et al., 2013).
In our approach presented, spectral properties mirror the biochemical, structural, and
functional properties of vegetation. That is, the vegetation or Plant Functional Types (PFT) are
indicators of soil properties as well as the soil mirobiome. Plant Functional Types (PFTs) is an
important concept linking species with broad vegetation types, based on ecosystem structure,
ecosystem function (phenology and physiology), as well as mirroring plant responses to soil
and ecosystem conditions (Ustin and Gamon, 2010). Climate as well as the soil biophysical
environment, control ecological niche and a biome's vegetation structure and function (Ustin
and Gamon, 2010).
The relationship between above ground vegetation and soil processes are known to be both
species and ecosystem specific (Ma et al., 2020; Hess and Austin, 2014), with patterns and
processes even demonstrated at global scales (Ordonez et al., 2009). Understanding soil-plant
relationships requires knowledge of species interactions and soil processes at the level of
ecosystems. While spectral indicators of the biota may be context-dependent, many general
relations have been developed (Roy et al., 2006), which allows deductive interpretation of
rhizosphere functions, as demonstrated in this study for three ecosystems across North
America, and further increases understanding of soil microbial processes at a global scale
(Niemann et al., 2015). Soil microbiome biodiversity as measured by eDNA reveals a complex
community composition. Microbiome families respond to habitat, with the habitat
comprising the physical and biotic factors that support the species’ survival and reproduction
(Kiesewetter and Afkhami, 2021; Wang et al., 2022; Keet et al., 2019). By considering the co-
occurring eDNA and image spectroscopy data, our results demonstrate that ecosystem
functions and hence related processes such as the presence and relative abundance of
microbiome families, can be deductively interpreted, also in ecosystems with vegetation cover.
Given the eDNA profile data contain thousands of taxonomic groups, which in this study were
reduced to a list of 81 discriminative families (see Table 1 for some examples of these 81
families), it is straightforward to repeat our novel method for different families as well as at
different taxonomic ranks. For example, the three families (viz. AKIW659, Pseudomonadaceae,
Micromonosporaceae) chosen to demonstrate our method (Fig. 1) could be replaced in the
true colour composite images with any combination of three families from the 81
discriminative families, or indeed other taxonomic ranks such as species, order, or phylum. A
https://www.sciencedirect.com/science/article/pii/S266601722200027X 15/30
10/21/22, 11:34 AM Mapping the relative abundance of soil microbiome biodiversity from eDNA and remote sensing - ScienceDirect
key insight of this paper is that it is possible, with the highly multivariate (“deep”) eDNA
profile data, to predict the relative distribution of any combination of the taxonomic groups
present in the study area from remote sensing. We show that it is also methodologically
straightforward to discover the (multivariate) relations between the microbiome taxa (in an
eDNA profile database), the image spectroscopy, and environmental data, as demonstrated
above for the Psedomonadaceae family.
The relatively limited number of samples may result in statistical models being potentially
overfitted (Shen et al., 2020). We acknowledge that ideally more samples would be available in
this study, though the scarcity of publicly available image spectroscopy imagery that coincide
with eDNA profile data across natural areas in North America constrained the study. Though
research indicates that as few as 20–30 samples may be adequate for significant and
meaningful PLSR analyses (Bastien et al., 2005; Tenenhaus et al., 2005; Kock and Hadaya,
2018), other researchers propose more samples (>100) (Burnett et al., 2021). Several image
spectroscopy studies using hard-to-collect biodiversity variables have successfully interpolated
environmental variables using n < 25 in addition to the range 26 < n < 50 (Axelsson et al., 2013;
Ramoelo et al., 2013; Bi et al., 2019; Askari et al., 2015; Cho et al., 2007). Additionally, by
reducing the number of PLSR predictor variables to between 3 and 5 regression components,
model overfitting is further diminished (Shen et al., 2020). We checked overfitting of the PLSR
model using standard leave-one-out (LOO) cross-validation for the predicted versus measured
scores, demonstrating that results are repeatable and robust (Ramoelo et al., 2013; Askari et al.,
2015; Fernandez-Habas et al., 2021) (Supplementary Fig. S1) for the three family taxa, with a
RMSEP of 0.01 and 0.04 as well as a R2 of 0.54 and 0.85 between the measured and predicted
values for the three families (Table 3). We anticipate other scientists will further test and
develop the concept described in this paper for robustness and repeatability when mapping
the microbiome from remote sensing, as more eDNA data become available that are
coincident with hyperspectral images.
As well as complementing in situ data collection, the method using remote sensing
demonstrated here may be further upscaled with respect to the number of taxa for whom the
relative abundance is estimated, potentially providing further insight into the ecosystem
health of ecosystems. The so-called ‘Linnean gap’ (i.e., gaps caused by most species not being
described with traditional sampling) may easily be filled by increasing the number of marker
genes for additional species, as part of the eDNA analysis process (Bini et al., 2006). If multiple
images obtained at different dates are available, then it will be possible to interpolate
microbial species relative abundance using time-series imagery (Zhang et al., 2003).
5. Conclusion
The requirements for monitoring anthropogenic biodiversity change, as detailed by the UN
Sustainable Development Goals as well as the UN Convention on Biodiversity Aichi Targets,
necessitates the application of novel technologies to provide spatially contiguous information
rapidly and accurately to update biodiversity information for companies, government, and
NGOs. The post-2020 Biodiversity Targets of the UN Convention on Biodiversity are not
https://www.sciencedirect.com/science/article/pii/S266601722200027X 16/30
10/21/22, 11:34 AM Mapping the relative abundance of soil microbiome biodiversity from eDNA and remote sensing - ScienceDirect
merely aspirational; the implementation of policy is currently constrained by the lack of data
and indicators (Geijzendorffer et al., 2016). Several imaging spectroscopy satellites with spatial
sampling of 20–30 m have been and will launch between 2019 and 2026, ushering in real
opportunities to combine this newly emerging space technology with eDNA in situ taxonomic
data. Here we demonstrate that ecosystem function can be the bridge between these two
rapidly developing technological approaches, combining the massive information content of
in situ eDNA data with the high spatial accuracy and global overview provided by imaging
spectroscopy. As reference collections of eDNA barcode taxonomic databases are coupled with
next generation satellite imaging spectrometers, a more comprehensive and complete picture
of biodiversity delineating species across all kingdoms of life will become possible by applying
and building upon the techniques presented in this paper. Combining eDNA and imaging
spectroscopy holds the promise of accurately, rapidly, and cost-effectively monitoring
biodiversity over the terrestrial surface of the Earth.
Acknowledgment
This project has received funding and support from the European Research Council (ERC)
under the European Union's Horizon 2020 research and innovation programme (grant
agreement n° 834709).
Multimedia component 1.
Recommended articles
References
Albaina et al., 2016 A. Albaina, M. Aguirre, D. Abad, M. Santos, A. Estonba
18S rRNA V9 metabarcoding for diet characterization: a critical evaluation with two
sympatric zooplanktivorous fish species
Ecol. Evol., 6 (2016), pp. 1809-1824, 10.1002/ece3.1986
View Record in Scopus Google Scholar
https://www.sciencedirect.com/science/article/pii/S266601722200027X 17/30
10/21/22, 11:34 AM Mapping the relative abundance of soil microbiome biodiversity from eDNA and remote sensing - ScienceDirect
Ali et al., 2016 A.M. Ali, R. Darvishzadeh, A.K. Skidmore, I. van Duren
Effects of canopy structural variables on retrieval of leaf dry matter content and
specific leaf area from remotely sensed data
IEEE J. Sel. Top. Appl. Earth Obs. Rem. Sens., 9 (2016), pp. 898-909,
10.1109/jstars.2015.2450762
View Record in Scopus Google Scholar
https://www.sciencedirect.com/science/article/pii/S266601722200027X 18/30
10/21/22, 11:34 AM Mapping the relative abundance of soil microbiome biodiversity from eDNA and remote sensing - ScienceDirect
Bini et al., 2006 L.M. Bini, J.A.F. Diniz, T. Rangel, R.P. Bastos, M.P. Pinto
Challenging Wallacean and Linnean shortfalls: knowledge gradients and conservation
planning in a biodiversity hotspot
Divers. Distrib., 12 (2006), pp. 475-482, 10.1111/j.1366-9516.2006.00286.x
View Record in Scopus Google Scholar
Chariton et al., 2010 A.A. Chariton, L.N. Court, D.M. Hartley, C.M. J, C.M. Hardy
Ecological assessment of estuarine sediments by pyrosequencing eukaryotic ribosomal
DNA
Front. Ecol. Environ., 8 (2010), pp. 233-238
CrossRef View Record in Scopus Google Scholar
Cho et al., 2007 M.A. Cho, A. Skidmore, F. Corsi, S.E. van Wieren, I. Sobhan
Estimation of green grass/herb biomass from airborne hyperspectral imagery using
spectral indices and partial least squares regression
Int. J. Appl. Earth Obs. Geoinf., 9 (2007), pp. 414-424, 10.1016/j.jag.2007.02.001
Article Download PDF View Record in Scopus Google Scholar
https://www.sciencedirect.com/science/article/pii/S266601722200027X 20/30
10/21/22, 11:34 AM Mapping the relative abundance of soil microbiome biodiversity from eDNA and remote sensing - ScienceDirect
https://directory.eoportal.org/web/eoportal/satellite-missions/c-missions/chime-
copernicus (2022)
Google Scholar
Haugen et al., 1982 R.K. Haugen, C.W. Slaughter, K.E. Howe, S.L. Dingman
Hydrology and Climatology of the Caribou-Poker Creeks Research Watershed
(1982), p. 41
Washington, D.C.
Google Scholar
https://www.sciencedirect.com/science/article/pii/S266601722200027X 22/30
10/21/22, 11:34 AM Mapping the relative abundance of soil microbiome biodiversity from eDNA and remote sensing - ScienceDirect
Keet et al., 2019 J.H. Keet, A.G. Ellis, C. Hui, J.J. Le Roux
Strong spatial and temporal turnover of soil bacterial communities in South Africa's
hyperdiverse fynbos biome
Soil Biol. Biochem., 136 (2019), 10.1016/j.soilbio.2019.107541
Google Scholar
Khan et al., 2020 M.J. Khan, S.D. Jurburg, J.Z. He, G. Brodie, D. Gupta
Impact of microwave disinfestation treatments on the bacterial communities of no-till
agricultural soils
Eur. J. Soil Sci., 71 (2020), pp. 1006-1017, 10.1111/ejss.12867
View Record in Scopus Google Scholar
Koyama et al., 2014 A. Koyama, M.D. Wallenstein, R.T. Simpson, J.C. Moore
Soil bacterial community composition altered by increased nutrient availability in
Arctic tundra soils
Front. Microbiol., 5 (2014), p. 16, 10.3389/fmicb.2014.00516
View Record in Scopus Google Scholar
https://www.sciencedirect.com/science/article/pii/S266601722200027X 23/30
10/21/22, 11:34 AM Mapping the relative abundance of soil microbiome biodiversity from eDNA and remote sensing - ScienceDirect
Li et al., 2016 N. Li, N.P. He, G.R. Yu, Q.F. Wang, J. Sun
Leaf non-structural carbohydrates regulated by plant functional groups and climate:
evidences from a tropical to cold-temperate forest transect
Ecol. Indicat., 62 (2016), pp. 22-31, 10.1016/j.ecolind.2015.11.017
Article Download PDF View Record in Scopus Google Scholar
Liu et al., 2014 L.Y. Liu, W.J. Huang, R.L. Pu, J.H. Wang
Detection of internal leaf structure deterioration using a new spectral ratio index in
the near-infrared shoulder region
J. Integr. Agric., 13 (2014), pp. 760-769, 10.1016/s2095-3119(13)60385-8
Article Download PDF View Record in Scopus Google Scholar
https://www.sciencedirect.com/science/article/pii/S266601722200027X 24/30
10/21/22, 11:34 AM Mapping the relative abundance of soil microbiome biodiversity from eDNA and remote sensing - ScienceDirect
Massmann et al., 2022 A. Massmann, M.A. Cavaleri, S.F. Oberbauer, P.C. Olivas, S. Porder
Foliar stoichiometry is marginally sensitive to soil phosphorus across a lowland
tropical rainforest
Ecosystems, 25 (2022), pp. 61-74, 10.1007/s10021-021-00640-w
View Record in Scopus Google Scholar
McClaugherty et al., 1985 C.A. McClaugherty, J. Pastor, J.D. Aber, J.M. Melillo
Forest litter decomposition in relation to soil-nitrogen dynamics and litter quality
Ecology, 66 (1985), pp. 266-275, 10.2307/1941327
View Record in Scopus Google Scholar
https://www.sciencedirect.com/science/article/pii/S266601722200027X 25/30
10/21/22, 11:34 AM Mapping the relative abundance of soil microbiome biodiversity from eDNA and remote sensing - ScienceDirect
Orwin et al., 2018 K.H. Orwin, I.A. Dickie, R. Holdaway, J.R. Wood
A comparison of the ability of PLFA and 16S rRNA gene metabarcoding to resolve soil
community change and predict ecosystem functions
Soil Biol. Biochem., 117 (2018), pp. 27-35, 10.1016/j.soilbio.2017.10.036
Article Download PDF View Record in Scopus Google Scholar
Prescott et al., 2004 C.E. Prescott, L. Vesterdal, C.M. Preston, S.W. Simard
Influence of initial chemistry on decomposition of foliar litter in contrasting forest
types in British Columbia
Can. J. For. Res., 34 (2004), pp. 1714-1729, 10.1139/x04-040
View Record in Scopus Google Scholar
Rahman et al., 2003 A.F. Rahman, J.A. Gamon, D.A. Sims, M. Schmidts
Optimum pixel size for hyperspectral studies of ecosystem function in southern
California chaparral and grassland
Remote Sens. Environ., 84 (2003), pp. 192-207, 10.1016/s0034-4257(02)00107-4
Article Download PDF View Record in Scopus Google Scholar
https://www.sciencedirect.com/science/article/pii/S266601722200027X 28/30
10/21/22, 11:34 AM Mapping the relative abundance of soil microbiome biodiversity from eDNA and remote sensing - ScienceDirect
Cited by (0)
https://www.sciencedirect.com/science/article/pii/S266601722200027X 30/30