Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Journal Pre-Proof: Earth-Science Reviews

Download as pdf or txt
Download as pdf or txt
You are on page 1of 74

Journal Pre-proof

Monitoring inland water quality using remote sensing: potential


and limitations of spectral indices, bio-optical simulations,
machine learning, and cloud computing

Vasit Sagan, Kyle T. Peterson, Maitiniyazi Maimaitijiang,


Paheding Sidike, John Sloan, Benjamin A. Greeling, Samar
Maalouf, Craig Adams

PII: S0012-8252(20)30233-6
DOI: https://doi.org/10.1016/j.earscirev.2020.103187
Reference: EARTH 103187

To appear in: Earth-Science Reviews

Received date: 17 August 2019


Revised date: 18 April 2020
Accepted date: 20 April 2020

Please cite this article as: V. Sagan, K.T. Peterson, M. Maimaitijiang, et al., Monitoring
inland water quality using remote sensing: potential and limitations of spectral indices,
bio-optical simulations, machine learning, and cloud computing, Earth-Science Reviews
(2020), https://doi.org/10.1016/j.earscirev.2020.103187

This is a PDF file of an article that has undergone enhancements after acceptance, such
as the addition of a cover page and metadata, and formatting for readability, but it is
not yet the definitive version of record. This version will undergo additional copyediting,
typesetting and review before it is published in its final form, but we are providing this
version to give early visibility of the article. Please note that, during the production
process, errors may be discovered which could affect the content, and all legal disclaimers
that apply to the journal pertain.

© 2020 Published by Elsevier.


Journal Pre-proof

Monitoring inland water quality using remote sensing: potential and


limitations of spectral indices, bio-optical simulations, machine learning, and
cloud computing
Vasit Sagan1,2,* vasit.sagan@slu.edu, Kyle T. Peterson2 , Maitiniyazi Maimaitijiang1,2 , Paheding
Sidike2,6 , John Sloan3 , Benjamin A Greeling4 , Samar Maalouf5 , Craig Adams5
1
Geospatial Institute, Saint Louis University, 3694 West Pine Mall, St. Louis, MO 63108, USA
2
Department of Earth & Atmospheric Sciences, Saint Louis University, St. Louis, MO 63108
3
National Great Rivers Research and Education Center, East Alton, Illinois, USA
4
The U.S. Army Corps of Engineers, St. Louis District, 1222 Spruce Street, St. Louis, MO 63103

of
5
Department of Civil Engineering, Saint Louis University, St. Louis, MO 63108

ro
6
Department of Electrical and Computer Engineering, Purdue University Northwest, Hammond,
IN, United States
*
Corresponding author. -p
re
lP
na
ur
Jo
Journal Pre-proof

Abstract

Given the recent advances in remote sensing analytics, cloud computing, and machine learning,
it is imperative to evaluate capabilities of remote sensing for water quality monitoring in the
context of water resources management and decision- making. The objectives of this review were
to analyze recent advances in water quality remote sensing and determine limitations of current
systems, estimation methods, and suggest future improvements. To that end, we collected over
200 sets of water quality data including blue- green algae phycocyanin (BGA-PC), chlorophyll-a
(Chl-a), dissolved oxygen (DO), specific conductivity (SC), fluorescent dissolved organic matter
(fDOM), turbidity, and pollution-sediments from 2016 to 2018. The water quality data,
generated from laboratory analysis of grab samples and in-situ real-time monitoring sensors

of
distributed in eight lakes and rivers in Midwestern United States, were paired with synchronous
proximal spectra, tripod-mounted hyperspectral imagery, and satellite data. The results showed

ro
that both proximal and satellite-based sensors have great potential to provide accurate estimate of
optically active parameters, and remote sensing of non-optically active parameters may be

-p
indirectly estimated but still remains a challenge. Data-driven empirical approaches, i.e., deep
learning outperformed the other competing methods, providing promising possibility for
re
operational use of remote sensing in water quality monitoring and decision- making. As the first-
time review of deep neural networks for water quality estimation, the paper concludes that
lP

anomaly detection utilizing multi-sensor data fusion and virtual constellation in cloud-computing
is the most promising means for predicting impending water pollution outbreaks such as algal
blooms.
na
ur
Jo
Journal Pre-proof

1. Introduction
1.1 Inland water quality and importance of satellite-based monitoring
Water quality is broadly defined as the biological, chemical, and physical characteristics of
water to be maintained to meet the needs of various water usages including drinking, irrigation,
and recreation (IOCCG 2018a). It is often measured by a number of parameters, i.e., turbidity,
concentrations of chlorophyll-a, pollution-sediment, dissolved oxygen, nutrients, and harmful
algae, etc. While water temperature, metals, submerged habitat, salinity, and many other
contaminants are also considered indicators of water quality especially for coastal ecosystems
(Mullerkarger 1992), the scope of this paper is limited to inland water bodies with indicators that

of
are measurable from remote sensing. Pollution-sediment is often interchangeably described by
either suspended-sediment concentration (SSC) or total suspended solids (TSS) in literature. TSS

ro
is the total organic and inorganic suspended solids in a water column, which includes
chlorophyll-a. SSC and TSS is a measure of the amount of suspended solid-phase material in

-p
water, but the difference is in the analytical methods used to determine SSC and TSS. Both SSC
and TSS are measured by using the dry weight of the sediment from a known volume of a water-
sediment mixture but all the sediment in that mixture is used to produce SSC while a subsample
re
of the original is used to measure TSS (Gray et al. 2000). Another related parameter, total
dissolved solids (TDS), is a measure of the amount of dissolved inorganic chemicals consist ing
lP

of salts and some small amounts of organic matter that are dissolved in water. The primary
source of TDS are usually agricultural and residential runoff and contaminated substances from
na

sewage treatment plants.


Access to clean water has become an increasingly critical issue worldwide due to growing
population, and increasing pollutant loads from climate and land use changes into freshwater
ur

ecosystems. For example, water pollution and eutrophication is estimated to cause $6 to$16
billion in economic loss (OECD 2012) and over 3 million deaths (WHO 2006) per year globally.
Jo

Water quality is affected by nutrient and sediment loads. Excess nitrogen and/or phosphorus
can create conditions for algal blooms and aquatic plant growth leading to eutrophication and
fish kills (Chapman 1996). Sediment contributes to water quality deterioration by altering
hydrologic system including: channel degradation that can threaten levees and bridge piers;
reduced performance of intake structures that can threaten water supplies; erosion of sand bars
and islands. On the other hand, changes in land use, urbanization, and climate affects nutrient
cycling (Jordan et al. 2014) by altering the flow volume and rate in streams. This in turn can lead
to bank erosion in local waterways (Herrick and Whitford 1995). Nutrient-rich runoff from
fertilizers and animal waste can cause algal blooms and eutrophication in surface waters (Visser
et al. 2005), changing water chemistry and habitats, and increasing nutrients and contamina nts
that contribute to formation of algal blooms such as gulf hypoxia (Pereira et al. 2018). Farmers
use fertilizers rich in phosphorus and nitrogen to grow crops, but these excess nutrients can be
transported to streams. Phosphorus and nitrogen are considered limiting nutrients as they are
Journal Pre-proof

closely related to algae and aquatic plant growth. Concentration of these nutrients in streams are
mainly sourced from agricultural fertilizers, urban runoff, industrial wastes, and discharges from
sewage treatment plants. When available in excess quantity in surface waters, it can lead to
eutrophication (decrease of dissolved oxygen) and harmful algal growth, which affects human
and animal health. Therefore, there has been efforts from policy makers, the public and
researchers on nutrient reduction in surface waters (https://www.epa.gov/ms-htf/history-hypoxia-
task-force).
Increasing nutrient and chemical loads from agricultural and urban run-off are associated with
climate change characterized by warmer temperatures and rainfall anomalies, which has caused
widespread global increase in algal blooms (including those that are toxic to humans)
(Carmichael and Boyer 2016; Davis et al. 2009; Paerl and Otten 2013; Rastogi et al. 2015).

of
Some cyanobacteria, including Microcystis aeruginosa, are harmful algal bloom (HAB)
organisms, which can limit recreational use of lakes and coastal waters, reduce dissolved oxygen
in waterbodies, causing taste and odor issues. Importantly, Microcystis aeruginosa is known to

ro
produce powerful hepatotoxins (i.e., microcystins) which is associated with liver cancer and
tumors in humans and wildlife (Carmichael and Boyer 2016; Kudela et al. 2015; Rastogi et al.
-p
2015). The timing of fertilizer application and harvest as well as the concentrations and chemical
make-up of nutrients dictate overall magnitude and extent, species composition, and duration of
re
algal blooms. Given that cyanobacteria differ significantly in cell size, secondary morphology,
and toxin content per cell among and within strains, populations, species, and genera
lP

(Carmichael and Boyer 2016), it is challenging to implement standards for bloom monitoring and
assessment. Since establishment of the Mississippi River/Gulf of Mexico Watershed Nutrient
Task Force (HTF) in 1997, significant progress has been made to reduce nutrients by
na

implementing specific reduction plans nationally (U.S. EPA 2017). However, there is no routine
assessment of cyanobacteria for many parts of the world, partly due to limited capabilities and
ur

resources available to agencies to monitor HAB at watershed scales.


This highlights the critical need to develop a capacity to monitor water quality at regional
Jo

and global scales and predict when and where outbreaks may occur thus enabling policy makers
and water resources managers to take proactive measures to mitigate adverse effects of water
pollution. However, water quality modeling or discrete observation facilities lack sufficient and
effective information for decision-making. Satellite-based monitoring techniques, on the other
hand, are cost-effective providing water quality conditions over extensive spatial and temporal
scales. When calibrated with in-situ measurements, satellite-based models provide time-stamped,
geotagged data that can infer trends and anomalies in concentration of key indicators of water
quality for improved confidence in predicting impending water quality outbreaks.
1.2 Development of remote sensing for water quality
Water absorbs radiation in the red and near- infrared regions of the electromagnetic spectrum,
and no reflectance signal is detectable from a clear body of water from wavelengths at 750 nm or
longer (Zhang et al. 2014). However, algal bloom reflectance is detectable because of light-
harvesting pigments such as chlorophyll- a and phycocyanin (Vincent et al. 2004). A growing
Journal Pre-proof

body of research demonstrates that optical and thermal sensors extract information on water
quality with high spectral and spatial resolution (Zhang et al. 2002). Watershed-scale models
developed from Landsat, MODIS, and MERIS sensors are established for estimating
chlorophyll-a and suspended sediment levels in large rivers (Lim and Choi 2015; Olmanson et al.
2013), reservoirs (Hansen et al. 2015; Song et al. 2012c), and lakes (Becker et al. 2009; Gomez
et al. 2011; Matthews et al. 2012; Olmanson et al. 2008; Olmanson et al. 2011; Song et al. 2010;
Wang et al. 2012; Wang et al. 2013), which has been demonstrated to be applicable to global
scales (Ho et al. 2019). However, not all water quality variables can be directly measured by
current satellites, including nutrient (i.e., total nitrogen and phosphorus) concentrations,
dissolved oxygen levels, and microorganisms/pathogens, due to some of these variables are not
optically active or lack of hyperspectral data at fine spatial resolutions. Given the fact that

of
nutrients are significant factors in the initiation, propagation, and maintenance of HABs
(Schindler 1977), the ability to remotely estimate nutrients and a suite of other water quality
parameters associated with toxins is imperative. Airborne hyperspectral studies demonstrated

ro
that the ratio indices often constructed using the scattering peak at the red edge (700 nm) and the
reflectance troughs caused by chlorophyll absorption at 670 nm, pigment absorption at 592 or
-p
620 nm can be used to predict total nitrogen, total phosphorus and suspended sediment
(Olmanson et al. 2013; Song et al. 2012c). Although these developments demonstrated the
re
potential of improving total nitrogen and phosphorus estimation using either spectral bands in the
green, red, and near- infrared (Olmanson et al. 2013; Song et al. 2012c), neural-networks (Keiner
lP

and Brown 1999), or empirical orthogonal functions (Craig et al. 2012; Qi et al. 2014),
significant challenges remain for monitoring nutrients and microorganisms/pathogens in
freshwater fluvial systems, which are especially important given their role in contributing to
na

hypoxia in lentic and coastal systems. Additionally, these models are not generalizable because
they are relying on optical proxies that vary re gionally and therefore no consistent relationship
ur

can be found with the non-optical water quality parameter of interest over time and space.
In general, there are two main approaches to water quality modeling from remote sensing
Jo

data. The first is empirical modeling which is purely statistical and the second semi-analytical
which are based on the physics of light interactions with the water surface. Successful remote
sensing of cyanobacterial blooms has been reported using absorption features of phycocyanin at
620 nm, a characteristic pigment associated with freshwater cyanobacteria (Kutser 2009;
Ogashawara et al. 2013). Recent studies demonstrated that approaches based on spectral shape
rather than absorption features may be more effective due to its insens itivity to atmospheric
correction (Wynne et al. 2008). These algorithms including the Cyanobacterial Index (CI)
(Wynne et al. 2008), Maximum Peak Height (MPH) (Matthews et al. 2012), Adaptive
Reflectance Peak Height (ARPH) (Ryan et al. 2014), Scattering Line Height (SLH) index and
Aphanizomenon-Microcystis Index (AMI) (Kudela et al. 2015), employ changes in spectral shape
caused by the strong scattering by cyanobacteria. These algorithms demonstrate reduced
sensitivity to noisy data and/or atmospheric effects. Despite advances in algorithm development,
both phycocyanin absorption methods and spectral shape methods are limited by the relative
Journal Pre-proof

unavailability of sensors with both fine spectral and spatial resolution. Hyperspectral sensors
including AVIRIS, HyMap, Hyperspectral Infrared Imager (HyspIRI), and upcoming satellite
missions including NASA’s Surface Biology and Geology mission, Italy’s PRISMA, Japan’s
HISUI, and Germany’s EnMAP can address these issues with both increased spatial and spectral
resolution. On the other hand, unmanned aerial system (UAS) advances have led to innovative,
regional monitoring of inland surface water, including algal pigments and suspended sediment.
Utilizing visible and near- infrared imagery at very high spatial resolution and capturing spectral
bands matching those captured by satellite sensors, recent studies demonstrate the advantage of
versatile unmanned and airborne systems to monitor water quality and expand models to
watershed scales (Vogt and Vogt 2016).
Suspended sediments and turbidity also indicate water quality (Ritchie et al. 1976b).

of
Therefore, reducing erosion is necessary to reduce high suspended sediment load. Deriving
sediment levels from Landsat imagery has been considerably researched over the past few
decades. Indeed, suspended sediments directly affect water response to incident solar radiation

ro
(Islam et al. 2001; Long and Pavelsky 2013; Mertes et al. 1993; Ritchie and Cooper 1991), and
remotely sensed reflectance data in the visible and near- infrared wavelengths obtained over large
-p
rivers and lakes correlate with suspended solids (Park and Latrubesse 2014; Ritchie et al. 1976a).
Reflectance-sediment relationships are quantified with statistical regression models that are a
re
function of particle size, distribution, color, and mineralogy. They are typically considered site-
specific, and very few studies demonstrate that these models can be regionally transferable (Long
lP

and Pavelsky 2013; Pereira et al. 2018; Ritchie and Cooper 1991).
1.3 Machine learning
na

Machine learning is a set of statistical methods that can automatically learn from data,
develop a detection, estimation or classification model that minimize the difference between
training and prediction dataset without being explicitly programmed (Murphy 2012). Machine
ur

learning also known as statistical learning, involves providing data to a computer which can be
“trained” with known or predefined features or objects that allow detection, classification, or
Jo

pattern recognition in semi-automated or automated manner. Machine learning for remotely


sensed water quality estimation has become popular in recent years thanks to the advances in
algorithm development, computing power, sensor systems, and data availability (Peterson et al.
2018). The most powerful form of machine learning is so called deep learning, a hierarchically
structured neural network (NN) that can extract useful information and make predictions from
raw data. Deep learning has shown unparalleled advantages for water quality mapping (Peterson
et al. 2020). Deep learning architectures, including recurrent and convolutional neural networks
appear particularly attractive due to great success in many recent studies, outperforming many
other methods in a variety of remote sensing applications (Hartling et al. 2019; Sidike et al. 2019;
Maimaitijiang et al. 2020). However, machine learning, especially deep learning (DL), in remote
sensing is still in its infancy and there are many unsolved questions that bring new challenges in
remote sensing data analysis (Zhang et al. 2016; Ball et al. 2017; Alom et al. 2019; Ma et al.
2019; Sidike et al. 2019). Some typical challenges can be summarized as follows: 1) The
Journal Pre-proof

performance of DL approaches particularly rely on a large number of training samples which is


difficult to obtain in real-world scenarios. 2) Remote sensing data are often multimodal which
requires the development of novel machine learning models to extract joint features from
heterogeneous spectral, spatial, and temporal information. This also challenges the transferability
of the trained algorithms to the other modalities over space and time. 3) The depth of DL is
critical to learn the intricate structure of input variables (Szegedy et al. 2015). However, deeper
network leads to significant computational cost as well as overfitting due to the corresponding
increments of trainable parameters and sometimes inadequate training samples. The issues on
appropriately balancing the depth of the network and computational efficiency are still an open
topic. 4) Unexplainable solutions of machine learning models for modeling physical phenomena
can lead to inaccurate outputs or ill-posed problems.

of
2. Optical properties of water
Conventional water quality monitoring programs have historically relied on protocols

ro
requiring in-depth field sampling and laboratory analysis. These protocols have many limitations
related to costs, accessibility, spatial and temporal data coverage. The possibility of continuous

-p
data acquisition and analysis of accessible and remote waters is both challenging and cost-
prohibitive. Although, satellite imagery cannot and will not be able to measure every physical,
re
chemical and biological characteristic of water, the ability to measure several water quality
parameters using direct and indirect means is being developed by researchers in the field.
lP
na
ur
Jo
Journal Pre-proof

of
ro
-p
re
lP
na

Figure 1. Diagram of light-water interactions.

2.1 Light and water interactions


ur

Water is composed of molecules containing many suspended and dissolved, organic and
Jo

inorganic, living and non- living materials. When incident solar radiation or other light hits the
surface of a water body, part is reflected off the surface while a larger portion penetrates the
water column itself initiating interactions with both suspended and dissolved matter. Optically
active constituents (OAC) refer to the fraction of suspended and dissolved matter that interact
with electromagnetic radiation through absorptive, refractive and scatter ing mechanisms. The
measured absorbance, refraction and light scattering are specific to each component and are
referred to as the inherent optical properties (IOP) (IOCCG 2006). As OACs are a function of
parameters such as salinity and temperature, OACs a re classified based on their spectral water-
leaving radiance as pure water, Chl-a (and other phytoplankton pigments), non-algal particles
and colored dissolved organic matter (CDOM) (IOCCG 2018b). Figure 1. shows a diagram
representing the light-water interactions common for inland waters.
Journal Pre-proof

2.2 Inherent optical properties (IOPs)


Optical properties of water are classified into two distinct categories: inherent and
apparent. IOPs are a function of the medium and its constituents and are not affected by the
ambient light field within the medium. Of primary interest in spectral-based remote sensing are
two IOPs, the absorption coefficient and the volume scattering function. Additionally, the index
of refraction and the beam attenuation coefficient (which integrates the absorption coefficient
and scattering coefficient) are relevant. Apparent optical properties (AOP's), on the other hand,
depend on the IOPs of the medium and on the geometric distribution of the ambient light field.
The measured radiance emerging from the water related to reflective, absorptive and scattering
effects is of particular interest to remote sensing. The spectrum and radiance distribution of this
water-leaving radiance is affected by the IOPs of various dissolved and suspended materials

of
within the water body (Moore et al. 2009).
Transmittance (T) is a measure of how much light passes through a substance.

ro
Specifically, T is the ratio of the intensity of the light passing through the medium to the
intensity of the incident light. Absorbance (A) is a measure of the capacity of a substance to

-p
absorb light due to physiochemical interactions within the medium. Absorption, in contrast,
relates to the attenuation of light due to absorbance, reflection, scattering and other processes.
re
Absorbance is defined as the base-10 log of the inverse of T. The Beer- Lambert law relates
molar concentration (Ci) of solutes to A as a function of path length through the medium (L) and
the molar absorptivity of the solutes (ɛ), or A = C·ɛ·L. The molar absorptivity (ɛ) is a constant
lP

for a substance at a given temperature and wavelength (Crouch et al. 2007). The reflectance (R)
of light from a surface of water is a function of the refractive index of the medium which, in turn,
na

is a function of salinity, temperature and wavelength.


Light scattering or dispersion of light in water occurs when dissolved or particulate
(colloidal) matter change the direction of a light ray and scatter it in different directions.
ur

Scattering can be assessed in several ways. Traditionally, turbidity is the most common measure
of light scattering within a water sample. Bench-top turbidometers measure the amount of light
Jo

scattered orthogonally to the incident light beam being passed through the sample and is
calibrated by turbidity standards. Units for turbidity are typically Nephelometric Turbidity Unit
(NTU) for laboratory turbidometers and Formazin Nephelometric Unit (FNU) for submersible
turbidometers which commonly utilize infrared (IR) incident energy.
More relevant to measuring light scattering via remote sensing is the IOP, volume
scattering function (VSF). The VSF is a measure of the intensity of light scattering as a function
of angle between the incident beam and the scattered light. For remote sensing, the backscattered
light back towards the source is of key relevance as the light source (e.g., sun) and the detector
(on a satellite or drone) are both above the water source. Forward-scattered light, on the other
hand, is defined as the light scattered away from the light source. Scattering is dependent on the
size, shape, and color of the particles (or molecules) involved. Gordon et al. (1975) and Morel
and Prieur (1977) found that the water reflectance is inversely proportional to the absorption
Journal Pre-proof

coefficient and proportional to the backscattering coefficient. This serves as a key factor in
spectral-based remote sensing methodologies (IOCCG 2006).
Finally, the fluorescence of water constituents is of great importance in water quality
assessment. The essence of the fluorescence mechanism is that chemical bonds within inorganic
and organic compounds in water absorb light at one wavelength and emit the absorbed energy at
a different wavelength. This property is utilized in an analysis method called fluorescence
excitation emission matrix (FEEM) in which a water sample is excited one (or a range of)
wavelength(s) and the emission is measured at another (or range of) wavelength(s). Specific
excitation/emission wavelength pairs are characteristic of specific compounds or compound
classes within the water matrix (Crouch et al. 2007). The IOPs of a solution are the sum of the
IOP of all the solution’s components. Therefore, the absorption and scattering properties of

of
surface waters equals to the sums of the constituents IOP.
2.3 Optically active constituents

ro
OACs are classified as pure water, Chl-a (and other phytoplankton pigments), non-algal
particles and CDOM (IOCCG 2018b). Since the spectral absorption and scattering coefficients of

-p
pure water are well documented, pure water is typically used as an "optical standard" to calibrate
field instruments. Thus, calibrated measurements in the field represent the IOP of particles and
re
dissolved material, independent of the water itself.
Chlorophyll-a is in the phytoplankton class of OACs and is closely associated with algae
lP

and related constituents present in the water column. Chlorophyll has a major role in absorbing
solar energy in a water column and is involved photosynthetically during the algae light cycle.
Chlorophyll pigments (such as Chl-a) absorb energy from sunlight at specific wavelengths in the
na

blue and red regions and reflect the light at other wavelengths in the green region. This IOP
serves the basis of many laboratory and remote sensing techniques to measure Chl-a in surface
water. In the remote sensing field, the light reflected from water surface is compared relative to
ur

the absorbance spectrum of Chl-a. FEEM is another common technique to measure Chl-a in the
laboratory or field (Wang et al., 2017a).
Jo

The non-algal particle class of OACs include suspended particulate matter (0.2 - 0.7 mm)
such as non-pigmented components of phytoplankton, organic debris, living microorganisms
(e.g., zooplankton and bacteria), and inorganic particles resulting from riverbed erosion, runoffs,
and particles resuspension. Turbidity and suspended sediment also constitute non-algal particles
and typically lead to increased reflectance across the visible and NIR spectrum. Thus, this
category of variables negatively affect the transparency within a water column and is often a
primary carrier of both nutrients and contaminants (Dörnhöfer and Oppelt 2016).
Colored dissolved organic matter (CDOM) is the quantifiable parameter used most
typically in remote sensing as an indicator of dissolved organic carbon (DOC) and total organic
carbon (TOC). The source of DOC can be autochthonous, meaning it comes from aquatic plants
or algae decomposed within surface water or allochthonous indicating that it is originated from
outside of the system, e.g., soils or terrestrial plants (Kritzber et al. 2004). Natural organic matter
(NOM) in the water column include a wide range of compounds including carbohydrates, fatty
Journal Pre-proof

acids, amino acids and other hydrocarbons (Thurman 1985). One of the most relevant fractions
of NOM are humic substances which are comprised of humic acids and fulvic acids. These
humics are often responsible for much of the color of natural waters due to the many unsaturated
and aromatic bonds in these high molecular weight molecules. Humics are highly reactive to
electrophilic oxidation in water treatment and, thus, are precursors for many regulated
disinfection byproducts (DBP). CDOM absorbs strongly in the UV and visible spectrum and are
typically measured as the absorption at a wavelength of 440 nm (Dörnhöfer et al. 2016). The
absorbance at 254 nm as measured in spectrophotometer is more commonly used as a DBP
precursor surrogate in the water treatment industry. Fluorescent dissolved organic matter (fDOM)
is a fraction and surrogate measure of CDOM that fluoresce due to their complex unsaturated
molecular structure. fDOM is a more specific and sensitive measurement than absorbance for

of
spectral measurement of organic matter and is used in several commercial probe systems as an
estimate of the organic matter present in a water column (Huguet et al. 2009).

ro
2.4 Non-optically active water quality constituents
Non-optically active water quality constituents are those that do not absorb light, and

-p
hence, cannot be directly measured by spectral methods. Indirect methods, however, can be used
to estimate or measure several of these important water quality parameters. The two key
re
nutrients that drive both green algae and blue-green algae (cyanobacterial) blooms are nitrogen
(N) and phosphorus (P) (EPA). These two nutrients do not impact the visible spectrum of the
lP

water body directly but do affect the color due to their promotion of algae growth. While the
correlation between these nutrients and algae growth (for example, as represented by Chl-a
measurements) may be strong, less is known about correlations at temporally- and spatially-
na

smaller samplings. A further complication is that the correlation between N or P and algae
growth is based on the fact that one of these nutrients is often the growth limiting factor for an
algae bloom. Thus, in one water body the correlation with Chl-a might be with N while in a
ur

different water body the correlation might be with P, depending on which is growth limiting in
that particular water body.
Jo

Dissolved oxygen is another important water quality parameter that is not directly
measurable using spectral methods. The main sources of DO in water are atmospheric oxygen
transfer into the water column, and oxygen generation in situ due to photosynthesis from algae
and other organisms (Crittenden et al. 2012). In the laboratory, dissolved oxygen is most often
measured using a membrane-based electrochemical probe or a luminescent-based method. For
remote sensing, indirect methods can potentially be used for dissolved oxygen. Specifically,
dissolved oxygen is impacted in complex ways to Chl-a and algae (e.g., generation during
photosynthesis, and consumption during respiration) and temperature (through solubility), thus,
providing indirect diurnal spectral linkages between these parameters.
Specific conductance (SC) or conductivity is a measure of a water’s ability to conduct
electrical current, which is proportional to the ionic strength of the solutions, that is, the salt
content. SC is generally reported in units of microSiemens ( µS/cm). For laboratory and field
measurements, a conductivity probe is used to measure conductivity. In some instruments, the
Journal Pre-proof

measured conductivity is converted to other units such as total dissolved solids or salinity.
Correlations to spectral measurements are more challenging due to the complex and highly
varied relationships between SC and optically active water quality constituents.
3 Objectives of this review
 Analyze current trends and advances in water quality remote sensing
 Identify and evaluate a range of popular estimation methods across data sources and
datasets
 Identify limitations of current systems, estimation methods, and suggest future
improvements

of
4. Materials

ro
4.1 Global description of the datasets
The proximal dataset compiled for this study was based on two main data sources. The

-p
first is the proximal dataset which was a combination of corresponding in-situ measured water
quality data, PSR-3500 spectroradiometer from Spectral Evolution (Lawrence, MA 01840 USA),
re
and tripod-based hyperspectral imaging. The water quality data contained in the proximal dataset
was the result two independent data collection campaigns comprising of the mesocosm
lP

(simulated stream system) and Saint Louis University (SLU) data. The second database used in
this study is the satellite dataset comprised of in-situ water quality data collected by United
States Army Corp. of Engineers (USACE) and Great Rivers Ecological Observation Network
na

(GREON) monitoring buoys with corresponding satellite imagery from a Landsat-8 and
Sentinel-2 virtual constellation. These datasets are used herein to evaluate a range of water
constituent estimation methods and investigate the potential of spatial, spectral, and atmospheric
ur

effects. An overview of the data used in this study is displayed in Table 1. Both databases
contain data gathered across the States of Illinois and Missouri in the United States Midwest and
Jo

for a range of different waterbodies including reservoirs, rivers, and simulated stream
environments (e.g. mesocosm described in more detail below). Figure 2 shows the geographic
distribution of field sampling locations included in these datasets.
Journal Pre-proof

of
ro
-p
re
lP
na

Figure 2. Map of the field collected water quality data and sampling sites. In total, there were 43
different sampling locations across seven independent waterbodies where in-situ data were
collected.
ur

4.1.1 SLU water quality data


Jo

This database was compiled by researchers at Saint Louis University (SLU) in a previous
study (Peterson et al. 2019). It consists of in situ spectroradiometer (PSR-3500 spectroradiometer,
Spectral Evolution, Lawrence, MA 01840, USA) and water quality data from three independent
Midwestern waterbodies including two reservoirs in Illinois, Carlyle Lake (an impoundment of
the Kaskaskia River) and Lake Decatur (an impoundment of the Sangamon River) representing
inland reservoirs heavily impacted by the surrounding agriculture, and sections of the Meramec
River in Missouri which is largely unimpacted by human activities (Pete rson et al., 2018).
Sampling sites for each watershed were selected to maximize the spatial distribution within the
waterbody for the most representative sample of water conditions across the study areas. Surface
water samples were analyzed in situ using a YSI EXO2 sonde or YSI ProPlus Multiparameter
handheld meter (Yellow Springs Instruments, Yellow Springs, OH) with sensors for blue-green-
algae phycocyanin (BGA-PC), Chl-a, TDS (calculated from specific conductance), and/or
turbidity (as FNU). In the Meramec watershed, turbidity was measured with a Hach
Journal Pre-proof

Turbidimeter (Hach Company, Loveland, CO). A Van Dorn sampler was used to collect discrete
surface water samples that were returned to the lab for determination of TSS and Chl-a. The
water samples were collected in acid-washed polyethylene bottles and stored at 4ºC until analysis.
Sample TSS content was determined via the EPA method 160.2 (EPA, 1971). Chl-a and BGA-
PC were measured in vivo with a dual channel YSI Total Algae PC Smart Sensor integrated into
an EXO2 sonde (YSI 2017). This data was collected between June 2016 and November 2017. It
represents water quality data under a range of seasons and atmospheric conditions and creates a
dataset with 86 to 105 samples depending on the parameter.

4.1.2 Mesocosm water quality data


Relying on naturally-occurring conditions in surface waters to provide a wide range of
water quality conditions is unpredictable and may require significant time and resources

of
collecting field samples. Therefore, we used the mesocosm facility at the National Great Rivers
Research and Education Center (NGRREC) to manipulate nutrient and sediment concentrations

ro
in actual Mississippi River water over a period of several days while simultaneously collecting
spectral measurements using the Bayspec OCI-F hyperspectral sensor along with concurrent

-p
handheld spectroradiometer from the Spectral Evolution PSR-3500 which are further explained
in section 3.2. The mesocosms are large cement raceways measuring 15- m x 3.6-m and 1.8-m
re
depth (97 m3 volume) that can be supplied with up to 15,000 m3 of Mississippi River water per
day. Differing amounts of nutrients and sediments were added to the Mississippi River water to
lP

represent a range of hydrological conditions typically found in Midwestern rivers and


impoundments during an annual cycle (e.g., fair weather flow to storm flow). The mesocosms
can operate in two modes either independently or simultaneously: 1) continuous flow-through
na

with constant addition of fresh Mississippi River water, or 2) recirculating without addition of
fresh river water. While collecting spectral data and water chemistry measurements on nutrients,
the mesocosm was placed in a recirculating mode to keep the nutrient concentrations constant.
ur

For measurements on low suspended sediment concentrations, the mesocosm was kept in a
recirculating mode. The flow-through mode was added for higher sediment concentrations to
Jo

provide additional kinetic energy to the water, thus keeping sediments in suspension while the
spectral and water chemistry measurements were taken. A dataset was created by collecting
spectral imagery with the aforementioned sensors while simultaneously measuring the
concentrations of chlorophyll- a (Chl-a), BGA-PC, and fluorescent dissolved organic matter
(fDOM) with a YSI EXO2 multi-paramater sonde, and NO 2 +NO3 nitrogen with a SUNA V2
optical sensor.

4.1.3 GREON water quality data


The Great Rivers Ecological Observatory Network (GREON) is a fleet of seven water
quality monitoring buoys created by the National Great Rivers Research and Education Center
(NGRREC) using technology developed by YSI, Inc. (Yellow Springs, OH). Each GREON
station collects high- frequency (2- h interval) water quality data using two sondes: 1) a YSI
EXO2 multiparameter sonde for water temperature, d issolved oxygen (DO), specific
Journal Pre-proof

conductance (SC), turbidity, chlorophyll-a (Chl-a), BGA-PC and fluorescent dissolved organic
matter (fDOM), and 2) a SUNA V2 optical nitrate (NO3-N) sensor (Sea-Bird Electronics,
Bellevue, Washington). Both sondes are identical to those used in the Mesocosm experiment
described in section 3.1.2. Data from the GREON sondes is recorded with a Campbell Scientific
CR-1000 data logger and is automatically uploaded every two hours by cellular modem to the
Great Lakes to Gulf Virtual Observatory (www.greatlakestogulf.org) where it is freely available
for viewing and downloading. Although the GREON buoys record water quality data
continuously, they are removed from the water in late November to prevent damage from ice and
re-deployed in mid to late spring after the threat of ice has passed.
Two GREON monitoring stations provided data for this study. One GREON monitoring
station was located on Lake Decatur, IL in Central Illinois (39.8907, -88.8601) and a second was

of
on Carlyle Lake in Southwestern, IL (38.6182, -89.3443). Both reservoirs are located in
watersheds where the primary land use is agricultural, and they were both included on the 2004
Illinois EPA 303(d) list of impaired waters due to excessive sediments and nutrients (IEPA 2007).

ro
Excessive phosphorus has been an ongoing problem for both reservoirs and concentrations
frequently exceed 0.05 mg/L which the State of Illinois has designated as the upper limit for
-p
protecting aquatic life and aesthetic quality (IEPA 2007; USACE 2012). The Lake Decatur
watershed covers 2396 km2 with 82% of the land dedicated to agriculture; primarily for corn and
re
soybean production (Keefer and Bauer 2005). Due to the prevalence of tile-drained, row crop
agriculture and soil tillage in the watershed, excessive siltation and sedimentation have been a
lP

problem in the reservoir. In addition to excessive phosphorus, nitrate concentrations occasionally


exceed the drinking water maximum contaminant level of 10 mg/L. The Carlyle Lake watershed
drains 7030 km2 of predominantly agricultural land (73%) and therefore is impacted by sediment
na

and nutrient runoff from tilled fields and tile drainage. Nitrogen concentrations (total N and
nitrate N) rarely exceed the drinking water limit of 10 mg/L, but total P frequent ly exceeds the
ur

Illinois General Use Water Quality Standard of 0.05 mg/L. Shaughnessy (2019) reported that P
concentrations in Carlyle Lake have increased over time even though the amount entering the
Jo

lake has remained relatively constant, suggesting that the sediment layer in the lake is serving as
a source of soluble P. Algal blooms are occasionally observed and sometimes lead to complaints
from lake users (USACE, 2012).
4.1.4 USACE water quality data
Collected and maintained by the USACE this water quality database contains data for
reservoirs across the Midwestern United States including Carlyle Lake, IL, Lake Shelbyville, IL,
Mark Twain Lake, MO, Lake Wappepello, MO, and Rend Lake, IL. Each site contains several
sampling locations distributed across the waterbody to gauge the overall water conditions at the
time of sampling. Measurements for a range of variables including temperature, dissolved
oxygen, specific conductivity, TSS, turbidity (as FNU), and total dissolved solids (TDS), among
others are regularly collected using in-situ devices. Several devices were used to collect in-situ
water quality data including the YSI EXO2, YSI Pro DSS, Hydrolab DS5 (OTT Hydromet,
Loveland, CO), Hydrolab DS5X, Hydrolab HL4, and Hydrolab MS5 sonde devices. Grab
Journal Pre-proof

samples for a range of parameters are also taken across the waterbody and sent to a contracted
NELAP accredited laboratory compliant with the Department of Defense Quality Systems
Manual including Chl-a, NO3-N, and PO4-P, among others. Due to the need for corresponding
satellite imagery the total number of samples used for this study from the USACE database were
severely limited.

Table 1. Summary statistics of the water quality data.

Proximal Satellite
SLU Mesocosm GREON USACE
BGA-PC (ug/L)
Samples (n) 96 22 84 -----
Min-Max 0.1 - 6.0 0.45 – 1.01 0.47 - 9.34 -----

of
Mean ± SD 2.3 ± 1.6 0.66 ± 0.11 3.41 ± 2.06 -----
Chl-a (mg/L)
Samples (n) 105 22 84 12

ro
Min-Max 0.5 - 86.4 15.76 – 25.22 0.66 - 74.38 4.3 - 79.5
Mean ± SD 20.8 ± 16.9 20.07 ± 2.47 26.42 ± 16.48 19.49 ± 15.88
DO (mg/L)
Samples (n)
Min-Max
Mean ± SD
-----
-----
-----
22
-p
6.89 – 10.20
8.47 ± 1.16
84
3.95 - 19.68
9.34 ± 3.01
-----
-----
-----
re
fDOM (QSU)
Samples (n) ----- 22 83 -----
70.86 – 75.83
lP

Min-Max ----- 0.27 - 64.33 -----


Mean ± SD ----- 73.43 ± 1.12 40.58 ± 13.93 -----
NO3-N (mg/L)
Samples (n) 86 22 ----- -----
na

Min-Max 0 - 9.63 9.94 – 10.5 ----- -----


Mean ± SD 1.42 ± 2.26 10.39 ± 0.11 ----- -----
PO4-P (mg/L)
Samples (n) 86 ----- ----- -----
ur

Min-Max 0.00 - 0.31 ----- ----- -----


Mean ± SD 0.14 ± 0.09 ----- ----- -----
TDS (mg/L)
Jo

Samples (n) 86 ----- ----- -----


Min-Max 8.0 - 384.0 ----- ----- -----
Mean ± SD 226.8 ± 76.4 ----- ----- -----
Turbidity (FNU)
Samples (n) 96 22 84 6
Min-Max 1.0 - 183.0 3.03 – 15.87 1.98 - 131.13 3.78 - 29.60
Mean ± SD 24.9 ± 28.4 12.70 ± 2.66 20.01 ± 16.82 17.69 ± 5.66
TSS (mg/L)
Samples (n) 93 ----- ----- 12
Min-Max 1.0 - 183.0 ----- ----- 10.7 - 36.8
Mean ± SD 32.1 ± 30.2 ----- ----- 15.21 ± 3.51

4.2 Spectral Data


4.2.1 Spectroradiometer
Field spectroscopy data were collected using the Spectral Evolution PSR-3500 (Spectral
Evolution, Inc., Lawrence, MA) handheld spectroradiometer which has a spectral range of 350 -
Journal Pre-proof

2500 nm with a spectral resolution of 3.5 nm in the 350-1000 nm range, 10 nm in the 1000-1900
nm range, and 7 nm in the 1900-2500 nm range. Spectral reflectance measurements were
captured between 10:00 and 14:00 with clear skies from a 1 m height above the water surface at
nadir. Three separate water surface reflectance samples were collected at each location and
averaged to reduce intra-sample noise. Initial processing of raw spectral data was conducted by
interpolating native spectral reflectance values to 1 nm intervals covering the full 350-2500 nm
spectrum. The 1 nm interval spectra were employed in this study to gain maximum spectral
information at narrow wavelengths. The data were then normalized, filtered to remove excess
noise using a Savitzky–Golay filter (Savitzky and Golay 1964), and spectra within the 350 – 399
nm and 901 - 2500 nm wavelength range were excluded based on the signal-to-noise ratio for
this study.

of
4.2.2 Hyperspectral imagery
Between August and October of 2018 proximal hyperspectral imagery was gathered for

ro
the NGRREC mesocosm experiments using the Bayspec OCI hyperspectral imager (BaySpec,
Inc., San Jose, CA) from a tripod platform to collect data over a wide range of water quality

-p
conditions. The Bayspec OCI is a push-broom hyperspectral camera covering the 400-1000 nm
wavelength range with a 16 mm lens and 21 o field of view. Images were gathered between 10:00
re
and 14:00 with clear skies using the tripod scanning device at off- nadir. Using the Bayspec OCI
proprietary software the data was mosaiced and spectral reflectance (Rrs) was calculated. Spectra
lP

were extracted from each mosaiced image using an average of pixel values within a selected
region. These regions were manually determined to reduce further error related to surface waves
or sun glint. Spectra beyond 900 nm were removed and the data were resampled to 1 nm
na

intervals and filtered using the Savitzky–Golay filter to match the handheld spectroradiometer
data. In total, 22 high quality hyperspectral samples were collected and added to the proximal
database. Plotted spectra including data from both the hyperspectral and spectroradiometer
ur

sensors are displayed in Figure 3.


Jo
Journal Pre-proof

of
ro
-p
re
lP
na

Figure 3. Combined spectroradiometer and hyperspectral data after pre-processing representing


ur

the proximal database. Rrs refers to spectral reflectance.

The proximal dataset was also resampled to the Landsat 8 spectral response (Figure 4.) to assess
Jo

the impact of scale and spectral resolution on water quality parameter estimation. Resampling of
high-resolution proximal spectra to satellite spectra were conducted by using spectral response
functions of Landsat-8 data following the approaches of Maimaitiyiming et al. (Maimaitiyiming
et al. 2016) and Ghulam et al. (Ghulam et al. 2008). In-situ proximal data resampled to satellite
spectra provide means to investigate the prediction power of satellite observations without any
atmospheric disturbances. The comparison of spectral data from proximal, satellite, and proximal
data resampled to Landsat-8 and Sentinel-2 data enables a better understanding the impact of
scale without any atmospheric disturbance and whether or not the results among different scales
are comparable. A direct comparison between resampled and satellite-based spectra was not
possible since the proximal data collection was not coordinated with Landsat-8 or Sentinel-2
satellite overpass.
Journal Pre-proof

of
ro
-p
re
lP
na

Figure 4. Spectra representing the proximal data resampled to Landsat-8 spectral resolution. Rrs
ur

refers to spectral reflectance.

4.2.3 Satellite Data


Jo

Using the Google Earth Engine (GEE) cloud computing platform, Landsat-8 and
Sentinel-2 datasets were obtained and processed ranging from the period between May 2013 and
December 2018 corresponding to the in-situ data from the GREON and USACE datasets. The
processing methodology followed for this study followed the methodology laid out by Claverie
et al. (2018) to create a synchronized Landsat-8-Sentinel-2 virtual constellation (Figure 5.). The
process begins with Level-1C (L1C) data hosted on the GEE cloud which is then atmospherically
corrected using the Case 2 Regional CoastColour processor (C2RCC). The C2RCC is
developed specifically for optically complex inland waters and uses a large database of radiative
transfer simulations that are inverted by a series of neural networks to calculate atmospheric
correction for satellite sensors including Landsat-8 and Sentinel-2 (Brockmann et al. 2016). The
data is then masked for values indicating clouds, adjacent shadow, and cloud shadow using the
CFmask procedure adopted from (Zhu et al. 2015). This method however does not filter out ice
pixels which may be a source of error in scenes during winter months. Cloud filtering and
Journal Pre-proof

masking were performed using the provided pixel quality assurance (QA) bands to select only
pixels containing cloud free water. Sentinel-2 data is then resampled to the nominal 30 m spatial
resolution and the two products are geographically registered. A bidirectional reflectance
distribution function (BRDF) normalization is then applied to the Sentinel-2 and Landsat-8
reflectance products which are normalized per-pixel view and per-granule illumination angles.
This applies to all optical bands except the Sentinel-2 red-edge bands and the atmospheric bands
(Coastal aerosol, Cirrus, Water vapor) for which no BRDF information are available and no
correction can be made with this technique. View angle is set to nadir and solar zenith is set to a
temporally constant value at each tile’s center and varies with latitude. This technique uses the c-
factor technique and the global coefficients provided by Roy et al. (2016). A bandpass
adjustment is then calculated to account for small spectral differences within overlapping band

of
regions (Claverie et al. 2018). To filter out passing ships and/or monitoring units deployed within
the study area, a standard deviation filter was applied to the imagery (Peterson et al. 2018). The
satellite data acquired were further masked to extract only the areas surrounding the sampling

ro
locations. A 90 x 90 m (3 x 3 pixel) sampling area was created for each site and the mean
reflectance values were calculated and used as input spectral data for the modeling process.
-p
These sampling areas were manually demarcated to diminish issues related to adjacency effects.
re
lP
na
ur
Jo

Figure 5. Satellite data preprocessing workflow as defined by the NASA Harmonized Landsat-8
Sentinel-2 (HLS) science algorithm. S-30 MSI and L-30 OLI represent the final harmonized data
products at 30 m spatial resolution. Figure reused with permission from (Peterson et al. 2020).

The resulting data product contains fully corrected surface reflectance values for the
seven overlapping bands (coastal, blue, green, red, NIR, SWIR1 and SWIR2) and is virtually
Journal Pre-proof

identical in terms of spectral response and spatial resolution. This creates the potential for
synergistic use of the two satellite data sources and greatly increases the temporal
resolution/revisit time (Claverie et al. 2018). A detailed description of the HLS processing
methodology can be found in (Claverie et al. 2017). The resulting satellite dataset contained 233
scenes from Landsat-8 and 75 from Sentinel-2 creating a total of 308 images. When compared to
the GREON and USACE water quality data 96 scenes aligned with ground samples for the
satellite database. The plotted results of the HLS spectral reflectance are shown in Figure 6.

of
ro
-p
re
lP
na
ur
Jo

Figure 6. Spectra resulting from the Landsat-8-Sentinel-2 virtual constellation pre-processing


routine representing the satellite database. Rrs refers to spectral reflectance.

5. Methods
With remote sensing the interaction between electromagnetic radiation, water, and the
bio-physical compounds in a freshwater system can be used to estimate a range of water quality
parameters (van der Meer and De Jong 2001). Thus, optically active parameters such as
phytoplankton, suspended sediments, and dissolved organic matter can be estimated by
characterizing specific absorption and scattering features (Glasgow et al. 2004). A variety of
Journal Pre-proof

algorithms have been developed for retrieving constituents in inland waters based on these
optical interactions of water constituents and electromagnetic radiation in the visible and near
infrared regions that are not heavily impacted by water absorption. These algorithms are
typically divided into two broad categories empirical and analytical/bio-optical methods.

5.1 Spectral methods


5.1.1 Band ratio
Spectral band ratio methods use a simple semi-empirical approach to estimating water
quality parameters by creating a mathematical ratio between the reflectance of two or more
spectral bands. The basic relationship determined by the ratio approach can be described as:

𝑅 𝛽
𝑝̂ = 𝛼 (𝑅1 ) + 𝛾

of
(1)
2

where 𝑝̂ is the quantity of parameter being estimated, Ri is the reflectance in the band i, and 𝛼, 𝛽,

ro
and 𝛾 are the coefficients derived from regression. Ratio algorithms can consist of several
different band combinations to exploit the spectral signal gathered by the sensor and are used as
-p
independent variables in regression analysis. Ratio algorithms have been widely researched for
the estimation of Chl-a utilizing the ratio of reflectance at 700 nm and 670 nm which exploits the
re
backscattering from phytoplankton near 700 nm and the Chl-a absorption maxima at 670 nm
(Roesler et al.1989). Several studies such as Gitelson et al. (1993), Gons (1999) and Moses et al.
lP

(2009) display significant correlations between Chl-a concentration and the ratio of 700 nm and
670 nm over a wide range of Chl-a concentrations (0.1 – 350 mg/L) across a variety of inland
waterbodies. Although the simple two band ratio for Chl-a has displayed promising results in
na

several cases, relatively low accuracies are more commonly derived when applied to inland
waters due to the influence of other constituents such as CDOM (IOCCG 2006). Along with Chl-
ur

a other optically active parameters have been successfully estimated using band ratios such as
CDOM (Kutser et al. 2005), Turbidity (Gitelson et al. 1993; Kallio et al. 2001; Lacaux et al.
Jo

2007; Shafique et al. 2003) and TSS (Gitelson et al. 1993; Kallio et al. 2001). Although band
ratio estimation methods are easily implemented, they are more suited for case 1 waters (waters
dominated by phytoplankton) where the spectral signal is less impacted by turbidity and
suspended matter. Results from these methods also tend to vary widely by location and water
type. Table 2 contains selected band ratio algorithms for water quality parameter estimation.

Table 2. Selected Band ratio methods applied to retrieve water quality parameters. In this table
some equations are denoted by specific wavelengths and others by broad band regions as it was
presented in the original publication.

Author Method Dataset Parameter Band/Equation

Shafique et Semi- Hyperspectral Chl-a, TP, 𝐶ℎ𝑙 − 𝑎 = 𝑎 ∗ (705/675)


al. (2003) empirical Turbidity − 𝜀
Journal Pre-proof

band ratio 𝑇𝑃 = 𝑎 ∗ 𝑙𝑜𝑔 (554/


675 𝑛𝑚) − 𝜀 𝑇𝑢𝑟𝑏𝑖𝑑𝑖𝑡𝑦 =
𝑎 ∗ (710 – 740 𝑛𝑚) + 𝜀

Hu et al. Color Index MODIS Chl-a 𝐶𝐼𝐴 = 𝑅(555) − [𝑅(443)


(2012) Algorithm + (555
(CIA) − 443)/(670
− 443)
∗ (𝑅(670)
− 𝑅(443))]

Lacaux et Normalized SPOT-5 Turbidity 𝑁𝐷𝑇𝐼


al. (2007) Difference (𝑅𝑟𝑠 (𝑅𝑒𝑑 )−𝑅𝑟𝑠 (𝐺𝑟𝑒𝑒𝑛))
=

of
Turbidity (𝑅𝑟𝑠 (𝑅𝑒𝑑 ) + 𝑅𝑟𝑠 (𝐺𝑟𝑒𝑒𝑛))
Index
(NDTI)

ro
Zheng et al. Semi- Landsat-8 K490 𝐾490 = 𝑎 ∗ 𝑙𝑛(𝑅𝑟𝑠 (𝑁𝐼𝑅)
(2016) empirical
band ratio -p /𝑅𝑟𝑠 (𝐺𝑟𝑒𝑒𝑛))
+𝜀
re
Mishra and Normalized MERIS Chl-a 𝑁𝐷𝐶𝐼
Mishra Difference (𝑅𝑟𝑠 (708) − 𝑅𝑟𝑠 (665))
=
lP

(2012) Chlorophyll (𝑅𝑟𝑠 (708) + 𝑅𝑟𝑠 (665))


Index
(NDCI)
na

Hu (2009) Floating MODIS Floating 𝐹𝐴𝐼 = 𝑅(𝑁𝐼𝑅) − 𝑅’(𝑁𝐼𝑅),


Algal Index algae 𝑅’𝑁𝐼𝑅 = 𝑅(𝑅𝑒𝑑) +
(FAI) (𝑅(𝑆𝑊𝐼𝑅) − 𝑅(𝑅𝑒𝑑)) ∗
ur

(𝜆𝑁𝐼𝑅 − 𝜆𝑅𝑒𝑑 )/ (𝜆𝑆𝑊𝐼𝑅 −


𝜆𝑅𝑒𝑑)
Jo

Pulliainen et Semi- AISA Chl-a 𝐶ℎ𝑙 − 𝑎 = 𝑎0 +𝑎1 𝑅(702)


al. (2001) empirical /𝑅(665)
band ratio
Journal Pre-proof

Gitelson et Normalized Spectroradiometer TSS, 𝑇𝑆𝑆


al. (1993) difference Turbidity = (R(560)–567
bad ratio − R(515)–523)
/(R(560)–567
+ R(515)–523)
𝑇𝑢𝑟𝑏𝑖𝑑𝑖𝑡𝑦
= (R(560)–567
− R(515)–523)
/(R(560)–567
+ R(515)–523)

Kallio et al. Semi- AISA Chl-a, TSS, 705


𝐶ℎ𝑙 − 𝑎 = (𝐿699 − (( )

of
(2001) empirical Turbidity, 𝐿670
band ratio SD − 677)

ro
𝑇𝑆𝑆 = 𝑅 (705) − 714

𝑇𝑢𝑟𝑏𝑖𝑑𝑖𝑡𝑦 = 𝑅(705) − 714


-p 𝑆𝐷 = (𝐿488–496
− 𝐿747– 755)
re
/(𝐿618–625
− 𝐿747– 755)
lP
na

5.1.2 Spectral shape


Spectral shape algorithms like band ratios are based on the underlying spectral
relationship between measured radiance/reflectance and the water constituents. This approach
ur

differentiates itself from the band ratio approach by utilizing the distinct absorption and
reflectance properties to isolate the spectral signature (usually slope or peak
Jo

reflectance/absorption differences) of the parameter in question. For example, one of the most
widely used spectral shape methods is the Fluorescence Line Height (FLH) which is a measure
of the amount of radiance leaving the water surface in the Chl-a fluorescence emission band. A
baseline is then determined given the linear fit between the two bands at 667 nm and 748 nm
which brackets the signal at 678 nm. The signal a t 678 nm reaching above this baseline is
quantified as the FLH (Neville and Gower 1977). This methodology has proven useful in
numerous studies for the estimation of optically active parameters such as Chl-a (Dierberg and
Carriker 1994; Kallio et al. 2001; Matthews and Odermatt 2015; Neville and Gower 1977;
Wynne et al. 2008), CDOM (Helms et al. 2008; Twardowski et al. 2004), and SD (Kallio et al.
2001) generally outperforming simple band ratios that are prone to fluctuations in CDOM and
suspended matter common in inland waters. Although these methods tend to outperform simple
band ratios, they often require high spectral resolution data which limit their applicability to
Journal Pre-proof

satellite data. Table 3. contains a list of key spectral shape algorithms used for water quality
estimation.

Table 3. Selected spectral shape methods applied to retrieve intermediate algal products.

Author Method Dataset Parameter Band/Equation

Neville and Fluorescent Line Airborne Chl-a 𝐹𝐿𝐻 = 𝐿 𝑦 − [ 𝐿 𝑅 +


Gower Height (FLH) spectrometer 𝜆𝑅 −𝜆𝐹
(𝐿 𝐿 − 𝐿 𝑅 ) ]
(1977) 𝜆𝑅 −𝜆𝐿

Dierberg Reflectance Line Airborne Chl-a 𝑅𝐿𝐻


and Carriker Height (RLH) multispectral = 𝐿2 − 𝐿1 – [(𝐿3 − 𝐿1)
(1994) measurement (𝜆2 − 𝜆1)

of
∗ ]
system (𝜆3 − 𝜆1)

ro
Matthews Maximum Peak MERIS Chl-a 𝑀𝑃𝐻 = 𝜌𝐵𝑅𝑚𝑎 𝑥
and
Odermatt
Height (MPH)
-p − 𝜌𝐵𝑅664
− ((𝜌𝐵𝑅885
re
(2015) − 𝜌𝐵𝑅664 )
∗ (𝜆𝑚𝑎𝑥
− 664)
lP

/(885
− 664))
na

Kallio et al. Spectral shape AISA Chl-a, CDOM, 𝐶ℎ𝑙 − 𝑎 = (𝐿702/𝐿674),


(2001) band ratio airborne SDT 𝐶𝐷𝑂𝑀 = ((𝐿571–𝐿607)/
imaging 𝐿607) , 𝑆𝐷𝑇 =
ur

spectrometer ((𝐿492– 𝐿751)/

(𝐿622– 𝐿751))
Jo

Wynne et al. Cyanobacteria MERIS Cyanobacteria 𝐶𝐼


(2008) Index (CI) (Microcystis) = 𝑛𝐿𝑤(681) − 𝑛𝐿𝑤(665)
− {𝑛𝐿𝑤(709)
(681 − 665)
− 𝑛𝐿𝑤(665)}.
709 − 665
Gower et al. Maximum MERIS Chlorophyll 𝑀𝐶𝐼
(2005) Chlorophyll = 𝐿 (709) − 𝐿 (681)
Index (MCI) − 0.389(𝐿(753)
− 𝐿(681))
Kudela et Scattering Line MASTER, Chlorophyll 𝑆𝐿𝐻 = 𝑅𝑟𝑠 (714) −
al. (2015) Height (SLH); HICO, (Microcystis) [𝑅𝑟𝑠 (654) +
Journal Pre-proof

Aphanizomenon- AVIRIS 𝑅𝑟𝑠 ( 754) −𝑅𝑟𝑠 ( 654)


(714 −
Microcystis 𝑅𝑟𝑠 ( 754 ) +654

Index (AMI) 654)];

𝐴𝑀𝐼 = [640 − 510]/[652


− 625]

5.2 Empirical estimation


The goal for empirical methods is to identify relationships between water quality
parameters and spectral reflectance values (individual band or combinations of bands) by means
of regression-related efforts (Chang et al. 2015). Empirical approaches are fully, or semi-data

of
driven and require sufficient in-situ water quality measurements that are used to establish an
empirical relationship (e.g. linear or non- linear regression) between the water leaving radiance

ro
measured by the sensor (Dörnhöfer and Oppelt 2016) and water quality measurements. Most
empirical approaches utilize a multivariate regression modeling paradigm as a consequence of
-p
the optical complexity of inland waters. Table 4. includes a list of selected empirical methods for
constituent estimation.
re
Table 4. Selected empirical methods applied to retrieve water quality parameters. Equations
lP

contained in this table display the equations as defined in their original publications.

Author Method Dataset Parameter Band/Equation


na

Baban Multiple Landsat-5 SDD, SS, B1, B2, B3, B4, B5


(1993) Linear Salinity
regression
ur

(MLR)

Bonansea Linear mixed Landsat-5, 7 Chl-a, Secchi 𝐶ℎ𝑙 − 𝑎: 𝜌2, 𝜌3, WST and AR
Jo

et al. model disk (spatial correlation structure);


(2015) (LMM) transparency SDT: ρ4, ρ4/ρ1 and AR (spatial
(SDT) correlation structure)

Kloiber et Log- Landsat-5 SDT B1/B3, B1


al. (2002) transformed
multiple
linear
regression

Rundquist First Spectroradiometer Chl-a D(690)


et al. derivative,
(1996) Linear
regression
Journal Pre-proof

Dekker MLR Landsat-5 Chl-a, SDT B1, B2, B3, B4


and Peters
(1993)

Hicks et Linear Landsat-7 TSS, 𝑇𝑆𝑆  =   −52.817  +  1,449.4  ×


al. (2013) regression Turbidity,  𝐵4; 𝑇𝑈𝑅𝐵  =   −63.717  +
SDT  1,587.8  ×  𝐵4; 𝐿𝑛(𝑆𝐷𝑇)   =
 −2.0298  +  2.7517  ×
 𝐿𝑛(𝐵1: 𝐵3) – 0.6022  ×
 𝐿𝑛(𝐵1)

Giardino Neural MERIS Chl-a, SPM, NA


et al. network CDOM

of
(2013) (NN)

Lobo et al. Non-linear Landsat-5, 7, 8 TSS 𝑇𝑆𝑆 = 𝑅(𝑅𝑒𝑑) ∗ 0.99 + 0.8

ro
(2015) regression

Majozi et Non-linear MERIS Attenuation 𝐾490 = 𝑎[(𝑅(560))


al. (2014) regression
-p
coefficient at
490nm
/(𝑅(𝜆))]𝐵 + 𝑦
re
(K490)
705
Olmanson Step-wise Hyperspectral Chl-a, TSS, 𝐶ℎ𝑙 − 𝑎 = 4.12 ∗ (592 ) − 0.31;
lP

et al. regression nonvolatile


𝑇𝑆𝑆 = 0.006 ∗ (705) + 0.51;
(2013) suspended
𝑁𝑉𝑆𝑆 = 705, 705/670;
solids
𝑇𝑢𝑟𝑏𝑖𝑑𝑖𝑡𝑦 = −0.004 ∗
na

(NVSS),
turbidity (705) + 4.88
ur

Alparslan Multiple Landsat-7 Chl-a, TSS, Chl-a = (B1, B2, B3, B4)
et al. linear SDT
(2007) regression TSS = (B1, B2, B3, B5),
Jo

SDT = (B1, B2, B3, B6)

Maier and Support Hyperspectral Chl-a, BGA, NA


Keller vector CDOM,
(2018) machine turbidity
regression
(SVR)

Peterson et Feature Spectroradiometer BGA, Chl-a, NA


al. (2019) fusion and TSS,
ensemble turbidity,
regression TDS

Wang et Particle Hyperspectral Water NA


swarm Quality Index
Journal Pre-proof

al. (2017b) optimization (WQI)


and SVR

Peterson et Feature Landsat-5, 7, 8 SSC NA


al. (2018) fusion and
extreme
learning
machine
regression
(ELR)

Gitelson et Non-linear MERIS Chl-a Chl-a =[𝑅𝑟𝑠 (670) −


al. (2009) regression 1– 𝑅𝑟𝑠 (710) − 1] × 𝑅𝑟𝑠 (750)
band ratio

of
Flink et al. PCA + MLR MERIS Chl-a PC1, PC2, PC3, PC4

ro
(2001)

Tyler et al. Linear Landsat-5, 7 Chl-a TM1, TM2, TM3 OR ETM1,


(2006) mixture
modelling -p ETM2, ETM3
re
Chen et al. Genetic Landsat-7 Chl-a B1, B4, B5, B7
(2008) Algorithm
lP

Song et al. Genetic Spectroradiometer Chl-a, TSM NA


(2012b) Algorithms
na

and Partial
Least Square
(GA-PLS)
ur

Panda et Radial Basis Landsat-5 Chl-a, SM NA


Jo

al. (2004) Function


Neural

Network
Models

(RBFN)

Wu et al. Log Landsat-5 TP 𝑙𝑛(TP) = −21.45(B3/B2)


(2010b) Regression − 14.42(B1
/B3)
+ 42.99(B1)
+ 27.1
Wang et SVR SPOT5 COD, NH3- NA
al. (2010) N, DO
Journal Pre-proof

Chen Genetic SPOT Chl-a 𝐶ℎ𝑙 − 𝑎 = (𝑅𝑟𝑠 (𝐵4))/


(2003) programming
(𝑙𝑛(𝑅𝑟𝑠 (𝐵2) + 𝑅𝑟𝑠 (𝐵3)))
+ 𝑙𝑛(61.6)

Linear approaches such as multiple linear regression (MLR) (Alparslan et al. 2007; Baban 1993;
Dekker and Peters 1993; Flink et al. 2001; Hicks et al. 2013; Rundquist et al. 1996), partial least
squares (PLS) (Song et al. 2012b), linear mixture modeling (Bonansea et al. 2015; Tyler et al.
2006), and genetic algorithms (GA) (Chen 2003; Chen et al. 2008), to name a few, gained
popularity in earlier studies. Although linear models showed several promising results the
presence of nonlinear relationships among water quality constituents, which makes linear

of
regression is less reliable to identify precise relationships, led scientists to employ more
advanced approaches (Wang et al. 2010). Among the literature, nonlinear methods such as

ro
support vector regression (SVR) (Maier and Keller 2018; Peterson et al. 2019; Wang et al. 2010;
Wang et al. 2017b) and neural network (NN)-based approaches (Giardino et al. 2013; Panda et al.

-p
2004; Peterson et al. 2018; Peterson et al. 2020) have gained popularity due to their robust ability
to capture complex statistical trends typical of water quality remote sensing data. Although there
re
are very few studies that employ deep learning methods such as deep recurrent networks and
convolutional neural networks, these offer great potential for water quality parameter estimation
lP

due to their robust high-level non-linear nature.


One limitation of empirical methods is geographic transferability. An empirical model
developed for a specific waterbody may not be fully transferable to other waterbodies as it is
na

based on the unique characteristics of the original study area. However, since these techniques do
integrate unique characteristics of the water body when establishing a model, they typically
generate higher accuracy results than the more generalized spectral and/or bio-optical models
ur

(Matthews 2011). A unique advantage to the empirical approach is that there is no need for the
prior understanding of the often-complex water and light interactions, which make the empirical
Jo

approach desirable for ease of implementation.


5.3 Bio-optical estimation
Due to the high complexity of inland and coastal waters researchers developed bio-
optical modeling techniques which have gained popularity as of late. Bio-optical modeling
techniques, which are considered analytical, are primarily based on the radiative transfer within
the water column. Utilizing the radiative transfer equation, the optically active constituents of the
water column can be derived from the water leaving radiance. However, these models do require
detailed spectral information of the optically active water constituents within the target region.
Based on the fundamental light-water interactions these methods have been proposed to alleviate
and overcome issues of regional transferability inherit in empirical methods.
Bio-optical modeling has two general methods for solving the radiative transfer equation.
The forward approach derives Rrs from the water’s IOPs and an approximation of the radiative
Journal Pre-proof

transfer equation which can be done through models such as HydroLight (Mobley 1998).
Radiative transfer equations can then be used to calc ulate the top of the atmosphere reflectance
using popular methods such as MODTRAN (Kneizys et al. 1988). The inverse approach
estimates the IOPs and the concentrations of the optically active constituents from either spectral
reflectance or radiance and can be solved using a variety of regression models and optimization
techniques such as multiple linear regression and neural networks. A significant weakness of
these algorithms is due to the selection and specification of appropriate parameters required in
the modeling and solving for the radiative transfer equation (IOCCG 2006).
Table 5 contains a list of selected methods used in recent studies. Earlier studies such as
(Hoogenboom et al. 1998) and (Hoge 1996) used matrix inversion for the retrieval of several
parameters including Chl-a, CDOM, and suspended matter. Other approaches such as the

of
modeling approach used by (Arst and Kutser 1994) and (Kutser et al. 2001) estimated
concentrations of Chl-a and, CDOM, and total suspended matter (TSM) using simulated spectra
which were generated through forward bio-optical modeling and found to be comparative to real

ro
in-situ hyperspectral data. The quasi-analytical algorithm (QAA) inversion, originally proposed
by (Lee et al. 2002), requires no calculation of total absorption coefficient and spectral
-p
characteristics of pigment and gelbstoff absorption coefficients are not fully utilized. These
absorption coefficients, spectrally decomposed from the total absorption coefficient in a separate
re
calculation, make the algorithm easy to implement. Proposed by (Wang et al. 2005), the over
constrained linear matrix inversion is like other inversion methods but provides an uncertainty of
lP

the inverted parameters for each individual spectrum. Following these developments, the need
for computational efficiency for satellite image processing the linear matrix inversion method
was later adapted into a look-up-table approach (SAMO-LUT) to be applied to MERIS data
na

interpreting the spectra on a pixel-by-pixel level (Yang et al. 2011). Due to the high complexity
of the developed bio-optical models tools such as the Water Colour Simulator (WASI) (Albert
ur

and Gege 2006) and Bio-Optical Model- Based tool for Estimating water quality and bottom
properties from remote sensing images (BOMBER) (Giadino et al., 2012) were developed
Jo

specifically for use with inland water. More recently, advanced techniques such as 2SeaColor
(Salama and Verhoef 2015) which uses a two-stream analytical solution to the radiative transfer
equation have gained popularity due to the robust calculations and application to a range of
different water conditions. Other advanced approaches utilizing machine learning have gained
popularity such as NN-based models (Doerffer and Schiller 2007) which seek to find non-linear
optimizations and solutions to the radiative transfer equation to estimate water constituents.

Table 5. Selected bio-optical methods applied to retrieve water quality parameters. In this table K
represents the diffuse attenuation coefficient at a given wavelength (i.e. K490).

Author Technique Sensor/platform Parameter

Albert and Gege Shallow water Airborne hyperspectral Phytoplankton,


(2006) inversion (WASI) and multispectral suspended matter,
absorption of
Journal Pre-proof

gelbstoff, bottom
depth, bottom
type

Giardino et al. BOMBER Airborne hyperspectral SPM, Chl-a,


(2012) inversion CDOM, SAV,
Water Depth

Yang et al. (2011) Semi-analytical MERIS Chl-a, tripton,


model-optimizing CDOM
and look-up-table
(SAMO-LUT)

(Horion et al. Regionally MODIS Chl-a, K490

of
2010) adjusted inversion

ro
Matthews et al. Case-2 waters MERIS Chl-a, TSS, OSS,
(2010) inversion CDOM, SD, ISS

Santini et al.
(2010)
2-step least-
squares inversion -p
Airborne hyperspectral Chl-a, TSM,
CDOM
re
Doerffer and Neural Network Sentinel-2, Sentinel-3 Chl-a, TSM,
Schiller (2007) Inversion
lP

Ruiz-Verdú et al. Case-2 waters MERIS Chl-a, TSM


(2008) inversion
na

Smyth et al. (2006) PML algorithm MODIS CDOM,


phytoplankton
ur

Van Der Woerd HYDROPT MERIS, MODIS Chl, SPM, CDOM


and Pasterkamp inversion
(2008)
Jo

Kutser et al. (2001) Area Airborne hyperspectral Chl-a, TSM,


minimization CDOM
inversion

Kuchinke et al. Spectral SeaWiFS Chl-a, CDOM


(2009) optimization

Lee et al. (2002) Quasi-analytical Airborne hyperspectral Chl-a, CDOM


algorithm (QAA)

Mishra et al. Quasi-analytical Spectroradiometer Chl-a, PC


(2013) inversion
Journal Pre-proof

Salama and 2SeaColor Sentinel-3 K490


Verhoef (2015)

Franz and Werdell Generalized Airborne hyperspectral Chl-a


(2010) Inherent Optical
Property
algorithm (GIOP)

Li et al. (2017) Shallow water Spectroradiometer CDOM


Bio-Optical
Properties
algorithm

Ma et al. (2006) Optimal Spectroradiometer CDOM

of
parameter
suitability model

ro
Dekker et al. One-dimensional Landsat-5 and SPOT- TSM
(2001) network model HRV

Miksa et al. (2004) Analytical


-p
Airborne hyperspectral Chl-a, TSS,
re
Modular CDOM
inversion and
lP

processing system

Loisel and Morel Spectral Inversion Spectroradiometer K440, K490


(2001)
na

Hoge (1996) Linear Matrix SeaWiFS K410, K490,


Inversion CDOM
ur

Algorithm

Wang et al. (2005) Over Constrained Spectroradiometer K440


Jo

Linear Matrix
Inversion
(OCLMI) with

Statistical
Selection

Garver and Siegel GSM Semi- Spectroradiometer K443, Chl-a,


(1997) Analytical Bio- CDOM
Optical Model

Hoogenboom et al. Matrix Inversion Airborne hyperspectral Chl-a, tripton


(1998) Method (MIM) (DW)
Journal Pre-proof

Phinn et al. (2005) Analytical Model Landsat-7 Secchi depth, KD,


tripton, CDOM
of Underwater

Light Climate

Li et al. (2018) Shallow Water Landsat-8 CDOM


Bio-optical
Properties
(SBOP)

5.4 Selected Models

of
To evaluate the range of different water quality estimation methods included in this work
we select several methods from each approach and apply them to the proximal and satellite

ro
(when applicable) datasets. By applying the range of different methods to the selected datasets
the goal is to identify specific strengths and weaknesses by the modeling approach. Several

-p
spectral band ratios (NDCI, FAI, CIA, and NDTI) and spectral shape algorithms (FLH, RLH,
MCI, and MPH) were tested representing the spectral approach. The spectral methods generally
re
involving only a select few bands were applied to both the proximal and satellite datasets.
Methods that were applied to the satellite dataset utilized the most suitable spectral channel to
lP

the original algorithm when possible. Several methods were not applied to the satellite dataset
due to spectral limitations such as the spectral shape algorithms which require narrow bands
within the red and red edge regions not present.
na

Representing the empirical approach, several of the more popular multivariate regression
techniques were tested including PLSR, SVR, deep neural network (DNN), and long short-term
memory (LSTM). PLSR representing a linear approach was determined to be most effective with
ur

three principal components which were used in modeling for both the proximal and satellite
datasets. The SVR model used in this study used a linear kernel and Bayesian optimization
Jo

function (Pelikan et al. 1999) for optimized hyperparameter selection. Other cutting-edge deep
learning regression methods were also evaluated to gauge the contribution of deep learning in
water quality remote sensing as these methods have seen increasing popularity throughout the
field of remote sensing. Here we implement a feedforward DNN containing five hidden layers
with the Tansig activation function, learning rate of 0.01, and utilized Bayesian regularized
backpropagation to train the model. To further explore novel deep learning methods, a LSTM
(Long Short-Term Memory) was also applied to the datasets (Hochreiter, 1997). The LSTM
method used in this study consisted of three hidden layers with learning rate of 0.001 and the
adaptive moment estimation (ADAM) solver. To our knowledge, this is the first implementation
of LSTM for water quality estimation from spectral data. Empirical regression methods were
given the full spectrum from 400 nm - 900 nm as input for the proximal dataset along with all
available bands for the satellite dataset. All empirical models were conducted using MATLAB
2018b. Further in-depth investigations into model hyperparameter tuning and feature engineering
Journal Pre-proof

were not explicitly addressed in this work as the main objective was to test and evaluate the
general effectiveness of empirical approaches.
In this study, we investigate a range of popular bio-optical spectral inversion methods on
both the proximal and satellite datasets when possible. The quasi-analytical algorithm (QAA)
inversion is designed to derive the absorption and backscattering coefficients by inverting the
spectral Rrs. This method was originally designed to use Rrs at 412 nm, 443 nm, 490 nm, 510 nm,
555 nm, and 670 nm, therefore, it was only applied the proximal dataset. Proposed by (Wang et
al. 2005), the over constrained linear matrix inversion (OCLMI) is like other inversion methods
but provides an uncertainty of the inverted parameter for each individual spectrum. Like the
QAA, this technique was only applied to the proximal dataset due to spectral limitations of the
satellite data. Input wavelengths required for the over constrained linear matrix inversion are Rrs

of
at 412 nm, 443 nm, 490 nm, and 555 nm requiring one less band than the QAA method. The
2SeaColor method was also selected for testing since recent studies (Salama and Verhoef 2015;
Song et al. 2012b) have shown the two-stream model to be especially beneficial in high turbidity

ro
inland waters. Input wavelengths for the 2SeaColor method are Rrs at 440 nm, 490 nm, 550nm,
and 560 nm therefore it was excluded from satellite dataset. Based on the bio-optical model
-p
proposed by Albert and Mobley (2003) WASI was specifically designed in order to deal with
many of the common issues regarding inland remote sensing such as shallow water depth, sun
re
glint, and surface waves. Here we test the WASI application on both the proximal and satellite
datasets using the inverse model to calculate Chl-a within shallow water conditions from the
lP

input Rrs. For the Landsat-8-Sentinel-2 virtual constellation bands B2 (482 nm), B3 (561 nm), B4
(655 nm), and B5 (864 nm) were utilized as the input wavelengths. Lastly, the NN inversion
method defined by (Doerffer and Schiller 2007) was evaluated and applied to both the proximal
na

and satellite dataset. This method has been specifically designed to be applied to satellite data
(MERIS, MODIS, SeaWiFS, Landsat-8, and Sentinel-2) and is supported through the ESA’s
ur

Sentinel toolbox SNAP.


5.5 Model evaluation and comparison
Jo

Using an iterative optimization is common practice in machine learning and statistical


modeling, which aims to find the best results by averaging the results of multiple iterations. All
models (except bio-optical model) evaluated in this study were run 100 iterations and average
scores were reported to avoid statistical bias. The spectral and empirical methods tested utilized a
random partition of 70% training and 30% testing data for testing model accuracy and were
shuffled each iteration. Although the overall percentage of samples (or number of samples)
remained the same for training and testing data as 70%/30%, however, samples in each group
were different for each iteration to ensure the models have been exposed to a representative
average of the dataset. Due to the relatively small sample size of the datasets issues related to
spatial transferability were not explored and may be present. Since the bio-optical models do not
require a specific dataset for training the model was applied to all individual samples. Deep
learning approaches use minima of a cost function or erro r curve identified through iterations
(also called as epochs), which is the number of times hyperparameters of a machine learning
Journal Pre-proof

algorithm are updated through processing of the training dataset and optimized by calculating the
cost function through backpropagation and adjustment of weights and biases in the model.
To quantitatively evaluate the results of the models, the coefficient of determination (R 2 )
was calculated as:

𝑆𝑆𝑟𝑒𝑠
𝑅2 = 1 − ( ),
𝑆𝑆𝑡𝑜𝑡
where 𝑆𝑆𝑟𝑒𝑠 is the residual sum of squares and 𝑆𝑆𝑡𝑜𝑡 is the total sum of squares (Draper and
Smith 1981). Median absolute deviation (MAD) calculated as:
∑𝑛 |𝑦 − 𝑦̅𝑖 |
𝑀𝐴𝐷 = 𝑖=1 𝑖 ,
𝑛

of
representing the median absolute deviation between the observed (𝑦𝑖 ) and predicted values (𝑦̅).
Mean square error (MSE) is the sum of the squared errors divided by the number of observation

ro
which is calculated as:

𝑀𝑆𝐸 = -p
∑𝑛𝑖=1(𝑦𝑖 − 𝑦̅𝑖 )2 ,
𝑛
re
and root mean square error (RMSE) which is simply the square root of MSE (Draper and Smith
1981). Mean absolute percentage error (MAPE) is the average of absolute errors divided by
lP

actual observation values (Draper and Smith 1981) calculated as:


| |
∑𝑛𝑖=1 𝑦𝑖 − 𝑦𝑖
𝑦
𝑀𝐴𝑃𝐸 = 𝑥 100 .
na

The correlation coefficient (R), which is simply the square root of R2 , is also used in this context
ur

when making correlational comparisons between variables. These metrics were calculated based
on the independent testing dataset.
Jo

6. Results
6.1 Model Comparison and Evaluation
Several competing methods and approaches from the literature were tested and evaluated
using mean R2 , MAD, MSE, RMSE, and MAPE. All models were run 100 iterations with
randomly partitioned data (70% training and 30% testing) to accurately gauge the robustness of
each method. The developed models when applied to the proximal dataset and their
corresponding validation metrics are displayed in Table 6 and Figure 7.
Overall, empirical methods yielded superior performance in estimation of all water
quality related parameters compared to other methods with the highest R2 and lowest error values
(MAD, MSE, RMSE, and MAPE) when applied to the proximal dataset (Table 6 and Figure 7).
Particularly, the deep learning-based empirical method, DNN exhibited best performance
followed by LSTM for the estimation of BGA-PC, Chl-a, TSS, Turbidity and TDS comparing to
Journal Pre-proof

PLSR and SVR-based empirical methods. This is likely because deep learning often outperforms
shallow or traditional machine learning models such as PLSR and SVR, especially dealing with
complex, nonlinear and redundant datasets (LeCun et al. 2015; Zhang et al. 2018). The best
estimation for NO3-N was achieved by SVR, demonstrating the complex fitting properties of the
SVR model, even for non- linear data through different kennel functions (Liu et al. 2017). PLSR
presented the poorest performance compared to other empirical methods, which may be caused
by the inability to characterize highly non-linear and complex relationships between variables
(Wang et al. 2018). Although empirical methods provide the best estimation results, further
research should be conducted to evaluate the transferability of the empirically based prediction
models which may limit practical usage.
As shown in Table 6, due to methodological limitations, bio-optical inversion methods

of
were only applied for the estimation of Chl-a. Spectral inversion methods performed poorer than
empirical methods, but slightly better than band ratios and shape methods with R2 ranging from
0.51 to 0.84. Comparing different spectral inversion methods, the NN Inversion model provided

ro
the best results with R2 of 0.84 and RMSE of 4.70, while the QAA method yielded the lowest
accuracies with R2 of 0.51 and RMSE of 16.98.
-p
Spectral band ratios and spectral shape methods yielded relatively poor performance with
lower R2 and higher error metrics (Table 6). This may be due to the model simplicity and limited
re
bands involved when compared to the empirical and bio-optical methods. Regarding the
performance of different band ratios and spectral shape models, the band ratios-based model
lP

NDCI provided the highest accuracy for BGA-PC estimation with R2 of 0.68, the spectral shape-
based model RLH presented the best performance for Chl-a with R2 of 0.72. The band ratio-
based model NDTI yielded highest accuracies for TSS and turbidity estimation with R2 of 0.56
na

and 0.48, respectively.


The estimation of BGA-PC, Chl-a, TSS and Turbidity achieved relatively higher
ur

accuracies regardless of the methods comparing to the other parameters (Table 6). However, the
estimation performance for NO3-N and TDS is very poor with low R2 values and high error
Jo

metrics. In addition, spectral band ratios methods failed to predict TDS, which can be primarily
attributed to its non-optical nature. None of the methods succeeded in the estimation of PO4-P,
again likely due to the non-optical nature of PO4-P. Although these models in all cases generated
very low accuracies, NO3-N and TDS did display positive correlations in empirical models for
the SVR and DNN methods with R2 between ~0.25 and 0.35.

Table 6. Results of the water quality modeling analysis when applied to the proximal dataset.
The proximal dataset contains in-situ water quality measurements with corresponding
spectrometer and/or hyperspectral data. Mean R2 and mean RMSE of each model are included to
evaluate overall model performance. The best performances for each parameter are indicated in
bold.

BGA- NO3- PO4-


Dataset Chl-a TSS fDOM Turbidity DO TDS
PC N P
Journal Pre-proof

R2 0.68 0.69 0.28 --- 0.29 --- 0.20 -0.11 0.12


MAD 0.42 3.56 9.95 --- 21.22 --- 0.21 0.22 111.63
NDCI MSE Proximal 0.61 38.15 228.97 --- 1208.91 --- 0.11 0.07 20127.83
RMSE 0.78 6.18 15.13 --- 34.77 --- 0.33 0.27 141.87
MAPE 11.97 16.46 36.51 --- 61.57 --- 236.38 317.54 57.38
R2 0.34 0.69 0.41 --- 0.27 --- 0.07 -0.07 0.08
MAD 0.64 3.56 8.39 --- 21.44 --- 0.23 0.21 142.98
FAI MSE Proximal 1.31 38.15 165.48 --- 1254.01 --- 0.12 0.07 27910.86
RMSE 1.14 6.18 12.86 --- 35.41 --- 0.35 0.26 167.07
MAPE 21.91 16.46 33.95 --- 61.69 --- 280.48 316.92 73.08
R2 0.36 0.49 0.38 --- 0.18 --- 0.02 -0.12 0.06
MAD 0.52 4.02 8.01 --- 20.94 --- 0.24 0.21 144.31

of
CIA MSE Proximal 0.71 55.08 128.16 --- 1494.97 --- 0.13 0.06 27860.88
RMSE 0.84 7.42 11.32 --- 38.66 --- 0.36 0.25 166.92

ro
MAPE 15.43 16.98 30.03 --- 59.86 --- 390.93 318.99 74.80
R2 0.27 0.39 0.56 --- 0.48 --- 0.06 -0.08 0.05

NDTI
MAD
MSE
RMSE
Proximal
1.01
2.60
1.61
4.99
83.63
9.14
-p7.63
114.93
10.72
---
---
---
17.76
941.42
30.68
---
---
---
0.24
0.13
0.36
0.21
0.07
0.26
140.31
26696.88
163.39
re
MAPE 33.28 19.18 29.52 --- 49.81 --- 301.79 316.98 72.89
R2 0.38 0.67 0.45 --- 0.31 --- 0.12 -0.11 0.02
lP

MAD 0.48 3.71 8.46 --- 19.44 --- 0.22 0.22 150.34
FLH MSE Proximal 0.64 40.30 134.36 --- 1121.06 --- 0.11 0.07 28603.80
RMSE 0.80 6.35 11.59 --- 33.48 --- 0.33 0.27 169.13
na

MAPE 14.35 16.86 32.15 --- 57.45 --- 241.30 317.52 78.10
R2 0.44 0.72 0.40 --- 0.33 --- 0.15 -0.01 0.09
MAD 0.50 3.79 8.39 --- 18.86 --- 0.21 0.22 66.86
ur

RLH MSE Proximal 0.66 44.07 165.48 --- 1089.72 --- 0.11 0.07 8588.27
RMSE 0.81 6.64 12.86 --- 33.01 --- 0.33 0.26 92.67
Jo

MAPE 14.97 16.34 33.95 --- 53.53 --- 240.77 316.04 32.17
2
R 0.42 0.57 0.46 --- 0.26 --- 0.12 0.00 0.05
MAD 0.47 3.64 8.40 --- 21.08 --- 0.36 0.23 76.86
MCI MSE Proximal 0.63 41.93 132.14 --- 1195.52 --- 0.27 0.07 10158.14
RMSE 0.78 6.48 11.50 --- 34.58 --- 0.52 0.27 100.79
MAPE 14.02 15.87 32.07 --- 60.77 --- 953.93 315.76 37.66
R2 0.39 0.66 0.35 --- 0.46 --- 0.13 -0.03 0.08
MAD 0.51 3.52 9.01 --- 21.13 --- 0.35 0.23 74.33
MPH MSE Proximal 0.65 38.50 181.34 --- 1281.77 --- 0.27 0.07 10156.58
RMSE 0.79 6.20 13.47 --- 35.80 --- 0.52 0.27 99.44
MAPE 14.31 16.16 35.48 --- 57.92 --- 938.39 317.59 37.05
2
R 0.68 0.76 0.73 --- 0.41 --- 0.26 0.02 0.17
PLSR MAD Proximal 0.30 3.52 3.40 --- 21.13 --- 0.20 0.21 48.75
MSE 0.26 38.87 35.41 --- 1281.77 --- 0.10 0.06 4466.92
Journal Pre-proof

RMSE 0.51 6.23 5.95 --- 35.80 --- 0.32 0.25 66.84
MAPE 10.20 15.79 11.71 --- 57.92 --- 230.19 312.69 28.30
2
R 0.77 0.78 0.81 --- 0.55 --- 0.29 0.00 0.33
MAD 0.25 3.33 2.36 --- 16.40 --- 0.20 0.21 35.15
SVR MSE Proximal 0.20 36.80 21.43 --- 875.26 --- 0.10 0.06 2621.52
RMSE 0.45 6.07 4.63 --- 29.58 --- 0.31 0.25 51.20
MAPE 7.78 15.05 7.54 --- 44.18 --- 230.91 315.57 20.40
R2 0.82 0.85 0.84 --- 0.65 --- 0.29 0.00 0.36
MAD 0.23 0.93 2.04 --- 14.11 --- 0.20 0.21 31.93
DNN MSE Proximal 0.19 5.35 13.93 --- 694.97 --- 0.10 0.06 2370.04
RMSE 0.43 2.31 3.73 --- 26.36 --- 0.31 0.25 48.68
MAPE 7.64 5.56 6.89 --- 34.96 --- 230.58 299.75 18.98

of
2
R 0.74 0.76 0.77 --- 0.51 --- 0.22 0.00 0.30
MAD 0.29 3.45 2.72 --- 17.76 --- 0.20 0.21 36.75

ro
LSTM MSE Proximal 0.25 38.11 26.52 --- 941.42 --- 0.10 0.06 2786.48
RMSE 0.50 6.17 5.15 --- 30.68 --- 0.32 0.25 52.79
MAPE
R2
MAD
9.94
---
---
15.44
0.67
3.95
-p
9.24
---
---
---
---
---
49.81
---
---
---
---
---
232.29
---
---
317.25
---
---
21.32
---
---
re
OCLMI MSE Proximal --- 45.79 --- --- --- --- --- --- ---
RMSE --- 6.77 --- --- --- --- --- --- ---
lP

MAPE --- 17.17 --- --- --- --- --- --- ---
2
R --- 0.51 --- --- --- --- --- --- ---
MAD --- 4.02 --- --- --- --- --- --- ---
na

QAA MSE Proximal --- 55.08 --- --- --- --- --- --- ---
RMSE --- 7.42 --- --- --- --- --- --- ---
MAPE --- 16.98 --- --- --- --- --- --- ---
ur

R2 --- 0.71 --- --- --- --- --- --- ---


MAD --- 3.52 --- --- --- --- --- --- ---
Jo

2SeaColor MSE Proximal --- 38.87 --- --- --- --- --- --- ---
RMSE --- 6.23 --- --- --- --- --- --- ---
MAPE --- 15.79 --- --- --- --- --- --- ---
R2 --- 0.73 --- --- --- --- --- --- ---
MAD --- 3.79 --- --- --- --- --- --- ---
WASI MSE Proximal --- 44.07 --- --- --- --- --- --- ---
RMSE --- 6.64 --- --- --- --- --- --- ---
MAPE --- 16.34 --- --- --- --- --- --- ---
R2 --- 0.84 --- --- --- --- --- --- ---
MAD --- 2.70 --- --- --- --- --- --- ---
NN
MSE Proximal --- 22.11 --- --- --- --- --- --- ---
Inversion
RMSE --- 4.70 --- --- --- --- --- --- ---
MAPE --- 12.70 --- --- --- --- --- --- ---
Journal Pre-proof

of
ro
-p
re
lP
na
ur
Jo
Journal Pre-proof

Figure 7. Plotted modeling results when applied to the proximal dataset. The bio-optical methods
(i.e. OCLMI, QAA, 2SeaColor, WASI, and NN Inversion) were only applied to estimate Chl-a
due to methodological limitations. Empirical and bio-optical methods generated the best overall
results with DNN models exceeding all others tested except in the case on NO3-N where SVR
performed best.

The results of implementing different methods for the satellite dataset were demonstrated
in Table 7 and Figure 8. Due to the coarse spectral and spatial resolution of the satellite dataset,
as well as methodological limitations, only spectral band ratios methods, empirical methods and
two bio-optical inversion methods were employed. Overall, empirical methods, especially DNN
outperformed other methods in estimating BGA-PC, TSS, turbidity and DO with higher R2 and
lower MAD, MSE, RMSE, and MAPE. The SVR-based empirical method presented the highest

of
accuracy for fDOM estimation with R2 of 0.59 and RMSE of 5.81. Due to the small sample size
for the satellite dataset there may be some degree of overfitting present in the results of the

ro
empirical methods. Bio-optical inversion methods also generated decent estimation accuracy for
Chl-a, and NN Inversion yielded best performance for Chl-a estimation with R2 of 0.79 and

-p
RMSE of 2.02. Spectral band ratios methods exhibited the poorest performance with lower R 2
and higher MAD, MSE, RMSE, and MAPE values compared to the aforementioned methods.
re
Table 7. Results of the water quality modeling analysis when applied to the satellite dataset. The
satellite dataset contains in-situ water quality measurements with corresponding Landsat-8
lP

and/or Sentinel-2 data. Mean R2 and mean RMSE of each model when applied to the testing
dataset are included to evaluate overall model performance. The best performances for each
na

parameter are indicated in bold.

BGA- NO3- PO4-


Dataset Chl-a TSS fDOM Turbidity DO TDS
PC N P
ur

R2 0.07 0.11 0.40 0.18 0.08 0.12 --- --- ---


MAD 1.53 8.54 8.39 12.74 14.80 3.18 --- --- ---
Jo

NDCI MSE Satellite 4.32 110.63 106.95 264.98 358.21 13.48 --- --- ---
RMSE 2.08 10.52 10.34 16.28 18.93 3.67 --- --- ---
MAPE 106.17 74.46 31.88 25.41 78.90 53.67 --- --- ---
R2 0.29 0.37 0.37 0.12 0.04 0.04 --- --- ---
MAD 0.93 3.94 8.92 9.12 15.14 3.51 --- --- ---
FAI MSE Satellite 1.54 26.80 119.88 172.33 366.90 16.94 --- --- ---
RMSE 1.24 5.18 10.95 13.13 19.15 4.12 --- --- ---
MAPE 79.31 36.80 34.60 18.66 81.22 58.74 --- --- ---
R2 0.16 0.18 0.35 0.15 0.04 0.13 --- --- ---
MAD 1.07 7.87 9.15 8.13 15.21 3.18 --- --- ---
CIA MSE Satellite 1.97 95.17 126.15 133.07 368.08 13.48 --- --- ---
RMSE 1.40 9.76 11.23 11.54 19.19 3.67 --- --- ---
MAPE 82.48 59.13 35.81 16.75 81.99 53.67 --- --- ---
NDTI R2 Satellite 0.19 0.07 0.58 0.07 0.38 -0.15 --- --- ---
Journal Pre-proof

MAD 1.12 8.54 7.59 10.45 5.33 4.48 --- --- ---
MSE 2.14 110.63 80.88 198.96 57.12 29.49 --- --- ---
RMSE 1.46 10.52 8.99 14.11 7.56 5.43 --- --- ---
MAPE 84.32 74.46 29.95 21.30 28.46 75.66 --- --- ---
2
R --- --- --- --- --- --- --- --- ---
MAD --- --- --- --- --- --- --- --- ---
FLH MSE Satellite --- --- --- --- --- --- --- --- ---
RMSE --- --- --- --- --- --- --- --- ---
MAPE --- --- --- --- --- --- --- --- ---
R2 --- --- --- --- --- --- --- --- ---
MAD --- --- --- --- --- --- --- --- ---
RLH MSE Satellite --- --- --- --- --- --- --- --- ---

of
RMSE --- --- --- --- --- --- --- --- ---
MAPE --- --- --- --- --- --- --- --- ---

ro
R2 --- --- --- --- --- --- --- --- ---
MAD --- --- --- --- --- --- --- --- ---
MCI MSE Satellite --- --- --- --- --- --- --- --- ---
RMSE
MAPE
---
---
---
---
---
---
-p ---
---
---
---
---
---
---
---
---
---
---
---
re
R2 --- --- --- --- --- --- --- --- ---
MAD --- --- --- --- --- --- --- --- ---
lP

MPH MSE Satellite --- --- --- --- --- --- --- --- ---
RMSE --- --- --- --- --- --- --- --- ---
MAPE --- --- --- --- --- --- --- --- ---
na

R2 0.48 0.47 0.64 0.46 0.39 0.09 --- --- ---


MAD 0.80 3.94 6.85 5.06 5.26 3.18 --- --- ---
PLSR MSE Satellite 1.29 26.80 66.32 52.55 54.86 13.48 --- --- ---
ur

RMSE 1.14 5.18 8.14 7.25 7.41 3.67 --- --- ---
MAPE 74.30 36.80 28.12 10.55 28.28 53.67 --- --- ---
Jo

R2 0.53 0.48 0.76 0.59 0.44 0.21 --- --- ---


MAD 0.79 3.54 4.87 4.13 5.06 3.17 --- --- ---
SVR MSE Satellite 1.07 21.21 41.59 33.81 53.66 13.47 --- --- ---
RMSE 1.04 4.61 6.51 5.81 7.33 3.67 --- --- ---
MAPE 76.81 29.40 18.68 8.49 26.20 53.59 --- --- ---
R2 0.68 0.75 0.83 0.58 0.51 0.26 --- --- ---
MAD 0.65 2.25 4.82 4.20 4.79 2.55 --- --- ---
DNN MSE Satellite 0.72 7.95 41.54 35.02 45.96 9.43 --- --- ---
RMSE 0.85 2.82 6.45 5.92 6.78 3.07 --- --- ---
MAPE 73.75 15.46 18.65 8.65 25.45 43.95 --- --- ---
R2 0.64 0.70 0.81 0.49 0.42 0.21 --- --- ---
LSTM MAD Satellite 0.69 2.28 5.28 5.20 5.06 3.14 --- --- ---
MSE 0.80 7.98 46.36 46.28 53.66 13.19 --- --- ---
RMSE 0.89 2.89 6.81 6.80 7.33 3.63 --- --- ---
Journal Pre-proof

MAPE 73.02 15.86 21.61 10.70 26.20 53.00 --- --- ---
2
R --- --- --- --- --- --- --- --- ---
MAD --- --- --- --- --- --- --- --- ---
OCLMI MSE Satellite --- --- --- --- --- --- --- --- ---
RMSE --- --- --- --- --- --- --- --- ---
MAPE --- --- --- --- --- --- --- --- ---
2
R --- --- --- --- --- --- --- --- ---
MAD --- --- --- --- --- --- --- --- ---
QAA MSE Satellite --- --- --- --- --- --- --- --- ---
RMSE --- --- --- --- --- --- --- --- ---
MAPE --- --- --- --- --- --- --- --- ---
R2 --- --- --- --- --- --- --- --- ---

of
MAD --- --- --- --- --- --- --- --- ---
2SeaColor MSE Satellite --- --- --- --- --- --- --- --- ---

ro
RMSE --- --- --- --- --- --- --- --- ---
MAPE --- --- --- --- --- --- --- --- ---
2
R --- 0.54 --- --- --- --- --- --- ---

WASI
MAD
MSE Satellite
---
---
3.59
23.04
-p
---
---
---
---
---
---
---
---
---
---
---
---
---
---
re
RMSE --- 4.80 --- --- --- --- --- --- ---
MAPE --- 24.61 --- --- --- --- --- --- ---
lP

R2 --- 0.79 --- --- --- --- --- --- ---


MAD --- 1.52 --- --- --- --- --- --- ---
NN
MSE Satellite --- 4.06 --- --- --- --- --- --- ---
Inversion
na

RMSE --- 2.02 --- --- --- --- --- --- ---
MAPE --- 11.66 --- --- --- --- --- --- ---
ur
Jo
Journal Pre-proof

of
ro
-p
re
lP
na
ur
Jo

Figure 8. Plotted modeling results when applied to the satellite dataset. The spectral shape
methods (i.e. FLH, RLH, MCI, and MPH) along with OCLMI, QAA, and 2SeaColor were not
applied to the satellite dataset due to unavailable spectral bands. The WASI and NN Inversion
methods were only applied to Chl-a due to methodological limitations. Empirical and bio-optical
methods outperformed spectral ratios in all cases. DNN generated the best models for BGA-PC,
TSS, turbidity, and DO, SVR for fDOM, and NN Inversion for Chl-a.
Journal Pre-proof

In addition to evaluating different models using proximal and satellite datasets, the
proximal dataset was also resampled to Landsat-8 data with corresponding in-situ water quality
measurements for model assessment. The simulated Landsat-8 data-based estimation results are
displayed in Table 8 and Figure 9. Like the results of the proximal and satellite datasets, DNN-
based methods yielded superior performance to other methods in estimating BGA-PC, Chl-a,
TSS, turbidity, NO3-N and PO4-P with higher R2 and lower MAD, MSE, RMSE, and MAPE
values. DNN provided the highest accuracy to TDS estimation with R2 of 0.34 and RMSE of
48.84. Bio-optical inversion methods exhibited slightly lower accuracies than DNN, but better
performance than spectral band ratios methods. Consistent with the proximal dataset-based
estimation, all models presented relatively stronger estimation capability for BGA-PC, Chl-a,
TSS and turbidity (Table 8), while the estimation performance for NO3-N and TDS was very
poor with low R2 and high error metrics. Particularly, all models yielded negative R2 values and

of
high MAD, MSE, RMSE, and MAPE for PO4-P estimation, indicating a poor model
performance, which can be primarily due to the non-optical nature of PO4-P.

ro
Table 8. Results of the water quality modeling analysis when applied to the resampled dataset.

-p
This dataset represents the proximal data resampled to Landsat-8 with corresponding in-situ
water quality measurements. Mean R2 and mean RMSE of each model when applied to the
re
testing dataset are included to evaluate overall model performance. The best performances for
each parameter are indicated in bold.
lP

BGA- NO3- PO4-


Dataset Chl-a TSS fDOM Turbidity DO TDS
PC N P
R2 0.65 0.64 0.23 --- 0.25 --- 0.18 -0.11 0.13
na

MAD 0.38 3.56 10.17 --- 21.94 --- 0.21 0.22 112.56
NDCI MSE Resampled 0.59 38.15 247.22 --- 1284.46 --- 0.11 0.07 20335.11
RMSE 0.77 6.18 15.72 --- 35.84 --- 0.33 0.27 142.60
ur

MAPE 10.77 16.46 37.52 --- 65.57 --- 237.69 289.55 57.83
R2 0.31 0.43 0.32 --- 0.25 --- 0.06 -0.04 0.07
Jo

MAD 0.69 3.87 9.01 --- 21.78 --- 0.23 0.21 142.31
FAI MSE Resampled 1.41 54.04 188.20 --- 1270.82 --- 0.13 0.06 27630.43
RMSE 1.19 7.35 13.72 --- 35.65 --- 0.35 0.25 166.22
MAPE 23.28 16.23 35.51 --- 64.56 --- 304.75 276.51 72.78
R2 0.38 0.49 0.36 --- 0.12 --- 0.05 -0.10 0.06
MAD 0.51 4.02 9.13 --- 22.15 --- 0.25 0.21 140.05
CIA MSE Resampled 0.69 55.08 164.39 --- 1577.05 --- 0.14 0.06 25968.48
RMSE 0.83 7.42 12.82 --- 39.71 --- 0.37 0.25 161.15
MAPE 15.27 16.98 34.59 --- 64.19 --- 391.64 281.09 73.70
R2 0.35 0.41 0.57 --- 0.49 --- 0.03 0.00 0.04
MAD 0.81 4.99 7.63 --- 17.58 --- 0.24 0.23 132.98
NDTI MSE Resampled 1.65 83.63 114.93 --- 940.71 --- 0.13 0.07 24133.88
RMSE 1.28 9.14 10.72 --- 30.67 --- 0.36 0.27 155.35
MAPE 25.71 19.18 29.52 --- 48.37 --- 302.11 264.67 69.72
Journal Pre-proof

R2 --- --- --- --- --- --- --- --- ---


MAD --- --- --- --- --- --- --- --- ---
FLH MSE Resampled --- --- --- --- --- --- --- --- ---
RMSE --- --- --- --- --- --- --- --- ---
MAPE --- --- --- --- --- --- --- --- ---
R2 --- --- --- --- --- --- --- --- ---
MAD --- --- --- --- --- --- --- --- ---
RLH MSE Resampled --- --- --- --- --- --- --- --- ---
RMSE --- --- --- --- --- --- --- --- ---
MAPE --- --- --- --- --- --- --- --- ---
R2 --- --- --- --- --- --- --- --- ---
MAD --- --- --- --- --- --- --- --- ---

of
MCI MSE Resampled --- --- --- --- --- --- --- --- ---
RMSE --- --- --- --- --- --- --- --- ---

ro
MAPE --- --- --- --- --- --- --- --- ---
R2 --- --- --- --- --- --- --- --- ---

MPH
MAD
MSE
RMSE
Resampled
---
---
---
---
---
---
---
---
---
-p ---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
re
MAPE --- --- --- --- --- --- --- --- ---
2
R 0.62 0.74 0.73 --- 0.38 --- 0.16 -0.02 0.17
lP

MAD 0.45 3.52 3.40 --- 21.91 --- 0.20 0.21 49.41
PLSR MSE Resampled 0.62 38.87 35.41 --- 1354.97 --- 0.11 0.06 4673.91
RMSE 0.79 6.23 5.95 --- 36.81 --- 0.33 0.25 68.37
na

MAPE 14.18 15.79 11.71 --- 58.79 --- 230.72 277.84 28.61
2
R 0.73 0.71 0.81 --- 0.52 --- 0.21 0.01 0.23
MAD 0.35 3.33 2.36 --- 16.40 --- 0.21 0.23 39.15
ur

SVR MSE Resampled 0.34 36.80 21.43 --- 875.26 --- 0.10 0.08 3466.64
RMSE 0.59 6.07 4.63 --- 29.58 --- 0.32 0.29 58.88
Jo

MAPE 10.46 15.05 7.54 --- 44.18 --- 234.19 261.54 22.05
R2 0.81 0.80 0.84 --- 0.62 --- 0.25 0.00 0.34
MAD 0.31 1.06 2.04 --- 14.83 --- 0.20 0.20 32.66
DNN MSE Resampled 0.29 7.55 13.93 --- 764.86 --- 0.10 0.06 2384.97
RMSE 0.54 2.75 3.73 --- 27.66 --- 0.31 0.25 48.84
MAPE 9.32 6.22 6.89 --- 35.74 --- 232.55 264.80 19.31
R2 0.71 0.73 0.82 --- 0.57 --- 0.23 0.00 0.25
MAD 0.38 3.45 2.12 --- 16.12 --- 0.21 0.22 39.41
LSTM MSE Resampled 0.40 38.11 22.33 --- 834.89 --- 0.11 0.07 3587.76
RMSE 0.63 6.17 4.73 --- 28.89 --- 0.32 0.26 59.90
MAPE 12.54 15.44 6.52 --- 48.01 --- 237.29 263.57 22.54
R2 --- --- --- --- --- --- --- --- ---
OCLMI MAD Resampled --- --- --- --- --- --- --- --- ---
MSE --- --- --- --- --- --- --- --- ---
Journal Pre-proof

RMSE --- --- --- --- --- --- --- --- ---
MAPE --- --- --- --- --- --- --- --- ---
2
R --- --- --- --- --- --- --- --- ---
MAD --- --- --- --- --- --- --- --- ---
QAA MSE Resampled --- --- --- --- --- --- --- --- ---
RMSE --- --- --- --- --- --- --- --- ---
MAPE --- --- --- --- --- --- --- --- ---
2
R --- --- --- --- --- --- --- --- ---
MAD --- --- --- --- --- --- --- --- ---
2SeaColor MSE Resampled --- --- --- --- --- --- --- --- ---
RMSE --- --- --- --- --- --- --- --- ---
MAPE --- --- --- --- --- --- --- --- ---

of
R2 --- 0.75 --- --- --- --- --- --- ---
MAD --- 3.79 --- --- --- --- --- --- ---

ro
WASI MSE Resampled --- 44.07 --- --- --- --- --- --- ---
RMSE --- 6.64 --- --- --- --- --- --- ---
MAPE --- 16.34 --- --- --- --- --- --- ---
R2
MAD
---
---
0.77
3.09
---
---
-p
---
---
---
---
---
---
---
---
---
---
---
---
re
NN
MSE Resampled --- 30.70 --- --- --- --- --- --- ---
Inversion
RMSE --- 5.54 --- --- --- --- --- --- ---
lP

MAPE --- 13.75 --- --- --- --- --- --- ---
na
ur
Jo
Journal Pre-proof

of
ro
-p
re
lP
na
ur
Jo
Journal Pre-proof

Figure 9. Plotted modeling results when applied to the resampled dataset. The spectral shape
methods (i.e. FLH, RLH, MCI, and MPH) along with OCLMI, QAA, and 2SeaColor were not
applied to the satellite dataset due to unavailable spectral bands. The WASI and NN Inversion
methods were only applied to Chl-a due to methodological limitations. Empirical models
generated the best results in all cases. DNN outperformed all other models for BGA-PC, Chl-a,
TSS, turbidity, NO3-N, and PO4-P while SVR resulted in the best model for TDS.

6.2 Influence of spectral domain used for estimation water quality variables
The influence of individual spectral bands was determined by calculating the selectivity
ratio of each band when applied to the empirical methods. The selectivit y ratio is the ratio
between explained and residual variance of the individual spectral variables on the projected

of
target data (Rajalahti et al. 2009). The selectivity ratio is calculated as:
𝑉𝑒𝑥𝑝,𝑖
𝑆𝑅𝑖 = , 𝑖 = 1,2,3 … , (3)

ro
𝑉𝑟𝑒𝑠 ,𝑖

where 𝑉𝑒𝑥𝑝 is the variance explained by spectral band i and 𝑉𝑟𝑒𝑠 is the residual variance of the
-p
parameter being estimated. Figure 10. shows the results of the selectivity ratio analysis when
applied to the proximal dataset. The values generated from the selectivity ratio analysis are a
re
relative value and indicate feature importance in the model.
lP
na
ur
Jo
Journal Pre-proof

of
ro
-p
re
lP
na
ur
Jo

Figure 10. Plotted results of the selectivity ratio analysis when applied to the proximal dataset.
The selectivity ratio shown in orange is a relative measure, which is scaled between 0 and 5 by
multiplying the original values by 300. Black solid line represents mean reflectance values.

Results for BGA-PC indicate that regions near 470 nm, 725 nm, and 860 nm had the
greatest impact on predictive modeling. This appeared to align with the strong absorption feature
at ~730. Regions deemed as significant for Chl-a primarily aligned with well-known spectral
properties. The Chl-a pigment significantly absorbs light at 442 nm (primary) and 670 nm
(secondary) and has a reflectance peak around 700 nm due to the minimum sum of absorption of
phytoplankton, particulate and dissolved matter and water (Gitelson et al. 1993), and its
fluorescence emission band at 681 nm (Gower and Borstad 1981). The TSS results indicated that
Journal Pre-proof

the narrow region near the reflectance peak at 675 nm, 800 nm, and 875 nm had the greatest
impact on modeling. This aligns well with other research identifying the red and NIR regions as
key predictors of suspended sediment (Gitelson et al. 1993; Petus et al. 2010). Turbidity
displayed key regions across the spectrum many overlapping with BGA-PC, Chl-a, and TSS but
had highest values for the region near 460 nm. NO3-N results did show several unique regions
mainly at the absorption features near 610 nm and 725 nm. The absorption feature at 640 nm
generally corresponded to TSS and PO4-P while the trough at 670 nm aligned loosely with Chl-a
and BGA-PC and TSS. While results for NO3-N indicated distinct regions results for PO4-P
displayed low significance to several regions including 485 nm, 650 nm, and 825 nm. This was
reflected in the poor modeling results generated for the parameter and helps explain the difficulty
of predicting non-optical variables. Overall, the variables with narrow band regions deemed as

of
important yielded the highest overall modeling accuracies (i.e. TSS, Chl-a, and BGA-PC).

7. Discussion

ro
7.1 Difference in performances among methods

-p
As seen in the results of the water quality estimation (Tables 6, 7, and 8) the spectral
methods consisting of selected band ratio and spectral shape algorithms generated the lowest
overall correlations. Figure 11. shows the best model results for each respective water quality
re
parameter and grouped by methodological approach. The best models in all cases were generated
by the empirical approach. In the case of Chl-a the NN inversion (bio-optical) model was near
lP

identical to the DNN generated model.


na
ur
Jo

Figure 11. The best model results for each respective water quality parameter, across all datasets,
and grouped by methodological approach. The bio-optical models were only applied to Chl-a due
Journal Pre-proof

to methodological limitations. In nearly all cases empirical methods generated the best results
followed closely by bio-optical models in the case of Chl-a.

There have been several studies indicating that spectral indices are prone to interference
from organic matter, phytoplankton, turbidity, and suspended sediment concentrations based on
the status of the given water body (IOCCG 2006; Matthews 2011). Spectral band ratios and
shape algorithms are popular approaches as they are generalizable and can be easily applied
across wide geographic ranges. Although easy to implement, they generally lack accuracy for
inland applications as they assume constant water and atmospheric conditions. This can cause
significant error in estimations, especially when applied across a time series. For these reasons,
spectral band ratios and spectral shape methods are better suited to gauge and identify spatial
distributions of water quality parameters rather than make exact estimations.

of
The bio-optical models tested in this paper generally proved to be more robust than
spectral methods and slightly less accurate than the empirical models. The NN Inversion method

ro
proved to be the most robust bio-optical model tested. This can likely be attributed to the large
range of IOPs that the NN based model is based upon and the robust atmospheric correction

-p
routine used for satellite data. Apart from the WASI method when applied to the proximal and
resampled datasets, all other models generated low overall accuracies. There are several concerns
re
related to bio-optical inversion methods primarily due to their sensitivity to errors in atmospheric
correction procedures and the additive nature of the IOPs (Lee et al. 2002). For example, QAA
lP

inversion and similar approaches were found to be highly affected by bottom reflectance
(Doerffer and Schiller 2007), thus making the method less accurate for applications in shallow
inland waters. The uncertainty in determining IOPs must factor in both uncertainty in pigment
na

absorption and the value of the other absorbing and scattering components in water. This is also
known as the masking effect (Brockmann et al. 2016). Ideally, these methods require in-depth
understanding of the given waterbody and local or regional IOPs which are often key parameters
ur

in the inversion equation. A great deal of in-situ data on IOPs are required for parameterization
of bio-optical models for generalization regionally (Matthews 2011). In this study, many if not
Jo

all, of the modeling parameters utilized default inland water values as no supplemental data was
gathered during field measurements. This can cause significant errors in modeling although the
models are based on sound theory. Issues such as these make bio-optical approaches complex
and limit the overall effectiveness representing significant obstacles to implementation.
In this study, empirical methods outperformed spectral methods and bio-optical
inversions proving to be robust for estimating a wide range of variables. The deep learning
methods (DNN and LSTM) showed promising results by generating high overall model
accuracies. Deep learning methods have been proven to be robust in studies throughout the field
of remote sensing and can simulate the complex non-linear influence of various parameters on
the water- leaving reflectance. Other studies such as (Brewin et al. 2015) showed similar results
proving that empirical methods tend to outperform a range of bio-optical and the more simplistic
spectral ratio/shape methods when applied to regional datasets. Empirical methods also have the
benefit of being easy to implement and straightforward for data processing. However, since the
Journal Pre-proof

models are derived from data that cannot fully represent all- natural variations, the performance
of such algorithms is inherently subject to the water conditions and variable ranges from which
the data were obtained for algorithm development (IOCCG 2006). The dataset used for training
empirical models is of key importance and must be carefully considered based on the application.
7.2 Prospects of indirect estimation of non-optically active parameters
In several recent studies, there have been attempts at estimating concentrations of non-
optically active parameters in freshwater. Generally, these attempts have used indirect means to
estimate spatial and temporal distributions of nutrients, however finding significant correlations
between optically active variables and their spectral signal with nutrients remains a major
challenge (Wu et al. 2010a). Several studies indicated a relatively strong correlation between
Chl-a, CDOM, and suspended sediment with N and P concentrations (Chang et al. 2012;

of
Domagalski et al. 2007; Heiskary and Wilson 2005). Therefore, by knowing the spectral
characteristics of several key water quality parameters, empirical correlations can be found

ro
between these variables and nutrients. He et al. (2008) established empirical models for Guanting
Reservoir in China and analyzed eight commo nly concerned water quality variables, namely

-p
algae content, turbidity, and concentrations of chemical oxygen demand (COD), total nitrogen
(TN), ammonia nitrogen (NH3-N), NO3-N, total phosphorus (TP), and dissolved phosphorus
re
(DP), by using Landsat-5 TM data. Results demonstrated a statistically significant correlation
(10–30% mean relative error) between all estimated parameters and reflectance regression
lP

algorithms. Similarly, (Song et al. 2012a) utilized airborne hyperspectral imagery to empirically
estimate total phosphorus (TP) through associated variables Chl-a, TSM, and secchi disk
transparency (SD) for three reservoirs in Indiana, US. The results indicated that TP has close
na

association with diagnostic spectral variables with R2 ranging from 0.55 to 0.72 based on
reservoir.
ur

Although correlations among optically active and non-optically active variables have
been reported in several other studies (He et al. 2008; Wu et al. 2010b) this was not
Jo

demonstrated in this study’s data. Figure 12. shows the correlations among DO, NO3-N, PO4-P,
TDS, and optically active constituents.
Journal Pre-proof

of
ro
-p
re
lP
na
ur
Jo

Figure 12. Scatter plots displaying the relationship between non-optically active and optically
active water constituents. The dashed red line indicates the linear best fit relationship between
the two variables. All available data from both the proximal and satellite datasets are represented
in the plots. As displayed in the plots trends between optical and non-optical variables were weak
except for DO vs. Chl-a and NO3-N vs. BGA-PC.

As seen in Figure 12, there is generally very low correlations among optically active and
non-optically active variables however the strongest correlations were seen between DO and
fDOM (r = -0.56), DO and Chl-a (r = 0.53), NO3-N and BGA-PC (r = -0.52), and PO4-P and
BGA-PC (r = 0.39). Using an empirical approach machine learning and deep learning methods
such as those discussed in this paper serve as an ideal solution by enabling higher-level and non-
Journal Pre-proof

linear statistical relationships to be uncovered (Peterson et al. 2019) and link key variables to
non-optically active parameters. Future empirical based research should further explore and
carefully validate the use of deep learning methods and CNNs for water quality estimation.

7.3 Impact of spectral and spatial resolution


The results from this study showed that the proximal dataset generated higher accuracy
models than the satellite and resampled datasets in nearly all cases (Figure 13). The resampled
dataset only showed minor decreases in performance whe n compared to the proximal dataset.
This indicates that the loss of spectral resolution was not nearly as important as the radiometric
resolution or spectral sensitivity. Although the satellite and the resampled datasets contained the
spectral resolution the resampled data originated from proximal hyperspectral imagery with high

of
radiometric resolution. Moderate resolution satellites (e.g. Landsat-8, Sentinel-2) lack of narrow
bands and low signal-to- noise ratio (SNR) further prevents the use of the more sophisticated
algorithms such as spectral shape algorithms and bio-optical inversion. A common problem

ro
attributed to this low SNR for moderate resolution algorithms is inconsistency between studies
regarding which spectral features or bands are useful for estimating the target parameter. This
-p
creates difficulty in understanding the bio-physical dynamics underlying the empirical models
and make large scale models inaccurate. The loss of accuracy seen in the satellite dataset can
re
also be attributed to atmospheric effects on the spectral signal received by the sensor. As noted in
other studies atmospheric correction over inland waters is still unresolved and can generate
lP

significant errors.
na
ur
Jo

Figure 13. Plotted results of the best model for each respective water quality parameter by
dataset. The y-axis is the reported R2 value of the resulting model when applied to the testing
dataset. The variables fDOM, DO, NO3-N, PO4-P, and TDS were not contained in all dataset
Journal Pre-proof

due to limitations in field measurements. As expec ted, the proximal and resampled datasets
outperformed the satellite dataset likely due to atmospheric interference.

It is worth noting that due to its shorter revisit time, higher spatial resolutions in visible
and near-infrared bands (10 m) and three red-edge bands with 20 m spatial resolution, one may
expect that Sentinel-2 has advantages over Landsat-8 for water quality monitoring. However, our
focus in this paper was to provide a critical review of various models and methods utilizing most
up-to-data technologies (e.g., Landsat-8 and Sentinel-2 virtual constellation). A direct
comparison of these satellite data against water quality estimation were excluded from this paper.
Other observed differences among these results can also be attributed to sensor spectral
resolution. The proximal dataset consisting of high-resolution spectral data (1 nm – 3 nm
intervals) represents the target with continuous spectral information across the visible and NIR

of
regions. Detailed spectral information such as this captures nearly all absorption and scattering
trends within the given spectrum. On the other hand, the satellite dataset which is resampled to

ro
Landsat-8 spectral resolution contains discrete spectral data. This spectral data only represents
and captures target information in specific broad band regions and may overlook key spectral

-p
regions for many water quality parameters. Originally designed for terrestrial applications,
Landsat-8 and Sentinel-2 both lack bands in key regions for water quality monitoring such as the
re
Chl-a absorption regions. This creates a further hindrance for accurate parameter estimation.
Future satellite sensor developments should carefully assess band placement especially within
lP

the red, red edge, and NIR regions which are useful in both te rrestrial and aquatic remote sensing.
Although moderate resolution satellites lack spectral resolution, they are often chosen for
inland waters. This is due to the fact that obtaining valid data generally requires 3×3 pixels that
na

fit within the water body boundaries. Spatial resolution is of specific concern for inland waters as
large number of lakes, streams, and rivers cannot be successfully monitored with a sensor spatial
resolution larger than 100 m. According to Verpoorter et al. (2014), a sensor spatial resolution of
ur

< 15 m (e.g. Sentinel-2) could map all global lakes with an area of 0.002 km2 . Although a ~15 m
resolution would be able to map most lakes and reservoirs only about 25% of stream and rivers
Jo

could be resolved. The ability to resolve rivers from space requires much higher resolution than
lakes. Globally, most river widths are less than 10 m wide which would, in theory, require a
spatial resolution of ~3 m (IOCCG 2018b). Currently, there are no freely available multispectral
satellite sensors that offer an ideal combination of spectral and spatial resolution.
7.4 Anomaly detection from time-series analysis
For algal bloom to occur several conditions have to be met including increasing
concentration of nutrients, slow moving or still waterbodies and higher temperatures. Estimation
of the exact amount of concentration of water quality variables of interest may not provide
precise information about onset of an event. Relative changes in water quality products (e.g.,
Chl-a, SSC/TSS) in space and time can be observed by satellite data and may provide critical
information on impending bloom outbreak, which can be used to trigger alerts and further on site
investigation. Such anomalies can be defined by deviation of observations from long term
Journal Pre-proof

average over a specific waterbody in which z-score outlier rejection test can be implemented for
satellite-derived time-series products (Daszykowski et al. 2007). Peterson et al. (2020)
incorporated Empirical Data Analytics (EDA) (Angelov et al. 2016), a fully data-driven
(unsupervised) method to identify spatio-temporal anomalies in water quality data. This was
particularly encouraging because the approach does not require user defined thresholds to
identify anomalies as variable thresholds can be different by region or even by specific
waterbodies. In recent years, machine learning has demonstrated its great potential in anomaly
detection that decipher both spatial and temporal variability in data (Kiran et al. 2018; Zhang et
al. 2016) with high true-detection rate and minimum false negatives. Future research should
focus on developing novel anomaly detection methods based on data-driven artificial intelligence
that can harness time-series satellite big data with the power of latest deep learning approaches.

of
7.5 Future sensors
The low temporal revisit time of moderate resolution satellites (i.e. Landsat-8 and

ro
Sentinel-2) enables effective monitoring of specific events rather than frequent change-detection
(Claverie et al. 2018). In general, these instruments are also best suited for deriving site and time
-p
parameter-specific empirical algorithms (Matthews 2011) since they lack narrow bands in key
spectral regions. Due to these limitations, future sensors such as NASA’s HyspIRI, will provide
re
enhanced monitoring capabilities for inland water applications with a 30 m spatial resolution and
214 contiguous spectral bands in visible and shortwave infrared regions. Given the relatively fine
lP

spatial and spectral resolution HyspIRI will be very useful water monitoring and cover key
wavelength regions for all OACs. Future sensors should carefully consider the tradeoffs between
temporal, spatial, spectral, and radiometric resolution to ensure inland water quality is feasible in
na

conjunction with terrestrial monitoring. An “ideal” inland water q uality sensor would consist of
an imaging spectrometer with 3 to 10 nm spectral resolution, bands ranging from 250 to 1000 nm,
ur

at least three SWIR bands for atmospheric correction, and spatial resolution between 5 m and 15
m. Revisit frequency should be as high as possible and account for sun glint by timing and
Jo

pointing the satellite sensor away from the sun (IOCCG 2018b).

7.6 Remaining challenges and prospects


7.6.1 Non-optically active parameter estimation
Not all water quality variables can be directly measured by current satellites, including
nutrient concentrations and microorganisms/pathogens, due to lack of hyperspectral data at fine
spatial resolutions. However, nutrient levels often correlate with other parameters, so future
research should focus on: (1) Evaluating spectral signatures (reflectance shape and amplitude) of
water samples with known nutrient concentrations along with known amounts of suspended
sediments and Chl-a, which are both remotely measurable. This may provide an indirect
estimation of nutrients, as high levels of nutrients are often associated with flood waters (i.e.,
high suspended sediment levels) while low levels of dissolved nutrients can be associated with
high Chl-a levels (nutrient uptake in biomass). (2) Synchronous collection of hyperspectral,
Journal Pre-proof

multispectral, and thermal data from ground, UAS, and satellite which will better characterize
water quality at different scales and could potentially be scaled up to Landsat/Sentinel-2. (3)
Integrating land-cover/land-use, vegetation phenology, and agricultural activities with hydrologic
models to predict nutrient loads.

7.6.2 Atmospheric correction and land adjacency effect


As seen in this study (Figure 13), the impact of atmospheric interference and the lack of
consistent and accurate correction routines can have a significant impact on constituent
estimation. Although sensor specific atmospheric correction routines for inland water have
existed for several decades, there is no a standardized water surface reflectance product.
Atmospheric correction over water is especially important because 70-90% of the TOA
reflectance from water is from atmospheric effects and sun glint from the water surface (Wang et

of
al. 2010). Brighter land pixels surrounding the target, or adjacent to, a dark water target,
produces stray light increasing at-sensor radiance by up to 50 % in the NIR bands (Odermatt et al.

ro
2008). Uncorrected this can lead to errors in estimation of water quality parameters reliant on the
NIR region. Stray light reaching the sensor from land may result in errors in measured near-shore

-p
water-leaving reflectance. Over small inland water bodies, it contributes to the over-correction of
atmospheric effects for the visible bands. Care should be taken to reduce and potentially
re
eliminate adjacency effects by: (1) combining spectral shape and absorption depth algorithms
that are less sensitive to atmospheric effects; (2) spectral unmixing of land and water
lP

endmembers using radiative transfer simulations and quantifying the impact of sca le on ground,
airborne and satellite observations; and (3) using HydroLight/MODTRAN radiative transfer
simulations of water/land leaving radiances and satellite signal.
na

However, any radiative transfer simulations or empirical atmospheric correction


algorithms are based on certain assumptions and simplifications of the real- world scenarios.
Future research should focus on direct estimation methods of water quality variables using TOA
ur

reflectance. In direct estimation method, representative datasets of wate r quality and TOA
reflectance are generated by combining machine learning (especially deep learning) and radiative
Jo

transfer simulations for various cases of atmospheric conditions and water quality scenarios,
accounting for different solar and sensor observation geometries. These models will then be
applied to satellite data to estimate water quality at regional and global scales. A similar concept
was used by Wang et al. (2017c) for direct estimation of land surface albedo using TOA
reflectance data. for various cases of atmospheric and surface conditions.
7.6.3 Revisit time of terrestrial satellites
Although, Landsat and Sentinel-2 have 16-day and 5-day global revisit times (from 2- to 5-
day when combined), respectively, cloud cover severely limits the coverage of satellite-driven
water quality products. In principle, virtual constellations of Landsat and Sentinel-2 data resolve
spectral confusions at the finer spatial detail and with an improved temporal resolution that will
allow detection despite the persistent cloud cover. This idea of a virtual constellation using
Landsat-8 and Sentinel-2 for inland water quality monitoring was proven successful Peterson et
Journal Pre-proof

al. (2020) in estimating range of water quality parameters with relatively high accuracies. To
resolve issues related to the temporal frequency the idea of a geostationary satellite has been
proposed by several working groups including the IOCCG and Integrated Global Observation
Strategy (IGOS). Although this would solve the temporal resolution issues the proposed sensors
would likely have too coarse spatial resolution for inland applications (~100-300 m) as they are
primarily focused on coastal observation. Other satellite platforms such as MERIS and MODIS
which are designed or commonly used for ocean color applications have sufficient revisit time
(1-3 days) but lack the required spatial resolution.

7.6.4 Difference between lentic and fluvial aquatic systems


Previous water quality studies using remo te sensing data have been most successful in
coastal areas. There have been limited studies on lentic (lakes and reservoirs) and fluvial systems,

of
which are especially important given their role in contributing to hypoxia. There are several
distinctions between lentic and fluvial systems including water flow, depth, and constituent

ro
transport which affects at sensor reflectance. Due to these differences, it is challenging to
generalize models for both lentic and fluvial environments. Thanks to the availabilit y of decades

-p
of satellite data and corresponding water quality data from USGS and other institutions, e.g., the
Great Rivers Ecological Observation Network (GREON), it is possible to generalize algorithms
re
for both lentic and fluvial systems by accounting for seasonal and annual variations in flow
velocity and volume. Efforts are currently underway to generalize models for various aquatic
lP

environments using high-resolution spatial and temporal data from satellites including Landsat-8
and Sentinel-2 (Claverie et al. 2018; Pereira et al. 2018; Peterson et al. 2020).
na

7.6.5 Cloud computing


In this paper, image processing including atmospheric correction and extraction of
reflectance data were completed on Google Earth Engine, and a custom deep learning
ur

architecture was developed and implemented locally between spectral data and water quality
variables. Due to the smaller sample size, we were able to run deep learning on a regular
workstation. However, deep learning applications with satellite imagery generally req uires
Jo

enormous computing power and time to train, test and implement algorithms. Especially for
regional and global water quality monitoring in operational scenarios, it is impossible to run the
algorithms on consumer grade workstations. Cloud computing infrastructure such as Google
Earth Engine (GEE), Amazon Web Services (AWS) etc. should be utilized for deep learning-
based water quality monitoring over large areas.
Over the last several years great progress has been made in cloud computing technologies
such as Google Earth Engine (GEE) and Amazon Web Services (AWS). This has been especially
useful in the case of remote sensing applications where computational cost is often high and
represents a significant barrier for researchers. Currently, both GEE and AWS cloud platforms
offer access to NASA, USGS, and ESA satellite data and provide high-end computing resources
at little to no cost. These improvements and newly developed capabilities have led to the
increased use and relevance of cloud computing in remote sensing with many recent stud ies
utilizing this resource. Although this technology represents a great improvement in
computational efficiency there are several limiting factors. One of the major drawbacks to cloud
Journal Pre-proof

computing for remote sensing is the limited access to widely used programming languages and
software. For example, GEE primarily supports the native JavaScript-based language and
specific Python packages. Those wishing to use advanced machine learning algorithms and other
common programmatic languages such MATLAB or R must currently do outside of GEE. Users
are also limited by atmospheric correction routines that require either separate software or
additional local processing which requires the downloading of large datasets. Current limitations
such as these confine the potential of these vast computing resources for mainstream use by
researchers but provide powerful tools for common processing tasks.

8. Conclusion
The objective of this paper was to review recent advances in remote sensing of water
quality by comparing methods for quantifying a range of popular water quality parameters that

of
deemed promising from the recent literature. We used multi-scale remote sensing data from
proximal and satellite-based sensors to estimate major indicators of water quality including

ro
optically active parameters (BGA-PC, Chl-a, fDOM, turbidity, and TSS) as well as non-optically
active variables (DO, N, P, and SC). Various monitoring techniques, summarized as (a) spectral
-p
reflectance-based ratio and shape methods, (b) empirical methods including PSLR, SVR, ELM,
and deep neural network (DNN), and (c) bio-optical estimation means, were compared against
re
field data collected in multiple streams and lakes in the Midwestern US. Main conclusions of this
paper include the followings.
lP

(1) Machine learning methods provided the best overall accuracy for almost all of the
case studies. Our results indicated that empirically based machine learning and deep
na

learning methods outperformed frequently used band ratios, spectral shape, and bio-
optical inversion techniques. This is due to machine learning methods, especially
deep learning algorithms that use sophisticated optimization to minimize the
ur

difference between prediction errors when sufficient training data are provided during
modeling stage.
Jo

(2) Satellite virtual constellation and data fusion is a very promising approach to improve
data gaps and therefore estimation accuracy. Moderate resolution satellites, required
for inland water applications, are often limited by temporal revisit times not frequent
enough to capture quickly changing water dynamics to predict major issues such as
algal blooms. In this research, we utilized cloud-based multisensor data fusion of
Landsat-8 and Sentinel-2 to create a satellite virtual constellation containing identical
spectral and spatial resolutions. This virtual constellation greatly improved the
temporal revisit time and enabled the accurate monitoring and estimation of water
quality data from moderate resolution satellites not before possible.
(3) Satellite remote sensing can provide accuracy comparable to in-situ monitoring with
deep learning algorithms. When resampled to satellite spectra, in-situ proximal data
provide means to investigate the prediction power of satellite observations without
any atmospheric disturbances. The comparison of spectral data from proximal,
Journal Pre-proof

satellite, and proximal data resampled to Landsat-8 and Sentinel-2 data demonstrated
that the results among these data are closely comparable for suspended sediments and
Chl-a, indicating that the impact of scale is less significant for monitoring these water
quality indicators for which remote sensing can be used as a regional scale
observation tool. Prediction strengths for BGA-PC and turbidity were lower for
satellite data compared to proximal and resampled pro ximal data, which might be
associated with errors from atmospheric correction. Therefore, precise atmospheric
correction of satellite data is critical especially for estimating these BGA-PC and
turbidity.
(4) Estimation of non-optically active indicators of water quality (DO, N, P) remains a
significant challenge for remote sensing regardless of the scale. Although the results

of
of indirectly estimating nutrients and DO using other predictable variables (e.g.,
specific conductivity, Chl-a) are encouraging, prediction accuracy varied by
waterbodies and were inconsistent among the case studies.

ro
(5) Anomaly detection is a promising approach to provide useful information for
managers. Translating exact values of water quality indicators to decision- making
-p
with respect to predicting when and where a pollution outbreak occur is a complex
problem that vary among water bodies. However, understanding the trends and
re
deviations of these indicators from the long-term averages can be produced by using
satellite remote sensing. Comparing the trends with historic mean and outbreaks can
lP

help predict impeding water quality issues.

Future research should focus on (1) better detection of anomalies in water quality indicators
na

and its association with the onset of an outbreak, (2) indirect estimation of non-optically active
variables using very high-resolution spectral data, and most importantly (3) improving the
confidence in atmospheric correction of satellite data.
ur

Acknowledgements
Jo

This work was supported in part by the National Science Foundation (IIA-1355406 and
IIA-1430427) and in part by the National Aeronautics and Space Administration
(NNX15AK03H).

References
Albert, A., & Gege, P. (2006). Inversion of irradiance and remote sensing reflectance in shallow
water between 400 and 800 nm for calculations of water and bottom properties. Applied
Optics, 45, 2331-2343
Albert, A., & Mobley, C.D. (2003). An analytical model for subsurface irradiance and remote
sensing reflectance in deep and shallow case-2 waters. Optics Express, 11, 2873-2890
Alparslan, E., Aydöner, C., Tufekci, V., & Tüfekci, H. (2007). Water quality assessment at
Ömerli Dam using remote sensing techniques. Environmental monitoring and assessment,
135, 391
Journal Pre-proof

Angelov, P. P. , X. Gu, J. Principe, and D. Kangin. 2016. Empirical data analysis - a new tool
for data analytics. IEEE International Conference on Systems, Man, and Cybernetics:53-
9.
Alom, M.Z., Taha, T.M., Yakopcic, C., Westberg, S., Sidike, P., Nasrin, M.S., Hasan, M., Van
Essen, B.C., Awwal, A.A.S., Asari, V.K. A. (2019). A state-of-the-art survey on deep
learning theory and architectures." Electronics, 8(3), 292,
https://doi.org/10.3390/electronics8030292
Arst, H., & Kutser, T. (1994). Data processing and interpretation of sea radiance factor
measurements. Polar research, 13, 3-12
Baban, S.M. (1993). Detecting water quality parameters in the Norfolk Broads, UK, using
Landsat imagery. International Journal of Remote Sensing, 14, 1247-1267
Ball, J. E., Anderson, D. T., & Chan, C. S. (2017). Comprehensive survey of deep learning in
remote sensing: theories, tools, and challenges for the community. Journal of Applied

of
Remote Sensing, 11(4), 042609
Becker, R.H., Sultan, M.I., Boyer, G.L., Twiss, M.R., & Konopko, E. (2009). Mapping

ro
cyanobacterial blooms in the Great Lakes using MODIS. Journal of Great Lakes
Research, 35, 447-453
Bonansea, M., Rodriguez, M.C., Pinotti, L., & Ferrero, S. (2015). Using multi-temporal Landsat

-p
imagery and linear mixed models for assessing water quality parameters in Río Tercero
reservoir (Argentina). Remote Sensing of Environment, 158, 28-41
re
Brewin, R.J., Sathyendranath, S., Müller, D., Brockmann, C., Deschamps, P.Y., Devred, E.,
Doerffer, R., Fomferra, N., Franz, B., Grant, M., & Groom, S. (2015). The Ocean Colour
Climate Change Initiative: III. A round-robin comparison on in-water bio-optical
lP

algorithms. Remote Sensing of Environment, 162, 271-294


Brockmann, C., Doerffer, R., Peters, M., Kerstin, S., Embacher, S., & Ruescas, A. (2016).
Evolution of the C2RCC neural network for Sentinel 2 and 3 for the retrieval of ocean
na

colour products in normal and extreme optically complex waters. In Living Planet
Symposium, 740
Carmichael, W.W., & Boyer, G.L. (2016). Health impacts from cyanobacteria harmful algae
ur

blooms: Implications for the North American Great Lakes. Harmful Algae, 54, 194-212
Chang, N.B., Imen, S., & Vannah, B. (2015). Remote Sensing for Monitoring Surface Water
Quality Status and Ecosystem State in Relation to the Nutrient Cycle: A 40-Year
Jo

Perspective. Critical Reviews in Environmental Science and Technology, 45, 101-166


Chang, N.B., Xuan, Z.M., & Wimberly, B. (2012). Remote sensing spatiotemporal assessment of
nitrogen concentrations in Tampa Bay, Florida due to a drought. Journal of Terrestrial,
Atmospheric and Oceanic Sciences, 23, 1–12
Chapman, D. (1996). Water Quality Assessments - A Guide to Use of Biota, Sediments and
Water in Environmental Monitoring. E&FN Spon, an imprint of Chapman & Hall,
Chen, L. (2003). A study of applying genetic programming to reservoir trophic state evaluation
using remote sensor data. International Journal of Remote Sensing, 24, 2265-2275
Chen, L., Tan, C.H., Kao, S.J., & Wang, T.S. (2008). Improvement of remote monitoring on
water quality in a subtropical reservoir by incorporating grammatical evolution with
parallel genetic algorithms into satellite imagery. Water Research, 42, 296-306
Claverie, M., Ju, J., Masek, J.G., Dungan, J.L., Vermote, E.F., Roger, J.C., Skakun, S.V., &
Justice, C. (2018). The Harmonized Landsat and Sentinel-2 surface reflectance data set.
Remote Sensing of Environment, 219, 145-161
Journal Pre-proof

Claverie, M., Masek, J.G., Ju, J., & Dungan, J.L. (2017). Harmonized landsat-8 sentinel-2 (HLS)
product user’s guide. National Aeronautics and Space Administration (NASA):
Washington, DC, USA
Craig, S.E., Jones, C.T., Li, W.K.W., Lazin, G., Horne, E., Caverhill, C., & Cullen, J.J. (2012).
Deriving optical metrics of coastal phytoplankton biomass from ocean colour. Remote
Sensing of Environment, 119, 72-83
Crittenden, J.C., Hand, D.W., Howe, K.J., Rhodes Trussell, R., & Tchobanoglous, G. (2012).
Water Treatment Principles and Design, third ed. .
Crouch, S.R., Holler, F.J., & Skoog, D.A. (2007). Principles of Instrumental Analysis, sixth ed. .
Daszykowski, M., Kaczmarek, K., Vander Heyden, Y., & Walczak, B. (2007). Robust statistics
in data analysis—a review: basic concepts. Chemometrics and intelligent laboratory
systems, 85, 203-219
Davis, T.W., Berry, D.L., Boyer, G.L., & Gobler, C.J. (2009). The effects of temperature and

of
nutrients on the growth and dynamics of toxic and non-toxic strains of Microcystis during
cyanobacteria blooms. Harmful Algae, 8, 715-725

ro
Dekker, A.G., & Peters, S.W.M. (1993). he use of the Thematic Mapper for the analysis of
eutrophic lakes: a case study in the Netherlands. International Journal of Remote Sensing,
14, 799-821

-p
Dekker, A.G., Vos, R.J., & Peters, S.W.M. (2001). Comparison of remote sensing data, model
results and in situ data for total suspended matter (TSM) in the southern Frisian lakes.
re
Science of the Total Environment, 268, 197-214
Dierberg, F.E., & Carriker, N.E. (1994). Field testing two instruments for remotely sensing water
quality in the Tennessee Valley. Environmental science & technology, 28, 16-25
lP

Doerffer, R., & Schiller, H. (2007). The MERIS Case 2 water algorithm. International Journal of
Remote Sensing, 28, 517-535
Domagalski, J., Lin, C., Luo, C., Kang, J., Wang, S.M., Brown, L.R., & Munn, M.D. (2007).
na

Eutrophication study at the Panjakou-Daheiting reservoir system, Northern Hebei


Province, People’s Republic of China: Chlorophyll-a model and sources of phosphorous
and nitrogen. Agricultural Water Management, 94, 43–53
ur

Dörnhöfer, K., Gege, P., Pflug, B., & Oppelt, N. (2016). Mapping indicators of lake ecology at
Lake Starnberg, Germany—First results of Sentinel-2A. In L. Ouwehand (Ed.), Living
Planet Symposium. Noordwijk, The Netherlands: ESA Communications:
Jo

Dörnhöfer, K., & Oppelt, N. (2016). Remote sensing for lake research and monitoring–Recent
advances. Ecological Indicators, 64, 105-122
Draper, N., & Smith, H. (1981). Applied regression analysis. New York: Wiley Interscience
EPA (2017). Mississippi River/Gulf of Mexico Watershed Nutrient Task Force 2017 Report to
Congress. https://www.openchannels.org/sites/default/files/literature/https-
www.epa_.govsitesproductionfiles2017-
11documentshypoxia_task_force_report_to_congress_2017_final.pdf
Flink, P., Lindell, L.T., & Östlund, C. (2001). Statistical analysis of hyperspectral data from two
Swedish lakes. Science of the Total Environment, 68, 155-169
Franz, B.A., & Werdell, P.J. (2010). A generalized framework for modeling of inherent optical
properties in ocean remote sensing applications. Proceedings of Ocean Optics,
Anchorage, Alaska, 27, 1-13
Journal Pre-proof

Garver, S.A., & Siegel, D.A. (1997). Inherent optical property inversion of ocean color spectra
and its biogeochemical interpretation: 1. Time series from the Sargasso Sea. Journal of
Geophysical Research: Oceans, 102, 18607-18625
Ghulam, A., Li, Z.L., Qin, Q.M., Yimit, H., & Wang, J.H. (2008). Estimating crop water stress
with ETM plus NIR and SWIR data. Agricultural and Forest Meteorology, 148, 1679-
1695
Giardino, C., Bresciani, M., Stroppiana, D., Oggioni, A., & Morabito (2013). Optical remote
sensing of lakes: an overview on Lake Maggiore. Journal of Limnology, 73
Giardino, C., Candiani, G., Bresciani, M., Lee, Z., Gagliano, S., & Pepe, M. (2012). BOMBER:
a tool for estimating water quality and bottom properties from remote sensing images.
Computers & Geosciences, 45, 313-318
Gitelson, A., Garbuzov, G., Szilagyi, F., Mittenzwey, K.H., Karnieli, A., & Kaiser, A. (1993).
Quantitative remote sensing methods for real-time monitoring of inland waters quality.

of
International Journal of Remote Sensing, 14, 1269-1295
Gitelson, A.A., Gurlin, D., Moses, W.J., & Barrow, T. (2009). A bio-optical algorithm for the

ro
remote estimation of the chlorophyll-a concentration in case 2 waters. Environmental
research letters, 4
Glasgow, H.B., Burkholder, J.M., Reed, R.E., Lewitus, A.J., & Kleinman, J.E. (2004). Real-time

-p
remote monitoring of water quality: a review of current applications, and advancements
in sensor, telemetry, and computing technologies. Journal of Experimental Marine
re
Biology and Ecology, 300, 409-448
Gomez, J.A.D., Alonso, C.A., & Garcia, A.A. (2011). Remote sensing as a tool for monitoring
water quality parameters for Mediterranean Lakes of European Union water framework
lP

directive (WFD) and as a system of surveillance of cyanobacterial harmful algae blooms


(SCyanoHABs). Environmental Monitoring and Assessment, 181, 317-334
Gons, H.J. (1999). Optical teledetection of chlorophyll a in turbid inland waters. Environmental
na

Science & Technology, 33, 1127-1132


Gordon, H.R., Brown, O.B., & Jacobs, M.M. (1975). Computed relationship between the
inherent and apparent optical properties of a flat homogeneousocean. Applied optics, 14,
ur

417-427
Gower, J., King, S., Borstad, L., & Brown (2005). Detection of intense plankton blooms using
the 709 nm band of the MERIS imaging spectrometer. International Journal of Remote
Jo

Sensing, 26
Gower, J.F.R., & Borstad, G. (1981). Use of the in vivo fluorescence line at 685 nm for remote
sensing surveys of surface chlorophyll a.
Gray, J.R., Glysson, G.D., Turcios, L.M., & Schwarz, G.E. (2000). Comparability of suspended-
sediment concentration and total suspended solids data. In, Water-Resources
Investigations Report 00-4191. Reston, VA: U.S. Geological Survey
Hansen, C.H., Williams, G.P., Adjei, Z., Barlow, A., Nelson, E.J., & Miller, A.W. (2015).
Reservoir water quality monitoring using remote sensing with seasonal models: case
study of five central-Utah reservoirs. Lake and Reservoir Management, 31, 225-240

Hartling, S., Sagan, V., Sidike, P., Maimaitijiang, M., & Carron, J. (2019). Urban Tree Species
Classification Using a WorldView-2/3 and LiDAR Data Fusion Approach and Deep
Learning. Sensors, 19(6), 1284.
Journal Pre-proof

He, W., Chen, S., Liu, X., & Chen, J. (2008). Water quality monitoring in a slightly-polluted
inland water body through remote sensing—case study of the Guanting Reservoir in
Beijing, China. Frontiers of Environmental Science & Engineering in China, 2, 163-171
Heiskary, S., & Wilson, B. (2005). Minnesota lake water quality: Developing nutrient criteria
(3rd ed.). St. Paul MN: Minnesota Pollution Control Agency
Helms, J.R., Stubbins, A., Ritchie, J.D., Minor, E.C., Kieber, D.J., & Mopper, K. (2008).
Absorption spectral slopes and slope ratios as indicators of molecular weight, source, and
photobleaching of chromophoric dissolved organic matter. . Limnology and
Oceanography, 53, 955-969
Herrick, J.E., & Whitford, W.G. (1995). Assessing the quality of rangeland soils: challenges and
opportunities. Journal of Soil and Water Conservation, 50, 237–242
Hicks, B.J., Stichbury, G.A., Brabyn, L.K., Allan, M.G., & Ashraf, S. (2013). Hindcasting water
clarity from Landsat satellite images of unmonitored shallow lakes in the Waikato region,

of
New Zealand. Environmental monitoring and assessment 185, 7245-7261
Ho, J.C., Michalak, A.M. & Pahlevan, N. Widespread global increase in intense lake

ro
phytoplankton blooms since the 1980s. (2019). Nature 574, 667–670
doi:10.1038/s41586-019-1648-7
Hoge, F.E.a.L., P.E. (1996). Satellite retrieval of inherent optical properties by linear matrix

-p
inversion of oceanic radiance models: an analysis of model and radiance measurement
errors. Journal of Geophysical Research: Oceans, 101, 16631-16648
re
Hoogenboom, H.J., Dekker, A.G., & De Haan, J.F. (1998). Retrieval of chlorophyll and
suspended matter from imaging spectrometry data by matrix inversion. Canadian Journal
of Remote Sensing, 24, 144-152
lP

Horion, S., Bergamino, N., Stenuite, S., Descy, J.P., Plisnier, P.D., Loiselle, S.A., & Cornet, Y.
(2010). Optimized extraction of daily bio-optical time series derived from MODIS/Aqua
imagery for Lake Tanganyika, Africa. Remote Sensing of Environment, 114, 781-791
na

Hu, C. (2009). A novel ocean color index to detect floating algae in the global oceans. Remote
Sensing of Environment, 113, 2118-2129
Hu, C., Lee, Z., & Franz, B. (2012). Chlorophyll aalgorithms for oligotrophic oceans: A novel
ur

approach based on three‐band reflectance difference. Journal of Geophysical Research:


Oceans, 117
Jo

Huguet, A., Vacher, L., Relexans, S., Saubusse, S., Froidefond, J.M., & Parlanti, E. (2009).
Properties of fluorescent dissolved organic matter in the Gironde Estuary. Organic
Geochemistry, 40
IEPA, I.E.P.A. (2007). Sangamon river / Lake Decatur Watershed TMDL Report. Springfield
(IL): IEPA/BOW/07-017. In
IOCCG (2006). Remote sensing of inherent optical properties: fundamentals, tests of algorithms,
and applications. In Z. Lee (Ed.), Reports of the International Ocean-Colour
Coordinating Group, No. 5. Dartmouth, Canada: IOCCG
IOCCG (2018a). Earth Observations in Support of Global Water Quality Monitoring. In S. Greb,
A. Dekker, & C. Binding (Eds.), IOCCG Report Series. Dartmouth, Canada: International
Ocean Colour Coordinating Group
IOCCG (2018b). Earth Observations in Support of Global Water Quality Monitoring. In A.D. In
C. Binding, S. Greb (Ed.) (Ed.), Reports and Monographs of the International Ocean-
Colour Coordinating Group No. 17. Dartmouth, Canada: IOCCG
Journal Pre-proof

Islam, M.R., Yamaguchi, Y., & Ogawa, K. (2001). Suspended sediment in the Ganges and
Brahmaputra Rivers in Bangladesh: observation from TM and AVHRR data.
Hydrological Processes, 15, 493-509
Jordan, Y.C., Ghulam, A., & Chu, M.L. (2014). Assessing the Impacts of Future Urban
Development Patterns and Climate Changes on Total Suspended Sediment Loading in
Surface Waters Using Geoinformatics. Journal of Environmental Informatics, 24, 65-79
Kallio, K., Kutser, T., Hannonen, T., Koponen, S., Pulliainen, J., Vepsäläinen, J., & Pyhälahti, T.
(2001). Retrieval of water quality from airborne imaging spectrometry of various lake
types in different seasons. Science of the Total Environment, 268, 59-77
Keefer, L., & Bauer, E. (2005). Watershed monitoring for the Lake Decatur watershed: 2000–
2003. Contract Report 2005-09.
Keiner, L.E., & Brown, C.W. (1999). Estimating oceanic chlorophyll concentrations with neural
networks. International Journal of Remote Sensing, 20, 189-194

of
Kiran, B.R., Thomas, D.M., & Parakkal, R. (2018). An overview of deep learning based methods
for unsupervised and semi- supervised anomaly detection in videos. Journal of Imaging, 4,

ro
36.
Kloiber, S.M., Brezonik, P.L., & Bauer, M.E. (2002). Application of Landsat imagery to
regional-scale assessments of lake clarity. Water Research, 36, 4330-4340

-p
Kneizys, F.X., Shettle, E.P., Abreu, L.W., Chetwynd, J.H., & Anderson, G.P. (1988). Users
guide to LOWTRAN 7 (No. AFGL-TR-88-0177). AIR FORCE GEOPHYSICS LAB
re
HANSCOM AFB MA
Kritzberg, E.S., Cole, J.J., Pace, M. L., Granéli, W., Bade, D. L. (2004). Autochthonous versus
allochthonous carbon sources of bacteria: Results from whole-lake 13C addition
lP

experiments. Limnology and Oceanography. 49 (2): 588–596.


doi:10.4319/lo.2004.49.2.0588.
Kuchinke, C.P., Gordon, H.R., & Franz, B.A. (2009). Spectral optimization for constituent
na

retrieval in Case 2 waters I: Implementation and performance. Remote Sensing of


Environment, 113, 571-587
Kudela, R.M., Palacios, S.L., Austerberry, D.C., Accorsi, E.K., Guild, L.S., & Torres-Perez, J.
ur

(2015). Application of hyperspectral remote sensing to cyanobacterial blooms in inland


waters. Remote Sensing of Environment, 167, 196-205
Kutser, T. (2009). Passive optical remote sensing of cyanobacteria and other intense
Jo

phytoplankton blooms in coastal and inland waters. International Journal of Remote


Sensing, 30, 4401-4425
Kutser, T., Herlevi, A., Kallio, K., & Arst, H. (2001). A hyperspectral model for interpretation of
passive optical remote sensing data from turbid lakes. Science of the Total Environment,
268, 47-58
Kutser, T., Pierson, D.C., Kallio, K.Y., Reinart, A., & Sobek, S. (2005). Mapping lake CDOM
by satellite remote sensing. Remote Sensing of Environment, 94, 535-540
Lacaux, J.P., Tourre, Y.M., Vignolles, C., Ndione, J.A., & Lafaye, M. (2007). Classification of
ponds from high-spatial resolution remote sensing: Application to Rift Valley Fever
epidemics in Senegal. Remote Sensing of Environment, 106, 66-74
LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. nature, 521, 436
Lee, Z., Carder, K.L., & Arnone, R.A. (2002). Deriving inherent optical properties from water
color: a multiband quasi-analytical algorithm for optically deep waters. Applied optics, 41,
5755-5772
Journal Pre-proof

Li, J., Yu, Q., Tian, Y.Q., & Becker, B.L. (2017). Remote sensing estimation of colored
dissolved organic matter (CDOM) in optically shallow waters. ISPRS Journal of
Photogrammetry and Remote Sensing, 128, 98-110
Li, J., Yu, Q., Tian, Y.Q., Becker, B.L., Siqueira, P., & Torbick, N. (2018). Spatio-temporal
variations of CDOM in shallow inland waters from a semi-analytical inversion of
Landsat-8. Remote Sensing of Environment, 218, 189-200
Lim, J., & Choi, M. (2015). Assessment of water quality based on Landsat 8 operational land
imager associated with human activities in Korea. Environmental Monitoring and
Assessment, 187
Liu, H.Z., Shi, T.Z., Chen, Y.Y., Wang, J.J., Fei, T., & Wu, G.F. (2017). Improving Spectral
Estimation of Soil Organic Carbon Content through Semi-Supervised Regression. Remote
Sensing, 9
Lobo, F.L., Costa, M.P., & Novo, E.M. (2015). Time-series analysis of Landsat-MSS/TM/OLI

of
images over Amazonian waters impacted by gold mining activities. Remote Sensing of
Environment, 157, 170-184

ro
Loisel, H., & Morel, A. (2001). Non-isotropy of the upward radiance field in typical coastal
(Case 2) waters. International Journal of Remote Sensing, 22, 275-295
Long, C.M., & Pavelsky, T.M. (2013). Remote sensing of suspended sediment concentration and

-p
hydrologic connectivity in a complex wetland environment. Remote Sensing of
Environment, 129, 197-209
re
Ma, L., Liu, Y., Zhang, X., Ye, Y., Yin, G., Johnson, B.A. (2019). Deep learning in remote
sensing applications: A meta-analysis and review. ISPRS Journal of Photogrammetry and
Remote Sensing, 152, 166-177.Ma, R.H., Tang, J.W., & Dai, J.F. (2006). Bio‐optical
lP

model with optimal parameter suitable for Taihu Lake in water colour remote sensing.
International Journal of Remote Sensing, 27, 4305-4328
Maier, P.M., & Keller (2018). Machine learning regression on hyperspectral data to estimate
na

multiple water parameters. arXiv preprint arXiv, 1805.01361


Maimaitijiang, M., Sagan, V., Sidike, P., Hartling, S., Esposito, F., & Fritschi, F. B. (2020).
Soybean yield prediction from UAV using multimodal data fusion and deep learning.
ur

Remote Sensing of Environment, 237, 111599.


Maimaitiyiming, M., Miller, A.J., & Ghulam, A. (2016). Discriminating Spectral Signatures
Jo

Among and Within Two Closely Related Grapevine Species. Photogrammetric


Engineering and Remote Sensing, 82, 51-62
Majozi, N.P., Salama, M.S., Bernard, S., Harper, D.M., & Habte, M.G. (2014). Remote sensing
of euphotic depth in shallow tropical inland waters of Lake Naivasha using MERIS data.
Remote sensing of environment, 148
Matthews, M.W. (2011). A current review of empirical procedures of remote sensing in inland
and near-coastal transitional waters. International Journal of Remote Sensing, 32, 6855-
6899
Matthews, M.W., Bernard, S., & Robertson, L. (2012). An algorithm for detecting trophic status
(chlorophyll-a), cyanobacterial-dominance, surface scums and floating vegetation in
inland and coastal waters. Remote Sensing of Environment, 124, 637-652
Matthews, M.W., Bernard, S., & Winter, K. (2010). Remote sensing of cyanobacteria-dominant
algal blooms and water quality parameters in Zeekoevlei, a small hypertrophic lake, using
MERIS. Remote Sensing of Environment, 114, 2070-2087
Journal Pre-proof

Matthews, M.W., & Odermatt, D. (2015). Improved algorithm for routine monitor- ing of
cyanobacteria and eutrophication in inland and near-coastal waters. Remote Sensing of
Environment, 156
Mertes, L.A.K., Smith, M.O., & Adams, J.B. (1993). Estimating Suspended Sediment
Concentrations in Surface Waters of the Amazon River Wetlands from Landsat Images.
Remote Sensing of Environment, 43, 281-301
Miksa, S., Gege, P., & Heege, T. (2004). Investigations on the capability of CHRIS-Proba for
monitoring of water constituents in Lake Constance compared to MERIS. In,
Proceedings of the 2nd CHRIS-PROBA Workshop. Frascati, Italy
Mishra, S., & Mishra, D.R. (2012). Normalized difference chlorophyll index: A novel model for
remote estimation of chlorophyll-a concentration in turbid productive waters. Remote
Sensing of Environment, 117, 394-406
Mishra, S., Mishra, D.R., Lee, Z., & Tucker, C.S. (2013). Quantifying cyanobacterial

of
phycocyanin concentration in turbid productive waters: A quasi-analytical approach. .
Remote Sensing of Environment, 133, 141-151

ro
Mobley, C.D. (1998). Hydrolight 4.0 users guide.
Moore, C., Barnard, A., Fietzek, P., Lewis, M.R., Sosik, H.M., White, S., & Zielinski, O. (2009).
Optical tools for ocean monitoring and research. Ocean Science, 5, 661-684

-p
Morel, A., & Prieur, L. (1977). Analysis of variations in ocean color. Limnology and
Oceanography, 22, 709-722
re
Moses, W.J., Gitelson, A.A., Berdnikov, S., & Povazhnyy, V. (2009). Estimation of chlorophyll-
a concentration in case II waters using MODIS and MERIS data—successes and
challenges. Environmental Research Letters, 4, 045005
lP

Mullerkarger, F.E. (1992). Remote-Sensing of Marine Pollution - a Challenge for the 1990s. Mar
Pollut Bull, 25, 54-60
Murphy, K.P. (2012). Machine learning : a probabilistic perspective. Cambridge, Mass.: MIT
na

Press
Neville, R.A., & Gower, J.F.R. (1977). Passive remote sensing of phytoplankton via chlorophyll
α fluorescence. Journal of Geophysical Research, 82, 3487-3493
ur

Nunes, J.P., Seixas, J., Keizer, J.J., & Ferreira, A.J.D. (2009). Sensitivity of runoff and soil
erosion to climate change in two Mediterranean watersheds. Part I: model
parameterization and evaluation. Hydrological Processes, 23, 1202-1211
Jo

Odermatt, D., Kiselev, V., Heege, T., Kneubühler, M., & Itten, K.I. (2008). Adjacency effect
considerations and air/water constituent retrieval for Lake Constance. In, Proceedings of
the 2nd MERIS/(A) ATSR user workshop. Frascati, Italy (Vol. 1)
OECD (2012). OECD Environmental Outlook to 2050.
Ogashawara, I., Mishra, D.R., Mishra, S., Curtarelli, M.P., & Stech, J.L. (2013). A Performance
Review of Reflectance Based Algorithms for Predicting Phycocyanin Concentrations in
Inland Waters. Remote Sensing, 5, 4774-4798
Olmanson, L.G., Bauer, M.E., & Brezonik, P.L. (2008). A 20-year Landsat water clarity census
of Minnesota's 10,000 lakes. Remote Sensing of Environment, 112, 4086-4097
Olmanson, L.G., Brezonik, P.L., & Bauer, M.E. (2011). Evaluation of medium to low resolution
satellite imagery for regional lake water quality assessments. Water Resources Research,
47
Olmanson, L.G., Brezonik, P.L., & Bauer, M.E. (2013). Airborne hyperspectral remote sensing
to assess spatial distribution of water quality characteristics in large rivers: The
Journal Pre-proof

Mississippi River and its tributaries in Minnesota. Remote Sensing of Environment, 130,
254-265
Paerl, H.W., & Otten, T.G. (2013). Harmful Cyanobacterial Blooms: Causes, Consequences, and
Controls. Microbial Ecology, 65, 995-1010
Pan, Z.T., Arritt, R.W., Takle, E.S., Gutowski, W.J., Anderson, C.J., & Segal, M. (2004). Altered
hydrologic feedback in a warming climate introduces a ''warming hole''. Geophysical
Research Letters, 31
Panda, S.S., Garg, V., & Chaubey, I. (2004). Artificial neural networks application in lake water
quality estimation using satellite imagery. Journal of Environmental Informatics, 4, 65-
74
Park, E., & Latrubesse, E.M. (2014). Modeling suspended sediment distribution patterns of the
Amazon River using MODIS data. Remote Sensing of Environment, 147, 232-242
Pelikan, M., Goldberg, D.E., & Paz, E. (1999). The Bayesian optimization algorithm,

of
Proceedings of 612 the 1st Annual Conference on Genetic and Evolutionary Computation.
Morgan Kaufmann Inc.,, 525-532

ro
Pereira, L.S.F., Andes, L.C., Cox, A.L., & Ghulam, A. (2018). Measuring Suspended-Sediment
Concentration and Turbidity in the Middle Mississippi and Lower Missouri Rivers Using
Landsat Data. Journal of the American Water Resources Association, 54, 440-450

-p
Peterson, K.T., Sagan, V., Sidike, P., Cox, A., & Martinez, M. (2018). Suspended Sediment
Concentration Estimation from Landsat Imagery along the Lower Missouri and Middle
re
Mississippi Rivers Using an Extreme Learning Machine. Remote Sensing, 10, 1503
Peterson, K.T., Sagan, V., Sidike, P., Hasenmueller, E.A., Sloan, J.J., & Knouft, J.H. (2019).
Machine learning based ensemble prediction of water quality variables using feature-
lP

level 1 and decision-level fusion with proximal remote sensing. Photogrammetric


Engineering and Remote Sensing, 85(4): 269–280
Peterson, K.T., Sagan, V., & Sloan, J. (2020). Deep learning-based water quality estimation and
na

anomaly detection using Landsat-8/Sentinel-2 virtual constellation and cloud computing.


GIScience & Remote Sensing, DOI: 10.1080/15481603.2020.1738061
Petus, C., Chust, G., Gohin, F., Doxaran, D., Froidefond, J.M., & Sagarminaga, Y. (2010).
ur

Estimating turbidity and total suspended matter in the Adour riverplume (south bay of
biscay) using MODIS 250-m imagery. Continental Shelf Resources, 30, 379-392
Phinn, S.R., Dekker, A.G., Brando, V.E., & Roelfsema, C.M. (2005). Mapping water quality and
Jo

subtrate cover in optically complex coastal and reef waters: An integrated approach.
Marine Pollution Bulletin, 51, 459-469
Pulliainen, J., Kallio, K., Eloheimo, K., Koponen, S., Servomaa, H., Hannonen, T., Tauriainen,
S., & Hallikainen, M. (2001). A semi-operative approach to lake water quality retrieval
from remote sensing data. Science of the Total Environment, 1-3, 79-93
Qi, L., Hu, C.M., Duan, H.T., Barnes, B.B., & Ma, R.H. (2014). An EOF-Based Algorithm to
Estimate Chlorophyll a Concentrations in Taihu Lake from MODIS Land-Band
Measurements: Implications for Near Real-Time Applications and Forecasting Models.
Remote Sensing, 6, 10694-10715
Rajalahti, T., Arneberg, R., Berven, F.S., Myhr, K.M., Ulvik, R.J., & Kvalheim, O.M. (2009).
Biomarker discovery in mass spectral profiles by means of selectivity ratio plot.
Chemometrics and Intelligent Laboratory Systems, 95, 35-48
Journal Pre-proof

Rastogi, R.P., Madamwar, D., & Incharoensakdi, A. (2015). Bloom Dynamics of Cyanobacteria
and Their Toxins: Environmental Health Impacts and Mitigation Strategies. Frontiers in
Microbiology, 6
Ritchie, J.C., & Cooper, C.M. (1991). An Algorithm for Estimating Surface Suspended Sediment
Concentrations with Landsat Mss Digital Data. Water Resources Bulletin, 27, 373-379
Ritchie, J.C., Schiebe, F.R., & McHenry, R. (1976a). Remote sensing of suspended sediment in
surface waters. Photogrammetric Engineering and Remote Sensing, 69, 695-714
Ritchie, J.C., Schiebe, F.R., & McHenry, R. (1976b). Remote sensing of suspended sediment in
surface waters. Photogrammetric Engineering and Remote Sensing, 69, 695-714
Roesler, C.S., Perry, M.J. and Carder, K.L. (1989). Modeling in situ phytoplankton absorption
from total absorption spectra in productive inland marine waters. Limnology and
Oceanography, 34, 1510-1523.
Ruiz-Verdú, A., Koponen, S., Heege, T., Doerffer, R., Brockmann, C., Kallio, K., Pyhälahti, T.,

of
Peña, R., Polvorionos, A., Heblinski, J., & Ylöstalo, P. (2008). Development of MERIS
lake water algorithms: Validation results from Europe. In, 2nd MERIS/(A) ATSR User

ro
Workshop. Frascati, Italy
Rundquist, D.C., Han, L., Schalles, J.F., & Peake, J.S. (1996). Remote measurement of algal
chlorophyll in surface waters: the case for the first derivative of reflectance near 690 nm.

-p
Photogrammetric Engineering and Remote Sensing, 62, 195-200
Ryan, J.P., Davis, C.O., Tufillaro, N.B., Kudela, R.M., & Gao, B.C. (2014). Application of the
re
Hyperspectral Imager for the Coastal Ocean to Phytoplankton Ecology Studies in
Monterey Bay, CA, USA. Remote Sensing, 6, 1007-1025
Salama, M.S., & Verhoef, W. (2015). Two-stream remote sensing model for water quality
lP

mapping: 2SeaColor. remote Sensing of Environment, 157, 111-122


Santini, F., Alberotanza, L., Cavalli, R.M., & Pignatti, S. (2010). A two-step optimization
procedure for assessing water constituent concentrations by hyperspectral remote sensing
na

techniques: An application to the highly turbid Venice lagoon waters. Remote Sensing of
Environment, 114, 887-898
Savitzky, A., & Golay, M.J. (1964). Smoothing and differentiation of data by simplified least
ur

squares procedures. Analytical chemistry, 36, 1627-1639


Schindler, D.W. (1977). Evolution of phosphorus limitation in lakes. Science, 195, 260-262
Shafique, N.A., Fulk, F., Autrey, B.C., & Flotemersch, J. (2003). Hyperspectral remote sensing
Jo

of water quality parameters for large rivers in the Ohio River basin. In, First Interagency
Conference on Research in the Watershed (pp. 216-221). Benson, AZ
Shaughnessy, A.R., Sloan, J.J., Corcoran, M.J., and Hasenmueller, E.A. (2019). Sediments in
Agricultural Reservoirs Act as Sinks and Sources for Nutrients Over Various Timescales.
Water Resources Research, 55(7), 5985-6000
Sidike, P., Sagan, V., Maimaitijiang, M., Maimaitiyiming, M., Shakoor, N., Burken, J., ... &
Fritschi, F. B. (2019). dPEN: deep Progressively Expanded Network for mapping
heterogeneous agricultural landscape using WorldView-3 satellite imagery. Remote
sensing of environment, 221, 756-772
Smyth, T.J., Moore, G.F., Hirata, T., & Aiken, J. (2006). Semianalytical model for the derivation
of ocean color inherent optical properties: description, implementation, and performance
assessment. Applied Optics, 45, 8116-8131
Journal Pre-proof

Song, K., Li, L., Li, S., Tedesco, L., Hall, B., & Li, L. (2012a). Hyperspectral remote sensing of
total phosphorus (TP) in three central Indiana water supply reservoirs. Water, Air, & Soil
Pollution, 223, 1481-1502
Song, K., Lu, D., Li, L., Li, S., Wang, Z., & Du, J. (2012b). Remote sensing of chlorophyll-a
concentration for drinking water source using genetic algorithms (GA): Partial least
square (PLS) modeling. Ecological Informatics, 10, 25-36
Song, K.S., Li, L., Li, S., Tedesco, L., Hall, B., & Li, L.H. (2012c). Hyperspectral Remote
Sensing of Total Phosphorus (TP) in Three Central Indiana Water Supply Reservoirs.
Water Air and Soil Pollution, 223, 1481-1502
Song, Y., Song, X.D., Jiang, H., Guo, Z.B., & Guo, Q.H. (2010). Quantitative Remote Sensing
Retrieval for Algae in Inland Waters. Spectroscopy and Spectral Analysis, 30, 1075-1079
Szegedy, C., Liu, W., Jia, W., Sermanet, P., Reed, S., Anguelov, A., Erhan, D., Vanhoucke, V.,
Rabinovich, A. (2015).Going deeper with convolutions. Proceedings of the IEEE

of
conference on computer vision and pattern recognition, 7-12 June 2015, Boston, MA,
USA. DOI: 10.1109/CVPR.2015.7298594

ro
Thurman, E.M. (1985). Organic Geochemistry of Natural Waters.
Twardowski, M.S., Boss, E., Sullivan, J.M., & Donaghay, P.L. (2004). Modeling the spectral
shape of absorption by chromophoric dissolved organic matter. Marine Chemistry, 89,
69-88
-p
Tyler, A.N., Svab, E., Preston, T., Présing, M., & Kovács, W.A. (2006). Remote sensing of the
re
water quality of shallow lakes: A mixture modelling approach to quantifying
phytoplankton in water characterized by high‐suspended sediment. International Journal
of Remote Sensing, 27, 1521-1537
lP

USACE, U.S.A.C.o.E. (2012). 2012 Carlyle Lake Water Quality Report. In


van der Meer, F., & De Jong, S.M. (2001). Imaging spectrometry: basic principles and
prospective applications Dordrecht: Springer
na

Van Der Woerd, H.J., & Pasterkamp, R. (2008). HYDROPT: A fast and flexible method to
retrieve chlorophyll-a from multispectral satellite observations of optically complex
coastal waters. Remote Sensing of Environment, 112, 1795-1807
ur

Vandekerckhove, L., Poesen, J., & Govers, G. (2003). Medium-term gully headcut retreat rates
in Southeast Spain determined from aerial photographs and ground measurements.
Jo

Catena, 50, 329-352


Verpoorter, C., Kutser, T., Seekell, D.A., & Tranvik, L.J. (2014). A global inventory of lakes
based on high‐resolution satellite imagery. Geophysical Research Letters, 41, 6396-6402
Vincent, R.K., Qin, X.M., McKay, R.M.L., Miner, J., Czajkowski, K., Savino, J., & Bridgeman,
T. (2004). Phycocyanin detection from LANDSAT TM data for mapping cyanobacterial
blooms in Lake Erie. Remote Sensing of Environment, 89, 381-392
Visser, P.M., Ibelings, B.W., Mur, L.R., & Walsby, A.E. (2005). The ecophysiology of the
harmful cyanobacterium Microcystis: features explaining its success and measures for its
control. In J. Huisman, H.C.P. Matthijs, & P.M. Visser (Eds.), Harmful cyanobacteria
(pp. 109–142). Berlin: Springer-Verlag
Vogt, M.C., & Vogt, M.E. (2016). Near-Remote Sensing of Water Turbidity Using Small
Unmanned Aircraft Systems. Environmental Practice, 18, 18-31
Wang, D., Liang, S., Zhou, Y., He, T. & Yu, Y. (2017c). A new method for retrieving daily land
surface albedo from VIIRS data. IEEE Transactions on Geoscience and Remote Sensing,
55(3), 1765-1775, doi: 10.1109/tgrs.2016.2632624.
Journal Pre-proof

Wang, J., Ding, J., Abulimiti, A., & Cai, L. (2018). Quantitative estimation of soil salinity by
means of different modeling methods and visible-near infrared (VIS–NIR) spectroscopy,
Ebinur Lake Wetland, Northwest China. Peerj, 6, e4703
Wang, M.H., Nim, C.J., Son, S., & Shi, W. (2012). Characterization of turbidity in Florida's
Lake Okeechobee and Caloosahatchee and St. Lucie Estuaries using MODIS-Aqua
measurements. Water Research, 46, 5410-5422
Wang, M.H., Son, S.H., Zhang, Y.L., & Shi, W. (2013). Remote Sensing of Water Optical
Property for China's Inland Lake Taihu Using the SWIR Atmospheric Correction With
1640 and 2130 nm Bands. Ieee Journal of Selected Topics in Applied Earth Observations
and Remote Sensing, 6, 2505-2516
Wang, P., Boss, E.S., & Roesler, C. (2005). Uncertainties of inherent optical properties obtained
from semianalytical inversions of ocean color. Applied optics, 44, 4074-4085
Wang, X., Ma, L., & Wang, X. (2010). Apply semi-supervised support vector regression for

of
remote sensing water quality retrieving. In IEEE (Ed.), 2010 IEEE International
Geoscience and Remote Sensing Symposium (pp. 2757-2760)

ro
Wang, X., Zhang, F., & Ding, J. (2017a). Evaluation of water quality based on a machine
learning algorithm and water quality index for the Ebinur Lake Watershed, China.
Scientific reports, 7, 12858

-p
Wang, X., Zhang, F., Kung, H., Ghulam, A., Trumbo, A., Yang, J., Ren, Y., Jing, Y. (2017b).
Evaluation and estimation of surface water quality in an arid region based on EEM-
re
PARAFAC and 3D fluorescence spectral index: A case study of the Ebinur Lake
Watershed, China. Catena, 155: 62-74.
WHO (2006). Meeting the MDG drinking water and sanitation target : the urban and rural
lP

challenge of the decade.

World Health Organization and UNICEF. In


na

Wu, C., Wu, J., Qi, J., Zhang, L., Huang, H., Lou, L., & Chen, Y. (2010a). Empirical estimation
of total phosphorous concentration in the mainstream of the Qiantang River in China
using Landsat TM data. International Journal of Remote Sensing, 31, 2309–2324
ur

Wu, C., Wu, J., Qi, J., Zhang, L., Huang, H., Lou, L., & Chen, Y. (2010b). Empirical estimation
of total phosphorus concentration in the mainstream of the Qiantang River in China using
Landsat TM data. International Journal of Remote Sensing, 31
Jo

Wynne, T.T., Stumpf, R.P., Tomlinson, M.C., Warner, R.A., Tester, P.A., Dyble, J., &
Fahnenstiel, G.L. (2008). Relating spectral shape to cyanobacterial blooms in the
Laurentian Great Lakes. International Journal of Remote Sensing, 29, 3665-3672
Yang, W., Matsushita, B., Chen, J., & Fukushima, T. (2011). Estimating constituent
concentrations in case II waters from MERIS satellite data by semi-analytical model
optimisation and look-up tables. Remote Sensing of Environment, 115
YSI, I. (2017). EXO User Manual. (Revision G, April 2017 ed.).
Zhang, L., Zhang, L., & Du, B. (2016). Deep learning for remote sensing data: A technical
tutorial on the state of the art. IEEE Geoscience and Remote Sensing Magazine, 4(2), 22-
40
Zhang, N., Rao, R.S.P., Salvato, F., Havelund, J.F., Møller, I.M., Thelen, J.J., & Xu, D. (2018).
MU-LOC: A Machine-Learning Method for Predicting Mitochondrially Localized
Proteins in Plants. Frontiers in Plant Science, 9:634. doi: 10.3389/fpls.2018.00634
Zhang, Y.C., Ma, R.H., Duan, H.T., Loiselle, S.A., Xu, J.D., & Ma, M.X. (2014). A Novel
Algorithm to Estimate Algal Bloom Coverage to Subpixel Resolution in Lake Taihu. Ieee
Journal Pre-proof

Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 7, 3060-
3068
Zhang, Y.Z., Pulliainen, J., Koponen, S., & Hallikainen, M. (2002). Water quality studies of
combined optical, thermal infrared, and microwave remote sensing. Microwave and
Optical Technology Letters, 34, 281-285
Zheng, Z., Ren, J., Li, Y., Huang, C., Liu, G., Du, C., & Lyu, H. (2016). Remote sensing of
diffuse attenuation coefficient patterns from Landsat 8 OLI imagery of turbid inland
waters: a case study of Dongting Lake. Science of the Total Environment, 573, 39-54
Zhu, Z., Wang, S., & Woodcock, C.E. (2015). Improvement and expansion of the Fmask
algorithm: Cloud, cloud shadow, and snow detection for Landsats 4–7, 8, and Sentinel 2
images. Remote Sensing of Environment, 159, 269-277

of
ro
-p
re
lP
na
ur
Jo
Journal Pre-proof

All authors have read and approved this revised version of the article, and due care has been
taken to ensure the integrity of the work. This paper is original work and is currently not being
considered for publication in any other peer-reviewed journal. No part of this paper has been
published or submitted elsewhere. No conflict of interest exists in the submission of this
manuscript.

of
ro
-p
re
lP
na
ur
Jo

You might also like