Zhou Yuhan Thesis
Zhou Yuhan Thesis
Zhou Yuhan Thesis
Yuhan Zhou
A thesis submitted
04 2023
Thesis committee:
Fish richness and diversity serve as important indicators of a healthy stream ecosystem,
which are influenced by a complex web of ecological factors, including regional climate,
watershed characteristics, riparian zone quality, and water quality. Investigating how these
factors interconnect and impact fish community is crucial for developing effective
management strategies to safeguard freshwater ecosystems. In this study, we used partial least
climate factors affect fish richness and diversity by altering water temperature, pH,
conductivity, total nitrogen, and total phosphorous (TP) in 277 watersheds in the Great Lake
how watershed land use, slope, and soil interacted to drive changes in TP concentration with
multiple linear regression. Results suggested that moderate watershed development (average
5% developed percentage in the study site) can enhance fish diversity by increasing pH,
diminished fish diversity by increasing nutrient concentrations. We also found that land
cover-soil interaction was not significant. Future ecosystem management in the study area
should therefore emphasize a dual focus on watershed management approaches and riparian
Table of Content
Chapter 1: Introduction
Fish diversity and richness are vital indicators of biodiversity and ecosystem health, where
species with greater diversity and richness adapt to a wider variety of conditions such as
stream disturbance, disease, and climate change (Hiddink et al., 2018; Messemer et al., 2011).
Fish diversity and richness also influence various ecosystem services including aquatic
habitat, recreational opportunities, and fisheries (Keeler et al., 2012; Prudencio and Null,
2018). Today, freshwater fish population are increasingly jeopardized by human activities,
primarily due to factors such as river fragmentation, habitat loss, eutrophication and pollution,
and the introduction of non-native species (Su et al., 2021). These threats manifest at
different spatial scales, ranging from global climate change, watershed-scale land use change,
Such threats induce alterations in fish diversity and richness predominantly by modifying the
freshwater environment, including temperature, nutrients, sediment, and toxins (Keeler et al.,
2012). To the best of our knowledge, little research has been done to investigate this complex
Threats to freshwater fish diversity and richness could be causally and structurally linked. For
example, the combined effect of watershed development and climate change causes flow
regimes alteration and generates greater sediment and nutrients under heavy rainfall events,
thereby driving changes in habitat suitability and fish community composition (Ferreira et al.,
2019; Pörtner and Peck, 2010; Turunen et al., 2021). Another possible pathway is that
and water temperature (Wilkinson, 1999). Two major gaps are associated in quantifying these
pathways and their effects on fish richness and diversity. Firstly, there is limited
understanding of the relative importance of the threats to fish richness and diversity in a
specific ecological context. For example, it remains unknown whether watershed land use
change or riparian zone alteration are more influential, since both alter runoff patterns,
sedimentation, pollution, and habitat suitability; and the relative importance may vary
depending on the characteristics of the freshwater system (Meador and Goldstein, 2003).
For instance, the disturbance of riparian zone is typically considered harmful to fish diversity
as it undermines a diverse range of habitats, elevates soil erosion, and increases water
temperature (Reid et al., 2019). While the intermediate disturbance hypothesis suggests local
riparian zone has high biomass due to the abundance of terrestrial and aquatic food
availability (Albertson et al. 2017). Therefore, a causal model is needed to quantify the
strength and effect of different pathways on how watershed characteristics, riparian quality,
as well as water physical, chemical, and biological indicators are intertwined in their effect on
fish richness and diversity. Wang and colleagues (2021) have recently applied a structural
equation model to water quality because of its advantage in constructing latent variables
using measured variables and exploring the causal pathways; however, this model has been
high TP concentrations in streams can reduce fish diversity by promoting algae growth,
thereby reducing oxygen levels and creating unfavorable conditions for fish (Dala-Corte et al.,
2016). Extensive literature on the drivers of total phosphorus (TP) concentration in surface
water has identified multiple drivers of phosphorus pollution, such as climate change,
land use change, and catchment hydrology (Lintern et al., 2018). Past research selected
conclusions regarding the effect of each driver were inconsistent. Some researchers proposed
that flat areas may discharge more nutrients when compared to steep areas (Yu et al., 2016),
while others suggested that a decreased deviation in slope is associated with a decrease in
pollutant concentration (Wang et al., 1997; Wissmar et al., 1990). These inconsistent
conclusions may be due to interaction effects between variables having been often
overlooked, especially how land cover interacts with other geophysical factors in influencing
Slope and soil are key factors that interact with land cover in affecting water quality (Yu et al.,
2016). Slope is an important factor affecting nutrient concentration in the water body because
it is related to surface runoff volume and velocities. Slope was also identified as a key
parameter in predicting rates of water flow across surfaces (Richards et al., 1996). With
increasing slope, greater rates of water flow contribute to soil erosion and the rates of
particulates picking up pollutants, which has the potential to further deteriorate water quality
(Yu et al., 2016). The chemical characteristics of soil (e.g., phosphorus, nitrogen, and salt
content in soil) directly affect the types and concentration of pollutants in water (Dillon et al.,
1975). Soil erodibility and sorption capacity also influence constituent mobilization in
catchments (Lintern et al., 2018). The mobilization of sediments is positively correlated to the
susceptibility of the geological deposit and the soil within the catchment to erosion and
weather. Moreover, soil hydrological property affects water quality, with a lower hydraulic
conductivity leading to more residence time of subsurface flow and facilitating pollutants to
result, research has found a positive correlation between pollutants in water and soil drainage
and fish community, with riparian quality and water quality as mediating factors. Stream
topography, soil, and their interaction effects. Our specific research goals are as follows: (1)
to explore the spatial distribution of important land cover, climate, riparian quality, water
quality, and fish diversity variables and their associations; (2) to investigate the mechanisms
by which watershed development and climate impact fish richness and diversity; and (3) to
assess how land cover-slope interaction and land cover-soil interaction affect TP
Chapter 2: Materials and Methods
2.1 Study Site
We obtained our sampling stations from the EPA’s National River and Stream Assessment
(NRSA) between 2018-2019 as part of the National Aquatic Resource Surveys. We defined
the site as all states surrounding the Great Lakes (i.e., Illinois, Indiana, Michigan, Minnesota,
New York, Ohio, Pennsylvania, and Wisconsin), resulting in the sample size of 277 stations.
We delineated subwatershed boundaries with the location and elevation of these stations as
shown in Fig. 1 with ArcGIS (version 10.8). The major land use types of our study area are
forest and cropland. All samples were collected in summer with an average daily temperature
2.2 Data and Variables
Fish community data were obtained from NRSA 2018-2019; and we calculated the richness
and Shannon diversity index based on the number of each fish species, which represented fish
community structure (J Zhao et al., 2014). Here, H is the Shannon diversity index, and P(i) is
H=- �=1
[�(�) ∗ ���(�)] (1)
Water quality data included TP, total nitrogen (TN), pH, conductivity (COND), and water
temperature (WATER_TEM). Phosphorus and nitrogen are the main nutrients associated
with the intensification of cropland in the catchment area, which could have significant
negative impact on aquatic ecosystems function (Owens et al., 2005; Bierschenk et al., 2019).
pH, COND, and WATER_TEM are water quality parameters highly relevant to fish diversity,
as changes in these indicators may influence the transmission of acoustic, visual, and
chemical signals, thereby affecting species interactions and reproductive success (Barrella et
Climate data were obtained from the parameter-elevation relationships on independent slopes
model dataset, which provides gridded climate datasets for the United States (Daly et al.,
2008). Among climate variables, we calculated the daily precipitation amount and daily
average temperature at the subwatershed scale to match the TP concentration collected from
the outlet of the subwatershed (see Table 1). We also calculated the monthly maximum 1-day
precipitation (Rx1day), warmest monthly temperature, and coldest monthly temperature to
represent the extreme climates effect on water quality and fish. Rx1day was calculated using
Eq. 2, where Rx1dayj is the maximum 1-day precipitation value for period j, and RRij is the
Land cover percentages were obtained from the National Land Cover Database (NLCD;
2019), a 30m Landsat-based land cover database. These images rely on the imperviousness
data layer for urban classes and on a decision-tree classification for all other classes (Dewitz
et al., 2021). Among the various land cover types, we selected the percentage of developed,
NLCD divides the developed area into four categories: (1) developed open space (impervious
surfaces <20%); (2) developed low intensity (impervious surfaces 20%-49%); (3) developed
medium intensity (impervious surfaces 50%-79%), and (4) developed high intensity
(impervious surfaces >80%). It divides forest land into three categories: (1) deciduous forest;
(2) evergreen forest; and (3) mixed forest. To simplify the model, we merged the developed
low, medium, and high intensity into developed (“DEV”) land and combined all types of
forest into a single “FOREST” category. The developed open space remained a separate
category because the impervious percentage was lower than 20% and not likely to cause
Soil type were obtained from the OpenLandMap Soil Texture Class (USDA System).
Because the water quality indicators we used were primarily pollutants transported with
surface runoff, we used the 0cm soil depth layer as the soil data source. We then reclassified
the 12 types of soil into three different types (clay, silt, and sand) according to the soil
textural triangle. Silt and sand had a strong correlation, while sand and clay had a weak
correlation. Therefore, we chose sand and clay as variables to represent soil texture in our
study and excluded silt. Slope variable were calculated based on the 30m-resolution NASA
Part of the riparian zone variables were from NRSA, including riparian quality index, riparian
vegetation quality index, and riparian disturbance index. These represented on a scale from 0
to 1 the quality and disturbance of the riparian buffer. Other riparian zone variables including
proportion of deciduous forest (R_DECI), evergreen forest (R_EVER), and mixed forest
(R_MIXED) in riparian zone were calculated from NLCD 2019. Specifically, we calculated
the fraction of different forests at the riparian buffer scale defined as 500m surrounding the
water quality stations. Here, we separated forest categories because different forest species
and defoliation along the riparian buffer could have varying impacts on fish diversity (Allan
et al., 2004).
Table 1. Variables used.
quality station
TP Total phosphorus ug / L
TN Total nitrogen mg / L
PH Potential of hydrogen /
COND Conductivity uS / cm
WATER_TEM Water temperature °C
Climate variables
TMEAN Daily average temperature °C
PPT Daily total precipitation mm
W_MTEM Warmest monthly temperature °C
C_MTEM Coldest monthly temperature °C
Rx1day Monthly maximum 1-day precipitation mm
surface <20%)
surface >20%)
SLOPE Average slope %
Soil variables
PLS path modeling is a sophisticated technique for estimating intricate causal relationships
within path models that incorporate latent variables (LV; Wold, 1966, 1980). In a PLS model,
the inner model consists of latent variables and their connecting paths, while the outer model
comprises measured variables (MV; Henseler et al., 2016; see Eq. 3). Within the inner model,
path coefficients (pc) quantify the relationships between LVs, whereas in the outer model, the
associations between LVs and MVs are represented by weights (W; Hair et al., 2014). Each
path coefficient represents the impact of independent variables (i.e., exogenous or ‘start of
path’) on dependent latent variables (i.e., endogenous or ‘end of path’). The estimated score
of an endogenous latent variable (e.g., LVp,c in Fig. 2; see Eq. 4) is calculated as the
weighted sum of all its connected exogenous latent variables, where the Ws are represented
by path coefficients:
��� = �=1
(��� ∗ �� ) (3)
The PLS algorithm can produce either formative of reflexive models. Reflexive models
feature paths that originate from LV and terminate at MV, while the arrow directions are
effects of latent variables, while they are viewed as causes in formative models (Boccuzzo
and Fordellone, 2015). In our research, PLS was used as formative model because we
We used the coefficient of determination (R2 ) for LVs and the Goodness of Fit (GoF) index
of the whole model to evaluate the model performance, which serves as a pseudo GoF
measure to account for the quality of both the measurement and structural models in
evaluating the overall model performance. In a PLS model, R2 represents the percentage of
variance in the dependent variable that is explained by the predicting constructs (Oliveira et
al., 2019). Generally, the model performance can be classified into three levels: R2 <= 0.2 -
2.3.2 Multiple Linear Regression Models and Post-Hoc Analysis of the Interaction Effect
We applied multiple linear regression to explore the relationship between land cover, soil,
slope, daily climate, and TP. The TP value was log-transformed to approximate a normal
(Ponce-Palafox et al., 2019): land cover-slope interaction and land cover-soil interaction. To
control the number of independent variables and address the multicollinearity issues, we
performed least absolute shrinkage and selection operator (LASSO) regression for each
model to select the important main effects variables. This is a regression analysis method that
performs both variable selection and regularization, which can enhance the prediction
accuracy and interpretability of the regression models. LASSO model selection resulted in
the following formula for the linear regression model, where Y is the dependent variable (TP),
and bi represents the constant and the coefficient of different independent variables. T, P, O,
SLOPE, CLAY and SAND, respectively. Table 1 describes the variable in detail. The
Y = b0 + b1T + b2P + b3O + b4D + b5F + b6C + b7S + b8CL + b9SA + b10(D*S) + b11(C*S) +
performance, we used R2 and Akaike information criterion (AIC) to compare Eq. 5 and
model without interaction variables. Because the number of independent variables in these
two models differ, it is not appropriate to use only R2 to compare model performance. AIC
is a proven method for model evaluation with the criteria of the number of independent
variables and the maximum likelihood estimate of the model. The best-fit model according to
AIC is the one that explains the largest variance with the least independent variables. When
models are nested and the sample sizes of models are consistent, the model with the smallest
effects in the regression models. Perspective plots are particularly useful for exploring
complex interactions, finding optimal solutions, and identifying trends or patterns in data.
The perspective plot can draw a contour graph that explores the relationship between several
independent variables and the dependent variable. According to the contour graph of two
independent variables, we can then identify the joint effect of them and how the effect of one
Chapter 3: Results
3.1 Spatial Distribution of Key Variables
The study site and season were characterized by mild climate and favorable ecological
conditions. The site experienced relatively cool temperatures, with 22℃ as the average
highest monthly temperature. Precipitation distribution varied among different sampling sites
and days, ranging from 0 to 46.20mm of daily precipitation. The maximum Rx1day was
120.82mm, with an average value of 32.82mm, indicating the relative wet condition of the
sampling season. The average forest land cover percentage was 37.8%, with notably high
forest coverage in Southeast Lake Ontario, Susquehanna, Allegheny subregions (HUC 4), and
areas around Lake Michigan (see Fig. 3). The major human disturbance was agriculture
development, with a 26.7% average crop coverage. The average developed/urban area was
only 5%, which would not likely cause stream degradation. Riparian quality was generally
good in the study area, with an average QR of 0.59 and an average 0.47 QRVeg. The riparian
forest predominantly consisted of deciduous trees (25.6% average coverage). We also found
the spatially clustering effect of riparian quality was weak and not correlated with watershed
forest coverage. For example, sites in proximity to Lake Michigan and Lake Huron had high
Overall, the study area exhibited satisfactory conditions for both water quality and fish
diversity. Fish diversity was higher near Lake Superior and Lake Michigan, including
Northeastern Lake Huron, Northeastern Lake Michigan, and Northeastern Lake Michigan
subregions (HUC4), where forest dominated the watershed landscape. However, some
watersheds in east Lake Ontario and Lake Erie had low fish diversity, such as the
Southeastern Lake Ontario and the Allegheny subregion (see Fig. 3). The study area
exhibited low nutrient concentration, with an average TP 0.08 mg/l and an average TN of
1.98 mg/l. The highest TP concentrations were observed in the Great Miami subregions and
Mississippi Headwaters. The conductivity variation was relatively large, with elevated values
in the Southeastern Lake Huron, Southeastern Lake Michigan, and Wabash subregions (see
Fig. 3).
Table 2. Summarized statistics for variables used.
Climate variables
Slope variables
Soil variables
Fig. 3. Spatial distribution of the key variables (a) fish diversity, (b) warmest monthly
temperature, (c) forest land cover, (d) riparian quality index, (e) total phosphorus, and (f)
3.2 Variable Correlation
The correlation heatmap in Fig. 4 indicates that the variable with the highest correlation to
fish diversity was pH (r=0.44), followed by warmest monthly temperature (r=0.32) and water
temperature (r=0.27). At the riparian zone scale, fish diversity was surprisingly positively
correlated with riparian disturbance score (r=0.26). This might be due to a riparian
disturbance leading to high pH and conductivity, which are both increasing fish diversity in
the study region. Also, more riparian disturbance was found in warm areas (as indicated by
the correlation of 0.43 between riparian disturbance and the warmest monthly temperature)
with high fish diversity. Among all watershed characteristics, the percentages of forest, crop,
and sand in the soil were the most correlated variable with fish diversity (r=0.22, -0.22, and
0.22, respectively). Fish richness was highly correlated with fish diversity, with the effect of
Nutrient concentration was associated with both watershed land cover and riparian quality,
while watershed land cover’s association was stronger. Forest was significantly negatively
correlated with TN and TP concentration, while the crop’s correlation was significantly
positive. Riparian forest coverage was negatively correlated with TN and TP concentration,
with deciduous and mixed forests’ correlation higher than evergreen forests. In general, the
riparian forest coverage was stronger related to TN and TP than the riparian quality and
vegetation quality score. Higher forest in the upstream and riparian zone were also associated
with lower pH and conductivity. We also found that watersheds with high cropland had lower
riparian quality and higher riparian disturbance. The warmest monthly temperature had a
significant association with many variables, where warm areas had more disturbed riparian
zone, more sandy soil, higher pH, conductivity, and higher nutrient concentration.
In the PLS model, we linked all the measured variables to six latent variables: watershed
development, warm climate, riparian quality, water physiochemistry, eutrophication, and fish
diversity. Watershed development was assessed using land use (developed, developed open,
forest, crop), slope, soil, and their interactions. According to the PLS results, forest, crop, and
their interaction with soil were the most influential factors on the latent variable of watershed
development. The warm climate latent variable was quantified using daily temperature,
warmest and coldest monthly temperatures, daily precipitation, and Rx1day. The most
significant factor for this latent variable was the warmest monthly temperature. Additionally,
extreme precipitation as indicated by Rx1day weighed more heavily than daily precipitation.
Watershed development and warm climate were exogenous variables that accounted for the
The endogenous latent variables were riparian quality, water physiochemistry, nutrients, and
fish diversity. Riparian quality was assessed using different types of riparian forest
percentages (i.e., riparian evergreen forest, mixed forest, and deciduous forest percentages),
riparian quality index, riparian vegetation index, and riparian disturbance index. The
percentages of mixed and deciduous forests in the riparian zone predominantly influenced
this latent variable. Water physiochemistry was measured using three critical water quality
indicators affecting fish diversity: water temperature, pH, and conductivity, with conductivity
and pH weighing more heavily than water temperature. Eutrophication was represented by
total nitrogen and total phosphorus concentrations, which had nearly equal weightings. Lastly,
fish diversity was evaluated using the fish diversity index and richness index.
fish diversity had R2 values of 0.44, 0.46, 0.31, and 0.25, respectively. As a result,
watershed landscape, climate factors, and riparian quality demonstrated moderate predictive
power for water quality (Sanchez, 2013), while fish community characteristics were less
effectively captured by the measured variables. The relationships between latent variables can
The above equations indicate that higher fish diversity is associated with higher pH (within
the range of 5.43 to 8.67, with an average of 7.96), higher conductivity (within the range of
12.2 to 2773.60, with an average of 445.71), and warmer water temperatures (within the
fish diversity and richness. Warm and humid climate conditions lead to increased fish
diversity and richness. Furthermore, both eutrophication and water physiochemistry are
and a lower percentage of forest result in higher pH, conductivity, and water temperature.
The effect of urban development on nutrients, pH, conductivity, and water temperature is also
positive, but not as pronounced as cropland. Riparian quality can reduce eutrophication levels
and decrease water temperature, conductivity, and pH. Warm and humid climate conditions
After substituting the endogenous latent variables (eutrophication, water environment, and
riparian quality) in Equations 6-9 on the right-hand side with water physiochemistry and
warm climate, we derived a model for fish diversity, eutrophication, and water environment
increasing TP, TN, pH, conductivity, and water temperature. The effect of warm climate on
pH, conductivity and water temperature was more pronounced than that on TP and TN.
Watershed development caused increasing fish diversity because the disturbance in our study
site was only moderate (5% average percentage of development, see Table 2). In general, we
concluded that moderate development led to low degree of eutrophication, thus not
Fig. 5. The outcomes of PLS model. (a) path coefficients between LVs within the inner model, (b)
weights of all factors for each LV.
decrease in the AIC value, indicating that interaction variables work efficiently with
main effects had R2 of 0.368 and AIC of 734.15; while the R2 and AIC of regression
models with interaction effects were 0.420 and 714.14, respectively (see Table 3).
The effect of crop on TP depended on soil types, while the effect of developed area on TP
varied with slope. Specifically, the main effect of crop was negative but not significant.
However, CROP-CLAY interaction was positive, indicating the influence of crop on TP gets
positive and larger with the increase of clay content in the soil. Also, the CROP-CLAY
interaction graphs (see Fig. 6 (a)) indicates that when the proportion of clay in the soil is less
than 20%, TP increased slowly with the increase of cropland percentage. However, when the
percentage of clay in the soil is larger than 40%, TP increased much more rapidly as the
percentage of cropland increased. Developed area had a significantly negative main effect on
TP, while the effect changed to positive when the slope gets steeper indicated by the
significantly positive DEV-SLOPE interaction (see Table 3). From the DEV-SLOPE
interaction graphs (see Fig. 6 (b)), when average slope WAS less than 5%, the increase in the
percentage of developed land almost had no impact on TP. However, when average slope was
greater than 15%, the percentage of developed land has a strong positive association with TP.
Overall, the individual influence of land cover, slope, and soil on TP presented a potentially
nonlinear effect due to the interaction with each other. In addition, daily precipitation showed
Table 3. Multiple linear regression results with interaction variables.
Predictors Estimates CI p
Observations 277
AIC 714.135
Fig. 6. Contour graph of interaction variables. (a) CROP-CLAY interaction graphs, and (b)
DEV-SLOPE interaction.
Chapter 4: Discussion
4.1 The Mechanisms of Fish Richness and Diversity Pathways
and fish diversity by identifying two distinct pathways– through changing eutrophication
levels and through altering other physicochemical indicators such as pH, conductivity, and
biodiversity has yielded inconsistent results. For example, many studies found row-crop
agricultural had deleterious effects on fish communities primarily because higher loads of
fine sediment reduced hatching success (Gido et al. 2010; Sternecker and Geist, 2010).
However, other studies have observed relatively high fish community conditions in heavily
agricultural areas (comprising over 50% of the basin), which may be attributable to the
influence of factors such as nutrient and sediment regulation (Meador and Goldstein, 2003).
Our findings indicated that agricultural land use augments nutrient levels, subsequently
temperature, and pH, thereby promoting fish diversity. The combined effects of the two
pathways showed an overall positive effect of watershed development on fish diversity. This
result supported the previous finding that freshwater fish community was only sensitive to
some higher-level threshold of agricultural development (2%-37%, Chen and Older, 2020).
The average agricultural land cover of the study site was 26.72%, which might not have
caused fish diversity degradation. That said, in our study site, the positive effect of
agriculture on fish diversity through changing water physiochemistry outweighed its negative
effect through increasing eutrophication, given the relatively low in-stream TP concentration
(average 0.08 mg/l). However, it is plausible that with the progression of urbanization, this
relationship could be inverted. When watershed development reached certain threshold, the
diversity. The other possible reason could be the site climate was cool and humid, where the
The observed positive correlation between watershed development and fish diversity, as well
as the association between riparian disturbance and fish diversity, may be attributed to the
relatively low extent of development. This evidence lends support to the well-established
"intermediate disturbance hypothesis" (Huston, 2014), which posits that sites with
biodiversity. Specifically, it was found that food availability was highest in streams of open
completed forested stream (Albertson et al. 2017). However, this “intermediate disturbance
hypothesis” was ambiguous due to the imprecise definition of "intermediate." In this study,
disturbance" for watersheds surrounding the Great Lakes. Also, in addition to food
availability, we identified the other possible mechanism of how intermediate disturbed site
had high biodiversity—through slightly increasing water temperature, pH, and conductivity.
Existing research has yielded inconsistent conclusions regarding the relative importance of
riparian quality and catchment land use in relation to their effects on water quality and fish
diversity, where our research supported the significance of catchment land use. Some
researchers found measures of water physiochemistry and riparian condition might be better
indicators of fish diversity compared to basin wide land use because broader scale land use
might not adequately capture local activities affecting fish community (Meador and Goldstein,
2003; Lammert and Allan, 1999). However, some researches highlighted the terrestrial land
use and urbanization effects on fish community composition, especially the effects of
erosion-prone land use types such as root crop and maize (Bierschenk et al. 2019; Sternecker
and Geist, 2010). Our research indicated riparian quality mainly influenced fish diversity by
changing water physiochemistry, but this effect was not as strong as the effect of watershed
development. Moreover, our findings indicate that an intermediate level of riparian zone
disturbance enhances fish diversity, which contradicts numerous studies that have asserted
that riparian zones promote fish diversity by reducing nutrient and sediment input into
streams (Bierschenk et al. 2019). Riparian quality appeared less critical than watershed land
cover and did not exhibit a marked effect on eutrophication, possibly due to the relatively
favorable riparian quality in the study area, characterized by an average Riparian Quality
Index (QR) of 0.59 and an average deciduous tree cover of 25.6%. Consequently, urban
The warm climate in the study area exhibited a positive impact on fish diversity, which may
be attributed to the mild average summer temperature of 18.25℃ during the sampling season.
The daily temperature throughout this period did not surpass the optimal growth temperature
for the majority of species (Jobling, 1981; Tsuchida, 1995). Other research also showed fish
migration and diversity were positively related to the water temperature because warmer
temperature increased biomass by raising metabolism (Brodersen et al. 2011; Duffy et al.
2016). However, if water temperature keeps rising under climate change, it would decrease
fish diversity when it exceeds fish’s preferred temperature range and when it starts to cause
eutrophication. In addition, extreme precipitation (Rx1day) also had a positive effect on fish
diversity in our study. This association may stem from increased streamflow resulting from
high precipitation, which in turn generates new habitats for fish and supplies additional food
resources (Cheung, 2018; Hollowed et al. 2013). Compared to Rx1day, daily precipitation
was a negligible factor for fish diversity. Overall, we identified the pathways that warm
weather and high precipitation influenced fish diversity directly and indirectly through
changing eutrophication and water physiochemistry. Any future projection of climate change
effects on fish community should jointly consider the direct effects of climate change and the
Influenced TP
Both the developed area and slope negatively affected TP when the value of them were small,
but the effect changed to positive with the larger values of developed and slope due to the
significant positive interaction effects. Generally, steep slopes tended to increase surface
runoff volume (Wang et al. 2002; Sueker et al. 2001) and velocities (Lintern et al. 2018; Sliva
et al. 2001), which led to a greater chance of mobilization of sediments and pollutions by
mass failure or landslides (Bednarik et al. 2010). The observed negative effect of slope in our
study may be attributed to the prevalence of cropland and scarcity of forest in flat areas.
Previous studies also found that in forest area, the slope had significantly negative
correlations with water quality variables, which was because flat areas were cropland and
urban land (Yu et al. 2016). Therefore, land cover types and land use activities may outweigh
Regarding the main effect of developed area, our results were also not consistent with most
existing research stating the positive influence of urban area on TP. In the process of
urbanization, large vegetative surfaces have been converted into built-up areas, leading to an
increase of impervious surfaces (Kim et al. 2017; Li et al. 2018) and therefore increased
surface runoff (Li et al. 2019) and deteriorated surface water quality (Owens et al. 2002). The
observed inconsistency may be due to the fact that when the percentage of developed area is
minimal, the impact of extensive cropland on water quality surpasses that of developed land,
The positive interaction of DEV-SLOPE could be explained by the fact that steeper slopes
reduced infiltration and increased surface runoff velocities. According to the results of
contour plot of DEV-SLOPE (Fig. 4), we suggested land use planning consider reducing the
recommend that regulations governing various land uses for TP pollution management be
Among the three types of soil, silt and clay both have high sorption capacity, resulting in their
positive association with TP (Lintern et al. 2018). Other research also found a positive
correlation between stream TP concentration and the silt and clay content of catchment soils
in Finland (Varanka et al. 1989--1999). Different from silt and clay, sand has a higher rate of
porosity and holds less water, which leads to a lower nutrient absorption. Consequently,
sandy soil improves water quality by reducing nutrient content (Andry et al. 2009), which
agreed with our regression model results. The reason why clay soils exacerbated the impact
of cropland on TP may be due to the high sorption capacity of clay, causing an increased
uptake of phosphorus from fertilizers into the soil and subsequent discharge into adjacent
rivers (Johnson et al. 1997). Therefore, we suggested agricultural cultivation should avoid
clay-soil areas in land use planning, especially on areas with clay percent larger than 40%.
Moreover, although sand had positive influence on water quality, a high percentage of sand
could lead to low water-holding capacity, excessive drainage of irrigation, and poor fertilizer
use efficiency when planting (Andry et al. 2009). Therefore, the most suitable soil type for
planting should comprise a balanced composition (e.g., sandy loam), which accommodates
The major limitation of this study is that the quantification of the fish community did not
include species-related information. Given the relatively large study area, the same diversity
index could represent distinct species distributions across different regions, potentially
compromising the internal validity of the PLS model. Environmental factors influence fish
species in various ways. For example, research has demonstrated that climate change,
characterized by rising and variable temperatures, impacts diverse fish communities more
significantly than species-poor communities (Duffy et al. 2016). Also, different fish species
occurred at different pH conditions, with the lowest pH ranging from 3.1 to 7.0 (Leuven and
Oyen, 1987). Moreover, the species richness was inherently different in regions with varying
temperature conditions (Dala-Cort, 2016). Therefore, without accounting for species, our fish
diversity and richness index may not be comparable between watersheds. Future research
could consider partitioning data according to species distributions and constructing the
structural equation models separately. Another related issue was that the model did not
consider spatial clustering effects, which could be partially attributed to the spatial pattern of
mixed effects in the PLS model to deal with this issue, such as treating different regions as
Due to the data availability issue, the omission of some important variables affecting water
quality and fish was another limitation, which could cause incomplete pathways in the PLS
model. Stream discharge, for instance, can alter fish diversity by affecting habitat quality and
restricting fish movement. Groundwater and baseflow were related to water temperature,
dissolved oxygen levels, and nutrient availability (Franssen et al. 2016; Poff et al. 2010; Rolls
et al. 2018). However, these hydrological regime-related variables were not available to
include in the PLS model. Certain chemical indicators, such as dissolved oxygen and metals,
were also absent from the NRSA dataset. Because the study sites had low urban and industrial
land uses, we expected metals indicating toxicity conditions to fish would be very low and
not cause much bias in the results. Dissolved oxygen was highly correlated with water
temperature which was already included in the model. Another omitted variable was the
location of dams relative to the sampling sites. Dams could cause habitat fragmentation and
prevent fish from accessing important spawning and feeding areas (Liermann et al. 2012).
However, the initial quantification of upstream dams was very weakly associated with fish
diversity and therefore we removed it from the PLS model. For the next step, we could try
quantifying the number of dams within different location buffers relative to the sampling sites
configuration, and land management (Liu et al. 2012; Brabec et al. 2009; Yu et al. 2016).
Here we argued these variables were not very important for TP considering the general low
Our models were also subject to some multicollinearity and generalizability issues. Despite
LASSO, the final regression models still exhibited moderate correlations among certain
independent variables (e.g., CROP and FOREST, DEV_OPEN and DEV). A similar situation
arose when we added interaction variables to the regression models which could led to some
biased coefficients in the MLR results. Moreover, the generalization of this research could be
limited by the specific context of the study site. Our study sites were characterized by low
development intensity and favorable habitat quality. The data collection took place during the
summer months in cooler regions, with an average temperature of 19.90 ℃. Consequently,
our findings are biased towards favorable environmental conditions and may not be readily
Chapter 5: Conclusion
In this study, we identified pathways through which watershed development, climate, and
riparian quality impacted fish richness and diversity by altering water quality indicators. The
agricultural land use and decreased forest land use, which influenced fish richness and
diversity by raising pH, water temperature, conductivity, and nutrient levels. Moderate
watershed development, as indicated by the average of 5% developed areas in the study site,
resulted in increased fish diversity. Although watershed development negatively affected fish
richness and diversity by increasing nutrient concentrations and eutrophication, this pathway
was outweighed by the positive effects of increasing pH, temperature, and conductivity.
Furthermore, watershed development had a more substantial impact on fish communities than
riparian quality. The warmest monthly temperature and Rx1day both positively influenced
This study also revealed how stream TP concentration was affected by the interaction
between watershed land cover, slope, and soil. The CROP-CLAY and DEV-SLOPE
interactions were both significant factors affecting TP. Specifically, a soil clay content higher
than 40% significantly increased the positive effect of cropland on TP, and a slope greater
than 15% significantly increased the positive effect of developed areas on TP.
the Great Lakes should focus on both watershed management and riparian zone protection to
mitigate the negative impact of nutrients on fish diversity. (2) Land use regulation and
nonpoint source programs for stream nutrient management should be tailored to regions with
Albertson, L. K., Ouellet, V., & Daniels, M. D. (2018). Impacts of stream riparian buffer land
use on water temperature and food availability for fish. Journal of Freshwater Ecology, 33(1),
Anbumozhi, Venkatachalam, Jay Radhakrishnan, and Eiji Yamaji. "Impact of riparian buffer
Andry, H., et al. "Water retention, hydraulic conductivity of hydrophilic polymers in sandy
soil as affected by temperature and water quality." Journal of Hydrology 373.1-2 (2009):
Mikuláš railway case study." Physics and Chemistry of the Earth, Parts A/B/C 35.3-5 (2010):
Bierschenk, A. M., Mueller, M., Pander, J., & Geist, J. (2019). Impact of catchment land use
on fish community composition in the headwater areas of Elbe, Danube and Main. Science of
Boccuzzo, Giovanna, and Mario Fordellone. "Comments about the use of PLS path modeling
Brabec, Elizabeth A. "Imperviousness and land-use policy: Toward an effective approach to
Brodersen, J., Nicolle, A., Nilsson, P. A., Skov, C., Brönmark, C., & Hansson, L. A. (2011).
Interplay between temperature, fish partial migration and trophic dynamics. Oikos, 120(12),
Bry, X., Trottier, C., Mortier, F., & Cornu, G. (2020). Component-based regularization of a
Chen, K., & Olden, J. D. (2020). Threshold responses of riverine fish communities to land
use conversion across regions of the world. Global Change Biology, 26(9), 4952-4965.
Cheung, W. W. L. (2018). The future of fishes and fisheries in the changing oceans. Journal
Dala‐Corte, R. B., Giam, X., Olden, J. D., Becker, F. G., Guimarães, T. D. F., & Melo, A. S.
(2016). Revealing the pathways by which agricultural land‐use affects stream fish
Dillon, P. J., and W. B. Kirchner. "The effects of geology and land use on the export of
Duffy, J. E., Lefcheck, J. S., Stuart-Smith, R. D., Navarrete, S. A., & Edgar, G. J. (2016).
Biodiversity enhances reef fish biomass and resistance to climate change. Proceedings of the
Fernandes, A. C. P., Sanches Fernandes, L. F., Cortes, R. M. V., & Leal Pacheco, F. A. (2019).
The role of landscape configuration, season, and distance from contaminant sources on the
F. Hair Jr, Joe, et al. "Partial least squares structural equation modeling (PLS-SEM) An
emerging tool in business research." European business review 26.2 (2014): 106-121.
Franssen, N. R., Gido, K. B., Guy, C. S., Tripe, J. A., Shrank, S. J., Strakosh, T. R., ... &
Gido, K. B., Dodds, W. K., & Eberle, M. E. (2010). Retrospective analysis of fish community
change during a half-century of landuse and streamflow changes. Journal of the North
Henseler, Jörg, Geoffrey Hubona, and Pauline Ash Ray. "Using PLS path modeling in new
technology research: updated guidelines." Industrial management & data systems (2016).
Hiddink, J. G., MacKenzie, B. R., Rijnsdorp, A., Dulvy, N. K., Nielsen, E. E., Bekkevold,
D., ... & Ojaveer, H. (2008). Importance of fish biodiversity for the management of fisheries
Huston, M. A. (2014). Disturbance, productivity, and species diversity: empiricism vs. logic
Hollowed, A. B., Barange, M., Beamish, R. J., Brander, K., Cochrane, K., Drinkwater, K., ...
& Yamanaka, Y. (2013). Projected impacts of climate change on marine fish and
Jobling, M. (1981). Temperature tolerance and the final preferendum—rapid methods for the
Kaushal, Sujay S., et al. "Interaction between urbanization and climate variability amplifies
watershed nitrate export in Maryland." Environmental science & technology 42.16 (2008):
Keeler, B. L., Polasky, S., Brauman, K. A., Johnson, K. A., Finlay, J. C., O’Neill, A., ... &
Dalzell, B. (2012). Linking water quality and well-being for improved assessment and
Goonetilleke, Ashantha, et al. "Understanding the role of land use in urban stormwater
Kim, Hyun Woo, et al. "Exploring the impact of green space health on runoff reduction using
Lammert, M., & Allan, J. D. (1999). Assessing biotic integrity of streams: effects of scale in
measuring the influence of land use/cover and habitat structure on fish and
the distribution of fish species in shallow and lentic soft waters of The Netherlands: an
Li, Chunlin, et al. "Effects of urbanization on direct runoff characteristics in urban functional
Li, Fazhi, et al. "Green infrastructure practices simulation of the impacts of land use on
surface runoff: Case study in Ecorse River watershed, Michigan." Journal of environmental
Liermann, C. R., Nilsson, C., Robertson, J., & Ng, R. Y. (2012). Implications of dam
Lintern, A., et al. "Key factors influencing differences in stream water quality across
Liu, An, Ashantha Goonetilleke, and Prasanna Egodawatta. "Inadequacy of land use and
impervious area fraction for determining urban stormwater quality." Water Resources
Meador, M. R., & Goldstein, R. M. (2003). Assessing water quality at large geographic scales:
relations among land use, water physicochemistry, riparian condition, and fish community
Messmer, V., Jones, G. P., Munday, P. L., Holbrook, S. J., Schmitt, R. J., & Brooks, A. J.
Oliveira, Caroline Favaro, et al. "The modeling of pasture conservation and of its impact on
stream water quality using Partial Least Squares-Path Modeling." Science of The Total
Owens, Philip N., and Desmond E. Walling. "The phosphorus content of fluvial sediment in
rural and industrialized river basins." Water research 36.3 (2002): 685-701.
Poff, N. L., & Zimmerman, J. K. (2010). Ecological responses to altered flow regimes: a
literature review to inform the science and management of environmental flows. Freshwater
effects on water quality, growth and survival of shrimp Penaeus vannamei postlarvae raised
Prudencio, L., & Null, S. E. (2018). Stormwater management and ecosystem services: a
Reid, A. J., Carlson, A. K., Creed, I. F., Eliason, E. J., Gell, P. A., Johnson, P. T., ... & Cooke,
flow in an endangered Panamanian frog, Atelopus varius." Diversity and Distributions 15.5
(2009): 796-806.
Rolls, R. J., Heino, J., Ryder, D. S., Chessman, B. C., Growns, I. O., Thompson, R. M., &
Sanchez, G. (2013). PLS path modeling with R. Berkeley: Trowchez Editions, 383(2013),
Sliva, Lucie, and D. Dudley Williams. "Buffer zone versus whole catchment approaches to
studying land use impact on river water quality." Water research 35.14 (2001): 3462-3472.
Sternecker, K., & Geist, J. (2010). The effects of stream substratum composition on the
Su, G., Logez, M., Xu, J., Tao, S., Villéger, S., & Brosse, S. (2021). Human impacts on global
Sueker, Julie K., et al. "Effect of basin physical characteristics on solute fluxes in nine
Tsuchida, S. (1995). The relationship between upper temperature tolerance and final
Turunen, J., Elbrecht, V., Steinke, D., & Aroviita, J. (2021). Riparian forests can mitigate
Varanka, Sanna, Jan Hjort, and Miska Luoto. "Geomorphological factors predict water
quality in boreal rivers." Earth Surface Processes and Landforms 40.15 (2015): 1989-1999.
Wang, Guangxing, et al. "Spatial and temporal prediction and uncertainty of soil loss using
the revised universal soil loss equation: a case study of the rainfall–runoff erosivity R
Wang, Y., Liu, X., Wang, T., Zhang, X., Feng, Y., Yang, G., & Zhen, W. (2021). Relating
Wold, Herman. "Estimation of principal components and related models by iterative least
squares." Multivariate analysis (1966): 391-420.
Wold, Herman. "Model construction and evaluation when theoretical knowledge is scarce:
Yu, Songyan, et al. "Effect of land use types on stream water quality under seasonal variation
and topographic characteristics in the Wei River basin, China." Ecological Indicators 60
(2016): 202-212.
Zhao, Jing, et al. "A comparison between two GAM models in quantifying relationships of
environmental variables with fish richness and diversity indices." Aquatic ecology 48 (2014):