SAD
SAD
SAD
Introduction
The species abundance distribution (SAD) characterizes the distribution of
abundances of all species within a
sample or ecological community. The observation that most species are
relatively rare with only a few being common is often described as one of the
few ecological laws (McGill et al. 2007). The SAD is an important con cept in
ecology and macroecology, being interesting in its own right as well as
providing the theoretical foundation for exploration of other ecological
patterns, such as the distance–decay relationship and the species–area
relation ship (SAR; Preston 1948; McGill et al. 2007; Whittaker & Fern andez-
Palacios 2007). Around 30 different SAD models have been proposed, with the
most commonly used being the log-normal (Preston 1948) and the logser ies
(Fisher, Corbet & Williams 1943). While SADs have been researched for over
seventy years (e.g. Fisher, Cor bet & Williams 1943), there has been a
resurgence of interest over the last decade (e.g. McGill et al. 2007). However,
much of this recent work has been on theoreti cal aspects of the pattern, and
less focus has been placed on the utility of SADs as a tool to provide useful
infor mation in applied ecology, conservation and manage ment. For instance,
a number of general SAD reviews have been published (e.g. May 1975;
Magurran 2004; McGill et al. 2007; McGill 2011), but few discuss the applied
uses of SADs (for an exception, see Dornelas, Soykan & Ugland 2011). Global
environmental change and disturbance to biotic communities resulting from
habitat loss, pollution and invasive species, amongst other drivers, represent
substantial pressures on biodiversity (Sala et al. 2000), in the light of which
applied ecologists and biodiversity managers require accurate, easy to use
and intuitive methods for measuring the impacts of the aforementioned drivers
of biodiversity decline (Mouillot et al. 2013). It is also necessary that these
methods allow workers to compare communities which differ in species
richness, an issue which plagues many traditional diver sity metrics. We
believe the SAD represents such a tool kit. Thus, the distinctive purpose of
this review is to synthesize the information and recent advances relating to the
use of the SAD in applied ecology. To achieve this aim, we have split the
review into two broad sections regarding the utility of SADs in applied
ecology: (1) a discussion on how incorporating the SAD into the study of
ecological communities can generate useful informa tion for the management
of biodiversity and (2) a review of the application, and potential application, of
SADs in biodiversity conservation and management. In the first section, we
review the recent literature on combining SADs with an assemblage
deconstruction approach in order to illustrate how the analysis of complete
assem blages (i.e. all sampled species) can obscure important abundance
patterns in species of conservation and man agement concern. We use recent
case studies of multi modal SADs to highlight the types of beneficial
information that can be derived from this approach. The second section
focuses more on the actual usage of SADs in conservation, including
application of SADs in mea suring ecosystem health, and for conservation
planning frameworks.
ON PLOTTING SPECIES ABUNDANCE DISTRIBUTIONS
Before progressing, it is necessary to review briefly the two main methods for
plotting SADs, as both are discussed throughout the paper. First, the SAD can
be visualized as a histogram of the number of individuals on the x-axis and the
number of species represented by a particular abundance on the y-axis. The
numbers of indi viduals are generally binned into octaves using a variety of
different methods (see Gray, Bjørgesæter & Ugland 2006; Matthews &
Whittaker 2014). The use of binning in SAD studies has been criticized as it
results in the loss of information (Gray, Bjørgesæter & Ugland 2006). Thus, a
number of studies use a different plotting method, termed rank abundance
plots (e.g. Foster & Dunstan 2010; see Matthews & Whittaker 2014). Rank
abundance plots/diagrams (RADs) are plots of abun dance (untransformed or
log-transformed) against rank order, where rank one corresponds to the
species with the highest abundance and so on. Generally, species with the
same abundance are assigned increasing ranks; for example, three species in
a sample represented by five individuals might be given the ranks of 10, 11 and
12. However, it may in fact be preferable to assign such spe cies the same
rank, in this case 10, 10 and 10. RADs are useful in that they can sometimes
reveal differences in model fits not apparent when using histograms (Fattorini
2005). Fattorini (2005) has recently shown that the geo metric series and
broken stick SAD models (see Matthews & Whittaker 2014) can be fitted using
linear regression and RAD plots. His analyses demonstrate that when the
abundance data are log-transformed, a linear relationship indicates that the
SAD follows a geometric series, while the broken stick model is supported in
cases where a linear relationship emerges following log-trans formation of the
rank axis.
Typically, empirical ecological and macroecological analy ses are based on lists of
all species encountered during a sampling exercise, for example all birds seen or
heard in a patch of forest. However, it has become increasingly apparent that the
aggregation in a sample of species with differences in key ecological properties,
such as body size, dispersal ability or habitat affinity and specialization, can act to
obscure patterns of interest for particular subsets of species, a theory we have
termed the amalgamation hypothesis in previous work (Matthews, Borges &
Whittaker 2014). Thus, a number of recent studies have focused on splitting
samples into various subsets prior to analysis and then exploring patterns of
interest in each subset separately (e.g. Magurran & Henderson 2003; Ulrich &
Zalewski 2006; Bommarco et al. 2010). For example, in a recent paper, we have
shown that the amal gamation of specialist and generalist bird species in forest
fragment data sets can act to mask the loss of specialist species, generally those
species of most conservation con cern, in response to habitat insularization (i.e.
generalists depress the slope of the island SAR; Matthews, Cottee Jones &
Whittaker 2014; see also Bommarco et al. 2010). To take another example, this
time in the context of SADs, Labra, Abades & Marquet (2005) used a decon
struction approach to look at invasive and native species and found that for US
birds, on average, invasive species obtain higher maximum abundances than
native species.
While the possibility has long been recognized, it has become increasingly
apparent that many SADs may in fact be multimodal, that is characterized by
multiple dis tinct modes (e.g. Pielou 1969; Dornelas & Connolly 2008;
Vergnon, van Nes & Scheffer 2012; Matthews, Borges & Whittaker 2014). A
number of explanations have been put forward to explain multimodal SADs,
including emer gent neutrality theory (Vergnon, van Nes & Scheffer 2012; but
see Barab as et al. 2013) and the possibility that they simply represent
statistical artefacts (Gaston 1994; McGill et al. 2007). A more detailed history
of multi modal SAD research is presented in Table 1. However, One of the first
studies to recognize the possibility of multimodal SADs, stating that increasing
a sample to include a broader range of taxa (e.g. expanding from a focus on
warblers to all birds), might result in the combination of several logseries
distributions, each with its own set of parameters. More recently, Sizling et al.
(2009) have discussed how incorporating multiple areas and a variety of taxa
within a sample can result in a SAD that is a proportional sum of different
partial SADs Ugland & Gray (1982) A problem with using the log-normal
distribution to model SADs is that it assumes equal density probabilities
across the species (symmetry), but in reality, there is a general pattern of
asymmetry as communities are comprised of three distinct abundance groups,
roughly translated as rare, intermediately abundant and common. Within each
group, the SAD is symmetric, which leads to a mixture of three log-normal
distributions when focusing on the whole assemblage Gray, Bjørgesæter &
Ugland (2005) Found strong evidence of bimodal log-normal distributions for
a mixture of marine and terrestrial data sets.
Dornelas & Connolly (2008) Fitted a combination of one- to four-mode
Poisson log-normal (PLN) distributions, in addition to the logseries, for a large
sample of coral communities, Australia, and compared the various
distributions using maximum-likelihood methods. At the scale of the entire
sample, the three-mode PLN had the greatest support, but as the four-mode
model was within an AIC of 2 of the three-mode model, this can also be
concluded to have strong support The SADs of arthropod communities on
Terceira Island, Azores, could be deconstructed into three distinct abundance
groups (abundant, intermediate and rare) Found that environmental
heterogeneity underpinned multimodal SADs of weed seed bank communities
Tested a number of data sets representing various taxa and found strong
evidence of multimodality. This indicates that multimodality may be more
common than previously thought: it has simply been overlooked
Using the methodology of Dornelas & Connolly (2008), it was found that a two-
mode PLN provided a better fit than the unimodal PLN and logseries for a number
of Azorean arthropod assemblages at a variety of spatial scales perhaps the most
widely accepted explanation is the afore mentioned amalgamation hypothesis (e.g.
Alonso, Ostling & Etienne 2008). In this regard, analysis of multimodal SADs often
reveals interesting information relating to community structure because the
different modes in the distribution have often been found to represent clusters of
different types of species (Ugland & Gray 1982; Borges et al. 2008; Matthews,
Borges & Whittaker 2014). That is, species are more similar, according to some
trait, to oth ers within clusters than between clusters. For instance, one
assemblage division which has proved enlightening in SAD studies has been into
core and satellite species sub sets. Conceptually, core species represent the
constituent members of any ecological community and are predicted to be
structured according to traditional niche-based mechanisms (Ulrich & Zalewski
2006). Satellite species are those species only occasionally found in any commu
nity (i.e. mostly immigrants from outside the local species pool) and are predicted
to be governed by stochastic pro cesses, largely random dispersal. Using a core–
satellite division, Magurran & Henderson (2003), who focused on an estuarine fish
community in the UK over a period of 21 sampling years, discovered that those
species which are relatively abundant throughout the record and possess
specialized estuarine habitat requirements (the ‘core’ spe cies) are characterized
by a log-normal distribution. In contrast, the species with low abundances,
infrequent records in the data set and different habitat requirements (the ‘satellite’
species) follow simple Poisson processes and are characterized by a logseries
distribution (Magurran & Henderson 2003). The two different sets of species leave
different signals within the SAD and when combined result in a distribution with
more rare species than pre dicted by the standard log-normal model (for further
examples, see Gray, Bjørgesæter & Ugland 2005; Ulrich & Zalewski 2006;
Unterseher et al. 2011). It is important to remember, however, that any division of
an assemblage into core and satellite species is a simplification of reality.
Empirical assemblages represent a continuum of species types, and workers must
be careful not to introduce arte facts into analyses when splitting species into
binary cate gories.
More recently, Matthews, Borges & Whittaker (2014) used a null model
approach to show that the SADs of Azorean forest arthropods were frequently
significantly bimodal, with the rarer mode of species predominantly
comprising satellite species and the common mode largely comprising core
species. Many of the satellite species in the rarer modes were as follows: (a)
species introduced to the Azores and (b) also classified as tourist species
(sensu Borges et al. 2008). Tourist species were defined as species present in
higher abundances in more anthropogenic land use types surrounding the
native forest patches (e.g. agri cultural pastures or exotic forest plantations)
and for which it was assumed that native forest was not primary habitat
(Matthews, Borges & Whittaker 2014). Thus, although occurring rarely in the
samples, they were not
species of conservation concern. The core–satellite dichot omy used in
conjunction with a multimodal SAD model in the above study was useful as it
allowed focus on the abundance of the different types of species and showed
that a significant proportion of the ‘rare’ species in the forest were in fact
species which managers would not want to conserve (e.g. introduced species;
see also Mat thews, Cottee-Jones & Whittaker 2014). Dornelas et al. (2009)
have also shown that fitting multimodal SAD mod els to weed communities in
agroecosystems can generate useful information about how weed species
respond to environmental heterogeneity (discussed below).
In sum, it has become increasingly apparent that the amalgamation of multiple
groups of species within a sam ple can mask patterns that may be of
considerable interest to managers. A natural next step then is to deconstruct
full assemblages/samples into different subsets and to explore patterns in the
subsets seperately. This does not have to add too much complexity to data
collection and analysis; simple divisions (e.g. core and satellite species,
specialist and generalists) can be informative.
Section 2: The species abundance distribution in biodiversity management
and conservation
Although not as widespread in the management and con servation literature as
other macroecological patterns (e.g. the SAR), SADs are a potentially useful
tool for conser vation scientists and managers as they can be used to ‘define’
rarity in a particular community. Conservation practices generally focus on
rare species within a delin eated protected area boundary (Gaston 1994), and
a SAD can provide evidence of the level of rarity of particular species of
interest relative to other species (McGill 2011) and thus extinction risk and
associated conservation action. In addition, changes in the empirical SAD can
act as an early warning for the effects of disturbance on bio diversity, as the
shape of the SAD can change markedly before any local extinctions occur (H
agvar 1994; Mouillot et al. 2013). Disturbance to ecological communities
arising from drivers such as habitat loss, invasive species and pol lution is a
pervasive feature of modern times (Sala et al. 2000), and there is thus an
exigent need for the develop ment of tools that allow for relatively quick
assessments of ecosystem health and/or the success of management
prescriptions aimed at ameliorating the effects of distur bance (Mouillot et al.
2013). We argue here that the SAD represents an example of such a tool, and
we use this sec tion to review a number of particularly promising areas of
application.
A summary of methods
Histograms are not the only method for plotting the SAD, and Fig. 3 provides a
graphical summary of the various SAD methods available for examining the
impact of disturbance and management actions on ecological communities.
For example, Lambshead, Platt & Shaw (1983) pioneered the K dominance plot
(Fig. 3a). This method plots cumulative abundance percentages against
species rank and has been argued to be a useful way of determining the effect
of disturbance on the SAD, but has been little explored. In K dominance plots,
if a curve lies completely below another curve (as community A does in Fig.
3a), then this community can be defined as being more diverse. A similar
method derived in the context of marine benthic communities uses K
dominance plots to compare the distribution of individuals amongst species,
with that of biomass (Fig. 3b; e.g. Warwick 1986). It is based on the theoretical
consideration that the standard individual-based SAD should behave
differently from the biomass-based SAD when affected by disturbance. Unpol
luted systems are characterized by the biomass curve appearing above the
abundance curve, as in Fig. 3b, and vice versa. Methods based on departures
from a log-nor mal SAD shape (using histograms) and comparisons of model
parameters (Fig. 3c,d) have been discussed above. The empirical cumulative
distribution function (ECDF) can be plotted instead of the commonly used
probability density function and can be useful for highlighting differ ences in
SADs between communities (Fig. 3e). For instance, three hypothetical SADs
(here, we have formu lated the data ourselves for effect) are plotted in ECDF
form in Fig. 3e, each relating to a river invertebrate com munity under different
pollution scenarios: high and mildly polluted, and unpolluted. In (e), the
unpolluted
curve lies below the two polluted curves at the left hand side of the plot,
indicating that there are higher propor tions of low abundance species in the
two polluted com munities. Finally, the slopes of SAD models plotted using
rank abundance diagrams (RAD) offer an alternative to histograms for
comparing SADs between communities (see Fig. 3f).
OTHER APPLICATIONS
Reviewing the literature reveals that SADs have been used in numerous other
subfields within applied ecology, albeit sometimes sporadically. However,
these uses have not been synthesized within a single review, and thus, the full
potential of SAD models in applied ecology may be underappreciated. As
such, we now review these different applications of SADs under five broad
headings.
Conservation planning
The use of abundance data, and in particular of SADs, has significant potential
in the field of conservation plan ning (e.g. Pearce & Ferrier 2001; Dunstan et
al. 2012). For instance, a study by Dunstan et al. (2012) explored the use of
SAD information in the selection of biodiver sity hotspots for the benthic fish
and invertebrates of the continental slope and shelf, south-west Australia.
Using a novel form of rank abundance diagram (see Foster & Dunstan 2010),
these authors found that incorporating SAD information greatly improved the
identification of biodiversity hotspots as it provided novel information on
which areas had a high proportion of rare species. The shape of RAD curves
can be used to determine the pro portion of relatively rare species and thus is
useful for comparing sites and selecting necessary sites for conserva tion. In a
further study, an analysis based on the logseries SAD predicted both the
number of tree species in Amazo nia and patterns of dominance and rarity (ter
Steege et al. 2013). These striking results were of particular conserva tion
importance as it was predicted that as many as half he individual trees in the
Amazon belong to a group of 227 ‘hyper-dominant’ species, meaning the
remaining 50% of individuals are distributed across over 10 000 species. This
indicates that the majority of the 10 000 species are very rare, a fact obscured
when only focusing on the spe cies richness of the region. Such findings are
interesting in themselves, but it is the general message of these papers that is
most relevant in the context of this review, namely that most biodiversity
conservation and land acquisition decisions are largely based on the use of
species richness as an index of biodiversity (Pearce & Ferrier 2001; Dunstan et
al. 2012), which may not reveal the whole pic ture. It is only when one
considers the SAD of a system that any idea of the distribution of rarity and
community structure can be inferred, and thus, ultimately conserva tion
planning could be greatly improved through broader appreciation of the
potential information content of SADs.
An integral part of conservation biogeography over the last 30 years has been
predicting extinctions related to habitat loss and fragmentation (Whittaker &
Fern andez Palacios 2007; Ladle & Whittaker 2011). This endeavour has largely
centred on the backwards use of the power law SAR model, a methodology
which has been criticized for being theoretically incorrect and producing
erroneous extinction estimates (see Whittaker & Fern andez-Palacios 2007;
He & Hubbell 2011). However, recent work has shown that incorporating the
SAD into extinction predic tions can greatly improve the accuracy of various
metrics (Kitzes & Harte 2014). For instance, the ‘extinction–area relationship’
and ‘probabilistic species–area relationship’ of Kitzes & Harte (2014) are
based on the logseries SAD and upper-truncated geometric spatial abundance
distri bution and have been shown to be more flexible and theo retically
appropriate than the power law SAR.
Finally, focusing on changes to the SAD through time is likely to provide interesting
additional insights for bio diversity conservation. For instance, if we return to the
area of conservation planning, it can be seen that there has been an increasing
focus on temporal turnover and long-term persistence in reserve selection in the
last two decades (e.g. Rodrigues, Gregory & Gaston 2000; van Teeffelen, Cabeza
& Moilanen 2006). Specifically, it has been postulated that reserves and reserve
networks are likely to be more successful in their aim to conserve biodi versity in
the long term if the reserve sites are selected through incorporation of abundance
data of any species of conservation interest (Rodrigues, Gregory & Gaston 2000).
Sites should be located where these species are locally abundant, and thus, the
probability of long-term persistence is higher. SADs can help in this endeavour as
they give a broader perspective of relative abundances of a set of species in the
network or at particular sites. For instance, theoretical work has shown that
varying the assumptions on the processes affecting variations in popu lation size
through time leads to different SADs (see dis cussion in Sæther, Engen & Grøtan
2013).
Fig. 3. Species abundance distribution-based methods for determining the effect
of disturbance in applied ecology using simulated (a–e) and empirical (f) data. (a)
A K dominance plot (i.e. a plot of cumulative abundance percentages against
species rank; Lambshead, Platt & Shaw 1983). If a curve lies completely below
another curve [as community A does in (a)], then this community can be defined
as being more diverse. The vertical dashed lines in (a) indicate the species
richness of each community. (b) Abundance/biomass plots (i.e. a K dominance plot
that compares the distribution of individuals amongst species with that of
biomass; see Warwick 1986). Unpolluted sys tems are characterized by the
biomass curve appearing above the abundance curve, as in (b), and vice versa. (c)
The fit of the log-normal (PLN; here, the zero-truncated form) to an empirical SAD
(here, binned) is often used to assess the impact of disturbance on species
assemblages. (d) Variation in the logseries alpha along a hypothetical disturbance
gradient in which an increasing proportion of the indi viduals from a community
(simulated to follow a log-normal SAD; N = 2396, S = 52) are randomly lost
through sampling from the community (left to right on the x-axis represents
increasing hypothetical disturbance and loss of individuals). Sampling was
repeated 100 times (dots = median alpha values; shading = 95% confidence
intervals). (e) Three hypothetical SADs plotted using the empirical cumu lative
distribution function (ECDF). The data were created for this figure in order to
illustrate the method, but might for instance relate to river invertebrate
communities subjected to different degrees of pollution: heavily and mildly
polluted, and unpolluted. In (e), the unpolluted curve lies below the two polluted
curves at the left hand side of the plot, indicating that there are a higher
proportion of low abundance species in the polluted communities. (f) The SADs of
the pooled polluted (triangles) and unpolluted (circles) Norwegian mar ine
invertebrate samples (see Fig. 2. caption) are plotted in rank abundance form. The
fits of the truncated log-normal model to the unpolluted samples, and the
logseries model to the polluted data, are displayed. In the polluted sample, it is
clear that there are more observed rare species predicted even by the logseries.
The plots were constructed using the ggplot2 R package (Wickham 2009).