Chapter 3-2020
Chapter 3-2020
Chapter 3-2020
3 Frequency Analysis
3.1 General
Water resource systems must be planned for future events for which no exact time of occurrence can be
forecasted. Hence, the hydrologist must give a statement of the probability of the stream flows (or other
hydrologic factors) will equal or exceed (or be less than) a specified value. These probabilities are important
to the economic and social evaluation of a project. In most cases, absolute control of the floods or droughts is
impossible. Planning to control a flood of a specific probability recognizes that a project will be overtaxed
occasionally and damages will be incurred. However, repair of the damages should be less costly in the long
run than building initially to protect against the worst possible event. The planning goal is not to eliminate all
floods but to reduce the frequency of flooding, and hence the resulting damages. If the socio-economic
analysis is to be correct, the probability of flooding must be eliminated accurately. For major projects, the
failure of which seriously threatens human life, a more extreme event, the probable maximum flood, has
become the standard for designing the spillway.
This chapter deals with techniques for defining probability from a given set of data and with special methods
employed for determining design flood for major hydraulic structures.
Frequency analysis is the hydrologic term used to describe the probability of occurrence of a particular
hydrologic event (e.g. rainfall, flood, drought, etc.). Therefore, basic knowledge about probability (e.g.
distribution functions) and statistics (e.g. measure of location, measure of spread, measure of skewness, etc) is
essential. Frequency analysis usually requires recorded hydrological data.
Hydrological data are recorded either as a continuous record (e.g. water level or stage, rainfall, etc.) or in
discrete series form (e.g. mean daily/monthly/annual flows or rainfall, annual series, partial series, etc.).
For planning and designing of water resources development projects, the important parameters are river
discharges and related questions on the frequency & duration of normal flows (e.g. for hydropower
production or for water availability) and extreme flows (floods and droughts).
The FDC only applies for the period for which it was derived. If this is a long period, say more than 10 to 20
years, the FDC may be regarded as a probability curve or flow frequency curve, which may be used to
estimate the percentage of time that a specified discharge will be equalled or exceeded in the future. An
example is demonstrated in table 4.1 below.
The shape of the flow-duration curve gives a good indication of a catchment’s characteristics response to
its average rainfall history. An initially steeply sloped curve results from a very variable discharge,
usually from small catchments with little storage where the stream flow reflects directly the rainfall
pattern. Flow duration curves that have very flat slope indicate little variation in flow regime, the
resultant of the damping effects of large storages.
Adequacy refers primarily to length of record, but sparisty of data collecting stations is often a problem.
The observed record is merely a sample of the total population of floods that have occurred and may
occur again. If the sample is too small, the probabilities derived cannot be expected to be reliable.
Available stream flow records are too short to provide an answer to the question: How long must a
record be to define flood probabilities within acceptable tolerances?
Accuracy refers primarily to the problem of homogeneity. Most flow records are satisfactory in terms of
intrinsic accuracy, and if they are not, there is little that can be done with them. If the reported flows are
unreliable, they are not a satisfactory basis for frequency analysis. Even though reported flows are
accurate, they may be unsuitable for probability analysis if changes in the catchment have caused a
change in the hydrologic characteristics, i.e., if the record is not internally homogenous. Dams, levees,
diversions, urbanization, and other land use changes may introduce inconsistencies. Such records should
be adjusted before use to current conditions or to natural conditions. There are two data series of floods:
(i) The annual series, and
(ii) The partial duration series.
The annual series constitutes the data series that the values of the single maximum
daily/monthly/annually discharge in each year of record so that the number of data values equals the
record length in years. For statistical purposes, it is necessary to ensure that the selected peak discharges
are independent of one another. This data series is necessary if the analysis is concerned with probability
less than 0.5. However as the interest are limited to relatively rare events, the analysis could have been
carried out for a partial duration series to have more frequent events.
.
The partial duration series constitutes the data series with those values that exceed some arbitrary level.
All the peaks above a selected level of discharge (a threshold) are included in the series and hence the
series is often called the Peaks Over Threshold (POT) series. There are generally more data values for
analysis in this series than in the annual series, but there is more chance of the peaks being related and
the assumption of true independence is less valid.
p = 1/Tr (3.2)
To plot a series of peak flows as a cumulative frequency curves it is necessary to decide on a probability
or return period to associate with each peak. There are various formulas for defining this value as shown
in table 3.2.
The probability of occurrence of the event r times in n successive years can be obtained from:
(3.3)
Where
q = 1 - P.
Consider, for example, a list of flood magnitudes of a river arranged in descending order as shown in Table
3.3. The length of record is 50 years
The last column shows the return period T of various flood magnitude, Q. A plot of Q Vs T yields the
probability distribution. For small return periods (i.e. for interpolation) or where limited extrapolation is
required, a simple best-fitting curve through plotted points can be used as the probability distribution. A
logarithmic scale for T is often advantageous. However, when larger extrapolations of T are involved,
theoretical probability distributions (e.g. Gumbel extreme-value, Log- Pearson Type III, and log normal
distributions) have to be used. In frequency analysis of floods the usual problem is to predict extreme flood
events. Towards this, specific extreme-value distributions are assumed and the required statistical parameters
calculated from the available data. Using these flood magnitude for a specific return period is estimated.
Chow has shown that most frequency-distribution functions applicable in hydrologic studies can be
expressed by the following equation known as the general equation of hydrologic frequency analysis:
(3.4)
Where x = value of the variate X of a random hydrologic series with a return period T, x = mean of the
T
variate, σ = standard deviation of the variate, K =frequency factor which depends upon the return period,
T and the assumed frequency distribution.
Gumbel makes use of a reduced variate y as a function of q, which allows the plotting of the distribution as a
linear function between y and X (the maximum flow in this case). Gumbel also defined a flood as the largest
of the 365 daily flows and the annual series of flood flows constitute a series of largest values of flow.
According to his theory of extreme events, the probability of occurrence of an event equal to or larger than a
value x0 is
(3.5)
(3.6)
(3.7)
In practice it is the value of X for a given P that is required as such Eq. (3.7) is transposed as
(3.8)
(3.9)
(3.10)
Now rearranging Eq. (3.7) the value of the variate X with a return period T is
(3.11)
(3.12)
Note that Eq. (3.12) is of the same form as the general equation of hydrologic frequency analysis, Eq. (3.4).
Further eqs (3.11) and (3.12) constitute the basic Gumbel's equations and are applicable to an infinite sample
size (i.e. N → ∞).
Since practical annual data series of extreme events such as floods, maximum rainfall depths, etc., all have
finite lengths of record; Eq. (3.12) is modified to account for finite N as given below for practical use.
(3.13)
(3.14)
(3.15)
These equations are used under the following procedure to estimate the flood magnitude corresponding to a
given return period based on annual flood series.
1. Assemble the discharge data and note the sample size N. Here the annual flood value is the variate
X. Find x and σn-1 for the given data.
2. Using Tables 3.4 and 4.5 determine yn and Sn appropriate to given N
3. Find yT for a given T by Eq.(3.15).
4. Find K by Eq.(3.14).
5. Determine the required xT by Eq.(3.13).
To verify whether the given data follow the assumed Gumbel's distribution, the following procedure may be
adopted. The value of x for some return periods T<N are calculated by using Gumbel's formula and plotted as
T
x Vs T on a convenient paper such as a semi-log, log-log or Gumbel probability paper. The use of Gumbel
T
probability paper results in a straight line for x Vs T plot. Gumbel's distribution has the property which gives
T
T = 2.33 years for the average of the annual series when N is very large. Thus the value of a flood with T =
2.33 years is called the mean annual flood. In graphical plots this gives a mandatory point through which the
line showing variation of x with T must pass. For the given data, values of return periods (plotting positions)
T
for various recorded values, x of the variate are obtained by the relation T = (N+1)/m and plotted on the graph
described above. A good fit of observed data with the theoretical variation line indicates the applicability of
Gumbel's distribution to the given data series. By extrapolation of the straight-line x Vs T, values of x > N
T T
The Gumbel (or extreme-value) probability paper is a paper that consists of an abscissa specially marked for
various convenient values of the return period T (or corresponding reduced variate y in arithmetic scale). The
T
ordinate of a Gumbel paper represent x (flood discharge, maximum rainfall depth, etc.), which may have
T
Table 3.4: Reduced mean in Gumbel’s extreme value distribution, N= sample size
Table 3.5: Reduced standard deviation Sn in Gumbel’s extreme value distribution, N=sample size
(3.16a)
K= frequency factor given by Eq. (3.14)
It is seen that for a given sample and T, 80% confidence limits are twice as large as the 50% limits and
95% limits are thrice as large as 50% limits.
In addition to the analysis of maximum extreme events, there also is a need to analyze minimum extreme
events; e.g. the occurrence of droughts. The probability distribution of Gumbel, similarly to the Gaussian
probability distribution, does not have a lower limit; meaning that negative values of events may occur.
As rainfall or river flows do have a lower limit of zero, neither the Gumbel nor Gaussian distribution is
an appropriate tool to analyze minimum values. Because the logarithmic function has a lower limit of
zero, it is often useful to first transform the series to its logarithmic value before applying the theory.
Appropriate tools for analyzing minimum flows or rainfall amounts are the Log-Normal, Log-Gumbel, or
Log-Pearson distributions.
(3.20)
The variations of Kz = f(Cs, T) is given in Table 3.6. After finding z by Eq.(3.19), the corresponding
T
to account for the size of the sample by using the following relation proposed by Hazen (1930)
(3.21)
Where s Cˆ = adjusted coefficient of skew. However the standard procedure for use of Log-Pearson Type
III distribution adopted by U.S. Water Resources Council does not include this adjustment for skew.
When the skew is zero, i.e. C s = 0, the Log-Pearson Type III distribution reduces to Log-normal
distribution. The Log-normal distribution plots as a straight line
on logarithmic probability paper.
The flood-frequency analysis described above is a direct means of estimating the desired flood based
upon the available flood-flow data of the catchment. The results of the frequency analysis depend upon
the length of data. The minimum number of years of record required to obtain satisfactory estimates
depends upon the variability of data and hence on the physical and climatological characteristics of the
basin. Generally a minimum of 30 years of data is considered as essential. Smaller lengths of records are
also used when it is unavoidable. However, frequency analysis should not be adopted if the length of
records is less than 10 years.
Flood-frequency studies are most reliable in climates that are uniform from year to year. In such cases a
relatively short record gives a reliable picture of the frequency distribution. With increasing lengths of
flood records, it affords a viable alternative method of flood-flow estimation in most cases.
A final remark of caution should be made regarding to frequency analysis. None of the frequency
distribution functions have a real physical background. The only information having physical meaning
are the measurements themselves. Extrapolation beyond the period of observation is dangerous. It
requires a good engineer to judge the value of extrapolated events of high return periods. A good
impression of the relativity of frequency analysis can be acquired through the comparison of result
obtained from different statistical methods. Generally they differ considerably.
distribution. In drier areas a skew distribution such as the Log-Pearson, Log-Normal may give a better
fit.
For purposes of statistical analysis, low flows are defined as annual minimum flows averaged over
consecutive-day periods of varying length. The most commonly used averaging period is d = 7days, but
analyses are often carried out for d = 1,3, 15, 30, 60, 90 and 180 days as well. Low-flow quantile values
are cited as "dQp," where p is now the annual non-exceedence probability (in percent) for the flow
averaged over d-days. The 7-day average flow that has an annual non-exceedence probability of 0.10 (a
recurrence interval of 10 yr),called "7QI0," is commonly used as a low-flow design value. the “7Q10”
value is interpreted as follows:
In any year there is a 10% probability that the lowest 7-consecutive-day average flow will be less than
the 7QIO value.
Droughts
Droughts are extended severe dry periods. To qualify as a drought, a dry period must have duration of at
least a few months and be a significant departure from normal. Drought must be expected as part of the
natural climate, even in the absence of any long term climate change. However, “permanent” droughts
due to natural climate shifts do occur, and appear to have been responsible for large scale migrations and
declines of civilizations through human history. The possibility of regional droughts associated with
climatic shifts due to warming
cannot be excluded. As shown in Figure 3.3, droughts begin with a deficit in precipitation that is
unusually extreme and prolonged relative to the usual climatic conditions (meteorological drought).
This is often, but not always, accompanied by unusually high temperatures, high winds, low humidity,
and high solar radiation that result in increased evapotranspiration.
These conditions commonly produce extended periods of unusually low soil moisture, which affect
agriculture and natural plant growth and the moisture of forest floor (Agricultural drought). As the
precipitation deficit continues, stream discharge, lake, wetland, and reservoir levels, and water-table
decline to unusually low levels (Hydrological drought). When precipitation returns to more normal
values, drought recovery follows the same sequence: meteorological, agricultural, and hydrological.
Meteorological drought is usually characterized as a precipitation deficit.
qu = a + bqg (3.22)
where:
qu is the flow at the un-gauged site, qg is the concurrent flow at the gauged site,
and a and b are estimated via regression anlysis. Then estimate the dQp at the un-gauged site, dQpu, as:
dQpu = a + b.dQg (3.23)
where:
dQqg is the dQp value established by frequency analysis at the gauged site. In order to minimize errors
when using this procedure, each pair of flows used to establish equation (3.22) should be from a separate
hydrograph recession, the r2 value for the relation of equation (3.22) should be at least 0.70 and the two
basins should be similar in size, geology, topography, and climate.
Drought type:
As noted, one may be interested in one or more of the basic types of drought, each reflected in time
series of particular types of data: meteorological (precipitation); agricultural (soil moisture); or
hydrological (stream flow, reservoir levels, or ground water levels)
Averaging Period:
As with time-series analysis generally, drought analysis requires selection of an averaging period (dt).
Since drought by definition have significant duration, one would usually select dt = 1 month, 3 months,
or 1 yr, with the choice depending on the available data and the purposes of the analysis. For a given
record length the selection of dt involves a trade-off in uncertainty of the analysis:
Drought definition:
Figure 3.4 shows a time series of a selected quantity, X (e.g., precipitation, stream flow, ground water
level), averaged over an appropriate dt. The quantitative definition of drought is determined by the
truncation level, X0, selected by the analyst: Values of X < X 0 are defined as droughts. Typical values for
X0 might be:
Dracup et al. (1980) suggested choosing because it standardizes the analysis and gives more
significance to extreme events, which are usually of most interest.
Once X0 is determined, each period for which X < X 0 constitutes a “drought” and each “drought” is
characterized by the following measures:
Duration, D = length of period for which X < X0;
Severity, S = cumulative deviation from X0;
Intensity (or magnitude), I = S/D.
Note that if X is stream flow [L3T-1], then the dimensions of S are [L3T-1]x[T]=[L3] and the dimensions
of I are [L3T-1]
Figure 3.4: Quantitative definition of droughts. X is a drought measure, X 0 is the truncation level. D1, D2,
D3 are durations of droughts 1, 2 and 3. The areas S 1, S2, S3 are severities of droughts 1, 2 and 3.
design flood discharge and the river stage during the design flood) involve a natural or inbuilt uncertainty
and as such a hydrological risk of failure. As an example, consider a weir with an expected life of 50
years and designed for a flood magnitude of return period T=100 years. This weir may fail if a flood
magnitude greater than the design flood occurs within the life period (50 years) of the weir.
The probability of occurrence of an event (x≥x T) at least once over a period of n successive years is
called the risk, R . Thus the risk is given by R = 1 -(probability of non-occurrence of the event x≥x T in n
years)
(3.24)
(3.25)
It can be seen that the return period for which a structure should be designed depends upon the
acceptable level of risk. In practice, the acceptable risk is governed by economic and policy
considerations.
Safety Factor: In addition to the hydrologic uncertainty, as mentioned above, a water resource
development project will have many other uncertainties. These may arise out of structural,
constructional, operational and environmental causes as well as from non-technological considerations
such as economic, sociological and political causes. As such, any water resource development project
will have a safety factor for a given hydrological parameter M as defined below.
(3.26)
The parameter M includes such items as flood discharge magnitude, maximum average stage, reservoir
capacity and free board. The difference (C am - Chm) is known as Safety Margin.