Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Statistics Ocenanography

Download as pdf or txt
Download as pdf or txt
You are on page 1of 17

Illustrating Frequentist and Bayesian Statistics in Oceanography1

George Casella
Cornell University
ABSTRACT
Both frequentist and Bayesian methodologies provides means for a
statistical solution to a problem.

However, it is usually the case that,

for a given situation, one methodology is more appropriate.

Using a number

of oceanographic examples we explore the components of a statistical


solution and illustrate the most appropriate methodology.

Ve argue that

the statistical consideration of utmost importance is the type of inference


and conclusion to be made.

In some examples it is more appropriate to make

this inference as a Bayesian, and in some it is more appropriate to make


this inference as a frequentist.
"Still, it is an error to argue in front of your data. You find yourself insensibly twisting them round to
fit your theories."

Sherlock Holmes
The Adventure of Visteria Lodge
1.

INTRODUCTION
An alternate title for this paper might well be "Conditional and

Unconditional Inference in Oceanographic Studies," as a fundamental


difference between frequentist and Bayesian statistics is their resulting
inference.

A frequentist inference is unconditional, applying to a series

of repeated experiments (most always an imagined series).

In contrast, a

Bayesian inference is conditional, applying to the data at hand, and not


directly addressing the concept of repeatability.
This paper is an introduction to these methods, and illustrates their
1 This

paper was presented at the 'Aha Huliko'a Winter Workshop on "Probability Concepts
in Physical Oceanography," January 12-15, 1993, Honolulu, Hawaii, and is technical report
BU-1187-M, in the Biometrics Unit, Cornell University. This research was supported by
National Science Foundation Grant No. DMS9100839 and National Security Agency Grant
No. 90F-073.

uses with some oceanographic data sets.

The primary message is that each

statistical view has a lot to offer, and, depending on the problem, one
methodology is probably more appropriate.

We illustrate this through the

examples.
A second goal of this paper is to try to explain to the oceanographic
community how a statistician approaches a problem.

The purpose of this

endeavor is to provide a structured approach to dealing with problems


involving data, from their inception to ending.

In doing so, perhaps the

task of dealing with the ever-increasing data bases can be made a little
easier.
The remainder of the paper is arranged as follows.

In Section 2 we

give general outline of how to approach a problem statistically,


illustrating this with an example in Section 3.

Section 4 discusses the

underlying differences between the frequentist and Bayesian approaches to


statistics, and Sections 5 and 6 contain more examples illustrating these
methodologies.
2.

Section 7 contains a concluding discussion.

COMPONENTS OF A STATISTICAL SOLUTION


In the best of all possible worlds, a problem is planned,

statistically, from beginning to end.

Chronologically, the steps of a

solution can be listed as in Table 1.


Table 1:

Components of a Statistical Solution


(Chronological Order)
1.

Model the Process

2.

Design the Experiment

3.

Collect the Data

4.

Esti.ate and Verify the Model

5.

Infer and Conclude

6.

lmple~rent

the Solution

Although the steps are performed in chronological order, they are


best planned in reverse order.

That is, when approaching any problem, the

first consideration is "How will the knowledge we gain be implemented?"


2

For example, if a study is proposed to examine wave magnitude and direction


in the North Atlantic, the first consideration should be the use of the
resulting knowledge.

Vill it be used to plan routes for oil tankers?

it be used to increase our basic knowledge of ocean dynamics?

Vill

By answering

this question first, the remainder of the steps of a statistical solution


will fall into place, and the problem can be attacked in a very efficient
fashion.

Although this mechanism for solution is not usually taught in the

classroom, it seems to be the one most preferred by statisticians.

By

concentrating on the final result, the entire study becomes focused.


Vith respect to frequentism or Bayesianism, the components of the
statistical solution remain essentially unchanged.

Of course, there are

some differences in the approaches, with the major difference being in the
modeling and inference stages.

However, the overall attack is similar.

This is illustrated in the next sect:on.


3.

AN EXAMPLE CONCERNING ICEBERGS


Defant (1961, page 278) presented the following data on the frequency

of icebergs off Newfoundland.


Table 2: Frequency of Icebergs off Newfoundland south
of 48N (a) and south of the Grand Banks (b), for
the period 1900-1926.
Month
Jan Feb Mar Apr May Jun Jul Aug Sep Oct

(a)

10

36

83

(b)

130 68
18

13

Nov Dec

25

13

386

51

For our example, we will look at the question of whether the yearly
distribution of icebergs is the same in each location.

A glance at

Figure 1 will show that such a hypothesis is very likely, but for
illustration we will step through both a Bayesian and frequentist
approach to the problem.

We take as the goal of our study to be the

description of the distribution of icebergs off Newfoundland.

In both the Bayesian and frequentist approaches to this problem we


assume that the data are distributed according to a multinomial
distribution, and we wish to test the null hypothesis

n0 :

distributions in locations {a) and {b) are the same.

To test this as a

The

frequentist we use a chi-squared test of association (see Snedecor and


Cochran, 1989).

The chi-squared test results in a p-value of .977, which

is very strong evidence in favor of the null hypothesis.


To perform a Bayesian analysis a prior distribution must be
specified, that is, a distribution that ve subjectively believe describes
the pattern of icebergs.

Ve then use this distribution, in conjunction

with the observed data, to assess the plausibility of the hypothesis.


Since we really have no prior knowledge about the icebergs, ve use a
strategy that attempts to

model this ignorance, and calculate the

probability of every data table vith the given marginal totals, using a
hypergeometric distribution.

This leads us to use Fisher's exact test

{Fisher,1970) and assess the probability of the null hypothesis as .994.


Again, this is very strong evidence in favor of this hypothesis.
(Strictly speaking, Fisher's exact test is not a Bayesian procedure but a
conditional procedure, as it is calculated conditionally on the observed
data.

However, the important feature is that it yields a conditional

inference.)
Ve now can clearly see the distinction betveen Bayesian and
frequentist inferences.
interpretation.

The frequentist bases inference on a frequency

A formal conclusion vould be of the form, "the

statistical procedure used (here the chi-squared test) vould result in an


erroneous inference less than 5% of the time in repeated experiments.""
In contrast, the Bayesian inference is conditional on the observed data,
and would formally conclude "based on the stated prior distribution and
observed data, the probability is .994 that Ho is true."

Ve now look at

these differences a bit more closely.

4. WHERE DOES THE RANDOMNESS COME FROM?


The most important part of any statistical investigation is the
resulting inference.

In fact, it may even be said that the main reason

for doing a statistical investigation is to produce a meaningful


inference, since the inference applies to a wider population than is
actually studied and measured.

(For example, after measuring the

activities of a number of waves in a certain area, we are then interested


in making a statement (an inference) about all waves in that area.)

To

make this inference we need an underlying model of the phenomena, one


that accounts for the randomness of the observations and allows an
inference.

Bayesians and frequentists have different approaches to this.

4.1 Frequency Randomness


The frequentist assumes repeatability of the experiment, that the
experiment actually performed is one of an infinitely long sequence of
If we denote this sequence of experiments E1 , ~'
... , Ek, Ek + 1 , ... , then we make our inference to the entire sequence, even
though only one experiment (say Ek) is actually performed. The rest of
identical experiments.

this imagined-sequence builds the randomness into our model.

Ve know

that the results of each experiment (if performed) would be slightly


different, and our inference will take these potential differences into
account.
Thus, the frequentist inference is an unconditional one that applies
to the entire sequence, and does not single out the experiment actually
performed.

It is important to realize that the inference is about the

performance of the procedure over the entire sequence of experiments, such


as, "The statistical procedure used will be correct in 95% of all
experiments performed."

The actual outcome of the observed experiment

will not change this inference.


4.2 Bayesian Randomness
In a Bayesian analysis the data are assumed to be fixed, and
inference is made conditional on their observed values.
randomness comes from the data.

Thus, no

The randomness in a Bayesian inference

comes from the subjective prior distribution.

This randomness, together

with the information in the data, are combined into the posterior
distribution.

The posterior distribution is then used for inference.

Of

course, different subjective prior distributions may result in different

inferences.
More precisely, suppose there are data, X, which vary according to a
probability distribution f(xiO), a distribution indexed by an unknown
parameter 0.

(For example f( IO) may be a Gaussian distribution with

unknown mean 0.)

We then assume that the parameter 8 varies according to

a prior distribution

~(0).

This probability distribution reflects our

knowledge about the parameter 0 before observing the new data x.

(In

keeping with convention, an upper case X denotes an unseen random


variable while a lower case x denotes an observed value.

Thus the

equation "X= x" means that we have observed the value x of the random
variable X.)

Using the laws of probability (or sometimes called Bayes

rule) we calculate the posterior distribution of 0 given X=x, g(Ojx), as


f(xj8)1r(O)

g(Oix) =

where the integral is over all values of 0.


calculations, see Casella and Berger, 1990.)

(For more detail on such


Our inference is then based

on g(Ojx), which only considers the experiment actually performed, not


any repeated sequence.

For example, one might infer "Based on the

specified 1r(O) and observed x, we conclude that 8 ~ 0 vi th probability

95%."

This inference would follow if it were the case that

J()g(O I x)dO = .95.


4.3 The Appropriate Inference
As mentioned before, the purpose of this paper is not to make value
judgments as to which of Bayesianism or frequentism is better.

Rather,

the purpose is to illustrate situations where one method is more


appropriate. It then follows that the more appropriate methodology, and
inference, is the one to use.

From the previous two subsections, we see

that the frequentist inference is more appropriate if repeatability is


important, while the Bayesian inference is more appropriate if the
inference is to be made conditional on the observed data.

Returning to

the iceberg data, it seems that the Bayesian inference is more


appropriate, as we are faced with a data set that is unrepeatable, and we

are interested in an inference conditional on that data set.


(Interestingly, it was argued during discussions at the workshop that one
could consider the observed 26-year period as one of a sequence of 26year periods, in which case the frequentist inference maybe more
appropriate.)

If it may be argued that either interpretation is valid,

and hence either inference is appropriate, there is no problem.

As long

as the methodology is chosen to appropriately answer the question of


interest, phrased in the manner of interest, the statistics have served
their purpose.
5. AN EXAMPLE CONCERNING BREAKING WAVES
Hwang, Hsu and Wu (1990) report on an experiment concerning average
height of breaking waves, H8 , measured as a function of RMS surface
displacement, q. The data are presented in F;gure 2. They conclude that
H8 < H8 , the significant wave height, where Hg = 4q, and state, "In a
random wave field, waves that break due to local instabilities are not
necessarily the highest waves."

Statistically, we can think of this as

testing the hypotheses


vs.
It seems here that frequency considerations are important, in that
-conclusions should apply to repetitions of the experiment.

This concern

.seems implicit in the above quoted conclusion of Hwang et al.


frequentist analysis is more appropriate.

Thus, a

Using a standard linear

regression model with Gaussian errors, we obtain a p-value of .999 for


the hypothesis H0 : HB ::5 4q, showing that there is overwhelming evidence
to support this hypothesis. (In fact, the hypothesis H0 : H8 ::53q yields
a p-value of .911, demonstrating extremely good support for this even
stronger claim.)
Of course, a Bayesian analysis could also be performed, but the
inference would not apply to a sequence of experiments.
would be conditional on the observed data.

The conclusions

To do the Bayesian analysis

we again use a standard linear regression model with Gaussian errors, but

we also assume that HB=b7], where b is a parameter with a specified prior


distribution.

We specify the prior to also be Gaussian, and we take the

prior mean to be equal to the hypothesized value.


HB ~ 47] we specify a Gaussian prior with mean 4.

(Thus, for testing H0 :


This strategy of

centering the prior at the hypothesized value gives equal prior weight
above and below the value, and may be considered an impartial prior
specification.)
Combining our prior specification with the observed data, we
calculate

Pr(b~4ldata)

= .999 and

Pr(b~3ldata)

= .623.

That is, for the

specified priors and conditional on the observed data, b is less than 4


with probability .999 and less than 3 with probability .623.
Quantitatively, these conclusions are similar to those of the
frequentist, and show overwhelming support for the null hypotheses.

The

only difference is in the scope of the inference.


Bayesian conclusions are, of course, dependent on the prior
specification, and sometimes there might be concern about oversensitivity
to this specification.

Such a concern is easily addressed, however, by

calculating posterior probabilities over a range of prior specifications.


This is illustrated in Figure 3, where we display the posterior
probabilities over a wide range of standard deviations.

(The standard

deviation of the data is .082, and the graph shows the prior standard
deviation up to twice this value.)

The figure shows that, for this range

of prior standard deviations, the conclusions from the Bayesian analysis


are relatively stable in their support of H0
6. AN EXAMPLE CONCERNING BUBBLE POPULATIONS
The distribution of bubble populations is also investigated by Hwang,
et al. (1990).

They collected data on bubble populations as a function

of depth and wind velocity, as presented in Figure 4.

For a given depth,

Z, (em) and wind velocity, u, (m/s), the logarithm of the bubble


population, N(Z), (log cm3) is modeled as
u=10, 11, ... , 15

where

represents random error, and is assumed to have Gaussian


distribution with mean 0 and variance u 2 .

A question of interest is whether the distribution of bubbles is the


same at each depth.

After some thought, it seems that the appropriate

inference here is the frequentist inference.

Concern about the

repeatability of the inference leads to this conclusion, as we would like


to be able to describe the bubble populations at a given depth and wind
velocity when such conditions are again realized.
6.1 A Standard Frequentist Inference
A standard approach to this problem is to decide if the slopes are
the same at each wind velocity, so we would test the null hypothesis H0 :
b 10 = b 11 = ... = b 15 . Doing so leads to a p-value of .063, which suggests
rejection of H0 . Thus a standard frequentist analysis would lead us to
fit separate regression lines for each wind velocity. So for each wind
velocity we would use a separate regression equation to predict the
bubble population.

See Table 3 and Figure 5.

6.2 An Empirical Bayes Analysis


The bubble population data is ideal for an empirical Bayes analysis-a
mixture of frequency and Bayesian analyses that combines the best
features of each.

Here we will only briefly explain the methodology, for

a more detailed introduction see the articles by Casella (1985, 1992).


Table 3: Coefficients for the standard regression analysis
(frequentist) and empirical Bayes analyses of
the bubble populations.
Wind

empirical

Velocity

10

.666

-.084

.011

-.076

11

.924

-.040

.013

-.042

12

1.594

-.080

.008

-.073

13

1.669

- .050

.017

-.050

14

1.698

- .031

.029

-.035

15

1.635

- .0009

.027

- .011

intercept

slope

std. dev.

Bayes slope

To perform an empirical Bayes analysis we start with the frequentist


model and inference structure.

Ve append a Bayes model to the slopes

bu""Gaussian (b, .,.2) , u=10, 11, , 15,


that is, that the slopes come from a common Gaussian population with
unknown mean band variance T 2
The "empirical" part of empirical Bayesian is to now estimate these
unknown parameters b and T 2 from the data. (A standard Bayesian analysis
would specify values for these parameters.)

Using these estimated values

allows the data to assess the tenability of the submodel, that the bu' s
come from a common population.

The empirical Bayes slope estimates are a

convex combination of the common overall slope (-.048) and the individual
least squares slopes, given by
empirical Bayes= ( 221 )( _ 048 )
slope

+ (. 779 )(least

squares).
slope

'l'he weighting factor .221 (and 779 = 1- .221) are data based estimates.
The empirical Bayes slope estimates are valid under the model of
frequentist repeatability.

In fact, they are superior to the frequentist

estimates using a criterion of expected mean squared error.

Thus, on the

average, the empirical Bayes estimates will be closer to the true values
.than the standard frequentist estimates.

They combine the best features

of Bayesian modeling and frequentist inference.


Figure 5 also shows the empirical Bayes regression lines.

Although

they are not very different from the standard frequentist lines, they do
display a movement toward the common slope value.

The empirical Bayes

analysis has uncovered a small amount of common structure, and has used
this in improving each of the estimates.

7. CONCLUSIONS
The statistical methodology to be used, whether Bayesian or
frequentist, should be selected according to the type of inference that
is desired (and is appropriate).

The frequentist methodology is

10

appropriate for inference over a series of repeated experiments, while


the Bayesian methodology is appropriate for inference specific to the
experiment that was done.

This article has given examples and provided

discussion of situations where each methodology is appropriate.


There is no brick wall between Bayesianism and frequentism.

The

methodologies are not at odds with one another, they are complementary to
one another.
best.

When approaching a statistical problem

~opportunism"

is

With that in mind, the appropriate analysis and inference can be

chosen from all available statistical methodologies.


Both Bayesianism and frequentism are built on a set of assumptions,
some more palatable than others.

For a user of frequentist methods,

perhaps the assumption most difficult to believe is that the process


(including parameter values) remains constant over the imagined series of
experiments.

For user of Bayesian methods, perhaps the assumption most

difficult to believe is that the prior distribution is correct.

These

assumptions, however, can sometimes be checked and and maybe even


relaxed.

Moreover, their reasonableness in any particular situation may

also form a basis for choosing an appropriate methodology.

(See Berger

1985, who discusses robust Bayesian analysis, which addresses these


concerns).

Lastly, there is an enormous amount of research being done in

statistics, and some of it is aimed at relaxing these assumptions.

Such

research has already given us techniques like empirical Bayes analysis, a


synthesis of both Bayesian and frequentist methodologies which can often
'provide superior solutions.
REFERENCES
Berger, J. 0. ( 1985) :
Edition.

Statistical Decision Theory and Bayesian Analysis, Second

New York:

Casella, G. (1985):

Springer-Verlag.

An Introduction to Empirical Bayes Data Analysis.

The American Statistician, 39, 83-87

Casella, G. (1992):

Illustrating Empirical Bayes Methods.

Chem. and Intell.

Lab. Sys., 16, 107-125.

Casella, G. and Berger, R.L. (1990):


Wadsworth and Brooks/Cole.

11

Statistical Inference, Pacific Grove:

Defant, A. (1961):

Physical Oceanography, Volume I.

New York:

Pergamon

Press.
Fisher, R.A. (1970):
New York:

Statistical MethodsforResearch Workers, Fourteenth Edition.

Hafner (Reissued by Oxford University Press, 1990).

Hwang, P.A., Hsu, Y.-H.L., and Vu, Jin.


Breaking Wind Waves:

(1990):

Air Bubbles Produced by

A Laboratory Study, J. Phys. Oceanography, 0, 19-

28.

Snedecor, G.W. and Cochran, W.G. (1989):


Ames:

Iowa State University Press.

12

StatuticalMethods, Eighth Edition.

Figure 1:

Relative frequencies of icebergs off (a) Newfoundland (black


squares) and (b) Grand Banks (white squares).

-Relative Frequency of Icebergs


0.4
0.35
>c

0.3

0.25

0'

&I.

0.2

Gi

0.15

0::

0.1
:0.05

0
1

-3

5-

7
Month

13

10

11

12

Figure 2:

Averaged height of breaking waves,

n8 ,

as a function of RMS

water surface displacement, q.

The line shown is the least


squares line, with equation H8 =.102+2.89q (r2 =.994).

Breaking Waves
9

E6

-..
CJ

.c
Cl 5
Gi
:::E:

'i
~
cu

~ 3

+-----+-----+-----+-----+-----+-----+-----+-----+---~

0.89

1.14

1.34

1.5

1.76

2.02

2.41

RMS Water-Surface Displacement (em)

14

2.57

2.92

3.02

Figure 3:

Posterior probabilities for the null hypotheses H0 :b$4 (solid


lines) and H0 : b $ 3 (dashed lines), as a function of the prior
standard deviation.

('

........

Q)

f-

Oro
_o .

1-

~0
-+--' .

_o
/"'

QO

---- --

o_

Q["-..
-

1-

LO
Q)
-+--'

(f)

o_~

I
I

I
.
0
i.{)

0.02

0.04

0.06

0.08

0.10

0.12

0.14

Prior Standard Deviation

15

0.16

Data from Hwang et al. (1990) on bubble populations.

Figure 4:

The six
groups are each at a different wind velocity, from 10 to 15
m/s in steps of 1.

The groups are in order from 10 m/s

'

(lowest) to 15 m/s (highest), and are denoted by black squares


(10 m/s), white squares (11 m/s), black diamonds (12 m/s),
white diamonds (13 m/s), black triangles (14 m/s) and white
triangles (15 m/s).

The data are connected merely to aid

viewing.

Bubble Populations
1.8
1.6

1.4

C ")

1.2
E
u

::~: _.,.~..~-~. ./-.i': ......:~

a.

CD

0.8

0
Q.

25

.g

0.6
0.4

0.2
0
4

.6

16

7
Depth (em)

10

Figure 5:

Standard frequentist (solid lines) and empirical Bayes (dashed


lines) fits to the bubble data, coded as in Figure 4.

The

empirical Bayes lines (whose slopes are pulled toward -.048)


are under the least squares lines for 11, 14 and 15
above the least squares lines for 10 and 12
are virtually identical for 13

mfs.

mfs,

and

The lines

mfs.

Bubble Populations
1.8
/1.

--- --- --- ----- --- --v

0.4
0.2

~:::.----

~---:...::__-.--

<>

- - - ---

<>

--~---

~-....r---=-:.:.-~::------ ---:- ---

---

0+-------~-------+--------r---~--~----~~------~

7
Depth(cm)

17

10

You might also like