0% found this document useful (0 votes)

57 views

STAT2120: Categorical Data Analysis Chapter 1: Introduction

This document provides an introduction to analyzing categorical response data. It discusses categorical variables and different probability distributions used to model categorical data, including the binomial, multinomial, Poisson, and negative binomial distributions. Key points covered include the definitions and properties of these distributions, how they relate to each other, and how to determine when certain approximations can be applied.

Uploaded by

Ricardo Tavares

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

57 views

STAT2120: Categorical Data Analysis Chapter 1: Introduction

Uploaded by

Ricardo Tavares

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 51

STAT2120: Categorical Data Analysis

Chapter 1: Introduction
Spring 2012
Department of Mathematics
Hong Kong Baptist University
1 / 51
1.1 Categorical Response Data
The subject of this course is to study how to analyze a data
set in which we have one response variable (dependent
variable or Y variable) and one or more explanatory variables
(independent variable or X variable), where the response
variable is categorical and the explanatory variables may be
categorical or continuous.
What is a categorical variable?
A categorical variable has a measurement scale consisting
of a set of categories. Categorical variables are common in
many areas, e.g., in social sciences, health sciences, behavioral
sciences, zoology, education, marketing, engineering Sciences,
industrial quality control, and etc.
2 / 51
There are two kinds of categorical variables: ordinal and nominal.
Categorical variables having ordered scales are called ordinal
variables. Many categorical scales have a natural ordering.
Examples include:
(a) patient condition (excellent, good, fair, poor),
(b) government spending (too high, about right, too low),
(c) frequency of feeling anxiety (never, occasionally, often).
Categorical variables having unordered scales are called
nominal variables. Examples include:
(a) transportation to work (car, bicycle, bus, subway, walk),
(b) favorite type of music (classical, country, folk, jazz, rock),
(c) religious aliation (Catholic, Jewish, Muslim, Buddhism).
3 / 51
For nominal variables, the order of listing the categories is
irrelevant. The statistical analysis should not depend on that
ordering. That is, methods designed for nominal variables
should give the same results no matter how the categories are
listed.
Methods designed for ordinal variables utilize the category
ordering. Methods designed for ordinal variables cannot be
used with nominal variables, since nominal variables do not
have ordered categories.
Methods designed for nominal variables can be used with
nominal or ordinal variables. However, when used with ordinal
variables, they do not use the information in ordering so that
we may suer a serious loss in power.
4 / 51
1.2 Probability Distributions for Categorical Data
Recall that in STAT2110, the response variable is assumed to
follow a normal distribution.
We introduce four important discrete distributions used for
modeling categorical data:
(1) Binomial distribution
(2) Multinomial distribution
(3) Poisson distribution
(4) Negative Binomial distribution
5 / 51
(1) Binomial distribution
The binomial distribution is based on the idea of a Bernoulli
trial (two possible outcomes: success and failure). Let the
probability of having a success is in a Bernoulli trial. Then
for a series of independent Bernoulli trials with the same
probability of having a success, the total number of successes
out of n trials denes a Binomial distribution.
Let X Binomial(n, ). The probability mass function (pmf)
of X is
P(X = x) =
_
n
x
_

x
(1 )
nx
, for x = 0, . . ., n.
The mean and variance of X are
E(X) = n, Var(X) = n(1 ).
6 / 51
Example 1
(a) Let X Binomial(3, 0.3). We have
P(X 1) = P(X = 0) + P(X = 1)
=
_
3
0
_
0.7
3
+
_
3
1
_
0.3 0.7
2
= 0.784.
(b) Let X Binomial(10, 0.3). We have
P(X 2) = P(X = 0) + P(X = 1) + P(X = 2)
=
_
10
0
_
0.7
10
+
_
10
1
_
0.3
1
0.7
9
+
_
10
2
_
0.3
2
0.7
8
= 0.3828.
(c) Let X Binomial(100, 0.3). We have
P(X 25) =
25

i =0
_
100
i
_
0.3
i
(1 0.3)
100i
= ?
7 / 51
Probability Mass Function of Binomial(n, 0.3)
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0
.
1
0
.
2
0
.
3
0
.
4
Binomial(3,0.3)
i
d
e
n
s
i
t
y
0 2 4 6 8 10
0
.
0
0
0
.
1
0
0
.
2
0
Binomial(10,0.3)
i
d
e
n
s
i
t
y
0 5 10 15 20 25 30
0
.
0
0
0
.
0
5
0
.
1
0
0
.
1
5
Binomial(30,0.3)
i
d
e
n
s
i
t
y
0 20 40 60 80 100
0
.
0
0
0
.
0
2
0
.
0
4
0
.
0
6
0
.
0
8
Binomial(100,0.3)
i
d
e
n
s
i
t
y
8 / 51
The binomial distribution is symmetric when = 0.5;
The binomial distribution is right-skewed when < 0.5;
The binomial distribution is left-skewed when > 0.5.
Asymptotic result for Binomial(n, ): When n is large, the
binomial distribution can be approximated by a normal
distribution with = n and
2
= n(1 ). That is,
Binomial(n, ) N(n, n(1 )) as n is large.
Rule of Thumb: We apply for the normal approximation when
both n 5 and n(1 ) 5. Examples include:
(a) When = 0.5, we require only n 10.
(b) When = 0.1 or = 0.9, we require n 50.
9 / 51
Example 1 - ctd
When n = 100 and = 0.3, we have n = 30 5 and
n(1 ) = 70 5. The normal approximation suggests that
X approximately follows
N(n, n(1 )) = N(30, 21).
This leads to
P(X 25) = P(X 25.5)
= P(
X 30

25.5 30

21
)
(0.9819805)
= 0.1630547.
Remark: The true probability of P(X 25) is 0.1631301.
10 / 51
(2) Multinomial distribution
Some trials may have more than two possible outcomes.
Example are:
(a) The possible outcome of Dr Tongs teaching evaluation
can be Excellent, Very Good, Good, Fair or Poor.
(b) The possible genotype of an individual can be AA, Aa or
aa, where A represents dominant and a represents recessive.
When the trials are independent with the same categories
probabilities for each trial, the counts in the various categories
denes a Multinomial distribution.
11 / 51
Specically, let c denote the number of all possible categories,
and denote their probabilities by {
1
,
2
, . . . ,
c
} with

j

j
= 1. For n independent observations, assume that X
1
fall in category 1, X
2
fall in category 2, . . . , and X
c
fall in
category c. We have n =

j
X
j
.
For any non-negative counts n
1
, . . . , n
c
with

j
n
j
= n, the
pmf of (X
1
, . . . , X
c
) is
P(X
1
= n
1
, , X
c
= n
c
) =
n!
n
1
!n
2
! n
c
!

n
1
1

n
2
2

n
c
c
.
The binomial distribution is a special case of the multinomial
distribution with c = 2. The marginal distribution of the
count in any particular category is also binomial. Thus for any
category j , we have E(X
j
) = n
j
and Var(X
j
) = n
j
(1
j
).
12 / 51
(3) Poisson distribution
Many discrete response variables have counts as possible
outcomes. Examples include:
(a) the number of parties attended in the past month,
(b) the number of car accidents in a certain city per week,
(c) the number of phone calls arriving at a center per minute.
Note that the above counts usually have no maximum value.
Therefore, it is not appropriate to use the binomial or
multinomial distribution to model these data.
The simplest distribution to model these data is the Poisson
distribution. Let X Poisson() where is the parameter.
Then X can take any non-negative integer value with pmf
P(X = x) =
e

x
x!
, x = 0, 1, 2, . . . .
13 / 51
The Poisson distribution is unimodal and right-skewed for any
parameter . The mean and variance of X are
E(X) = Var(X) = .
When n is large and is small, the binomial distribution
Binomial(n, ) can be approximated by a Poisson distribution
with parameter = n.
(Revisit Example 1) If we apply the Poisson approximation for
Binomial(100, 0.3), we have = 100 0.3 = 30 and
P(X 25) =
25

x=0
e
30
30
x
x!
= 0.2083574.
Given that the true value is 0.1631301, this does not provide a
good approximation. [Reason: is not small enough.]
14 / 51
(4) Negative Binomial distribution
The Poisson distribution is the simplest to model count data.
However, Poisson has only one parameter and does not
allow for the variance to be adjusted independently of the
mean.
For count data, the variance may not be guaranteed to be the
same as the mean value.
Example: Consider the following sample data:
5, 16, 50, 71, 123.
The sample variance is 2226.5, which is much larger than the
sample mean at 53. This suggests that Poisson may not be
appropriate to model this sample data.
15 / 51
Denition
(1) When the observed variance is greater than the expected
variance, we say overdispersion has occurred; (2) When the
observed variance is less than the expected variance, we say
underdispersion has occurred.
Overdispersion or underdispersion is not an issue in the
normal distribution because the normal has a separate
parameter
2
from the mean to measure variability.
In practice, overdispersion is more commonly observed than
underdispersion.
16 / 51
When overdispersion is observed, an alternative to Poisson is
the Negative Binomial (NB) distribution.
NB is dened to be the total number of failures in Bernoulli
trials that leads to the kth success. Let X NB(k, ). The
pmf for X is
P(X = x) =
_
x + k 1
k 1
_

k
(1 )
x
, x = 0, 1, 2, ....
For NB, we have
E(X) =
k(1 )

, Var(X) =
k(1 )

2
.
17 / 51
For convenience, we do the following parameterization for NB.
Let = k(1 )/. Then
=
1
1 + /k
=
1
1 + D
,
where D = 1/k is referred to as the dispersion parameter.
With and D, we denote the parameterized NB to be
X NB(, D) with pmf
P(X = x) =
_
x + D
1
1
D
1
1
__
1
1 + D
_
D
1 _
D
1 + D
_
x
.
18 / 51
For the parameterized NB, we have
E(X) = , Var(X) = + D
2
.
This mean-variance relationship provides enough exibility for
modeling real data.
When D 0, the above NB distribution converges to the
Poisson distribution with parameter . In this sense, NB can
be treated as a generalized Poisson distribution.
In fact, NB can be derived as a Poisson-Gamma mixture, i.e.,
a Poisson distribution in which the mean itself is random that
follows the Gamma distribution.
19 / 51
1.3 Statistical Inference for Population Parameters
1.3.1 Maximum Likelihood Estimation
In practice, the parameter values for Binomial, Multinomial,
Poisson and NB are unknown; they are often estimated by the
sample data.
In this section, we consider to estimate the parameters using
Maximum likelihood.
The likelihood function is the probability (or density) of the
observed data, expressed as a function of the parameter value.
The maximum likelihood estimate (MLE) is the parameter
value at which the likelihood function is maximized.
20 / 51
Let X
1
, . . . , X
n
be i.i.d. random variables from a population
with pmf or pdf f (x|), where = (
1
, . . . ,
k
) R
k
. The
likelihood function is given by
(|x) = (|x
1
, . . . , x
n
) =
n

i =1
f (x
i
|).
The log-likelihood function is L(|x) = ln (|x).
The MLE of , denoted by

MLE
, is the value of that
maximizes L(|x) or (|x). If the likelihood function is
dierentiable in all
i
, the MLE can be obtained by solving the
following likelihood equations,

i
L(|x) = 0, i = 1, . . . , k.
21 / 51
Example 2
(1) Let X
1
, . . . , X
n
iid
Bernoulli(). The MLE of is

MLE
=
1
n
n

i =1
X
i

X
n
,
where X is the total number of successes in n trials.
(2) Let X
1
, . . . , X
n
iid
Binomial(m, ), m known. The MLE of is

MLE
=
1
mn
n

i =1
X
i
.
(3) Let X
1
, . . . , X
n
iid
Poisson(). The MLE of is

MLE
=
1
n
n

i =1
X
i
=

X.
22 / 51
Invariance Property of MLEs
Theorem
(Invariance Property of MLEs) If

MLE
is the MLE of , then for
any function (), the MLE of () is (

MLE
).
Example: Let X
1
, . . . , X
n
iid
Poisson(). Note that the MLE of
is

MLE
=

X. By the invariance property of MLEs, we have
(a) The MLE of
2
is

X
2
;
(b) The MLE of

X;
(c) The MLE of ln() is ln(

X).
Example: Let X
1
, . . . , X
n
iid
Bernoulli(). The MLE of
_
p(1 p)
is
_
X
n
(1
X
n
).
23 / 51
Asymptotic Normality of MLEs
Theorem
(Asymptotic Normality of MLEs) Let X
1
, . . . , X
n
be i.i.d random
variables from f (x|), and

n
be the MLE of . Then under some
regularity conditions, as n ,

n
_

_
N(0, v()) in distribution,
where v() is the Cramer-Rao Lower Bound on the variance of any
unbiased estimators of .
Remark: The property that the distribution converges to the
normal distribution as the sample size goes to innity is called
asymptotic normality. The asymptotic normality of MLEs is an
important property and will be repeatedly used in this course.
24 / 51
1.3.2 Signicance Tests about Population Parameters
A hypothesis is a statement about a population. The two
complementary hypotheses in a hypothesis testing problem are
called the null hypothesis and the alternative hypothesis. We
denote them by H
0
and H
1
, respectively.
Let denote a population parameter of interest. The general
format of the null and alternative hypotheses about is
H
0
:
0
versus H
1
:
c
0
,
where =
0

c
0
is the entire parameter space.
25 / 51
A hypothesis test is a rule that decides, based on a sample
from the population, which of the two complementary
hypotheses is true. Specically, we are interested in:
(a) for which sample values the decision is made to accept H
0
;
(b) for which sample values H
0
is rejected and H
1
is accepted.
Accept H
0
Reject H
0
H
0
is true correct decision type I error
H
1
is true type II error correct decision
Typically, a hypothesis test consists of the following three
components:
(a) a test statistic T(X) = T(X
1
, . . . , X
n
),
(b) a signicance level ,
(c) a critical region that suggests rejection of H
0
based on
the observed test statistic value.
26 / 51
Statistical Test about a Binomial Proportion
Recall that the MLE of the proportion is the sample
proportion
MLE
= X/n, where X is the total number of
successes in n trials.
For ease of notation, we use p to represent the sample
proportion. That is,
p
MLE
= X/n.
Remark: Greek letter corresponds to Roman letter p.
The sample proportion p has mean and variance
E(p) = , Var(p) =
(1 )
n
.
27 / 51
The sample proportion p is an unbiased estimator of . In
addition, the variance of p decreases toward zero when n
increases. This shows that p is a consistent estimator of .
By the asymptotic normality of MLEs (or by Central Limit
Theorem), the sampling distribution of p is approximately
normal for large n. This suggests to apply large-sample
inferential methods for .
Consider test H
0
: =
0
versus H
1
: =
0
. Let the test
statistic be
z =
p
0
_

0
(1
0
)/n
.
Under H
0
, the large-sample distribution of the z test statistic
is the standard normal distribution.
28 / 51
Large-Sample Decision: For a given sample, we reject H
0
if
|z
obs
| > z
/2
,
where is the signicance level and z

is the upper th
percentile of the standard normal distribution.
For the two-sided test, let P-value = P(|Z| > z
obs
), where
Z N(0, 1). The decision rule based on the P-value is:
Reject H
0
if P-value < .
Remark: The smaller the P-value, the stronger evidence
against H
0
.
29 / 51
1.3.3 Condence Intervals for Population Parameters
Point Estimation is to use a single value to estimate the
parameter . It represents our best guess for the true value of
the parameter, but it provides little condence to support that
the guess is correct.
Interval Estimation is an alternative approach to estimate the
parameter . It provides condence in such a way: to nd a
random interval that contains the true parameter with a
pre-specied probability.
30 / 51
For a random sample X = (X
1
, . . . , X
n
), an interval estimator
of with coverage probability 1 is a random interval
[L(X), U(X)] ,
where P( [L(X), U(X)]) 1 for all .
[L(X), U(X)] is called a (1 ) condence interval (CI) of .
The quantity 1 is referred to as the condence level or
condence coecient.
Note that L(X) and U(X) are random variables but is xed
(though unknown). If both L(X) and U(X) are nite, then the
interval is called a two-sided CI; otherwise it is a one-sided CI.
31 / 51
Example 3
Let X
1
, . . . , X
n
be i.i.d. from N(,
2
) with
2
known. Let
L(X) =

X c and U(X) =

X + c.
Find the value of c so that the CI has condence level 1 .
Solution: Note that

X N(,
2
/n). We have
P( [L(X), U(X)]) = P(

X c

X + c)
= P(

X
_

2
/n

)
= 1 2(

nc/),
where () is the cdf of N(0, 1). To make the CI having level
1 , we set 1 2(

nc/) = 1 . This leads to

c = z
/2
/

n. Further, the (1 ) CI for is

[L(X), U(X)] = [

X z
/2

n
,

X + z
/2

n
].
32 / 51
Condence Interval for a Binomial Proportion
Let SE =
_
p(1 p)/n denote the estimated standard error
of p. It can be shown that (p )/SE is asymptotically
normal.
A large-sample (1 ) CI for (p )/SE is
z
/2

p
SE
z
/2
.
This leads to, equivalently, the (1 ) CI for as
p z
/2
_
p(1 p)/n.
33 / 51
1.4 More on Statistical Inference for Discrete Data
In this section, we introduce three frequently used methods
for conducting inference (signicance tests and condence
intervals).
(1) Wald Test
(2) Likelihood Ratio Test
(3) Raos Score Test
The above methods are general and apply to any parameter in
a statistical model. Consider test
H
0
: =
0
versus H
1
: =
0
.
We will apply the methods to test the binomial proportion
for illustration.
34 / 51
(1) Wald Test
Let

be the MLE of . Let SE be the standard error of

,
evaluated by substituting

for the unknown in the
expression of sd(

).
The Wald test statistic is dened as
z =

0
SE
.
Under H
0
, the test statistic z has approximately a standard
normal distribution. Equivalently, z
2
has approximately a
2
1
distribution.
Example: For testing H
0
: =
0
in binomial distribution, the
Wald test statistic is
z =
p
0
_
p(1 p)/n
.
35 / 51
(2) Likelihood Ratio Test
Likelihood ratio test statistic is dened as
2 ln() = 2(L
0
L
1
),
where =
0
/
1
is the ratio of
0
(the maximum likelihood
under the null space) to
1
(the maximum likelihood under
the entire space), and L
0
= ln
0
and L
1
= ln
1
are the
log-likelihood functions.
Note that L
1
L
0
, because L
1
refers to the global maximum
and L
0
is only a local maximum. Therefore, the test statistic
2 ln() is always non-negative.
When H
0
is not true, the discrepancy between L
1
and L
0
can
be large. This suggests to reject H
0
for large values of
2 ln().
36 / 51
The reason for taking the natural log transformation and
multiplying by 2 is that it yields an approximate chi-square
distribution.
Specically, under H
0
, the test statistic 2 ln() follows
approximately a chi-square distribution with degrees of
freedom, where is equal to the dierence between the
numbers of free parameters in the null and entire spaces.
Based on the asymptotic distribution, the likelihood ratio test
rejects H
0
if
2 ln() >
2

(),
where
2

() is the upper th percentile of the

distribution.
37 / 51
Example 4
For testing H
0
: =
0
in binomial distribution, the likelihood
ratio test statistic is
2 ln() = 2 ln

x
0
(1
0
)
nx
p
x
(1 p)
nx
.
In addition, noting that there are 1 free parameter in the entire
space and 0 free parameter in the null space, we have = 1.
This implies that 2 ln() follows approximately
2
1
under H
0
.
Finally, we reject H
0
if
2 ln() >
2
1
().
38 / 51
(3) Raos Score Test
Raos score test is motivated from the score function, which is
dened as
s() =

(|x).
For any value, it can be shown that E(s()) = 0.
Under H
0
, we expect that s(
0
) is near zero. When H
0
is not
true, s(
0
) tends to be away from zero. This suggests to
reject H
0
for large values of |s(
0
)| up to a certain scale.
Of course, the formal denition of Raos score test statistic
will be more complicated so that we skip to discuss the details.
However, it is worth mentioning that Raos score test statistic
is similar to the Wald test statistic except that it nds the
standard error under the assumption that H
0
is true.
39 / 51
Example 5
For testing H
0
: =
0
in binomial distribution, recall that
the Wald test statistic is
z =
p
0
SE
=
p
0
_
p(1 p)/n
,
where the standard error SE is evaluated at the ML estimate.
For Raos score test, we evaluate SE under the assumption
that H
0
is true. This leads to SE =
_

0
(1
0
)/n so that
the score test statistic is
z =
p
0
_

0
(1
0
)/n
.
Under H
0
, the score test statistic z has approximately a
standard normal distribution.
40 / 51
Comparison
The Wald, likelihood ratio, and Raos score tests are the three
major ways to construct signicance tests for parameters in
statistical models.
For normal data, the three tests provide identical results.
For other data, the three tests are asymptotically equivalent
when n is large.
Some relationships among the three tests are:
(a) Wald test requires calculation of MLE under ;
(b) The likelihood ratio test requires calculation of MLEs
under both
0
and ;
(c) Raos score test requires calculation of MLE under
0
.
41 / 51
Test-based Condence Intervals
For each test, we has a corresponding condence interval.
This is based on inverting results of the signicance test:
A (1 ) CI contains all values that will not be rejected
by the test at the signicance level.
Consider estimate in binomial distribution. For a given
sample proportion p and sample size n, the upper and lower
bounds of a (1 ) CI for , based on the Wald test, are the
values
0
that satisfy
|p
0
|
_
p(1 p)/n
= z
/2
.
This is equivalent to saying that the CI of is
p z
/2
_
p(1 p)/n.
42 / 51
If we use the score test, then the upper and lower bounds of a
(1 ) CI for are the values
0
that satisfy
|p
0
|
_

0
(1
0
)/n
= z
/2
.
To achieve the CI, we can square both sides to give a
quadratic equation in
0
and then solve this for
0
.
Similarly, we may use the likelihood ratio test to construct
condence intervals. This conrms, again, that we have a
corresponding condence interval for each test.
43 / 51
1.5 Small-Sample Inference for Discrete Data
Recall that the above three tests are all asymptotic tests that
require a large sample size n. In addition, the three tests are
asymptotically equivalent as n .
While for small sample sizes, the above three tests can be very
dierent and their normal or
2
approximation may have large
errors.
In view of this, for small sample sizes, it is safer to use the
discrete distributions directly (rather than a normal or
2
approximation) to calculate the exact P-value.
44 / 51
P-value
Denition
The P-value is the probability of obtaining a test statistic value
that is at least as extreme as the one that was actually observed,
under the assumption that H
0
is true.
For discrete distributions, a value that is at least as extreme
as the observed x value is dened to be
a value that has a probability less than or equal to
P(X = x) in the direction of H
1
.
For continuous distributions, a value that is at least as
extreme as the observed x value is dened to be
a value that has a density less than or equal to f (x) in
the direction of H
1
.
45 / 51
Example 6
Let X = x be the number of successes out of 7 trials. Given
= 0.4, the pmf of X is
x P(X = x) =
_
7
x
_
0.4
x
0.6
7x
0 0.028
1 0.131
2 0.261
3 0.290
4 0.194
5 0.077
6 0.017
7 0.002
46 / 51
Example 6 - ctd
Consider the one-sided test H
0
: = 0.4 versus H
1
: < 0.4.
If the observed x value is 1, the exact P-value is
P-value = P(X = 1) + P(X = 0)
= 0.131 + 0.028
= 0.159.
Consider the one-sided test H
0
: = 0.4 versus H
1
: > 0.4.
If the observed x value is 5, the exact P-value is
P-value = P(X = 5) + P(X = 6) + P(X = 7)
= 0.077 + 0.017 + 0.002
= 0.096.
47 / 51
Example 6 - ctd
Now consider the two-sided test H
0
: = 0.4 versus H
1
: = 0.4.
If the observed x value is 6, what is the exact P-value?
P-value = P(X = 0) + P(X = 1) + P(X = 6) + P(X = 7)
= 0.178.
The above answer is NOT correct. Why?
By denition, the exact P-value should be
P-value = P(X = 6) + P(X = 7) = 0.019.
What if the observed x value is 0? The exact P-value is
P-value = P(X = 0) + P(X = 6) + P(X = 7) = 0.047.
48 / 51
Exact Test
Such a test, using the exact distribution instead of using the
asymptotic distribution to calculate the P-value, is called an
exact test.
There are two problems in using exact tests:
(a) It is often very dicult to obtain the exact distribution in
many complicated situations.
(b) If the distribution is discrete, the exact test is always
conservative. As a consequence, the power is also lower than
it can be.
49 / 51
Conservative Test
In Example 6, we reject H
0
at the 0.05 signicance level if and
only if the observed x value is 0, 6 or 7, because their
P-values are less than 0.05 whilst any other value will give a
P-value higher than 0.05.
With the above decision rule, we will commit the type I error
if the true is really 0.4 but the observed x value is 0, 6 or 7.
This leads to the actual probability of committing the type I
error as
P(X = 0, 6 or 7) = 0.047.
If the actual probability of committing the type I error at the
signicance level is strictly less than , then the test is
called a conservative test.
50 / 51
Mid P-value
The mid P-value is dened as
mid P-value = 0.5 P(the observed value)
+ P(more extreme values).
To diminish the conservativeness of tests, we may use the mid
P-value instead of the exact P-value to make the decision.
Using mid P-values will lead to less conservative tests.
However, it has a major disadvantage. Specically, the
probability of committing the type I error may exceed the
signicance level, leading to a liberal test.
Therefore, using mid P-values for conservative tests is only
another option, which is common used but not mandatory.
51 / 51

Numerical Methods Chapra 6th Edition Solution Manual PDF
0% (1)
Numerical Methods Chapra 6th Edition Solution Manual PDF
3 pages
Probability & Statistics 2: AS2110 / MA3666
No ratings yet
Probability & Statistics 2: AS2110 / MA3666
32 pages
Factoring Flow Chart
No ratings yet
Factoring Flow Chart
2 pages
Catagorical Data Analysis
No ratings yet
Catagorical Data Analysis
96 pages
MSD_Discrete_count_models_2
No ratings yet
MSD_Discrete_count_models_2
42 pages
Categorical Chapter One
No ratings yet
Categorical Chapter One
11 pages
Stats
No ratings yet
Stats
24 pages
CH 1
No ratings yet
CH 1
30 pages
Notes Dvi
No ratings yet
Notes Dvi
34 pages
Lec 01
No ratings yet
Lec 01
44 pages
STAE Lecture Notes - LU5
No ratings yet
STAE Lecture Notes - LU5
22 pages
Categorical Slide2024
No ratings yet
Categorical Slide2024
189 pages
Lecture08 PDF
No ratings yet
Lecture08 PDF
43 pages
Study Guide
No ratings yet
Study Guide
8 pages
Distribution PPT
No ratings yet
Distribution PPT
75 pages
Important PMFs and PDFs
No ratings yet
Important PMFs and PDFs
7 pages
TOPIC TWO. RANDOM VARIABLE AND PROBABILITY DISTRIBUTION pptx
No ratings yet
TOPIC TWO. RANDOM VARIABLE AND PROBABILITY DISTRIBUTION pptx
43 pages
1853_Random Variable & Distribution
No ratings yet
1853_Random Variable & Distribution
43 pages
Theoretical Distributions
No ratings yet
Theoretical Distributions
5 pages
Theoretical Distributions 1
No ratings yet
Theoretical Distributions 1
2 pages
DS Full
No ratings yet
DS Full
113 pages
Categorical-Notes-Ch1
No ratings yet
Categorical-Notes-Ch1
18 pages
Distributions
No ratings yet
Distributions
61 pages
Business Inferential Statistics Lessons
No ratings yet
Business Inferential Statistics Lessons
7 pages
Basic Statistics in Fluid Mechanics
No ratings yet
Basic Statistics in Fluid Mechanics
34 pages
Day 02-Random Variable and Probability - Part (I)
No ratings yet
Day 02-Random Variable and Probability - Part (I)
34 pages
Probability Distributions: Inferential Statistics AB
No ratings yet
Probability Distributions: Inferential Statistics AB
19 pages
Probability - Distributions
No ratings yet
Probability - Distributions
20 pages
Chapter 6
No ratings yet
Chapter 6
5 pages
f (x) ≥0, Σf (x) =1. We can describe a discrete probability distribution with a table, graph,
No ratings yet
f (x) ≥0, Σf (x) =1. We can describe a discrete probability distribution with a table, graph,
4 pages
W4 Lecture4
No ratings yet
W4 Lecture4
31 pages
Chapter 6
No ratings yet
Chapter 6
16 pages
Probability Presentation
No ratings yet
Probability Presentation
26 pages
5.Theo distn.TC_120209
No ratings yet
5.Theo distn.TC_120209
6 pages
Chapter 2 Commonly Used Probability Distribution
No ratings yet
Chapter 2 Commonly Used Probability Distribution
50 pages
Study Guide
No ratings yet
Study Guide
9 pages
Statistics Final Review
No ratings yet
Statistics Final Review
28 pages
Statistics Using R Tutorial
No ratings yet
Statistics Using R Tutorial
22 pages
Unit 4
No ratings yet
Unit 4
22 pages
Chapter 01 Preliminaries (1)
No ratings yet
Chapter 01 Preliminaries (1)
10 pages
Probability Distribution
No ratings yet
Probability Distribution
11 pages
Introduction Biostat
No ratings yet
Introduction Biostat
32 pages
Probability Distributions: 4.1. Some Special Discrete Random Variables 4.1.1. The Bernoulli and Binomial Random Variables
No ratings yet
Probability Distributions: 4.1. Some Special Discrete Random Variables 4.1.1. The Bernoulli and Binomial Random Variables
12 pages
Distributions PPT - ICAI
No ratings yet
Distributions PPT - ICAI
42 pages
Lab-2: Probability Distributions Name: Objective:To Compute Probability Density Function (PDF) and Cumulative Distribution Function (CDF) Outcomes
No ratings yet
Lab-2: Probability Distributions Name: Objective:To Compute Probability Density Function (PDF) and Cumulative Distribution Function (CDF) Outcomes
15 pages
Types of Statistical Distributions
No ratings yet
Types of Statistical Distributions
34 pages
(Lecture 4) Discrete Probability Distributions
No ratings yet
(Lecture 4) Discrete Probability Distributions
57 pages
Discrete Distribution
No ratings yet
Discrete Distribution
19 pages
Session 4-5 Reference: SFM Ch.5
No ratings yet
Session 4-5 Reference: SFM Ch.5
24 pages
QT-Random Variable and Probability Distribution-1
No ratings yet
QT-Random Variable and Probability Distribution-1
4 pages
Random Variables: Random Variables Study Material For Week 6 Lecture Five
No ratings yet
Random Variables: Random Variables Study Material For Week 6 Lecture Five
7 pages
Notes 2
No ratings yet
Notes 2
6 pages
1743 Chapter 4 Probability Distribution
No ratings yet
1743 Chapter 4 Probability Distribution
23 pages
Mstat Note7 Random Variable f23
No ratings yet
Mstat Note7 Random Variable f23
76 pages
Chapter 6 Probability Distribution
No ratings yet
Chapter 6 Probability Distribution
22 pages
Chap1 Introduction 2may24
No ratings yet
Chap1 Introduction 2may24
21 pages
Distributions
No ratings yet
Distributions
21 pages
FRM Part 1: Distributions
No ratings yet
FRM Part 1: Distributions
25 pages
Discrete Random Variables and Probability Distributions
No ratings yet
Discrete Random Variables and Probability Distributions
4 pages
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
From Everand
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
Jeffrey M. Wooldridge
No ratings yet
Convergence of Probability Measures
From Everand
Convergence of Probability Measures
Patrick Billingsley
No ratings yet
Learn Statistics Fast: A Simplified Detailed Version for Students
From Everand
Learn Statistics Fast: A Simplified Detailed Version for Students
Hesbon R.M
No ratings yet
Basic and Applied Mathematics: Solution
No ratings yet
Basic and Applied Mathematics: Solution
9 pages
MAT-142 - PPT - Function-Monsoon 2021 - Student
No ratings yet
MAT-142 - PPT - Function-Monsoon 2021 - Student
38 pages
Maths Schemes
No ratings yet
Maths Schemes
9 pages
Unit 5 Quantitative Techniques
No ratings yet
Unit 5 Quantitative Techniques
39 pages
Application of Integrals
No ratings yet
Application of Integrals
5 pages
Inequalities + Modulus
No ratings yet
Inequalities + Modulus
36 pages
Instant Ebooks Textbook (Ebook PDF) Fundamentals of Complex Analysis: With Applications To Engineering and Science 3rd Edition Download All Chapters
80% (5)
Instant Ebooks Textbook (Ebook PDF) Fundamentals of Complex Analysis: With Applications To Engineering and Science 3rd Edition Download All Chapters
37 pages
Back Analysis of Measured Displacements of Tunnels: Takeuchi
No ratings yet
Back Analysis of Measured Displacements of Tunnels: Takeuchi
8 pages
Inverse Trigonometric Functions: Key Concept Involved
No ratings yet
Inverse Trigonometric Functions: Key Concept Involved
4 pages
Complex Numbers - Typical Exam Questions
No ratings yet
Complex Numbers - Typical Exam Questions
2 pages
MIT15 093J F09 Rec02
No ratings yet
MIT15 093J F09 Rec02
3 pages
Xi Maths Chapt 7 Remesh Hsslive Permutations Cmbinations
No ratings yet
Xi Maths Chapt 7 Remesh Hsslive Permutations Cmbinations
10 pages
Linear Programming Solution Examples
100% (4)
Linear Programming Solution Examples
8 pages
HOTS Drill 1 Exercise Paper 1 Functions 2015
100% (1)
HOTS Drill 1 Exercise Paper 1 Functions 2015
13 pages
2.16 2.17 2.22 3.1 3.3 3.9 3.10 PDF
No ratings yet
2.16 2.17 2.22 3.1 3.3 3.9 3.10 PDF
4 pages
1-dsp SBT PDF
No ratings yet
1-dsp SBT PDF
68 pages
M2042 Ch1 1st Order
No ratings yet
M2042 Ch1 1st Order
30 pages
Foundation Review - Math
No ratings yet
Foundation Review - Math
24 pages
(Birger Iversen) Hyperbolic Geometry (B-Ok - Xyz) PDF
100% (1)
(Birger Iversen) Hyperbolic Geometry (B-Ok - Xyz) PDF
313 pages
Optimization For ML: CS771: Introduction To Machine Learning Nisheeth
No ratings yet
Optimization For ML: CS771: Introduction To Machine Learning Nisheeth
18 pages
CUFSM4 Tutorial
No ratings yet
CUFSM4 Tutorial
34 pages
Rev - Ass. X Polynomials 2024
No ratings yet
Rev - Ass. X Polynomials 2024
8 pages
Dll-Illus - Ratinal Exp
No ratings yet
Dll-Illus - Ratinal Exp
5 pages
Maths 2014
No ratings yet
Maths 2014
4 pages
Parametric Query Optimization
No ratings yet
Parametric Query Optimization
12 pages
2019 Y5 Promo Revision (Sem1 Topics)
No ratings yet
2019 Y5 Promo Revision (Sem1 Topics)
10 pages
Cartan's EQs.
No ratings yet
Cartan's EQs.
9 pages