Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
8 views

Unit 3_hypothesis Testing

Uploaded by

zxcvbnm.we541
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Unit 3_hypothesis Testing

Uploaded by

zxcvbnm.we541
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Unit 3 Hypothesis Testing

Hypothesis is usually considered as the principal instrument in research. The main goal in many
research studies is to check whether the data collected support certain statements or predictions.
A statistical hypothesis is an assertion or conjecture concerning one or more populations. Test
of hypothesis is a process of testing of the significance regarding the parameters of the
population on the basis of sample drawn from it. Thus, it is also termed as “Test of
Significance’.
In short, hypothesis testing enables us to make probability statements about population
parameter. The hypothesis may not be proved absolutely, but in practice it is accepted if it has
withstood a critical testing.
Points to be considered while formulating Hypothesis

• Hypothesis should be clear and precise.


• Hypothesis should be capable of being tested.
• Hypothesis should state relationship between variables.
• Hypothesis should be limited in scope and must be specific.
• Hypothesis should be stated as far as possible in most simple terms so that the same is
easily understandable by all concerned.
• Hypothesis should be amenable to testing within a reasonable time.
• Hypothesis must explain empirical reference.
Types of Hypothesis:
There are two types of hypothesis, i.e., Research Hypothesis and Statistical Hypothesis
1. Research Hypothesis: A research hypothesis is a tentative solution for the problem
being investigated. It is the supposition that motivates the researcher to accomplish
future course of action. In research, the researcher determines whether or not their
supposition can be supported through scientific investigation.
2. Statistical Hypothesis: Statistical hypothesis is a statement about the population which
we want to verify on the basis of sample taken from population. Statistical hypothesis
is stated in such a way that they may be evaluated by appropriate statistical techniques.
Types of Statistical Hypotheses
There are two types of statistical hypotheses:
1. Null Hypothesis (H0) – A statistical hypothesis that states that there is no difference
between a parameter and a specific value, or that there is no difference between two
parameters.
2. Alternative Hypothesis (H1 or Ha) – A statistical hypothesis that states the existence
of a difference between a parameter and a specific value, or states that there is a
difference between two parameters. Alternative hypothesis is created in a negative
meaning of the null hypothesis.
Suppose we want to test the hypothesis that the population mean (µ) is equal to the
hypothesised mean (µH0) = 100. Then we would say that the null hypothesis is that the
population mean is equal to the hypothesised mean 100 and symbolically we can
express as:
H0: µ = µ H0 = 100
If our sample results do not support this null hypothesis, we should conclude that
something else is true. What we conclude rejecting the null hypothesis is known as
alternative hypothesis. In other words, the set of alternatives to the null hypothesis is
referred to as the alternative hypothesis. If we accept H0, then we are rejecting H1 and
if we reject H0, then we are accepting H1. For H0: µ = µHo =100, we may consider three
possible alternative hypotheses as follows:
Alternative hypothesis To be read as follows
H1: µ ≠ µHo (The alternative hypothesis is that the population mean
is not equal to 100 i.e., it may be more or less than 100)

H1: µ > µHo (The alternative hypothesis is that the population mean
is greater than 100)
H1: µ < µHo (The alternative hypothesis is that the population mean
is less than 100)
The null hypothesis and the alternative hypothesis are chosen before the sample is
drawn (the researcher must avoid the error of deriving hypotheses from the data that
he/she collects and then testing the hypotheses from the same data). In the choice of
null hypothesis, the following considerations are usually kept in view:
1. Alternative hypothesis is usually the one which one wishes to prove and the null
hypothesis is the one which one wishes to disprove. Thus, a null hypothesis
represents the hypothesis we are trying to reject, and alternative hypothesis
represents all other possibilities.
2. Null hypotheses should always be specific hypothesis i.e., it should not state
about or approximately a certain value.
3. In testing hypothesis, there are two possible outcomes:
• Reject H0 and accept H1 because of sufficient evidence in the sample in
favour of H1;
• Do not reject H0 because of insufficient evidence to support H1.
BASIC CONCEPTS CONCERNING TESTING OF HYPOTHESES
1. The level of significance: This is a very important concept in the context of hypothesis
testing. It is always some percentage (usually 5%) which should be chosen with great
care, thought and reason. In case we take the significance level at 5 per cent, then this
implies that H0 will be rejected when the sampling result (i.e., observed evidence) has
a less than 0.05 probability of occurring if H0 is true. In other words, the 5 per cent level
of significance means that researcher is willing to take as much as a 5 per cent risk of
rejecting the null hypothesis when it (H0) happens to be true. Thus, the significance
level is the maximum value of the probability of rejecting H0 when it is true and is
usually determined in advance before testing the hypothesis.
2. Decision rule or Test of Hypothesis: A decision rule is a procedure that the researcher
uses to decide whether to accept or reject the null hypothesis. The decision rule is a
statement that tells under what circumstances to reject the null hypothesis. The decision
rule is based on specific values of the test statistic (e.g., reject H0 if Calculated value >
table value at the same level of significance)
3. Types of Error: In the context of testing of hypotheses, there are basically two types
of errors we can make.
a. Type 1 error: To reject the null hypothesis when it is true is to make what is
known as a type I error. The level at which a result is declared significant is
known as the type I error rate, often denoted by α.
b. Type II error: If we do not reject the null hypothesis when in fact there is a
difference between the groups, we make what is known as a type II error. The
type II error rate is often denoted as β.

In a tabular form the said two errors can be presented as follows:


Particulars Decision
Accept H0 Reject H0
H0 (True) Correct Decision Type I error
(α error)
H0 (False) Type II error Correct decision
(β error)
4. One- tailed and Two-tailed Tests: A test of statistical hypothesis, where the region of
rejection is on only one side of the sampling distribution, is called a one tailed test. For
example, suppose the null hypothesis states that the mean is less than or equal to 10.
The alternative hypothesis would be that the mean is greater than 10. The region of
rejection would consist of a range of numbers located on the right side of sampling
distribution i.e., a set of numbers greater than 10.
A test of statistical hypothesis, where the region of rejection is on both sides of the
sampling distribution, is called a two-tailed test. For example, suppose the null
hypothesis states that the mean is equal to 10. The alternative hypothesis would be that
the mean is less than 10 or greater than 10. The region of rejection would consist of a
range of numbers located on both sides of sampling distribution; i.e., the region of
rejection would consist partly of numbers that were less than 10 and partly of numbers
that were greater than 10.
Procedure of Hypothesis Testing
Procedure for hypothesis testing refers to all those steps that we undertake for making a choice
between the two actions i.e., rejection and acceptance of a null hypothesis. The various steps
involved in hypothesis testing are stated below:
1. Making a formal statement: The step consists in making a formal statement of the
null hypothesis (H0) and also of the alternative hypothesis (Ha or H1). This means that
hypotheses should be clearly stated, considering the nature of the research problem.
2. Selecting a significance level: The hypotheses are tested on a pre-determined level of
significance and as such the same should be specified. Generally, in practice, either 5%
level or 1% level is adopted for the purpose.
3. Deciding the distribution to use: After deciding the level of significance, the next step
in hypothesis testing is to determine the appropriate sampling distribution. The choice
generally remains between normal distribution and the t-distribution.
4. Selecting a random sample and computing an appropriate value: Another step is
to select a random sample(s) and compute an appropriate value from the sample data
concerning the test statistic utilizing the relevant distribution. In other words, draw a
sample to furnish empirical data.
5. Calculation of the probability: One has then to calculate the probability that the
sample result would diverge as widely as it has from expectations, if the null hypothesis
were in fact true.
6. Comparing the probability and Decision making: Yet another step consists in
comparing the probability thus calculated with the specified value for α, the
significance level. If the calculated probability is equal to or smaller than the α value in
case of one-tailed test (and α /2 in case of two-tailed test), then reject the null hypothesis
(i.e., accept the alternative hypothesis), but if the calculated probability is greater, then
accept the null hypothesis.
The above stated general procedure for hypothesis testing can also be depicted in the form
of a cart flow-
Tests of Hypotheses
Hypothesis testing determines the validity of the assumption (technically described as null
hypothesis) with a view to choose between two conflicting hypotheses about the value of a
population parameter. Hypothesis testing helps to decide on the basis of a sample data, whether
a hypothesis about the population is likely to be true or false. Statisticians have developed
several tests of hypotheses (also known as the tests of significance) for the purpose of testing
of hypotheses which can be classified as:
a) Parametric tests or standard tests of hypotheses; and
b) Non-parametric tests or distribution-free test of hypotheses.
Parametric tests usually assume certain properties of the parent population from which we draw
samples. Assumptions like observations come from a normal population, sample size is large,
assumptions about the population parameters like mean, variance, etc., must hold good before
parametric tests can be used. But there are situations when the researcher cannot or does not
want to make such assumptions. In such situations we use statistical methods for testing
hypotheses which are called non-parametric tests because such tests do not depend on any
assumption about the parameters of the parent population. Besides, most non-parametric tests
assume only nominal or ordinal data, whereas parametric tests require measurement equivalent
to at least an interval scale. As a result, non-parametric tests need more observations than
parametric tests to achieve the same size of Type I and Type II errors.
IMPORTANT PARAMETRIC TESTS
The important parametric tests are: (1) z-test; (2) t-test; and (3) F-test. All these tests are based
on the assumption of normality i.e., the source of data is considered to be normally distributed.
1. z- test: It is based on the normal probability distribution and is used for judging the
significance of several statistical measures, particularly the mean. This is a most
frequently used test in research studies. This test is used even when binomial
distribution or t-distribution is applicable on the presumption that such a distribution
tends to approximate normal distribution as ‘n’ becomes larger. z-test is generally used
for comparing the mean of a sample to some hypothesised mean for the population in
case of large sample, or when population variance is known. z-test is also used for
judging he significance of difference between means of two independent samples in
case of large samples, or when population variance is known. z-test is also used for comparing
the sample proportion to a theoretical value of population proportion or for judging the
difference in proportions of two independent samples when n happens to be large. Besides,
this test may be used for judging the significance of median, mode, coefficient of correlation
and several other measures.
2. t- test: It is based on t-distribution and is considered an appropriate test for judging the
significance of a sample mean or for judging the significance of difference betweenthe means
of two samples in case of small sample(s) when population variance is not known (in which
case we use variance of the sample as an estimate of the population variance). In case two
samples are related, we use paired t-test (or what is known as difference test) for judging the
significance of the mean of difference between the tworelated samples. It can also be used for
judging the significance of the coefficients of simple and partial correlations.
3. F-test: It is based on F-distribution and is used to compare the variance of the two-
independent samples. This test is also used in the context of analysis of variance (ANOVA)
for judging the significance of more than two sample means at one and the same time. It is
also used for judging the significance of multiple correlation coefficients.
Non parametric Tests
Non parametric tests are used when the data isn't normal. Therefore, the key is to figure out ifyou have
normally distributed data. The only non-parametric test you are likely to come acrossin elementary
stats is the chi-square test. However, there are several others. For example: the Kruskal Willis test is
the non-parametric alternative to the One-way ANOVA and the Mann Whitney is the non- parametric
alternative to the two-sample t test.
Z Test Example :
Problem 1:
A teacher claims that the mean score of students in his class is greater than 82 with a standard deviation of 20.
If a sample of 81 students was selected with a mean score of 90 then check if there is enough evidence to support
this claim at a 0.05 significance level.
As the sample size is 81 and population standard deviation is known, this is an example of a right-tailed one-sample z
test.

H0 : μ=82

H1 : μ>82
From the z table the critical value at αα = 1.645

As 3.6 > 1.645 thus, the null hypothesis is rejected and it is concluded that there is enough evidence to support the
teacher's claim.

Answer: Reject the null hypothesis

Problem 2 :
An online medicine shop claims that the mean delivery time for medicines is less than 120 minutes with a
standard deviation of 30 minutes. Is there enough evidence to support this claim at a 0.05 significance level if 49
orders were examined with a mean of 100 minutes?
As the sample size is 49 and population standard deviation is known, this is an example of a left-tailed one-sample z
test.

As -4.66 < -1.645 thus, the null hypothesis is rejected and it is concluded that there is enough evidence to support the
medicine shop's claim.

Answer: Reject the null hypothesis


Problem 3:
A company wants to improve the quality of products by reducing defects and monitoring the efficiency of
assembly lines. In assembly line A, there were 18 defects reported out of 200 samples while in line B, 25 defects
out of 600 samples were noted. Is there a difference in the procedures at a 0.05 alpha level?
This is an example of a two-tailed two proportion z test.

As this is a two-tailed test the alpha level needs to be divided by 2 to get 0.025.

Using this, the critical value from the z table is 1.96.

As 2.62 > 1.96 thus, the null hypothesis is rejected and it is concluded that there is a significant difference between the
two lines.
Answer: Reject the null hypothesis

Limitations of the Test of Hypotheses


• Test do not explain the reasons as to why does the difference exist, say between the means of
the two samples. They simply indicate whether the difference is due to fluctuations of sampling
or because of other reasons but the tests do not tell us as to which is/are the other reason(s)
causing the difference.
• Results of significance tests are based on probabilities and as such cannot be expressedwith full
certainty.
• Statistical inferences based on the significance tests cannot be said to be entirely correct
evidences concerning the truth of the hypotheses.

Conclusion:
A hypothesis is an educated guess about something in the world around us. Hypotheses are theoretical
guesses based on limited knowledge; they need to be tested. Thus, hypothesis testing is a decision-
making process for evaluating claims about a population. We use variousstatistical analysis to test
hypotheses and answer research questions. In formal hypothesis testing, we test the null hypothesis
and usually want to reject the null because rejection of the null indirectly supports the alternative
hypothesis to the null, the one we deduce from theory as a tentative explanation. Thus, a hypothesis
test mutually exclusive statements about a population to determine which statement is best supported
by the sample data.

Chi Square- Test


The 2 test (pronounced as chi-square test) is an important and popular test of hypothesis which
fall is categorized in non-parametric test. This test was first introduced by Karl Pearson in the year
1900.

It is used to find out whether there is any significant difference between observed frequencies and
expected frequencies pertaining to any particular phenomenon. Here frequencies are shown in the
different cells (categories) of a so-called contingency table. It is noteworthy that we take the
observations in categorical form or rank order, but not in continuation or normal distribution.
The test is applied to assess how likely the observed frequencies would be assuming the null
hypothesis is true.
This test is also useful in ascertaining the independence of two random variables based on
observations of these variables.
This is a non parametric test which is being extensively used for the following reasons:
1. This test is a Distribution free method, which does not rely on assumptions that the data are
drawn from a given parametric family of probability distributions.
2. This is easier to compute and simple enough to understand as compared to parametric test.
3. This test can be used in the situations where parametric test are not appropriate or
measurements prohibit the use of parametric tests.
It is defined as:

(𝑂−𝐸)2
2 = ∑
𝐸

Where O refers to the observed frequencies and E refers to the expected frequencies.
Uses of Chi-Square Test

Chi Square test has a large number of applications where paremertic tests can not be applied.
Their uses can be summarized as under along with examples:

(a) A test of independence.

This test is helpful in detecting the association between two or more attributes. Suppose we
have N observations classified according to two attributes. By applying this test on the given
observations (data) we try to find out whether the attributes have some association or they are
independent. This association may be positive, negative or absence of association. For example
we can find out whether there is any association between regularity in class and division of
passing of the students, similarly we can find out whether quinine is effective in controlling fever
or not. In order to test whether or not the attributes are associated we take the null hypothesis
that there is no association in the attributes under study. In other words, the two attributes are
independent.

After computing the value of chi square, we compare the calculated value with its corresponding
critical value for the given degree of freedom at a certain level of significance. If calculate value
of 2 is less than critical or table value, null hypothesis is said to be accepted and it is
concluded that two attributes have no association that means they are independent. On the
other hand, if the calculated value is greater than the table value, it means that the results of the
experiment do not support the hypothesis and hypothesis is rejected, and it is concluded that
the attributes are associated.

Illustration 1: From the data given in the following table, find out whether there is any
relationship between gender and the preference of colour.

Colour Male Female Total


Red 25 45 70
Blue 45 25 70
Green 50 10 60
Total 120 80 200

(Given : For  =2, 2 0.05 = 5.991)


Solution: Let us take the following hypothesis:

Null Hypothesis 𝐻0: There is no relationship between gender and preference of colour.

Alternative Hypothesis 𝐻𝑎: There is relationship between gender and preference of colour.

We have to first calculate the expected value for the observed frequencies. These are shown
below along with the observed frequencies:

Colour Gender O E O-E (O-E)2 (O-E)2/E


Red M 25 42 -17 289 6.88
F 45 28 17 289 10.32
Blue M 45 42 3 9 0.21
F 25 28 -3 9 0.32
Green M 50 36 14 196 5.44
F 10 24 -14 196 8.16
2= 31.33

The degree of freedom are (r-1) (c-1) = (3-1) (2-1) = 2.

The critical value of 2 for 2 degrees of freedom at 5% level of significance is 5.991.

Since the calculated 2 =31.33 exceeds the critical value of 2 , the null hypothesis is rejected.
Hence, the conclusion is that there is a definite relationship between gender and preference of
colour.

(B) A test of goodness of fit

It is the most important utility of the Chi Square test. This method is mainly used for testing of

goodness of fit. It attempts to set up whether an observed frequency distribution differs from an

estimated frequency distribution. When an ideal frequency curve whether normal or some other

type is fitted to the data, we are interested in finding out how well this curve fits with the

observed facts.
The following steps are followed for the above said purpose:

i. A null and alternative hypothesis pertaining to the enquiry are established,

ii. A level of significance is chosen for rejection of the null hypothesis.

iii. A random sample of observations is drawn from a relevant statistical population.

iv. On the basis of given actual observations, expected or theoretical frequencies are derivedthrough

probability. This generally takes the form of assuming that a particular probability distribution is

applicable to the statistical population under consideration.

v. The observed frequencies are compared with the expected or theoretical frequencies.

vi. If the calculated value of 2 is less than the table value at a certain level of significance (generally 5%

level) and for certain degrees of freedom the, fit is considered to be good. i.e.. thedivergence between the

actual and expected frequencies is attributed to fluctuations of simple sampling. On the other hand, if the

calculated value of 2 is greater than the table value, the fitis considered to be poor i.e. it cannot be

attributed to the fluctuations of simple sampling rather itis due to the inadequacy of the theory to fit the

observed facts.

Illustration 2:

In an anti malaria campaign in a certain area, quinine was administered to 812 persons out of a
total population of 3248. The number of fever cases is shown below:

Treatment Fever (A) No fever (a) Total


Quinine (B) 140(AB) 30 (aB) 170 (B)
No Quinine (b) 60(Ab) 20 (ab) 80 (b)
Total 200(A) 50 (a) 250 (N)

Discuss the usefulness of quinine in checking malaria.

(Given: For  =1, 2 0.05 = 3.84)


Solution: Let us take the following hypotheses:

Null Hypothesis 𝐻𝑂: Quinine is not effective in checking malaria.

Alternative Hypothesis 𝐻𝑎: Quinine is effective in checking malaria.

Applying 2 test:
(𝐴)𝑋(𝐵)
Expectated frequency of say AB = = 200𝑋 170 = 136
𝑁 250

Or 𝐸1, i.e., expected frequency corresponding to first row and first column is 60.The

table of expected frequencies shall be :

Treatment Fever No Fever Total


Quinine 136 34 170
No quinine 64 16 80
Total 200 50 250 (N)

Computation of Chi Square value

O E (O-E)2 (O-E)2 /E
140 136 16 0.118
60 64 16 0.250
30 34 16 0.471
20 16 16 1.000
(𝑂−𝐸)2
∑ = 1.839
𝐸

(𝑂−𝐸)2
2 = ∑ = 1.839
𝐸

Degree of freedom  = (r-1) (c-1) = (2-1) (2-1) = 1

Table Value: For  =1, 2 0.05 = 3.84

The calculated value of 2 i.e. 1.839 is less than the table value i.e. 3.84, the null hypothesis is
accepted. Hence quinine is not useful in checking malaria.
(C) A test of homogeneity

The 2 test of homogeneity is an extension of the 2 test of independence. Such tests indicate
whether two or more independent samples are drawn from the same population or from different
populations. Instead of one sample as we use in the independence problem, we shall now have
two or more samples. Supposes a test is given to students in two different higher secondary
schools. The sample size in both the cases is the same. The question we have to ask: is there any
difference between the two higher secondary schools? In order to find the answer, we have to set
up the null hypothesis that the two samples came from the same population. The word
‘homogeneous’ is used frequently in Statistics to indicate ‘the same’ or ‘equal’. Accordingly, we
can say that we want to test in our example whether the two samples are homogeneous. Thus, the
test is called a test of homogeneity.

Illustration 3: Two hundred bolts were selected at random from the output of each of the five
machines. The number of defective bolts found were 5, 9, 13, 7 and 6 . Is there a significant
difference among the machines? Use 5% level of significance.

( Given: For  =4, 2 0.05 = 9.488)

Solution: Let us take the following hypothesis:

𝐻𝑂: There is no significant difference among the machines.

𝐻𝑎: There is significant difference among the machines.

As there are five machines, the total number of defective bolts should be equally distributed among
these machines. That is how we can get expected frequencies as under:

Here expected no. of defective bolts for each machine (E) =


𝑆𝑢𝑚 𝑜𝑓 𝑑𝑒𝑓𝑒𝑐𝑡𝑖𝑣𝑒 𝑏𝑜𝑙𝑡𝑠
= 40 = 8.
𝑁𝑜. 𝑜𝑓 𝑚𝑎𝑐ℎ𝑖𝑛𝑒𝑠 𝑝𝑟𝑜𝑑𝑢𝑐𝑖𝑛𝑔 𝑡ℎ𝑒𝑠𝑒 𝑑𝑒𝑓𝑒𝑐𝑡𝑖𝑣𝑒 𝑏𝑜𝑙𝑡𝑠 5
Computation of Chi Square test

Machine O E O-E (O-E)2 (O-E)2/E


1 5 8 -3 9 1.125
2 9 8 1 1 0.125
3 13 8 5 25 3.125
4 7 8 -1 1 0.125
5 6 8 -2 4 0.5
∑(O-E)2/E= 5.00

Decision: The critical value of 2 at 0.05 level of significance for 4 degrees of freedom is
9.488. As the calculated value of 2 = 5 is less than the critical value, 𝐻𝑂 is accepted. In
other words, the difference among the five machines in respect of defective bolts is not
significant.

You might also like