Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 40

Chi Square Test

A brief overview
Contents
 Recap
 Chi Square test
 Anova

2 CM Young Professionals Fellowship 10/2/2019


Programme
RECAP
 Parametric Tests: Tests in which population constants like
mean, standard deviation, standard error, correlation
coefficient, proportion etc. follow one assumed or established
distribution.
 Make assumptions about parameters of population differences.
 Assumptions need to be correct for correct results.
 Non-parametric Tests: No constant of a population is used.
No assumptions are made. Data do not follow any specific
distribution.
 E.g.To classify good, better and best.

3 CM Young Professionals Fellowship 10/2/2019


Programme
RECAP
 Hypothesis
 Null Hypothesis
 Alternate Hypothesis
 Degrees of Freedom: It denotes the extent of
independence by a given set of standard frequencies. If we have
n observed frequencies subject to k independent restrictions
then df= n-k
 Df = (r-1)(c-1)
 Contingency Table: A type of table in a matrix format that
displays the (multivariate) frequency distribution of the
variables. ... They provide a basic picture of the interrelation
between two variables and can help find interactions between
them.

4 CM Young Professionals Fellowship 10/2/2019


Programme
What is chi square test?
 It is used to investigate whether distributions of
categorical variables differ from one another. It compares
the tallies or counts of categorical responses between
two or more independent groups.
 It can only be used on actual numbers.
 We measure the difference between what is observed
and what is expected.
 Of the many chi square tests available, Pearson’s chi
square is the most popular.

5 CM Young Professionals Fellowship 10/2/2019


Programme
Conditions for chi square test
 Observations must be recorded and collected on a
random basis.
 All items of the sample are independent.
 No group should contain very few items say less than 10.
 Total number of items should be large i.e. at least 50.

6 CM Young Professionals Fellowship 10/2/2019


Programme
Application/Utility
The test is used in
1) Goodness of fit of distributions
2) Test of independence of attributes
3) Test of homogeneity

7 CM Young Professionals Fellowship 10/2/2019


Programme
Test of Goodness of Fit of Distributions
 It is used to find out how the observed value of a given
phenomena is significantly different from the expected
value.
 The term goodness of fit is used to compare the
observed sample distribution with the expected
probability distribution.
 It determines how well theoretical distribution (such as
normal, binomial, or Poisson) fits the empirical
distribution.

8 CM Young Professionals Fellowship 10/2/2019


Programme
How to conduct the test?
 1. The null hypothesis assumes that there is no significant
difference between the observed and the expected value.
 2. The alternative hypothesis assumes that there is a
significant difference between the observed and the
expected value.
 Compute the value of Chi-Square goodness of fit test
using the following formula:

Where, O= observed value E= expected value

9 CM Young Professionals Fellowship 10/2/2019


Programme
Chi Square Test of Independence
 Applied when you have two categorical data from a single
population. It is used to determine whether there is a
significant association between the two variables.

10 CM Young Professionals Fellowship 10/2/2019


Programme
Chi Square Test of Homogeneity
 The test is applied to a single categorical data from two
or more different populations. It is used to determine
whether frequency counts are distributed identically
across different populations.

11 CM Young Professionals Fellowship 10/2/2019


Programme
Limitations

The chi-square test does not give us much information about


the strength of the relationship or its substantive significance in
the population.
The chi-square test is sensitive to sample size. The size of the
calculated chi-square is directly proportional to the size of the
sample, independent of the strength of the relationship
between the variables.
The chi-square test is also sensitive to small expected
frequencies in one or more of the cells in the table.

As sample size increases, absolute differences become a


smaller and smaller proportion of the expected value. What
this means is that a reasonably strong association may not
come up as significant if the sample size is small, and conversely,
in large samples, we may find statistical significance when the
findings are small and uninteresting., i.e., the findings are
not substantively significant, although they are statistically
12 significant. CM Young Professionals Fellowship 10/2/2019
Programme
Practice Problems
 1. As a personnel director you want to test the
perception of fairness of 3 methods of performance and
evaluation. Of 180 employees,
63 rated Method 1 as fair
45 rated Method 2 as fair
72 rated Method 3 as fair
Is there any difference in perception?
H0= p1=p2=p3
H1= At least one is different

13 CM Young Professionals Fellowship 10/2/2019


Programme
Solution
Observed Expected O-E (O-E)2 (O-E)2/E
frequency frequency

63 60 3 9 0.15
45 60 -15 225 3.75
72 60 12 144 2.4

Alpha value to be 0.05. df= 2 Chi Square value of 6.3. Reject


Null hypothesis looking at a probability table. At
Least one proportion is different.

14 CM Young Professionals Fellowship 10/2/2019


Programme
Probability Table

15 CM Young Professionals Fellowship 10/2/2019


Programme
Practice Problems
 In many families this is the distribution of blood groups
among children.
 Blood Group A- 26
 Blood Group B - 31
 Blood Group AB - 39
 Blood Group O- 24

16 CM Young Professionals Fellowship 10/2/2019


Programme
Practice Problems
 A researcher is interested to know the voting
preferences on gun control issues. A questionnaire sent
to 90 random sample and information collected on party
membership.
Party Favour Neutral Oppose F row

A 10 10 30 50

B 15 15 10 40

F column 25 25 40 90

17 CM Young Professionals Fellowship 10/2/2019


Programme
Other examples
 A die is thrown 132 times. Is it unbiased?
 In a study of the television viewing habits of children, a
developmental psychologist selects a random sample of
300 first graders - 100 boys and 200 girls. Each child is
asked which of the following TV programs they like best:
The Lone Ranger, Sesame Street, or The Simpsons.
Lone Sesame Simpsons Row
Ranger Street total
Boys 50 30 20 100

Girls 50 80 70 200

C total 100 110 90 300

18 CM Young Professionals Fellowship 10/2/2019


Programme
Examples
 Conducting a drug trial on a group of animals.
 Ha: Animals receiving drug would show increase in heart
rate.

Increase in No increase Total


heart rate in heart
rate
Treated 36 14 50
Not treated 30 25 55
66 39 105

19 CM Young Professionals Fellowship 10/2/2019


Programme
Examples
 Incidence of malaria in three regions

Asia Africa South Row total


America
Malaria A 31 14 45 90
Malaria B 2 5 53 60
Malaria C 53 45 2 100
Column 86 64 100 250
Total

20 CM Young Professionals Fellowship 10/2/2019


Programme
Examples
 Find out if educational level of persons and their sources
of news are independent. A survey of 220 persons were
conducted.

Education
Sources of Upto plus Graduate Post graduate
news two
TV 30 30 20
Vernacular 35 15 10
newspaper
English daily 15 35 30

21 CM Young Professionals Fellowship 10/2/2019


Programme
Examples
 500 voters were contacted and required to give their
educational level and political party preferred in a secret
ballot. Are educational level and party affiliation
independent?

Educational Party preferred


level
Party A Party B Party C
Upto plus 2 40 80 20
Graduate 70 60 30
PG 90 60 50

22 CM Young Professionals Fellowship 10/2/2019


Programme
Examples
 A restaurant wants to know whether the drink preferred
during breakfast was depending on age of customer. 300
customers were observed on a particular day.

Age Drink preferred


Coffee/tea Cool drink Milk, Horlicks
0-30 35 40 15
30-45 70 40 20
Above 45 10 20 50

23 CM Young Professionals Fellowship 10/2/2019


Programme
Examples
 A survey was conducted among the voters for assessing
their attitude towards Lokpal.

Class Attitude
Welcome Neutral Pessimistic
Poor 38 32 74
Middle Class 30 44 92
Rich 48 22 64

24 CM Young Professionals Fellowship 10/2/2019


Programme
Examples
 An investment company wanted to know whether the
pattern of savings is dependent on the nature of
employment. Are the pattern of investment and nature of
employment independent.

Nature of Instrument chosen


employment
Provident Bank deposit Mutual Fund
Fund
Govt. Service 50 37 13
Private life 30 52 18
Others 20 65 15

25 CM Young Professionals Fellowship 10/2/2019


Programme
Examples
 A manager of a company wants to know whether his
company’s brand is equally popular in various regions. He
took a random sample of 200 consumers in each of the
four regions.

Action East West North South


Purchasing 110 100 90 80
the brand
Not 90 100 110 120
purchasing
the brand

26 CM Young Professionals Fellowship 10/2/2019


Programme
Examples
 A survey was conducted to know if people preferred
weekend shopping or shopping on week days. 200
persons from each age group of under 35, 36-50, and over
50 gave their responses.
 Does mileage determine the model you buy? The
question was put to men and women.
 The number of road accidents was recorded on 35 days.
(day and frequency given) Is it uniformly distributed?
 Male and female entrepreneurs gave four reasons for
becoming an entrepreneur. Challenge, generating
employment, high risk reward, maintaining individuals.

27 CM Young Professionals Fellowship 10/2/2019


Programme
Analysis of Variance

Where to use it?


What is ANOVA?
 Analysis of variance (ANOVA) tests the hypothesis that the
means of two or more populations are equal.
 Assess the importance of one or more factors by comparing
the response variable means at the different factor levels.
 The null hypothesis states that all population means (factor
level means) are equal while the alternative hypothesis states
that at least one is different.
 For ANOVA, we must have a continuous response variable
and at least one categorical factor with two or more levels. We
require data from approximately normally distributed
populations with equal variances between factor levels.
 Anova developed by Ronald A Fisher in 1920s.

29 CM Young Professionals Fellowship 10/2/2019


Programme
One way ANOVA
 Smiles and leniency
 The effect of different types of smiles on the leniency
shown to a person was investigated.
 Four different types of smiles (neutral, false, felt, and
miserable) were shown.
 Here, "Type of Smile" is the independent variable. In
describing an ANOVA design, the term factor is a synonym
of independent variable.
 Therefore, "Type of Smile" is the factor in this
experiment. Since four types of smiles were compared,
the factor "Type of Smile" has four levels.

30 CM Young Professionals Fellowship 10/2/2019


Programme
Between groups and within groups
 In the Smiles and Leniency study, the four levels of the factor "Type
of Smile" were represented by four separate groups of subjects.
 When different subjects are used for the levels of a factor, the factor
is called a between-subjects factor or a between-subjects variable. The
term "between subjects" reflects the fact that comparisons are
between different groups of subjects.
 In a drug treatment study, every subject was tested with each of
four dosage levels (0, 0.15, 0.30, 0.60 mg) of a drug. There was only
one group of subjects, and comparisons were not between different
groups of subjects but between conditions within the same subjects.
When the same subjects are used for the levels of a factor, the
factor is called a within-subjects factor or a within-subjects variable.
Within-subjects variables are sometimes referred to as repeated-
measures variables since there are repeated measurements of the
same subjects.

31 CM Young Professionals Fellowship 10/2/2019


Programme
Two way ANOVA
 If an experiment has two factors, then the ANOVA is
called a two-way ANOVA.
 Suppose an experiment on the effects of age and gender
on reading speed were conducted using three age groups
(8 years, 10 years, and 12 years) and the two genders
(male and female). The factors would be age and gender.
Age would have three levels and gender would have two
levels.

32 CM Young Professionals Fellowship 10/2/2019


Programme
Assumptions for using ANOVA
 The samples are drawn from using normal population
 The samples are random
 The variances of population are equal
 The other characteristics of the population (except those
under study) are effectively controlled.

33 CM Young Professionals Fellowship 10/2/2019


Programme
Examples of ANOVA
 Mobility scores between three groups of patients.
Control group did not receive any therapy. Treatment 1
received physical therapy. Treatment 2 received physical
therapy and counselling.
Control Treatment1 Treatment 2
35 38 47
38 43 53
42 45 42
34 52 45
28 40 46
39 46 37

34 CM Young Professionals Fellowship 10/2/2019


Programme
Examples
 A chain of restaurants in a city wants to compare 3 of its
restaurants regarding the service time per customer. One
of the owners visited the three restaurants during the
peak hours and noted that the service time for 5
customers in each restaurant is as below. Are the average
service time in the three restaurants significantly
different?
Restaurant 1 Restaurant 2 Restaurant 3
3 3 2
4 4 3.5
5.5 5.5 5
3.5 2.5 6.5
4 3 6
35 CM Young Professionals Fellowship 10/2/2019
Programme
Examples
 Does brand and month of sale affect the sales of cars?
Given below is the data for 5 brands of cars. The no of cars
sold is in 1000s.

Brands September October November


2013 2013 2013
Alto 24 23 23
Swift 17 19 16
Dezire 17 17 15
Wagon R 15 14 13
Bolero 9 11 8

36 CM Young Professionals Fellowship 10/2/2019


Programme
Examples
 Price of bank stocks in stock market under different
market conditions (rise in index, fall in index, flat market)
and announcement of RBI on SLR
 No of units produced by different units in different shifts
 Effectiveness of different brands of detergents under
different types of water in removing dirt
 The salaries of MBAs coming out of three different MBA
schools
 Efficiency of workers in 3 plants using the same processes

37 CM Young Professionals Fellowship 10/2/2019


Programme
Examples
 An HR consultant wanted to know whether job
satisfaction depends on individual or profession. He
collected job satisfaction score (0-10) scale of 5
professors, 5 lawyers, 5 chartered accountants and 5
doctors. Is there any difference in the job satisfaction
among the four professions.
 A modified incentive scheme is announced by an IT
company having branches in Kormangala, Electronic city
and ITPL. The company wants to study the effect of
incentive schemes on professionals in three branches and
the effect on three job types- financial services, dealing
with overseas customers and BPO
38 CM Young Professionals Fellowship 10/2/2019
Programme
Examples
 Human Resource Department wants to know if
occupational stress varies according to age and gender.
 The price of 2BHK flats having an area of nearly 1000
sq.ft in two developing areas of Chennai are given.
Whether their prices differ significantly?
 A departmental store gave advertisements regarding
discount on offer on a particular day in 3 news papers
and increase in sales in five of its branches.

39 CM Young Professionals Fellowship 10/2/2019


Programme
Thank you

Email id: mangala.gowri@mp.gov.in

40 CM Young Professionals Fellowship 10/2/2019


Programme

You might also like