UNIT IV Cat
UNIT IV Cat
UNIT IV Cat
Source : https://egyankosh.ac.in/bitstream/123456789/10421/1/Unit-9.pdf
Application of Techniques :
It then calculates a p value (probability value). The p-value estimates how likely it is that you
would see the difference described by the test statistic if the null hypothesis of no relationship
were true.
If the value of the test statistic is more extreme than the statistic calculated from the null
hypothesis, then you can infer a statistically significant relationship between the predictor
and outcome variables.
If the value of the test statistic is less extreme than the one calculated from the null hypothesis,
then you can infer no statistically significant relationship between the predictor and outcome
variables.
For a statistical test to be valid, your sample size needs to be large enough to approximate the
true distribution of the population being studied.
If your data do not meet the assumptions of normality or homogeneity of variance, you may be
able to perform a nonparametric statistical test, which allows you to make comparisons
without any assumptions about the data distribution.
If your data do not meet the assumption of independence of observations, you may be able to
use a test that accounts for structure in your data (repeated-measures tests or tests that include
blocking variables).
Types of variables
The types of variables you have usually determine what type of statistical test you can use.
Quantitative variables represent amounts of things (e.g. the number of trees in a forest). Types
of quantitative variables include:
Continuous (aka ratio variables): represent measures and can usually be divided into
units smaller than one (e.g. 0.75 grams).
Discrete (aka integer variables): represent counts and usually can’t be divided into units
smaller than one (e.g. 1 tree).
Categorical variables represent groupings of things (e.g. the different tree species in a forest).
Types of categorical variables include:
Choose the test that fits the types of predictor and outcome variables you have collected (if you
are doing an experiment, these are the independent and dependent variables). Consult the tables
below to see which test best matches your variables.
Certainly! Parametric tests are statistical tests that make certain assumptions about the
distribution of the data. These assumptions include normality and homogeneity of variances.
Parametric tests are often used when the data is continuous and normally distributed. Here are
some commonly used parametric tests in social science along with examples:
- Example: Compare the average scores of two groups of students, one taught with traditional
methods and the other with a new teaching method.
- Purpose: Compare means of two related groups (matched pairs or repeated measures).
- Example: Assess the effectiveness of a new therapy by comparing the scores of individuals
before and after the therapy.
- Purpose: Similar to ANOVA but with the addition of one or more covariates to control for
potential confounding variables.
5. Linear Regression:
- Purpose: Examine the relationship between two continuous variables, with one as the
predictor and the other as the outcome.
- Example: Explore the relationship between the amount of time spent studying and exam
scores.
6. Multiple Regression:
- Purpose: Extend linear regression to examine the relationship between one dependent
variable and multiple independent variables.
- Example: Investigate the factors influencing job performance by considering variables like
education, experience, and motivation.
- Example: Examine whether there are significant differences in multiple outcome variables
(e.g., test scores, creativity scores) across different teaching methods.
8. Multivariate Regression:
- Example: Predict both academic achievement and emotional well-being based on various
predictor variables like study habits, social support, etc.
Remember to check the assumptions of each test before applying them to your data. Violation
of assumptions may lead to inaccurate results. Additionally, it's crucial to interpret results in
the context of your study and research questions.
Non-parametric tests don’t make as many assumptions about the data, and are useful when one
or more of the common statistical assumptions are violated. However, the inferences they make
aren’t as strong as with parametric tests.
Nonparametric tests are used when the assumptions of parametric tests are violated or when
dealing with ordinal or non-normally distributed data. Here are some commonly used
nonparametric tests in social science along with examples:
1. Mann-Whitney U Test:
- Example: Assess whether there is a significant difference in the rankings of job satisfaction
between two different departments in a company.
- Example: Examine whether there is a significant difference in the pre- and post-treatment
anxiety levels of individuals in a therapeutic intervention.
3. Kruskal-Wallis Test:
- Purpose: Nonparametric alternative to one-way ANOVA, used for comparing more than
two independent groups.
- Example: Investigate whether there are differences in the levels of perceived stress among
individuals in different occupational sectors.
4. Friedman Test:
- Example: Analyze whether there are differences in the performance of students across three
different teaching methods over multiple testing sessions.
- Example: Investigate whether there is a relationship between gender and preferred learning
style among a group of students.
- Purpose: Assess the strength and direction of a monotonic relationship between two
variables.
- Example: Examine whether there is a significant correlation between the amount of time
spent on extracurricular activities and academic achievement.
7. Kendall's Tau:
- Purpose: Another nonparametric measure of correlation, similar to Spearman's correlation.
- Example: Investigate the association between the ranks of job satisfaction and years of work
experience among employees.
- Purpose: Test the equality of medians between two or more independent groups.
- Example: Compare the median income levels across different regions to determine if there
are significant differences.
When using nonparametric tests, it's important to note that they might be less powerful than
their parametric counterparts in certain situations. Additionally, the interpretation may differ,
as nonparametric tests often focus on ranks and medians rather than means. Always consider
the specific characteristics of your data and research questions when choosing the appropriate
test.
Scaling Techniques:
MEASUREMENT AND SCALING:
a) Measurement: Measurement is the process of observing and recording the observations that
are collected as part of research. The recording of the observations may be in terms of numbers
or other symbols to characteristics of objects according to certain prescribed rules. The
respondent’s, characteristics are feelings, attitudes, opinions etc. For example, you may assign
‘1’ for Male and ‘2’ for Female respondents. In response to a question on whether he/she is
using the ATM provided by a particular bank branch, the respondent may say ‘yes’ or ‘no’.
You may wish to assign the number ‘1’ for the response yes and ‘2’ for the response no. We
assign numbers to these characteristics for two reasons. First, the numbers facilitate further
statistical analysis of data obtained. Second, numbers facilitate the communication of
measurement rules and results. The most important aspect of measurement is the specification
of rules for assigning numbers to characteristics. The rules for assigning numbers should be
standardised and applied uniformly. This must not change over time or objects.
b) Scaling: Scaling is the assignment of objects to numbers or semantics according to a rule.
In scaling, the objects are text statements, usually statements of attitude, opinion, or feeling.
For example, consider a scale locating customers of a bank according to the characteristic
“agreement to the satisfactory quality of service provided by the branch”. Each customer
interviewed may respond with a semantic like ‘strongly agree’, or ‘somewhat agree’, or
‘somewhat disagree’, or ‘strongly disagree’. We may even assign each of the responses a
number. For example, we may assign strongly agree as ‘1’, agree as ‘2’ disagree as ‘3’, and
strongly disagree as ‘4’. Therefore, each of the respondents may assign 1, 2, 3 or 4.
Typically, there are four levels of measurement scales or methods of assigning numbers: (a)
Nominal scale, (b) Ordinal scale, (c) Interval scale, and (d) Ratio scale.
a) Nominal Scale is the crudest among all measurement scales but it is also the simplest scale.
In this scale the different scores on a measurement simply indicate different categories. The
nominal scale does not express any values or relationships between variables. For example,
labelling men as ‘1’ and women as ‘2’ which is the most common way of labelling gender for
data recording purpose does not mean women are ‘twice something or other’ than men. Nor it
suggests that men are somehow ‘better’ than women.
Another example of nominal scale is to classify the respondent’s income into three groups: the
highest income as group 1. The middle income as group 2, and the low-income as group 3. The
nominal scale is often referred to as a categorical scale. The assigned numbers have no
arithmetic properties and act only as labels. The only statistical operation that can be performed
on nominal scales is a frequency count. We cannot determine an average except mode. In
designing and developing a questionnaire, it is important that the response categories must
include all possible responses. In order to have an exhaustive number of responses, you might
have to include a category such as ‘others’, ‘uncertain’, ‘don’t know’, or ‘can’t remember’ so
that the respondents will not distort their information by forcing their responses in one of the
categories provided. Also, you should be careful and be sure that the categories provided are
mutually exclusive so that they do not overlap or get duplicated in any way.
b) Ordinal Scale involves the ranking of items along the continuum of the characteristic being
scaled. In this scale, the items are classified according to whether they have more or less of a
characteristic. For example, you may wish to ask the TV viewers to rank the TV channels
according to their preference and the responses may look like this as given below: TV Channel
Viewers preferences Doordarshan-1 1 Star plus 2 NDTV News 3 Aaaj Tak TV 4 The main
characteristic of the ordinal scale is that the categories have a logical or ordered relationship.
This type of scale permits the measurement of degrees of difference, (that is, ‘more’ or ‘less’)
but not the specific amount of differences (that is, how much ‘more’ or ‘less’). This scale is
very common in marketing, satisfaction and attitudinal research. Another example is that a fast
food home delivery shop may wish to ask its customers: How would you rate the service of our
staff? (1) Excellent • (2) Very Good • (3) Good • (4) Poor • (5) Worst • Suppose respondent X
gave the response ‘Excellent’ and respondent Y gave the response ‘Good’, we may say that
respondent X thought that the service provided better than respondent Y to be thought. But we
don’t know how much better and even we can’t say that both respondents have the same
understanding of what constitutes ‘good service’. In marketing research, ordinal scales are used
to measure relative attitudes, opinions, and preferences. Here we rank the attitudes, opinions
and preferences from best to worst or from worst to best. However, the amount of difference
between the ranks cannot be found out. Using ordinal scale data, we can perform statistical
analysis like Median and Mode, but not the Mean.
c) Interval Scale is a scale in which the numbers are used to rank attributes such that
numerically equal distances on the scale represent equal distance in the characteristic being
measured. An interval scale contains all the information of an ordinal scale, but it also one
allows to compare the difference/distance between attributes. For example, the difference
between ‘1’ and ‘2’ is equal to the difference between ‘3’ and ‘4’. Further, the difference
between ‘2’ and ‘4’ is twice the difference between ‘1’ and ‘2’. However, in an interval scale,
the zero point is arbitrary and is not true zero. This, of course, has implications for the type of
data manipulation and analysis. We can carry out on data collected in this form. It is possible
to add or subtract a constant to all of the scale values without affecting the form of the scale
but one cannot multiply or divide the values. Measuring temperature is an example of interval
scale. We cannot say 400 C is twice as hot as 200 C. The reason for this is that 00 C does not
mean that there is no temperature, but a relative point on the Centigrade Scale. Due to lack of
an absolute zero point, the interval scale does not allow the conclusion that 400 C is twice as
hot as 200 C. Interval scales may be either in numeric or semantic formats. The following are
two more examples of interval scales one in numeric format and another in semantic format.
whether they have more or less of a characteristic. For example, you may wish to ask the TV
viewers to rank the TV channels according to their preference and the responses may look like
this as given below: TV Channel Viewers preferences Doordarshan-1 1 Star plus 2 NDTV
News 3 Aaaj Tak TV 4
The main characteristic of the ordinal scale is that the categories have a logical or ordered
relationship. This type of scale permits the measurement of degrees of difference, (that is,
‘more’ or ‘less’) but not the specific amount of differences (that is, how much ‘more’ or ‘less’).
This scale is very common in marketing, satisfaction and attitudinal research. Another example
is that a fast food home delivery shop may wish to ask its customers: How would you rate the
service of our staff? (1) Excellent • (2) Very Good • (3) Good • (4) Poor • (5) Worst • Suppose
respondent X gave the response ‘Excellent’ and respondent Y gave the response ‘Good’, we
may say that respondent X thought that the service provided better than respondent Y to be
thought. But we don’t know how much better and even we can’t say that both respondents have
the same understanding of what constitutes ‘good service’. In marketing research, ordinal
scales are used to measure relative attitudes, opinions, and preferences. Here we rank the
attitudes, opinions and preferences from best to worst or from worst to best. However, the
amount of difference between the ranks cannot be found out. Using ordinal scale data, we can
perform statistical analysis like Median and Mode, but not the Mean.
References:
https://egyankosh.ac.in/
https://chat.openai.com/