Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
22 views

BRM File

The document provides information about conducting statistical analysis using SPSS. It discusses introducing SPSS, managing data in SPSS, coding and recoding variables, selecting, sorting and analyzing data, handling missing values, and descriptive statistics. Practical examples are also given on various SPSS procedures.

Uploaded by

bf9wk4bvsw
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views

BRM File

The document provides information about conducting statistical analysis using SPSS. It discusses introducing SPSS, managing data in SPSS, coding and recoding variables, selecting, sorting and analyzing data, handling missing values, and descriptive statistics. Practical examples are also given on various SPSS procedures.

Uploaded by

bf9wk4bvsw
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 55

GURU GOBIND SINGH INDRAPRASTHA UNIVERSITY

MAHARAJA SURAJMAL INSTITUTE


RESEARCH METHODOLOGY LAB
SUBJECT CODE – BBA 213

SUBMITTED BY – Prabal Sharma SUBMITTED TO – Dr. Dimpy Sachar

ROLL NO – 13121201721
CLASS – BBA (G) SECTION – C (3rd SEM)
SERIAL No. TOPIC Remarks

1. Introduction to SPSS

2. Managing Data In SPSS

3. Coding And Recording In SPSS

4. Selecting, Sorting And Analyzing Data In SPSS

5. Missing Values, Recording The Variable

6. Descriptive Statistics

7.
Crosstabs And CHI-Square Test

8. Correlation

9. Regression

10. One-Sample T Test


11. Independent Sample T Test

12. Paired Sample T-Test

13. ANOVA Testing For Means


PRACTICAL – 1
Introduction to SPSS

OVERVIEW:-

SPSS is the abbreviation of Statistical Package for Social Sciences and it is used by
researchers to perform statistical analysis. As the name suggests, SPSS statistics
software is used to perform only statistical operations.

The professionals at Statistics Solutions are experts in SPSS software and statistical
operations.

SPSS software is used to perform quantitative analysis and is used as a complete statistical
package that is based on a point and click interface. This software has been widely used by
researchers to perform quantitative analysis since its development in the 1960s by Norman H.
Nie, in collaboration with C. Hadlai Hulland Dale Bent.

SPSS software can read and write data from other statistical packages, databases, and
spreadsheets. When entering data into the software, one has to click on “variable view.” The
variable view enables the user to customize it by data type and consists of the following
headings: Name, Type, Width, Decimals, Label, Values, Missing, Columns, Align, and
Measures. These headingsenable the user to characterize the data.

SPSS is most often used in social science fields such as psychology, where statistical
techniques are involved at a large scale. In the field of psychology, techniques such as cross
tabulation, t-test, chi square test, etc., are available in the “analyze” menu of the software.
FEATURES:-

Now that we have a basic idea of how SPSS works, let's take a look atwhat it can do. Following a
typical project workflow, SPSS is great for

 Opening data files, either in SPSS’ own file format or many others;
 Editing data such as computing sums and means over columns or rows of data. SPSS has
outstanding options for more complex operations aswell.
 Creating tables and charts containing frequency counts or summary statistics over (groups of)
cases and variables.
 Running inferential statistics such as ANOVA, regression and factoranalysis.
 Saving data and output in a wide variety of file formats.

Role of Computerized Data Analysis:-

Right, so SPSS can open all sorts of data and display them -and their metadata- in two
sheets in its Data Editor window. So how to analyze your data in SPSS? Well, one
option is using SPSS’ elaborate menu options.
Make a data file with the following variables and their information.

Now let us see how we can work efficiently through SPSS

Step 1:

Make a Google form with its name RM file.

Step 2: Create data fields of Customer ID, Gender, Income, customer satisfaction
Step 3: Record responses from various respondents.

Step 4: Convert the responses into an excel file.

Step 5: Import the data in SPSS from excel (xls format).


Displaying the data from data view.

Displaying the data from variable view.


PRACTICAL- 2
MANAGING DATA IN SPSS

Overview:

Calculate the frequency distribution and measures of Central tendency (Mean, Median,
Mode) and Dispersion from the following data.

Steps:

Step 1: Create a dataset using the following variables.


Step 2: Go to analyze  Descriptive statistics  Frequencies

Step 3: Select the variables and statistics to be used and prepare the output.
OUTPUT:

Frequency
Statistics
insurance monthly
salary of the premium paid by expenditure of gender of the
employee the employee the employee employee
N Valid 100 100 100 100
Missing 0 0 0 0
Mean 1.86 12640.00 17470.00 1.52
Median 2.00 13000.00 17000.00 2.00
Mode 1 12000 16000 2

Histogram frequency
Statistics
insurance monthly
salary of the premium paid by expenditure of
employee the employee the employee
N Valid 100 100 100
Missing 0 0 0
Mean 1.86 12640.00 17470.00
Median 2.00 13000.00 17000.00
Pie-chart frequency table
salary of the employee
Cumulative
Frequency Percent Valid Percent Percent
Valid 40000-50000 40 40.0 40.0 40.0
51000-60000 34 34.0 34.0 74.0
61000 and above 26 26.0 26.0 100.0
Total 100 100.0 100.0

insurance premium paid by the employee


Cumulative
Frequency Percent Valid Percent Percent
Valid 10000 14 14.0 14.0 14.0
11000 13 13.0 13.0 27.0
12000 21 21.0 21.0 48.0
13000 16 16.0 16.0 64.0
14000 19 19.0 19.0 83.0
15000 17 17.0 17.0 100.0
Total 100 100.0 100.0

monthly expenditure of the employee


Cumulative
Frequency Percent Valid Percent Percent
Valid 15000 11 11.0 11.0 11.0
16000 22 22.0 22.0 33.0
17000 18 18.0 18.0 51.0
18000 20 20.0 20.0 71.0
19000 16 16.0 16.0 87.0
20000 13 13.0 13.0 100.0
Total 100 100.0 100.0
PRACTICAL - 3

Coding And Recoding In SPSS


Steps:
Step 1: Create a data set with the following variables.

Step 2: Add values wherever necessary, as shown below:


PRACTICAL - 4

Selecting, Sorting and Analyzing data in SPSS

Steps:
Step 1: Create a data set with the following variables.

Step 2: Add values wherever necessary, as shown below:


PRACTICAL- 5

FINDING OUT THE MISSING VALUES AND RECODING THE SAME


VARIABLE AFTER FILLING THE MISSING VALUES ACCORDING
TO GROUP: - SPLITTING FILE

MISSING VALUES

STEP 1: Go to analyze tab, select descriptive statistics and statistics from drop down menu.

STEP 2: Frequency named dialogue box appears, choose "4-year resale value "in variable
Column and then go to 'statistics'.

STEP 3: Choose mean from different options in the frequencies statistics dialogue box
and press continue.
Step 4: In output data viewer, 4-year resale value appear with mean

RECORDING INTO SAME VARIABLE: OLD AND NEW VALUES

STEP 5: Go to transform tab, and choose recoding into same variables old and new
value from the drop down menu.

STEP 6: Dialogue box named " recoding into same variables " will appear. Select
'resale value as numeric value expression.
STEP 7: Select Old and New Value Option
STEP 8: Select System Missing in old value box and 32.4 in the new value box as
it is mean. And then press Continue

STEP 9: Then in data editor, all the missing values of the data is replaced by a
common number i.e. 32.4 as it is the mean of all the values combined together.

SPLITTING FILE

STEP 1: Go to data tab, select split file from the drop down menu.
STEP 2: Split file dialogue box appears, in which manufacturer should be selected in
group based on column and press "ok".

STEP 3: In output viewer, data appear into different splits based on manufacturer with
their mean value.
RECODING INTO SAME VARIABLE: OLD AND NEW VARIABLE
(SELECTING THE PARTICULAR BRAND NAME AFTER SPLITTING
THE FILE)

For example: - BMW missing value is replaced by its mean value


STEP 1: Go to transform tab, and choose recoding into same variables old and new
value from the drop-down menu.

STEP 2: Dialogue box named" recoding into same variables " will appear. Select "resale
value' as numeric value expression.

STEP 3: Select " old and new value" option

STEP 4: Select " system missing " in old value box and '32.4 'n new value box as it is
the mean, then press continue.
STEP 5: Then in data editor, all the missing values of the BMW is replaced by a common
number i.e. 32.4 as it is the mean of all the values combined of BMW together.

STEP 6: Press "ok"

STEP 7: In data editor, all the missing values of BMW is replaced by "32.4"
MEAN VALUE OF BMW, ONE MISSING VALUE IS REMOVED

STEP 1: Once missing value of BMW is replaced by 12.96, then go to transform tab, and
chooserecoding into same variables old and new value from the drop-down menu.

STEP 2: Dialogue box named" recoding into same variables" will appear.

STEP 3: Choose 'IF' option from the dialogue box, then another dialogue box named
"Recode into same variable: If cases". Select the option "Include if case satisfies
condition" andtype manufacturer "BMW" in the blank area and press "continue".
PRACTICAL – 6

DESCRIPTIVE STATISTICS

STEPS:
Step 1: Create a data set with the following variables.

Step 2: Add values wherever necessary, as shown below:


Step 3: Go to analyze  Descriptive statistics  descriptives

Step 4: Select height and weight as variables


Step 5: Select mean, minimum, maximum, variance as descriptive statistics in descriptives

OUTPUT:
PRACTICAL – 7

CROSSTAB AND CHI-SQUARE TEST

Overview:

Chi- square is one of the most popular non parametric tests. It is used in two cases which are
as follows:
a) To test the association between two nominal variables in research.
b) To test the difference between expected and observed frequencies.

The process of chi square test compares the actual observed frequencies with the calculated
expected frequencies of different combination of nominal variables. The difference between
observed and expected frequencies gives logic of possible association between categorical
variables.
The null hypothesis of chi square is that there exist no significant association between two
nominal variables.

Assumptions of chi square test:

Chi square test doesn't rely on assumption such as having continuously normally distributed
data still the test has few assumptions:
a) Both the variables should be nominal in nature and must have at least two different
categories.
b) Each observation contributes to only one cell of the contingency table.
c) In the two-by-two contingency table the expected frequency in each cell should be greater
than 5. In the larger contingency table, the rule is that all expected counts should be greater
than 1 and not more than 20% of the expected counts should be less than 5.

STEPS:

Step 1: Go to analyse  Descriptive statistics  Crosstabs


Step 2: Select gender as row variable and marital status as column variable and find the
output.
OUTPUT:

Case Processing Summary


Cases
Valid Missing Total
N Percent N Percent N Percent
Gender Of Student * 47484 100.0% 0 0.0% 47484 100.0%
Married or Unmarried

Gender Of Student * Married or Unmarried Crosstabulation


Married or Unmarried
Married Unmarried Total
Gender Of Male Count 8503 16747 25250
Student Expected Count 7704.1 17545.9 25250.0
Female Count 5985 16249 22234
Expected Count 6783.9 15450.1 22234.0
Total Count 14488 32996 47484
Expected Count 14488.0 32996.0 47484.0
Chi-Square Tests

Asymptotic
Significance Exact Sig. Exact Sig.
Value df (2-sided) (2-sided) (1-sided)
Pearson Chi-Square 254.605a 1 .000
Continuity Correction b 254.286 1 .000
Likelihood Ratio 255.679 1 .000
Fisher's Exact Test .000 .000
Linear-by-Linear 254.599 1 .000
Association
N of Valid Cases 47484
a. 0 cells (.0%) have expected count less than 5. The minimum expected count is
6783.89.
b. Computed only for a 2x2 table

Symmetric Measures
Approximate
Value Significance
Nominal by Nominal Phi .073 .000
Cramer's V .073 .000
N of Valid Cases 47484
PRACTICAL – 8
CORRELATION

Correlation depicts the relationship between two variables under a common study. it is
generally categorized into two types namely by various relation and partial correlation.

Correlation analysis studies the strength of the linear relationship between the different types
of variables. It measures the extent to which one variable affects other variables. The
correlation can be measured by Karl Pearson confession of correlation.

A correlation is a degree of measures which means that it can be negative positive or perfect.
A positive correlation is a type in which an increase in one variable will bring the same
change in the other variable erase a negative correlation implies that change in 1 variable will
result in an opposite change in the other variable. If the value of the correlation coefficient is
0 then the variables are said to be uncorrelated, that means there is no relationship between
the 2 variables.

STEP 1: ENTER THE VALUES IN DATA AND VARIABLE VIEW

STEP 2: GO TO ANALYZE, THEN CORRELATE.


STEP 3: IN CORRELATE CLICK ON ‘BIVARIATE’ AND GIVE THE DEPENDENT
AND INDEPENDENT VARIABLE.
Correlations
Demand In
PetrolPrice Litres
PetrolPrice Pearson Correlation 1 -.097
Sig. (2-tailed) .611
N 30 30
Demand In Litres Pearson Correlation -.097 1
Sig. (2-tailed) .611
N 30 30

Correlations
Demand In
DieselPrice Litres
DieselPrice Pearson Correlation 1 .044
Sig. (2-tailed) .819
N 30 30
Demand In Litres Pearson Correlation .044 1
Sig. (2-tailed) .819
N 30 30
PRACTICAL-9

REGRESSION
OVERVIEW:
Regression analysis studies the dependence of one variable or the another and estimates the
expected values of the dependent variables with the help of known values of the independent
variables.

Assumptions

Linear regression is an analysis that accesses whether one or more predictor variables explain
the dependent variable. It has 5 key assumptions:

i) There should be a linear relationship. This relationship can be checked with scatter plot.

ii) It has to my multivariate normal.It can be checked with a histogram.

iii) There should be no or little multi collinearity

STEPS:
Step 1: Input data in SPSS program
Step 2: : go to analyseregression  linear
Step 3: Select dependent and independent variables with statistics of estimates, Durbin-
Watson and model fit
OUTPUT
PRACTICAL – 10

ONE SAMPLE T-TEST

A T-test is commonly used when the sample size is small i.e. less than or equal to 3. The T
statistics is used:

1. When variable is normally distributed


2. Mean is known
3. Population variance is estimated from sample
4. The null hypothesis of one sample T-test is that there is no significant difference
between sample mean and population mean.

Question: A healthcare provider claims that on an average its customers have lost 5kgs of
weight in a month after joining is weight less program in order to test the validity of claim an
independent researcher collects data of weight loss from 30 customers a month after joining
the program. The researcher has decided to apply one-sample T-test in order to test the
validity of the claim.

Variable View
Data view

Step-1: Go to Analyze and choose the option of One Sample T-test in the compare means
tab.
Step-2: Then add weight loss in the Test variables tab and then click on OK to proceed
further.

Output:

One-Sample Statistics
N Mean Std. Deviation Std. Error Mean
weight loss in a month 30 3.47 1.224 .224

One-Sample Test

Test Value = 0
95% Confidence Interval of the
Mean Difference
t df Sig. (2-tailed) Difference Lower Upper
weight loss in a month 15.509 29 .000 3.467 3.01 3.92
PRACTICAL -11

Independent Sample T-Test

Overview:

When we want to test the difference between two independent sample means we
used independent sample T-test. The independent samples may belong to the
same population or different population.

The null hypothesis of independent sample T-test is there is no significant


difference between the sample mean of two independent groups.
QUESTION:

A researcher is interested to analyze the difference in the average performance of employees


of an enterprise in different demographic profiles, the researcher divides the employees on
the basis of gender and their age group. Calculate the statistics by applying the independent
sample t-test.

Steps:
Step 1: go to analysecompare means  independent sample T test
Step 2: Input variable as performance score and gender.
OUTPUT:

Paired Samples Statistics


Mean N Std. Deviation Std. Error Mean
Pair 1 performance score before 51.50 30 12.811 2.339
training
performance score after 68.80 30 12.416 2.267
trainig

Paired Samples Correlations


N Correlation Sig.
Pair 1 performance score before 30 .713 .000
training & performance score
after trainig

Paired Samples Test


Paired Differences
95% Confidence Interval of the
Difference
Mean Std. Deviation Std. Error Mean Lower Upper
Pair 1 performance score before -17.300 9.563 1.746 -20.871 -13.729
training - performance score
after trainig
PRACTICAL – 12

PAIRED SAMPLE T-TEST


Overview:
A paired sample T-test is also known as repeated sample T test because data response is
collected from same respondent but at different time periods. A paired sample T test should
be used when we want to test the impact of an event or experiment on the variable
understudy, in this the data is collected from the same respondent before and after the event.

The null hypothesis of period sample T test is that the means of pre sample and post
sample are equal.

Question:

The HR manager of a business enterprise wants to analyze the impact of a training program
conducted for 30 employees. The purpose of conducting the training program was to improve
the performance of employees, the performance scores of the employees are noted before and
after the training program. Apply the paired sample t-test to analyze the impact of training
program.

STEPS:

Step 1: go to analysecompare means  paired sample T test


Step 2: Input variable 1 and 2.

OUTPUT

Paired Samples Statistics


Mean N Std. Deviation Std. Error Mean
Pair 1 Performance score before 51.50 30 12.811 2.339
training
Performance score after 68.80 30 12.416 2.267
training

Paired Samples Correlations


N Correlation Sig.
Pair 1 Performance score before 30 .713 .000
training & Performance
score after training
PRACTICAL – 13
ANOVA testing for mean

Overview: ANOVA is to be called as analysis of variance.it has an advantage over T -test .


When the researcher wants to compare means of a large number of population that is three or
more. It is a parametric test that is used to study the difference among more than two groups
in the data set. It helps in explaining. It helps in explaining the amount of variation in the data
set. it explains three types of variances:

1. Total variance
2. Between group variance
3. Within group variance

It is based on the logic that, if between group variance is significantly greater than within
group variance, it indicates that means of different samples are significantly different.

Question: A researcher wants to compare the sales of three companies located in three
different regions. The data of monthly sales of these companies is collected from different
branches. Conduct the analysis of variance as per the data collected.

VARIABLE VIEW: Enter data in data and variable view, then describe the measures you
want to analyze the data with.
DATA VIEW:

Step 1: go to analyze> compare means> one way ANOVA


Step 2: Input variable 1 and 2

Step 3: Go to post hoc and checklist turkey


Step 4: Go to options and checklist descriptives, homogeneity of variance test & means plot
and continue

OUTPUT:

You might also like