Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
71 views

Abhinn - Spss Lab File

1. The document describes SPSS, a statistical software package used for data analysis. 2. SPSS can read and write various file types and has menu and syntax interfaces for performing analyses. It places constraints on file structure and data types to simplify programming. 3. SPSS is widely used in social science, market research, education, government, and other fields to analyze large datasets.

Uploaded by

vikrambedi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
71 views

Abhinn - Spss Lab File

1. The document describes SPSS, a statistical software package used for data analysis. 2. SPSS can read and write various file types and has menu and syntax interfaces for performing analyses. It places constraints on file structure and data types to simplify programming. 3. SPSS is widely used in social science, market research, education, government, and other fields to analyze large datasets.

Uploaded by

vikrambedi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 67

STATISTICAL PACKAGE FOR THE

SOCIAL SCIENCES
LAB FILE

SUBMITTED TO:
ABHINN BAJAJ
Dr. R. Srivastava
DTU/2K13/MC/002
Assistant Professor 3rd
SEMESTER
Department of Applied Mathematics GROUP R1
DTU
INTRODUCTION
SPSS Statisticsis asoftware packageused forstatistical analysis. Long
produced bySPSS Inc., it was acquired byIBM in 2009. The current
versions (2014) are officially namedIBM SPSS Statistics.
SPSS is a widely used program for statistical analysis in social science. It
is also used by market researchers, health researchers, survey companies,
government, education researchers, marketing organizations, data miners,
and others.
The many features of SPSS Statistics are accessible via pull-down menus
or can be programmed with a proprietary 4GL command syntax language.
Command syntax programming has the benefits of reproducibility,
simplifying repetitive tasks, and handling complex data manipulations and
analyses. Additionally, some complex applications can only be programmed
in syntax and are not accessible through the menu structure. The pull-down
menu interface also generates command syntax; this can be displayed in the
output, although the default settings have to be changed to make the
syntax visible to the user. They can also be pasted into a syntax file using
the "paste" button present in each menu.
SPSS Statistics places constraints on internal file structure, data types,
data processing, and matching files, which together considerably simplify
programming. SPSS datasets have a two-dimensional table structure,
where the rows typically represent cases (such as individuals or households)
and the columns represent measurements (such as age, sex, or household
income). Only two data types are defined: numeric and text (or "string").
All data processing occurs sequentially case-by-case through the file. Files
can be matched one-to-one and one-to-many, but not many-to-many.
The graphical user interface has two views which can be toggled by clicking
on one of the two tabs in the bottom left of the SPSS Statistics window.
The 'Data View' shows a spreadsheet view of the cases (rows) and
variables (columns). Unlike spreadsheets, the data cells can only contain
numbers or text, and formulas cannot be stored in these cells. The 'Variable
View' displays the metadata dictionary where each row represents a
variable and shows the variable name, variable label, value label(s), print
width, measurement type, and a variety of other characteristics. Cells in
both views can be manually edited, defining the file structure and allowing
data entry without using command syntax. This may be sufficient for small
datasets. Larger datasets such as statistical surveys are more often
created in data entry software, or entered during computer-assisted
personal interviewing, by scanning and using optical character recognition
and optical mark recognition software, or by direct capture from online
questionnaires. These datasets are then read into SPSS.

SPSS Statistics can read and write data from ASCII text files (including
hierarchical files), other statistics packages, spreadsheets and databases.

APPLICATION OF SPSS IN VARIOUS FIELDS

SPSS finds its uses in various fields with one basic application ie to hold
and analyze huge amount of data.
It is used in fields such as:
1. Stock Exchange
2. Used by Cosmologists
3. Analyzing Census Data
4. Marketing
5. Survey Companies
6. Other Purposes Include Educational researches, etc.

INDEX

S.N TITLE DATE SIGN


o.
OBJECTIVE:1
Transportation of data set to SPSS editor

INPUTS:
FILES:Book1.xlxs,state.txt

PROCEDURE FOLLOWED:
Through excel

COMMANDS:
File-->Open-->Data
OUTPUT
FILE:Book1.sav

DATA VIEW

VARIABLE VIEW
THROUGH A TEXT FILE

COMMANDS:
File-->Read Text Data

OUTPUT

DATA VIEW
VARIABLE VIEW

CONCLUSION

Any document file in excel ,text etc. formats can be transported to SPSS editor
window.

PRECAUTIONS

There should be proper spacing between variables in a text in a file.


Extensions of the files should be strictly taken care of.
OBJECTIVE:2
Create a SPSS file which includes a student name, gender and maths marks on
mid-sem.

INPUTS
The user defines the variables and its attributes and enters the data.

PROCEDURE

COMMANDS FOLLOWED

File-->New-->Data
OUTPUT

INSERTING VARIABLES IN VARIABLE VIEW

DEFINING LABELS FOR A VARIABLE


CONCLUSION

A file can be made in spss by defining variables and their attributes and data can
be then entered and stored accordingly.

PRECAUTIONS

Attributes of a variable must be defined carefully keeping in view the requirement


of the variable.
Extensions of the files should be strictly taken care of.
OBJECTIVE 3:
BY USING 3 FILES, FILE1, FILE2, FILE3, DEMONSTRATE

1. Merging of cases

2. Merging of variable value

a) Both files provide cases

b) Non active data set is keyed table

c) Active data set is keyed table

INPUTS
3 files File1, File@ and File3 are crated in SPSS in which the user defines the
variables, their attributes and enters the data.

PROCEDURE

MERGING OF CASES

1. Open FILE1.sav

File-->Open-->Data
2. Data--> Merge Files--> Add Cases

3. Make the names of unpaired variables same so as to merge the files


ID IS CHANGED TO S.No

OUTPUT

MERGING OF VARIABLE VALUE

1. Data--> Merge Files--> Add Variables


MAKE SURE THAT FILES TO BE MERGED ARE IN ASCENDING
ORDER
OUTPUT:1

BOTH FILES PROVIDE CASES

OUTPUT:2

NON ACTIVE DATA DAT IS KEYED TABLE


OUTPUT:3

ACTIVE DATA SET IS KEYED TABLE


CONCLUSION

We can conclude that in spss, we can either merge complete files with one another
or we can add variables of one file to the variables of another files.

PRECAUTIONS

While adding variables to a file, make sure both the files were sorted in ascending
order.

While merging cases, the unpaired variables should be properly renamed and then
paired accordingly.
OBJECTIVE 4:
Demonstrate the following in a SPSS file :
Filtering of Data
Splitting of File According to Variable(s)

INPUTS :

Files : "car_sales.sav"

PROCEDURE

Filtering Of Data

Open "car_sales.sav" from the samples provided along with the software.
Data > Select Cases
Choose the radio button saying "If condition is satisfied".
Click on the "if" button.
Specify the condition in the dialog box that opens.
Click on "Continue".
Click on "OK".
Splitting of File According to Variable(s)

File : "car_sales_splitting.sav"

CONCLUSION :

Small extracts of very large document files can be viewed easily through splitting
and filtering of data on given specific cases.
PRECAUTIONS :

Extensions of the files should be strictly taken care of.


The conditional statements on the basis of which the file is being split should be
given carefully.
OBJECTIVE :5

Give a program for finding the sum of obtained marks in a set of Multiple Choice
Questions for 5 students, if correct answers of "Q1", "Q2" and "Q3" are "d", "b"
and "a" respectively.

INPUT

OBJECTIVE5.sav

PROCEDURE

Transform > Recode into Different Variables


Select variable "Q1" and send it over to the right side.
Type "A1" in the "Name" field and click on "Change" button.
Click on the button saying "Old and New Values".
Type the correct answer ("d" in case of Q1) in the "Value" field under "Old Value"
and give it the value "1" in the "Value" field under "New Value". Then, click on
"Add".
Choose the radio button saying "All other values" under "Old Value" and give them
the value "0" under "New Value". Then, click on "Add".
Click on "Continue".
Repeat Steps 2 to 7 for "Q2" and "Q3".
Click on "OK".
Transform > Compute Variable
Give the name "Result" to the new variable.
Select "All" under "Function Group" and then double click on "Sum" under
"Functions and Special Variables".
Specify the variables whose sum is to be calculated, i.e. "A1", "A2" and "A3".
Click on "OK".
RECODING INTO DIFFERENT VARIABLES

DEFINING OLD AND NEW VALUES FOR THE RECODING

VARIABLES A1, A2,A3 WITH THEIR VALUES


THE VARIABLE RESULT CALCULATES THE TOTAL NO OF RIGHT ANSWERS

OUTPUT :
CONCLUSION :

New variables have been created on the basis of the answers of the given questions.
Result is computed on the basis of number of questions correctly answered by the
student.

PRECAUTIONS :

The value of the new variable should be given carefully depending on the correct
answer.
The numeric expression for the target variable should be given carefully and properly.
OBJECTIVE :6

Write a program to find and replace missing values in data.

INPUT:
OBJECTIVE6.sav

THE FIELD AVGMARKS HAS SOME MISSING VALUES

PROCEDURE:

1. Transform --> Replace Missing Values


2. Select the variable which contains missing values and add and rename it
according to convenience and press the change button.
3. Choose the method Series mean.

THE MISSING VALUES ARE REPLACED BY THE MEAN OF THE SERIES.

4. Similarly, choose the methods Mean of Nearby Points as the method to


replace missing variables.

THE VALUE OF FIRST AND THE LAST VARIABLES ARE STILL EMPTY
BECAUSE THERE ARE NO NEARBY VARIABLES ABOVE AND BELOW THEM
RESPECTIVELY AND THE VALUE OF 14TH VARIABLE IS STILL EMPTY
BECAUSE THE NO OF NEARBY VARIABLES WAS TAKEN TO BE 2 AND
AVERAGE CAN'T BE CALCULATED
5. Similarly, choose the methods Median of Nearby Points as the method to
replace missing variables.

THE VALUE OF FIRST AND THE LAST VARIABLES ARE STILL EMPTY BECAUSE THERE
ARE NO NEARBY VARIABLES ABOVE AND BELOW THEM RESPECTIVELY AND THE
VALUE OF 14TH VARIABLE IS STILL EMPTY BECAUSE THE NO OF NEARBY VARIABLES
WAS TAKEN TO BE 2 AND MEDIAN CAN'T BE CALCULATED
CONCLUSION :

Missing values of certain fields is calculated by applying various methods like mean
of the complete data, mean of nearby data, median of nearby data, etc

PRECAUTIONS :

The method of replacing missing variables must be carefully chosen.


Care must be taken while applying changes to a variable and applying changes to
the source variable itself must be avoided as future use of the source variable may
be present.

.
OBJECTIVE :7

Pictorial representation of data.

INPUTS:
Objective7.sav

PROCEDURE:

1. Graphs --> Chart Builder


2. Select and drag the type of graph from the gallery to the Chart Preview.
3. Select and drag the variables accordingly to the Chart Preview. Click on the
button saying "Element Properties...".

Edit any properties, if required, in the dialog box that opens.


4. Press "Close" when done editing.

5. Press "OK" and the graph is opened in the output window.

6. Do the same for Histogram, Scatter Plot, Box Plot, Bar Graph & Pie Chart.

HISTOGRAM
BAR GRAPH

PIE CHART

BOX PLOT
SCATTER PLOT

CONCLUSION :

We conclude that for any given set of data we can represent it easily with
the help of graphs.

PRECAUTIONS :

Variables should be chosen carefully during plotting of graphs.


Graph labels should be chosen appropriately.
OBJECTIVE :8

Descriptive Statistics.

INPUT:
Objective8.sav

PROCEDURE

1. Analyze > Descriptive statistics > Frequencies...


2. Send the variable(s) to be used, over to the list called "Variable(s)" on the right side.

3. Click on the button saying "Statistics..." and choose the required options to be displayed
and click on the button saying "Continue". Click on the button saying "Charts..." and
select "Histograms" and click on the button saying "Continue".
4. Click on the button saying "OK" to proceed to the output.

OUTPUT:
CONCLUSION :

1. Frequency tables show us vivid statistical interpretation of data.


2. Frequency curves show us easy interpretation of skewness and kurtosis
curves.

PRECAUTIONS :

Curve of symmetry should be judged carefully.


Do note that quartiles divide distribution into 4 equal parts.
OBJECTIVE :9
Correlation and regression

INPUT:
Objective9.sav

PROCEDURE:

A. CORRELATION

1. Analyze > Correlate > Bivariate


2. Select the variables M3 and DM.

3. Select "Two-tailed" and click "OK" and the output is displayed in the Output Window.

4. Repeat the above for "One-tailed".

B. REGRESSION

1. Analyse > Regression > Curve Estimation


2. Select "Roll_no" for the independent variable and choose "M3" for the dependant variable.

3. Under the category "Models", select "Linear", "Quadratic" & "Exponential".Click on "OK" to
view the output.
CONCLUSION :
From the above data, we conclude that M3 and DM are very partially correlated
as Pearson's correlation coefficient is very small and as sig value in both one tailed
and two tailed tests are greater than 0.05, we fail to reject the null hypothesis Ho
that M3 and DM are not correlated and alternate hypothesis Ha that M3 and DM
are correlated is rejected.

PRECAUTIONS :

1. Variables should be chosen carefully.

2. Extensions of the files should be strictly taken care of.

OBJECTIVE 10:
Distribution curves

INPUTS:
FILES:breakfast.sav

PROCEDURE FOLLOWED:

COMMANDS:
1. Files-->Open-->Data

2. Graphs -> Legacy Dialogs -> Histogram


ROW WISE: FINDING FREQUENCY BY GENDER TO COFFEE CAKE(CC)

OUTPUT
COLUMN WISE: FINDING FREQUENCY BY GENDER TO JELLY DONUT(JD)

OUTPUT
BOTH ROW WISE AND COLUMN WISE: FINDING
FREQUENCY BY GENDER TO COFFEE CAKE(CC) AT
ROW AND JELLY DONUT(JD) AT COLUMN

OUTPUT
CONCLUSION

Concising the data in terms of frequency makes it analysis easier through


curves.

PRECAUTIONS

Choice of dependent and independent variables should be made aptly.

Data variables for frequency curves should be decided before hand for proper
results.
OBJECTIVE 11:
Chisquare test

INPUTS:
FILES:breakfast.sav

COMMANDS:
1. INDEPENDENT

1 .Files -->Open-->Data
1. Analyse--> Nonparametric test --> Legacy Dialogs --> Chi Square

OUTPUT
2. Dependent

COMMANDS:
1. Analyse--> Descriptive Statistics --> Cross Tabs
OUTPUT

CONCLUSION

Chi square test

Chi square test is a statistical test commonly used to compare observed data with
data we would expect to obtain according to a specific hypothesis.

PRECAUTIONS

Variables should be chosen properly.

It may not show proper results so what we have to observe should be decided
before hand.

OBJECTIVE 12:
Perform T test

INPUTS:
FILES:breakfast.sav

PROCEDURE FOLLOWED:

1. One Way

COMMANDS:

1. Files --> Open --> Data

2.Analyze -> Compare Means -> One Sample T test


OUTPUT

1. T test ( Test Value = 15)

2. T test ( Test Value = 16)


A. Paired

COMMANDS:

1.Analyze -> Compare Means ->Paired-Samples Mean Test


OUTPUT:

CONCLUSION

We obtain that as we increase the test value,mean difference increases.


It means that more approximately we estimate the better result we get.

PRECAUTIONS:
Variables should be chosen properly.
It may not show proper results so what we have to observe should be decided
before hand.

OBJECTIVE 13:
ANOVA Test

INPUTS:
breakfast.sav

PROCEDURE FOLLOWED:

A. One Way

COMMANDS:

1. Files --> Open --> Data

2.Analyze -> Compare Means -> One way ANOVA


OUTPUT

B. Two Way

COMMANDS:

1.Analyze ->General Linear Model ->Univariate


OUTPUT
CONCLUSION:

ANOVA test is used to compare the means of three or more groups to determine
significantly from one another. Another important function is to estimate the
differences between specific groups. The most common method to detect differences
among groups in one-way ANOVA IS F-test, which is based on assumption that
the populations for all samples share a common, but unknown, standard deviation.
We recognized, in practise , that samples often have different standard deviations.

PRECAUTIONS

Variables should be chosen properly.

It may not show proper results so what we have to observe should be decided
before hand.

You might also like