Spss 15 For Windows: Tutorial
Spss 15 For Windows: Tutorial
Spss 15 For Windows: Tutorial
SPSS 15
for Windows
Tutorial
Contents
1. Introduction 2
2. Getting started 2
3. Default SPSS windows 3
4. Notation 3
5. Data entry 4
6. Saving your work 5
7. Reading in a SAVED data set 5
8. Listing the data 6
9. Exploring the data 8
10. Saving results 9
11. Reading data from a separate file 11
12. Transformations and calculations on data 13
13. Creating subgroups of data 14
14. Data summary 15
15. Statistical analysis – regression 17
16. Analysis of variance 20
17. Saving command syntax 20
18. Using the HELP menu 21
2
1. Introduction
SPSS is a package developed originally for Social Scientists using large mainframe
computers. Since then it has been refined and redeveloped for different types of architecture
including Windows. It is an extensive package with facilities for data entry, data
manipulation and statistical analysis in a graphical environment. It has modules for survey
analysis, graphical display and time series. The package has been considerably improved to
include logistic regression and repeated measures analysis and much more beyond the scope
of this tutorial. The current version on the PC labs is 15, version 17 is currently available to
individuals via the ITS web site. (see: Obtaining and Buying Software)
2. Getting Started
This tutorial document assumes that you have a basic knowledge of Windows. This tutorial
is based on the default installation of the software at Reading where your have your ITS
supported N drive available after logging in as the N:\ drive. This is normally accessed via
My Computer. For this tutorial we suggest you create a folder such as SPSS_Tut on your
N:\drive. We find that in the PC labs this is best done on the N:\ drive, not under My
Documents. Any files that you are provided with should be stored in this folder.
Access SPSS from the Start menu and select Statistics and then SPSS for Windows.
Ignore, by selecting OK, any error messages that may appear on the screen until you see
Figure 1. Select the option Don’t show this dialog in the future and this dialogue-box will
not appear again. You can then remove the screen by selecting the Cancel button.
Figure 1
3
Wait until Figure 2 then appears on the screen. [For information - to exit SPSS, click on the
File option on the menu bar and choose Exit and confirm your intention.]
4. Notation
In the sections that follow, the actions you must perform are shown in bold, as too are the
default values chosen by SPSS. Italic is used to denote items whose names/values you may
choose. It is preferred that variable names start with a letter and are kept as short and simple
as possible.
For information:
A sequence of actions using the drop-down menus will be denoted by →. For example,
Data→Insert Variable... would mean "Click-left on the Data option on the drop-down
menu and select the option Insert Variable... (by clicking-left again)".
4
5. Data Entry
If SPSS has been set up correctly the first empty column in the first row will be outlined in
the Untitled - SPSS Data Editor window. Enter the following three columns of data into the
spreadsheet.
11 110 22
15 129 250
104 90 195
168 102 177
60 145 297
125 86 186
111 109 188
By default the names given to these three columns are var00001, var00002 and var00003.
Each row of this data represents a farm number, the number of trees on that farm and the
total yield of apples (in unspecified units) from each of the farms. It would, therefore, be
more useful to have suitable names for each column.
Move to the Variable View window (click-left on the tab at the bottom of the window) to see
Figure 3 and replace the default names by those suggested above in the Name column.
Move back to the Data View window to see Figure 4. Do not worry about the contents of the
other columns, shown in Variable View, they will be discussed as the need arises.
Figure 3
Figure 4
5
.
6
Exercise 2
• Choose File→Exit from the main menu. Using any method you know (ie look at the
contents of the folder N:\SPSS_Tut) check the file apples.sav exists. If necessary you
may need to repeat the tutorial from the beginning.
• Invoke the package again and wait for the software to be reloaded. (described in
Section 2)
• Select File→Open→Data… from the main menu bar. This will show the Open File
dialogue box similar to Figure 6. Check you are looking at the correct folder and
filename N:\SPSS_Tut\apples.sav and select Open
• Check that the apples.sav - SPSS Data Viewer is as shown in Figure 4.
Figure 6
button to select the variable as required to be listed. Repeat for farm and trees.
Click on the OK button. The maximised Output1 - SPSS Viewer window will become
visible. Scroll up and down through it to check that it looks like Figure 8. Note that the
data is listed in the order in which you selected the variables e.g. alphabetic. This window
could then be printed – do not attempt this now.
Correct any mistakes if necessary, and re-save the data file using File→Save. This will
automatically overwrite the file N:\SPSS_Tut\apples.sav.
7
Figure 7
Figure 8
8
Figure 10
Exercise 5
2. Scatterplots:
From the main menu select Graphs→Scatter ... This will produce a further dialogue-box as
shown in Figure 11. Select the Simple plot and click on the Define button to produce
Figure 12. Select apples as the Y-axis and trees as the X-axis using the method in
Exercise 3 and then click on OK. The Output1 - SPSS Viewer window now contains this
simple scatterplot (as shown in Figure 13).
9
Figure 11 Figure 12
Figure 13
Exercise 6
If necessary, move into the Output1 - SPSS Viewer window and click in it. Select
File→Save As which will give the dialogue-box similar to that shown in Figure 14.
Confirm that your folder N:\SPSS_Tut is selected and then supply a suitable File name
such as apples. [Note that the default file extension is given as .spo] Click on Save and
the complete contents of the output window will be saved and the default window name
changed to N:\SPSS_Tut\apples.spo. It is possible to save the graphs separately in a
special format that can be imported into other packages such as Word 2007, however, for
most reports this quality is sufficient.
Exit from SPSS [File→Exit] but you do not need to save the contents of any other window.
Figure 14 Figure 15
Exercise 7
To print the output file N:\SPSS_Tut\apples.spo, once saved, you can read it back into
SPSS from the File→Open→Output… menu and choose the correct file as shown in Figure
15. Try it now before going on to the next set of data.
Exit from SPSS [File→Exit] so that you have a clean session for the start of the next
exercise.
From now on the resulting output from exercises and basic dialogue-boxes
will not be shown unless further options are selected and additional screens
appear. Ask the tutor if you are not sure but be adventurous at this stage.
Make mistakes now on a small data set - do not wait until you have entered
your data containing 126 variables and 3452 observations!
11
A data set that you wish to analyse may have been entered using a text editor, a database or
into another spreadsheet. This example assumes that the data are in an EXCEL file.
SPSS has extensive facilities to input from textfiles, in which data are typed in columns using
a word processor or WordPad but here we will concentrate on EXCEL files
The data are in N:\SPSS_Tut\school.xls. Find this file, open it in EXCEL and check that
the data are as below.
Do not type this into the spreadsheet. Follow the next exercise to input the data.
12
Exercise 8
Begin by starting SPSS again, and access the File→Open→ Data... menu selecting the file
N:\SPSS_Tut\school.xls. You will need to change the option of File Type to Excel (*.xls)
and choose the file before selecting Open.
The next dialog box asks if you want to read variable names from the first line (the default)
and as we have these in the file simply choose OK.
The Data View window should contain the 19 observations of the data set – check that these
are correct.
The Variable View window should contain the information shown in Figure 16. Check that
school is a String type and all the others are Numeric.
This is probably the most common method of importing data into a statistics package. One
advantage of using EXCEL to input that data is that it can easily be read into other packages,
you may find that one particular analysis is better implemented in another package.
You must however be careful to ensure that your data do appear correctly in the statistics
package. The package will normally scan the first few lines of your data to see whether a
column is numeric or character – as seen here in the Variable View.
Figure 16
13
Using the data set just input, the percentage of “yes” answers for each question e.g.
⎛ number of correct answers ⎞
⎜ ⎟ ∗100 might be required. Exercise 9 demonstrates this facility
⎝ total number of responses ⎠
as an example of manipulation of data values. Many other calculations and transformations
are possible but can be explored in your own time.
Exercise 9
Select Transform→Compute from the main menu to get the Compute Variable box
shown in Figure 17. Complete the dialogue-box in the following way:
• Enter the name of the new variable percent in the Target Variable box
• Type in the expression (correct/number)*100 in the Numeric Expression box.
• Click on OK and return to the Untitled - SPSS Data Editor window and check that one
or two observations, of the new variable percent, have been calculated correctly.
• Calculate the number of "incorrect" (or not "yes") answers to show you understand how
to use this facility.
Figure 17
14
Exercise 10
Calculate the overall means of the variables number, correct and percent (use Exercise 4 as
a guide). Then split the file into the four different schools as follows:
♦ From the main menu select Data→Split File to get Figure 18.
♦ Select the option Organise output by groups.
♦ Select the variable school and move it into the Groups Based on: box. Your screen
should now look like Figure 18.
♦ Click on OK as usual. Notice that the data in the Untitled - SPSS Data Editor window
has been sorted by school and that it lists the CROSSLEY school first.
♦ Repeat the calculations with the same variables and see that four separate mean values
(one for each school) have been calculated for each variable.
Figure 18
Exercise 11
Be aware that the data file is considered to be split into the groups for any following analysis
so complete the next exercise to show how to remove the grouping indicator.
Very simply, follow the previous exercise but this time select the option
Notice that the box entitled Groups Based on: has become dimmed. Click on OK. You
need to be aware of the status of the data set that you are analysing so that you always work
with the data intended.
Exit from SPSS [File→Exit] but again you do not need to save the contents of any window.
15
If survey data are being analysed, almost the first thing that is required is to display the data
in tabular form. This has the added bonus of being a way of checking categorical data. The
following exercises demonstrate methods of tabulating data.
A file, N:\SPSS_Tut\surv.xls has been provided for you containing data in the following
form:
Before proceeding further save the data window (Exercise 1) for future use, as
N:\SPSS_Tut\mysurv.sav.
[If you are having problems at this stage and are struggling with time, you have been
supplied with a previously prepared SPSS file called N:\SPSS_Tut\surv.sav. Follow
Exercise 2 to bring it into the Untitled - SPSS Data Editor window and continue with the
next exercise.]
16
Figure 19
Exercise 13
This exercise demonstrates simple one-way frequency tables. Select Analyze→Descriptive
Statistics→Frequencies... . Select variable sex to demonstrate Figure 20 then click on OK.
Examine the output. Repeat with other variables as you wish, removing the previous
selection by highlighting the variable name in the Variable(s): box and click on the
button. Try selecting the Statistics and Charts buttons and try the options for more
descriptive statistics and bar charts.
There are a wide variety of options which are useful in initial exploration of your data. Try
several options and make sure you understand what is plotted etc.
Exercise 14
Crosstabulations can be produced using a different menu selection. This time choose
Analyze→Descriptive Statistics→Crosstabs... to produce Figure 21. Select a variable to
appear in the row dimension and another to appear in the column dimension e.g. smoke and
drink. Click on OK and examine the output again. Can you interpret the results OK? Try
adding the Expected values and the Row percentages to the Cells and calculate the Chi-
square (look at Statistics).
Figure 20 Figure 21
17
SPSS contains many facilities to perform statistical analyses. All should be used initially
under the guidance of a statistician as there are many ways of producing the wrong analysis.
The following exercise uses CURVE ESTIMATION to perform more complex regressions.
Exercise 16
Repeat the plot between another pair of variables Z and X. By eye, it does not look as if a
linear regression would be the best fit between these variables Z and X but you can try it
anyway. This time you will use a different method.
Select Analyze→Regression→Curve Estimation... and choose Z and X as appropriate
(see Figure 30). Using all the default selections, fit the curve. Notice that a plot is
automatically prepared with both the observed data and the fitted linear regression being
plotted together. It does not appear to be a good fit, so repeat this part of the exercise
selecting the Quadratic and then finally the Cubic options in the the Models selection box.
This is also known as “polynomial regression”.
Repeat the previous exercise (Exercise 15) to show a straightforward multiple regression
using "unrelated" explanatory variables. Select W as the dependent variable and U and X as
the independent variables. It is not possible to show this on a two-dimensional graph.
Figure 23
19
Exercise 17
When regression analysis has been performed it is usual to want to save the predicted (or
fitted) values and residuals and plot them against each other to check that they are randomly
scattered about zero. SPSS allows you to both save and plot at the same time.
Select Analyze→Regression→Linear… and choose the simple regression of Y on X. This
time, however, click on the Save button and select Standardised Predicted Values and
Standardised Residuals.
Click on Continue to return to the previous screen and then click on Plots.
Choose *ZPRED as the X-variable and *ZRESID as the Y-variable. Click on Continue and
then OK. Look at the Output1 - SPSS Viewer and Reganal - SPSS Data Editor windows
to see the results. Did you achieve Figure 24?
Figure 24
Scatterplot
Dependent Variable: y
Regression Standardized Residual
-1
-2
-2 -1 0 1 2
Regression Standardized Predicted Value
20
SPSS also has many commands to perform analysis of variance dependent on your data
collection. This exercise demonstrates a simple analysis of variance of a balanced designed
experiment.
Exercise 18
Another file has been created called N:\SPSS_Tut\balaov.sav. Open this into the SPSS
Data Editor window (saving the previous results if required) and look at the data. The file
contains information on the yield of tomatoes from an agricultural experiment which was
set out in 3 blocks (blk ) with 2 treatments (side with 2 levels and strain with 4 levels).
Analyse the data in the following way assuming that strain is randomised across all plots in
each block - a factorial randomised block design.
Select Analyze→General Linear Model→Univariate and in the resulting window select
yield as the dependent variable and then blk, side and strain as the fixed factors to be
defined.
Click on Plot and put side and strain into the top two boxes. Click on Continue
Click on OK
In order to display the means for side and strain, return to the Univariate dialogue-box (via
Analyze→General Linear Model→Univariate menu), select the Options button and
choose the factors
A copy of all of the syntax (SPSS command language) used to produce the analysis required
is automatically saved in a file called SPSS.JNL. However, this file is usually overwritten
every time SPSS is invoked.
It is also possible to direct certain parts of the syntax to a separate file for future reference,
for example, you may want to repeat the exact analysis on a new set of data containing the
same variables. It is possible to copy and paste syntax from the Output window into a Syntax
Window . You can open a new syntax window by File→New→Syntax…
Having run a particular dialogue-box (e.g. Descriptive Statistics) and decided that this
would be needed for a subsequent data set you can return to this box and select Paste. A new
icon will appear on the taskbar which if you open it should contain similar information to that
shown in Figure 25.
21
Figure 25
It is possible to save the contents of the Syntax1 - SPSS Syntax Editor window for use at a
future date. If you want to save any of the command syntax that you have pasted into the
window before exiting then move into the Syntax1 - SPSS Syntax Editor window and click
on File→Save As…. Give a suitable name to the file which is given the default extension of
.sps.
This can be reused in a later session opening the file straight into the Syntax1 - SPSS Syntax
Editor window and using Run→Current [or use Ctrl-R].
All statistical analysis packages include an extensive Help system. Clicking on this will
give context sensitive help on the options available etc.
If you have time, make use of the drop-down menu and look at the Statistics Coach facility
that is loaded in this PC lab. This is a simple introduction to basic analysis.
When finished use File→Exit saving any files you might like to refer to at a later date.
Do seek statistical advice if you are not sure what you are doing, for example from the
Statistical Advisory Service run by Applied Statistics at
http://www.reading.ac.uk/biologicalsciences/appstats/biosci-as_SAS_info.aspx