Spsswin 10
Spsswin 10
Spsswin 10
to an SPSS data file. Patients were identified by the numeric variable cpr, and the admission
number (admno) for each patient was identified by:
SORT CASES BY cpr admdate.
COMPUTE admno=1.
IF (cpr=LAG(cpr))admno=LAG(admno)+1.
Fortunately it was detected that something went wrong: The same person apparently could
have more than one first admission. The explanation was that during translation inaccuracies
occurred, and the CPR number 0605401449 could be represented as 605401449.0000...01 or
as 605401448.9999...99, meaning that the lagged comparisons did not work. The solution was
to round cpr to the nearest integer before sorting with:
COMPUTE cpr=RND(cpr).
To test whether all values of a variable are integers, calculate the remainder after division by 1
and print a frequency table. If the frequency table includes one value only: 0.00000..., all
values of cpr are integers:
COMPUTE test=MOD(cpr,1).
FORMATS test (F20.16).
FREQUENCIES test.
43
23. Dates, time, and Danish CPR numbers
Date variables
Date variables are numeric variables, the internal value is the number of seconds since 14 Oct
1582 (start of the Gregorian calendar). In output they can be displayed with different formats;
below I show the EDATE (European date) formats. On other date formats see Syntax
Reference Guide, Universals.
Reading date variables
In DATA LIST the EDATE format reads a date of the format dd.mm.yy (eg. 06.05.02) or
dd.mm.yyyy (eg. 06.05.2002), dependent on whether you specify 8 or 10 digits:
DATA LIST FILE='c:\...\alfa.dat'
/bdate 1-10 (EDATE) opdate 11-20 (EDATE).
If an ASCII file includes date information in a non-date format, (eg. 060502), you should read
the date information as three separate variables (day, month, year) to enable further
calculations:
DATA LIST FILE='c:\...\alfa.dat'
/bday 1-2 bmon 3-4 byear 5-6.
Output formats
EDATE8 displays dates with the format dd.mm.yy (06.05.02). EDATE10 displays
dd.mm.yyyy (06.05.2002). The above command read two dates (a birth date and a date of
operation). Since each variable occupied 10 digits, the output format was automatically set to
EDATE10.
The corresponding FORMAT command is:
FORMATS bdate opdate (EDATE10).
Calculations with dates
Internally, date values are seconds. You can calculate a time interval in years (here an age):
COMPUTE opage=(opdate-bdate)/(86400*365.25).
(1 day=86400 seconds; 1 year=365.25 days).
The DATE.DMY function creates a date variable (seconds since 14 Oct 1582):
COMPUTE bdate=DATE.DMY(bday,bmon,byear).
FORMATS bdate (EDATE10).
1 July 1985 will be displayed as 01.07.1985.
On CPR numbers: extracting key information
Sometimes you get date information as a CPR number. You can read the CPR number as one
variable and the date information from the same columns in the data file:
DATA LIST FILE='c:\...\alfa.dat'
/cprnum 1-10 bday 1-2 bmon 3-4 byear 5-6 control 7-10.
44
It is also possible to extract the date and sex information from a CPR number read as one
variable. cprnum is a numeric variable; cprstr is the corresponding string variable:
STRING cprstr (A10).
COMPUTE cprstr=STRING(cprnum,F10).
COMPUTE bday=NUMBER(SUBSTR(cprstr,1,2),F2).
COMPUTE bmon=NUMBER(SUBSTR(cprstr,3,2),F2).
COMPUTE byear=NUMBER(SUBSTR(cprstr,5,2),F2).
COMPUTE control=NUMBER(SUBSTR(cprstr,7,4),F4).
The information on sex can be extracted from the control variable, the MOD function
calculating the remainder after division by 2 (male=1, female=0):
COMPUTE sex=MOD(control,2).
Validation of CPR numbers
The modulus 11 test checks the validity of CPR numbers. To check a CPR number, multiply
the digits by 4,3,2,7,6,5,4,3,2,1, and sum these products. The result should be divisible by 11.
In order to perform the test, each digit must be a separate variable:
DATA LIST FILE='c:\...\alfa.dat'
/cprnum 1-10 c1 TO c10 1-10.
! or the digits can be extracted from the string variable cprstr:
STRING cprstr (A10).
COMPUTE cprstr=STRING(cprnum,F10).
COMPUTE test=0.
DO REPEAT #i=1 to 10
/#x=4,3,2,7,6,5,4,3,2,1.
COMPUTE #c=NUMBER(SUBSTR(cprstr,#i,1),F1).
RECODE #c(missing=0). (1st character in cprstr may be blank, meaning 0).
COMPUTE test=test + #x*#c.
END REPEAT.
Now perform the test and display invalid CPR numbers by:
COMPUTE test=MOD(test,11).
SELECT IF (test>0).
LIST cprnum test.
A year 2000 crisis?
Hardly, but I recommend always to record years with 4 digits.
In CPR numbers the 7th digit includes information on the century of birth:
Pos. 5-6 (year of birth)
Pos. 7 00-36 37-57 58-99
0-3
4, 9
5-8
19xx
20xx
20xx
19xx
19xx
not used
19xx
19xx
18xx
Source: http://www.cpr.dk
45
24. Random samples, simulations
Random number functions
SPSS can create 'pseudo-random' numbers:
COMPUTE y=UNIFORM(x). Uniformly distributed in the interval 0-x (each value has
the same probability).
COMPUTE y=NORMAL(x). Normal distribution, mean=0, SD=x.
COMPUTE y=10+NORMAL(2). Normal distribution, mean=10, SD=2.
A number of other random variable functions are available (see Syntax Reference Guide,
Universals).
If you run the same syntax twice, it will yield different numbers. If you need to reproduce a
series of random numbers, initialize the seed (a large integer used for the initial calculations):
SET SEED = 7654321.
Random samples and randomization
You may use the SAMPLE transformation to select a random sample of your data set:
SAMPLE 0.1. Selects an approximately 10 per cent random sample.
You may also assign a random number to each case, and use that for selecting cases:
COMPUTE y=UNIFORM(1).
COMPUTE treat=1.
IF (y>0.5) treat=2.
Now the cases are assigned randomly to two treatments.
Creating artificial data sets
You may use INPUT PROGRAM to create a working file with 'artificial' data, eg. for
simulation purposes. The following sequence defines a file with 10,000 cases and one
variable (i). Next it is used to study the behaviour of the difference (dif) between two
measurements (x1 x2), given information about components of variance (sdtotal
sdwithin sdbetw).
INPUT PROGRAM.
LOOP i=1 TO 10000.
END CASE.
END LOOP.
END FILE.
END INPUT PROGRAM.
EXECUTE.
COMPUTE sdtotal=20.
COMPUTE sdwithin=10.
COMPUTE sdbetw=SQRT(sdtotal**2-sdwithin**2).
COMPUTE x0=50+NORMAL(sdbetw).
COMPUTE x1=x0+NORMAL(sdwithin).
COMPUTE x2=x0+NORMAL(sdwithin).
COMPUTE dif=x2-x1.
46
25. Exchange of data with other programs
Possibilities vary somewhat between SPSS versions. Use the menus:
File < Save as... and File < Open < Data
and pick the appropriate file type.
If you need to exchange data with SPSS on other platforms (e.g UNIX), create a file in
portable format (.por) which is common to all SPSS versions.
You may read and write e.g. Excel (.xls) and dBase (.dbf) files. SPSS versions prior to 10
read only Excel version 4.0 files.
SPSS writes Excel files version 4.0. The syntax is:
SAVE TRANSLATE OUTFILE='c:\...\....xls'
/TYPE=XLS
/KEEP=v1 v3 v4 v7
/FIELDNAMES.
The /FIELDNAMES subcommand instructs SPSS to write variable names to the first row in
the Excel worksheet.
DBMS/COPY and Stat/Transfer
These versatile programs translate between a large number of statistical packages.
Writing and reading ASCII files
Any spreadsheet or analysis program can write and read ASCII files. Exchange of information
via ASCII files is not very practical: any data documentation (variable names, labels, missing
values, etc.) is lost and must be defined again.
Precautions
Translation between programs may go wrong. Always check if the translation worked as
intended, by comparing the contents of the source and the target file. Especially missing value
definitions sometimes go wrong. Also take care with date variables. An example:
SigmaPlot imports Excel files, and SPSS data can thus be transferred to SigmaPlot via Excel.
SPSS SYSMIS is translated to #NULL! in Excel, and SigmaPlot translates #NULL! to 1 a
quite likely valid value.
47
Appendix 1. Exercises
The purpose of these exercises is to learn SPSS by doing.
READ THIS BEFORE YOU START.
You should start by setting preferences (see section 6).
Next, copy the files needed for exercises to your hard-disk.
I strongly recommend that you enter commands in the syntax window by writing them
(occasionally by pasting and editing them) before execution (see section 8). The reasons for
this recommendation are:
You will soon learn that it is much faster to write commands in the syntax window than to
zap around in the menus. It is easy to learn the fundamental commands and to recall them.
Using the command language you have a nice tool to plan what to do. Intuitive computing
has its merits, but if you are going to produce results of any importance, planning is a
good idea.
The syntax file documents what you did, while it can be impossible to reproduce a series
of clicks with the mouse
I also recommend that you save the syntax file for each question (2g.sps being the syntax
file for question 2g). Once you have saved a syntax file, delete it from the syntax window, to
avoid confusion. This means that you for most questions should create a syntax file, starting
with a GET FILE command.
For exercise 2g the syntax file (c:\dokumenter\spsskurs\2g.sps) should look like this:
get file='c:\dokumenter\spsskurs\ryge1.sav'.
frequencies tobacco.
list variables=cigaret cheroot pipe tobacco
/cases from 1 to 50.
Doing this, you will for each question have a good documentation of what you did. Including
the GET FILE command means that you will be certain what data set actually was analysed.
Some jobs create new system files. This syntax file must be saved for documentation (note the
recommendation on file names, section 9).
Delete unnecessary text in your output window before printing. In some cases you might want
to save the output (1a.spo being the output from question 1a).
In the exercise questions I sometimes give a hint about the procedure to be used (eg.
DISPLAY). Lookup the command syntax in this booklet.
48
Exercise 1
Objective: To get used to running SPSS jobs and to be acquainted with various procedures
and their output. The exercise uses the SPSS system file beer.sav.
The meaning of variable names etc. is:
Variable Meaning Codes
ID Brand of beer
RATING Rating 1 excellent
2 good
3 not good
COUNTRY Country of origin
COST Price, $ per bottle 0 missing
CALORIES Kcal / litre 999 missing
SODIUM Sodium g/l 99 missing
ALCOHOL Alcohol vol per cent 99 missing
a) Look at file contents in the data window.
b) Create a list of variables in beer.sav, including labels etc. (DISPLAY). When you
have succeeded, print the output.
c) Create an overview of minimum and maximum values for all variables in beer.sav
(DESCRIPTIVES). Print it.
d) Create frequency tables for all variables in beer.sav (FREQUENCIES).
e) Examine the relationship between price and rating (MEANS).
f) Describe the distribution of rating in different price groups (CROSSTABS). cost
must first be grouped in eg. 3 groups (RECODE). It is a good idea, when you have
recoded, to control the correctness (LIST).
g) Make a graphical description (GRAPH /SCATTERPLOT) of the relation between price
and alcohol content.
h) Examine other relationships which you might find interesting.
49
Exercise 2
Objective: Learn to create an SPSS system file from an ASCII data file. Further experience
with output.
Your input is the ASCII data file ryge.dat; it is concerned with smoking. The format of
ryge.dat is shown in the Codebook below:
Variable Meaning Values Digits Position
ID ID number 1-250 3 1-3
SEX Sex 1 male
2 female
9 no information
1 4
AGE Age in years 0-98
99 no information
2 5-6
WEIGHT Weight in kg 40-150
999 no information
3 7-9
HEIGHT Height in cm 100-250 3 10-12
SMOKER Smoker? 0 no
1 current smoker
2 former smoker
9 no information
1 13
CIGARET Cigarettes/day 0-98
99 no information
2 14-15
CHEROOT Cigars or cheroots
per day
0-98
99 no information
2 16-17
PIPE Packs of pipe
tobacco per week
0-8
9 no information
1 18
a) See ryge.dat on your screen by opening it in e.g. NotePad or a word processor.
Is ryge.dat an ASCII file?
b) Create the system file ryge.sav from ryge.dat (see an example in section 12).
You should define Variable labels, Value labels, and Missing values. You should
name the syntax-file gen.ryge.sps (see section 9 on recommended file names).
c) See ryge.sav on your screen ny opening it in e.g. NotePad. Is it an ASCII file?
(NB! Don't print from the data window; you waste a lot of paper).
d) Do the same exercises with ryge.sav as in exercise 1, question c and d. However,
don't create a frequency table for id. Print the tables; you need them for the next
questions.
e) Examine graphically the relation between height and weight (GRAPH /SCATTERPLOT).
Do the same for women only (SELECT IF).
50
f) Create a new variable, agegrp, which is a reasonable grouping of age (RECODE).
Create a new variable, tobacco: tobacco use in grams per day (1 cigarette = 1 g, 1
cigar/cheroot = 2 g, 1 pack of pipe tobacco = 40 g) (COMPUTE). Define labels for
agegrp and tobacco (VARIABLE LABELS). Create a new system file, ryge1.sav,
including the two new variables (SAVE OUTFILE).
What name did you give the syntax file creating ryge1.sav? Did you save it?
g) From ryge1.sav: see a frequency table for tobacco. Compare with the frequency
tables for cigarettes etc. (from question 2d) and decide if the result makes sense. Also,
use LIST for the first 50 cases to see if calculations have been made as intended. If
wrong, redo exercise 2f.
h) agegrp could have been made with COMPUTE, using the TRUNC function. Try to do
that. (Don't feel sorry if you can't find out).
i) Describe the joint age and sex distribution of the study population (CROSSTABS).
j) Create a new variable, bmi (Body Mass Index) = weight/height
2
(weight in kg, height
in m). See a frequency table for bmi. See the average bmi by sex and age groups
(MEANS). Test if the bmi distribution is different for men and women (T-TEST).
k) Group bmi in 3 groups (RECODE). See the grouped bmi distribution by age and sex
(CROSSTABS).
l) kbmi (invented just for the sake of this exercise) is a corrected Body Mass Index. For
women kbmi=bmi. For men, kbmi is 90% of bmi. Examine the relationship
between kbmi and age (GRAPH, MEANS, CROSSTABS). When reasonable, group
age and kbmi.
m) Make a list of all men, showing the variables id age weight height bmi kbmi
(SELECT IF, LIST). For the new variables, you should beforehand define number
of decimals etc. in the output (FORMAT).
n) Create an ASCII data file rygm.dat (WRITE) with the same content as listed in
question 2m. See it on the screen. (An ASCII data file can be used as input to other
software).
51
Exercise 3
Objective: Experience with the whole process of data collection, preparation for data entry,
data entry, analysis, and documentation.
Imagine a survey among 15 persons. The questionnaire looked like this:
Questionnaire number:
Sex: ~ Male ~ Female
Which year were you born?
At what level did you leave school?
Before finishing 9th grade ............................................. 1
After 9th grade............................................................... 2
After 10th grade............................................................. 3
After high school (gymnasium) ..................................... 4
Other .............................................................................. 5
Do you have a vocational education? (write)
The information in the 15 questionnaires was:
Questionnaire Sex
Year of
birth
School
education
Vocational
education
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
M
F
F
F
M
M
F
F
M
M
M
F
F
M
F
1940
1963
1936
1943
1950
1947
1964
1961
1957
1932
1939
1947
1951
1957
4
3
1
4
2
3
4
4
5
1
2
2
3
4
Physician
Office clerk
None
Architect
Mason
Carpenter
Nurse
Social worker
Sailor
None
Tailor
Shop assistant
None
Law school
52
a) Write a codebook (using pencil and paper or a word processor) for the study, as shown
in exercise 2. Give numerical codes to sex. Group vocational education by your own
choice and give numerical codes.
b) In EpiData prepare for entering data by creating the dataset definition file educ0.qes
and the data entry form educ0.rec. See more on EpiData in section 12 B and in Take
good care of your data, appendix 7.
c) From EpiData print the data documentation for educ0.rec. Compare with your
codebook.
d) Enter data in EpiData and save the EpiData file educ1.rec and the SPSS file
educ1.sav.
e) In SPSS create the variables age (age by 31 December 1988) and agegrp (age in
groups 0-4, 5-14, 15-24, . . . , 65+). Save the file with the new variables as
educ2.sav.
f) Print the key tables for your data (DESCRIPTIVES, FREQUENCIES).
g) Create whatever tables you find interesting.
This exercise reflects the typical sequence of an investigation. At the end of the exercise you
should have the following vital documents and files:
Written documents Syntax files Data files
Codebook
educ0.qes educ0.rec
Empty EpiData file
Questionnaires
educ1.rec
educ1.sav
EpiData file with data
SPSS data set
gen.educ2.sps educ2.sav
File with added variables
The codebook, the questionnaires, the syntax files, and the data files should be stored in a safe
place (safety, documentation, accountability).
53
Appendix 2. On documentation and safety.
When keeping financial accounts you should be able to document all expenses by
identification of the original vouchers. This is accomplished by giving each voucher a unique
number, and by enabling you ! and the auditor (revisor) ! to go back from the final balance
sheet to each voucher (the audit trail).
When working with data you should be able to document each piece of information by
identification of the original document (eg. questionnaire). This means that an ID (case
identifier) must be included both in the original documents and the data set. All modifications
to the data set must be documented (syntax files), and each analysis must be documented
(syntax files).
One purpose of this is to enable external audit (revision), but the main purpose is to protect
yourself against mistakes, errors, and loss of information.
Data documentation procedures must be included all the time when working with data;
otherwise it can be impossible ! or at least very time-consuming ! to reconstruct what
happened.
Source data:
Questionnaire, hospital records, etc.
Codebook:
Describes rules for coding of source data (see example in exercise 2)
If reading data from ASCII data file
Format (location of information) is described in the codebook
Syntax file creating first generation of SPSS system file:
DATA LIST FILE= 'c:\dokumenter\...\alfa.dat'
/....
VARIABLE LABELS...
VALUE LABELS...
MISSING VALUES...
SAVE OUTFILE='c:\dokumenter\...\alfa.sav'.
This syntax file could have the name gen.alfa.sps (see section 9 on recommended
filenames).
Error checks:
DESCRIPTIVES ALL. to see minimum and maximum values for all variables
FREQUENCIES v1 v7. to see more if needed for selected variables
CROSSTABS v1 BY v7. to check impossible combinations (e.g. pregnant males)
LIST. to identify cases with suspected errors:
TEMPORARY.
SELECT IF (sex > 2).
LIST id sex.
54
Corrections:
It is easy to change values in the data window, but it is dangerous, and documentation is
lacking. I strongly recommend to make corrections in syntax:
GET FILE='c:\dokumenter\...\alfa.sav'.
IF (id = 2473)sex=2.
IF (id = 2715)bday=17.
S
AVE OUTFILE='c:\dokumenter\...\alfa1.sav'.
This syntax file could have the name gen.alfa1.sps.
Creating next generation of a data set:
GET FILE = 'c:\dokumenter\...\alfa1.sav'.
(transformations)
VARIABLE LABELS...
VALUE LABELS...
MISSING VALUES...
S
AVE OUTFILE='c:\dokumenter\...\alfa2.sav'.
This syntax file could have the name gen.alfa2.sps.
Analyses:
You should be able to document the analyses leading to the published tables (syntax files).
For this reason (and to be sure to analyse the data set intended) include a GET FILE
command before the analysis:
GET FILE='c:\dokumenter\...\alfa2.sav'.
SELECT IF (sex = 1).
C
ROSSTABS agr BY item7.
If this created the information for table 7 you might save it as tab7.sps.
Remove external identification:
The data protection authorities (Registertilsynet) require that you remove external
identification from your analysis file as soon as possible. The syntax file
gen.alfakey.sps:
GET FILE='c:\dokumenter\...\alfa2.sav'.
SORT CASES BY id.
SAVE OUTFILE='a:\alfakey.sav'
/KEEP=id cpr.
SAVE OUTFILE='c:\dokumenter\...\alfa3.sav'
/DROP=cpr.
The key file (alfakey.sav) linking the internal identification (id) with the external
identification (cpr) should be stored separately (ie. not on the same computer as the
information). Here I used a diskette, but beware: diskettes are not very stable, so make an
extra backup copy.
If you later need to include cpr:
MATCH FILES FILE='a:\alfakey.sav'
/FILE='c:\dokumenter\...\alfa3.sav'
/BY id.
55
ANOTHER NOTE ON FILE NAMES
Data input Syntax file Result
ALFA.DAT
(ASCII data file)
GEN.ALFA.SPS
DATA LIST FILE =
'c:\dokumenter\..\alfa.dat'
/ (variable list) .
VARIABLE LABELS...
VALUE LABELS...
MISSING VALUES...
SAVE OUTFILE =
'c:\dokumenter\..\alfa.sav'.
ALFA.SAV
(1st generation
SPSS data set)
ALFA.SAV GEN.ALFA1.SPS
GET FILE =
'c:\dokumenter\..\alfa.sav'.
(transformations; create new variables)
VARIABLE LABELS...
VALUE LABELS...
MISSING VALUES...
SAVE OUTFILE =
'c:\dokumenter\..\alfa1.sav'.
ALFA1.SAV
(2nd generation
SPSS data set)
ALFA1.SAV TAB1.SPS
GET FILE =
'c:\dokumenter\..\alfa1.sav'.
CROSSTABS agegrp sex BY treat.
MEANS age BY treat BY sex.
Analyses for table 1
ALFA1.SAV TAB2.SPS
GET FILE =
'c:\dokumenter\..\alfa1.sav'.
(another analysis)
Analyses for table 2
Syntax files worth keeping forever:
C Syntax files generating new versions of the data set (gen.alfa.sps,
gen.alfa1.sps). The purpose of the prefix (gen.) is to enable you to identify them
easily and safely. These syntax files must include both the name of the input data file
(DATA LIST or GET FILE) and the output data file (SAVE OUTFILE).
C Syntax files generating information for your final publication. Give them names like
tab1.sps, tab2.sps. These syntax files must include the name of the input data file
(GET FILE) to avoid ambiguity on which data set was actually used.
Syntax files probably not worth keeping (forever):
Interim analyses not resulting in information for the final publication.
56
Appendix 3. SPSS modules and manuals
Module The module includes: Manual with US$ price Comments
SPSS Base 10.0 User's
Guide Package
US$ 49
The manual is good in
describing operations via
the menu system while
syntax information is
virtually absent.
SPSS Base 10.0 Syntax
Reference Guide
US$ 49
A systematic description
of the complete syntax.
During installation from
CD-ROM you may
download the manual in
PDF format on your
computer
Base All data handling and
transformation procedures.
Descriptive statistics,
Analysis of variance, linear
regression etc.
SPSS Base 10.0
Applications Guide
US$ 49
Nice introduction to a
variety of statistical
analyses
Advanced
Models
General linear models,
survival analysis including
Kaplan-Meier and Cox
regression
SPSS 10.0 Advanced
Models
US$ 49
Needed eg. for survival
analysis
Regression
Models
Binomial and multinomial
logistic regression, nonlinear
regression.
SPSS 10.0 Regression
Models
US$ 49
Needed eg. for logistic
regression
Tables Complex tabulations for
presentation
SPSS Tables 8.0
US$ 41
Few users need Tables.
Missing
value
analysis
Examine missing value
patterns. Tools for
substitution of missing
values
SPSS Missing Value
Analysis 7.5
US$ 38
Substituting missing
values should be done
with care ! or not at all.
Trends Time series and forecasting
analysis
SPSS Trends 10.0
US$ 29
Hardly used in health
research
Conjoint Conjoint analysis SPSS Conjoint 8.0
US$ 20
Hardly used in health
research
Categories Correspondence analysis SPSS Categories 10.0
US$ 39
Hardly used in health
research
The primary manual is: SPSS Base 10.0 User's Guide. It describes how to use the menu
facilities in SPSS for Windows.
The command language (common to a number of SPSS platforms) is described in SPSS Base
10.0 Syntax Reference Guide. This guide is also included in the installation CD-ROM.
If you have manuals version 8 or 9 you probably don't need to replace them by version 10.
Manuals can be purchased from:
Polyteknisk Boghandel, Anker Engelundsvej 1, 2800 Lyngby, Tel. 4588 1488.
57
Appendix 4. A few remarks on Windows
It is rather unsafe to use any program without mastering the fundamental structure and
facilities in Windows. There are nice and cheap booklets for sale in many kiosks.
My main comments and recommendations apply to handling of the folder (directory,
bibliotek, mappe) structure. There are several ways to move and copy files; I only show one
technique.
Create a smart folder structure
Don't mix your own data and documents with program files; this is risky and will inevitably
lead to confusion.
Create a main folder for all of your own files (data, syntax files, text documents), eg.
C:\DOKUMENTER, with all of your own files in subfolders under your main folder.
Example of folder structure.
C:\
C:\ is the root folder
Programs
EpiData
SPSS
Stata
WordPerfect
WinZip
Games
Solitaire
Doom
Windows
Program folders should include programs only,
never data nor documents created by yourself.
Dokumenter
Personal
CV
Secrets (encrypted)
Project 1
Protocol
Administration
Data
Safe
Manuscripts
Project 2
Protocol
Administration
Data
Safe
Manuscripts
C:\Dokumenter is your own main folder.
All of your own data and documents should be
placed in subfolders under your main folder.
Organize the folders by subject, not by file type.
This structure:
- Makes it easy for you to locate your own files.
- Facilitates the selection of files to be backed up
(C:\Dokumenter and its subfolders).
This structure has several advantages:
a. You avoid mixing own files with program files
b. You can select your main folder (C:\DOKUMENTER) as the default root folder for all
of your own folders (see below, and section 6: setting preferences), so that when
opening or saving files, you see only your own folders, not the program folders.
c. It is much easier to set up a practical backup procedure.
58
Use Windows Explorer (Stifinder)
I recommend to use Explorer rather than My Computer, and to put a shortcut at your desktop:
C Right-click the [Start] button and select Open
C Open the Programs folder
C Select the Explorer shortcut icon and copy it to the clipboard by [Ctrl]+[C]
C Click anywhere on the desktop and paste the icon by [Ctrl]+[V]
Make your main folder default when opening Explorer
C Right-click the Explorer shortcut icon
C Properties < Shortcut < Path <
(Egenskaber < Genvej < Sti < )
C:\WINDOWS\EXPLORER.EXE /n, /e, C:\dokumenter
Make Explorer display file name extensions
For reasons not understood by me, Microsoft decided not to display file name extensions by
default. This is inconvenient (you can not distinguish the syntax file alpha.sps from the
data file alpha.sav), and you should set Explorer to display file name extensions.
C Open Explorer
C View < Options
(Vis < Indstillinger)
C G Uncheck: "Hide MS-DOS file extensions"
("Undlad at vise MS-DOS filtyper")
How to create a new folder
The example is to create the folder PROJECT3 under C:\DOKUMENTER
C Double-click the Explorer (Stifinder) icon at the desktop
C Click C:\DOKUMENTER (root folder for own files)
C Files < New < Folder
(Filer < Ny < Mappe)
C Rename 'New Folder' (Ny Mappe) to 'project3'
How to rename a folder or file
C In Explorer, right-click the folder or file and select Rename
C Write the name desired and press [Enter]
How to copy a file or a folder to another folder or to a diskette
C In Explorer, highlight the source file or folder; press [Ctrl]+[C] (copy to clipboard)
C Move to the target folder (or A:); press [Ctrl]+[V] (paste from clipboard)
How to move a file or a folder to another folder
C In Explorer, highlight the source file or folder icon; press [Ctrl]+[X] (copy to clipboard
and delete source file)
C Move to the target folder; press [Ctrl]+[V] (paste from clipboard)