SPSS for Applied Sciences: Basic Statistical Testing
By Cole Davis
2.5/5
()
About this ebook
This book offers a quick and basic guide to using SPSS and provides a general approach to solving problems using statistical tests. It is both comprehensive in terms of the tests covered and the applied settings it refers to, and yet is short and easy to understand. Whether you are a beginner or an intermediate level test user, this book will help you to analyse different types of data in applied settings. It will also give you the confidence to use other statistical software and to extend your expertise to more specific scientific settings as required.
The author does not use mathematical formulae and leaves out arcane statistical concepts. Instead, he provides a very practical, easy and speedy introduction to data analysis, offering examples from a range of scenarios from applied science, handling both continuous and rough-hewn data sets.
Examples are given from agriculture, arboriculture, biology, computer science, ecology, engineering, farming and farm management, hydrology, medicine, ophthalmology, pharmacology, physiotherapy, spectroscopy, sports science, audiology and epidemiology.
Related to SPSS for Applied Sciences
Related ebooks
SPSS for you Rating: 4 out of 5 stars4/5Quantitative Method-Breviary - SPSS: A problem-oriented reference for market researchers Rating: 0 out of 5 stars0 ratingsIntroduction to Biostatistics with JMP (Hardcover edition) Rating: 1 out of 5 stars1/5Design and Analysis of Experiments in the Health Sciences Rating: 0 out of 5 stars0 ratingsCategorical Data Analysis Using SAS, Third Edition Rating: 0 out of 5 stars0 ratingsIntroduction to Data Analysis in Qualitative Research Rating: 0 out of 5 stars0 ratingsBiostatistics by Example Using SAS Studio Rating: 0 out of 5 stars0 ratingsAn Introduction to Statistics using Microsoft Excel Rating: 0 out of 5 stars0 ratingsBayesian Biostatistics Rating: 0 out of 5 stars0 ratingsData Management and Analysis Using JMP: Health Care Case Studies Rating: 0 out of 5 stars0 ratingsAnalysis of Experimental Data Microsoft®Excel or Spss??! Sharing of Experience English Version: Book 3 Rating: 0 out of 5 stars0 ratingsManaging Data Using Excel Rating: 5 out of 5 stars5/5Analysis of Clinical Trials Using SAS: A Practical Guide, Second Edition Rating: 0 out of 5 stars0 ratingsStatistics Can Be Fun Rating: 0 out of 5 stars0 ratingsStatistics: Basic Principles and Applications Rating: 0 out of 5 stars0 ratingsClinical Prediction Models: A Practical Approach to Development, Validation, and Updating Rating: 0 out of 5 stars0 ratingsAn Introduction to Statistical Computing: A Simulation-based Approach Rating: 0 out of 5 stars0 ratingsRegression Models for Categorical, Count, and Related Variables: An Applied Approach Rating: 0 out of 5 stars0 ratingsJMP for Basic Univariate and Multivariate Statistics: Methods for Researchers and Social Scientists, Second Edition Rating: 0 out of 5 stars0 ratingsThinking Statistically Rating: 5 out of 5 stars5/5Time Series Analysis in the Social Sciences: The Fundamentals Rating: 0 out of 5 stars0 ratingsStatistics at Square One Rating: 0 out of 5 stars0 ratingsSurviving Statistics: A Professor's Guide to Getting Through Rating: 0 out of 5 stars0 ratingsCluster Analysis Rating: 4 out of 5 stars4/5Statistics in Psychology Using R and SPSS Rating: 0 out of 5 stars0 ratingsA Quick and Easy Guide in Using SPSS for Linear Regression Analysis Rating: 0 out of 5 stars0 ratingsApplied Survival Analysis: Regression Modeling of Time-to-Event Data Rating: 4 out of 5 stars4/5IBM SPSS Statistics 21 Brief Guide Rating: 0 out of 5 stars0 ratingsMethods of Multivariate Analysis Rating: 2 out of 5 stars2/5Excel Statistics: Step by Step Rating: 4 out of 5 stars4/5
Mathematics For You
Quantum Physics for Beginners Rating: 4 out of 5 stars4/5Math Magic: How To Master Everyday Math Problems Rating: 3 out of 5 stars3/5Real Estate by the Numbers: A Complete Reference Guide to Deal Analysis Rating: 0 out of 5 stars0 ratingsMy Best Mathematical and Logic Puzzles Rating: 4 out of 5 stars4/5Calculus Made Easy Rating: 4 out of 5 stars4/5What If?: Serious Scientific Answers to Absurd Hypothetical Questions Rating: 5 out of 5 stars5/5Relativity: The special and the general theory Rating: 5 out of 5 stars5/5Basic Math & Pre-Algebra For Dummies Rating: 4 out of 5 stars4/5The Little Book of Mathematical Principles, Theories & Things Rating: 3 out of 5 stars3/5Algebra - The Very Basics Rating: 5 out of 5 stars5/5Logicomix: An epic search for truth Rating: 4 out of 5 stars4/5Algebra II For Dummies Rating: 3 out of 5 stars3/5Pre-Calculus For Dummies Rating: 5 out of 5 stars5/5Standard Deviations: Flawed Assumptions, Tortured Data, and Other Ways to Lie with Statistics Rating: 4 out of 5 stars4/5The Golden Ratio: The Divine Beauty of Mathematics Rating: 5 out of 5 stars5/5Alan Turing: The Enigma: The Book That Inspired the Film The Imitation Game - Updated Edition Rating: 4 out of 5 stars4/5Calculus Essentials For Dummies Rating: 5 out of 5 stars5/5Sneaky Math: A Graphic Primer with Projects Rating: 0 out of 5 stars0 ratingsBasic Math & Pre-Algebra Workbook For Dummies with Online Practice Rating: 4 out of 5 stars4/5GED® Math Test Tutor, 2nd Edition Rating: 0 out of 5 stars0 ratingsMental Math Secrets - How To Be a Human Calculator Rating: 5 out of 5 stars5/5Basic Math Notes Rating: 5 out of 5 stars5/5Infinite Powers: How Calculus Reveals the Secrets of the Universe Rating: 4 out of 5 stars4/5The Everything Everyday Math Book: From Tipping to Taxes, All the Real-World, Everyday Math Skills You Need Rating: 5 out of 5 stars5/5ACT Math & Science Prep: Includes 500+ Practice Questions Rating: 3 out of 5 stars3/5
Reviews for SPSS for Applied Sciences
6 ratings1 review
- Rating: 4 out of 5 stars4/5还是看
Book preview
SPSS for Applied Sciences - Cole Davis
PART ONE
Pre-test considerations
CHAPTER 1
Introduction
WHAT THIS BOOK DOES
After an introduction which should be invaluable to beginners and those returning to statistical testing after a break, this book introduces statistical tests in a well-organised manner, providing worked examples using both parametric and non-parametric tests.
Whether you are a beginner or an intermediate level test user, you should be able to use this book to analyse different types of data in applied settings. It should also give you the confidence to use other statistical software and to extend your expertise to more specific scientific settings as required.
This book assumes that many applied researchers, scientific or otherwise, will not want to use statistical equations or to learn about a range of arcane statistical concepts. Instead, it is a very practical, easy and speedy introduction to data analysis in the round, offering examples from a range of scenarios from applied science, handling both continuous and rough-hewn data sets.
Examples will be found from agriculture, arboriculture, audiology, biology, computer science, ecology, engineering, epidemiology, farming and farm management, hydrology, medicine, ophthalmology, pharmacology, physiotherapy, spectroscopy and sports science. These disciplines have not been covered in depth, as this book is intended to provide a general approach to solving problems using statistical tests.
The output, with permission from IBM, comes from SPSS (PASW) Student Version 18, for the purpose of the widest usability, and the Advanced Module of SPSS 20. It is completely compatible with SPSS versions 17 to 20 (including those packages with the title PASW) and will generally be usable with earlier editions. As SPSS tends not to change much over the years, this book is likely to be relevant for quite some time. SPSS features are used selectively here for the sake of clarity. Various manuals and handbooks are available on the internet and in print for those eager to know every possible detail of its use.
Similarly, as the book is essentially about statistical testing, research design is generally only touched on for the purposes of clarity. Again, there are a lot of sources of information out there, especially relating to different specialisms.
In contrast to many books on statistics, I favour coherence over conceptual comprehensiveness, although as will be seen, this book offers some tests not usually found in other introductory books.
THE ORGANISATION OF CONTENT
Although many core concepts are presented in the first part of the book, which should definitely be read by newcomers to statistical testing, other ideas appear where they logically arise. Although mathematics is barely touched upon, statistical jargon is introduced, as you will meet it in SPSS and other software as well as in research papers which you may read or even find yourself writing. Descriptive statistics are introduced, as it is important in the preliminary analysis of data, but are dealt with sparingly: inferential statistics are at the heart of statistical testing. The first part of the book also offers a quick and basic guide to using SPSS.
The second part of the book comprises the tests. Each test is accompanied by at least one worked example. Where possible, non-parametric equivalents are provided in addition to parametric tests; we recognise that data sets in the real world are not always as blandly measurable as we would wish them to be.
The chapter on experiments and quasi-experiments – essentially, the analysis of differences – is fairly conventional, apart from equal consideration being given to non-parametric tests as useful tools in applied settings. Factorial analysis of variance (e.g. two-way ANOVA) is also covered, although a discussion about the analysis of covariance (ANCOVA) is deferred until the brief chapter on advanced techniques.
The chapter on the frequency of observation – also known as qualitative (or categorical) analysis – offers a broader set of practical usages than in most introductory texts.
Survival analysis is also new to general introductory texts, but given its wide applicability outside the world of medicine, I prefer to call it the analysis of the time until events. Although this is also qualitative in nature, it is so different in function as to be worthy of a separate chapter.
The next chapter starts with correlations, but goes beyond some contemporary texts in introducing multiple regression, which is increasingly used in applied settings. It also provides a stripped down account of factor analysis, which will meet the needs of people on master’s and doctoral projects (and others) who find themselves needing to use this technique in a hurry. Many so-called simple introductions are generally nothing of the sort. The core coverage provided here meets immediate needs, but will also make it easier to absorb more in-depth texts when necessary.
The third part of the book includes a short set of exercises. Problems in the real world are not usually accompanied by signposts saying ‘this problem involves correlations’, so I have avoided the common practice of putting a quiz at the end of each chapter. I think it makes most sense to tackle exercises once you have an overall grasp of what you have read and the experience of having worked through the preceding worked examples.
The chapter on reporting is intended for organisations with practical concerns; academic writers will need to use works of reference specific to their disciplines or universities. The book concludes with a brief summary of a few advanced statistical techniques.
DATA SETS AND ADDITIONAL INFORMATION
The data sets are small, to avoid lengthy data entry or the need for internet downloads. Following the same logic, some data sets are built upon as each chapter progresses. While the worked examples should be of interest to various practitioners, it should be noted that the data sets are for learning purposes only and are fictional unless there is a clear statement to the contrary.
The book contains various ‘discussion points’, which draw the reader’s attention to statistical topics that are philosophically interesting or controversial.
On the subject of controversy, I may add that independent researchers will find SPSS to be rather an expensive piece of software. A cheaper option is StatsDirect. I wrote a book to accompany this package (Davis 2010), but do note that the data sets and texts are similar in both books. I do not recommend buying both. If a choice has to be made, then this book is more comprehensive in its range of tests and concepts.
HOW TO USE THIS BOOK
If you do not have to time to read the whole book, it is still a good idea to read the introductory part before homing in on the chapter of interest. If time dictates dipping into a single chapter, then try to read the whole chapter and follow the worked examples.
References to statistical theory may be skipped over by first time readers, but they may in time improve your understanding of the issues. When you have a full grasp of this book, you should be able to use other software and more advanced tests.
ACKNOWLEDGEMENTS
I would particularly like to thank Dr George Clegg, a scientist with experience in academic research and the defence industry, who asked some hard questions about what I intended to write. Thanks are also due to Nick Jones for his encouragement during the development of this book, and Ofra Reuven, statistician and data analyst, for her speedy and reliable help creating images and checking through my data.
Permission was granted by IBM to use screenshots from the IBM statistical testing package.
I would also like to thank the Orwell Estate for their goodwill over the dedication of this book. George Orwell’s essays and books have given me food for thought and themes for debate over the decades. His integrity stands as a beacon.
The responsibility for any shortcomings remains my own.
DISCUSSION POINT
Statistical testing is like driving a car. You need to know where you are going and what to do when you get there, but the workings of the engine need not necessarily bother you. It is my contention that formulae are of little relevance to effective data analysis.
CHAPTER 2
Descriptive and inferential statistics introduced
DESCRIPTIVE STATISTICS
This book is primarily about inferential statistics, generalising from limited data, but some knowledge of descriptive statistics is essential. When we have all the data, the entire population rather than a sample, descriptive statistics may tell us all we need to know. When looking at samples, the descriptive data helps us to decide which statistical tests to use and indeed if any tests should be used. The statistical concepts discussed (lightly) here underlie what the tests try to achieve.
A statistic is a number which represents or summarises data. Descriptive statistics reveal how much data is involved and its shape.
There are times when an absolute number gives us what we want. We can have 99 red balloons, 20 000 drug addicts and 101 Dalmatians. There are also simple representative statistics such as the range, the maximum minus the minimum: if the maximum is 206 and the minimum is 186, then the range statistic is 20.
Measures of central tendency
When we contrast groups of data, we run into the limitations of absolute numbers. For example, the comparison of the effects of alcohol intake between individuals may be misleading if we do not take into account the size of the individual. Therefore, we tend to use central tendency as one of the ways to reduce irrelevant differences.
The measure of central tendency is also sometimes referred to as the ‘average’. However, the term average is problematic in more than one way.
Part of the problem is that of interpretation. We can see the dubious nature of the layman’s ‘average’ when we consider newspaper articles that refer to ‘average pay’. I do not know which average is being referred to – the mean, the mode or the median – and it is likely that the journalist is similarly unsure. A related problem is that the word ‘average’ is associated by many with just one particular measure of central tendency, the mean. This being the case, ‘central tendency’ is to be preferred when referring to statistical principles. (However, there are times when ‘average’ slips more easily from the tongue, pen or keyboard.)
THE MEAN
The mean adds the numbers in the data set and divides the sum by the number of items, as in this simple example: 2, 3, 3, 4, 8. The sum, Σ, = 20. The number of items, N, = 5. The mean is therefore Σ / N: 20/5 = 4.
If we use the mean to calculate the central tendency in workers’ salaries, the strength of this method is that it takes into account everyone from multimillionaires to the lowest paid. This is also its weakness, as the presence of one or two billionaires could provide a highly unrepresentative statistic.
THE MODE
The mode is the number which appears most frequently in a data set, in this case the number 3.
The mode will successfully ignore the presence of our uber-tycoons, as most salary earners may well be clerical workers. But how representative is this of the earnings of the workforce in general?
THE MEDIAN
The median is the value in the middle of the string of numbers on a continuum from biggest to smallest. We count inwards from our tiny data set, discounting first the 2 and the 8, then the outer 3 and the 4, leaving the central 3 in the middle as the median.
In our industrial example, the median statistic may find a middle-manager’s salary. This could also be useful, but it does not render the most common wage, for which we need the mode, nor does it take into account the purchasing power of the extremely rich and the extremely poor, as the mean does.
Apart from demonstrating the importance of central tendency as a concept, this shows how interpretative statistical research can be (and I do not mean this in the cynical sense). The context may determine our use of different statistics.
The distribution of data
Central tendency is just part of what is known as the distribution of data, which can be shown using a histogram. Again, we use 2, 3, 3, 4, 8. Techniques such as histograms, as well as simple quantitative statistics such as measures of central tendency, allow us to consider the shape of a distribution and hence which type of distribution we are looking at.
A common distribution is the ‘normal distribution’, otherwise known as Gaussian distribution, the famous bell curve (an idealised symmetrical one is shown below). This generally represents a natural population, for example, animal running speeds or intelligence test results.
The chart shows some new figures. We already know about the measures of central tendency, the mean, median and mode. However, people can be misled by figures such as the mean, which can be large or small without telling us very much, similarly the median. (The mode has another little foible: it may not be unique, as there may be two or more figures which come up particularly frequently.) So we are also interested in measures of dispersion, how spread out the numbers are around the mean.
The figures underneath the chart, running from –4 to +4 represent one measure of dispersion, the standard deviation. You will often read reports citing the standard deviation (SD) as well as the mean. As you will see, one standard deviation around the mean (the centre) represents over 68% of the data. Two standard deviations either way represent over 95%, with three