Introduction To Statistics Presentation of Data
Introduction To Statistics Presentation of Data
PRESENTATION OF DATA
BUSINESS STATISTICS
Major topics:
1) Descriptive Methods Tabular, graphical and numerical summaries.
2) Sampling and Design of Experiments
3) Relationship between Quantitative and Categorical Variables
4) Probability Theory How to measure what is probable?
5) Probability Distributions Binomial, Poisson and Normal probabilities.
2) Inferential Statistics
Methods which facilitate the making of generalizations (i.e. inferences)
about population characteristics from information obtained from a
sample.
3
STATISTICAL TERMS
• Data: Individual pieces of information.
e.g. the number of Statistics students in semester 1, 2005 at Angell.
Many students, Many characteristics (e.g. age, income, gender)
4
Statistical data is usually stored in a computer file (eg.spreadsheets).
•Case: All responses from a person in a sample or census. Each row is a
case. E.g. One respondent’s answers to all questions on a questionnaire.
•Variable: All responses to a particular question in a sample or census.
Each column is a variable. E.g. Answers to a particular question on a
questionnaire filled in by all respondents.
Ex 1: A councilor who is running for the office of mayor of a city with 25 000
registered voters commissions a survey. In the survey, 48% of the 200
registered voters interviewed say they planned to vote for her.
8
DESCRIPTIVE METHODS
Summarising Quantitative Data
Example 4
What was the average annual family income in Freiburg in 2005?
Population: annual incomes of all families living in Freiburg in 2005.
10
Examples of tabular summaries are the frequency distribution, the relative
frequency distribution and the cumulative (relative) frequency distribution.
i. The class intervals must not overlap each other, but they must be
exhaustive.
(Each data point must belong to one and only one class.)
iii. Select a width and starting value that are convenient to work with
(i.e. integer, some multiple of 5, 10 etc. units).
iv. For numerical data, do not use less than 5, or more than 15 class
intervals.
12
There is a simple relationship between the number of classes, the range of the
data and the class width:
number of classes range class width
(Ex 4) If the common class width is 5, 87/5= 17.4 18 classes (too many)
If the common class width is 10, 87/10= 8.7 9 classes
Income Frequency
25 – < 35 230
35 – < 45 420
45 – < 55 475
55 – < 65 635
65 – < 75 930
75 – < 85 720
85 – < 95 590 590 families have
95 – < 105 395 annual income b/w
105 - < 115 105 85 and 95 ($’000)
Total 4500
13
Ex 4: Histogram of annual family incomes in Freiburg in year 2000,
1000
800
Frequency
600
… and polygon of annual
400 family incomes.
200
15
DESCRIPTIVE METHODS Summarising Qualitative Data
Example 5
Let’s suppose a second question was asked in the previous survey:
What do you think of your current standard of living? Is it lower, higher
or the same as five years ago?
… the frequency distribution of the replies are tabulated as follows:
16
• Column/bar chart: a graphical presentation of a (relative)
frequency distribution of qualitative data.
– Each category is illustrated with a rectangle;
– Each rectangle has the same width;
– The rectangles do not touch each other.
2500
2000
Frequency
1500
1000
500
0
Lower Same Higher
17
• Pie chart: an alternative type of graphical tool for (relative)
frequency distributions of qualitative data.
– The whole data set is illustrated with a circle;
– The circle is subdivided into slices that represent the
categories;
– The size of each slice is proportional to the corresponding
(relative) frequency.
1500
(Ex 5)
2500
500
19
Clustered bar chart Stacked bar chart
80
70 250
60
200
50
40 150
30 100
20
10 50
0 0
Males Females Males Females
Science Business Engineering Arts Other Science Business Engineering Arts Other
12% 22%
17%
15% 34%
Science Business Engineering Arts Other Science Business Engineering Arts Other
20