Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
4 views

Math

The document outlines basic concepts in statistics, including levels of measurement (nominal, ordinal, interval, and ratio), types of variables (qualitative and quantitative), and measures of central tendency (mean, median, mode). It also discusses measures of dispersion such as range, variance, and standard deviation, along with examples and procedures for calculating these statistics. Additionally, it covers sampling methods and provides examples of frequency distribution and measures of central tendency for both ungrouped and grouped data.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Math

The document outlines basic concepts in statistics, including levels of measurement (nominal, ordinal, interval, and ratio), types of variables (qualitative and quantitative), and measures of central tendency (mean, median, mode). It also discusses measures of dispersion such as range, variance, and standard deviation, along with examples and procedures for calculating these statistics. Additionally, it covers sampling methods and provides examples of frequency distribution and measures of central tendency for both ungrouped and grouped data.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Basic Concepts in Statistics

Nominal Level
Statistics
 This is characterized by data that consist of names, labels,
 It is a collection of methods for planning experiments, or categories only.
obtaining data, and then analyzing, interpreting and  Gender
drawing conclusions based on the data.  Most preferred color
 Usual sleeping time
 Civil status
 Data are the values that the variables can assume.
Ordinal Level
 A variable is a characteristics that is observable or
 This involves data that arranged in some order, but
measurable in every unit of a variable. differences between data.
 Population is the set of all possible values of a variable.  Happiness index for the day
 Sample is a subgroup of a population.  Highest educational attainment
 Ranking of tennis player
 Academic excellence awards
Qualitative Variables
 Words or codes that represent a class or category Interval Level
 Express a categorical attribute:  This is the same in ordinal level, with an additional
 Gender property that we can determine meaningful amounts of
 Religion differences between the data.
 Marital status
 Body temperature
 Highest educational attainment
 Intelligence quotient
Quantitative Variables
 Number that represents an amount or a count Ratio Level
 Numerical data, sizes are meaningful and answer  This is an interval level modified to include the inherent
questions such as "how many" or "how much". zero starting point.
 Height  It possesses a meaningful absolute, fixed zero pint and

 Weight allows all arithmetic operations.

 Household size  Number of siblings

 Number of registered cars  Weight


 Height

Discrete Variables
 Data that can be counted  Random Sampling, this is done by using chance or
 Number of days random numbers.
 Number of siblings  Systematic Sampling, this is done by numbering each
 Usual number of text messages sent in day subject of the population and selecting nth number
 Usual daily allowance in school  Stratified Sampling, if the population has a distinct
group, it is possible to divide the population into
Continuous Variables these groups and to draw SRSs from each of the
 It can assume all values between any two specific values groups.
like 0.5, 1.2. and data that can be managed.  Cluster Sampling, this method uses intact groups called
 Weight clusters.
 Height
 Body temperature
Measures Of Central Tendency of Ungrouped Data Example:
Find the mode of the given data set: 15, 28, 25, 48, 22, 43, 39,
 Also referred to as measures of Centre or central location 44, 43, 34, 22, 33, 27, 25, 22, and 30.
is a summary measure that attempts to describe a whole
15,22,22,22,25,27,28,30,33,34,39,43,43,44,48,49
set of data with a single value that represents the middle
In the given data, the number that appeared the most
or Centre of its distribution.
number of times is 22. The data set is said to unimodal.

There Are Three Main Measures Of Central Tendency:


Example:
Mean
The speed of ten stenographers in typing per minute are as
 The mean is the most used measure of central tendency.
follows: 121, 110, 120, 119, 112, 121, 118, 115, 107, 115
When we speak of average, we always refer to the mean.

Σ� 107,110,112,115,115,118,119,120,121,121
�� =
Ν
Thus, the data set has two modes: 115 and 121. The data set
is said to be bimodal.
Example:
Six friends in a biology class of 20 students receives test
grades of 92, 84, 65, 76, 88, and 90. Find the mean of these
test scores.
--------------------------------------------------------------
Σ�
�� =
Ν

92 + 84 + 65 + 76 + 88 + 90
�� = Measures Of Central Tendency of Grouped Data
6
495
�� =
6
Mean
�� = 82.5
��
where: �� =

Median �� = the product of frequency and classmark/midpoint.
 The median is the midpoint of the data array. Before � = total frequencies
finding this value, the data must be arranged in order,
from least to greatest or vice versa. Finding the value in Median
the exact middle. �
−�
where: �� = � + 2
×ℎ
�� = �1 , �2 , �3 �4, �5 �
� = total frequencies
�� = �1 , �2 , �3 , �4 �5 , �6
� = lower boundaries of median class

� = frequency of median class
Example:
ℎ = class width
Eight novels were randomly selected and the numbers of
� = cumulative frequency before/preceding the median class
pages were recorded as follows: 145, 398, 402, 400, 420, 415,
407, 425
Mode
398,400,402,407,415,415,420,425 ��−��−1
where: �� = � + ×ℎ
��−��−1 +��−��+1
407 + 415 ℎ = class width
�� =
2 � = lower boundaries of modal class
822
�� = �� = frequency of modal class
2
�� = 411 ��−1 = frequency of modal class preceding
��+1= frequency of modal class succeeding
Mode
 It is the value that occurs most often in the data set.
 The number/value/observation in a data set which
appears the most number of times.
Example: Solution:
Compute the mean, median, and mode of the scores of the 12 − 11
students in a basic statistic test. �� = 30.5 + ×5
12 − 11 + 12 − 11
1
�� = 30.5 + ×5
Frequency Distribution 12 − 11 + 12 − 11
Score Frequency Lower Classmark �� Cumulative 1
(� ) boundarie / Frequency �� = 30.5 + ×5
1+1
(� ) Midpoint (� )
11-15 1 10.5 13 13 1 1
�� = 30.5 + ×5
16-20 2 15.5 18 36 3 2
21-25 5 20.5 13 115 8 �� = 30.5 + 0.5 × 5
26-30 11 25.5 28 308 19 �� = 30.5 + 2.5
31-35 12 30.5 33 396 31 �� = 33
36-40 11 35.5 38 418 42
41-45 5 40.5 43 215 47
46-50 1 45.5 48 48 48
� = 48 �� =
1,549

Mean
��� Solution:
�� =
� 1549
�� =
��� = 1549 48
�� = 32.27
� = 48

Median
� Solution:
−�
�� = � + � � 48
� − 19
�� = 30.5 + 2 ×5
� = 19 12
� = 30.5 24 − 19
� = 12 �� = 30.5 + ×5
12
�=5 5
� = 4812 �� = 30.5 + ×5
12
�� = 30.5 + 0.416 × 5
�� = 30.5 + 2.08

�� = 32.58

Mode

�� − ��−�
�� = � + �
�� − ��−� + �� − ��+�

� = 30.5
�� = 12
��−� = 11
��+� = 11
�=5
Measures Of Dispersion Procedures For Computing A Standard Deviation:

1) Determine the mean of the numbers.


 A measure of variability of a set of data is a number that
2) For each number, calculate the deviation (difference)
conveys the idea of spread for the data set.
between the number and the mean of the numbers.

There Are Statistical Values: 3) Calculate the square of each deviation and find the sum
Range of these squared deviations.
 The range measures the distance between the largest
4) If the data is a population, then divide the sum by . If
and the smallest values and, as such, gives an idea of the the data is a sample, then divide the sum by � − �.
spread of the data set. However, the range does not use
the concept of deviation, It is affected by outliner but
Example For Variance And Standard Deviation:
does not consider all values in the data set. Thus it is a
The following numbers were obtained by sampling a
not a very useful measure of variability.
population. 2, 4, 7, 12, 15
Range (R) = highest value - lowest value
Step 1
Example:
2 + 4 + 7 + 12 + 15
Find the range of the numbers of ounces dispensed by �� =
5
Machine 1 and Machine 2.
40
Machine 1 Machine 2 �� =
Machine 1 5
9.52 8.01 R = 10.07 - 5.85 �� = 8
6.41 7.99 R = 4.22 oz
10.07 7.95 Machine 2 Step 2
5.85 8.03 R = 8.03 - 7.95 Σ(χ − μ)2
s2 =
8.15 8.02 R = 0.08 oz n−1

X=8.0 X=8.0 (2 − 8) 2 + (4 − 8) 2 + (7 − 8) 2 + (12 − 8) 2 + (15 − 8) 2


s2 =
5−1
118
s2 =
Variance 4

 The variance for a given data set is the square of the s2 = 29.5
standard deviation of the data.
Step 3
Variance of the population Variance of the sample

Σ(χ − μ)2 Σ(χ − μ)2 Σ(χ − μ)2


2
σ = s2 = s=
n n−1 n−1

s = 29.5

Standard Deviation s = 5.43


 The Standard Deviation is a measure of how spread out
numbers are Example For Use Standard Deviation:
 Its symbol is � (the greek letter sigma)
A consumer group has tested a sample of 8 size - D batteries
Standard deviation of Standard deviation of from each of 3 companies. The result of the tests are shown in
the population the sample the following table. According to these tests, which company
produces batteries for which the values representing hours of
Σ(χ − μ)2 Σ(χ − μ)2
σ= s= constant use have the smallest standard deviation?
n n−1
Company Hours of constant use per battery
EverSoBright 6.2, 6.4, 7.1, 5.9, 8.3, 5.3, 7.5, 9.3
Dependable 6.8, 6.2, 7.2, 5.9, 7.0, 7.4, 7.3, 8.2
Beacon 6.1, 6.6, 7.3, 5.7, 7.1, 7.6, 7.1, 8.5
EverSoBright Measures Of Dispersion Of Grouped Data
Step 1
Range
6.2 + 6.4 + 7.1 + 5.9 + 8.3 + 5.3 + 7.5 + 9.3
�� = Scores of 40 students in a 60-point Quiz.
8
Scores Frequency (f) Class Boundaries
56
�� = 53 - 58 3 52.5 - 58.5
8 47 - 52 4 46.5 - 52.5
�� = 7 41 - 46 1 40.5 - 46.5
35 - 40 2 34.5 - 40.5
29 - 34 10 28.5 - 34.5
Step 2 23 - 28 11 22.5 - 28.5
(6.2 − 7)2 + (6.4 − 7)2 + (7.1 − 7)2 + (5.9 − 7)2 + (8.3 − 7)2 + (5.3 − 7)2 + (7.5 − 7)2 + (9.3 − 7)2
17 - 22 4 16.5 - 22.5
s=
8−1 11 - 16 3 10.5 - 16.5
5 - 10 2 4.5 - 10.5
12.34
s=
7
Range = Highest Class Boundaries - Lowest Class Boundaries
s = 1.33h
Range = 58.5 - 4.5
Dependable
Step 1 Range = 54

6.8 + 6.2 + 7.2 + 5.9 + 7.0 + 7.4 + 7.3 + 8.2


�� =
8
Variance And Standard Deviation
56
�� = Scores Frequency Class �� (� − ��)� �(� − ��)�
8
(f) Mark (x)
�� = 7 53 - 58 3 55.5 166.5 635.04 1905.12
47 - 52 4 49.5 198 368.64 1474.56
Step 2 41 - 46 1 43.5 43.5 174.24 174.24
35 - 40 2 37.5 75 51.84 103.68
s=
(6.8 − 7)2 + (6.2 − 7)2 + (7.2 − 7)2 + (5.9 − 7)2 + (7.0 − 7)2 + (7.4 − 7)2 + (7.3 − 7)2 + (8.2 − 7)2 29 - 34 10 31.5 315 1.44 14.4
8−1
23 - 28 11 25.5 280.5 23.04 253.44
17 - 22 4 19.5 78 116.64 466.56
3.62
s=
7
11 - 16 3 13.5 40.5 282.24 846.72
5 - 10 2 7.5 15 519.84 1039.68
s = 0.72h � = 40 ��� = 1212 ��(� − ��)� = 6278.4

Class Mark (x)= midpoint of score Get the mean


Beacon (53 + 58 = 111) / 2 = 55.5 ���
Step 1 �� =

1212
6.1 + 6.6 + 7.3 + 5.7 + 7.1 + 7.6 + 7.1 + 8.5 ��= frequency * classmark �� =
�� = 40
8 3 * 55.5 = 166.5
�� = 30.3
56 (� − �� �)� = (��������� − ����)�
�� =
8 (55.5 - 30.3 = 25.2) 25.22 = 635.04
�� = 7
�(� − ��)� = frequency * (� − ��)�
3 * 635.04 = 1905.12
Step 2

(6.1 − 7)2 + (6.6 − 7)2 + (7.3 − 7)2 + (5.7 − 7)2 + (7.1 − 7)2 + (7.6 − 7)2 + (7.1 − 7)2 + (8.5 − 7)2
s=
8−1
Variance Standard Deviation

Σ(χ − ��)2
5.38 s2 = Σ(χ − μ)2
s= n−1 s=
7 n−1
6278.4
s = 0.88h �2 =
40 − 1 � = 160.98
6278.4 � =12.69
The batteries from Dependable have the smallest standard �2 =
39
deviation. According to these results, the Dependable
�2 = 160.98
company produces the most consistent batteries with regards
to life expectancy under constant use.
SYMMETRIC AND ASSYMETRIC
DISTRIBUTION

Symmetric Distribution
 Property of a distribution that has the mean as the center,
acting as a mirror image of the two sides of the
distribution
 Most of the data values are found near the mean, tapering
off on both sides of the mean.
 The mean is equal to the median

Asymmetric Distribution
 Lack of symmetry
 Can be right-skewed distribution or left-skewed
distribution

Pearson Coefficient of Skewness

(�� − ������)
�� = 3

Example:
Calculate for the coefficient of skewness if the mean is 76 and
the median is 75 with standard deviation is 0.75.
(�� − ������)
�� = 3

76 − 75
�� = 3
0.75
�� = 4

IMPORTANT NOTES (FOR ASYMMETRIC DISTRIBUTION ONLY):


1. If the Sk is less than 1 or greater than 1, then the
distribution is highly skewed.
2. If the Sk is between -1 or -1/2 or between ½ and 1, then
the distribution is moderately skewed.
3. If the Sk is between -1/2 and 1, then the distribution is
approximately normal.
4. If the Sk is 0, then the distribution is normally distributed.

*If the mean is greater than the median, it is a RIGHT SKEWED


DISTRIBUTION.
*If the median is greater than the mean, it is a LEFT SKEWED
DISTRIBUTION.

You might also like