Brm unit.5 data.analysis_interpretation_shriram.dawkhar.1

Business Research Methods, Unit: 5
1
PREPARED BY: Dr. Shriram S. Dawkhar, SIBAR, Pune. (For Student Distribution-MBA)
A)Frequency Distribution:
A frequency distribution is a tabular summary of data showing the frequency (or number) of items
in each of several nonoverlapping classes.
The objective is to provide insights about the data that cannot be quickly obtained by looking only
at the original data.
Example: Sun & Sand Hotel Customer rating
Guests staying at Sun & Sand were asked to rate the quality of their accommodations as being excellent,
above average, average, below average, or poor. The ratings provided by a sample of 20 guests are
shown below.
frequency Distribution/ Frequency Table as given below:
Rating Frequency
Poor 2
Below Average 3
Average 5
Above Average 9
Excellent 1
Total 20
*** The GM of Sun & Sand has a goal that no more than 10% of all guests will rate their stay as worse than
average. How is the S & S inn doing?
The relative frequency of a class is the fraction or proportion of the total number of data
items belonging to the class.
A relative frequency distribution is a tabular summary of a set of data showing the relative
frequency for each class.
The percent frequency of a class is the relative frequency multiplied by 100.

2
A percent frequency distribution is a tabular summary of a set of data showing the percent
frequency for each class.
Pie Chart:
• The pie chart is a commonly used graphical device for presenting relative frequency
distributions for qualitative data.
• First draw a circle; then use the relative frequencies to subdivide the circle into sectors that
correspond to the relative frequency for each class.
• Since there are 360 degrees in a circle, a class with a relative frequency of .25 would consume
.25(360) = 90 degrees of the circle.
• Or simply 25X3.6= 90 or 10X3.6 = 36
• ( Angel = X/total *360 please check solved example in the class )

3

4
Types of statistical measures
Measures of central tendency
Mean / average / arithmetic mean
Median
Mode
Measures of dispersion
Range
Quartile deviation or semi-quartile range
Mean absolute deviation
Standard deviation
Problem: Mean, Mode & Median:
1) Given below 31 students height data, Calculate mean, mode, median and Range.

5
Problem :2) Apartment Rents:
1) Mean:
The mean of a data set is the average of all the data values.
If the data are from a sample, the mean is denoted by
x
x
n
i=

If the data are from a population, the mean is denoted by
 (mu).
2) Median
The median of a data set is the value in the middle when the data items are arranged in ascending order.
If there is an odd number of items, the median is the value of the middle item.
If there is an even number of items, the median is the average of the values for the middle two items.

6
3) Mode:
The mode of a data set is the value that occurs with greatest frequency.

7
4) Range:
The range of a data set is the difference between the largest and smallest data
values.
It is the simplest measure of variability.
It is very sensitive to the smallest and largest data values.
5) The coefficient of variation (CV) is a measure of relative variability. It is the ratio of the
standard deviation to the mean (average).
Coefficient of Variation = (Standard Deviation / Mean) * 100.
6. Standard Deviation
The standard deviation of a data set is the positive square root of the variance.
It is measured in the same units as the data, making it more easily comparable, than the variance, to the
mean.
If the data set is a sample, the standard deviation is denoted s.
If the data set is a population, the standard deviation is denoted s (sigma

8
Non-Grouped Data
Non-grouped data is just a list of values. The standard deviation is given by the formula:
s means 'standard deviation'.

9
Criteria for Hypothesis testing: Very important
Hypothesis Testing (Old question paper problems)
Rules for Decision Making: (2 approaches)
A)1st
Approach: If Calculated (Observed) value & or
Critical (table) value is used.
Rules for Decision Making:
1)If Calculated Value (observed) < Critical (table) Value
……………….Then Accept Null hypothesis.
2)If Calculated Value (observed) > Critical (table) Value
………………. Then Reject the null hypothesis.
B) 2nd
Approach: Rules if Probability Value (p-value) Approach is
used:
Reject null hypothesis if the probability value (p-value) is less than significance level.
If… p-value < Significance level ……..Reject the null hypothesis
If…p-value > Significance level ……..Accept the null hypothesis

10
Solution:
Null Hypothesis (H0): Magazine choice is independent of level of income.
Alternate Hypothesis (Ha or H1) = Magazine choice is dependent of level of income.
Now in above example,
Calculated Value (observed) < Critical Value
i.e. 3.956 < 5.99
Hence Null hypothesis is accepted. Hence we can say that Magazine choice is
independent of level of income.
1)If Calculated Value (observed) < Critical (table) Value
2)If Calculated Value (observed) > Critical (table) Value

11
Solution:
Null hypothesis: The mean salary of the students is 45000
Alternate hypothesis: The mean salary of the students is not 45000
1) If Calculated Value (observed) < Critical (table) Value
2) If Calculated Value (observed) > Critical (table) Value
Calculated Value (observed) < Critical (table) Value
i.e. 0.83 < 2.131
Hence Null hypothesis is accepted. Thus, we can say that the mean salary of the
students is 45000.

12
Solution:
Null Hypothesis: The die was fair.
Alternate hypothesis: The die was not fair.
Level of Significance 5% i.e. .5 (given)
Degrees of freedom = n-1 = 6-1 = 5………………total 6 possible observations/outcomes. i.e. faces: 1,2,3,4,5,6
Calculated Value (observed) > Critical (table) Value
i.e. 11.40 > 11.07
Hence Null hypothesis is rejected. Hence alternate hypothesis can be accepted. Thus we
can say that the die was not fair.

13
Solution:
Null Hypothesis: The mean height of the students is 171.17.
Alternate hypothesis: The mean height of the students is not 171.17.
Calculated Value (observed) < Critical (table) Value
i.e. 1.27 < 1.96
Hence Null hypothesis is accepted. Hence we can say that the mean height of the students is
171.17.

14
Type I (Alpha: α) & Type II (Beta: β) error in hypothesis testing:
• Type I errors happen when we reject a true null hypothesis
• Type II errors happen when we fail to reject a false null hypothesis.
• Type I Error: The first kind of error that is possible involves the rejection of a null hypothesis that
is actually true. This kind of error is called a type I error and is sometimes called an error of the first
kind.
• Type II Error: The other kind of error that is possible occurs when we do not reject a null
hypothesis that is false. This sort of error is called a type II error and is also referred to as an error of
the second kind.
Example:
• Type I error - Concluding that the new drug is better than the standard (HA) when in fact it is no
better (H0). Ineffective drug is deemed better.
• Type II error - Failing to conclude that the new drug is better (HA) when in fact it is. Effective drug is
deemed to be no better.

15
Correlation
◼ Correlation: The degree of relationship between the variables under consideration is
measure through the correlation analysis.
◼ The measure of correlation called the correlation coefficient.
◼ The degree of relationship is expressed by coefficient which range from correlation ( -1 ≤ r ≥ +1)
◼ The direction of change is indicated by a sign.
◼ The correlation analysis enable us to have an idea about the degree & direction of the relationship
between the two variables under study.
◼ Correlation is a statistical tool that helps to measure and analyze the degree of relationship between
two variables.
◼ Correlation analysis deals with the association between two or more variables.
1) Positive and Negative Correlation:
◼ Positive Correlation: The correlation is said to be positive correlation if the values of two variables
changing with same direction.
Ex. Pub./Ad. Exp. & sales, Height & weight.
◼ Negative Correlation: The correlation is said to be negative correlation when the values of variables
change with opposite direction.
Ex. Price & qty. demanded.
2) Simple and Multiple Correlation
◼ Simple correlation: Under simple correlation problem there are only two variables are studied.
Types of
Coorelation
1) Positive and Negative
Correlation
3) Linear and Non-
linear Correlation
2) Simple and Multiple
Correlation

16
◼ Multiple Correlation: Under Multiple Correlation three or more than three variables are studied.
Ex. Qd = f ( P,PC, PS, t, y )
◼ Partial correlation: analysis recognizes more than two variables but considers only two variables
keeping the other constant.
◼ Total correlation: is based on all the relevant variables, which is normally not feasible.
3) Linear and Non-linear Correlation
◼ Linear correlation: Correlation is said to be linear when the amount of change in one variable tends
to bear a constant ratio to the amount of change in the other. The graph of the variables having a
linear relationship will form a straight line.
Ex X = 1, 2, 3, 4, 5, 6, 7, 8,
Y = 5, 7, 9, 11, 13, 15, 17, 19,
Y = 3 + 2x
◼ Non Linear correlation: The correlation would be non linear if the amount of change in one
variable does not bear a constant ratio to the amount of change in the other variable.
1) Scatter Diagram:
Scatter Diagram is a graph of observed plotted points where each points represents the
values of X & Y as a coordinate. It portrays the relationship between these two variables
graphically
2) Karl Pearson’s Coefficient of Correlation (r)
Methods of
Studying
Correlation
1) Scatter Diagram Method
3) Spearman’s Rank
Correlation Coefficient
2) Karl Pearson’s Coefficient
of Correlation (r)

17
Karl Pearson’s Coefficient of Correlation denoted by- ‘r’ The coefficient of correlation ‘r’
measure the degree of linear relationship between two variables say x & y
◼ Karl Pearson’s Coefficient of Correlation denoted by- r -1 ≤ r ≥ +1
◼ Degree of Correlation is expressed by a value of Coefficient
◼ Direction of change is Indicated by sign ( - ve) or ( + ve)
3) Spearman’s Rank Coefficient of Correlation ρ (rho):
When statistical series in which the variables under study are not capable of quantitative
measurement but can be arranged in serial order, in such situation pearson’s correlation
coefficient can not be used in such case Spearman Rank correlation can be used.
ρ (rho) = 1- (6 ∑D2 ) / N (N2 – 1)

18
Interpretation of Correlation:

19
Data analysis:
• Data analysis: involves entering data into computer files, inspecting data for errors
(data cleaning), running tabulations (frequencies), and conducting various statistical
tests
– Before analyzing data there are a few preliminary tasks to be completed. These
tasks include editing, coding, classification and tabulation.
i) Editing:

20
Editing is the procedure that improves the quality of the data for coding. Here
researcher look for inconsistencies, incompleteness of data and such data is
discarded for lack or utility for interpretation and analysis
• “ The process of checking the completeness, consistency & legibility of data ready for
coding & transfer to storage”
• “The process of checking and adjusting responses in the completed questionnaires for
omissions, legibility, and consistency and readying them for coding and storage.”
Purpose of editing
• Editing purpose is to guarantee that the data is :
• 1) Accurate
• 2) Consistent with intent of the question & other information in the survey.
• 3) Complete
• 4) arranged to simplify coding & tabulation
Types of Editing:
1. Field Editing
– Preliminary editing by a field supervisor on the same day as the interview to
catch technical omissions, check legibility of handwriting, and clarify responses
that are logically or conceptually inconsistent.
2. In-house Editing
– Editing performed by a central office staff; often dome more rigorously than
field editing
ii) Coding:
– Coding involves assigning suitable codes or symbols to categories of responses
so as to aid further human or computer analysis.
• The process of identifying and classifying each answer with a numerical score or other
character symbol
– The numerical score or symbol is called a code, and serves as a rule for
interpreting, classifying, and recording data
– Identifying responses with codes is necessary if data is to be processed by
computer

21
Example: Male =1, Female =2 & Transgender = 3
iii) Classification:
Next stage is classification where data having common
Characteristic are placed in one class.
– Classification according to attributes:
It can be.
• - Descriptive e.g. literacy, gender etc. or
• - Numerical e.g. weight, height, income etc.
• Classification according to class intervals
• – Data relating to income, production, age, weight etc. come under
this category.
iv) Tabulation:
Here classified data are put in the form of tables.
**** For details about correlation types, graphical presentations, dot plots & other details please check complete
PPT on above link:
https://www.slideshare.net/shriramd/shriram-correlation
For any query write to: ssdawkhar@gmail.com
Thank You.

Brm unit.5 data.analysis_interpretation_shriram.dawkhar.1

Related slideshows

More Related Content

Brm unit.5 data.analysis_interpretation_shriram.dawkhar.1