Q 1. (A) What Do You Understand by Word "Statistics", Give Out Its Definitions (Minimum by 4 Authors) As Explained by Various Distinguished Authors

BUISINESS STATISTICS
Q 1. (a) What do you understand by word “Statistics”, give out its definitions (minimum by
4 Authors) as explained by various distinguished authors.
Answer 1 [a].
 SYNOPSIS
 Introduction
 BRIEF HISTORY OF STATISTICS
 DEFINITIONS EXPLAINED BY VARIOUS DISTINGUISHED AUTHORS
 Prof. Horace Secrist
 Croxton and Cowden
 Prof, Ya Lun Chou
 Wallis and Roberts
 Conclusion
Introduction
The word statistics in our everyday life means different things to different
people. To a football fan, statistics are the information about rushing yardage,
passing yardage, and first downs, given a halftime. To a manager of a power
generating station, statistics may be information about the quantity of
pollutants being released into the atmosphere.
To a school principal, statistics are information on the absenteeism, test

scores and teacher salaries.
To a medical researcher investigating the effects of a new drug, statistics

are evidence of the success of research efforts.
And to a college student, statistics are the grades made on all the quizzes
in a course this semester.
Each of these people is using the word statistics correctly, yet each uses
it in a slightly different way and for a somewhat different purpose. Statistics is
a word that can refer to quantitative data or to a field of study. As a field of
study, statistics is the science of collecting, organizing and interpreting
numerical facts, which we call data. We are bombarded by data in our everyday
life.
The collection and study of data are important in the work of many
professions, so that training in the science of statistics is valuable preparation
for variety of careers. Each month, for example, government statistical offices
release the latest numerical information on unemployment and inflation.
Submitted by: AMIT ALEXANDER Page 1

Economists and financial advisors as well as policy makers in government and

business study these data in order to make informed decisions.
Farmers study data from field trials of new crop varieties. Engineers gather
data on the quality and reliability of manufactured of products. Most areas of
academic study make use of numbers, and therefore also make use of methods
of statistics. Whatever else it may be, statistics is, first and foremost, a
collection of tools used for converting raw data into information to help
decision makers in their works. The science of data - statistics - is the subject
of this course.
BRIEF HISTORY OF STATISTICS
The word statistik comes from the Italian word statista (meaning
“statesman”). It was first used by Gottfried Achenwall (1719-1772), a professor
at Marlborough and Gottingen. Dr. E.A.W. Zimmermam introduced the word
statistics to England. Its use was popularized by Sir John Sinclair in his work
“Statistical Account of Scotland 1791-1799”. Long before the eighteenth
century, however, people had been recording and using data. Official
government statistics are as old as recorded history. The emperor Yao had
taken a census of the population in China in the year 2238 B.C. The Old
Testament contains several accounts of census taking.
Governments of ancient Babylonia, Egypt and Rome gathered detail

records of population and resources. In the Middle Age, governments began to
register the ownership of land. In A.D. 762 Charlemagne asked for detailed
descriptions of church-owned properties. Early, in the ninth century, he
completed a statistical enumeration of the serfs attached to the land. About
1086, William and Conqueror ordered the writing of the Domesday Book, a
record of the ownership, extent, and value of the lands of England. This work
was England’s first statistical abstract. Because of Henry VII’s fear of the
plague, England began to register its dead in 1532. About this same time,
French law required the clergy to register baptisms, deaths and marriages.
During an outbreak of the plague in the late 1500s, the English government
started publishing weekly death statistics. This practice continued, and by
1632 these Bills of Mortality listed births and deaths by sex.
In 1662, Captain John Graunt used thirty years of these Bills to make
predictions about the number of persons who would die from various diseases
and the proportion of male and female birth that could be expected.
Summarized in his work, Natural and Political Observations ...Made upon the
Bills of Mortality, Graunt’s study was a pioneer effort in statistical analysis. For
his achievement in using past records to predict future events, Graund was
made a member of the original Royal Society. The history of the development of
statistical theory and practice is a lengthy one. We have only begun to list the
people who have made significant contributions to this field. Later we will
encounter others whose names are now attached to specific laws and methods.
Many people have brought to the study of statistics refinements or innovations
that, taken together, form the theoretical basis of what we will study in this
course.
DEFINITIONS EXPLAINED BY VARIOUS DISTINGUISHED AUTHORS
 Prof. Horace Secrist defines statistics as follows:
‘By Statistics we mean, aggregate of facts affected to a marked extent by

multiplicity of causes, numerically expressed, enumerated or estimated
according to reasonable standards of accuracy, collected in a systematic
manner for a pre-determined purpose and placed in relation to each other.’
Thus, according to Prof. Horace Secrist, the following characteristics of

Statistics can be noticed.
1. Statistics means an aggregate of facts: Facts can be analysed only when

there are more than one fact. Single fact cannot be analysed. Thus, the fact
‘Mr. John is 180 cms. Tall’, cannot be statistically analysed. On the other
hand, if we know the heights of 40 students of a class, we can comment upon
the average height, variation, etc. Hence, only a collection of many facts can be
called statistics.
2. Statistics are affected to a marked extent by multiplicity of causes: The

facts are the results of action and interaction of a number of factors. Thus, the
statistics of yield of paddy is the result of factors such as fertility of soil,
amount of rainfall, quality of seed used, quality and quantity of fertilizer used,
etc. These factors, in turn, are the results of many other factors.
3. Statistics are numerically expressed: Only numerical facts can be

statistically analysed. Therefore, facts such as ‘price decreases with increasing
production’ cannot be called statistics.
4. Statistics are enumerated or estimated according to reasonable

standards of accuracy: The facts should be enumerated (collected from the
field) or estimated (computed) with required degree of accuracy. The degree of
accuracy differes from purpose to purpose. In meaning the length of screws, an

accuracy upto a millimetre may be required, whereas, while measuring the

heights of students in a class, accuracy upto a centimetre is enough.
5. Statistics are collected in a systematic manner: The facts should be

collected according to planned and scientific methods. Otherwise, they are
likely to be wrong and misleading.
6. Statistics are collected for a pre-determined purpose: There must be a

definite purpose for collecting facts. Otherwise, the facts become useless and
hence, they cannot be called statistics.
7. Statistics are placed in relation to each other: The facts must be placed
in such a way that a comparative and analytical study becomes possible. Thus,
only related facts which are arranged in logical order can be called statistics.
Prof. Horace has defined Statistics as follows:-
• “By statistics we mean aggregate of facts affected to a marked extent by

multiplicity of causes, numerically expressed, enumerated or estimated
according to reasonable standards of accuracy, collected in a systematic
manner for a predetermined purpose and placed in relation to each
other.”Therefore:-
• Statistics are aggregate of facts
• Statistics are affected to a marked extent by multiplicity of causes
• Statistics are numerically expressed
• Statistics are enumerated or estimated according to reasonable standards of

accuracy
• Statistics are collected in a systematic manner
• Statistics are collected for a predetermined purpose
• Statistics should be placed in relation to each other
 According to Croxton and Cowden, “Statistics is the science of

collection, presentation, analysis and interpretation of numerical data.”
Thus, Statistics contains the tools and techniques required for the
collection, presentation, analysis and interpretation of data. This
definition is precise and comprehensive.

 According to Prof, Ya Lun Chou,” Statistics is a method o f decision

making in the face of uncertainty on the basis of numerical data and
calculated risks.”
 According to Wallis and Roberts, “Statistics is not a body of substantive
knowledge but a body of method for obtaining knowledge.”
Conclusion
Thus, Statistic is an important tool to figure out various complicated
calculation and is very useful to interpret the various categories.

(b) Enumerate some important development of statistical theory, also explain merits and
limitations of statistics.
Answer 1 [b]
 SYNOPSIS
 Introduction
 MERITS OF STATSTICS
 LIMITATION OF STATSTICS
 Conclusion
Introduction
Bernoulli or Bernouilli (both: bĕrn yē`), name of a family distinguished

in scientific and mathematical history. The family, after leaving Antwerp, finally
settled in Basel, Switzerland, where it grew in fame.
Jacob, Jacques, or James Bernoulli, 1654–1705, became professor at

Basel in 1687. One of the chief developers both of the ordinary calculus
calculus, branch of mathematics that studies continuously changing
quantities. The calculus is characterized by the use of infinite processes,
involving passage to a limit —the notion of tending toward, or approaching, an
ultimate value.
and of the calculus of variations, he was the first to use the word integral in
solving Leibniz's problem of the isochronous curve. He wrote an important
treatise on the theory of probability (1713) and discovered the series of
numbers that now bear his name, i.e., the coefficients of the exponential series
expansion of x/(1-e−x).
Blaise Pascal and Pierre Fermat are credited with founding

mathematical probability because they solved the problem of points, the
problem of equitably dividing the stakes when a fair game is halted before
either player has enough points to win. This problem had been discussed for
several centuries before 1654, but Pascal and Fermat were the first to give the
solution we now consider correct. They also answered other questions about
fair odds in games of chance. Their main ideas were popularized by Christian
Huygens, in his De ratiociniis in ludo aleae, published in 1657.
During the century that followed this work, other authors, including
James and Nicholas Bernoulli, Pierre Rémond de Montmort, and Abraham
De Moivre, developed more powerful mathematical tools in order to calculate
odds in more complicated games. De Moivre, Thomas Simpson, and others

also used the theory to calculate fair prices for annuities and insurance
policies.
James Bernoulli's Ars conjectandi, published in 1713, laid the

philosophical foundations for broader applications. Bernoulli brought the
philosophical idea of probability into the mathematical theory, formulated rules
for combining the probabilities of arguments, and proved his famous theorem:
the probability of an event is morally certain to be approximated by the
frequency with which it occurs. Bernoulli advanced this theorem (later called
the law of large numbers by Poisson) as a justification for using observed
frequencies as probabilities, to be combined by his rules to settle practical
questions.
Bernoulli's ideas attracted philosophical and mathematical attention, but

gambling, whether on games or on lives, remained the primary source of new
ideas for probability theory during the first half of the eighteenth century. In
the second half of the eighteenth century, a new set of ideas came into play.
Workers in astronomy and geodesy began to develop methods for reconciling
observations, and students of probability theory began to seek probabilistic
grounding for such methods. This work inspired Pierre Simon Laplace's
invention of the method of inverse probability, and it benefitted from
the.evolution of Bernoulli's law of large numbers into what we now call the
central limit theorem. It culminated in Adrien Marie Legendre's publication of
the method of least squares in 1805 and in the probabilistic rationales for least
squares developed by Laplace and by Carl Friedrich Gauss from 1809 to
1811. These ideas were brought together in Laplace's great treatise on
probability, Théorie analytique des probabilités, published in 1812.
Abraham de Moivre French-born British mathematician Abraham de

Moivre (May 26, 1667 – November 27, 1754) is best known for the
fundamental 1formula of complex numbers 1(cos x + i sin x)n = cos nx + i sin
nx, where i = √-1 , called de Moivre’s theorem. It is the keystone of analytic
trigonometry, linking complex numbers with trigonometry. The result can be
used to find explicit expressions for the nth roots of unity, that is, complex
numbers satisfying the equation zn = 1. De Moivre never explicitly stated it in
his work. However, his familiarity with it is clear from a related formula that he
discovered in 1722, namely, cos φ = ½(cos nφ + i sin nφ)1/n + ½(- . De Moivre’s
chief works were The Doctrine of Chances, (1718), a key contribution to the
early history of probability, the Miscellanea Analytica, (1730), in which he
investigated infinite series, and A Treatise of Annuities on Lives, (1752), an
application of probability to mortality statistics, and the creation of the theory

of annuities. This remarkably original work laid the foundations of the

mathematics of life insurance. de Moivre was born in Vitry, near Paris and
spent five years at a Protestant academy at Sedan. From 1682 to 1684, he
studied logic at Saumur, then entered the Collège de Harcourt in Paris and
took private mathematical lessons with Jacques Ozanam. De Moivre had the
misfortune to be a Huguenot (Calvinist) at the time that Roman Catholic
France revoked the Edict of Nantes and began persecuting French Protestants.
De Moivre studied Newton’s Principia and became such an expert on it

that in later years, when asked about some point or another in it, Newton
would say, “Go to Mr. De Moivre; he knows these things better than I do.” In
1733 de Moivre derived what is now known as the normal distribution as a
method for estimating discrete probabilities, in particular those involving the
binomial distribution. He sought to determine the probability of the most
frequent occurrences in a binomial distribution. De Moivre’s concern was with
games of chance, and his discovery showed the power of sampling to determine
patterns in a population by examining only a few members or cases. Later
Pierre de Laplace, motivated by observational science, discovered the means of
various samples of n measurements are distributed approximately according to
the normal curve. From these approximations Laplace was able only to state
the high probability of sample means lying within a given range according to
the normal distribution. This approximation of the probability of sums of
binomial distribution values is now known as the de-Moivre-Laplace limit
theorem. Prior to the discovery of this theorem, probability and statistics were
treated as two separate entities. It was the first example of central limit
theorems, most of which were derived by Pafnuty Tchebycheff and his
students Andrei Markov and Aleksandr Lyapunov during the period 1880 to
1920. These theorems unified probability and statistics. One of the most
important examples of a continuous probability distribution, the normal
distribution is often referred to as, “The Bell-Shaped Curve,” the shape of its
graph (Figure).

An extremely wide range of natural phenomena and are accurately described

using the curve. The empirical rule of the normal distribution is that 68% of
the area lies under the curve within one standard deviation of the center; 95%
of the area under the curve lies within two standard deviations of the center;
and 99% of the area under the curve lies within three standard deviations of
the center. A standard deviation is the square root of the average of the squares
of the deviations from the mean in a frequency distribution. De Moivre’s The
Book of Chances marked the first appearance of the bell-curve, although the
origin of the curve is sometimes attributed to Gauss, who did the most
important and fundamental work with the normal distribution; so much so
that it is sometimes referred to as the Gaussian distribution
Theory of probability was initially developed by James Bernoulli, Daniel

Bernoulli, La Place and Karl gauss according to this theory Probability starts
with logic. There is a set of N elements. We can define a sub-set of n favorable
elements, where n is less than or equal to N. Probability is defined as the
rapport of the favorable cases over total cases, or calculated as: .
n.p=-----N.
Normal curve discovered by Abraham de moivere (1687-1754). The normal

distribution (also called the Gaussian distribution or the famous ``bell-shaped''
curve) is the most important and widely used distribution in statistics. The
normal distribution is characterized by two parameters, and , namely, the
mean and variance of the population having the normal distribution. .

Jacques quetlet (1796-1874) discovered the fundamental principle “the

constancy of great numbers which became the basic of sampling. .
Regression developed by sir Francis galton . It evaluates the relationship

between one variable (termed the dependent variable) and one or more other
variables (termed the independent variables). It is a form of global analysis as it
only produces a single equation for the relationship thus not allowing any
variation across the study area. .
Karl Pearson developed the concept of square goodness of fit test. Sir Ronald
fischer (1890-1962) made a major contribution in the field of experimental
design turning into science. Since 1935 “design of experiments” has made
rapid progress making collection and analysis of statistical prompter and more
economical. Design of experiments is the complete sequence of steps taken
ahead of time to ensure that the appropriate data will be obtained, which will
permit an objective analysis and will lead to valid inferences regarding the
stated problem
MERITS OF STATSTICS
 Presenting facts in a definite form
 Simplifying mass of figure- condensation into few significant figures
 Facilitating comparison
 Helping in formulating and testing of hypothesis and developing new

theories.
 Helping in predictions.
 Helping in formulation of suitable policies.
LIMITATION OF STATSTICS
 Does not deal with individual measurement.
 Deals only with quantities characteristics.
 Result is true only on an average.
 It is only one of the methods of studying a problem.

 Statistics can be measured. It requires skills to use it effectively,

otherwise misinterpretation is possible
 It is only a tool or means to an end and not the end itself which has to be
intelligently identified using this tool.
Conclusion
James Bernoulli, Pierre Fermat, Blaise Pascal, Christian Huygens,
Abraham De Moivre, Pierre Simon Laplace's, Carl Friedrich Gauss, Pafnuty
Tchebycheff, Jacques quetlet, Francis galton, Karl Pearson are the names
of the tremendous contributors in the field of math’s and Statistic. They are the
pillars of the Statistic. Their contribution is very important for the whole of the
world and without them the statistics calculations would not be so easy as it
seems to be in present.

Q 2. (a) Define elementary theory of sets, also explain various methods by giving suitable
examples, Narrate the utility of “Set Theory” in an organization.
Answer 2 [a]
 SYNOPSIS
 Introduction
 UTILITY OF SET THEORY IN BUSINESS ORGANISATION.
 Conclusion
Introduction
A set is a collection of items, objects or elements, which are governed by a rule
indicating weather an object belong to the set or not. In conventional notation
of sets,
Alphabets like A, B, C, X, U; S etc are used to denote sets. Braces like ‘{ }’ are
used as a notation for collection of objects or elements in the set.’ Greek letter
epsilon “ ” is used to denote “belongs to”. A vertical line ‘|’ is used to denote
expression ‘such that’. Alphabet ‘I’ is used to denote an ‘integer’. Using above
notation a set called a considering of elements 0, 1, 2,3,4,5 May be
mathematically denoted in any of the following manner
1) List or roster method
This means all elements are actually listed.
A= {0, 1, 2, 3, 4, 5} read as A is a set with elements 0, 1, 2,3,4,5
2) set builder or rule method (in which a mathematical rules, equality or in

equality etc are specified to generate the elements of intended set)
A={x | 0 < x <5}, X I} (I =1, 2, 3, 4, 5)
Read as A is (a set of) (variable x) (such that) (x lies between 0 and 5, both

inclusive) where variable x belongs to integers.
3) Universal set is a set consisting of all objects or elements
Is a set consisting of all objects or elements of a type or a given interest and

is normally depicted by alphabets X, U or S.
E.g. A is set of all digits may be expressed as X= {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
4) Finite set is one in which the number of elements can be counted.
E.g. A = {1, 2, 3…15} having 15 elements
Or a set of employees in an organization.
4) Infinite set is one in which the number of elements cannot be counted.
E.g. a set of integers or real numbers.
5) Subset a is called a subset of B if every elements of a is also an element of B
This is represented as A B and is read as “A is a sub set of B”
E.g. each of sets A = {0, 1, 2, 3, 4, 5} OR THE SET B = {1, 3, 5} are subset of

set C
WHERE C= {0, 1, 2, 3, 4, 5}
6) Supersets A IS a superset of B. if every element of b is an element of B

This represents as A B and read as “A is superset of B”.
7) Equal sets if A is sub set of B and B is a subset of A then A and B are called
equal sets. This can be denoted as follows
If A B and B A then A= B
UTILITY OF SET THEORY IN BUSINESS ORGANISATION.
A company consists of sets of resources like personnel, machines, material

stocks and cash reserves. The relationship between these sets and between the
subsets of each set is used to equate assets of one kind with another. The
subsets of highly skilled production workers, within the sets of all production
workers, are critical subset that determines the productivity or other
personnel. Certain subsets of company products are highly profitable or certain
material may be subject to deterioration and must be stocked in greater
quantities than others. Thus the concept of sets is very useful in business
management.

(b) Explain the meaning and type of “Data” as applicable in any business. How would you
classify and tabulate the Data, support your answer with examples.
Answer 2 [b]
 SYNOPSIS
 Introduction
 Classification and tabulation of data:
 Conclusion
Introduction
Data is any group of observation or measurement related to the area of a
business interest and to be used for decision making. It can be of the following
two types.
1) Qualitative data (representing non numeric feature or qualities of object

under reference)
2) Quantitative data (that represent properties of object under reference with

numeric details.) Data can also be of the following types:
1) Primary data (are observed and recorded as a part of an original experiment

or survey)
2. Secondary data: (are compiled by someone other than the user of the data
for decision making purpose)
Classification and tabulation of data:
Data can be classified by
 geographical areas;
 chronicle sequences;
 qualitative attributes like urban or rural,

 male or female,
 literate or illiterates,
 under graduate ,
 graduate or post graduate,
 employed or unemployed and so on;
while the ,most frequently used method of classification of data is the

quantitative classification.
Tabulation:
After data is classified it is represented in a tabular form. A self explanatory
and comprehensive table has a
 table number,
 title of the table,
 caption (column or sub-column headings),
 stubs,
body containing the main data which occupies the cells of the table after the
data has been classified under various caption and stubs. Head notes are
added at the top of the table for general information regarding the relevance of
the table or for cross reference or links with other literature. Foot notes are
appended for clarification, explanation or as additional comments on any of the
cells in the table.]

Q 3. (a) Describe Arithmetic, Geometric and Harmonic means with suitable examples.
Explain merits and limitations of Geometric mean.
Answer 3 [a]
Arithmetic mean
The arithmetic mean is the "standard" average, often simply called the "mean".
It is used for many purposes but also often abused by incorrectly using it to
describe skewed distributions, with highly misleading results. The classic
example is average income - using the arithmetic mean makes it appear to be
much higher than is in fact the case. Consider the scores {1, 2, 2, 2, 3, 9}. The
arithmetic mean is 3.16, but five out of six scores are below this!
The arithmetic mean of a set of numbers is the sum of all the members of the
set divided by the number of items in the set. (The word set is used perhaps
somewhat loosely; for example, the number 3.8 could occur more than once in
such a "set".) The arithmetic mean is what pupils are taught very early to call
the "average." If the set is a statistical population, then we speak of the
population mean. If the set is a statistical sample, we call the resulting statistic
a sample mean. The mean may be conceived of as an estimate of the median.
When the mean is not an accurate estimate of the median, the set of numbers,
or frequency distribution, is said to be skewed.
We denote the set of data by X = {x1, x2, ..., xn}. The symbol µ (Greek: mu) is
used to denote the arithmetic mean of a population. We use the name of the
variable, X, with a horizontal bar over it as the symbol ("X bar") for a sample
mean. Both are computed in the same way:
The arithmetic mean is greatly influenced by outliers. In certain situations, the

arithmetic mean is the wrong concept of "average" altogether. For example, if a
stock rose 10% in the first year, 30% in the second year and fell 10% in the
third year, then it would be incorrect to report its "average" increase per year
over this three year period as the arithmetic mean (10% + 30% + (-10%))/3 =
10%; the correct average in this case is the geometric mean which yields an
average increase per year of only 8.8%.
Arithmetic mean (AM)
The arithmetic mean is the "standard" average, often simply called the "mean".

The mean may often be confused with the median, mode or range. The mean is
the arithmetic average of a set of values, or distribution; however, for skewed
distributions, the mean is not necessarily the same as the middle value
(median), or the most likely (mode). For example, mean income is skewed
upwards by a small number of people with very large incomes, so that the
majority have an income lower than the mean. By contrast, the median income
is the level at which half the population is below and half is above. The mode
income is the most likely income, and favors the larger number of people with
lower incomes. The median or mode are often more intuitive measures of such
data.
Nevertheless, many skewed distributions are best described by their mean –

such as the exponential and Poisson distributions.
For example, the arithmetic mean of six values: 34, 27, 45, 55, 22, 34 is
Geometric mean
The geometric mean is an average which is useful for sets of numbers which
are interpreted according to their product and not their sum (as is the case
with the arithmetic mean). For example rates of growth.
The geometric mean of a set of positive data is defined as the product of all the
members of the set, raised to a power equal to the reciprocal of the number of
members. In a formula: the geometric mean of a1, a2, ..., an is , which is . The
geometric mean is useful to determine "average factors". For example, if a stock
rose 10% in the first year, 20% in the second year and fell 15% in the third
year, then we compute the geometric mean of the factors 1.10, 1.20 and 0.85
as (1.10 × 1.20 × 0.85)1/3 = 1.0391... and we conclude that the stock rose on
average 3.91 percent per year. The geometric mean of a data set is always
smaller than or equal to the set's arithmetic mean (the two means are equal if
and only if all members of the data set are equal). This allows the definition of
the arithmetic-geometric mean, a mixture of the two which always lies in
between. The geometric mean is also the arithmetic-harmonic mean in the
sense that if two sequences (an) and (hn) are defined:
and Then an and hn will converge to the geometric mean of x and y.
Geometric mean (GM)
The geometric mean is an average that is useful for sets of positive numbers
that are interpreted according to their product and not their sum (as is the
case with the arithmetic mean) e.g. rates of growth.
For example, the geometric mean of six values: 34, 27, 45, 55, 22, 34 is:
Harmonic mean
The harmonic mean is an average which is useful for sets of numbers which
are defined in relation to some unit, for example speed (distance per unit of
time).
In mathematics, the harmonic mean is one of several methods of calculating an
average.
The harmonic mean of the positive real numbers a1,...,an is defined to be
The harmonic mean is never larger than the geometric mean or the arithmetic
mean (see generalized mean). In certain situations, the harmonic mean
provides the correct notion of "average". For instance, if for half the distance of
a trip you travel at 40 miles per hour and for the other half of the distance you
travel at 60 miles per hour, then your average speed for the trip is given by the
harmonic mean of 40 and 60, which is 48; that is, the total amount of time for
the trip is the same as if you traveled the entire trip at 48 miles per hour.
Similarly, if in an electrical circuit you have two resistors connected in parallel,
one with 40 ohms and the other with 60 ohms, then the average resistance of
the two resistors is 48 ohms; that is, the total resistance of the circuit is the
same as it would be if each of the two resistors were replaced by a 48-ohm
resistor. (Note: this is not to be confused with their equivalent resistance, 24
ohm, which is the resistance needed for a single resistor to replace the two
resistors at once.)
Harmonic mean (HM)
The harmonic mean is an average which is useful for sets of numbers which
are defined in relation to some unit, for example speed (distance per unit of
time).

For example, the harmonic mean of the six values: 34, 27, 45, 55, 22, and 34
is
Merits and limitation of geometric mean
Merits:
 It is based on each and every item of the series.
 It is rigidly defined
 It is useful in averaging ratio and percentage in determining rates of

increase or decrease.
 it gives less weight to large items and more to small items. Thus
geometric mean of the geometric of values is always less than their
arithmetic mean.
 It is capable of algebraic manipulation like computing the grand

geometric mean of the geometric mean of different sets of values.>
Limitation:
 It is relatively difficult to comprehend, compute and interpret.
 A G.M with zero value cannot be compounded with similar other non-
zero values with negative sign

(b) What do you understand by Concept of Probability, Explain various theories of

probabilities.
Answer 3 [b]
Probability theory is the branch of mathematics concerned with analysis of

random phenomena. The central objects of probability theory are random
variables, stochastic processes, and events: mathematical abstractions of non-
deterministic events or measured quantities that may either be single
occurrences or evolve over time in an apparently random fashion. Although an
individual coin toss or the roll of a die is a random event, if repeated many
times the sequence of random events will exhibit certain statistical patterns,
which can be studied and predicted.
Using Probability Theory to reason under uncertainty
 Probabilities quantify uncertainty regarding the occurrence of events.

 Are there alternatives? Yes, e.g., Dempster-Shafer Theory, disjunctive
uncertainty, etc. (Fuzzy Logic is about imprecision, not uncertainty.)
 Why is Probability Theory better? de Finetti: Because if you do not
reason according to Probability Theory, you can be made to act
irrationally. Probability Theory is key to the study of action and
communication:
Decision Theory combines Probability Theory with Utility Theory.
Information Theory is \the logarithm of Probability Theory".
 Probability Theory gives rise to many interesting and important
philosophical questions (which we will not cover).
Probability is a branch of mathematics that measures the likelihood that an

event will occur. Probabilities are expressed as numbers between 0 and 1. The
probability of an impossible event is 0, while an event that is certain to occur
has a probability of 1. Probability provides a quantitative description of the
likely occurrence of a particular event. Probability is conventionally expressed
on a scale of zero to one. A rare event has a probability close to zero. A very
common event has a probability close to one.
Four theories of probability
1)Classical or a priori probability: this is the oldest concept evolved in 17th

century and based on the assumption that outcomes of random experiments

(like tossing of coin, drawing cards from a pack or throwing a die) are equally
likely. For this reason this is not valid in the following cases (a) Where
outcomes of experiments are not equally likely, for example lives of different
makes of bulbs.
(b) Quality of products from a mechanical plant operated under different

condition. However, it is possible to mathematically work out the probability of
complex events, despite of these demerits. A priori probabilities are of
considerable importance in applied statistics.
2) Empirical concept: this was developed in 19th centaury for insurance

business data and is based on the concept of relative frequency. It is based on
historical data being used for future prediction. When we toss a coin, the
probability of a head coming up is ½ because there are two equally likely
events, namely appearance of a head or that of a tail. This is an approach of
determining a probability from deductive logic.
3) Subjective or personal approach. This approach was adopted by frank

Ramsey in 1926 and developed by others. It is based on personal beliefs of the
person making the probability statement based on past information, noticeable
trends and appreciation of futuristic situation. Experienced people use this
approach for decision making in their own field.
4) Axiomatic approach: this approach was introduced by Russian

mathematician A N Kolmogorov in 1933. His concept of probability is
considered as a set of function, no precise definition is given but following
axioms or postulates are adopted.
a) The probability of an event ranges from 0 to 1. That is, an event surely not
be happen has probability 0 and another event sure to happen is associated
with probability 1.
b) The probability of an entire sample space (that is any, some or all the
possible outcomes of an experiment) is 1. Mathematically, P(S) =1

Q 5. (a) What is “Chi - Square” (x2) test, narrate the steps for determining value of x2 with
suitable examples. Explain the conditions for applying x2 and uses of Chi-Square test.
Answer 5 [a]
Chi-square test
The chi-square is one of the most popular statistics because it is easy to

calculate and interpret. There are two kinds of chi-square tests. The first is
called a one-way analysis, and the second is called a two-way analysis. The
purpose of both is to determine whether the observed frequencies (counts)
markedly differ from the frequencies that we would expect by chance.
The observed cell frequencies are organized in rows and columns like a
spreadsheet. This table of observed cell frequencies is called a contingency
table, and the chi-square test if part of a contingency table analysis.
The chi-square statistic is the sum of the contributions from each of the
individual cells. Every cell in a table contributes something to the overall chi-
square statistic. If a given cell differs markedly from the expected frequency,
then the contribution of that cell to the overall chi-square is large. If a cell is
close to the expected frequency for that cell, then the contribution of that cell
to the overall chi-square is low. A large chi-square statistic indicates that
somewhere in the table, the observed frequencies differ markedly from the
expected frequencies. It does not tell which cell (or cells) are causing the high
chi-square...only that they are there. When a chi-square is high, you must
visually examine the table to determine which cell(s) are responsible.
When there are exactly two rows and two columns, the chi-square statistic
becomes inaccurate, and Yate's correction for continuity is usually applied.
Statistics Calculator will automatically use Yate's correction for two-by-two
tables when the expected frequency of any cell is less than 5 or the total N is
less than 50.
If there is only one column or one row (a one-way chi-square test), the degrees
of freedom is the number of cells minus one. For a two way chi-square, the
degrees of freedom is the number or rows minus one times the number of
columns minus one.
Using the chi-square statistic and its associated degrees of freedom, the
software reports the probability that the differences between the observed and
expected frequencies occurred by chance. Generally, a probability of .05 or less
is considered to be a significant difference.

A standard spreadsheet interface is used to enter the counts for each cell. After
you've finished entering the data, the program will print the chi-square,
degrees of freedom and probability of chance.
Use caution when interpreting the chi-square statistic if any of the expected
cell frequencies are less than five. Also, use caution when the total for all cells
is less than 50.
Example
A drug manufacturing company conducted a survey of customers. The

research question is: Is there a significant relationship between packaging
preference (size of the bottle purchased) and economic status? There were four
packaging sizes: small, medium, large, and jumbo. Economic status was:
lower, middle, and upper. The following data was collected.
Lower Middle Upper

Small 24 22 18
Medium 23 28 19
Large 18 27 29
Jumbo 16 21 33
------------------------------------------------
Chi-square statistic = 9.743

Degrees of freedom = 6
Probability of chance = .1359
“Chi - Square” (x2)
This test was developed by Karl Pearson (1857-1936), analytical situation and
professor of applied mathematics, London, Whose concept of coefficient of
correlation is most widely used. This r=test consider the magnitude of
dependency between theory and observation and is defined as
Where Oi is the observed frequency
E= expected frequencies
Steps for determining value of x2
1) When data is given in a tabulated form calculated form expected frequencies

for each cell using the following formula
E = (row total) * (column total)/total number of observation.
2) Take difference between O and E for each cell and calculate their square (O-
E) 2
3) Divide (O-E) 2 by respective expected frequencies and total up to get x2.
4) Compare calculated value with table value at given degree of freedom and
specified level of significance. If at a stated level, the calculated value is more
than table values, the difference between theoretical and observed frequencies
are considered to be significant. It could not have arisen due to fluctuation of
simple sampling. However if the values is less than table value it is not
considered as significant, regarded as due to fluctuation of simple sampling
and therefore ignored. Condition for applying x2
1) N must be large, say more than 50, to ensure the similarity between
theoretically correct distribution and our sampling distribution.
2) no theoretical cell frequency cell frequency should be too small, say less
than 5,because that may be over estimation of the value of x2 and may result
into rejection of hypotheses. In case we get such frequencies, we should pool
them up with the previous or succeeding frequencies. This action is called
Yates correction for continuity.
USES OF CHI SQUARE TEST:
1) As a test of independence
The Chi Square Test of Independence tests the association between 2

categorical variables.
Weather two or more attribute are associated or not can be tested by framing a
hypothesis and testing it against table value. For example, use of quinine is
effective in control of fever or complexions of husband and wives. Consider two
variables at the nominal or ordinal levels of measurement. A question of
interest is: Are the two variables of interest independent(not related)or are they
related (dependent)?

When the variables are independent, we are saying that knowledge of one gives
us no information about the other variable. When they are dependent, we are
saying that knowledge of one variable is helpful in predicting the value of the
other variable. One popular method used to check for independence is the chi-
squared test of independence. This version of the chi-squared distribution is a
nonparametric procedure whereas in the test of significance about a single
population variance it was a parametric procedure. Assumptions: 1. We take a
random sample of size n.
2. The variables of interest are nominal or ordinal in nature.
3. Observations are cross classified according to two criteria such that each
observation belongs to one and only one level of each criterion.
2) As a test of goodness of fit The Test for independence (one of the most
frequent uses of Chi Square) is for testing the null hypothesis that two criteria
of classification, when applied to a population of subjects are independent. If
they are not independent then there is an association between them. A
statistical test in which the validity of one hypothesis is tested without
specification of an alternative hypothesis is called a goodness-of-fit test. The
general procedure consists in defining a test statistic, which is some function
of the data measuring the distance between the hypothesis and the data (in
fact, the badness-of-fit), and then calculating the probability of obtaining data
which have a still larger value of this test statistic than the value observed,
assuming the hypothesis is true. This probability is called the size of the test or
confidence level. Small probabilities (say, less than one percent) indicate a poor
fit. Especially high probabilities (close to one) correspond to a fit which is too
good to happen very often, and may indicate a mistake in the way the test was
applied, such as treating data as independent when they are correlated. An
attractive feature of the chi-square goodness-of-fit test is that it can be applied
to any university distribution for which you can calculate the cumulative
distribution function. The chi-square goodness-of-fit test is applied to binned
data (i.e., data put into classes). This is actually not a restriction since for non-
binned data you can simply calculate a histogram or frequency table before
generating the chi-square test. However, the values of the chi-square test
statistic are dependent on how the data is binned. Another disadvantage of the
chi-square test is that it requires a sufficient sample size in order for the chi-
square approximation to be valid.
3) As test of homogeneity: it is an extension of test for independence weather

two more independent random samples are drawn from the same population or
different population. The Test for Homogeneity answers the proposition that
several populations are homogeneous with respect to some characteristic.

(b) How do you define “Index Numbers” ? Narrate the nature and types of Index numbers
with adequate examples.
Answer 5 [b]
According to Croxton and Cowden index numbers are devices for measuring
difference sin the magnitude of a group of related
According to Morris Hamburg “ in its simplest form an index number is nothing

more than a relative which express the relationship between two figures, where
one figure is used as a base. According to M. L .Berenson and D.M.LEVINE
“generally speaking , index number measure the size or magnitude of some
object at particular point in time as a percentage of some base or reference
object in the past. According to Richard .I. Levin and David S. Rubin” an index
number is a measure how much a variable changes over time
The concept of an index number is one of the most important in economics.

Anyone attempting any work in applied economics should have a good
undertanding of the principle of index numbers and a practical knowledge of
how to work with them.
NATURE OF INDEX NUMBER
1) Index numbers are specified average used for comparison in situation where
two or more series are expressed in different units or represent different items.
E.g. consumer price index representing prices of various items or the index of
industrial production representing various commodities produced.
2) Index number measure the net change in a group of related variable over a
period of time.
3) Index number measure the effect of change over a period of time, across the
range of industries, geographical regions or countries.
4) The consumption of the index number is carefully planned according to the

purpose of their computation, collection of data and application of appropriate
method, assigning of correct weightages and formula.
TYPES OF INDEX NUMBERS:
Price index numbers: A price index is any single number calculated from an
array of prices and quantities over a period. Since not all prices and quantities

of purchases can be recorded, a representative sample is used instead.. price

are generally represented by p in formulae. These are also expressed as price
relative , defined as follows
Price relative=(current years price/base years price)*100
=(p1/p0)*100 any increses in price index amounts to corresponding decreses in

purchasing power of the rupees or other affected currency. Quantity index
number a quantity index number measures how much the number or quantity
of a variable changes over time. Quantities are generally represented as q in
formulae. Value index number: a value index number measures changes in
total monetary worth, that is, it measure changes in the rupee value of a
variable. It combines price and quantity changes to present a more informative
index. Composite index number: a single index number may reflect a
composite, or group, of changing variable. For instance, the consumer price
index measures the general price level for specific goods and service in the
economy. These are also known as index numbers. In such cases the price-
relative with respect to a selected base are determined separately for each and
their statistical average is computed

Q 6. (a) What are the important Index Numbers used in Indian Economy. Explain index
numbers of Industrial Production.
Answer 6. [a]
IMPORTANCE OF INDEX NUMBERS USED IN INDIAN ECONOMY:
Cost of living index or consumer price index
Cost of living index number or consumer price index, expressed as percentage,

measure the relative amount of money necessary to derive equal satisfaction
during two periods of time, after taking into consideration the fluctuations of
the retail prices of consumer goods during these periods. This index is relevant
to that real wages of workers are defined as (actual wages/cost of living
index)*100. Generally the list of items consumed varies for different classes of
people (rich, middle, class, or the poor) at the same place of residence. Also
people of the same class belonging to different geographical regions have
different consumer habits. Thus the cost of living index always relates to
specific class of people and a specific geographical area, and it help in
determining the effect of changes in price on different classes of consumers
living in different areas. The process of construction of cost of living index
number is as follows
1) Obtain decision about class of people for whom the index number is to be
computed, for instance, the industrial personnel, officers or teachers etc. also
decide on the geographical area to be covered.
2) Conduct a family budget inquiry covering the class of people for whom the
index number is to be computed. The enquiry should be conducted for the base
year by the process of random sampling. This would give information regarding
the nature, quality and quantities of commodities consumed by an average
family of the class and also the amount spent on different items of
consumption.
3) The item on which the information regarding money spent is to be collected

are food( rice, wheat, sugar, milk, tea etc) ,clothing, fuel and lighting, housing
and miscellaneous items.
4) Collect retail prices in respect of the items from the localities in which the
class of people concerned reside, or from the markets where they usually make

their purchases.
5) as the relative importance of various items for different classes of people is

not the same, the price or price relative are always weighted and therefore, the
cost of living index is always a weighted index.
6) The percentage expenditure on an item constitutes the weight of the item

and the percentage expenditure in the five groups constitutes the group weight.
7) Separate index number are first of all determined for each of the five major
groups, by calculating the weighted average of price-relatives of the selected
items in the group.
INDEX NUMBER OF INDUSTRIAL PRODUCTION
The index number of industrial production is designed to measure increase or

decrease in the level of industrial production in a given period of time
compared to some base periods. Such an index measures changes in the
quantities of production and not their values. Data about the level of industrial
output in the base period and in the given period is to be collected first under
the following heads
 Textile industries to include cotton, woolen, silk etc.
 Mining industries like iron ore, iron, coal, copper, petroleum etc.
 Metallurgical industries like automobiles, locomotive, aero planes etc
 Industries subject to excise duties like sugar, tobacco, match etc.
 Miscellaneous like glass, detergents, chemical, cement etc.
The figure of output for a various industries classifies above are obtained on a
monthly, quarterly or yearly basis. Weights are assigned to various industries
on the basis of some criteria such as capital invested turnover, net output,
production etc. usually the weights in the index are based on the values of net
output of different industries. The index of industrial production is obtained by
taking the simple mean or geometric mean of relatives. When the simple
arithmetic mean is used the formula for constructing the index is as follows.
Index of industrial production = (100/ w)*{ (q1/q0) w}=(100/ w)* I.w

Where q1= quantity produced in a given period
Q0= quantity produced in the base period
W= relative importance of different outputs
I= (q1/q0) =index for respective commodity

(b) Enumerate Probability Distributions, explain the Histogram and Probability

Distribution curve.
Answer 6 [b]
PROBABILITY DISTRIBUTION CURVE:
Probability distributions are a fundamental concept in statistics. They are used

both on a theoretical level and a practical level.
Some practical uses of probability distributions are:
• To calculate confidence intervals for parameters and to calculate critical

regions for hypothesis tests.
• For uni variate data, it is often useful to determine a reasonable

distributional model for the data.
• Statistical intervals and hypothesis tests are often based on specific

distributional assumptions. Before computing an interval or test based on a
distributional assumption, we need to verify that the assumption is justified for
the given data set. In this case, the distribution does not need to be the best-
fitting distribution for the data, but an adequate enough model so that the
statistical technique yields valid conclusions. • Simulation studies with random
numbers generated from using a specific probability distribution are often
needed.
The probability distribution of the variable X can be uniquely described by its

cumulative distribution function F(x), which is defined by for any x in R.
A distribution is called discrete if its cumulative distribution function consists

of a sequence of finite jumps, which means that it belongs to a discrete random
variable X: a variable which can only attain values from a certain finite or
countable set. A distribution is called continuous if its cumulative distribution
function is continuous, which means that it belongs to a random variable X for
which Pr[ X = x ] = 0 for all x in R.
Several probability distributions are so important in theory or applications that

they have been given specific names:

The Bernoulli distribution, which takes value 1 with probability p and value 0
with probability q = 1 - p.
THE POISSON DISTRIBUTION
In probability theory and statistics, the Poisson distribution is a discrete

probability distribution (discovered by Siméon-Denis Poisson (1781–1840) and
published, together with his probability theory, in 1838 . N that count, among
other things, a number of discrete occurrences (sometimes called "arrivals")
that take place during a time-interval of given length. The probability that there
are exactly k occurrences (k being a non-negative integer, k = 0, 1, 2, ...) is
where e is the base of the natural logarithm (e = 2.71828...),

k! is the factorial of k, ? is a positive real number, equal to the expected
number of occurrences that occur during the given interval. For instance, if the
events occur on average every 2 minutes, and you are interested in the number
of events occurring in a 10 minute interval, you would use as model a Poisson
distribution with ? = 5.
THE NORMAL DISTRIBUTION
The normal or Gaussian distribution is one of the most important probability

density functions, not the least because many measurement variables have
distributions that at least approximate to a normal distribution. It is usually
described as bell shaped, although its exact characteristics are determined by
the mean and standard deviation. It arises when the value of a variable is
determined by a large number of independent processes. For example, weight
is a function of many processes both genetic and environmental. Many
statistical tests make the assumption that the data come from a normal
distribution.
THE PROBABILITY DISTRIBUTION FUNCTION IS GIVEN BY THE

FOLLOWING FORMULA
Where x= value of the continuous random variable
= mean of normal random variable(greek letter ‘mu’)
e= exponential constant=2.7183
= standard deviation of the distribution

= mathematical constant=3.1416
HISTOGRAM AND PROBABILITY DISTRIBUTION CURVE
A bar graph such that the area over each class interval is proportional to the
relative frequency of data within this interval in plotting a histogram, one starts
y dividing the range of all values into non-overlapping intervals, called class
intervals, in such a way that every piece of data is contained in some class
interval.
A histogram displays continuous data in ordered columns. Categories are of

continuous measure such as time, inches, temperature, etc.
Advantages
 Visually strong
 Can compare to normal curve
 Usually vertical axis is a frequency count of items falling into each

category
Disadvantages
 Cannot read exact values because data is grouped into categories
 More difficult to compare two data sets
 Use only with continuous data

We determine the height of each rectangular bar of the histogram shows the
midpoints of the intervals. Bars are centered above the midpoints of the
intervals. The vertical axis of the histogram shows the frequency of the scores
in each of the intervals on the horizontal axis. In 20 to 29, it shows an
extremely high or low score. From this histogram, we did not found the title of
this histogram. The scales that are used in this histogram are suitable because
the data are small. However, the raw data is not shown.
The Normal Probability Curve
The graph shown below is the shape of the famous bell curve. This bell curve
drawing is the same as a normal probability curve. This is the density curve.
Density curve means that it shows the likelihood or probability at any given
point on the curve. This illustrates the strong central bias for any given data
point.
The Cumulative Curve

The shape of the cumulative curve is probably not as familiar, but it is more
useful. Starting from the left and moving to the right, the cumulative curve is
the summation of all the points on the bell curve or the density curve behind or
to the left. This is the curve used to calculate useful probabilities

Q 1. (A) What Do You Understand by Word "Statistics", Give Out Its Definitions (Minimum by 4 Authors) As Explained by Various Distinguished Authors

Uploaded by

Copyright:

Available Formats

Q 1. (A) What Do You Understand by Word "Statistics", Give Out Its Definitions (Minimum by 4 Authors) As Explained by Various Distinguished Authors

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Q 1. (A) What Do You Understand by Word "Statistics", Give Out Its Definitions (Minimum by 4 Authors) As Explained by Various Distinguished Authors

Uploaded by

Copyright:

Available Formats

BUISINESS STATISTICS

To a school principal, statistics are information on the absenteeism, test

To a medical researcher investigating the effects of a new drug, statistics

Submitted by: AMIT ALEXANDER Page 1

Economists and financial advisors as well as policy makers in government and

BRIEF HISTORY OF STATISTICS

Governments of ancient Babylonia, Egypt and Rome gathered detail

DEFINITIONS EXPLAINED BY VARIOUS DISTINGUISHED AUTHORS

 Prof. Horace Secrist defines statistics as follows:

‘By Statistics we mean, aggregate of facts affected to a marked extent by

Thus, according to Prof. Horace Secrist, the following characteristics of

1. Statistics means an aggregate of facts: Facts can be analysed only when

2. Statistics are affected to a marked extent by multiplicity of causes: The

3. Statistics are numerically expressed: Only numerical facts can be

4. Statistics are enumerated or estimated according to reasonable

Submitted by: AMIT ALEXANDER Page 3

accuracy upto a millimetre may be required, whereas, while measuring the

5. Statistics are collected in a systematic manner: The facts should be

6. Statistics are collected for a pre-determined purpose: There must be a

Prof. Horace has defined Statistics as follows:-

• “By statistics we mean aggregate of facts affected to a marked extent by

• Statistics are aggregate of facts

• Statistics are affected to a marked extent by multiplicity of causes

• Statistics are numerically expressed

• Statistics are enumerated or estimated according to reasonable standards of

• Statistics are collected in a systematic manner

• Statistics are collected for a predetermined purpose

• Statistics should be placed in relation to each other

 According to Croxton and Cowden, “Statistics is the science of

Submitted by: AMIT ALEXANDER Page 4

 According to Prof, Ya Lun Chou,” Statistics is a method o f decision

Submitted by: AMIT ALEXANDER Page 5

Bernoulli or Bernouilli (both: bĕrn yē`), name of a family distinguished

Jacob, Jacques, or James Bernoulli, 1654–1705, became professor at

Blaise Pascal and Pierre Fermat are credited with founding

Submitted by: AMIT ALEXANDER Page 6

James Bernoulli's Ars conjectandi, published in 1713, laid the

Bernoulli's ideas attracted philosophical and mathematical attention, but

Abraham de Moivre French-born British mathematician Abraham de

Submitted by: AMIT ALEXANDER Page 7

of annuities. This remarkably original work laid the foundations of the

De Moivre studied Newton’s Principia and became such an expert on it

Submitted by: AMIT ALEXANDER Page 8

An extremely wide range of natural phenomena and are accurately described

Theory of probability was initially developed by James Bernoulli, Daniel

Normal curve discovered by Abraham de moivere (1687-1754). The normal

Submitted by: AMIT ALEXANDER Page 9

Jacques quetlet (1796-1874) discovered the fundamental principle “the

Regression developed by sir Francis galton . It evaluates the relationship

 Presenting facts in a definite form

 Simplifying mass of figure- condensation into few significant figures

 Helping in formulating and testing of hypothesis and developing new

 Helping in formulation of suitable policies.

 Does not deal with individual measurement.

 Deals only with quantities characteristics.

 Result is true only on an average.

 It is only one of the methods of studying a problem.

Submitted by: AMIT ALEXANDER Page 10

 Statistics can be measured. It requires skills to use it effectively,