Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

MMW Lesson 4 Reviewer

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 12

Lesson 4: Statistics and Data Management

Statistics is the study of the collection, organization, analysis, interpretation, and presentation of data. It deals with all
aspects of data, including the planning of its collection in terms of the design of surveys and experiments. Some consider
statistics a mathematical body of science that pertains to the collection, analysis, interpretation or explanation, and
presentation of data, while others consider it a branch of mathematics concerned with collecting and interpreting data.
Because of its empirical roots and its focus on applications, statistics is usually considered a distinct mathematical science
rather than a branch of mathematics.
Statistics is defined as a branch of mathematics which is concerned with facilitating wise decisionmaking in the face of
uncertainty and that, therefore develops and utilizes techniques for collection, effective presentation, and proper analysis
of data.
Branches of Statistics
1. Descriptive Statistics is concerned with the description and summarization of data, It deals with the techniques used in
the collection, presentation, organization, and analysis of the data on hand.
2. Inferential Statistics is concerned with the drawing of conclusions from data. It deals with the techniques used in
generalizing from samples to populations, performing estimations and hypothesis tests determining relationships among
variables, and making predictions.
Functions of Statistics
1. Condensation. Generally speaking, by the verb ‘to condense’, we mean to reduce or to lessen. Condensation is mainly
applied at embracing the understanding of a huge mass of data by providing only few observations.
2. Comparison. Classification and tabulation are the two methods that are used to condense the data. They help us to
compare data collected from different sources. Grand totals, measures of central tendency measures of dispersion, graphs
and diagrams, coefficient of correlation, etc. provide ample scope for comparison. As statistics is an aggregate of facts and
figures, comparison is always possible and in fact comparison helps us to understand the data in a better way.
3. Forecasting. By the word forecasting, we mean to predict or to estimate beforehand. Given the data of the last ten years
connected to the number of students enrolled in PUP, it is possible to predict or forecast the number of students that will
enroll for the near future. In business also forecasting plays a dominant role in connection with production, sales, profits
etc. The analysis of time series and regression analysis plays an important role in forecasting.
4. Estimation. One of the main objectives of statistics is drawn inference about a population from the analysis for the
sample drawn from that population.
5. Tests of Hypothesis. A statistical hypothesis is some statement about the probability distribution, characterizing a
population on the basis of the information available from the sample observations. In the formulation and testing of
hypothesis, statistical methods are extremely useful. Whether the grades of students increased because they are motivated
or whether the new teaching method is effective in discussing a particular topic are some examples of statements of
hypothesis and these are tested by proper statistical tools.
Scope of Statistics
1. Statistics and Industry. Statistics is widely used in many industries. In industries, control charts are widely used to
maintain a certain quality level. In production engineering, to find whether the product is conforming to specifications or
not, statistical tools, namely inspection plans, control charts, etc., are of extreme importance. In inspection plans we have
to resort to some kind of sampling - a very important aspect of Statistics.
2. Statistics and Commerce. Statistics are lifeblood of successful commerce. Any businessman cannot afford to either by
under stocking or having overstock of his goods. In the beginning he estimates the demand for his goods and then takes
steps to adjust with his output or purchases. Thus statistics is indispensable in business and commerce.
3. Statistics and Economics. Statistical methods are useful in measuring numerical changes in complex groups and
interpreting collective phenomenon. Nowadays the uses of statistics are abundantly made in any economic study. Both in
economic theory and practice, statistical methods play an important role.
4. Statistics and Education. Statistics is widely used in education. Research has become a common feature in all branches
of activities. Statistics is necessary for the formulation of policies to start new course, consideration of facilities available
for new courses etc. There are many people engaged in research work to test the past knowledge and evolve new
knowledge. These are possible only through statistics.
5. Statistics and Planning. Statistics is indispensable in planning. In the modern world, which can be termed as the “world
of planning”, almost all the organizations in the government are seeking the help of planning for efficient working, for the
formulation of policy decisions and execution of the same. In order to achieve the above goals, the statistical data relating
to production, consumption, demand, supply, prices, investments, income expenditure etc and various advanced statistical
techniques for processing, analyzing and interpreting such complex data are of importance. In India statistics play an
important role in planning, commissioning both at the central and state government levels.
6. Statistics and Medicine. In Medical sciences, statistical tools are widely used. In order to test the efficiency of a new
drug or medicine, t - test is used or to compare the efficiency of two drugs or two medicines, t-test for the two samples is
used. More and more applications of statistics are at present used in clinical investigation.
7. Statistics and Modern Applications. Recent developments in the fields of computer technology and information
technology have enabled statistics to integrate their models and thus make statistics a part of decision-making procedures
of many organizations. There are so many software packages available for solving design of experiments, forecasting
simulation problems etc.
Limitations of Statistics
1. Statistics is not suitable to the study of qualitative phenomenon. Since statistics is basically a science and deals with a
set of numerical data, it is applicable to the study of only these subjects of enquiry, which can be expressed in terms of
quantitative measurements. As a matter of fact, qualitative phenomenon like honesty, poverty, beauty, intelligence etc,
cannot be expressed numerically and any statistical analysis cannot be directly applied on these qualitative phenomena.
2. Statistics does not study individuals. Statistics does not give any specific importance to the individual items; in fact, it
deals with an aggregate of objects. Individual items, when they are taken individually do not constitute any statistical data
and do not serve any purpose for any statistical enquiry.
3. Statistical laws are not exact. It is well known that mathematical and physical sciences are exact. But statistical laws are
not exact and statistical laws are only approximations. Statistical conclusions are not universally true. They are true only
on an average.
4. Statistics table may be misused. Statistics must be used only by experts; otherwise, statistical methods are the most
dangerous tools on the hands of the inexpert. The use of statistical tools by the inexperienced and untraced persons might
lead to wrong conclusions.
5. Statistics is only one of the methods of studying a problem. Statistical method do not provide complete solution of the
problems because problems are to be studied taking the background of the countries culture, philosophy or religion into
consideration. Thus the statistical study should be supplemented by other evidences.
Population and Sample
In statistics, we are often interested in gathering information from a group of objects. If the group in consideration consists
of large number of objects, we try to obtain information about the group by examining its subgroup.
n order for the data from the sample is informative about the population, it must be representative of the population. Being
representative of the population does not mean that the characteristic of the sample is exactly that of the total population,
but instead the sample was obtain in such way that every member of the population had an equal chance to be included in
the sample.

4.2 Steps in Statistical Investigation


1. Defining the problem
(a) Identify a specific problem.
(b) Define the scope and limitations, assumptions to be made, and expected outcomes.
2. Collection of data
(a) Make sure to collect the data properly.
(b) Incomplete, fabricated, outdated, and inaccurate data are useless.
3. Summarization and tabulation of data
(a) This refers to organization of data in text, tables, graphs and charts, so that logical conclusion can be derived from
them.
(b) Explore the data to obtain additional insight that could contribute to the study.
4. Analysis of data
(a) This pertains to the process of deriving from the given data relevant information from which numerical descriptions
can be formulated.
(b) Summarized data must be examined so that insights and meaningful information ca be produced to support decision-
making or solutions to the question or problem at hand.
5. Interpretation of data and results
(a) Refers to the task of drawing conclusions from the analyzed data.
(b) Results must be able to answer the research problem and give recommendations.
6. Presentation of the result
(a) Present all pertinent results in a clear and concise manner.
(b) Use appropriate form of media to present results.
4.3 Sampling and Sampling Techniques
Sampling refers to the process of obtaining samples from the population. Sampling maybe categorized as either
probability sampling or non-probability sampling. Probability sampling, also referred to as random sampling, is the
method of sampling in which every member of the population have equal chance of being selected as sample; otherwise, it
is considered as non-probability sampling. We should note that in able to properly use the techniques of statistical
inference, probability sampling must be used to obtain samples.
Probability Sampling Techniques
1. Simple Random Sampling. A probability sampling technique wherein all possible subsets consisting of n elements
selected from the N elements of the population have the same chances of selection.
2. Systematic Sampling. This is a probability sampling technique wherein the selection of the first element is at random
and the selection of other elements in the sample is systematic by subsequently taking every kth element from the random
start where k is the sampling interval.
3. Stratified Random Sampling. A probability sampling method where we partition the population into non-overlapping
strata or group and then a proportional sample is chosen from each strata. The actual sample is the sum of the samples
derived from each strata.
4. Cluster Sampling. A probability sampling technique wherein we partition the population into non-overlapping groups or
clusters consisting of one or more elements, and then select a sample of clusters. Every member of the selected cluster
will be considered as sample.
Non-Probability Sampling Techniques
1. Accidental Sampling. Sample is chosen by the researcher by the obtaining members of the population in a convenient,
often haphazard way.
2. Quota Sampling. There is specified number of persons of certain types is included in the sample. The researcher is
aware of categories within the population and draws samples from each category. The size of each categorical sample is
proportional to the proportion of the population that belongs in that category.
3. Purposive Sampling. The researcher employs his or her judgments on choosing which he or she believes are
representative of the population.
4. Snowball Sampling. This technique is also called referral sampling. A primary set of samples are chosen based on the
criteria set by the researcher. Information on where to find succeeding set of sample having the same criteria will be
gathered from this primary set in order to expand the number of samples.
4.4 Sample Size Considerations
The sample size is typically denoted by n and it is always a positive integer. No exact sample size can be mentioned here
and it can vary in different research settings. However, all else being equal, large sized sample leads to increased precision
in estimates of various properties of the population. To determine the sample size we can apply one of the following
methods:
1. Slovin’s Formula. Slovin’s formula is used to calculate the sample size n given the population size and a margin
of error E. It is a formula use to estimate sampling size of a random sample from a given population. We can
compute

2. Minimum Sample Size for Estimating a Population Mean. The estimated minimum sample size n needed to

3. Minimum Sample Size for Estimating a Population Proportion The estimated minimum
4.5 Methods of Data Collection
1. Survey Method. The survey is a method of collecting data on the variable of interest by asking people questions. This
may be done, by interview or by using questionnaires.
2. Observation. Observation is a method of obtaining data or information by using our primary senses.
3. Experiment. Experiment is a method of collecting data where there is direct human intervention on the conditions that
may affect the values of the variable of interest.
4.6 Levels of Measurement
1. The nominal level of measurement classifies data into mutually exclusive (non-overlapping) categories in which no
order or ranking can be imposed on the data. Example: Gender (male, female), Zip Code, Color, Nationality, Political
affiliation, Religious affiliation.
2. The ordinal level of measurement classifies data into categories that can be ranked; however, precise differences
between the ranks do not exist. Example: Grade(A,B,C,D,F), Rating Scale/Likert scale, Ranking of tennis players,
Judging (First place, second place, etc.
3. The interval level of measurement ranks data, and precise differences between units of measure do exist; however, there
is no meaningful zero. Example: Temperature, IQ, SAT score
4. The ratio level of measurement possesses all the characteristics of interval measurement, and there exists a true zero. In
addition, true ratios exist when the same variable is measured on two different members of the population Example:
Height, Weight, volume, Time, Salary, Age
4.7 Presentation of Data
After data have been collected, the researcher can now present them in the following logical methods.
1. Textual Form. Data are presented in paragraph of text. The text highlights the important figures or results that the
researcher wishes to focus on.
2. Tabular Form. Data appears in a systematic manner in rows and columns. The following is an example of a Simple or
One-way Table
4. Graphical Form. Data or relationship among variables could be presented in visual form, thru graph or diagrams.
In that manner, the reader can easily perceive what is being meant by the figure or any trend being portrayed by
the data.
4.8 Measures of Central Tendency
A measure of central tendency or average is a location measure that pinpoints the center or typical middle value of
a data set. A convenient way of describing a set of data with a value that describes the average characteristic a
data set. The three common measures of central tendency are the mean, median and mode.

There are cases when the observations in a data set assume respective weights. In this case where the weights are positive
integers, we can call these weights as frequencies. The following gives a formula for the weighted mean of a weighted
data set.

4.9 Measures of Dispersion or Variability


Measures of dispersion are descriptive summary measures that helps us characterize the data set in terms of how
varied the observations are from the center. If its value is small, then this indicates that the observations are not
too different from the center. On the other hand, if its value is large, then this indicates that the observations are
very different from the center or that they are widely spread out from the center.

the population variance mainly because of the divisor n − 1. The reason behind this is rather technical and mathematical in
nature. Simply taken, the divisor n − 1 removes the “bias” in s 2 when we want it to estimate ff 2 for the purposes of
making inferences.
Notice that the variance is a nonnegative quantity because it came from averaging squared quantities. We also realize that
there is one major drawback to using the variance. If we follow the steps in calculating the variance, we find that the
variance is measured in terms of square units because we took the squares of the deviation. For example, if our sample
data is measured in terms of meters, then the units for a variance would be given in square units.
In order to standardize the units, we can take the square root of the variance to eliminate the problem of squared units, and
gives us a measure of the spread that will have the same units as our original sample or population data.
4.10 Measure of Central Tendency (Grouped Data)

2. The Median The median for grouped data is equal to the lower boundary of the
median class plus the difference between half the total number of
observations and the cumulative frequency before the median class,
i divided by the frequency of the median class, all multiplied by the class
width.

You might also like