Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
27 views

Collection of Data and Sampling Methods

The document discusses the stages of decision making including identifying problems, analyzing problems, making decisions, and implementing decisions. It also covers topics like collecting appropriate and unbiased data, defining the relevant population, and sources of primary and secondary data.

Uploaded by

Mila Zibak
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views

Collection of Data and Sampling Methods

The document discusses the stages of decision making including identifying problems, analyzing problems, making decisions, and implementing decisions. It also covers topics like collecting appropriate and unbiased data, defining the relevant population, and sources of primary and secondary data.

Uploaded by

Mila Zibak
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 65

Collection of data and

sampling methods
Topic 3
Stages in decision-making
Stage 1: Identify a problem
• At the end of this stage, managers should have a
• clear understanding of the problem they are facing, its context
and the requirements of their solution. For this stage they
might:
• (a) Do an initial investigation – looking at operations,
identifying difficulties and recognizing that there really is a
problem.
Stage 1: Identify a problem
• (b) Define the problem – adding details to the initial
investigation, saying exactly what the problem is (and not
just its symptoms), its context, scope, boundaries and any
other relevant details.
• (c) Set objectives – identifying the decision-makers, their
aims, improvements they want, effects on the organization
and measures of success.
• (d) Identify variables, possible alternatives and courses of
action.
• (e) Plan the work – showing how to solve the problem,
schedule activities, design timetables and check resources.
Stage 2: Analyze the problem
• At the end of this stage, managers should have a clear
understanding of their options and the consequences. For this
they might:
• (a) Consider different approaches to solving the problem.
• (b) Check work done on similar problems and see if they can use
the same approach.
• (c) Study the problem more closely and refine the details.
• (d) Identify the key variables and relationships between them.
• (e) Build a model of the problem and test its accuracy.
Stage 2: Analyze the problem
• (f ) Collect data needed by the model and analyze it.
• (g) Run more tests on the model and data to make sure that
they are working properly, are accurate and describe the real
conditions.
• (h) Experiment with the model to find results in different
circumstances and under different conditions.
• (i) Analyze the results, making sure that they are accurate and
consistent.
Stage 3: Make decisions
• This is where managers consider the results from analyses,
review all the circumstances and make their decisions. There
are three steps:
• (a) Compare solutions, looking at all aspects of their
performance.
• (b) Find solutions that best meet the decision-makers’
objectives.
• (c) Identify and agree the best overall solution.
Stage 4: Implement the decisions
• At this point managers turn ideas into practice, moving from
‘we should do this’ to actually doing it. For this they:
• (a) Check that the proposed solution really works and is an
improvement on current performance.
• (b) Plan details of the implementation.
Stage 4: Implement the decisions
• (c) Change operations to introduce new ways of doing things.
• (d) Monitor actual performance – after implementing their
decisions, managers still have to monitor operations using
feedback to compare actual performance with plans to make
sure that predicted results actually occur. And if things are not
going as expected, they have to adjust the operations and
plans.
Data and their presentation
• Definition: The collecting together of facts and opinions,
typically in numerical form, provides data. Whether this data
is useful or not depends on what purpose it is required to
serve, the method of collection and its analyse. The data will
become information (good or bad) when it informs the
decision-making of the user.
Data selection
• In terms of data selection, we will want to know whether the
data is:
• ➔➔ appropriate
• ➔➔ adequate
• ➔➔ without bias.
Appropriate data
• In answering a question like ‘what sort of business is it?’ we
will soon begin to talk numerically (if we have the data) about
size, profitability, product range, the market place, the
characteristics of the workforce and a host of other factors. To
be of value, the questions and the responses need to be
appropriate.
Data and purpose
• If your budget for car purchase is no more than £2000, it is
not particularly useful being given a range of prices for
Porsche cars between two and four years old. Data must
serve its purpose.
• We could, for example, be given sales figures on an annual,
quarterly, monthly, weekly, daily, hourly or minute-by-
minute basis.
• If all we want to know is the general trend over time, then
quarterly or monthly data might be sufficient. If we want to
predict demand for Thursday and Friday of next week,
recent sales on a daily or even hourly basis are likely to be
required.
Adequate data
• Working with the numbers given should be informative, but
we should also be prepared to take account of other factors.
Making sense of the numbers for travel to work time may
mean that we take account of local road works or a major
sporting or music event. Numbers alone are unlikely to give us
an adequate understanding of any business problem. We also
need to take account of the people involved, the culture, the
legal and economic environment.
Adequate data
• Knowing whether the information is adequate is a problem for
the problem solver. It is always possible to collect more and more
data. You will:
• ➔➔ need to be clear about problem boundaries;
• ➔➔ need to know what the problem owner or client expects
from you;
• ➔➔ need to know if any data is missing (there are many
examples of computer files being lost or wiped);
• ➔➔ be expected to work within time and resource constraints;
• ➔➔ need to decide whether the current data is sufficient for the
purpose (as defined by agreed objectives) or whether additional
data should be acquired.
Data without bias
• The data we have will be the basis of any inferences we
make. If the data, in some way misrepresents those of
interest we have the problem of bias. The results of a survey
of married women could not, for example, be taken to
represent the views of all women. If this survey did not give
a fair chance of inclusion to younger married women, then a
further source of bias would exist. The underlying principle
is that of fair representation. We need to be clear about who
or what we want to talk about (make inference to) and how
the sources of data can fairly represent them.
Issues of data collection
• The 5 Ws and H technique is frequently used in problem or
issue clarification.
• By asking the questions: who?, what?, where?, when?, why?
and how? we should be able to establish more clearly what the
problem actually is.
Who???
• Who? Who is an important first question in any problem. Data
will always relate to a particular group of people or set of
items in time and we use this concept to define the population
we will be working with.
Population
• The population is defined as all those people, items or
organizations of interest. Given limited resources, including time,
the identification of the relevant population is essential. If, for
example, you were concerned with the acceptability to women of
a new contraceptive pill it would be pointless contacting a group
of people to find that half were men. A similar problem can arise
if the group you have identified as the relevant population does
not include everyone for whom the survey is relevant.
• If you were interested in the purchase of music, you would need
to be careful not to exclude those under 16 years of age.
Population
• If we were concerned with job opportunities, the population
could be all the jobs offered by local businesses, all those
organizations employing one or more persons, or could be
concerned with the national or international job market. It
should be clear from the purpose of your research, what
population you actually need.
Census
• Having decided who, we must then consider whether we
need information on all of them or just a selection.
• A census is defined as a complete enumeration of all those
people, items or organizations of interest (whereas a sample
is just a selection from all those people or items of interest).
If we were interested in the services offered by rail
operating companies, we might include all of them in our
data collection given that the number of companies is
(relatively) small and the differences between them might
be of particular interest.
What???
• What? What data will depend on what we are trying to
achieve. A statement of objectives should be helpful. The
more we seek detail and description, the more likely we are to
restrict general coverage and seek in-depth information.
What???
• Research concerned with the long-term impact of smoking on the
individual and their family unit is more likely to use illustrative
case studies or case histories. In contrast, if we are more
concerned with the purchase of cigarettes by brand for
promotional purposes, then we are likely to choose a design that
has good coverage by region, age, gender and other factors, and
is up to date. The nature of the data will also inform decisions on
the method of data collection. If we are interested in the use of
car seat belts, then observational methods could be most
effective. Experience suggests that respondents do not always
accurately report the wearing seat belts or their frequency of
exceeding the speed limit!
Source of data
• A statistical enquiry may require the collection of new data,
referred to as primary data, or be able to use existing data,
referred to as secondary data. Most, however, require some
combination of both sources.
Primary data
• Sources of primary data include observation, group
discussions and the use of questionnaires.
• The distinguishing feature of primary data is its collection for
a specific project. As a result, primary data can take a long
time to collect and be expensive.
Secondary data
• Secondary data, in contrast, has been collected for some other
purpose. It is usually available at relatively low cost but may
be inadequate for all aspects of the enquiry. Where the data
requirements are fairly complex, it is normally seen as good
practice to first collect the lower cost secondary data, which is
usually more general but can be of good quality (it has been
published) and let this inform thinking about the enquiry.
Where??? When???
• Where? Where to find the right kind of data when you need it
or where to find the people of interest when you need them
requires skill and knowledge. The chances are that someone
somewhere will already have done some research on your
topic of interest. You only need to look to discover the number
of train passenger miles travelled each year, gross domestic
product or the number of diving fatalities in the previous year.
Internal and external data
• In organizational research it is often useful to distinguish
between internal and externally generated data. Recent sales
volume, sales value, number of employees, expenditure on
advertising and expenditure on research are all examples of
internal information likely to reside within the organization,
but may be difficult to obtain as an outsider. External
information would include all the data generated by national
governments, local government, chambers of commerce,
other commercial sources and may be available from the
Internet.
Why???
• Why? Why is always worth asking. It is seen as part of a questioning
approach that should lead to a greater clarification of the problem
situation and a justification of approach.
• In fact there is a useful technique called the why technique. By probing
problems and possible solutions with the question why?, a better
understanding of the causes and effects can be achieved. As a
technique it involves repeatedly asking the question why perhaps with
probing statements like ‘why did you say that?’ or ‘why should that be
the case?’ or ‘why use that data?’ We should in general be interested in
why we need particular data and whether more appropriate data could
be obtained by alternative means. Anticipating the ‘why questions’ can
help you plan your analysis and any presentation of the findings that
may be required.
How???
• How? How to make things happen is often the difficult bit.
• You should use numbers to argue and win a case.
• You should make effective presentations using numbers.
Issues to be addressed:
• Having defined the population of interest and the purpose of
the research, a number of issues will
• need to be addressed:
• ➔➔ whether existing published sources provide sufficient
information;
• ➔➔ whether useful information can be found through an
Internet search;
• ➔➔ if additional data is required, how data should be
collected;
• ➔➔ what type of sampling should be used, if any;
• ➔➔ if required, how should we design and ask questions.
Example of secondary source of data
Drawing graphs
• Graph – sometimes called a line graph show the relationship
between two variables.
Cartesian axes
• The most common type of graph uses two rectangular (or
Cartesian) axes. A horizontal axis is traditionally labelled x,
and a vertical axis is labelled y. Then x is the independent
variable, which is the one that we can set or control, and y is
the corresponding dependent variable, whose value is set by
x.
Cartesian axes
• Then x might be the amount we spend on advertising (which
we control) and y the resulting sales (which we cannot
control); x might be the interest rate that we charge for lending
money, and y the corresponding amount borrowed; x might be
the price we charge for a service, and y the resulting demand.
Cartesian axes
• When we talk about dependent and independent variables,
we do not assume any cause and effect. There might be a
clear relationship between two variables, but this does not
necessarily mean that a change in one actually causes a
change in the other. For example, a department store might
notice that when it reduces the price of overcoats the sales
of ice cream rise. There is a clear relationship between the
price of overcoats and the sales of ice cream, but one does
not cause the other – and both are really a result of the
prevailing weather.
Cartesian axes
Locating points with Cartesian coordinates
Cartesian axes
• The point where the two axes cross is the origin. This is the
point where both x and y have the value zero. At any point
above the origin y is positive, and at any point below it y is
negative; at any point to the right of the origin x is positive,
and at any point to the left of it x is negative. Often, we are
only interested in positive values of x and y – perhaps with a
graph of income against sales. Then we show only the top
right-hand corner of the graph, which is the positive
quadrant.
Cartesian axes
• We can describe any point on a graph by two numbers
called coordinates.
• The first number gives the distance along the x-axis from
the origin, and the second number gives the distance up the
y-axis. For example, the point x = 3, y = 4 is three units
along the x-axis and four units up the y-axis. A standard
notation describes coordinates as (x, y) – so this is point (3,
4). The only thing you have to be careful about is that (3, 4)
is not the same as (4, 3). And these points are some way
from (−3, 4), (3, −4) and (−3, −4).
Cartesian axes
• Points on the x-axis have coordinates (x, 0) and points on the
y-axis have coordinates (0, y). The origin is the point where
the axes cross; it has coordinates (0, 0).
Exercise
Solution
Solution
Diagrams for presenting data
• Diagrams attract people’s attention, and we are more likely to
look at them than read the accompanying text – like it saying,
‘One picture is worth a thousand words’. Good diagrams are
attractive, they make information more interesting, give a
clear summary of data, emphasize underlying patterns and
allow us to extract a lot of information in a short time. But
they do not happen by chance – they have to be carefully
designed.
Diagrams for presenting data
• There are many types of diagram for presenting data, with the
most common including:
• tables of numerical data and frequency distributions
• graphs to show relationships between variables
• pie charts, bar charts and pictograms showing frequencies
• histograms that show frequencies of continuous data.
Diagrams for presenting data
• The choice of best format is often a matter of personal
judgement and preference.
• But always remember that you want to present information
fairly and efficiently – and you are not just looking for the
prettiest picture.
Diagrams for presenting data
Guidelines for choosing the type of diagram include the following:
•choose the most suitable format for the purpose
•always present data fairly and honestly
•make sure all diagrams are clear and easy to understand
•state the source of data
•use consistent units and say what these are
•include totals, sub-totals and any other useful summaries
•give each diagram a title
•add notes to highlight assumptions and reasons for unusual or
atypical values.
Tables of numerical data
• Tables are probably the most common way of summarizing
data.
• The next table shows the weekly sales of the product and this
gives the general format for tables.
Tables of numerical data
Tables of numerical data
• This is clearer than the original list, and you can now see that
sales are higher in the first two quarters and lower in the
second two. But the table still only shows the raw data – and it
does not really give a feel for a typical week’s sales, it is
difficult to find the minimum and maximum sales and patterns
are not clear. We can emphasize the underlying patterns by
reducing the data.
Pie charts
• Pie charts are simple diagrams that give a summary of
categorical data. To draw a pie chart you draw a circle – the
pie – and divide this into slices, each of which represents one
category. The area of each slice – and hence the angle at the
center of the circle – is proportional to the number of
observations in the category.
Example
• Hambro GmbH has
operations in four regions of
Europe, with annual sales in
millions of euros given in
the following table.
Example of pie
Pie charts
• Pie charts are very simple and have an immediate impact, but
they show only very small amounts of data. When there are
more than, say, six or seven slices they become too
complicated and confusing. There is also some concern about
whether people really understand data presented in this format
and whether it gives a misleading view.
Bar charts
• Like pie charts, bar charts show the number of
observations in different categories.
• Each category is represented by its own line or bar, and the
length of this bar is proportional to the number of
observations.
• One constant rule, though, is that you should always start
the scale for the bars at zero, and never be tempted to save
space by omitting the lower parts of bars.
• This is sometimes unavoidable in graphs, but in bar charts
the result is simply confusing.
Example of bar charts
Pictograms
• In many types of presentations, it is more important to
attract attention and maintain interest than to give detailed
statistical accuracy. It may be necessary to make a few
important points effectively (think about the methods a
politician might employ) and not confuse people with
details in the limited time or space available.
• A pictogram can be very effective is such circumstances.
• The bars drawn on a bar chart are replaced by an
appropriate picture or pictures, either vertically or
horizontally.
Pictograms
Histograms
• Histograms are frequency distributions for continuous data.
They look very similar to bar charts, but there are important
differences. The most important is that histograms are used
only for continuous data, so the classes are joined and form a
continuous scale. When we draw bars on this scale, their width
– as well as their length – has a definite meaning. The width
shows the class size, and the area of the bar shows the
frequency.
Histograms
Frequency distributions for two sets of data
Describing the location and spread of data
Conclusion
• a measure of location – is used to show where the center of
the data is, giving some kind of typical or average value
• a measure of spread- is used to show how the data is
scattered around this center, giving an idea of the range of
values.
Conclusion
• In a typical bar chart or histogram, measures of location show
where the data lies on the x-axis, while measures of spread
show how dispersed the data is along the axis.

You might also like