1st Note
1st Note
1st Note
6. Many different studies might be helpful in this situation. Probably the two major
lines of investigation should be on (1) the firm’s operating inefficiencies and (2) its
future prospects in the industry. In the first case the president might request research
to determine whether the inefficiency is caused by internal production problems,
poor organization, ineffective cost controls, weak sales management, and so on. It
might be in the form of a company management audit. The examination of the
industry and the firm’s future in it would probably be directed at possible trends in
consumption patterns and distribution systems. With this database the president
might then consider possible changes in company operations that could help them
adapt to these trends.
7. When needed, the instructor should elaborate on these points. In this question,
understanding of the concept of morale is important. How is it defined? Is the
definition consistent with the literature? How is it measured? Does the measurement
operationalize the definition? What can we infer about the disproportionate sample
sizes and their composition? Does this suggest that there are only clerical and
executive employees in the headquarters? How are the findings from the separate
samples treated? Should we be concerned about the effects of social desirability on
the secretaries’ responses? Other questions on methodology, data analysis, and the
consultant’s credentials can be raised in conjunction with the nine criteria.
8. from the limited experience to all situations. This limits the scope of the study
and any method selected unless the researcher makes a deliberate effort to
incorporate other ways of thinking. (2) There is the constant danger of mixing
personal motives and research. In this situation, particular research results can have
employment implications for the manager-researcher. (3) A common source of bias
in using sales managers as researchers is that they prefer to understate demand.
Subsequently, sales are achieved more easily, and the manager looks better. This
tendency also leads managers to rely on their subordinates as key information
sources, or to use quantitative methodologies that give conservative estimates at the
cost of objectivity.
Bringing Research to Life
9. The student would look for the following in a proposed research design:
Purpose clearly defined: Research must provide an estimate of the size of the
outboard engine market, in sales and units, and an estimate of current market share
of all industry participants.
Research process detailed: A research proposal including budget will be approved
before the research is conducted and the researcher will report weekly the on
progress of the process. Research will be completed within 30 days.
Research designed thoroughly planned: Research will include internal data mining,
extensive secondary data search of industry specific sources, as well as interviews
with industry experts.
Limitations frankly revealed: Research will focus exclusively on outboard engines of
the size currently manufactured or in development.
Highly Ethical standards applied: Each estimator used in the forecasting will be
confirmed by two or more sources, to avoid bias in the calculation.
Adequate Analysis: Footnotes of explanation of calculations will be presented with
the findings.
Findings presented unambiguously: Findings will be presented in spreadsheet
format, with pessimistic, expected, and optimistic forecasts.
Conclusions justified: Recommendations will not include strategies or tactics for
expansion of company sales and market share.
Researcher's experience reflected: Researcher's credentials will appear in the
process.
10. Myra revealed her company is interested in assessing the level of customer
satisfaction (or dissatisfaction) with their post-sale servicing of their MindWriter
laptop computers. This is applied research, of a descriptive nature. Once the
satisfaction level is known, Myra might find that a predictive study would be of
value. In the proposal development stage, Jason should use his research expertise to
determine if Myra's company will be satisfied with the likely results of a descriptive
study.
11. For management to call a meeting requiring significant travel, they are obviously
concerned about something. Management asks Myra to attend the meeting with her
number cruncher, which would imply that there are numbers of some sort to analyze.
If they just wanted a benchmark study to facilitate continuous monitoring and
subsequent tracking of customer satisfaction, such a request would be unlikely to
require such a management meeting. Additionally, Jason suggests that Myra discern
"what facts management has gathered". He obviously believes that facts will be
shared at the meeting, without anticipating what those facts might be.
1. A. Concepts and constructs are both abstractions, the former from our perceptions of reality
and the latter from some invention that we have made. A concept is a bundle of meanings
or characteristics associated with certain objects, events, situations and the like.
Constructs are images or ideas developed specifically for theory building or research
purposes. Constructs tend to be more abstract and complex than concepts. Both are
critical to thinking and research processes since one can think only in terms of meanings
we have adopted. Precision in concept and constructs is particularly important in research
since we usually attempt to measure meaning in some way.
B. Both deduction and induction are basic forms of reasoning. While we may emphasize one
over the other from time to time, both are necessary for research thinking. Deduction is
reasoning from generalizations to specifics that flow logically from the generalizations. If
the generalizations are true and the deductive form valid, the conclusions must also be
true. Induction is reasoning from specific instances or observations to some
generalization that is purported to explain the instances. The specific instances are
evidence and the conclusion is an inference that may be true.
C. Dictionary definitions are those used in most general discourse to describe the nature of
concepts through word reference to other familiar concepts, preferably at a lower
abstraction level. Operational definitions are established for the purposes of precision in
measurement. With them we attempt to classify concepts or conditions unambiguously
and use them in measurement. Operational definitions are essential for effective research,
while dictionary definitions are more useful for general discourse purposes.
D. Concepts are meanings abstracted from our observations; they classify or categorize
objects or events that have common characteristics beyond a single observation (see A).
A variable is a concept or construct to which numerals or values are assigned; this
operationalization permits the construct or concept to be empirically tested. In informal
usage, a variable is often used as a synonym for construct or property being studied.
E. A proposition is a statement about concepts that can be evaluated as true or false when
compared to observable phenomena. A hypothesis is a proposition made as a tentative
statement configured for empirical testing. This further distinction permits the
classification of hypotheses for different purposes, e.g., descriptive, relational,
correlational, causal, etc.
F. A theory is a set of systematically interrelated concepts, constructs, definitions, and
propositions advanced to explain and predict phenomena or facts. Theories differ from
models in that their function is explanation and prediction whereas a model’s purpose is
representation. A model is a representation of a system constructed for the purpose of
investigating an aspect of that system, or the system as a whole. Models are used with
equal success in applied or theoretical work.
G. The characteristics of the scientific method are confused in the literature primarily
because of the numerous philosophical perspectives one may take when “doing” science.
A second problem stems from the fact that the emotional characteristics of scientists do
not easily lend themselves to generalization. For our purposes, however, the scientific
method is a systematic approach involving hypothesizing, observing, testing, and
reasoning processes for the purpose of problem solving or amelioration. The scientific
method may be summarized with a set of steps or stages but these only hold for the
simplest problems. In contrast to the mechanics of the process, the scientific attitude
reflects the creative aspects that enable and sustain the research from preliminary
thinking to discovery and on to the culmination of the project. Imagination, curiosity,
intuition, and doubt are among the predispositions involved. One Nobel physicist
described this aspect of science as doing one’s utmost with no holds barred.
2. The scientific method emphasizes (1) direct observation of phenomena, (2) clearly
defined variables, methods, and procedures, (3) empirically testable hypotheses, (4)
the ability to rule out rival hypotheses, (4) statistical rather than linguistic
justification of conclusions, and (6) the self-correcting process. A schematic
approach to the scientific method is found in the section entitled Reflective Thinking
and the Scientific Method. Here, observation, the processes of induction and
deduction, and hypotheses testing combine in a systematic way, as we proceed from
observations to a critical, self correcting approach to theory development.
3. Marketing, management, organizational behavior and other studies where the human
input is the central focus tend toward empirical solutions to problems and theory
building. Examples of such studies are (1) media effectiveness, (2) the impact of
different incentive schemes, (3) the organizational efficiency of different patterns of
organizing. Management science, operations research, production, and associated
areas tend toward rationalist approaches. Examples are (1) linear programming
transportation studies (2) inventory modeling and cost forecasting (3) production and
machine time scheduling. A quick survey of the current issue of the top two or three
journals in each field will reveal much about preferred methodologies. From this you
may wish to construct a pie chart on the board showing the range of research
approaches for each discipline.
Car Sales (DV) will increase as per capita income increases (IV), as long as low
interest (IVV) increase ease of access to credit among younger (EV) men (EV) and
competitors do not introduce more attractive models (EV), increase advertising
(EV), or increase their discounts (EV).
B. Given that buyer behavior is fickle with respect to ego-involved purchases
(e.g. car), and given the number of uncontrollable extraneous variables, a
model based on the above theory is unlikely to be relevant for any about of
time.
Constructs—customer defection.
B. Female sales representatives who are more culturally supported in establishing and
maintaining relationships extend that personal behavior into the work place.
Lower customer defections = fewer current customers lost at time of contract renewal
resulting in a smaller customer defection percentage.
Customer defection percentage = the number of customers who do not renew their
contract during the measurement period divided by the total number of customers at
the start of the measurement period.
8. Hypothesis 1—Receptionist misdirects calls due to his inability to correctly hear the
problem as stated by the caller.
The above hypothesis is induced from the situation described in the problem. From
the hypothesis we must be able to deduce some other factual conditions implied by
this hypothesis. For example:
Facts 1—The complaints of misdirected calls only occur when the 20-year
employee works the reception desk.
Fact 2—A fully hearing employee does not generate complaints of
misdirected calls.
Fact 3—Employee has requested two sick days in the last 3 months for ear-
related infections.
Hypothesis 2—A faulty switch causes the misrouted calls.
Fact 1—Switch A is tested by placing several sample calls and the calls are
correctly routed.
Fact 2—Switch B is tested by placing several sample calls and the calls are
misdirected.
9. This question should lead to a lively discussion. ITE members were primarily concerned
by State Farm’s failure to factor in traffic volume measures (as an indication of
probability of an accident occurring) and police incident reports (as an indication of
severity) in arriving at their designated ‘dangerous intersections’.
a. (1) What constitutes an accident? This is the most fundamental construct. Is an
accident only one in which the police are involved? Or is it one in which either
personal or property injury occurs? Or is it one in which medical treatment is
delivered? Or maybe an accident is only one where two vehicles collide in an
intersection. (2) Another construct is “What is an intersection?” Should an
intersection have to achieve some base level of traffic to be considered part of the
pool of intersections that are considered? (3)Traffic volume is another construct as it
is based on a statistical projection. In most instances volume is a sample measure
based on number of cars proceeding through an intersection during a specified period
of time in each of four directions. The period of time varies based on staff or
equipment used to make the counts. The volume is likely to vary depending on time
of day, day of week, and week of the year, as well as based on direction of
movement. Most volume counts are projections based on an actual traffic count of
vehicles moving through an intersection for a short period (sample) of time. So while
the ‘count’ may be a concept, ‘proceeding through an intersection’ is a construct. (4)
Accident severity is yet another construct, which can be defined with any number of
additional constructs and concepts. Should State Farm use medical recovery cost and
property damage repair estimates to determine severity or some other measure like
loss of life or limb? Even ‘loss of life’ may become a construct if one considers that a
severe accident might place a victim in a coma, and thus technically alive in medical
terms while not functioning in the layman’s eyes.
b. One possible hypothesis of ITE is that “a dangerous intersection is one in which the
largest percentage of vehicles moving through an intersection are involved in a
collision with another vehicle or an immovable object, regardless rather an insurance
claim is filed.” Another hypothesis might be that “only intersections with at least X
traffic volume can be considered dangerous and only then if X percent of vehicles
moving through that intersection are involved in a collision.” Ultimately ITE
members are responsible for solving intersection problems—whether or not such
intersections are brought to their attention by State Farm. So students might also
propose hypotheses related to intersection problems: “Intersections without signals
visible for a minimum of 500 yards are more dangerous than those with signals
visible for a minimum of 500 yards.
Bringing Research to Life
10. Numerous variables were introduced in the text discussion of BRTL's Army firing range
vignette. Among these:
• "danger" warning • loss of lucrative mining • proximity to firing range (MV)
(EV-control) jobs (EV)
• belief in fate (MV) • nocturnal scavenging of • resident education level (EV-
firing range for copper control)
wire (DV)
• fighter bombing • post-fire handling (MV) • speed of impact (IV)
with active artillery
(IVV)
• detonation delay • price of copper wire (EV)
(EV)
• kerosene lantern • profit from scavenged
marking (EV) materials (IVV)
• An active (non-dud) shell when fired from a cannon (IV) explodes on impact
(DV).
• A "dud" shell when fired from a cannon (IV) does not explode on impact (DV).
• A "dud" shell when fired from a cannon (IV) does not immediately explode on
impact (DV) and cannot subsequently be detonated by any means, human or
mechanical (IV). .
• Local's conviction that predetermined fate dictates time and place of death (IV)
leads them to undertake life-threatening behaviors—scavenging on the firing range
(DV). If locals could be warned of the danger (IV) of their actions, they would
change their nocturnal behavior (DV).
• The loss of mining jobs (IV) leads to acceptance of higher-risk behaviors to earn
a family-supporting income—race-car driving or nocturnal scavenging (DV)
especially due to the proximity of the firing range (MV) and limited education (MV)
of the residents.
• Among residents with less than a high school education (EV-control), the loss of
high income mining jobs (IV) leads to acceptance of higher-risk behaviors to earn a
family-supporting income—race-car driving or nocturnal scavenging (DV) especially
due to the proximity of the firing range (MV).
• Marking “dud” shells with kerosene lanterns for same-evening detonation (IV)
will reduce nocturnal scavenging (DV) among poorly educated local residents (MV)
by eliminating the profit motive for such behavior (IVV).
• An increase in the price of salvaged copper wire (IV) leads to an increase in
scavenging (DV) on the Army firing range.
1. When we consider whether a topic is researchable it is wise to note that the answer is one
of degree. Research can shed some light on most topics, but there are two situations where
research can not provide much help. In the first case there are questions relating to “value”
where fact gathering can not contribute much. For example, we may consider making a
merger offer to company X and ask the question, “Do we really want to grow in this way?”
Or, “Will we be happier making this offer rather than an offer to company Y?”
A second situation where research is limited concerns those questions where data
gathering could be helpful, but our techniques or procedures are inadequate. In the
merger case, for example, we might ask the question, “Will the stockholders of
Company X welcome our merger offer?” This type of question is answered if there
is a way to get the data before making the offer. However, there may be no method
to enable us to make this assessment short of making the offer. Or again, the
question, “Will the U.S. Department of Justice fight our merger plan?” While there
are no legitimate techniques to gain certainty about such information, research may
provide probable or at least possible clues as to reactions from stockholders,
government agencies and other stakeholders.
2. This question addresses the issue of shortsightedness in research planning. The
use of exploration, even under budgetary restraints, may make the difference
between terminating the project early or spending a substantial sum to rediscover
what is already known. The use of published data and experience surveys, for
example, may permit answering the question, changing the question, refining the
question, or selecting an optimal methodology-- all of which are cost saving.
Whereas exploration avoids costly mistakes on the front end of the problem, pilot
testing identifies methodological misapplications and measurement problems before
data collection. Both activities add to the cost of the project, but without them the
information may be completely without value. The failure to do some exploration
may result in studying the wrong problem and the failure to pretest may threaten the
validity of the study.
3. In a study of an inventory management situation the use of the last year’s audit as the key
database would imply that previous data on inventory levels, inventory mistakes, slow
moving and surplus stock is the basis for decision making. Normally such data is used to
evaluate the appropriateness of desired inventory levels (minimum, maximum, average).
Simulations are also possible based on alternative models. The second proposal uses a
study of systems and procedures to recommend changes. The focus here is procedural
efficiency and the results of new procedures are normally more conjectural than
quantitative data based approaches.
Prior evaluation is generally difficult; however a data-driven, audit based approach
can predict or simulate cost changes for carrying inventories. This is based on the
evaluation of past inventory levels, and targeted inventory levels using revised
inventory carrying norms. The same exercise becomes more difficult for a systems-
and procedure-based approach, because it is far more difficult to impute quantitative
cost saving to procedural changes. While ‘ex post facto’ evaluations are easier in
either case, with the latter problem, this may be the only form of evaluation possible.
The two research options are decision option alternatives and saving’s expectations
computed ex ante. One interesting point is that there is a possibility that the two
proposals finally yield similar expectations of savings. However, one of the options
may involve a more risky proposition: a more risky proposition anticipates a greater
deviation between the best and worst case scenarios, even though the savings
expectation could be the same. Here decisions would have to go beyond a decision
theory of expectations and the final decision would depend on a choice between risk
averse and risk seeking behavior.
Making Research Decisions
4. The President of Oaks International Inc. faces a management dilemma: the company is
plagued by low productivity. The management question should seek to identify the factors
that lead to low productivity and identify the strategies that can lead to increases in
productivity. In this case the President is assuming that (1) the cause of low productivity in
the organization is job satisfaction (2) there is a relationship between job satisfaction and
productivity. The latter relationship, even if partially valid, may be largely influenced by
moderating and intervening variables. For instance, performance or productivity is an
outcome of “work input” or effort, and this becomes a key intervening variable; a similar
variable may be absenteeism. Focusing on these variables may be important, as their
salience may be as high as that of job satisfaction. Finally, the cause of low productivity
may not lie in personnel issues, but in other matters such as the plant, equipment, materials
availability, or technology. The President’s approach biases the results and an exploratory
exercise to determine possible causes of low productivity is necessary. This may be
followed up by a pilot study to narrow key research questions to factors that have greater
importance with respect to productivity.
5. The editor of Gentlemen’s Magazine has asked you to carry out a research study. The
magazine has been unsuccessful in attracting advertising revenue from shoe manufacturers.
The shoe manufacturers claim that men’s clothing stores are a small and declining segment
of the men’s shoe business. Since Gentlemen’s Magazine sells chiefly to men’s clothing
store managers, the manufacturers have reasoned that it is not a good advertising medium
for their shoes.
The editor disagrees about the size and importance of shoe marketing through men’s
clothing stores. Neither side has much direct evidence on the matter, so the editor
asks you to carry out a study to determine the facts about these stores as a channel of
distribution for men’s shoes. You agree to do so and proceed by first structuring the
problem.
1. Management Question. The problem facing the magazine management is that of
expanding their advertising revenue. There are a number of ways that this might be
done and research might help in many of them. In this case, however, the
management has already decided that the management’s problem is to secure more
advertising from the shoe industry. The researcher accepted the problem definition as
stated.
2. Research Question. The research question was defined as “Are the actual or potential
sales of men’s shoes in menswear stores large enough to represent an advertising
opportunity to shoe manufacturers?”
3. Investigative Questions. Three major investigative questions were proposed:
A. Are shoe sales important to menswear stores?
B. Have shoe sales been growing in importance in menswear stores, and are they
expected to grow in the future?
C. Does the situation in menswear stores present an advertising opportunity for shoe
manufacturers?
It was agreed that the only feasible research design given the available budget
was a national mail survey. Because of the limited scope of the study and the
problems of population estimation from a mail survey, it was agreed that the
research study would be concerned chiefly with the first two questions. The third
question would be approached within the limits of the research design, but a full
answer to this question would require information not available to the researcher.
To answer the major investigative questions posed above it was necessary to
obtain information on a number of much more specific points. These represent
investigative sub-questions that we must know to answer the three major
questions. Parenthetical statements attached to them suggest the rationale for
including each question in the study.
D. Are shoe sales important to menswear stores?
1. What percentage of menswear stores carry men’s shoes? (One important
measure of this is the channel of distribution and is an indicator of the
importance of shoes to retailers.)
2. What percentages of menswear store sales come from shoes? (A measure of
the importance of shoes to total store operation.)
3. How do shoe gross margins compare with total store gross margins?
(Another measure of shoe importance to the store and an indicator of whether
or not shoes are likely to secure the store management’s attention.)
4. How do space/sales ratios compare between shoe departments and the total
store? (Another measure of the contribution of shoes to the total store
volume. Also a measure of how secure shoes might be as a part of the store
product mix.)
5. How important are shoes as traffic builders, extra sales, or extra business
from those who do not buy other merchandise at the time? (Another measure
of the value of shoes to store success and an indicator of the mix security of
shoes.)
E. Have shoe sales been growing in importance in menswear stores and are they
expected to grow in the future?
1. What percentage of the stores has added, expanded, dropped, or reduced shoe
operations in the last five years? (A measure of the changes in importance of
shoes in menswear stores in recent years.)
2. Have shoe sales grown or declined in absolute dollars in recent years? As a
percentage of total store sales? (Measures of the changes that have taken
place in menswear stores concerning shoes.)
What are store managers expecting in the next few years concerning shoes?
Add or drop? Reduce or enlarge? (A measure of store operator expectations
concerning shoes.)
F. Does the situation in menswear stores present an advertising opportunity for shoe
manufacturers?
1. What percentage of the men’s shoe volume is sold through menswear stores?
(A very crude measure of the importance of this channel to the manufacturer.
This estimate will be carefully hedged since there is little hope that the
survey can provide an unbiased estimate of total sales of men’s shoes.)
2. How important are leased departments in men’s shoes operations in
menswear stores? (A measure of the need for shoe manufacturers to contact
individual store managers to secure distribution, or whether the approach of a
few central buying firms might suffice.)
3. What price lines are carried, and what is the relative importance of each? (A
measure of the type of demand for men’s shoes represented in this channel.)
4. What is the importance of casual shoes versus dress shoe sales? (Another
measure of the type of shoe demand that prevails in this channel.)
5. What brands are carried and what is the relative importance of each brand?
(A measure of the penetration by various manufacturers as well as a measure
of the product characteristics that are popular in this channel.)
4. Measurement Questions. The 13 investigative questions listed under the three major
questions translated into about 20 specific measurement questions that were finally
asked of the participants. There were six additional classification-type questions.
They differed from the investigative questions chiefly in wording and format, so as to
aid the participant in replying with a minimum of difficulty and a maximum of
reliability.
6. A. Students should be asked for their suggestions of why sales of beef entrees
might be declining. They should generate many plausible reasons. Sales of
beef could be down due to fear (as suggested) but might also have decreased
due to a shift to fish, chicken, or vegetarian foods. Any of these alternative
entrees would be appropriate choices if customers are experiencing increasing
concern over high cholesterol as promoted by the area’s medical community.
Declining sales of beef might also be due to the closing of a primary beef
supplier, causing beef deliveries to be interrupted with a resulting out-of-stock
condition occurring. Sales could also be down due to waitress suggestions,
price incentives on desserts which led customers to reduce the size of their
entrees, or a new chef who prefers to use non-beef products for the featured
entrée. We need to know more about the internal situation to move forward.
When did the sales decline first start? Can the sales declines be linked to
changes in other internal behaviors? A survey of area restaurants, if the
stimulus is external, might support the idea of diminished sales of beef but
wouldn’t provide a reason for why or a clue how to correct the situation. Why,
for example, is a survey of other restaurants better than a survey of the
restaurant’s customers? The manager might be jumping to an unwarranted
conclusion that would result in misspent research dollars.
B. Encourage the students to use the management-research
reverse the trend of beef entrée sales. One of your students should
11. The students' answers to this question can focus on many management problems. Some
students should be invited to share their answers to this question, and form the hierarchy.
Other students with similar management dilemmas can assist that student to build a
complete hierarchy. Each student should start with a symptom to reveal the management
dilemma. Some examples of management dilemmas that students might use are:
a. workers not reporting for scheduled shifts…explore scheduling
b. high worker turnover (losing workers to competitive firms)…explore hiring, pay,
scheduling, training, customer service complaints
c. complaints from workers about disproportionate effort or results not reflected in
their pay…explore pay, benefits, scheduling, incentive programs, productivity
programs
d. high waste level within production…explore machinery accuracy and
maintenance, employee training, pay, supervision
e. high level of defect within production…explore machinery accuracy, training,
pay, supervision
f. low number of customers who remember advertising…explore message, media,
timing
12. The proposal content should include what information they want to collect with the 300
interviews to answer the management question chosen, and how they think such research
will help determine the management question. For example, the student might chose the
police chief study (Question 10e) on correcting dispatching problems. They might suggest
doing telephone interviews with the head dispatching officer in 300 police precincts around
the country, collecting information on the types of dispatching equipment used, the process
that guides the dispatching of officers to crimes or incidents, the average response times,
etc. If a similar process or machinery is used within the most efficient of the 300
departments, then a management question about switching to that process or machinery can
be raised.
5. Any number of journal reports can be chosen for this one. Each should be evaluated on
its own merits. A useful exercise is for small groups to critique proposal outlines.
The proposals are distributed without identification and the group works as a team to
prepare a critique for each of 4-5 proposals.
6. A. This is a study with major public relations consequences: if the effectiveness is
perceived to be minimal the bank may stop supporting specific organizations or
institute new policies governing the loaning of executives to boards and community
initiative. Also, information about effectiveness of community contributions is
unlikely to be available from a published source, although a bank trade association
might have explored the issue. Yet even given the likely decisions and their
influence on company reputation, most firms would consider this a small study. The
proposal would typically be in the form of a memo to the president detailing the
following:
1. Problem Statement – Are financial and manpower contributions to community
not-for-profits providing any measurable return to the bank?
2. Research Objectives -
a. identify how perceived value is measured by various publics
b. define benefits (‘measurable return’) to the bank
c. determine time period
d. other related items
3. Research Design - descriptive
4. Results anticipated
5. Schedule
7. The Seagate Proposal follows the model of the large-scale contract study with some
exceptions. There is no clearly noted section ‘executive summary’ nor is there a literature
review. The project objectives are fairly straightforward which would explain the absence
of these sections. The section detailing the qualifications of the supplier teams provide the
information on project management, facilities and special resources. It is not unusual that
the measurement instrument is not included; developing it is part of the proposed project
and the research teams have extensive experience in the area of consumer satisfaction
research. The level of detail in the research design and the data analysis is the strongest
clue at to the type of study. These sections are very extensive, as they should be given the
proposed cost of the project ($142,000). Your students might have liked to find a glossary
of terms. They should be encouraged to discuss whether such a glossary was necessary and
what might justify its exclusion or includsion. The same could be said of the literature
review. What would it have added? Why was it not included? Some likely reasons for
exclusion are a prior relationship between the parties, an understanding of the background
of the deciding manager, specifications within a RFP that specifically requested the
omission of these sections.
Abbreviated Student Proposal with Comments
1.
a. Exploratory versus formalized studies: Exploratory studies tend to have loose
structures with the purpose of discovering future tasks, or to develop future
hypotheses and questions for further research. The goal of a formal research design is
generally specific: to test hypotheses or answer research questions.
b. Experimental and ex post facto designs: In an experimental design the researcher
seeks to study the effects of variables by controlling and manipulating them. In an ex
post facto design the researcher does not influence the variables but reports on what
has happened.
c. Descriptive and causal studies: A descriptive study is concerned with description,
that is the who, what, where, when or how much in observations, whereas in a causal
study relationships between variables are sought to be identified, verified and
established.
2.
a. A relationship between variables is latent, but what is manifest are only the possible
effects. A relationship itself can only be theoretically postulated. For instance, a
higher income level may induce the purchase of higher priced cars, and this can be
theoretically postulated. Yet the data focuses on a manifest variable (purchase) rather
than the latent psychological processes.
b. Inductive conclusions, unlike deductive conclusions, have no “necessary”
connections between facts and conclusions. Thus the conclusion of an induction may
be simply one explanation for an observed fact whereas the conclusion of a deduction
is the explanation, if the deduction’s requirements are met. This means that when
dealing with causal relationships we require other more rigorous devices to assure
ourselves that our probabilistic statements contain the least possible margin for error.
Methods such as experimentation and statistical tests help to improve our confidence
in ascribing cause to inductive conclusions.
c. There may be a correlation in the following variable pairs found statistically at a
point of time, however there is no causal relationship between the variables; such
correlations are said to be spurious.
Increases in productivity : Increases in stock offerings to the public
Decreases in job satisfaction : Decreases in the consumer price index
A classic example of spurious correlation is exemplified in the fallacious argument:
All alcoholic beverages contain water; hence an excessive consumption of water
leads to the cirrhosis of the liver.
3.
a. Stimulus-response: When you are challenged to justify your position during a
management meeting your pulse rate increases rapidly, and you speak out strongly in
defense of your position.
b. Property-disposition: You are a member of a minority ethnic group and this makes
you very sensitive to ethnic type comments by others.
c. Disposition-behavior: You have strong opinions about the degradation of our
physical environment by some industries; as a result you are highly selective in
choosing the companies with whom you interview for career opportunities.
d. Property-behavior: You have grown up as a member of the upper-lower social class
and now follow the typical consumption practices of that class.
4. There are virtually an infinite number of possible extraneous variables that may confound
a causal relationship. Many are unanticipated and unidentified. We also have a
limited ability to control more than a few variables. By randomization we can,
within specific limits of variance, expect to equalize out the influence or potential
influence of these many extraneous variables. We can, however, control for a few
variables that are expected to be most important. By so doing we can assume that
they do not confound our study results.
5.
a. These two approaches are similar in their objective of trying to show IV-DV, or
causal relationships, basically by means of:
1. Studying co-variation patterns between variables.
2. Determining time order relationships.
3. Attempting to eliminate the confounding effects of other variables on the IV-DV
relationship.
They often use the same data collection and data manipulation methods. For
example, either may use interviews or observation, use certain statistical methods,
and the like.
b. They differ in their ability to measure causal effects. In experimental design we
can set up situations, manipulate variables, assign participants to exposure or
control groups, and control other variables. With ex post facto research, we must
accept what is, or what has been, uncover comparative groups that have been
exposed and others who have not been exposed to the “causal” factor, attempt to
learn the time order effect after the fact, and attempt to “control” other variables
by various after-the-event statistical or classification procedures. Given these
problems it is easily apparent why experimental design is the more powerful of
the two methods for causal analysis.
6. A good first step would be some exploratory research. This could be initiated by a study
of secondary data:
• Journal and newspaper articles on hospital volunteers.
• Extended reference of the references listed in publications.
• Publications of major hospitals, especially those targeting volunteers.
• Publications relating to the subject of volunteer training.
• Hospital Internet sites for pages related to volunteers.
• An Internet search related to volunteers and training..
8. a. The index of consumer confidence and the business cycle: While the relationship
between consumer confidence and the business cycle may be interdependent, the
most frequent form has the consumer confidence as the independent variable and the
business cycle as the dependent variable. In fact, consumer confidence is usually
viewed as a leading indicator of the business cycle. We might have as moderating
variables the level of employment, actions taken by the federal government in tax
policy, and others. Extraneous variables might include such things as inflation rate,
development of exciting new automobile models, the price of gold, and changes in
the role of the family in society. Note that the designation of moderating and
extraneous variables is often a matter of choice.
b. Level of worker output and closeness of supervision of the worker: The closeness of
supervision of the worker implies, and is generally accompanied by, behavioral
control and guidance on work. Both of these are factors that positively impact the
level of output. Supervised work leads to higher productivity; hence the level of
worker output may be regarded as the dependent variable and the closeness of
supervision as the independent or explanatory variable. Guidance at work may be
regarded as the intervening variable.
c. Degree of effort in a class and the student’s GPA: The student may postulate that
the amount of time spent reading class material, working text exercises, and
discussing text cases will lead to a better understanding of course material,
regardless of the student’s innate intelligence and assuming a high level of
interest due to courses being in their major, and thus will result in a better grade
for each class in the major, hence a better overall GPA. In this theoretical
formulation, GPA is the dependent variable, and effort is the independent
variable, while level of intelligence, level of understanding, and level of interest
are moderating variables.
Bringing Research to Life
10.
Exhibit 6-1 criteria Snapshot: Second GEM
Research question crystallized Formal study
Method of data collection Interrogation/communication
Power of researcher to produce effects Ex post facto
Purpose of study Descriptive
Time dimension longitudinal
Topical scope Statistical
Research environment Field setting
Respondents’ perception of research activity Not disguised
10. (continued)
Snapshot: John Deere
Research question crystallized Exploratory
Method of data collection Interrogation/communication and monitoring
Power of researcher to produce effects Ex post facto
Purpose of study descriptive
Time dimension Cross-sectional and longitudinal
Topical scope Case and statistical studies
Research environment Field and lab settings
Respondents’ perception of research activity Not disguised
Snapshot: Kool-Aid
Research question crystallized Formal
Method of data collection Interrogation/communication
Power of researcher to produce effects Ex post facto
Purpose of study Descriptive
Time dimension Cross-sectional
Topical scope Statistical
Research environment Field
Respondents’ perception of research activity Actual routine
Exhibit 6-1 Sample Design Critique Instructions
The purpose of your paper is to critique the design of the chosen article using criteria established
in Chapter 6. Consider the following characteristics in writing your paper.
Some journals are listed below. Select a recent issue (during the last year):
The article The Commitment of Social Workers to Affirmative Action, from the Journal of Sociology and Social
Welfare, is an attempt to study the relationship of several variables concerned with affirmative action. The Texas
chapter of the National Association of Social workers (NASW/Texas) was selected as the population for the study.
Surveys were mailed to 474 members, and 193 responded, creating a response rate of 46.3 percent.
Three general study variables were selected, from which the authors analyzed the data collected from the study
participants. The first variable concerns the participant’s commit to affirmative action. The participant’s
knowledge of affirmative action is another variable. Finally, the participants were asked if their experiences with
affirmative action were negative or positive. By dividing the participants into certain groups (sex, age, ethnic
background, political party, job level, public or private institution, and community size), the researchers were able
to determine which of these factors had an effect on the participants’ experience with affirmative action.
The purpose of this critique is to determine which criteria for proper research design are utilized in this particular
study on affirmative action experiences. The criteria are outlined in Business Research Methods by Emory and
Cooper (1991). The basic elements of design criteria and the use of them in this article are described below.
First, the degree of problem crystallization must be addressed. The study may be exploratory or formal. An
exploratory study loosely structures the research with the objective of learning what major research tasks are
required. The purpose is to develop hypotheses and questions to aid in subsequent research efforts. Formal
research begins with a hypothesis or question, and the goal is to test the hypotheses or answer the research
questions. This study is a formal study, although the statement of the goal is rather vague. On page 124 of the
article, a central question is stated which reads: “to what degree and under what circumstances do social workers
in Texas support the concept and implementation of affirmative action policies?”
The method of data collection can be monitoring or interrogation. Monitoring is observational, and the researcher
only views the activity or material to be studied. Interrogatory studies occur when the researcher actually
questions the subjects and collects data about the responses. This article is an example of interrogation: 474
questionnaires were mailed to randomly selected members of NASW/Texas, and the researchers gathered data
concerning certain characteristics of the participants and certain attitudes toward affirmative action.
The third criterion for research design concerns the researcher control of the variables; this can be experimental,
where the researcher attempts to manipulate the variables, or it can be ex post facto, where the researcher has no
control over the variables and cannot manipulate them. This particular study is an example of an experimental
design, since the researchers are attempting to determine certain characteristics that may create certain attitudes,
and certain variables are used to determine an effect on other variables. An ex post facto study would merely
report that a particular condition existed, not a potential reason for the condition.
The next criterion involves the purpose of the study. Descriptive studies are used to determine who, what, where,
when, or how much. Causal studies are used to determine why and to explain relationships. This article is
summarizing a causal study—again, the characteristics of the individuals responding to the questions are expected
to determine the response indicated. An individual possessing these characteristics would be expected to reply to
the questionnaire in a certain manner.
Time dimension can be cross sectional, when a snapshot is taken from one point in time, and the study is only
conducted once, or it can be longitudinal, when the study is repeated over a period of time. The affirmative action
study is a one-time occurrence, taken from a randomly selected group of people, and is therefore a cross-sectional
study.
The next criterion is the topical scope. This can be either statistical, which is an attempt to determine
characteristics of a population by analyzing data that is collected from a representative sample, resulting in
quantitative tests of the responses. The topical scope can also be a case study, which emphasizing a full analysis of
a limited number of events or conditions. Qualitative data is relied upon, making it difficult to support or reject a
hypothesis. This article describes a statistical study—a random sample of social workers was selected, and the
results were quantitatively calculated by statistical means, resulting in generalizations about the population. The
Results section (p. 120) and the Measures of Association section (p.130) describe the particular statistics utilized
to analyze the data collected from the survey.
The research environment is the next criterion: field studies take place under actual environmental conditions, and
laboratory studies take place under other conditions that are not the natural environment and which may be
simulated. The affirmative action study is a field study, which was conducted in the natural environment of the
participant.
*
Used by permission of Ms. Seaber.
The final criterion for research design involves the subjects’ perceptions of the study. When people in the study
perceive that the research is being conducted, it may affect the results of the study. The subjects are capable of
influencing the outcome of the results by having knowledge of being studied. In this study, the participants
obviously are aware of the fact that they are being studied, and although there is no evidence that this will bias the
results, it is possible that this particular outcome could occur.
The previous criteria are directly from the textbook Business Research Methods, but obviously there are other
criteria that may serve to describe a research design. In this particular case, the question can be asked: “Can this
study be generalized to other locations or is it likely to be specific only to Texas?” In my opinion, this study
cannot be generalized to a population in other states. For instance, the minority population in Texas may differ
greatly from other states, resulting from the state’s proximity to Mexico. Also, affirmative action may be
approached in a different manner in other states for various reasons.
Another criterion that may be useful would be to determine whether the study focuses upon factual, non-
negotiable responses by participants or by opinions of the participants. This study is a combination of the two: the
factual characteristics are described, and are then used to determine the effect on other variables. In some cases,
though, it may be important to use only factual data, while the majority of studies will rely on participants’
opinions.
This critique has so far determined the specific characteristics of the study conducted using social workers and
their attitudes toward affirmative action. The article indicates that this study was designed with the purpose of
describing certain characteristics of individuals in the Texas social workers population and to determine whether
or not they (1) have knowledge of affirmative action, (2) have a commitment to affirmative action, and (3) have
been negatively or positively impacted by affirmative action.
The results indicate that surprisingly, having knowledge of affirmative action is not significantly associated with
commitment to affirmative action. In other words, educational efforts to improve commitment to affirmative
action would not be effective, according to this survey. Also, a negative commitment usually corresponded with
negative experiences, and positive commitment usually corresponded with positive experiences. Furthermore,
those participants with negative or positive experiences are more likely to have higher scores in the area of
knowledge of affirmative action.
Commitment to affirmative action was described as follows: women scored higher than men; and racial or ethnic
minorities scored higher than non-minorities. Commitment was also affected by political party affiliation and job
position. Variables that did not affect commitment included public or private employment, community size, or
level of education.
One weakness of this article was the limited response of the selected population. According to Rubin and Babbie
in Research Methods for Social Work (1989), a 50 percent response rate is considered merely “adequate” for
analysis and reporting. While a 50 percent response rate was quoted as “adequate,” this survey generated only a
46.3 percent response rate. Also, although a general research question was stated, the actual predictions and
desired outcomes were very vague, resulting in some confusion concerning the actual reason for the study.
Another possible weakness is that, while the NASW/Texas membership consists of 74.5 percent females and 25.5
percent males, the actual response rate was 62.1 percent female and 37.9 percent male. This indicates a possible
problem concerning representativeness of the sample.
Despite these potential weaknesses, the authors did indicate that this particular study is preliminary and “designed
to identify key issues and to design instruments which would be applicable with a larger nationally representative
sample” (p. 134). They acknowledge that possible weaknesses exist, and encourage other researchers to expand on
the study in order to develop more information that may be helpful in determining the problem and solving it.
References
Emory, C. William and Donald R. Cooper (1991). Business Research Methods. Chicago: Irwin.
Rubin and Babbie (1989). Research Methods for Social Work: Belmont, CA: Wadsworth.
Stout, Karen D. and William E. Buffum (1993). The Commitment of Social Workers to Affirmative Action. Journal of
Sociology and Social Welfare, 123-134.
Exhibit 6-3 Research Design Paper: Evaluation Form (Graduate Students)
Student: ______________________________
In the research design critique, the student:
_________________
Level of Attainment
Low High
Comments:
Grading scale:
A Scholarly work
A- Superior work
B+ Very good work
B Good work
B- Deficient in some areas
C+ Deficient in most areas
*These may be unrated in some cases because they are unique to a subset of students.
A. A parameter is a value of a population, while a statistic is a similar value based on sample
data. For example, the population mean is a parameter, while a sample mean is a
statistic.
C. Unrestricted sampling occurs when sample elements are selected individually and
directly from the population at large. Restricted sampling occurs when additional
controls are placed on the process of element selection.
G. Sample accuracy refers to the degree to which bias is absent from the sample. An
accurate sample has a balance of underestimates and overestimates among the sample
members. Precision, on the other hand, refers to the degree to which random error
variance is estimated to exist in that sample.
I. Variable data parameters are normally computed from ratio or interval scale data.
Examples are age, length, dollars, scores, etc. The most common measures of
variable data are the arithmetic mean and the standard deviation. Attribute data
parameters are expressed in numbers or proportions of the population or sample that
have or do not have certain characteristics. They are necessary for nominal data and
are widely used for other scales. The most frequently used measure of attributes is
the percentage.
J. The mean of the sample is the point estimate and the best predictor of the unknown
population mean. The interval estimate brackets the point estimate of the population
mean and reveals the range wherein any sample mean will fall given a specified level
of confidence.
K. A proportionate sample, particularly within the context of stratified sampling, is one
where each stratum is proportionate to its share of the total population.
Disproportionate samples are departures from proportionate relationships among the
strata.
C. The finite population adjustment is of value when the sample size required to give a
certain degree of precision exceeds about 5 per cent of the population size. The net
effect is to reduce the size of sample needed to give the desired degree of precision.
3. You must determine the size of the acceptable interval range and the degree of confidence
you wish to have that the population parameter will be within that range.
4. This is an exercise in the reading and interpretation of the table of the proportion of area
under the normal curve (Table F-1 in the Appendix).
z= 1.3 = .4333 s
3.0
Entering Exhibit G-1 and interpolating between Z = .43 and .44 we find that .3324 of
the lower half of the distribution extends beyond (below) .4333 s. Thus the percent
of cases above 5.0 consists of:
- all cases in top half of distribution = .5000
- plus a portion of bottom half = .1676
(.5000-.1324 = .1676)
Total .6676
5. A quota sample is appropriate here. Logical selection of data on which to stratify the
sample are gender of the concert attendee and musical type, since both of these are
reflected in historical data. Using the historical data, you could set desired cell proportions
based on popularity of the music (number of concert attendees for that music type
compared to attendees overall). Since data are available by gender, you would set interview
quotas for males and females, in proportion to the importance of each in the total attendee
pool. Students should be asked how far back in the historic database they would reach for
information to determine quotas. Someone is likely to identify and important point—that
musical tastes change and you should use caution in using too old of data. Some one
should also note that some music is enduring and that it wouldn’t matter much how far
back you look. Others will note that music has generation bias, and since age of concert
attendees is not known that that might influence the musical type quota. Ultimately the
decision will depend on the manager’s confidence in the historical data as a good reflection
of attendees by type.
6. When we examine the issue of sample design and size, the following points emerge:
a. Variance in populations: In case different technologies are in use for the manufacture
of military and consumer grades of gaskets, differences in dispersion or the variance
of gasket thickness may be expected. In particular, if for the military grade
application, greater production process control is exercised a lower variance may be
expected. However tentative conclusions on this may be reached only after
estimating a sample variance for both production runs. The higher the population
variance (or its sample estimate) the greater the need for a larger sample size.
b. Need for exactness in specification: The need for the accuracy of gasket thickness
may be expected to be higher for military applications, given the sensitive nature of
the application. Correspondingly the interval range specification will be narrower.
This is a subjective evaluation, made on technical grounds. Stringency in this
requirement would tend to increase the required sample size.
c. Level of confidence: The degree of confidence required for the military application
would be higher, once again based on the argument, that the critical nature of the
product implies that we "cannot take chances," or have to use a higher confidence
level.
How the various aspects (a, b, and c) are combined through a single formula is
illustrated by Question 7. The confidence interval combines the considerations of the
interval range and confidence level.
d. The higher levels of production for the consumer automobile production runs, imply
a need for larger sample sizes. If the production runs of gaskets for the consumer
sector are really large (they may be in millions), sample size could be selected based
on the assumption of an infinite population. On the other hand for military
applications a smaller sample is indicated, and sample size adjustment for this may
be computed using the finite population adjustment, detailed in Question 8. The
adjustment is required only if the sample is more than 5% of the population.
7. In this problem a small sample is used to provide the estimate of the population dispersion
(s = 10). In addition, we must make subjective decisions about the size of interval estimate
we wish, and the degree of confidence of 95% (a = .05) and an interval estimate of $ .50.
Then we need a sample of 1539.
$.50 = 1.96 σ
X
σ = .255 = 10
X n-1
n = 1539
8. The sample size estimate should be adjusted for the fact that there is a finite population of
2,500. With this adjustment we reduce the needed size of sample from 1539 to 953.
n = 953
9. This is a good question to use to discuss the cost associated with accurracy. This problem
involves attributes data. We will assume that the firm has a large number of technicians
since it is a large firm and the employee pool is unionized. The phrase “an accurate
evaluation” would imply that the supervisor wants a narrow error interval, so we might use
anywhere + 1% to + 3%. One could also assume the supervisor wants a high degree of
confidence. 95% degree of confidence (1.96 standard errors) would probably be the
minimum the supervisor would accept, while 99% might be what is desired. Also needed is
a measure of population dispersion. We must either make such an estimate based on past
data collection (e.g. lets say 65% of technicians voted in favor of a union during the union
authorization election thus indicating dissatisfaction at some level, thus pq =.65 x .
35=0.2275) or set the estimate at a pq ratio of .25. This represents the largest possible
percentage value for dispersion (p = q = .5). With this dispersion estimate we calculate the
maximum size sample needed. The various answers based on the above assumptions in this
question are below.
11. Geographic clustering has intuitive appeal given that the bus routes are
already geographically based. Students want internally heterogeneous
subgroups, so drawing a random or systematic sample from residential
city blocks would be feasible. Then the students need to randomly or
systematically select the sample units (housing units) from each block.
No list of sample elements is needed for this probability estimate. Once
the students determine the selection rule for each city block and for
sample elements within each block, Burbidge need only travel to that
block location to execute the cluster sample designed. Of course, if he is
the only interviewer, this will take considerably more time than using the
captive audience on Bus 99.
However, it must be noted that this is not truly “interval scaled data,” since the
origin, or the first class interval is NOT arbitrary. As mentioned in the answer to
question 6, true interval scaled data is not often found in business situations. On the
assumption of symmetric distributions within each class interval it is possible to use
salary class interval midpoints for multiplication operations and comparisons. This
example is not strictly interval scaled, and some operations that are applicable to
ratio scaled data are possible. Yet the student should be guided to understand (1) the
loss of data involved in the use of class intervals, and (2) the fact that the use of the
mid-point, involves assumptions and inaccuracies. At this level of analysis it is still
possible to say that employees in the class interval $20,000-40,000 have on an
average, (and assuming the symmetric distribution within each class interval), a
salary half that of employees in the joint interval $40,000-$80,000.
An ordinal ranking of the employees may be made, in terms of their salaries. Now in
terms of ranking we would know that A is higher ranked than B, and B is higher
ranked than C. However, the ranking does not tell us the extent to which B’s salary
is higher than that of C, and the extent to which A’s salary is higher than that of B.
While A’s salary may be greater than that of B by $2000 annually, B’s may be
greater than that of C by $6000.
An example of nominal classification of salary data on the employees would be
classification into three categories, such as: executive, supervisory, and hourly
salaries. No ordinal ranking of incomes is implied; for instance, there may be
supervisors who have been with the company for a long time, and their present
salaries may exceed the salaries of junior executives.
6. There are relatively few pure interval scales found in business research. Almost
all text discussions of this scale refer to the example of temperature scales. However
some attitude scales such as the Likert and Semantic Differential are claimed to
approach interval characteristics. In addition, approximate interval scales can be
developed from paired comparisons and rank orders of objects.
A. Store customers
Nominal - Group them by race, ethnic background, married or single status, etc.
Ordinal - Rank them as very frequent buyers, frequent buyers, infrequent buyers.
Interval - Some scale of attractiveness in which the scale is presumed to be interval.
Ratio - Average size of monthly purchases.
B. Voter attitudes
Nominal - grouped as Republican, Democrat, Independent, and other.
Ordinal - Rank of candidates in order of preference.
Interval - Likert - type scale
Ratio - Count of votes for various candidates in each district.
C. Hardness of alloy
Nominal - Identification of alloys that include nickel and those that do not.
Ordinal - Ranking of hardness by determining which alloys scratch which others.
Interval - Use of an interval scale designed to rate alloy hardness.
Ratio - Amount of nickel per pound of steel in various alloys.
D. Common stock preference
Nominal - Industry classification of preferred stocks.
Ordinal - Rank order of five stocks as to your preference for them.
Interval - Rating of preference for the stock by converting the results of a paired
comparison rating into presumed interval scale.
Ratio - Six-month changes in price of various preferred stocks.
E. Division profitability
Nominal - Classification of sources of division profits, e.g., manufacturing, assembly,
trading, price changes, etc.
Ordinal - Ranking of divisions by the size of their dollar profits in 200X.
Interval - Use of Semantic Differential scale in evaluating the profit performance
image of various divisions.
Ratio - Dollar profits for each division in 200X.
7. Assume that the instrument has about several scale items (ordinal or interval
data) and is administered to students and their parents at the end of each school year.
Each item calls for a reply ranging from “5” strongly agree to “1” strongly disagree.
A. Stability (or test-retest reliability) would be measured by repeating the
administration of the questionnaire to the same sample units about 2 weeks later and
then correlating the results of the two administrations to measure the stability.
B. Equivalence: We might develop two parallel versions of the same test and
correlate the results.
C. Internal consistency: Here we are interested in homogeneity of the items. Even
and odd or randomly selected half may be used with the Spearman-Brown formula.
The KR20 and Cronbach’s alpha have better overall utility for multi-item scales.
D. Content validity. This measure is largely judgmental. Using the research question
hierarchy to organize the topic is one good way to try to include all aspects of the
topic in the instrument. It is probably desirable to use a panel of judges
knowledgeable about school curricula to determine whether the content coverage in
an instrument is adequate.
E. Predictive Validity: Good criterion-based validity is difficult to show since there
are no clear pragmatic measures of course or curricula quality. If the instrument
predicts how students vote in a “quality of course” poll, or if it predicts how parents
would rate the courses as to quality then it has predictive validity.
F. Construct validity requires that the results of the measurement compare well with
other measures that purport to indicate course or curricula quality. For example, we
would expect high quality courses to be particularly popular with academically
strong students and their parents and less popular with weak students. Entirely
different methods of attempting to evaluate courses (e.g., accreditation agency
evaluation, a faculty poll, recognition of the quality of the course from syllabus
analysis by experts, and the like) should all correlate well with the results from using
the instrument. If these results do correlate well with your results you would have an
example of convergent validity, a subtype of construct validity.
8. The ExxonMobil Travel Guide in 2001 awarded only 42 U.S. hotels
and restaurants the coveted Five Stars rating. “One of the
industry's highest accolades, that top rating indicates a level of
quality, service and excellence that makes visiting each
establishment a unique and memorable experience.” Exhibit 8-1
displays the five-star criteria or you are encouraged to visit
ExxonMobil’s website for more detailed information on the other
star-levels and the methodology.
(http://www.mobil.com/mobil_consumer/travel/guides/index.html)
Physical All materials are of highest caliber and integrated to provide a strikingly
Property memorable effect, a world class experience. The interior design is in keeping with
the overall theme and aspirations of the restaurant. Every element, such as the
floor covering, wall covering, artwork, etc. is integrated into the overall décor,
and no one element stands out or detracts from the dining experience. A
customized, highest quality cloth or true linen is used throughout the
establishment. Cloth items are replaced as they are soiled during the meal. A
special house design may be integrated throughout the entire service. China
service is unique, comprehensive, and of highest quality. The highest quality
stemware is used (leaded crystal perhaps), possibly utilizing a unique design.
There is a thoroughly comprehensive silver inventory including steak knives,
lobster picks, sauce spoons, dessert and fish forks, fish knives, escargot holders,
etc. as appropriate. Unique design features may be incorporated, and a full
complement of support equipment (silver chafing dishes, silver or marble
mahogany carts) may be used. A unique effort is obvious in the menu design
(such as commissioned art work or logo integration) and the menus are printed
on highest quality bond. Restrooms are expected to have attendants; linen or
cloth hand towels are also expected.
Service Service staff evidences some preparation skill such as carving, the use of service
spoon and fork, boning, salad preparation, dessert merchandising, flambéing. The
staff is familiar with the chef's culinary strengths and background. A sommelier is
present. Empty glasses are immediately attended to and spirits may be served
from a small carafe or decanter. Decanting services are available, or offered.
Staff hygiene, personal grooming, and use of cosmetics are impeccable. Service
personnel put the guest at ease in promoting an enjoyable experience. The guest
is never challenged by the service environment or nervous about committing
some culinary error. Tableside presentations do not delay the dining experience.
Guest The check is presented in a discreet manner when requested. There are no
Departure unnecessary delays in the payment of checks. Staff assists with apparel and
possibly securing of transportation. Special amenities are provided (candies,
menus, and souvenirs).
1
Excerpts from ExxonMobile’s website
(http://www.mobil.com/mobil_consumer/travel/guides/index.html)
9. A. There are many concepts and constructs which might be suggested and
operationalized here, but for purposes of illustration we restrict the discussion to two
definitions: full-time student and student morale.
B. Full-time students will be defined as those who are taking at least 12 semester
credit hours of coursework and are studying either for a bachelor’s or a graduate
degree. Morale may be divided into three major dimensions: academic, living
conditions, and social life.
C. For measuring student status we classify all students as full-time students if they
report when asked, that they are currently registered for at least 12 semester hours
study and are working for a degree. For measuring morale we develop four
statements concerning each of the three morale dimensions and ask students to show
their degree of agreement with these statements on a 1 to 5 agreement scale. (How to
develop and choose these scale items will be discussed in Chapter 8).
D. The simplest way might be to sum the answers to the four items and average for
each of the three dimensions. These can, in turn, be combined in the same way to
secure an overall index. In some cases we might wish to use a weighting system in
compiling the index.
E. Reliability can be determined by using several versions of the scales to determine
morale and calculating coefficient alpha. We might also administer the scales
several times over a short period of time to the same students to determine if they
give stable results. We can test for validity by looking for other indicators of a
person’s morale. We might, for example, talk with friends of the students or query
professors, advisors, or parents.
From Concept to Practice
1 Data Type
1.In your kind of work, if a person tries to change his usual way of doing Ordinal
things, how does it generally turn out?
2. Some people prefer doing a job in pretty much the same way because this Ordinal or
way they can count on always doing a good job. Others like to go out of Interval
their way in order to think up new ways of doing things. How is it with you
on your job?
3. How often do you try out, on your own, a better or faster way of doing Ordinal
something on the job?
4. How often do you get chances to try out your own ideas on the job, either Ordinal
before or after checking with your supervisor?
5. In my kind of job, it's usually better to let your supervisor worry about new Ordinal or
or better ways of doing things. Interval
6. How many times in the past year have you suggested to your supervisor a Ordinal
different or better way of doing something on the job?
• Students could add a question about the respondent's gender to obtain Nominal
nominal data.
• Students could add a question asking the respondent's actual age or the Ratio
number of years the respondent had worded to obtain ratio data.
1. A. Rating scales have an advantage in that they require less time, are interesting to
use, and have a wider range of application than ranking methods. They can also be
used with a large number of properties or variables. The major disadvantage of
rating scales is that they assume that a person can and will make good judgments.
The human element in rating scales makes the scale subject to the common errors
of leniency, central tendency, and the halo effect. These errors are discussed
individually in the section entitled "Problems to Avoid with Rating Scales."
Ranking scales do not have the wide application of rating scales, nor can they be
used with a large number of properties or variables. However, ranking scales
permit the participant to express his/her attitude in an unambiguous manner.
Whereas rating scales are generally viewed as interval scaled, ranking scales are
ordinal. There is a body of opinion that holds that interval scales can be developed
from some ranking comparisons, but this is not true of all systems of ranking. As
has been indicated earlier in Chapter 7, there is a loss of data when we use ordinal
rather than interval approaches. An equal difference in ranks does not imply an
equal difference in the attribute, by which the subjects are ranked.
B. Likert scales are relatively easy to develop compared to differential scales. They
are most useful when it is possible to compare the person's score with a distribution
of scores from some well defined group. If constructed in the classical manner,
each item that is included in the scale has met an empirical test for discriminating
ability. Since participants answer each item it is probably more reliable than a
differential scale in which only a few items are chosen. Also, it is easy to use this
scale both in participant-centered and stimulus-centered studies.
The disadvantage of the Likert scale is that a total score can be secured by a wide
variety of answer patterns, thus there are questions as to the meaning of the total
score.
Aside from the relative merits and demerits of differential scales that can be
inferred from the above, some additional points are: (1) The cost and effort
required to construct differential scales has limited their use. (2) This approach has
been criticized on the grounds that the values assigned to various statements by the
judges may reflect their own attitudes and will thus be biased.
C. Unidimensional scales are usually easy to construct and are not difficult to
understand. Conceptually, however, there is a question of whether we are truly
measuring a single dimension. The Guttman technique is one effort made to assure
that a so-called unidimensional scale is actually unidimensional. Multidimensional
scales are a way of recognizing that many concepts cannot be represented a single
dimension. However, these techniques are difficult to use and to understand. The
classical semantic differential scale is the first major multidimensional form that
has received considerable attention. Currently there are many developments in
multidimensional scaling using computer-based procedures that are attracting
widespread interest, especially in marketing research.
2. A. The Thurstone scale would be a set of statements about the person’s confidence in
the economic system. Each statement would have been rated regarding its degree
of favorableness along some assumed dimension. For example, the statement "The
U.S. economic system is will rebound quickly from the consequences of the
terrorist attacks in New York and Washington." might be presented among a list of
10 statements. This one might have been appraised as a "7.6" on a scale of 1 to 10.
The participant would be expected to select one or more statements with which he
agrees.
B. A Likert scale would consist of a series of statements about the resiliency of the
economic system, each with a 5 point scale (may also be 3-point or 7-point scale)
expressing a degree of agreement or disagreement. If the classical Likert approach
is used the items on the scale would have been chosen by a pretest using an item
analysis approach.
C. A semantic differential scale of the classical type would use a set of bipolar
adjective scales with the subject being the "resiliency of the U.S. economic
system," or a dimension of the U.S. economic system, such as its speed of
recovery. In a semantic differential scale the adjective pairs would be chosen to fit
the subject. These adjective pairs are normally chosen on an arbitrary basis.
D. A Stapel scale is used when bipolar adjectives usable with the semantic differential
scale cannot be found. Now, for each attribute of the U.S. economy identified,
there is a set of 10 response categories. For each item participants assign a rating
number (between +5 and -5). The more appropriate the description, the closer the
rating is to +5. Correspondingly, -5 would refer to the most inappropriate
description.
E. Suppose a Forced ranking scale was used to study the U.S. economy in, say three
dimensions, growth, stability and employment. Now participants can be asked to
rank the performance of the economy before and following the terrorist attacks of
September 11, 2001 for each of the selected dimensions. Hence, for employment,
‘before’ may be ranked 1 (least unemployment), and ‘after’ may be ranked 4 (most
unemployment). The participant is essentially forced to make a ranking for each of
the periods of time specified.
3. Job Involvement:
A. A graphic rating scale:
You may express your level of job involvement as a "/” intersecting the selected
point along the line:
The concept of job involvement may have multiple dimensions such as (1) hours
input per month (verbal anchors: high/low) (2) work output quality
(excellent/poor) (3) energy level at work (high/low), then each may be rated
separately, using a graphical scale.
Is similar to a numerical scale, which has equal intervals between numeric scale
points, and it allows visualization of the results. Here the question may like:
Choose, and circle, the alternatives that best describe various aspects of
your job.
Aspect of your job: High Average Low
Involvement 5 4 3 2 1
Man hours input 5 4 3 2 1
Work output quality 5 4 3 2 1
Energy level 5 4 3 2 1
Koak X 50 115 35
Zip 150 X 160 70
Pabze 85 40 X 45
Mr. Peepers 165 130 155 X
--------------------------------------------------------------------
TOTAL 400 220 430 150
RANK ORDER 2 3 1 4
5. A. These choices may be criticized because there is no way for a participant to express
an "undecided" response or a "don't know" response.
B. A widely used set of response choices and generally acceptable. One problem is
the operational definition of the various terms. What is "fair" to one may be "good"
to another when both are actually making the same judgment.
C. This assumes that the average must fall between good and fair, while it might be
that average should be elsewhere.
D. The central choice of "neither agree nor disagree" does not adequately reflect the
situation that might be called "uncertain" or "indifferent."
Question 1: If you were to purchase a bicycle today, keeping all your normal
considerations in mind, how would you rate the following brands:
A truly excellent buy (5)---(4)---Average---(2)---(1) A really bad
buy
Brand X (5)---(4)---Average---(2)---(1)
Brand Y (5)---(4)---Average---(2)---(1)
Brand Z (5)---(4)---Average---(2)---(1)
This data has the advantage that it measures distance; it gives us a measure of how
much a brand is perceived as better or worse than another brand, on a particular
brand, rather than a mere ordinal ranking. To analyze the data we would like to
know:
The numerical data for purchase preference, and for each of the dimensions can be
aggregated for the sample. The arithmetic means can be used to compute the
average perceptions. The variance or standard deviation can measure consistency
of opinions or the variance in opinions.
7. A. SD = 5
B. SA = 5
C. SA = 5
D. SD = 5
E. SA = 5
F. SD = 5
The responses for each person answering could be totaled to give a measure of each
person's attitude toward the program. The purpose would be to measure the
attitudes of the participants for their "total attitude." On the other hand each
statement could be tallied separately to determine how the program scores on each
point. In this case the emphasis is on different perceptions of the program rather
than attitudes of different students toward the program.
8. Jason and Myra argue with Jean-Claude that the CompleteCare research project
needs a unique scale (arbitrary) because 1) the scales used by the company's
marketing staff for their consumer products don't relate to the computer servicing
problem, 2) the focus group research revealed the need to address "expectations"
and virtually none of the scales developed for customer satisfaction deal with
expectations, and 3) that the combination of in-house and outsourcing of repairs
places special demands on measuring precision and reliability.
Chapter Exercise
Instructors often want a hands-on exercise that brings the details of the chapter
together with practical experience. Semantic differential scales allow students to
examine a scaling application using a topic familiar to them (the attitude object can
be their MBA program, current class or other topic of your choice). Likert, graphic,
or other rating scales typically require more preparation or item analysis work than
you can spare for a brief in-class illustration. Semantic differential scales also allow
a transition to multidimensional scales for instructors with advanced students.
1. Using the semantic differential scale below, record your impressions of your
current educational program by placing a small X on each adjective continuum.
2. Score your responses by assigning a 7 to the positive end of each continuum
and a 1 to the negative adjective end. Note that some scales are reversed.
3. Which of the adjective pairs are evaluative? potency? activity?
Instructors may wish to provide a scoring key on the board after the ratings
are made: 7s are good, strong, active, complete, fast, hot, meaningful, and
heavy.
Optional: If you have time, compare the results for "ideal program" to the previous
findings. Examine carefully the dimensions that contribute most to the differences.
a. A source's purpose indicates the likely bias of the material provided by the source. It
the source advocates a specific position (enlightenment), it is likely to be biased
in the direction of that position. Sources designed for entertainment take poetic
license with their material to insure the material is entertaining. Serious
information would be suspect if it came form a clearly biased or solely
entertaining source.
b. A source's scope indicates the time frame and range of material presented by the
site. A site designed to provide current highlights would not provide the detail
needed by many exploratory searches designed to enlighten the researcher. But,
such a source might be a portal to other richer and more comprehensive sources.
c. A source's authority refers to the degree to which primary vs. secondary data
sources are reported, and the credentials and reputation of the authors of the
material in that source. This criterion is especially important in a web source
where anyone can post anything.
d. A source's audience indicates for whom the information was collected. A source
designed for children might be written at too low a level to be useful for the
serious business researcher. A source written for members only, might provide
too narrow a scope to be of use in solving a more comprehensive problem.
e. A source's format related to how the information is related: narrative prose, table,
charts, linked to detailed data, etc. The more organized the format, the easier the
source is to use, and thus the more value it has for the serious business
researcher.
2. Primary sources in a secondary search relate original research, usually containing raw
data with out interpretation. Secondary sources are often a compilation of
information containing interpretations and summaries of primary sources. Tertiary
sources are usually compilations of secondary sources into indexes, bibliographies,
and other finding aids. Internet search engines fall into the latter category. Each has
value for the business researcher engaging in an exploratory search for information.
All things being considered, the more removed the information from its primary
source, the greater the distortion to the original information.
3. Data mining is the process of extracting knowledge (valid, novel, useful, and
ultimately understandable patterns in data) from information contained within
databases stored in data marts or data warehouses. The process involves a
disciplined set of analytical and statistical procedures, including data visualization,
clustering, neural networks, tree models, classification, estimation, association,
market-basket analysis, sequence-based analysis, fuzzy logic, genetic algorithms, and
fractal-based transformations.
4. In internal data mining, the researcher is accessing primary sources of data, and
exploring the patterns within and between data in various databases. In a literature
search, much of the information revealed comes from secondary and tertiary data
sources that another researcher has explored. As a result, some of the knowledge will
have been lost, ignored, or buried.
5. This might occur when the cost in time and money is so great that only secondary
sources may be used. In some cases the costs may be so great as to be prohibitive. In
other cases we can not gather the data, no matter how much money and time is spent,
because we do not have access to the information. For example, raw data in IRS files
are not available to researchers. Also, information about historical events is available
only from published sources.
7. There are two chief problems. One is the question of accuracy. Every study is done
for some reason (purpose, scope, audience, and authority) and we should assure
ourselves that the data we use is not biased in such a way that it is unusable. In
addition, there is the definition problem. Do we know the operational definitions
used in the study? Are they compatible with our own?
8. Below are the likely sources for each of the research needs. The list of sources in
each case is not necessarily exhaustive nor does it cover many web site possibilities.
A. The president wants a list of six of the best references that have appeared
on executive compensation during the last year. Because current
information is sought, students should use a search engine on the web, such as
Google, using the key word: executive compensation. There are several sites
that compile information about executive compensation. Students should also
use Periodical databases to supplement this Google search, such as ABI Inform
B. Has the FTC published any recent statements (within the last year)
concerning its position on quality stabilization? For this, students should
definitely use the FTC web site (http://www.ftc.gov/). Students can
reach it directly or through Google or just about any Web search engine.
Once there, the student would have to use the search engine on the site,
however, to find the info requested; it would not be available directly
through a search engine.
C. I need a list of the major companies located in Greensboro, North
Carolina. Could use many different sources for this one including Standard
and Poor’s Net Advantage, Harris directories, Mergent's FISonline, Dun and
Bradstreet's Million Dollar Directory, Gale's Business and Company Resource
Center, and probably the local web sites.
9. One plausible research question is "What problems have surfaced with the Westridge
project that have the county commission awaiting the recommendation of the county
planners and that have Croyand Associates seeking advise from another research
firm?" The key words to include in several bibliographic search queries include
research companies, Croyand Associates, real estate, ethics, InterMountBanc,
shopping centers, malls, and banks. The first search query might be: Croyand
Associates AND InterMountBanc OR Westridge.
10. Government sources might be tapped to reveal zoning applications, zoning rulings,
building permits, building inspections, building certifications. While Jason could
likely sort through these physically were he in Colorado rather than Florida, it is
more likely that Jason would turn to city, county, and possibly state government
databases accessible via the Internet.
11. This question offers the perfect opportunity for your students to get a head start on
their term project. They likely have developed several research questions that can
serve to get them started on this exploratory phase of their project. Encourage them to
use the Business Reference Sources CD and Appendix A and jump in with both feet!
You can even have them submit their search plan as an assignment.
4. What was last year's rate of unemployment among the 20-24 year age group?
(Monthly Labor Review)
5. Has the U.S. Senate held hearings on any matters related to small business in
the last year? (Monthly Catalog or CD-ROM CIS Congressional Masterfile)
6. What is the name and address of the association for the peanut industry?
(Encyclopedia of Associations)
8. What is the nearest university that receives the Journal of the Oil Chemist's
Society?
(Union List of Serials)
9. What was the most recently published monthly production figure for U.S.
bituminous coal?
(Survey of Current Business)
12. How many establishments were there in 1972 in industry SIC 3498 - fabricated
pipes and fittings? (U.S. Census of Manufacturers, publication MC72 (2) 34F)
13. What doctoral dissertations, if any, have recently been written on consumer
behavior?
(American Doctoral Dissertations, Dissertation Abstracts Online, or CD-ROM
UMI Dissertation Abstracts)
14. What is the most recently reported quarterly GNP figure for the U.S. in
constant dollars?
(Business Conditions Digest or Survey of Current Business)
15. What was the output of U.S. hosiery manufacturing last year?
(Statistical Abstract)
19. What are two or three important types of information available at a local level
concerning retail trade in the U.S.?
(Census of Retail Trade, state volumes)
2O. What is the total amount of outstanding installment credit debt in the U.S. last
year?
(Federal Reserve Bulletin)
21. One area that is not fully covered in Chapter 10 is the vast amount of statistical
information available in published form. One can develop a class exercise that
uses these statistical sources only. In such an exercise use the various
Censuses of Business, Population, Housing, Manufacturing, and Agriculture,
as well as current economic statistics from the Survey of Current Business,
Federal Reserve Bulletin, and the Monthly Labor Review. Other publications
such as the County Business Patterns, the various Census "p" bulletins, and
other monthly industrial, foreign trade, and agriculture publications are also
possibilities. In such an exercise the instructor may wish to specify specific
information that the students should collect and report.
Algorithms used in data mining are complicated and beyond the technical ability of most students
but because of the tremendous exploratory potential, it is desirable to illustrate the text material in
class. The SAS Institute offers resources at their web site including demonstration CDs and
brochures that are beneficial for helping students visualize this unit. Visit their site at
www.sas.com or send them an e-mail at software@sas.sas.com.
Response error occurs when the recorded data differ from the true data; such errors can come
from the participant in the editing, coding, or data entry stages.
Interviewer error occurs when the interviewer in some fashion corrupts the data.
This can occur as a result of inconsistent treatment of participants or questionnaires,
ineffective participant motivation and cooperation created by the interviewer, social
differences between participant and interviewer, and cheating.
Nonresponse error occurs when the researcher had difficulty in securing interviews
from participants who have been selected into the sample, and the non-participants
differ from the participants in a systematic way.
2. Many environmental conditions such as the degree of urbanization in the area, the
day of the week, the time of day, the location of the interview, and the other demands
being made on the participant at the time of contact are all important factors in
determining the rate of response. For example, many urban residents feel threatened
in their neighborhoods and are reluctant to answer the door when a stranger calls.
Likewise, in a business shopping center at dinnertime it may be difficult to secure
cooperation from persons rushing home.
These problems can be only partially offset. Care in the selection and training of
interviewers is one of the best ways to lessen these problems. Some interviewers
consistently turn in better response rates. Techniques such as varying callback times,
making calls at the times of highest probability of contact, setting up interviews by
phone, interviewing other members of the family (when this is acceptable), and
seeking assistance from neighbors to learn the best time to call are also ways to
improve performance.
3. There are many ways to motivate participants in such a case. Some methods are
economic, such as cash, merchandise, or discount coupons for products sold in the
mall. Other methods are psychological, such as developing good rapport with the
participant, showing interest in the participant's thoughts and feelings, and
convincing the participant that the research project and his participation is important
and appreciated. Here it is often useful to indicate the utility of the research. In most
cases a single method is not sufficient for adequately motivating participants; a
combination of motivational approaches is necessary.
4. Three suggestions for decreasing personal interviewing costs while increasing the
response rates include:
(1) Develop a strategy for intermediaries at the door/participant's absence in case the
required participant is not available. This can be a system of leaving a small note and
visiting card rescheduling the interview/and giving a confirmatory telephone call.
(2) Use an incentive system to reward efficient interviewing and using the telephone
for scheduling and screening personal interviews.
(3) Use self-administered questionnaires.
Improved selection of interviewers, role playing, and other forms of training are also
effective ways to improve interview efficiency.
a. In this first one it is more likely that a personal interview will be used because of the
compact study area and a topic which will be of high interest to participants. This is
especially the case if there are a substantial number of questions and a certain degree
of free form to the interview. Telephone interviewing would be the second choice.
b. Personal interview or telephone interviews for much the same reasons as above. The
sampling approach probably will be a major factor. If it were going to be a
convenience sample then personal interviewing at several spots on campus would be
adequate. If it is a random sample it might be desirable to use telephone interviewing
where possible and personal or mail surveys where telephone contact cannot be
made.
c. Probably mail survey would be the most appropriate. It is possible that some written
material will be sent (policy statements, etc.) and this must be handled by mail.
Telephone surveys are possible if the questioning is not too complex and lengthy.
Personal interviews would be the high cost alternative and probably not used unless
this project had a substantial budget and was aimed as an in-depth study.
d. A mail survey would probably hold costs down and improve the chances of making
contact with the special executive. One might also use a letter to inform the financial
officer of the project and then call long distance at a pre-specified time to secure the
answers to the questions.
e. College campuses today are fairly well connected with operating Intranets, and
supervisors of student workers typically provide some written information on the job
openings available in their departments. Thus, a combination of self-administered
questionnaire via web or mail, along with either telephone or personal interview is
likely.
This study was actually tested by mail survey first, but responses were incomplete
and in some instances provided confusing information. A second, more successful
test used a web-delivered instrument (to both students and supervisors) and
supplemented this with follow-up personal interviews (with supervisors) to clarify
variables that were answered inappropriately or incompletely. This is an excellent
opportunity to discuss incentives for participation; students were anxious to
participate because they perceived it was likely to bring about a new pay structure in
which many of these participants would benefit. Supervisors, on the other hand, had
no such motivation. Participant cooperation in a timely manner and organization
culture were the primary reasons cited for using personal interview rather than
telephone interviews for the follow-up with supervisors. A large pool of student
volunteers was used to conduct the interviews, thus eliminating the extra cost of a
personal interview and providing quick results as well.
6. The best way would be by random digit dialing within the exchange. There are 9,999
possible numbers from which to choose a group by using a table of random numbers or a
random number generator. Record the chosen random numbers in the order in which they
appear. Use at least 70 numbers in order to have replacements for unassigned numbers,
business establishment numbers, and other non-households. When a number does not
answer, continue to call back up to 5-7 times at different times of day before replacing it
with another numbered participant. Continue the calling process until the sample of 40
families is achieved.
7. This can be answered in several ways. The Kanuk and Berensen review article would
suggest that follow-ups be used, that a respected sponsorship be used, that a stamped return
envelope be enclosed, and that a money incentive be included. The Dillman approach is
concerned more with the total design of the project. He urges concern for the improvement
of all aspects of the study that could give it an aura of importance, quality, personalization,
and usefulness. He addresses the problem in terms of the appearance and content of the
materials as well as the process used. He stresses envelope appearance, cover letter
appearance (equal to a normal business letter), careful design of cover letter content (to
make a strong appeal), and the importance of the participant's participation. He advocates
multiple follow-ups and careful timing of the mail outs.
8. A. Sample selection: In the case of a large corporation, it may be assumed that the
survey is a multi-location exercise. Further, since the issue is one of sexual
harassment, the study may wish to examine the "age" aspect of sexual harassment
i.e., is the problem more prevalent with females of certain age groups and possibly
men of certain age groups. Sample selection may be random but stratified by age and
location.
C. Purpose: A survey such as this may be highly sensitive. It can prove to be a catalyst
to militant unionization by adversely affected employees. Alternatively, if it is
perceived that the survey is an instrument that will be used to determine future
policies, demands may be "exaggerated," by certain segments, facts falsified, and the
survey results biased in this sense. This bias may be oriented towards males or
females, and even the numbers in each sex may bias results, unless statistical analysis
takes cognizance of this. Whether issues of relationship building, fact finding, or
other issues are involved, it is important to (a) assure anonymity or confidentiality of
the participants (b) deliberate whether the survey should be carried out by an internal
agency or a more obviously "distanced" outside agency (c) assess the impact of a
complete disclosure of the purpose of the study versus relative lack of openness, as
well as determine the corresponding impact of rumors and grapevine effects.
On the issue of response and nonresponse error reference may be made to sections
(b) and (c), and the consideration of a two-stage survey approach
9. As the chapter frequently refers to the Albany case as issues are discussed. The vignette
provides lots of information, perceptions, and analyses for the student to collect prior to
defining appropriate actions. This question may be assigned as a partial out-of-class/in-
class exercise. Teams are formed to enumerate the various problems with the study. Once
they have shared the problems, teams are reformed to design an appropriate communication
study, given the measurement questions that were deemed necessary.
To facilitate discussion, here are some quotes on the Albany study extracted from the study:
•
The Albany clinic study was an intercept/self-administered central-location study.
•
Interviewer Error source--failure to secure full participant cooperation. The sample loses
credibility and is likely to be biased if interviewers do not do a good job of enlisting
participant cooperation. Certainly there is a question about the quality of the data collected
from Edna during the Albany clinic study. Toward the end of the communication, there is
some doubt about the seriousness with which questions were answered. Stressing the
importance of the information to the surgery that follows and having a receptionist serve as
question interpreter/prober could reduce this type of error.
•
The intercept interview would have been a possibility in the Albany clinic study, although
more admissions clerks would likely have been needed if volunteers were not available to do
this task.
• With her eyesight and the problems of question clarity that Edna experienced at
the Albany Outpatient Laser Clinic, a personal interview rather than the self-administered
questionnaire might have been a preferable communication method.
•
As Edna and her friends discussed the Albany clinic survey, they each applied their own
operational definitions to the concepts and constructs being asked. This confusion would have
created a bias that might have been eliminated by a well-trained interviewer.
•
Interviewer Error source--failure to consistently execute interview procedures. In the Albany
clinic study, providing differing concept or construct definitions to different clinic patients
would have created bias.
•
Interviewer Error source--failure to establish appropriate interview environment. Since the
Albany clinic study asked for factual rather than attitudinal data, interviewer-injected bias
would have been limited. If the clinic had required the admissions clerk (who insulted Edna
by referring to her negative attitude) to also conduct a post-surgery study of patient
satisfaction, the results of the latter study may have been influenced by interviewer bias.
• Answers to many of the questions on the patient survey might have been known
by a caregiver, especially since Edna was over 80. And the clinic’s admissions department
could have been confident that such information was as accurate as if given by Edna herself.
•
The absence of assistance to interpret questions in the Albany clinic study was a clear
weakness that would have been improved by the presence of an interviewer. Interviewers can
note conditions of the interview, probe with additional questions, and gather supplemental
information through observation. Edna was obviously in good spirits and very relaxed after
she and her fellow patients had critiqued the questionnaire. This attitude would have been
observed and noted by an interviewer. Of course, we’re hopeful that the interviewer would
correctly interpret laughter as a sign of humor, not as a negative attitude, as did the
admissions clerk.
•
In the Albany clinic study, the researcher could have taken
several actions to improve the quality of the data. Distributing
the questionnaire to the patient’s eye doctor or to the patient
(by mail) prior to arrival would have increased the accuracy of
identifying medications, diagnoses, hospitalizations, and so
forth. The patient’s eye doctor was in the best position to
encourage compliance with the collection process but was not
consulted. Having the patient bring the completed
questionnaire to the admissions procedure, where the
admissions clerk could review the completed instrument for
accuracy and completeness, would have given the researcher
the opportunity to clarify any confusion with the questions,
concepts, and constructs. Finally, pretesting the instrument
with a sample of patients would have revealed difficulties with
the process and operational definitions. Edna’s concerns could
have been eliminated before they surfaced.
The logical conclusions a student should reach are 1) while the study
might have elements indicating self-administration it should have
interviewer contact with the participant/patient, 2) the patient's doctor
could be recruited to increase participant motivation to participate, 3)
the receptionist could be trained to facilitate the process if a self-
administered study is chosen, and 4) the measurement questions
should be evaluated to determine their critical necessity.
1. A. Direct questions are those that the participant should be able to answer openly and
unambiguously. Indirect questions are those designed to provide answers through
inferences from what the participant says or does.
B. Open questions allow participants to reply with their own choice of words and
concepts. Closed questions limit participants to a few predetermined response
possibilities.
C. These types of questions compose three of the four levels of the management-
research question hierarchy. Measurement questions are at the bottom of the
hierarchy and are designed to gather specific information from research Participants.
The investigative questions usually compose several levels of questions and are
answered through the information provided by the measurement questions. The
research question is the basic information question or questions that the researcher
must answer to contribute to the solution of the management problem. The research
questions are answered through the information obtained by the investigative
questions.
2. The survey technique is popular in business and social science research because so many of
the research questions concern human attitudes, levels of knowledge, and other types of
cognitive or affective information. Much of this information can be secured only by
interrogation, or at least can be secured more efficiently by interrogation than by other
means. Then, because we have spent our lives conversing with others, the thought of
"asking people" comes naturally.
3. Two major problems are the frame of reference problem and the irrelevant response. The
former grows out of the participant interpreting the thrust of the question in a different way
than was intended by the researcher. This can lead to unanticipated responses and event
responses that appear to be acceptable but have a different meaning. For example, a
question like, "What do you think of Mr. 'X' as the democratic presidential candidate in the
next election?" A response of, "He is the best candidate the democrats have," might
introduce a dimension that is different from the expected preference statement. Or, an
answer of "great" might mean, "I prefer him because I'd like to see him be the president," or
it could mean, "I'm a republican and his nomination would assure a democrat loss." An
irrelevant response might be, "I think the democrats can't win, no matter who they run."
4. There are several reasons why a researcher may disguise the questioning objective. The
major reason would probably be to guard against introducing biases. The topic may be so
sensitive that a direct question will elicit a refusal, or the expression of socially approved
statements that do not accurately reflect the participant's views. Even if the topic is not
sensitive, a question may be so uninteresting or difficult to answer adequately that the
participant replies in a stereotypical way. A third situation in which indirect questioning
may be useful is when we seek information that is available from the participant, but not at
the conscious level.
5. While opinions will vary on these, four important faults of the survey instrument designer
are:
1) Failure to understand the full dimensions of the subject, hence the topic is covered
inadequately and the information is not secured in its most useful form,
2) Failure in selecting the most appropriate communication process or combination of
processes,
3) Failure in drafting specific measurement questions: hence inadequate attention is
given to each question's content, wording, and the sequence of the questions, and
4) Failure to test the instrument properly.
Finding the best wording for a question calls for experimentation with different versions,
particularly when positive and negative versions are usable. Another good guide is to
follow the test-revise-retest process with all questions. These six questions are helpful.
The section entitled Purposes of Pretesting provides a more complete set of criteria.
7. The first few questions need to awaken interest and motivate the participant to participate
in the study. Often students try to achieve this by starting off with questions that have little
or no information value. Another tendency they have is to ask for personal classification
information at the start, normally such questions should come at the end (except when
needed to qualify the participant or if there is a fear that the percentage of completed
interviews will be low). Questionnaire designers generally begin with questions that are
easy to answer and of more interest to the participant. They ask the simpler and more
general questions first, becoming more complex and specific as they progress (the funnel
approach is described in the section entitled Question Sequence). It is generally advisable
to deal with a single sub-topic before moving to another and there are often obvious
sequences for such topics that should be considered. Finally, they should watch out for
interactive questions where, for example, the answer to question 18 influences the later
answer to question 25. It may be more desirable for question 25 to come first, or it may be
necessary to use alternative versions to balance out the biasing effects of one question on
the other.
A. It leads the participant by asking whether he/she reads a prestige magazine. It will
probably over-report readership. And, "regularly" can be interpreted in different
ways.
B. It presents the participant with a difficult estimation task requiring information that
he is unlikely to have readily available. It is unclear whether the question is
concerned with verbal requests only or any form of request.
C. It concerns a rather trivial event in the distant past that is not likely to be
remembered.
D. The meaning of "discretionary buying power" is not clear and not easily understood.
In addition, the question asks for information that is not likely to be easily available
or calculable even if the term is understood.
E. It is a simple "why" question that leaves the frame of reference open to the
participant. If the answer is of substantial importance to the research it should
probably be expanded.
F. It does not make the explicit statement alternative of "not doing a good job." It will
probably bias responses in a positive direction. Also, the question is too vague and
general for any purpose other than as a rough attitude indicator. Furthermore, how
recent is "now"?
10. A number of criticisms may be made, but among the most likely are the following: first,
there are several format problems that could be improved. The order of questions is not
bad except for putting the overall evaluation at the beginning where it is likely to influence
how the individual questions are answered. A second weakness is the failure to provide
adequate space for responses, especially in questions 3 and 4. It would also be wise to give
some more guidance as to the form of answers sought. The various parts of question 2
apparently should be answered by yes or no, but the lack of specific indications to this
effect may result in some participants merely checking some parts and not checking others,
or answering in other unexpected ways. When using what are essentially closed response
questions it is wise to specify the response choices.
Question 1 - should be placed much later as a summary question. Some scale other than a
good-fair-poor scale would be better since these are vague concepts. It would probably be
wise to use a scale to compare the professor to other professors in the student's experience.
Question 2 - in all of these parts there should be at least "yes" and "no" response choices, it
might be even better to use a more sensitive scale, say of 1 to 5 points. In 2a there is some
question of the meaning of "good delivery." Does this refer to speaking delivery skills,
ability to conduct class discussions, or does delivery refer to total classroom performance?
Question 2b is better than most, but is "know the subject" too crude or vague in concept? A
similar criticism could be made in 2c where "positive attitude" is too vague and subject to
variable interpretations. "Grade fairly" (2d) and "sense of humor" (2e) are probably
acceptable, but 2f is a multiple question that should be broken into several. In addition,
there should be some measure of usage. In question 2g, how prompt is "promptly?"
Question 3 - after providing more response space, we might improve on "strongest point."
A statement asking for the professor's "greatest strength" probably clearly conveys the
intentions of the writer.
Question 4 - it would improve the quality of response if more effort was made to seek that
aspect of the professor's work that most "needs improvement."
Question 5 - "Kind of class" is confusing. Does this mean subject matter, type of class
operation, good or bad, or what?
A better letter would have had some degree of personalization. It would have suggested that
there is a need for such a study, offered some information about the purpose, or thanked the
addressee for participation, and, in general, adopted more of a persuasive tone. The
accompanying envelope should have been stamped as well as already addressed.
The questionnaire can not be fully evaluated in terms of coverage since we do not know the
precise objectives of the study. On balance, however, the instrument is poorly designed
and the question construction is generally bad. To begin, the heading, a single word
"Questionnaire" is not very helpful. Something that identifies the project would have been
better. The statement of directions is more what one would expect on some class
examination or bureaucratic form rather than a study that seeks voluntary cooperation.
Most researchers want as much detail as possible, rather than as little. This suggests that
the researcher is really not interested in his own topic, so why should the participant be
interested?
It is difficult to show the space allocation problem in this text illustration, but clearly there
is insufficient room for answering most of the questions. The actual questionnaire had a
similar amount of blank space for each reply. For each question the blank for response was
as long as the remaining space to the right margin. Additionally, some comments can be
made of each question:
Question 1 - This question assumes that all members of the ASTD are actually in the field
of training while this may not be the case. The participant does not know how the "field" is
defined. Finally, the participant may have entered the field with some organization that is
not a company, e.g., university, non profit organization, or local, state, or federal
government.
Question 2 - Assumes that there is a clear "field of training" definition. Also assumes that
the participant is now in the field of training while he/she may now be in a different area.
In such a case does one answer by giving the total time since entering training field, or
total time in that department?
Question 4 - Gives a full line to answer, where in this case, the reply expected is something
like "X years."
Question 5 - Word "department" may present problems. How does one answer if training
is a part of a division? It also assumes that there is a training department.
Question 8 - Inadequate space for persons with more than one degree.
Question 9 - "Why" question is almost impossible to answer since a full response would
contain too many factors. This compound question could be handled better if broken into
several questions.
Note also that there are no identification or classification questions. There was a written
code number on the back of the questionnaire that was never referred to.
12. Encourage your students to use Exhibit 12-10, as well as the cover letter with the Inquiring
Minds Want to Know--Now! case to answer this question. A good approach to this task is to
ask students about their own high school reunion experiences. If they haven't been out of
school long enough to have experienced a reunion, ask them what will encourage them to
return. Some will have strong feelings supporting the two extreme positions (can't wait to
return/ hope never to have to return for any reason) but most will be in the neutral position
(high school was okay, maybe it would be fun to see or hear from some of those I went to
school with).
From this exercise, move to one that has the students developing the investigative questions
for the likely research question (What do we need to know about graduates to write a
newsworthy article that the local paper will print?) Investigative questions will usually
focus on where graduates are-physically, but also financially, professionally, and
personally. The list of investigative questions should also include questions related to what
graduates have done since graduation that is newsworthy—unusual or impressive in and of
itself or unexpected based on who is involved.
Finally, ask students to identify what the letter must accomplish to get the highest possible
response rate. And also what it must NOT contain that might discourage response. Students
might reveal the following purposes, among numerous others, that the letter must
accomplish to get graduates of Palm Grove to consider completing and mailing back the
questionnaire:
1) The letter must help the participant reconnect with their hometown (as many
will have likely moved on to other towns, states or countries).
2) The letter must stimulate positive attitudes about participant's high school
experience (a tall order, given that high school is a time of painful adjustment for
many teens).
3) The letter must convince them that the community in which they live (d) will
benefit from their sharing of personal information.
4) The letter must make graduates feel important, that their information is vital
to the effort.
5) The letter must make completing the survey seem enjoyable, at best, and
painless and easy, at the very least.
After all this preliminary planning, drafting the letter may seem easy. This last task could
be assigned as an out-of-class assignment.
13. This exercise is a natural extension of the investigative questions developed in question 12.
Using this list, and knowing that the mail survey for a self-administered study is the most
likely instrument, have students as an out-of-class team assignment come up with a brief
survey of plausible measurement questions. You might want to specify the number and
type of questions you will accept (e.g. 3-5 classification questions, 4-5 target questions).
Have them refer to Exhibits 9-1, 9-2, and Exhibit 12-6 to obtain plausible scale designs and
other response strategies. And have them be prepared to specify the preliminary descriptive
statistics they want to collect for each question as a means of defending the scale or
response strategy that they chose. Each survey should be complete in all its parts and not
violate any of the basic design rules enumerated in the 21 issues detailed in the chapter.
Each instrument should also follow the rules for clarity, reliability and validity. Have the
team bring multiple copies (enough for another team to use) to the next class session.
On the next class session, each team could provide their instrument as a transparency or
PDF and explain their preliminary analysis plan. Then you can direct the teams to swap
surveys and critique the alternative instrument as a "researcher pretest", using the same
rules that they used to build their own instrument. You can have the teams critique the
instrument on several levels. Do the questions accomplish the research objective? Are the
response strategies appropriate given the group's preliminary analysis plan? Are the
questions properly sequenced? Are the transitions adequate? Are the questions free of
error? One spokesperson from each team should be asked to report on what the critique
reveals using the terminology of the research profession. Thus each term must identify the
problems they find by their correct terms and provide the solution to fix each error.
14. Some students learn graphically, so translating their team instrument development process
to model the detailed process graphics is an enlightening process. This also can be an
opportunity to demonstrate a questionnaire flowchart, noting places of likely
discontinuation or termination, skip directions, transitions, introductions and conclusions.
Some instructors will assign this problem at the same time they assign the out-of-class
assignment of problem 13. Sometimes it helps the students map the elements in Exhibit 12-
10 and 12-9 before they actually put their questions in survey format.
1. The observational method is a more useful method for collecting data from children,
illiterate, and functionally illiterate persons. The intrusion of observation is often better
accepted than questioning. Disguised and unobtrusive measures are often easier to carry
out than disguised questioning. It is generally a slow, expensive process with limited
opportunity to learn about the past. Further comparisons are found in the accompanying
table.
Referring to the chapter on Ethics in the book, the instructor may wish to review some
ethical issues. Considering the four classes of observational studies those which are
completely structured and in a laboratory setting will resemble experiments and will
produce ethical problems (for example: B, C, D, E, F, G, H, I). Unstructured observations
in a natural setting may be only problematic with respect to A, H, and J.
4. A. Observation data can be collected as it occurs; it often can secure information which
would have been otherwise ignored by those present. It is possible to capture the
total event, it is often less obtrusive than surveys, and often information can be
secured only by observation. On the other hand, observation is slow and expensive
to conduct, is limited to overt action information, the observer often has to be at the
scene of the event to secure the information, and for many types of historical
information the method is not usable.
On the other hand, questioning enables one to gather more information about the
past, about future expectations, and to gather information such as attitudes and
expectations that are not reflected in overt behavior. Information can be gathered
with less cost and from long distance by phone and mail.
C. Factual observation is restricted to those manifest events, actions, and conditions that
are reported as they occur or fail to occur. Inferential observation calls for the
observer to interpret events and actions by drawing conclusions about latent factors
that are at work, or reasons why one reacts in a certain way. Evaluations of
performance are an example of inferential observation.
5. There are three decision aspects to this question: whether to observe directly or indirectly,
whether to conceal the observer or the mission in some way, and whether the observer is to
participate in any functioning that goes on. Any combination of these three elements is
possible, so the answer often depends upon the assumptions made and the preferences of
the researcher.
A. If the observer is a member of the class then direct observation in which the
observation is concealed is likely, but the observer participates in class activity. If
the observer is a visitor the concealment is hardly possible and participation is
unlikely.
B. This could be direct but it is not uncommon for this to be done by a video camera
which takes a frame every second or so. The process of observation would probably
be concealed from the customer, the observer would probably be visible, but a
nonparticipant.
C. May be direct, concealed, and non-participatory if the client visits the scene of the
focus group and observes from behind a one way mirror. Also videotape may be
used and in this case it might be indirect, probably not concealed, and non-
participatory.
D. Probably direct, concealed, and participant. It is unlawful in the U.S. to "spy" on
union activities on behalf of management.
6. A. Student answers might take a variety of forms, so this question is ideal for class
discussion. Students might first suggest following a particular form through the
insurance company (e.g. a claim request form). Students might also suggest that
employee movements could be logged (e.g. when did an employee leave their work
space and what did they go to retrieve; when did another employee enter someone
else’s work space, and why; what forms were most used and where were they stored,
etc.) Students could also observe how employees communicated with each other (e.g.
by yelling out questions, or calling a colleague, or leaving a personal workspace to
have a face to face discussion with another employee). Efficiency is often hampered
by the condition of the office, so observations might be made about a participant’s
work space, its degree of organization, the availability of work vs. storage spaces,
etc. The list of potential observations is fairly extensive. In order to keep the
discussion on track, you might need to periodically remind the class that the company
wants to improve efficiency and paperwork flow.
B. You might suggest that the class choose between the content areas of ‘interoffice
communication’ or ‘form XYZ use and storage’ or ‘use of reference documentation’
or some other personal area of interest. Then in the content area, you should ask for
just what and how the observation should be done. For example, regarding
interoffice communication, students might operationalize the following observations:
• If employees yelled questions and answers across or between workspace
cubicles, and how often this occurred during the observation timeframe. They
should be required to indicate what they will observe and what and how they will
record the observation. For example, will they record the question or the
substance area of the question, the time duration of the conversation, the time of
day of the conversation, whether the question was adequately answered, the
names of the parties, the resulting behavior following the conversation, etc.?
Students should be encouraged to develop an observation checklist to record their
observations.
• Whether an employee left his/her workspace to ask questions of another
employee or deliver answers from previously asked questions, and how the
employee behaved if the targeted ‘expert’ was unavailable to answer the question
due to absence or involvement in a phone conversation. In this area, besides
some of the elements above, a student might suggest that they should record the
distance of the ‘trip’, the nature of the question, who was the instigator, amount
of time in the conversation, whether the work space of the targeted expert was
invaded or the conversation took place at the door or opening, whether others in
the area showed signs of distraction, whether an adequate answer was received,
whether paperwork or other materials (e.g. office supplies) changed hands,
whether the conversation was extended by non-work related topics, what was
done by each party immediately after the interaction, etc. Students should be
encouraged to develop an observation checklist to record their observations.
7. Same as 4, above.
8. A. Some of the standard information items that would be noted are sex, age, whether
alone or with other adults or children, time of day, day of week, and weather
conditions. Evidence of shopping activity, apparent income, race, apparent social
class, and many other items might also be included, depending upon the objectives of
the study.
B. The study objectives would be a major determinant of what to observe. There would
also be a number of variables that one might want to use as modifying or control
variables.
C. These would depend upon the variables used, but students should have no problem
establishing reasonable operational definitions. For example, age might be separated
into three or four age categories by inspection. Shopping status might be defined in
terms of carrying any shopping bag or item that has apparently just been purchased.
D. Generally the instructions should tell the observer how to act and what to do. They
would include such information as to how to dress, how to conduct her/himself on the
job, where to stand, how to record, and how to deal with questions or other situations
which may develop. In addition, how to sample, when to conduct the observations,
any special instructions as to what to observe and how to adapt to conditions which
might occur.
E. Probably there will be a time sampling such as 15 minutes of every hour, with the
particular 15-minute segment being chosen originally by a random method. The
observer may also be instructed to choose every 4th person passing a given point on
the sidewalk, in one direction only.
9. At ProSec Electronics, the study of defective parts should have given some direction to the
observation. For example, if defective parts had greasy residue, then the observation
checklist would have been seeking residue on work stations, either inherent or contributed
by the employee. If the defective part was due to excess stress on the part, then machines
or processes could be evaluated to determine where excess material stress was occurring.
The clandestine methods Jason and Otto used were partially due to not having identified
plausible causes of defects before the observation started or assuming that employees were
at fault because all other causes had been discarded.
This exercise should encourage students to imagine all the possible ways an electronic
device could be damaged during manufacture. One way to organize the brainstorming is to
identify major arenas for problems. Many such ideas come from five major arenas:
• the work environment itself (excessive dust, foreign particles invading
the electronic surface, etc.), or
• the machines which make the parts (worn die or out of alignment
machine), or
• the people who operate the machines or supplement the mechanical
processes (Bertha's peanut snacks, skipped quality check due to smoking break, or
smoking at the workstation), or
• the materials (lower grade plastic or metal component parts causing
stress on a proprietary part), or
• the supplier processes (excessive dampness or rodent infestation in
storage facility).
From Concept to Practice
1. A. Internal validity was called simply "validity" in Chapter 8. It involves the question of
whether we are measuring what we think we are, i.e., is the experimental treatment
the real cause of the result we find in the experimental group? External validity
concerns the degree to which the experiment can be generalized across persons,
times, or settings. That is, can the experiment be viewed as an accurate sample of
some more general conditions?
B. Pre-experiment designs are the crudest forms of "experimentation" because they fail
to control extraneous variables and they often omit the basic process of comparison.
History, maturation, and instrumentation problems often plague these designs.
Quasi-experiment designs are more sophisticated than pre-experiment designs, but
they too do not qualify as true experiments. These designs are used when the
researcher can control only some of the variables. In the quasi-experiment the
researcher cannot establish equivalent experimental and control groups through
random assignment, and often he/she cannot determine when or to whom to expose
the experimental variable. On the other hand, researchers can often determine when
and whom to measure.
C. Both are problems of internal validity. History effects represent specific events that
occur during a study that can influence the IV-DV relationship. Maturity effects
occur purely as a function of time passage and are not specific to a given event or
condition.
D. Random Sampling (Chapter 7) is the special case of the probability sample where
each population element has an equal chance of selection. Randomization and
matching are both useful devices by which one can improve the equivalency of
control and experimental groups. Neither method is perfect, but randomization is the
basic method because it is the primary means of assuring compatibility within some
known error interval. Participants are randomly assigned to groups by probability
sampling, the type depending on the nature of the experimental design. Matching,
which employs a nonprobability quota sampling approach, is a way to supplement
random assignment and can improve the equivalence of test and control groups.
E. Active factors are those variables that an experimenter can manipulate by causing
various participants to receive more or less of the factor. Blocking factors are those
that a participant has in some degree and can not be changed by the experimenter.
The experimenter can only identify and classify participants on these blocking
factors.
3. There is complete data on three variables temperature (T), humidity (H) and artisan
experience (E) available for a year (365 days). Whereas T, H, and E are the DV's, the
corresponding data on the percentage of defective glass shells being manufactured (the DV)
is also available. This would permit the use of the "factorial design approach," which
allows us to test for both main and interaction effects. While not an experiment, there is
time series field data available for each of the 27 (3 X 3 X 3) cells in the factorial design
approach. At this point it is suggested that the student consider the concept of interaction,
and how the effects of temperature and humidity may not be just additive main effects, but
there can be an "interaction." Analogously, a more experienced artisan may be better able
to combat the negative effects of adverse temperature and humidity conditions, as
compared to less experienced artisans, explaining a possible interaction between E and T or
E and H.
The information says that while "supervisors" may be a factor impacting the number of
defectives the data for this is available for only 242 out of the 365 days. The instructor may
discuss two possibilities. Suppose there are four supervisors. In that case the study may be
conducted using only the data of the 242 days for which data on T, H, E and supervisor
identity is complete, to come to conclusions, using a (T X H X E X Supervisor Identity)
factorial framework, with 3 X 3 X 3 X 4 cells. However this implies a loss of data for 365-
242= 123 days. The alternative would be to: (1) Use the 3 X 3 X 3 X 4 framework to assign
to various cells the observations for the 242 days, for which data is complete. (2) For the
remaining 123 days classification data in terms of T, H, and E is available, but the
supervisor identity is unknown. In such cases while T, H, and E classifications would be
used as earlier, the cases can be randomly assigned between the different supervisors.
Performance Differences - Define performance as the speed at which the system completes
a task. Several tasks could be measured and combined in a weighted average to come up
with a scalar number that represents the performance of each of the two computer setups, in
a particular customer environment. For example, the following tasks could be included:
time to retrieve client information from a database; time to update a database, time to
complete a calculation, time to check the spelling of a document, time to print a document.
Since the research question involves more than one "workstation," the tasks should be
measured with varying numbers of people doing the tasks (i.e., 1, 5, and 10). The
performance differences are represented by the difference between the time it takes the first
system to complete the tasks and the time it takes the second system to complete the tasks.
Local area network (LAN) - A local area network is defined as a set of microcomputers
connected together by cabling, able to share data, programs, printers, scanners, and etc.
Several types of LANs are available and the type of LAN will affect the performance.
Therefore, LANs should be further operationally defined to be token ring or Ethernet and
the speed at which it runs should be defined (4 or 16 Mb/sec for token ring, 10Mb/sec for
Ethernet).
Terminals - Includes a keyboard and a screen that connects to a computer. A terminal has
no "intelligence" (processor) to do computing on its own and depends on the computer to
which it is linked for data storage as well.
5. A. For the testing of drugs we need to study/test both main effects and interaction effects. The
effects of the same dosages on different age groups are not expected to be equal. To take the
more obvious case, infants require lower dosages than adults. Analogously, the elderly may be
more sensitive to certain drugs. Such effects refer to the interaction between age and dosage.
The factorial design allows us to vary all factors simultaneously and test for main and interaction
effects. A completely randomized factorial design is suggested with each of the dosage levels
being randomly assigned to each of the age groups. Alternatively the randomized block design
serves the same purpose, and age may be used as the blocking factor. Whether the design
improves the precision of the experimental measurement depends on how successfully the design
reduces the within block variance and maximizes between block variance.
B. In drug testing, control is made more stringent through the use of double blinds.
When participants do not know that they are receiving the experimental treatment
then they are "blind." When the experimenters also do not know whether they are
giving the treatment to the experimental or the control group it is referred to as a
double blind. The issue of human mortality during experiments is a complicated one.
Attrition in an experimental group is expected and this is normally handled through
random assignment of such cases. However, in the case of medical experiments even
the few deaths have high "real" significance. The age/dosage and their interaction
may in some cases, however few have main or interaction effects that are related to
death. One historical solution has been to attempt to, and in experimental
replications, collect enough such observations to test the age/dosage main and
interaction effects in such cases alone, taking into account other possible extraneous
factors impacting these cases through techniques such as covariance analysis.
Making Research Decisions
6. (See Chapter 5 on Ethics). In the past there has often been too little concern among
researchers regarding this problem. Clearly participants in experiments have rights that can
be violated easily, particularly in research involving students who may not feel free to
refuse participation. The federal government has promulgated regulations concerning the
use of humans as participants in research and many universities and colleges have formed
committees to monitor faculty research projects in this respect. Student-run projects have
generally not been monitored but there is no compelling reason why they should not be
regulated.
The discussion of this point should consider the degree to which the following ten items are
pertinent to the experimental method:
7. The statement essentially says that the model of experimental design is the most powerful
basis we have for determining causation. Therefore research efforts should seek to
approach this ideal model as closely as possible.
8. The major characteristic of the true experiment is the achievement of equivalency between
experimental and comparison groups through the use of random assignment. In this way
we can enhance internal validity.
A = no incentive R A 01
B = $1 Incentive R B 02
C = $3 incentive R C 03
Assume that there are strong reasons to believe that the experiment should be blocked
on political affiliation and that the three political classifications are:
No Incentives R A A A
$1 R B B B
$3 R C C C
C. Latin square
Assume that a second extraneous factor, age, is believed to have an important effect.
We divide all participants into young, middle age, and old groups.
Young A C B
Middle B A C
Old C B A
D. Factorial design
Assume that we wish to test the effect of the sex of the interviewer at the same time
we test the incentives. The factorial design might be:
No incentive $1 $3
With the information given it is not possible to determine which design to use. If
there is no apparent reason to use a more complex design, one might use the
completely randomized design. If there is a useful basis on which to block, this
would normally increase the precision of results as this type of stratified sampling is
typically more statistically efficient than simple random sampling. With a total
sample of 300 one might prefer a design with only a few cells.
10. A. Probably the most appropriate design is a quasi-experiment called the nonequivalent
control group. One might assume that there are three different factories (e.g.
assembly plants) and each one will use a different compensation method. While the
specific method assigned to each plant could be done randomly, this limited
randomization does not give much equivalency assurance. The use of the same
compensation system within each plant would at least partially guard against the
contamination effect that might be found if one were to try three experimental
patterns within the same plant.
B. This case calls for a factorial design since there are two variables that are being tested
simultaneously. This project might call for setting up the experiment in several cities
so as to achieve different levels of advertising.
C. Several different designs might be suggested here. Some will suggest a time series
quasi-experiment design in which various time periods will be control periods (i.e.,
no music) while others will be experimental periods. In this case each time period is
a unit of observation.
Another approach might be to use a randomized block design in which blocking are
done on time periods with different traffic levels since this may affect shopping
speed. Individuals entering the store during either control or experimental study
periods would be timed.
11. A good place to start this problem would be to challenge the student to use their secondary
data knowledge and skills to find experiments that have been done with twins. A quick
search reveals more than 40,000 twin research hits and approximately 7,000 twin
experiments hits, the most historically notorious done by Menegele at Auschwitz. A sample
of organizations that do studies involving twins includes
• National Organization of Mothers of Twins (www.nomotc.org)
• St. Thomas Hospital, Twin Research Center (UK) (www.twin-research.ac.uk)
• University of Wisconsin-Madison Twin Center
(http://psych.wisc.edu/goldsmith/nletters)
• The Twins Foundation (www.twinsfoundation.com)
• International Society for Twin Studies
• MIT Twins Study (http://web.mit.edu/jganger/Public/ourhome.html)
You can link to the above and many more from this portal for twin information:
• www.4twins.com
12. In Nose for Problem Odors, there are several treatments and several measures. In symbols,
the quasi-experiment might look like this:
R X1 X2 O1 O2
R1 X1 X2 O1 O2
R2 X1 X2 O3 O4
O1 O2 X1 O3 O4
O5 O6 O7 O8
13. In this question, the student is asked for subjective impressions of the experimental
procedure. The response should be a short essay from a participant’s perspective. The
chapter section Conducting an Experiment is a good starting point for a critique.
B. Spreadsheet data entry uses personal computer software for spreadsheet computation
as a data entry device. Preliminary analysis, data manipulation, summary statistics,
and graphics are frequently available in this medium.
C. Bar codes are digital/graphical labels used for tracking inventory movement and
movement of a product or service through a service process stage. It is a multi-part
identifier comprised of alpha and numeric characters that provides key elements of
information, including producer, category, and product/service specifics. As a result
it is one device that has made data mining more effective. Bar codes are most
frequently used to track the responses to administrative questions in research (who
did the interview, where and when the interview was collected, etc.).
E. Measures of shape include skewness and kurtosis which describe departures from
symmetry and the relative flatness (or peakedness) of a distribution, respectively.
G. Missing values are data not provided by a participant (or object during observation).
These may be due to lack of knowledge, participation, or motivation. A researcher
must decide how such data will be coded during data preparation.
2. There are several ways to handle DK answers. If there are only a few they might be
left as a separate category and included in the findings. Another approach is to
eliminate them. In doing this we are assuming that they have no particular bearing on
the pattern of answers we are getting. In cases where we suspect that they may be a
disguise for some other answer, we may attempt to correlate the DK answers with
these other answers. Comparing answers to other questions and checking with the
interviewer may at times also help interpret these replies.
3. While the standard deviation and the variance are measures of spread, the standard
deviation is more easily interpreted (as it measures distance from the mean of a data
set in the original units of measure, rather than being a squared calculation as is the
variance), as well as measuring the variability of individuals within a data set.
4. This exercise is perfect for learning to code open question. Note: Items labeled a - n
are not separate questions but 14 representative responses to the specified open
question. The challenge for the student is to categorize the responses in some
meaningful way. Assigned to teams for an in-class exercise, students will come up
with several different coding schemes to discuss. The classification scheme below is
taken from one in which more than 30 replies were used. In the longer exercise the
following classification system was the best that resulted from the efforts of the
student team involved.
B. Product selection
1. Less style emphasis _______ _______
2. Fewer resources/lines _______ _______
3. Concentrate on known brands _______ _______
4. Reduce item duplication _______ _______
5. Concentrate on in-stock shoes _______ _______
6. Eliminate fringe items _______ _______
C. Price reduction
1. Small early markdowns _______ _______
2. Markdowns at season end _______ _______
3. Do not markdown staples _______ _______
5. This question requires student-collected data. Any of the following units may be used
to analyze this data: syntactical, referential, propositional, or thematic. One might
use syntactical units (word counts) for specific mention of position, activity, goal
structure (immediate or deferred), or strategy. A propositional approach (e.g., actor,
action, action object) might mention goal attainment through family, school
colleagues, mentoring, or personal environmental scanning.
6. The median is the number in the middle position, dividing the ordered dispersion into
halves. (median = (154+160)/2 =157)
12 12
13 13
23 23
32 32
43 143
mean 24.6 44.6
median 23 23
8. The Table below contains the spreadsheet (text Exhibit 15-9) computations:
Statistics V1 V2 V3 V4
V5
100 Employ_ Job Grow Trng_hr Ent_Sal_ Revenu
Best US % /yr Prof es (Mil)
Rank2
001
N Valid 15 15 15 13 15 14
Missing 0 0 0 2 0 1
Mean 6014.00 8.20 62.85 59208.00 2663.29
Median 2047.00 7.00 52.00 51957.00 883.50
Mode 566 14 32 45000 146
Std. 8593.42 5.955 42.152 24578.06 5759.713
dev. 7 6
Skewn 2.008 .089 1.492 1.717 3.506
ess
Kurtosi 2.950 -1.370 1.603 2.985 12.696
s
Range 26306 18 141 90202 22147
Minimu 566 0 21 34798 146
m
Maxim 26872 18 162 125000 22293
um
a Multiple modes exist. The smallest value is shown
1. A. Marginals are the row and column totals that appear at the bottom and right
“margins” of a data table. They show the counts and percentages of the separate rows
and columns.
B. A Pareto Diagram is a bar chart whose percentages sum to 100 percent. The variables
represented by the bars are sorted in decreasing importance (bar height) with bar
height descending from left to right.
E. Nonresistant statistics (e.g., the mean and standard deviation) are influenced by
outlying values in a distribution and may change significantly in response to a small
change in the data set.
F Lower control limit on a process control chart, the lowest acceptable value of data.
Data that is below the lower control limit is evidence that the process is out of control
or special causes are adversely affecting it.
G. The five-number summary consists of the median, upper and lower quartiles, and
largest and smallest observations in a distribution.
4. There are several ways to handle DK answers. If there are only a few they might be
left as a separate category and included in the findings. Another approach is to
eliminate them. In doing this we are assuming that they have no particular bearing on
the pattern of answers we are getting. In cases where we suspect that they may be a
disguise for some other answer, we may attempt to correlate the DK answers with
these other answers. Comparing answers to other questions and checking with the
interviewer may at times also help interpret these replies.
3. A. Histograms are less helpful for error detection than frequency tables, stem-and-leaf
displays, or boxplots. A histogram of family size that had an unexpected code, such
as a midpoint at 20, might reveal a coding error or alert the analyst to a potential
outlier.
C. Boxplots through whiskers and outlier identification call attention to values that
extend beyond the main body of the data. Since extreme values have a substantial
influence on numerical summaries, errors in this area are often detected early.
4. A. The main body of the data is between the upper and lower hinges labeled "H" in 16-
1, below. This area represents 50% of the observed values.
Exhibit 16 – 1 Stem – and – Leaf Display for Net Profit Variable*
B. The upper and lower "inner fences" are plus or minus 1.5 IQRs from the hinges. The IQR is
1406.36. Since the lower hinge is 807.6 and the minimum value in the distribution is 251, there
are no lower outside values.
The upper "inner fence" would be 3130.16 (1723.80 + 1406.36) and the last observed
value inside that fence is 2975. Thus, four values are outside: 3758, 3825, 3939, and
4224.3 . (See Exhibit 16-1.)
(Note: With reference to the answer to question 6e, the outliers and extreme cases
are
identified above)
C. The upper "inner fence," 1.5 IQRs from the end of the rectangular box, is 46820.
Seven observed values lie beyond this point, each of which is designated by a case
number in the exhibit presented here (Exhibit 16-2) and has a corresponding
participant number in the Data Table for this problem. This distribution is positively
skewed and somewhat peaked. Extreme values are primarily from the following
sectors: Durable - Capital Equipment, Energy, and Hi - Tech.
Exhibit 16 – 3 Histogram of Market Value in 5000 Unit Intervals*
6. A. The histograms for this problem are shown as Exhibit 16-3 (5000 unit ), 16-4 (2000
unit). There is considerably less information in the 1000 and 2000 interval histogram
than in the 5000 unit version. Extreme values and outliers are apparent in all
examples but the 2000 and 5000 display them best. The 5000 unit chart provides a
clear picture of gaps in the main body of the distribution but not sufficiently more so
to justify its size.
B. The 1000 or 2000 unit version would be the most desirable for a management report
based on the tradeoffs of size, clarity, and completeness of information.
D. Exhibit 16-5 shows a spread-and-level plot for this data. At the bottom is a
recommended transformation of .445. Rounding up, our first transformation attempt
would use the square root.
Exhibit 16 – 4 Histogram with 2000 Unit Intervals*
MKTVAL
Exhibit 16 – 5 Spread and Level Plot for Transformation Decision*
7. The standard rule in determining which direction to compute percentage totals in a cross
tabulation is to compute them in the direction of the causal factor, following this rule: (a)
the age variable would total 100%, (b) family income would total 100%, (c) marital status
would total 100%, and (d) unemployment would total 100%. However, some may point out
that there are circumstances where it would be useful to run the percentages in the other
direction. Consider, for example, the relationship between age and consumption of
breakfast food, a property-behavior relationship. One would not suggest that cereal
changes cause age changes, but a cereal manufacturer might still find it to be very
informative to have an age profile of the users of each of its cereal products. In like manner
we might be very interested, in an economic survey, to know the family income patterns of
the optimists and pessimists regarding their family futures.
8. At first glance it would appear that students on aid have a higher tendency to drop out of
school than do students on no aid. The remainder of the data, in which aid grants are cross-
tabulated with drop out by nearness of the student's home to school, indicates that the
original aid/drop out relationship is reversed. That is, when adjusted for nearness of home
to school, the students receiving aid tend to be retained in school better than the non-aid
students. This occurs because a much higher percentage of the "home far" students are on
aid than is the case with the "home near" students. The table shown would have resulted if
the absolute numbers involved were as follows:
C. One would conclude that both social class and appeal type are important variables,
but that working class is stronger, perhaps by 50% or so. This can be shown by
comparing the WC/B to MC/A cell response rates.
Percent Replies
Appeal Middle Class Working Class
A 20 40
B 15 30
Why use these cells rather than the MC/B vs. WC/A? To compare the joint effects
we compare mixed pairs of categories (mixed in the sense that each pair contains one
better and one worse category). Since working class is the better and appeal B is the
worst, the intersection of these two factors is compared to the other worse-better cell
(MC/A).
10. Part A indicates that salaried employees have a generally lower turnover than wage earners.
There is also evidence that persons from rural origins have fewer turnovers than
persons from urban backgrounds (but only in salaried positions). When education
level is included we see that is a very important factor in the relationship.
It seems to be several times stronger than the rural/urban origin. In fact, along wage
earners the rural/urban difference becomes confused and is probably immaterial.
When rural/urban origin is "held constant" it appears that education is perhaps twice
as important as job type, and even more so among those of urban origin.
Procedure: Using SPSS, select the Graphs menu, select Control, and
define the chart type as X-bar and R.
Data:
V1 V2
1.00 10.22
1.00 10.31
1.00 9.79
2.00 10.25
2.00 10.33
2.00 10.12
3.00 10.37
3.00 9.92
3.00 10.80
4.00 10.46
4.00 9.94
4.00 10.26
5.00 10.06
5.00 9.39
5.00 10.31
6.00 10.59
6.00 10.15
6.00 10.23
7.00 10.82
7.00 10.85
7.00 10.20
8.00 10.52
8.00 10.14
8.00 10.07
9.00 10.13
9.00 10.69
9.00 10.15
10.00 9.88
10.00 10.32
10.00 10.31
1. Independent observations.
2. Observations from a normally distributed population.
3. Populations that have equal variances.
Nonparametric tests typically have fewer and less restrictive assumptions than
parametric tests. The particular assumptions vary from test to test, but nonparametric
tests are often referred to as '"distribution-free tests." While this is an exaggeration,
they usually do have few distribution requirements.
B. When testing a null hypothesis we have two choices: we can reject the null
hypothesis, or we can fail to reject it (loosely, we refer to this as accepting the null,
but this is not technically correct). We also find that two circumstances may exist -
either the null hypothesis is true or it is false. This combination of states of nature
(i.e.. null hypothesis is true or false) and decisions (i.e., reject or not reject the null)
gives us four conditional outcomes, two of which are correct and two of which are
not. They may be displayed as follows:
Null hypothesis is
True False
Accept the null Correct Decision Incorrect Decision
hypothesis Type II error
Probability Probability
p=1-alpha p=beta
Reject null hypothesis Incorrect decision Correct Decision
Type I error
p=alpha power of test
p=1-beta
Type I error is measured by alpha which we set in our hypothesis testing. It is the
probability (or significance) level at which we will risk rejecting the null hypothesis.
We define it by choosing some critical value beyond which the occurrence of a
sample value leads us to conclude that the null hypothesis is false. Given such a
critical value, we then compute the probability that we will incorrectly fail to reject
the null hypothesis when it is false. This is Type II error or beta error. It is not
difficult to compute this arithmetically, but it is sometimes difficult to explain
conceptually. The beta error depends upon what assumptions we make as to how far
the population parameter may have changed. For example, if mu was originally 50
and changes to 52 we get a different beta than if mu changes only to 51.
D. The acceptance region delineates the area in a distribution where the occurrence of a
sample statistic value leads us to fail to reject the null hypothesis. The rejection
region is that portion of the distribution where we have agreed to reject the null
hypothesis. The "critical value" divides these two areas.
F. The probability that we will incorrectly fail to reject the null hypothesis when it is
false is a Type II error or beta error. The power of the test is calculated as 1- beta and
is the probability with which we will correctly reject the false null hypothesis (see
Exhibit 17-3).
2. There are six relatively well-defined steps: (1) State the null hypothesis; (2) Choose
the statistical test; (3) Select the desired level of significance; (4) Compute the
calculated difference value; (5) Obtain the critical test value; and (6) Make the
decision to accept or reject the null.
3. The Mean Square between is the result of the between-groups sum of squares divided by
the between-groups degrees of freedom (number of groups minus one). This
calculation represents the effect of the treatment condition. The Mean Square within ,
calculated by dividing the within-groups sum of squares by the within-groups degrees
of freedom (total observations minus number of groups), represents the error due to
sampling and random fluctuations. When MSb is divided by MSw, the F ratio results.
If the null hypothesis is accepted, the ratio reflects the fact that there is relatively
little difference between the presumed treatment (numerator) and error variance
(denominator).
4. The required assumptions for ANOVA are: (1) independent random samples from
each of the represented populations, (2) normal distribution of the represented
populations, and (3) equal variances of said populations. Assumptions 2 and 3 are
checked with diagnostic procedures. Measures of location, shape, and spread were
discussed in Chapter 15 -- along with graphic techniques for examining distributions
for normality. In addition, Exhibit 17-6 provides an example of normal probability
plotting and detrended probability plotting for examining normality. With respect to
equal variance, there is a number of homogeneity of variance tests -- many of which
are dependent on the normality of the sample. For this reason we have referred to the
Levene test in both Chapters 16 and 17. It is less dependent on the assumption of
normality and, when using the SPSS Procedure EXAMINE, is nested within a
spread-and-level plot which also serves to determine whether the data should be
transformed.
5. The tradeoffs between Type I and Type II errors have a practical dimension as
defined by the costs incurred for each error. Often a change in the status quo is
associated with great cost (e.g., gambling the future of a firm on a new technology,
an acquisition, a sizable investment in equipment, etc.). Since the change must be
beneficial, the risks associated with alpha should be kept very low. However, if it is
essential to detect changes from a hypothesized mean, the risk of a beta error would
be paramount. Thus, we would choose a higher, less critical level for alpha.
A. We can reduce the probability of a Type I error by moving our critical values farther
from the expected mean, i.e., expand our region of acceptance and reduce our area of
rejection. A second way is to hold the same critical value but increase the size of
sample. A third way is to shift from a statistically inefficient sample design to one
more efficient, keeping the same sample size.
Type II error can be reduced, given an assumed population mean, by increasing the
size of sample. A second way is to increase our alpha risk by moving our critical
value closer to the original mean. A third way to reduce Type II error is to use a one
tailed distribution. That is, place the entire region of rejection in the tail of the
distribution in the direction of concern.
C. Yes. This is obviously the case since a census of both groups was taken. Whether
the difference has any real meaning or practical value is another question.
6. A. The appropriate test in this case is the Chi-Square test, which is particularly useful for
situations involving nominal data but can also be used for higher scales. The test is
used extensively in cases where persons, events or objects are grouped into two or
more nominal categories such as yes-no, or as in this case accepted-rejected. The
question is similar in principle to the example in the textbook (Chapter 17, section on
the Chi-square test) regarding 200 students and their intentions to join a club.
C. The required test will be a test for "two independent samples"; the Z test and the t test
are generally used. The Z test is used with large samples (above 30 observations)
and the t test with small samples. Alternatively, the Z test can be used with smaller
samples when the data are normally distributed and population variances are known.
In this case with a sample of 45 automobiles from each facility, the Z test can be
used. Here, we only seek to examine whether the "gas mileage is the same" for
products of the two facilities. Hence a two-tailed test is appropriate. If on the other
had if we sought to examine whether "automobiles from facility A produced
automobiles that gave a higher gas mileage than those manufactured at facility B,"
then a one-tailed test would be appropriate.
D. The data are in three categories of managers and three categories of motivation. For
data that is "categorical" or "classificatory," the Chi-square test is appropriate.
Further reference may be made to the answer to question 3A in this guide, and the
sections on One-Sample Tests and the Chi-Square test in the textbook. It is useful to
note that the binomial nonparametric test is not usable, since it is applicable only
when the population is viewed as two classes, for e.g. male and female.
E. The two-related samples t test is used here, as we are measuring the sales
performance of the same trainees, before and after a training program. This implies
that the data are not independent. Further, we are assuming that the measure of sales
performance is "ratio" (in fact sales, either value or quantity is often taken as measure
of sales performance.) As we are seeking to examine whether the "training improves
performance," rather than whether there is "a difference before and after training," a
one-tailed test is appropriate.
F. Here the dependent variable is sales measured in dollars or as units sold (ratio scale),
and the independent variables are product quality and advertising. If the data on
product quality and advertising are "classificatory," (e.g. high, medium and low
grades of a product; advertising budgets from 0-5mill, 5-15 mill, and 15 mill dollars
and above) the technique of choice would be ANOVA. F tests would test the
significance of the independent variables. We would be testing the Main effects:
Quality, and Advertising, as well as the Interaction effect: Quality x Advertising.
The instructor may want to suggest that were the independent variables obtained as
"continuous" ratio or interval scaled variables, for e.g. through the use of 7-or 11-
point point semantic differential scales, then the technique of regression could be
used, which is detailed in Chapter 18.
7. (1) The null hypothesis is that there is no difference between the long run average
of 3.0 and the current year's average of 3.2 other than what is expected by sampling
variations. It would seem appropriate here to use a one tailed test since we are
concerned only with whether the sampling variation could have gone as high as 3.2.
(2) The proper test is the parametric test involving one-sample results being
compared to a population value. Use the t test.
X-µ
t = = 3.2 - 3.0 = .20 = 2.5
σX .4 / 25 .08
(5) Enter Table F - 2 with 24 degrees of freedom. We find a critical value (one
tailed test) of 1.711 for alpha = .05.
(6) Since the calculated value 2.50 > critical value (1.711) the null hypothesis is
rejected.
The question also asks: "At what alpha level would it be significant?'' Checking text
Exhibit G-2, again we find that the critical value of 2.492 is found for an alpha of .01.
8. (1) The null hypothesis would be that there is no difference between the degree of
conservatism of professors and students, and that the differences found are due to
sampling variations only.
p1 - p2 .5 - .3
t = = = .200 = 1.32
σp 1 - p 2 .25 + .21 .152
20 20
5) Critical value from Table F-2 with 38 d.f. is 1.69 for one-tailed test and 2.03
for two-tailed test.
(6) Calculated values of 1.32 < either critical value, so null hypotheses is not
rejected. That is, the differences found between professors and students could be due
to sampling errors only.
9. (1) Null hypothesis. H0: There is no difference between average annual starting
salaries of the graduates at the two universities. HA: Graduates of Eastern University
received higher starting salaries than graduates of Western University.
(2) Statistical test. The t test is chosen because the data are at least interval in form
and the samples are independent in a two- test situation.
(3) Significance level. alpha = .05 (one-tailed test; the alternate hypothesis states
direction)
(4) Calculated value:
(6) Decision. Since the calculated value is larger than the critical value (3.44 >
1.66), reject the null hypothesis and conclude that graduates of Eastern University
secured higher average annual starting salaries.
10.
Number Frequencies: Oi (Ei)
Interviewed Percent Favorable Neutral
Unfavorable
(2) x 270 (2)x220
(2)x310
0i = observed Ei = expected
(1) Null hypothesis. H0: Oi = Ei. That is, the attitudes toward corporations in the
population are independent of college class. HA: Oi≠ Ei. That is, the attitudes toward
corporations in the population do vary by class.
(2) Statistical test. Choose the k sample chi-square test to compare the observed
distribution to a hypothesized distribution. The chi-square test is used because the
responses are classified into nominal categories and there are sufficient observations
in each cell.
2
5
.
7
529
.
1
22-
1
0
.
7
52-
2
4
.
1
221 6
.
3
82
χ
2
= ++ + +
.
+
7
4
.
2
5 7
0
.
8 6
0
.
7
5 6
4
.
1
2 7 3
.
6
2
χ 2 = 33.6
(6) Decision. The calculated value is greater than the critical value, so reject the
null hypothesis and conclude that there is a significant difference among the classes
as to attitude toward corporations.
11. (1) Null hypothesis. H0: There is no difference in readership rates between
business school and liberal arts students. HA: Readership rates differ between
business and liberal arts students.
(3) Significance level. alpha = .05, two- tailed test (alternative hypothesis is not
directional).
(5.6 - 4.5) - 0
t = = 1.1 = 4.378
100 (2.0)2 + 100 (1.5)2 1 .251
+ 1 ( )
(100 +100) - 2 100 100
(5) Critical value = 1.96. alpha = .05, two- tailed test, d.f. = 198. Consult Table
G- 2.
(6) Calculated value of 4.4 > critical value (1.96). Reject the null hypothesis.
12. (1) Null hypothesis. H0: There are no durability differences among the three
products. HA: There are durability differences among the three products.
(2) Statistical test. We use the F test with one-way analysis of variance because
the data are ratio scales. There are k independent samples, and we accept the
assumptions underlying this parameter test.
(5) Critical test value for F = 3.89, alpha = .05, d.f. = (2,12)
(6) Decision. Since the calculated value is greater than the critical value (5.49 >
3.89) we reject the null hypothesis.
In the continuation of this exhibit, a post hoc comparison (multiple comparison test)
of the three paints using the Scheffe procedure shows that the significant differences
are between One-Koat and Competitor B (groups 1 and 3).
Exhibit 17-1 continued Multiple Comparison Test for Question 12*
(2) Statistical test. We use the F test with one-way analysis of variance because
the data are ratio scales. There are k independent samples, and we accept the
assumptions underlying this parameter test.
(6) Decision. Since the calculated value is greater than the critical value (5.43 >
3.89) we reject the null hypothesis. A post hoc comparison of the three store types
using the Scheffe procedure shows that the significant differences are between the
electronics store and the department store (groups 1 and 2).
14. A test for related samples should be used since the data are collected on the same
firms and are paired for the one-year interval.
(5) Critical test value = 1.81. d.f. =10, one-tailed test. alpha = .05 .
(6) Decision. Since the calculated value is smaller than the critical value
(1.53 < 1.81, we fail to reject the null hypothesis and conclude that there are no
differences between profits of the two years.
15. A Students might use the Fortune categories from the article: 1 =
software/services, 2 = entertainment/sports, 3 =
hardware/semiconductors, 4 = communications equipment, 5 =
biotechnology, 6 = food, 7 = venture capital.
Net Worth
Tukey B
N Subset
for
alpha =
.05
Group 1 2
4.00 5237.200
0
2.00 7294.857
1
1.00 20730.500
0
3.00 4 4673.75
00
(2) Statistical test. The McNemar test is chosen because the groups have been
matched according to the control characteristic-- standard number of viruses that each
anti-virus product is presumed to remove. Although nominal measurements were
used, the chi-square test of independence is not appropriate.
=
Z3
-
5
8=-
2
5=
-
2
.
6
2
3
+
5
8 9
1
(6) Decision. Since the (absolute value) calculated value is larger than the critical
value (|-2.62| > |+1.96|), reject the null hypothesis and conclude that the proportion
removed by Anti-V is not equal to the proportion removed by Q-Cure.
Note: the other approach to McNemar, a modified chi-square that was illustrated in
the text, produced a chi-square value of 6.86. With 1 d.f. at alpha = .05, the critical
value is 3.84. We reject the null. (The square root of 6.84 is the Z result of 2.62.)
(1) Null hypothesis. H0: Oi = Ei. That is, the styling preferences are independent
of buyer behavior. HA: Oi ≠ Ei.
(2) Statistical test. Choose the two independent sample chi-square test to compare
the observed distribution to a hypothesized distribution. The chi-square test is used
because the responses are classified into nominal categories and there are sufficient
observations in each cell.
(6) Decision. The calculated value (20.94) is greater than the critical value (3.84)
so reject the null hypothesis and conclude that styling preference is not independent
of buyer characteristic.
B. Many analysts would apply the correction for continuity since the sample size is
larger than 40 and a 2 x 2 table is used. In this case, the calculated value drops only
slightly to 19.11 and we also reject the null hypothesis.
18. (1) Null hypothesis. H0: There are no differences between the means of the two
groups relative to flight service ratings.
(2) This is a case of two independent samples. The appropriate test is parametric
because the data are interval. Use the t test.
(3) The desired level of significance is again set at alpha = .05
(5) Critical value from text Exhibit G-2 with 38 d.f. is 2.03 for a two-tailed test.
The F test on the left of the exhibit indicates that we cannot reject the hypothesis of
equality of means; thus we interpret the pooled variance estimate section.
(6) Calculated values (absolute values) of |-2.19| > critical value( |2.03|). The null
hypothesis is rejected. That is, there is a significant difference found between the
two groups with respect to the service rating.
C. The test of the slope ( b1 = 0) is a very important test in bivariate linear regression. If
the true slope is found to be zero, there is no linear relationship between the X and Y
variables. The test of the intercept is only to determine if the regression line goes
through the origin. The test of r 2 = 0 is similar to the test of the slope with a slightly
different interpretation. As a goodness of fit test, r 2 tells us how well the regression
line fits the data. By partitioning the sum of squares in the dependent variable, we
discover the proportion of variation in the dependent variable explained by the
model.
D. The coefficient of correlation, r, describes the relationship between the two measured
variables. The coefficient of determination is r 2. It shows the degree to which the
variables in question share common variance. If r is found to be .90, r 2 is equal to .
81; that is, 81% of the variance in X is explained by Y and vice versa.
4. See Exhibit 18-1. Somers' d symmetric is .707 and asymmetric, with opinion
dependent, is .70. The symmetric coefficient results from averaging the two
asymmetric coefficients. Somers' is extension of gamma that considers the number of
pairs not tied on the independent variable. The magnitude of the coefficients is quite
similar to the tau results. We would assume that education influences opinion on the
tax and that the asymmetric coefficient with opinion dependent is the most
appropriate one to interpret. We conclude from this problem that there is a relatively
strong positive relationship between education and opinion about a tax on stock and
bond transactions.
Exhibit 18-1 Contingency Table for Measures of Association *
5. We have computed four measures of association in Exhibit 18-2. Of the four, phi is
the most appropriate because the table in question is 2 x 2. As noted elsewhere,
Cramer's V simplifies to phi for 2 x 2 tables. The Pearson contingency coefficient C
is problematic because of its upper limit and comparability to other measures. All of
these three are based on the chi-square. Lambda, on the other hand, offers a PRE
interpretation. Given the information in the problem, it would be difficult to identify
a predictor variable for lambda in that it could be argued that one's position on taxes
in general is just as good a predictor of party affiliation as party affiliation would be
for predicting favorableness on a specific position.
Exhibit 18-2 Contingency Table and Correlations for Question 5*
7. A. Symmetric Measures
Valu Approx.
e Sig.
Phi .827 .000
Cramer's V .827 .000
Contingen .637 .000
cy
Coefficient
N of 685
Valid
Cases
Phi is appropriate for this 2 x 2 table. It is identical to Cramer’s V in this case. The
Contingency Coefficient’s upper range restriction makes it unsuitable for comparison. The
relationship between the two variables is relatively strong. It is also statistically significant
at .0005. The Chi-square was significant and since the three measures are based on Chi-
square, we would expect them to have the same confidence level.
B. The value for lambda is .68 with the clinical condition as the
designated dependent variable. This means that 68% of the error
in predicting the clinical condition (pregnancy) is eliminated by
knowledge of the test result.
A. With d.f. (1,8) the critical value of F is 5.32. In this table, the calculated value is
95.75 (found from the mean squares once the student fills in the d.f. and sum of
squares by subtraction). We reject the null hypothesis, b = 0.
B. The t value (9.79) is the square root of F. It is the primary test of the slope for
bivariate regression.
9.
Exhibit 18-5 Correlation Matrix for Question 9
10. The largest Pearson coefficient is .9704, the relationship between cash flow and
market value. The Spearman rank order correlation is found to be .9758. Normally
we would expect the Spearman to display a lower coefficient since the ratio data is
lost when each data point is converted to a rank. In this case, the order preserved the
relationship with precision.
11. The variables selected for illustration were net profits and market value. The
regression appears in Exhibit 18-6. The R2 is .92 and a t value for the slope is 9.79.
The null hypothesis was rejected at the .00005 level.
Exhibit 18-6 Regression Results and ANOVA Summary for Question 11*
Linearity: The scatterplot graphs the relationship of the X and Y variables. There is
little doubt that the function is linear and that a straight line fits the data.
Residuals: The summary of residual statistics in Exhibits 18-8 and 18-9 does not
seem to indicate that the ranges (in the -1.6 to 1.7 range for standardized values)
would not be a cause for concern.
Recommendations: Generally, the diagnostics suggest that this small random sample
from Forbes 500 data conforms rather well to regression assumptions. Equality of
variance could be improved with transformation and that would be our next step in
this problem. Otherwise, the departures are relatively minor.
Exhibit 18-11 Normal Probability Plot for Question 11*
12. A. The relationship between X and Y is shown as - .84 in Exhibit 18-12.
C. The square of the correlation, r2, is equal to .71; that is 71% of X is explained by Y,
and vice versa.
E. Refer to Exhibit 18-12. The test of the slope (b = 0) was rejected below our
conventional a = .05, with a t value of - 4.16 and significance of .0042. The test for
the significance of the correlation coefficient was determined by a two-tailed t- test as
shown in the top of the exhibit. This null was rejected at .01. Finally, the F test of the
regression model was equal to t2 (17.3).
Exhibit 18-13 (cont.) Regression Results for Question 12*
3. If a MANOVA problem has two factor levels and several dependent variables, the position
of the variables could be reversed and treated with discriminant analysis. The two levels of
the factor, already a technically nominal classification, become the dependent variable in
the discriminant equation with the several dependent variables (from the MANOVA)
becoming the predictors. In this way, the degree to which each dependent contributed to a
linear equation predicting the correct classification could be assessed.
4. There are a variety of possibilities here about which your students will quite possibly have
some expertise. Some of the factors and levels that could be considered for are:
Factor Levels
Brands 3 (required)
Hardness of shocks 3 (soft/medium/hard)
Gearing 2 (low/high)
Weight 2 (light/heavy)
Tire size 3 (car/truck/other)
Tread type 3 (highway/off-road/other)
Number of seats 3 (1/2/4)
If we use the first 5 factors of the list, we have a 3 x 3 x 2 x 2 x 2 design or a 72-option full
concept design.
B. This situation also involves the dependency relationships and presents a metric
dependent variable and at least some independent variables that are metric. The
appropriate multivariate technique is multiple regression, perhaps with some dummy
(0-1) independent variables.
C. A multiple regression equation for the reasons of dependency plus metric predictor
and criterion variables.
D. This appears to call for a statistical technique which relates the various test results
and extracts a fewer number of latent variables or dimensions which "explain" sales
success. This would suggest factor analysis.
Sales: S ($)
Salesperson's level of education: E (Years of education).
Gender: We use a single dummy variable for two categories. If the observation is
male, then the variable M=1, if female then 0.
Consumer income: I ($)
Wealth: W ($)
Ethnicity: Let us consider three categories Caucasian, Hispanic and Asian. For the
three categories we need two dummy variables.
Asian -A, which takes the value 1, if it is an Asian and 0 otherwise.
Hispanic -H, which takes the value 1, if it is a Hispanic and 0 otherwise.
b0 is a constant; b1, b2, b3 etc. are coefficients, e is the residual error term.
The instructor may observe and explain how/ when the values of the dummy variables M,
A, and H are each zero, then the predicted value of S is for a Female salesperson, with the
consumer being Caucasian.
7. A. This particular exercise illustrates the sort of problems one encounters almost
routinely in real life situations, and how the form and scale in which data is collected
restricts the analysis and determines the techniques that can be used.
Exhibit 19-1 (Selecting the Most Common Multivariate Techniques) in the textbook
indicates that if there are no dependent variables, and the variables are non-metric the
possibly usable techniques are non metric factor analysis, latent structure analysis,
non metric cluster analysis, and non metric multidimensional scaling.
Of these, cluster analysis groups similar objects or people. Here cluster analysis
would examine "whether and where is there a clustering" of people in terms of job
satisfaction, promotions and departments. Do people high on job satisfaction seem to
be those who also have many promotions? Do these seem to be from one department
alone? These are the nature of insights that cluster analysis may provide.
The instructor can point out that the data on promotions could easily have been
collected as "number of promotions" for each participant (ratio scaled), and job
satisfaction data could be collected on a semantic differential seven point scale etc.
which would allow us to use methods appropriate for metric data. If this were done,
then only "Departments" would be nominally scaled.
Taking the data as it is, an exploratory exercise to test the hypothesis that "there is no
relation between departments, promotions and job satisfaction" can be done using the
Chi Square Test (refer to Chapter 16). Further, if hypotheses were formulated and a
dependent variable specified, then depending on the manner in which the data is
collected (metric or non metric) discriminant analysis (non metric dependent
variable) or multiple regression (metric dependent variable) could have been used.
B. Brand choice (3 brands) is a nominally scaled non-metric variable, income levels and
extent of advertising (say, advertising budgets, or the number of advertisements) are
ratio scaled metric variables. With the dependent variable for study being brand
choice, the appropriate methodology would be discriminant analysis.
Another way of approaching the issue would be to treat fabric color choice as the
dependent variable in a discriminant analysis, with income, temperature and ethnicity
being the explanatory or independent variables. Ethnicity is a nominal variable and
to use it in a discriminant analysis procedure would require the use of dummy
variables, in a manner similar to that used in regression analysis. In this case the
discriminant function can be used to extrapolate and "predict" the number of dark
color fabric users, when the data on income, temperature and ethnicity is known.
8. The R2 value indicates that the four independent variables statistically account for 92
percent of the variation in the annual sales. The standard error of estimate is a measure of
the precision of the Y estimates. This value of 11.9 million dollars indicates that two out of
three times the equation estimates of company sales were within a plus or minus 11.9
million of the true sales figure.
9. One way to judge the predictive value of a discriminant analysis is to determine the percent
of correctly classified dependent variable cases. In this problem 210 of 280 or 75 percent
of the cases were correctly classified. From the detailed data we see that the best record
was in predicting which persons would take alternative A.
Using the procedure Factor in SPSS; principal components, varimax rotation. Eighty-three
percent of the variance is explained by three factors. The rotated solution is shown below.
Terms in Review
1. A. A speaker-centered presentation is one where the focus (generally resulting from the
speakers sense of inadequacy) relies on memorization of the manuscript. The style of
delivery reflects a preoccupation with the memorized message to the detriment of
establishing rapport with and adapting to the needs of the audience. It is considered
self or speaker- centered because it is strictly one-way communication. In contrast, an
extemporaneous presentation replaces a script with an organized set of ideas that
may be presented from notes or an outline. This approach takes into consideration the
need for adaptability to the occasion, flexible response to audience feedback, and a
conversational delivery of the message.
B. Technical reports include both a full presentation of the analysis as well as sufficient
procedural information to permit another researcher to replicate the study. The report
structure follows the steps of the research study itself: prefatory items, introduction,
methodology, findings, conclusions, appendixes, and bibliography. The appendixes
contain detailed information such as instrumentation, data analysis methods, and
instructions for the field personnel.
Management reports are written for the nontechnical client. They minimize
methodological details. Conclusions are presented before the findings that support
them. Graphics are used to enhance comprehension. The appendix is short compared
to the technical report and the bibliographic references are often omitted.
C. The topic outline is a format in which only a key word or two are used for each item.
Sentence outlines express the essential thoughts using brief sentences associated with
the specific topic.
B. Tables appearing in the body of the report or presentation should be kept as simple as
possible in order to allow the reader/listener to grasp one or two specific points. In
place of tables, use charts and graphs whenever possible. If complex tables are
necessary in the report, an appropriate place is in the appendix.
C. The physical presentation of the report is critical to its being read. It is especially
important to design the report to fit its audience and to deal adequately with the
knowledge gap that may exist between the writer and reader. The particular form of
the report should be appropriate to the audience, occasion, and importance of the
subject.
D. "Pace" concerns the problem of how quickly concepts are developed and how deeply
they are explained. The pace tends to be slow and the depth limited if the topic is
complex and the audience unsophisticated. If the topic is complex, but the audience
is sophisticated, the writer can assume a certain depth of knowledge and move
quickly and deeply into the subject.
B. A technical report which follows the suggested format of the journal in question.
4. Several different graphic forms are acceptable for each of the cases mentioned.
A. If the intent is to show yearly data for the decade it is probably wise to use a line
diagram on a semilog chart because semilog scales show rates of relative change
much better than do arithmetic scales. If one is showing only the percentage change
between annual income in 1990 and 2000 then an arithmetic line scale diagram may
be used if 1990 for each country is set at 100 and 2000 is expressed as a ratio to
1990. A better choice would be to use a multiple variable bar chart, with both
countries listed above the years, 1990 and 2000. The raw data and the percentage
change can thus be shown on the same chart, using numbers to augment the graphic.
B. Data are delivered several ways, depending on the message being delivered. If there
were a desire to only compare percentages of income spent on items, it would
probably be best to use a horizontal bar chart of the 100% component type. This is
the best way to show comparable allocations among subparts. A vertical version of
the same type of chart is often found and is also acceptable. If, on the other hand, the
absolute dollar comparisons were desired, a vertical stacked bar chart would be a
good choice. Other possibilities include a set of pie charts (for percentages only), a
3-D column chart if the data has been collected for more than one year, or a stratum
chart.
C. This could be the two-way percent change horizontal bar chart. It is an ideal way for
comparing percent changes between two points for a limited number of entities,
especially when one or more may be negative percent changes. Alternatively, a
multiple variable vertical bar chart could be used to graphically show the changes
from year to year. This would be best done with the years 1996 and 1997 as the
variables and the six firms as the heading for the groups of bars. A third possibility is
to use a line chart with each firms' price shown as a different line.
5. This can be a good exercise to have students prepare in advance and then have
several shown on chalkboard or via transparencies. One suggested outline for the
first assignment is given below. Answers to parts B and C depend on the
circumstances at the time.
I. Prewriting Considerations
A. Purpose of report
B. Define audience
C. Circumstances and limitations
D. How will it be used?
II. Report Choice
A. Informational or research?
B. Long or short?
C. Management or technical?
D. Format?
III. Draft body of report
A. Analysis of data
B. Select important findings
C. Statistical tests
D. Draft conclusions
E. Draft recommendations
F. Review and revise
1. by self
2. by others
IV. Writing report
A. Outline body of report
B. How to present data
1. draft tables
2. set up charts
3. develop graphics
C. Introduction
D. Appendix
E. Synopsis
F. First draft - total report
G. Review and revise
1. by self
2. by others
H. Final printing and finishing touches
V. Submit report
B. All tables should be labeled adequately to ensure that the reader could understand
their contents.
C. Writers should seldom put more than several statistical numbers in the body of text.
When more need to be presented, the statistics should go into tables or semi-tabular
presentations.
E. Writers should recognize that charts and graphs represent visual comparisons that
have more impact than the labels on the data. For example, the omission of a zero
base line may give an incorrect visual impression even though the scales may be
labeled correctly.
F. Statistical data from which findings are drawn should be adjacent to those findings.