Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Catatan Statisktik FIX

Download as pdf or txt
Download as pdf or txt
You are on page 1of 59

C1: STATISTICS

The science of collecting, organizing, presenting, analyzing, and interpreting data to assist in making more
effective decisions.
} Descriptive S-> Methods of organizing, summarizing, and presenting data in an informative way.
(Can be used to organize data into a meaningful form & You can summarize data and provide
information that is easy to understand
} Inferential S-> The methods used to estimate a property of a population on the basis of a sample.
(can be used to estimate properties of a population & make decisions based on a limited set of data)
Population: The entire set of individuals/objects of interest or the
measurements obtained from all individuals/objects of interest.

Sample: a portion/part of the population of interest.

Types of Variables
1. Qualitative V: observed & recorded as a non-numeric characteristic or attribute.
Ex: gender, state of birth, eye color
2. Quantitative V: numerically - can be discrete or continuous
Ex: balance in your checking account, the life of a car battery
} Discrete = typically the result of counting, values have “gaps” between the values. Examples: jumlah kamar
(1, 2, etc.), the number of students in class (326, 421, etc.) -> bulat. XX: 1,5 org / 2,3 kamar
} Continuous = the result of measuring something, can assume any value within a specific range. Examples:
Duration of flights from A to B (5.25 hours), grade point average (3.258).

Levels of Measurement (determines the type of statistical analysis that can be performed)

1. Nominal Level of Measurement: data is represented as labels, can only be classified & counted.
Examples: classifying M&M candies by color, identifying students at a football game by gender.
2. Ordinal Level of Measurement: data is based on a relative ranking or rating of items based on a
defined attribute or qualitative variable. Variables based on this level of measurement are only ranked
and counted. (The rankings are known but not the magnitude of differences between groups)
Examples: the list of top ten states for best business climate, student ratings of professors.
3. Interval Level of Measurement: data the interval or the distance between values is meaningful, based
on a scale with a known unit of measurement. (This data has all the characteristics of ordinal level
data, + the differences between the values are meaningful, there is no natural 0 point) Examples: the
Fahrenheit temperature scale (Zero temperature does not mean no temperature at all), dress sizes.
4. Ratio Level of Measurement: Data based on a scale with a known unit of measurement & a
meaningful interpretation of zero on the scale. (The data has all the characteristics of the interval
scale & ratios between numbers are meaningful, the 0 point represents the absence of the
characteristic) Examples: wages (Zero dollar = no money), changes in stock prices, and height.

1
C2: DESCRIBING DATA: FREQUENCY TABLES, FREQUENCY DISTRIBUTIONS, &
GRAPHIC PRESENTATION
Frequency Table: a grouping of qualitative data into mutually exclusive and collectively exhaustive classes
showing the number of observations in each class.
} Mutually exclusive means the data fit in just one class
Term mutually exclusive -> In frequency distributions, classes are mutually exclusive if each
individual, object, or measurement is included in only one category.
} Collectively exhaustive
means there is a class for each value

Constructing Frequency Tables


1. sort the data into classes
2. Count the number in each class and report as
the class frequency
3. Convert each frequency to a relative frequency

Example
1) A frequency distribution is a grouping of quantitative data into overlapping classes showing the
number of observations in each class
⊚true
⊚false -> Classes in a frequency distribution may not overlap and must be mutually exclusive

2) What is the relative class frequency for the $25 up to $35 class?

9/50=0.15

Graphic Presentation of Qualitative Data

Bar Chart: A graph that shows the qualitative classes on the horizontal axis and the
class frequencies on the vertical axis. The class frequencies are proportional to the
heights of the bars. Use a bar chart when you wish to compare the number of
observations for each class of a qualitative variable.

2
Pie Chart: A chart that shows the proportion or percentage that each class represents of the
total number of frequencies. Use a pie chart when you wish to compare relative differences
in the percentage of observations for each class of a qualitative variable.

Frequency Distribution: A grouping of quantitative data into mutually exclusive and collectively exhaustive
classes showing the number of observations in each class.
Constructing Frequency Distributions

Step 1, Decide on the number of classes.


Use the 2k > n rule, where n=180.

k = the number of classes


n = the number of values in the data set

2k > 180, let k = 8 -> So use 8 classes

Step 2, Determine the class interval, i

So decide to use an interval of $400


The interval is also referred to as the class width
Round the result up to some convenient number like a multiple of 10 or 100; here we’ll use an interval of $400.

Step 3 Set the individual class limits. Lower limits should be rounded to an easy-to-read
number when possible

Step 4 Tally the individual data into the classes and


determine the number of observations in each class.
The number of observations is the class frequency

What is the class midpoint for the $58 up to $68 class?


A) 62.00
B) 62.50
C) 63.00
D) 63.50 58+68/2=63

3
Relative Frequency Distributions

Graphic Presentation of a Frequency Distribution

Histogram: A graph in which the classes are marked on the horizontal axis and the class frequencies on the
vertical axis. The class frequencies are represented by the heights of the bars, and the bars are drawn adjacent
to each other. A histogram shows the shape of a distribution.

A frequency polygon: similar to a histogram, also shows the shape of a distribution. Good to use when
comparing two or more distributions

Cumulative Relative Frequency Distribution


- To construct a cumulative frequency distribution, add each frequency to the frequencies before it
- This shows how many values have accumulated as you move from one class down to the next class
- Divide the cumulative frequencies by the total number of observations

As shown in Table 2-9, the


cumulative relative frequency of
the fourth class is 80/180 = 44%.

This means that 44% of the


vehicles sold for less than $1,800.

Cumulative Frequency Polygon


To plot a cumulative frequency distribution, scale the upper limit of each class
along the X-axis and the corresponding cumulative frequencies along the Y-axis.
Label the vertical axis on the right in terms of cumulative relative frequencies.

4
C3: DESCRIBING DATA: NUMERICAL MEASURES

Example:
There are 42 exits on I-75 through the state of Kentucky.
Listed below are the distances between exits (in miles).

1. Why is this information a population?


This is a population because we are considering all of the exits in Kentucky.

2. What is the mean number of miles between exits?

Example:

Median: The midpoint of the values after they have been ordered from the minimum to the maximum values.

The number of hours a sample of 10 adults used Facebook last month:


3 5 7 5 9 1 3 9 17 10
Arranging the data in ascending order gives:
1 3 3 5 5 7 9 9 10 17
Thus, the median is 6.
5
Mode: Modus -> The value of the observation that occurs most frequently

Relative Positions of Mean, Median, and Mode


If a distribution is highly skewed, the mean is probably not a representative measure of central tendency and
the median or mode should be used.

Example:
The Carter Construction Company pays its hourly employees $16.50, $19.00, or $25.00 per hour. There are
26 hourly employees: 14 are paid at the $16.50 rate, 10 at the $19.00 rate, and 2 at the $25.00 rate. What is
the mean hourly rate paid for the 26 employees?

Dispersion: Measures of dispersion also allow us to compare two or more distributions.


} Measures of dispersion include:
-Range - Variance -Standard Deviation

} It is influenced by extreme values

6
Example: Population Variance
The number of traffic citations
issued last year by month in
Beaufort County, South Carolina
is reported ------------------à

Determine the population variance.

So, the population variance for the number of citations is 124.


Standard Deviation

The major characteristics of the standard deviation are:


} It is in the same units as the original data
} It is the square root of the average squared distance from the mean
} It cannot be negative
} It is the most widely used measure of dispersion

Interpretations and Uses of the Standard Deviation


THE EMPIRICAL RULE For a symmetrical, bell-shaped frequency distribution, approximately 68% of the
observations will lie within plus and minus one standard deviation of the mean, about 95% of the

7
observations will lie within plus or minus 2 standard deviations of the mean, and practically all (99.7%) will
lie within 3 standard deviations of the mean.
If we have a symmetrical distribution, we can use the Empirical Rule, sometimes called the Normal Rule.

Here is a symmetrical distribution with a mean of 100 and a standard


deviation of 10.

Applying the Empirical Rule, we’ll find about


• 68% of the values between 90 and 110
• 95% of the values between 80 and 120
• 99.7% of the values between 70 and 130

Sample Mean of Grouped Data

Standard Deviation of Grouped Data

Calculating the Standard Deviation of Grouped Data


Applewood Auto Group Frequency Distribution Compute the standard deviation of the vehicle profits.

begin by estimating the mean (in this example, the mean is $1,851), then find the deviation of each value
from the midpoint, square the results in this column, and then multiply by the class frequency. Divide the
sum of these products and divide by n-1 and finally, take the square root of that calculation.
8
C4 DESCRIBING DATA: DISPLAYING AND EXPLORING DATA

Dot Plots Use dot plots to compare the two data sets like these of the number of vehicles serviced last
month for two different dealerships

Measures of Position
} n= Number of observations
} P= Percentile

Quartiles divide a set of observations into four equal parts


The interquartile range is the difference between the third quartile and the first quartile

Example
Morgan Stanley is an investment company with offices located throughout the United States. Listed below
are the commissions earned last month by a sample of 15 brokers
} First, sort the data from smallest to largest

} Next, find the median


L50 = (15+1)*50/100 = 8
So the median is $2,038, the value at position 8
25 75
L25 = (15 +1) =4 L75 = (15 +1) = 12
100 100
Therefore, the first and third quartiles are located at the 4th and 12th
positions, respectively: L25 = $1, 721; L75 = $2, 205
Q1 = $1,721, the 4th value.
Q3 = $2,205, the 12th value.

} If a distribution of wages, incomes, turnover, etc. Is arranged, the quartiles are the values which
divide the distribution into four equal parts. Thus, for a distribution of wages :
• The first quartile (Q1) is the wage below which 25% of the wages are situated or the first
quartile is the wage above which 75% of the wages are situated
• The second quartile (Q2) is the wage below which 50% of the wages are situated or the
wage above which 50% of the wages are situated
• The third quartile (Q3) is the wage below which 75 % of the wages are situated or above
which 25% of the wages are situated.

9
Box Plot a graphic display that shows the general shape of a variable’s distribution. It is based on five
descriptive statistics: the maximum and minimum values, the first and third quartiles, and the median.

} The interquartile range is Q3 –


Q1
} Outliers are values that are
inconsistent with the rest of the data
and are identified with asterisks in
box plots

Example
Alexander’s Pizza offers free delivery of its pizza within 15 miles. How long does a typical delivery take?
Within what range will most deliveries be completed?
Using a sample of 20 deliveries,
Alexander determined the following:
} Minimum value = 13 minutes
} Q1 = 15 minutes
} Median = 18 minutes
} Q3 = 22 minutes
} Maximum value = 30 minutes

Common Shapes of Data

} The coefficient of skewness can range from -3


to +3
} A value near -3 indicates considerable negative
skewness
} A value of 1.63 indicates moderate positive
skewness
} A value of 0 means the distribution is
symmetrical

Skewness

} The coefficient of skewness can range from -3 to +3


} A value near -3 indicates considerable negative skewness
} A value of 1.63 indicates moderate positive skewness
} A value of 0 means the distribution is symmetrical

10
Skewness Example
Following are the earnings per share for a sample of 15 software companies for the year 2018. The
earnings per share are arranged from smallest to largest.

} Begin by finding the mean,


median, and standard
deviation. Find the
coefficient of skewness.

} What do you conclude


about the shape of the
distribution? The
distribution is moderately
positively skewed.

Describing the Relationship Between Two Variables Scatter Diagrams

Correlation Coefficient

} Can range from -1.0 to +1.0


} The closer the coefficient is to −1.0 or +1.0, the stronger the relationship
} If r is close to 0.0, we can say that there is no relationship between the variables

Contingency Tables: A table used to classify observations according to two identifiable characteristics.
} It is a cross-tabulation that simultaneously summarizes two variables of interest
} Both variables need only be nominal or ordinal
11
Example
Applewood Auto Group’s profit comparison
} 90 of the 180 cars sold had a
profit above the median and half
below. This meets the definition of
median.
} The percentage of profits above
the median are Kane 48%, Olean
50%, Sheffield 42%, and Tionesta
60%.

C17 INDEX NUMBERS

INDEX NUMBER: number that expresses the relative change in price, quantity, or value compared to a
base period.

Hourly Wages Index Example


According to the Bureau of Labor Statistics, in 2000, the average hourly earnings of production workers was
$14.02. In January 2019, it was $27.56. What is the index of hourly earnings of production workers for
January 2019 based on 2000 data?

P=(Average hourly wage January 2019/Average hourly wage 2000) x 100


P= ($27.56/ $14.02) x 100 P= 196.58
Thus, the hourly earnings in 2019 compared to 2000 were 196.58%. This means there was a 96.58% increase
in hourly earnings during the period, found by 196.58 – 100.0 = 96.58
A simple price index is constructed by taking the price in a selected year and dividing it by the price in the
base year and multiplying the result by 100.

Population Index Example


An index can also compare one item to another. The population of the Canadian province of British
Columbia in 2019 was 4,862,610, and for Ontario it was 14,374,084. What is the population index of British
Columbia compared to Ontario?
The index of population for British Columbia is 33.9 found by: P = (100) = (100) = 33.8
The population of British Colombia is 33.8% (about one-third) of the population of Ontario, or another way
to say that is the population of British Columbia is 66.2% less than the population of Ontario (100 – 33.8 =
66.2)

E-Commerce Sales Index Example


US retail e-commerce sales in 2018 were $504,582,000
In 2010, e-commerce sales were $168,895,000
An increase of $335,687,000

12
Construction of Index Numbers

Suppose the price of a fall weekend package at Tryon Mountain Lodge in western North Carolina in 2000
was $450. The price rose to $795 in 2019. What is the price index for 2019 using 2000 as the base period
and 100 as the base value?

P $795
P = P t (100) = $450 (100) = 176.7 The fall weekend package increased 76.7% from 2000 to 2019.
0

Converting to Indexes using Different Base Periods


Below is a table for prices of a Benson Automatic Stapler,
converted to indexes using three different base periods. First, a single year (2015) is used and each year’s
price is divided by 20.
Next, two years (2015-2016) are used as the base; the base price of the stapler would be $21,
found by averaging the price of the stapler in the two years (20 + 22) ÷ 2 = $21 (dividing each year’s price)

Finally, the prices $20, $22, $23 are averaged if we use three years (2015-2017) as the base and then each
year’s price is divided by 21.67 to obtain the price index.

Unweighted Indexes
} In an unweighted index, we do not consider the quantities

Simple Average of Price Indexes Example


This table reports the prices for several food items in 2009 and 2019. We would like to develop an index
for this food group for 2019 using 2009 prices as the base. This is written 2009=100

First, we use formula 17-1 to compute the


simple index for each food item.
P
For instance, the index for bread, P = P t (100)
0
1.274
= 1.381 (100) = 92.3

Total:
ΣPi 92.3+ …. +147.9 591.0
P= = = = 98.5
n 6 6
The mean price of food decreased 1.5% 2009 to 2019.
13
Simple Aggregate Price Index

The simple aggregate index for food items on the previous slide is found by dividing the sum of prices in
2019 by the sum of the prices in 2009.
P = (100) = (100) = 103.7
This means that the aggregate group of prices had increased 3.7% from 2009 to 2019.

Weighted Indexes: Laspeyres Method

In a weighted index, the quantities are considered & In the Laspeyres method, base period quantities are
used in both the base period and the given period
} Advantage: Only quantity data from the base period is used which allows for more meaningful
comparison over time
} Disadvantage: Does not reflect changes in buying patterns over time & It may overweight goods
whose prices increase

Laspeyres Price Index Example


First, we determine the total amount spent for the six items in the base period, 2009.
To find this value, multiply the base period (2009) price for bread, $1.381, by the base period quantity of
50.
The result is $69.2. Continue that for all
items and total the result. The base
period total is $648.23.
The current year total is computed in a
similar fashion. For bread, we multiply
the quantity in 2009 by the price of
bread in 2019; 50 times $1.274 is
$70.07.
We make the same calculation for the
other items and total to get $667.783.
P = (100) = (100) = 103.0

We conclude the price of this group of items has increased 3.0% from 2009 to 2019.

Weighted Indexes: Paasche Method

} Advantage: Current buying habits are reflected


} Disadvantage
} It requires quantity data for the current year and it tends to overweight goods whose prices
have declined
} It requires the product of prices and quantities to be recomputed each year

14
Paasche Price Index Example
The following table shows the calculations to determine the Paasche index.
Σp q $667.78
P = t t (100) = (100) = 88.7
Σp0qt $753.00
This result indicates that there has been a decrease of 11.3% in the price of this “market basket” of goods
between 2009 and 2019.

Fisher’s Ideal Index

Value Index

The prices and quantities sold at the


Waleska Clothing Emporium for ties,
suits, and shoes for May 2015 and May
2019 are given in the table on the left.
What is the index of value for May 2019
using May 2015 as the base period?

Total sales in May 2019 were $10,600 and in 2015 is $9,000.

Σptqt $10,600
V = Σp q (100) = $9,000
(100) = 117.8
0 0

The value of apparel sales increased 17.8% from


May 2015 to May 2019.

Special-Purpose Indexes

example
The Seattle Chamber of Commerce wants to develop a measure of general business activity. It will be
called the General Business Activity Index of the Northwest and will include department store sales (40%),
regional employment (30%), freight car loadings (10%), and exports from Seattle harbor (20%).

15
Business activity has increased
57.0% from 2005 to 2010 and 57.1%
from 2005 to 2018.

Real Income
Suppose Ms. Watts earned $20,000 per year in the base period of 1982, 1983, and 1984. She has a current
income of $40,000. Note that although her money income has doubled since the base period of 1982-84,
the prices she pays for food, gasoline, clothing, and other items has also doubled. Compute her real
income.

Deflating Sales

The sales of Hill Enterprises, a small injection molding company in upstate New York, increased from
1982 to 2018. The owner, Harry Hill, realizes that the price of raw materials used in the production process
has also increased, so Mr. Hill wants to deflate sales to account for the increase in raw materials. What are
the deflated sales for 1990, 2000, 2005, 2010, 2015, and 2018 expressed in constant 1982 dollars?

16
Purchasing Power of the Dollar
Suppose the Consumer Price Index this month is 200.0 (1982-84 = 100). What is the purchasing power of
the dollar?

Purchasing power of dollar = (100) = (100) = $0.50

The CPI of 200 indicates that prices have doubled from the years 1982-84 to this month. Thus, the
purchasing power of the dollar has been cut in half. That is, a 1982-84 dollar is only worth 50 cents this
month.

C5 A SURVEY OF PROBABILITY CONCEPTS

Probability: a value between 0 and 1 inclusive that


represents the likelihood a particular event happens.

Experiment: a process that leads to the occurrence of one


and only one of several possible results.

Outcome: a particular result of an experiment.

Event: a collection of one or more outcomes of an experiment.

Classical Probability

Mutually Exclusive: The occurrence of one event means that none of the other events can occur at the
same time.
Collectively Exhaustive: At least one of the events must occur when an experiment is conducted.

17
Empirical Probability
The empirical definition occurs when the number of times an event happens is divided by the number of
outcomes
Empirical Probability: The probability of an event happening is the fraction of the time similar events
happened in the past.

Law of Large Numbers: Over a large number of trials, the empirical probability of an event will approach
its true probability

Law of Large Number

Subjective Probability
Subjective Concept Of Probabiltiy: The likelihood
(probability) of a particular event happening that is
assigned by an individual based on whatever
information is available.
Examples of subjective probability are:
} Estimating the likelihood the New England
Patriots will be in the Super Bowl next year
} Estimating the likelihood the U.S. budget deficit
will be reduced by half in the next 10 years

18
Rules of Addition
} The rules of addition refer to the probability that any two or more
events can occur
} The special rule of addition is used when the events are mutually
exclusive

Rules of Addition Example


A machine fills plastic bags with a mixture of beans, broccoli, and other vegetables. Most of the bags
contain the correct weight, but because of the variation in the size of the beans and other vegetables, a
package might be under weight or overweight. A check of 4,000 packages filled in the past month
revealed:

What is the probability that a particular package will


be either underweight or overweight?
P(A or C) = P(A) + P(C) = .025 + .075 = .10

Complement Rule
} The complement rule is used to determine the probability of an event happening by subtracting the
probability of an event not happening

You can also use the complement rule


P(A or C) = P(~B) = 1 − P(B) = 1 − .900 = .10

General Rule of Addition


} The general rule of addition is used when the events are not mutually exclusive

Joint Probability: a probability that measures the likelihood two or more events will happen concurrently.
General Rule of Addition Example:

A sample of 200 tourists in Florida shows 120 went to Disney, 100 went to Busch Gardens, and 60 visited both.

P(Disney) =120/200 = .60


P(Busch) =100/200 = .50
P(Disney and Busch) = 60/200 = .30

P(Disney or Busch) = P(Disney) + P(Busch) – P (Disney and Busch)


= .60 + .50 - .30 = .80

19
Special Rule of Multiplication
} The rules of multiplication are applied when two or more events occur simultaneously
} The special rule of multiplication refers to events that are independent

Independence: The occurrence of one event has no effect on the probability of the occurrence of another event.

A survey by the American Automobile Association (AAA) revealed 60% of its members made airline
reservations last year. Two members are selected at random. What is the probability both made airline
reservations last year?
P(R1 and R2) = P(R1)P(R2) = (.60)(.60) = 0.36

General Rule of Multiplication


} The general rule of multiplication refers to events that are not independent
} A conditional probability is the likelihood an event will happen, given that another event has already
happened

Conditional Probability: The probability of a particular event occurring, given that another event has occurred.

} The conditional probability is represented a P(B|A) and is read, the probability of B given A

General Rule of Multiplication Example

A golfer has 12 golf shirts in his closet. Suppose 9 of these shirts are white and the others are blue. He gets
dressed in the dark, so he just grabs a shirt and puts in on. He plays golf two days in a row and does not return
the shirts to the closet. What is the probability both shirts are white?

P(W1 and W2) = P(W1)P(W2|W1) = 9/12 x 8/11= 0.550

So the likelihood of selecting two shirts and finding them both to be white is .55. This can be extended to more
than two events

Contingency Tables
Contingency Table: A table used to classify sample observations according to two or more identifiable
categories or classes.

One hundred fifty adults were asked if they were older than
50 years of age and the number of Facebook accounts they
used. The following table summarizes the results.

20
1. 2.

3.

4.

5.

6. Independen jika umur dan nonton film tidak berkaitan: jawabannya tidak Independen sebab umur
mempengaruhi keputusan menonton film.

Tree Diagrams

21
Bayes’ Theorem
} Bayes’ Theorem is a method of revising a probability, given that additional information is obtained

Prior Probability: The initial probability based on the present level of information.
Posterior Probability: A revised probability based on additional information.

Suppose 5% of the population of Umen have a disease. A1 represents the part of the population that has the
disease and A2 represents those who do not. Let B denote a test result that shows the disease is present.
P(A1) = 0.05 Individual has the disease
P(A2) = 0.95 Individual does not have the disease
P(B|A1) = 0.90 Test shows the individual has the disease and is correct
P(B|A2) = 0.15 Test incorrectly shows the individual has the disease

Randomly select an individual and perform the test. The test results indicate the disease is present. What is
the probability the test is correct?
Use Bayes’ theorem to solve.

!(#$)!(&|#$) (./0)(.1/) ./20/


P(A1|B) = = (./0).1/)+(.10)(.$0) = = .24
!(#$)!(𝐵 )𝐴1*+!(#,)!(&|#,) .$340

Multiplication Formula
} The multiplication formula states that if there are n ways of doing one thing, and m ways of doing
another thing, then there are m*n ways of doing both

} This can be extended to more than two events. For three events m, n, and x: Total
} number of arrangements = (m)(n)(x)

Multiplication Formula Example


When the American Red Cross receives a blood
donation, the blood is analyzed and classified by
group and Rh factor. There are four blood groups:
A, B, AB, and O. The Rh factor can be either
positive or negative. How many different blood
types are there?
Total possible arrangements
= (m) (n) = (4) (2) = 8

22
The Permutation Formula
Permutation: Any arrangement of r objects selected from a single group of n possible objects.

There are three electronic parts to be assembled, so n=3. Because all three are to be inserted into the plug-
in component, r=3.
Note: 0! = 1! = 1

3 P3 = 3!/(3-1)! = (3x2x1)/(3-3)! = 6

Label the parts A, B, and C -> ABC BAC CAB ACB BCA CBA

The Combination Formula

The Grand 16 movie theater uses teams of three employees to work the concession stand each evening.
There are seven employees available to work. How many different teams can be scheduled?
!! '! '!
7
C3 = = = = 35
#!(!%#)! (!('%()! (!)!

C6 PROBABILITY CASE STUDY


} A survey of top executives revealed that 35% of them regularly read Time magazine, 20% read
Newsweek, and 40% read U.S. News & World Report. A total of 10% read both Time and U.S.
News & World Report. What is the probability that a particular top executive reads either Time or
U.S. News & World Report regularly?
A) 0.85 B) 0.06
C) 1.00 D) 0.65

The three events—reading Time, Newsweek, or U.S. News & World Report
—are not mutually exclusive because executives can read more than one of the magazines.
The P(Time or U.S. News) = P(Time) + P(U.S. News) − P(Time and U.S. News)
= 0.35 + 0.40 − 0.10 = 0.65.
} There are 25 AAA batteries in a box and 7 are defective. Two batteries are selected without
replacement. What is the probability of selecting a defective battery followed by another defective
battery?
A) 1/2 or 0.50 B) 1/8 or 0.13
C) 1/700 or about 0.0014 D) 7/100 or about 0.07

23
We use the special rule of multiplication to solve this problem as the selections are not independent.
The probability of a defective battery on the first selection is 7/25 = 0.28. The probability of selecting
a second defective battery is a conditional probability that assumes the first selection was defective,
so the probability of a second defective battery is 6/24. The joint probability is
(7/25)(6/24) = 0.070000, or about 0.07.

} A developer of a new subdivision wants to build homes that are all different. There are three different
interior plans that can be combined with any of five different home exteriors. How many different
homes can be built?
A) 8 B) 10
C) 15 D) 30

Using the multiplication formula, (3)(5) = 15.

} The ABCD football association is considering a Super Ten Football Conference. The top 10 football
teams in the country, based on past records, would be members of the Super Ten Conference. Each
team would play every other team in the conference during the season and the team winning the most
games would be declared the national champion. How many games would the conference
commissioner have to schedule each year? (Remember, Oklahoma versus Michigan is the same as
Michigan versus Oklahoma.)
A) 45 B) 50
C) 125 D) 14

As the order is not unique, we use the combination formula: n = 10 and r = 2,


10!/2!(10 − 2)! = 90/2 = 45.

} Alonzo, Bob, and Casper work bussing tables at a restaurant. Alonzo has a 40% chance, Bob has a
25% chance, and Casper has a 35% chance of bussing tables in the middle area of the restaurant. If
Alonzo is bussing tables, he has a 5% chance of breaking a dish. If Bob is bussing tables, he has a
2% chance of breaking a dish. Finally, if Casper is bussing tables, he has a 4% chance of breaking a
dish. If there is a broken dish in the middle of the restaurant, what is the probability it was broken by
Bob?
A) 0.014 B) 0.128
C) 0.359 D) 0.513

P(B|Break): 0,128

24
C6 DISCRETE PROBABILITY DISTRIBUTIONS
Probability Distribution: a listing of all the outcomes of an experiment and the probability associated with
each outcome.

CHARACTERISTICS OF A PROBABILITY DISTRIBUTION


1. The probability of a particular outcome is between 0 and 1 inclusive.
2. The outcomes are mutually exclusive.
3. The list of outcomes is exhaustive. So the sum of the probabilities of the outcomes is equal to 1.

Probability Distribution Example


} Suppose we are interested in the number of heads showing face up with 3 tosses of a coin
} The possible outcomes are 0 heads, 1 head, 2 heads, and 3 heads

there are 8 possible outcomes when tossing a coin


three times, since you will observe either a head or a
tail, two possibilities, and this toss is repeated 3
times,
so (2)(2)(2) = 8.

Using this data we can construct a probability


distribution.

Probability distribution table and chart


for the events of zero, one, two, and three
heads

Random Variables
Random Variable: quantity resulting from an experiment that, by chance, can assume different values.
Examples
} The number of employees absent from the day shift on Monday: the number might be 0, 1, 2, 3,
…The number absent is the random variable
} The grade level (Freshman, Sophomore, Junior, or Senior) of the members of the St. James High
School Varsity girls’ basketball team. Grade level is the random variable (and notice that it is a
qualitative variable).

Sample, Random Variable and Outcome

25
Two Types of Random Variables
1. Discrete Random Variable:
A random variable that can assume only certain clearly separated values. (usually the result of counting)
Example: Tossing a coin three times and counting the number of heads
For example, the Bank of the Carolinas counts the number of credit cards carried by a group of
customers. The number of cards carried is the discrete random variable.

2. Continuous Random Variable


A random variable that may assume an infinite number of values within a given range. (usually the
result of measuring)
Examples
} The time between flights between Atlanta and LA are 4.67 hours, 5.13 hours, and so on
} The annual snowfall in Minneapolis, MN measured in inches
} A random variable represents the outcome of an experiment.
⊚ true
⊚ false
Random variables are defined as a quantity resulting from an experiment that, by chance, can
assume different values.
} A probability distribution is a mutually exclusive and collectively exhaustive listing of
experimental outcomes that can occur by chance, and their corresponding probabilities
⊚ true
⊚ false
This is what a probability distribution shows. All possible outcomes are listed, and the
probability of each outcome is shown.

Mean and Variance of a Probability Distribution

} The mean is a typical value used to represent the central location of the data
} The mean is also referred to as the expected value

} The amount of spread (or variation) in the data is described by the variance
The standard deviation of the probability distribution is the positive square root of the variance

Probability Distribution Mean Example


John Ragsdale sells new cars for Pelican Ford. John usually sells the most cars on Saturday. He has
developed a probability distribution for the number of cars he expects to sell on Saturday.

1. What type of distribution is this? This is a discrete probability distribution

26
2. How many cars does John expect to sell on a typical Saturday? We find John can expect to sell 2.1
cars on a typical Saturday
In other words, over the long run, say 50 Saturdays in a year,
he can expect to sell (50)(2.1) = 105 cars.

3. What is the variance?

Take the (positive) square root of the variance to get the standard deviation.
The mean is 2.1, the variance is 1.290, and the standard deviation is 1.136 cars.

Binomial Distribution
There are four requirements of a binomial probability distribution
1. There are only two possible outcomes and the outcomes are mutually exclusive, as either a success or a
failure
2. The number of trials is fixed and known
3. The probability of a success is the same for each trial
4. Each trial is independent of any other trial
The binomial distribution is a widely occurring discrete probability distribution.

} Example: A young family has two children, both boys. The probability of the third birth being a boy
is still .50. The gender of the third child is independent of the gender of the other two.
} Probability winning the lottery or not
} Probability hitting a red light on your way to work or not
} Probability developing a side effect from a certain drug or not
} Probability she said “Yes”

Binomial Probability Experiment


} Use the number of trials, n, and the probability of a success, to compute binomial probability
Binomial Probability Experiment
1. An outcome on each trial of an experiment is classified into one of two mutually exclusive categories
— a success or a failure.
2. The random variable is the number of successes in a fixed number of trials.
3. The probability of success is the same for each trial.

27
4. The trials are independent, meaning that the outcome of one trial does not affect the outcome of any
other trial.
Note: Do not confuse the symbol with the mathematical constant 3.1416

Recently, www.creditcards.com reported that 28% of purchases at coffee shops were made with a debit
card. For 10 randomly selected purchases at the Starbucks on the corner of 12th Street and Main:

What is the probability that no purchases What is the probability that exactly one was made with a debit
were made with a debit card? card?
The probability that exactly one of the 10 purchases is made with a debit card
P(x) = nCr(π)r(1 − π)n − r is 0.1456 or 14.56 percent

P(0) = 10C0(. 28)0(1 − .28)10 − 0 r n r


P(x) = nCr(π) (1 − π) −
=0.0374 P(1) = C (. 28)1(1 − .28)10 − 1
10 1
=0.1456
The probability that exactly one of the 10 purchases is made with a debit card is.1456 or 14.56 percent

Binomial Probability Distribution


Recently, www.creditcards.com reported that 28% of purchases at coffee shops were made with a debit card.
For 10 randomly selected purchases at the Starbucks on the corner of 12th Street and Main, what is the
probability that exactly one of the purchases was made with a debit card? What is
the probability distribution for the random variable, number of purchases made with
a debit card? What is
the probability that six
or more purchases out
of 10 are made with a
debit card? What is
the probability that
five or less purchases
out of 10 are made
with a debit card?

Shortcut Formulas

28
Binomial Probability Tables
In the Southwest, 5% of all cell phone calls are dropped. What is the probability that out of six randomly
selected calls, none was dropped? Exactly one?

The probability of exactly


one dropped call in a
sample of six calls is .232

Cumulative Binomial Probability Distributions

And the probability of selecting 12 cars and finding that the occupants of 7 or more vehicles were
wearing seat belts is .9562

Hypergeometric Distribution
} When sampling from relatively small populations without replacement, use the hypergeometric
distribution

29
Example Hypergeometric Distribution:
Play Time Toys Inc. employs 50 people in the Assembly Dept. 40 of the employees belong to a union and
10 do not. Five employees are selected at random to form a committee. What is the probability that four of
the five belong to a union

(2/52)(0/-2/50-2) (*+,(*-)(+-)
𝑃 (4) = = = .431
0/50 .,++/,'0-
Thus, the probability of selecting 5 assembly workers at random from the 50 workers and finding 4 of the
5 are union members is
0.431.

Poisson Distribution

Poisson Distribution Example


Budget Airlines is a seasonal airline that operates flights from Myrtle Beach, South Carolina, to various
cities in the northeast. Recently Budget has been concerned about the number of lost bags. Ann Poston from
the Analytics Department was asked to study the issue. She randomly selected a sample of 500 flights and
found that a total of twenty bags were lost on the sampled flights.
The mean number of bags lost, is found by 20/500 = .04
The probability
" !
that'no bags are lost is found using formula 6-7.
! #$ .'( )$'.'(
P(0) = = = .9608
"! '!

Then calculate the probability


" !
that one'or more bags is lost.
! #$ .'( )$'.'(
P(x≥1) =1-P(0) = 1 − "! = 1 - = 1- .9608 = .0392
'!

30
Poisson Probability Distribution Tables

NewYork-LA Trucking company finds the mean number of breakdowns on the New York to Los Angeles
route is 0.30. From the table, we can locate the probability of no breakdowns on a particular run. Find the
column 0.3, then read down that column to the row labeled 0; the value is .7408. The probability of 1
breakdown is .2222

Case Study Chapter 6

A total of 60% of the customers of a fast-food chain order a hamburger, French fries, and a drink. If a
random sample of 15 cash register receipts is selected, what is the probability that 10 or more will show
that the above three food items were ordered?
A) 1.000 B) 0.186
C) 0.403 D) 0.000

Applying the binomial distribution, go to the binomial probability table, find the case where the number of
trials is n = 15, and the probability of success is π = 0.60. Find the row where x, the number of successes, is
10. Finally, add the probabilities for 10 through 15 successes
(0.186 + 0.127 + 0.063 + 0.022 + 0.005 + 0.000). The result is 0.403.

A management professor receives an average of five e-mail messages per day from students. Assume the
number of messages approximates a Poisson distribution. What is the probability that on a randomly
selected day she will have five messages?
A) 0.0067 B) 0.8750
C) 0.1755 D) 1.0000

31
Applying the Poisson probability distribution, the mean of the distribution is 5. Referring to the Poisson
probability tables or using the Poisson probability formula for x = 5 and a mean of 5, the probability of five
messages is 0.1755.

A committee of three people needs to be chosen. There are three men and five women available to serve on
the committee. If the committee members are randomly chosen, what is the probability that two of the three
people chosen on the committee are men?

A) 0.667
B) 0.536
C) 0.268
D) 0.376

The hypergeometric distribution applies here as selection is without replacement from a finite population.
Here, N = 8 (three men, five women), n = 3 (the committee size), S = 3 (the number of men in the population),
and x = 2 (the number of men selected for the committee). Using formula 6-6, P(x=2) = [(3C2) × (5C1)] /(8C3)
= 15/56 = 0.268.

C7 CONTINUOUS PROBABILITY DISTRIBUTIONS


Continuous Probability Distributions
} A continuous probability distribution usually results from measuring something, such as the distance
from the dormitory to the classroom, the weight of an individual, or the amount of bonus earned by
CEOs
} It is important to realize that a continuous random variable has an infinite number of values within
a particular range. So, for a continuous random variable, probability is for a range of values
} The probability for a specific value of a continuous random variable is 0

Uniform Distribution
} It is rectangular in shape, The height of the distribution
is constant or uniform for all values between a and b.
} The mean and the median are equal
} It is completely described by its minimum value a and
its maximum value b

32
Uniform Distribution Example
Southwest Arizona State University provides bus service to students while they are on campus. A bus
arrives at the North Main Street and College Drive stop every 30 minutes between 6 a.m. and 11 p.m.
during weekdays. Students arrive at the bus stop at random times. The time that a student waits is
uniformly distributed from 0 to 30 minutes.

The minimum wait time is 0 minutes and the maximum wait time is 30 minutes, so the range of the
distribution is 30 minutes. The height is
1/(b-a)= 1/30 = .0333

The area of the uniform distribution is found by


multiplying height × base
Area = 1/(b-a) x (b-a) = 1.00
The mean is = a+b\2=0+30/2 = 15
The standard deviation is = = 8.66

To find the probability that a student will wait more than 25


minutes, find the area between 25 and 30 minutes.

P(25 < wait time < 30) = (height)(base) =


1/30-0 x (30-25) = 0.1667

To find the probability that a student will wait between


10 and 20 minutes, find the area between 10 and 20
minutes.

P(10 < wait time < 20) = (height)(base) =


1/30-0 . 20-10 = .3333

} The time to fly between New York City and Chicago is uniformly distributed with a minimum of 120
minutes and a maximum of 150 minutes. What is the probability that a flight is between 125 and 140
minutes?
A) 1.00 B) 0.50 C) 0.33 D) 0.67

The probability is computed as the area under the curve. For a uniform distribution, it is the area of a
defined rectangle. In this case, the base is (140 − 125) and the height is (1/(150 − 120), or (140 − 125)
× (1/(150 − 120))) = 15/30 = 0.5.

} What is the probability that a flight is more than 140 minutes?


A) 1.000 B) 0.500 C) 0.333 D) 0.067

The probability is computed as the area under the curve. For a uniform distribution, it is the area of a
defined rectangle. In this case, the base is (150 − 140) and the height is (1/(150 − 120), or (150 − 140)
× (1/(150 − 120))) = 10/30 = 0.333

} What is the probability that a flight is less than 135 minutes?


A) 1.00 B) 0.50 C) 0.25 D) 0.00

In this case, the base is (135 − 120) and the height is


(1/(150 − 120), or (135 − 120) × (1/(150 − 120))) = 15/30 = 0.5.
33
Normal Probability Distribution
characteristics:
} It is bell-shaped and has a single peak at the center of the distribution
} The distribution is symmetric
} It is asymptotic, meaning the curve approaches but never touches the X-axis
} It is completely described by its mean and standard deviation
} There is a family of normal probability distributions
} Another normal probability distribution is created when either the mean or the stdev changes

Complex formula to find probabilities, but we will not need to use it!
use the table given in Appendix B.3

The Normal Curve

34
Family of Normal Probability Distributions

From Normal Probability Distribution to Standard Normal Probability Distribution


} The number of normal distributions is unlimited, each having a different mean (μ), standard deviation
(σ), or both
} Fortunately, one member of the family can be used to determine the probabilities for all normal
probability distributions. It is called the standard normal probability distribution
} it is unique because it has a mean of 0 and a standard deviation of 1

Standard Normal Probability Distribution


z VALUE: The signed distance between a selected value, designated x, and the mean, μ, divided by
the standard deviation, σ.

Any normal probability distribution can be converted into a standard normal probability distribution
by subtracting the mean from each observation and dividing this difference by the standard deviation.
The results are called z values or z scores
Once the normally distributed observations are standardized, the z values are normally distributed
with a mean of 0 and a standard deviation of 1
Areas Under the Normal Curve
} Here is a portion of the “z” Table
} For example, if you have a z=1.50, this reflects an area (or probability) of .4332.
The entire table can be found in Appendix B.3.
35
Standard Normal Probability Example
Rideshare services are available internationally. A customer uses a smartphone app to request a ride.
Then, a driver receives the request, picks up the customer, and takes the customer to the desired
location. No cash is involved; the payment for the transaction is handled digitally.
Suppose the weekly income of rideshare drivers follows the normal probability distribution with a
mean of $1,000 and a standard deviation of $100. What is the z value of income for a driver who
earns $1,100 per week? For a driver who earns $900 per week?

What is the z-value of income for a driver What is the z-value of income for a driver
who earns $1,100? who earns $900?

x-μ $1,100-$1,000 x-μ $900-$1,000


Z= σ
= $100
= 1.00 Z= σ
= $100
= -1.00

Regardless of whether z is +1or -1, the area under the curve is .3413
A z of 1.00 indicates that a weekly income of $1,100 is one standard deviation above the mean and
a z of -1.00 shows that a $900 weekly income is one standard deviation below the mean. Both
incomes are the same distance from the mean.

The Empirical Rule

36
The Empirical Rule Example
As part of its quality assurance program, the Autolite Battery Company conducts tests on battery life. For a
particular D-cell alkaline battery, the mean life is 19 hours. The useful life of the battery follows a normal
distribution with a standard deviation of 1.2 hours.

1. About 68% of the batteries failed between what two values?


19 ± 1(1.2) hours;
About 68% of batteries will fail between 17.8 and 20.2 hours.
2. About 95% of the batteries failed between what two values?
19 ± 2(1.2) hours;
About 95% of batteries will fail between 16.6 and 21.4 hours.
3. Virtually all of the batteries failed between what two values?
19 ± 3(1.2) hours;
Practically all of the batteries will fail between 15.4 and 22.6 hours.

Finding Areas under the Normal Curve


What is the z-value of income for a driver who earns $1,100?
x-μ $1,100-$1,000
Z= = = 1.00
σ $100

37
Using the weekly incomes of Uber drivers: Therefore, 34.13% of drivers earn between
P($1,000 < weekly income < $1,100) = 0.3413 $1000 and $1100 and 84.13% of drivers earn
P(weekly income < $1,100) = 0.3413 + 0.5000 =0.8413 less than $1,100

What is the z-value of income for a driver who earns $790?


x-μ $790-$1,000
Z= σ
= $100
= -2.10

Using the weekly incomes of Uber drivers: Therefore, 48.21% of drivers earn between
P($790 <weekly income < $1,000) = 0.4821 $790 and $1000 and 1.79% of drivers earn less
P(weekly income < $790) = 0.5000 − 0.4821 = 0.0179 than $790.

What is the z-value of income for a driver who earns $840?


x-μ $840-$1,000
Z= σ
= $100
= -1.60

What is the z-value of income for a driver who earns $1,200?


x-μ $1200-$1,000
Z= σ
= $100
= 2.00

Using the weekly incomes of Uber drivers:


P($840 <weekly income < $1,000) = .4452
P($1,000 <weekly income < $1,200) = .4772
P($840 < weekly income < $1,200) = .4452 + .4772 = .9224

What is the z-value of income for a driver who earns What is the z-value of income for a driver who
$1,250? earns $1,150?
x-μ $1,250-$1,000 x-μ $1,150-$1,000
Z= σ = $100
= 2.50 Z = σ
= $100
= 1.50

Using the weekly incomes of Uber drivers:


P($1,000 <weekly income < $1,250) = .4938
P($1,000 <weekly income < $1,150) = .4332
P($1,150 < weekly income < $1,250) = .4938 − .4332 = .0606

In this example, about 6%


of the drivers earn between $1,150 and $1,250.

38
Finding a Value for x Using z
Layton Tire and Rubber Company wishes to set a minimum mileage guarantee on its new MX100 tire. Tests
reveal the mean mileage is 67,900 with a standard deviation of 2,050 miles and that the distribution follows
the normal distribution. Let x represent the minimum guaranteed mileage and use the formula for z to solve
so that no more than 4% of tires need to be replaced.

From the table we find z = -1.75 So..

The Family of Exponential Distributions


There is not just one exponential distribution, but a family of them.
lambda, is the rate parameter
The lower the rate parameter, the “less skewed” the shape of the
distribution

The exponential distribution usually describes situations such as:


} The time between “hits” on a website
} The time until the next phone call arrives in a customer service center.
Ect.

39
Poisson vs Exponential Distribution
} To explain the relationship between the Poisson and the exponential distributions, suppose customers
arrive at a family restaurant during the dinner hour at a rate of six per hour
} The Poisson distribution would have a mean of 6. For a time interval of 1 hour, we can use the Poisson
distribution to find the probability that one, or two, or ten customers arrive.
} But suppose instead of studying the number of customers arriving in an hour, we wish to study the time
between their arrivals
} The time between arrivals is a continuous distribution because time is measured as a continuous random
variable.

Exponential Distribution Formulas

Shows Probability is received in LESS than:

Probability is received MORE than:

Mean= 1/μ The rate of parameter λ is equal 1/μ e = 2,71828

Exponential Distribution Example


Orders for prescriptions arrive at a pharmacy website according to an exponential probability distribution at
a mean of one every 20 seconds.
a. Find the probability the next order arrives in less than 5 seconds.

b. Find the probability the next order arrives in more than 40 seconds.

40
CHAPTER 7 PRACTICE PROBLEMS
A uniform distribution is defined over the The mean of a normal probability distribution is 500; the standard
interval from 6 to 10. deviation is 10.

a. What are the values for a and b? a. About 68% of the observations lie between what two values?
a=6 b = 10 490 and 510, found by 500 ± 1(10)

b. What is the mean of this uniform b. About 95% of the observations lie between what two values?
distribution? (6 + 10)/2 = 8 480 and 520, found by 500 ± 2(10)

c. What is the standard deviation? c. Practically all of the observations lie between what two values?
1.1547 470 and 530, found by 500 ± 3(10)

d. Show that the probability of any value The weekly mean income of a group of executives is $1,000 and the
between 6 and 10 is equal to 1.0. standard deviation of this group is $100. The distribution is normal.
Area is a rectangle, so What percent of the executives have an income of $925 or less?
height * base = [1/(10 – 6)](10 – 6) = 1 A) 27.34% B) 77.34%
C) 7.5% D) 22.66%
e. What is the probability that the random
variable is more than 7? Z=925-1000/100=-0.75
[1/(10 – 6)](10 – 7) = 0.75 which means probability 0.2734 Less than 925
means 0.5-0.2734=0.2266 or 22.66%
f. What is the probability that the random
variable is between 7 and 9? Waiting times to receive food after placing an order at the local
[1/(10 – 6)](9 – 7) = 0.5 Subway sandwich shop follow an exponential distribution with a
mean of 60 seconds. Calculate the probability a customer waits:
g. What is the probability that the random
variable is equal to 7.91? a. Less than 30 seconds. b. More than 120 seconds.
(–1/60)(30)
P (x= 7.91 ) = 0. For a continuous 1 – e = 0.3935 e(–1/60)(120) = 0.1353
probability distribution, the area for a
point value is zero. (LO7-1) c. Between 45 and 75 seconds.
e(–1/60)(45) – e(–1/60)(75) = 0.1859

The Internal Revenue Service reported the average refund in 2017 was $2,878 with a standard deviation of
$520. Assume the amount refunded is normally distributed.

a.What percent of the refunds are more


than $3,500? 0.1151

Find the z value for $3,500, -à which is (3,500 −2,878)/520, or 1.20.


See Appendix B.3 to find the area between 0 and 1.20, which is 0.3849.
Finally, since the area of interest is beyond 1.20, subtract that probability from 0.5000. The result is
0.5000 − 0.3849 = 0.1151.
b. What percent of the refunds are more than $3,500 but less than $4,000? 0.0997
Find the z value for $4,000 -à which is (4,000 − 2,878)/520, or 2.16.
See Appendix B.3 for the area under the standard normal curve.
That probability is 0. 4846.
Since the two points (1.20 and 2.16) are on the same side of the mean,
subtract the smaller probability from the larger. The result is 0.4846 − 0.3849 = 0.0997
41
c. What percent of the refunds are more than $2,400 but less than $4,000? 0.8058
Find the z value for $2,400 -à which is −0.92, found by (2,400 − 2,878)/520.
The corresponding area is 0.3212. Since −0.92 and 2.16 are on different sides of the mean, add the
corresponding probabilities.
Thus, we find 0. 3212 + 0. 4846 = 0.8058. (LO7-3)

A large manufacturing firm tests job applicant. Test scores are normally distributed with a mean of 500 and
a standard deviation of 50. Management is considering placing a new hire in an upper-level management
position if the person scores in the upper 6% of the distribution. What is the lowest score a new hire must
earn to qualify for a responsible position?
A) 50 B) 625 C) 460 D) 578

Recall that the area under the normal curve to the right of the mean is 0.5000.
The area between the mean and the desired "cutoff" score is 0.5000 − 0.0600 = 0.4400.
Now refer to the "areas under the normal curve" table.
Search the body of the table for the area closest to 0.4400. The closest area is 0.4406.
Move to the margins from this value and read the z-value of 1.56.
Finally, the lowest score a new hire must earn to qualify is
score x + zσ = 500 + 1.56(50) = 578

C8 SAMPLING, SAMPLING METHODS, AND THE CENTRAL LIMIT THEOREM


Probability Sampling Methods
1. Simple Random Sample: A sample selected so that each item or person in the population has the same
chance of being selected.
} Example: There were 750 Major League Baseball players at the end of the 2016 season. A committee
of 10 players is to be formed to study the issue of concussions. To make sure every player has an
equal chance of being selected, write each name on a piece of paper, place the names in a box and
mix them up, then draw 10 names.
Using a Table of Random Numbers

To choose a starting point, you could


close your eyes and simply point at a
number in the table. Another way is to
randomly pick a column and a row.

2. Systematic Random Sample: A random starting point is selected, and then every kth member of the
population is selected.
} Example: Stood’s Grocery Store wants to study the length of time customers spend in their store
Randomly select the days of the week, the times, and the starting point of the study, then systematically
select the customers and measure the time each spends in the store
Caution: If the population is in some order already, like invoices arranged in increasing dollar amounts, the
systematic procedure should not be used.
3. Stratified Random Sample: A population is divided into subgroups, called strata, and a sample is randomly
selected from each stratum.
} Example: A study of 50 of the 352 largest
US firms’ ad spending Begin by identifying
the strata, then use random sampling within
each group based on relative frequencies to
collect the sample
42
4. Cluster Sampling: A population is divided into clusters using naturally occurring geographic or other
boundaries. Then clusters are randomly selected and a sample is collected by randomly selecting from each
cluster.
} Example
Suppose we wish to sample residents of the 12 counties in the greater Chicago area
about government policy. Randomly select 3 counties and then select a random
sample of the residents in each of the 3 counties

SAMPLING ERROR: The difference between a sample statistic and its corresponding population parameter.
} Sampling error: xn − μ

The Foxtrot Inn’s number of rooms rented in June. The mean number of rooms rented, μ, is 3.13

How can we determine how accurate the sample mean is?


Sampling Distribution of The Sample Mean: A probability distribution of all possible sample means of a
given sample size.
} For a given sample size, the mean of all possible sample means selected from a population is equal
to the population mean
} There is less variation in the distribution of the sample mean than in the population distribution
} The sampling distribution of the sample mean tends to become bell-shaped

Sampling Distribution Example


Tartus Industries has seven production employees (the population). The hourly earnings of each employee
is given in the table.
1. What is the population mean?
2. What is the sampling distribution of the sample mean for samples of size 2?
3. What is the mean of the sampling distribution?
4. What observations can be made about the population and the sampling distribution?

Tartus Industries has seven


production employees (the
population). The hourly earnings
of each employee is given in the
table.

1. What is the population mean?


Σx
μ = N = $15.43

43
2. What is the sampling distribution of the sample mean for samples of size 2?

Tartus Industries has seven production employees (the population).


3. What is the mean of the sampling distribution?

Sum of all sample means $324


µx7 = = =$15.43
Total number of samples 21

4. What observations can be made about the population and the sampling distribution?
} The mean of the distribution of the sample mean ($15.43) is equal to the mean of the population,
μ = μx! (The sample means range from $14 to $17 while the population values range from $14 to $18)
} The spread in the distribution of the sample mean is less than the spread in the population values
} The shapes of the population and sample distributions are different

Central Limit Theorem


The Central Limit Theorem: If samples of a particular size are selected from any population, the sampling
distribution of the sample mean is approximately a normal distribution. The approximation improves with
larger samples.
} If the population follows a normal probability distribution, then for any sample size the sampling
distribution of the sample mean will also be normal
} If the population distribution is symmetrical, you will see the normal shape of the distribution of the
sample mean emerge with samples as small as 10
} If the distribution is skewed or has thick tails, it may require samples of 30 or more to observe the
normality feature

44
The mean of the distribution of sample
means will be exactly equal to the
population mean, if we select all possible
samples of same size from the population
μ = μx1
The standard deviation of the sampling
distribution of the sample mean is also
called the standard error of the mean

There will be less dispersion in the sampling


distribution of the sample mean, σ/√n, than in
the population σ

Normal Distribution
} If the population follows a normal distribution, the sampling distribution of the sample mean will also
follow the normal distribution for samples of any size
} If the population is not normally distributed, the sampling distribution of the sample mean will approach
a normal distribution when the sample size is at least 30
} Assume the population standard deviation is known
} To determine the probability that a sample mean falls in a particular region, use the following formula:

Individual observation Sample Mean

Using the Sampling Distribution Example


The Quality Assurance Dept. for Cola, Inc. maintains records regarding the amount of cola in its jumbo
bottle. The actual amount of cola in each bottle varies a small amount from one bottle to another. Records
indicate the amounts of cola follow the normal distribution, the mean amount of cola in the bottles is 31.2
ounces, and the standard deviation is 0.4 ounces. At 8 a.m. today, the quality technician randomly selected
16 bottles from the filling line. The mean amount was 31.38
ounces. Is this an unlikely result? Is it a likely the process is putting
too much soda in the bottle? Is the sampling error of 0.18 ounce
unusual?

x1 - μ (+.(/ %(+..-
z= = = 1.80
σ/√n -.)/√+0

45
We conclude that it is unlikely; there is less than a 4% chance. The process is putting too much soda in the
bottles.

First, we find z, then we use the table in Appendix B.3


In this example, we find that it is unlikely, less than a 4% chance, we could select a sample of 16 observations
from a normal population with a mean of 31.2 ounces and a population standard deviation of 0.4 ounce and
find the sample mean equal to or greater than 31.38 ounces. We conclude the process is putting too much
cola in the bottles. The quality technician should see the production supervisor about reducing the amount
of soda in each bottle.
On another word, the probability that the sample mean is equal to or greater than 31,38 ounces is 4 percent

CHAPTER 8 PRACTICE PROBLEMS


According to an IRS study, it takes a mean of 330 minutes for taxpayers to prepare, copy, and electronically
file a 1040 tax form. This distribution of times follows the normal distribution and the standard deviation is
80 minutes. A consumer watchdog agency selects a random sample of 40 taxpayers.

• What is the standard error of the mean in this example?


Standard error of the mean= 80/√40 = 12,64
Sample mean> 320

• What is the likelihood the sample mean is greater than 320 minutes?
"#$%""$
z= !" =-0.79 -> P(x>320) = 0,2852
√$"
Total likelihood= 0,5 + 0,2852=0,7852
The likelihood that the sample mean is greater than 320 minutes is 78,52 percent

• What is the likelihood the sample mean is between 320 and 350 minutes?
Probability is 0.7281, found by 0.2852 + 0.4429.

• What is the likelihood the sample mean is greater than 350 minutes?
0.0571, found by 0.5000 - 0.4429.

• What is the probability that the sampling error would be more than 20 minutes?
Sampling error of more than 20 minutes corresponds to times of less than 310 or more than 350
"&$%""$ "+$%""$
minutes. 𝑧 = '$ = − 1.58; 𝑧 = '$ = 1.58.
* *
√)$ √)$
Subtracting: 0.5 - .4429 = .0571 in each tail.
Multiplying by 2, the final probability is .1142.

C9 ESTIMATION AND CONFIDENCE INTERVALS


Point Estimate: The statistic, computed from sample information, that estimates a population parameter.
Example
} Suppose the Bureau of Tourism for Barbados wants an estimate of the mean amount spent by tourists visiting that
country. They randomly select 500 tourists as they depart and ask these tourists about their spending while there.
(Since it is not feasible to contact all the tourists visiting Barbados, the bureau relies on sample information.)
} The mean amount spent by the sample of 500 tourists serves as an estimate of the unknown population parameter.
The survey of the 500 sampled tourist showed mean spending of $168 per day. This sample mean is the point
estimate of the population mean of spending per day

46
Confidence Interval: A range of values constructed from sample data so that the population parameter is
likely to occur within that range at a specified probability. The specified probability is called the level of
confidence.
The factors that determine the width of a confidence interval for a mean are:
} The number of observations in the sample, n
} The variability in the population, usually estimated by the sample standard deviation, s
} The desired level of confidence

Level of Confidence, σ (standar deviation) Known


To determine the confidence limits when the population σ is known, we use the z distribution.

x - sample mean
z - z - value for a particular confidence level
σ - the population standard deviation
n - the number of observations in the sample

Finding a Value of z
The method for finding z for a 95% confidence interval is

Divide the confidence interval in


half, 0.9500 ÷ 2 = 0.4750

Find the value 0.4750 in the body


of the table

Identify the row and column and


add the values.

The probability of finding a value between 0 and 1.96 (1.9+0.06) is 0.4750


So the probability of finding a value between +/− 1.96 is 0.9500

Level of Confidence, z Example


The American Management Association is studying the income of store managers in the retail industry.
A random sample of 49 managers reveals a sample mean of $45,420. The standard deviation of the
population is $2,050.
1. What is the population mean?
We do not know the population mean, so we can use the sample mean, $45,420 as our best estimate.

2. What is a reasonable range of values for the population mean?


The AMA decides to use a 95% level of confidence -> table Z = 1.96

$45,420 – $574 = $44,846


$45,420 + $574 = $45,994
47
3. How do we interpret these results?
The confidence interval is from $44,846 to $45,994.
The value $574 is called the margin of error.

95% of all confidence intervals computed from random samples selected from a population will
contain the population mean. To illustrate, suppose we select many samples of 49 store managers,
perhaps several hundred. We could expect about 95% of these confidence intervals to contain the
population mean. About 5% of the intervals would not contain the population mean. This is due to
sampling error and is the risk we assume when we select the level of confidence.

Level of Confidence, σ (standar deviation) Unknown


To determine the confidence limits when the population σ is unknown, we use the t distribution

Level of Confidence, t Example


The Dean of the Business College wants to estimate the mean number of hours full-time students work at
paying jobs each week. He randomly selects a sample of 30 students and asks them how many hours they
worked last week. He can calculate the sample mean, but it is unlikely he would know the population
standard deviation required for formula 9-1.
However, we can use the sample standard deviation, s, as an estimate of σ and replace the z distribution
with the t distribution and use formula (9-2).

Characteristics of the t Distribution


} The t distribution is a continuous distribution
} It is mound-shaped and symmetrical
} It is flatter, or more spread out, than the standard normal distribution
} There is a family of t distributions, depending on the number of
degrees of freedom

Finding a Value of t
} First assume the population is normal
} Using Appendix B.5, move across the columns identified for confidence intervals
} In the next example, we want to use the 95% level of confidence, so move to that column
} Then find df, the degrees of freedom (df), sample size minus 1 -> n-1

48
Thus, 1 degree of freedom is lost in a
sampling problem involving the standard
deviation of the sample because one
number (the arithmetic mean) is known.
For a 95% level of confidence and 9 degrees
of freedom, we select the row with 9
degrees of freedom. The value of t is 2.262.

Level of Confidence, t Example


A tire manufacturer wishes to investigate the tread life of its tires. A sample of 10 tires driven 50,000 miles
revealed a sample mean of 0.32 inch of tread remaining with a standard deviation of 0.09 inch. Construct
a 95% confidence interval for the population mean.
Would it be reasonable for the manufacturer to conclude that after 50,000 miles the population mean
amount of tread remaining is 0.30 inch?

Sample mean = 0.32


Sample stdeviation(σ) = 0.09
n =10
Degree of freedom = n-1 -> 10-1= 9

t-statistics= 2.262 (dilihat pada table berdasarkan Df dan tingkat confidence)

0.32 – 0.064 = 0.256


0.32 + 0.064 = 0.384

The endpoints of the confidence interval are 0.256 and 0.384. The margin of error is 0.064. The manufacturer
can be reasonably sure (95% confident) that the mean remaining tread depth is between 0.256 and 0.384
inch. Because the value 0.30 is in this interval, it is possible that the mean of the population is 0.30.

How do we interpret this result? If we repeated this study 200 times, calculating the 95% confidence interval
with each sample’s mean and the standard deviation, we expect 190 of the intervals would include the
population mean. Ten of the intervals would not include the population mean. This is the effect of sampling
error.

49
What is the interpretation of a 96% confidence level?
A. Approximately 96 out of 100 such intervals would include the true value of the population
parameter
B. There's a 4% chance that the given interval does not include the true value of the population
parameter
C. The interval contains 96% of all sample means.

If 100 samples were collected from the same population and, based on each sample, 100 sample means
were calculated, and they were used to construct 100 confidence intervals, 96% or 96 of the 100 confidence
intervals are expected to include the population mean. A total of 4%, or 4 of the 100 confidence intervals,
are not expected to include the population mean

Confidence Intervals for Proportions


PROPORTION: The fraction, ratio, or percent indicating the part of the sample or the population having a
particular trait of interest.

} X = the number of successes,


} N = the number of observations

Examples
} Southern Tech career services reports that 80% of its graduates enter the job market in a position
related to their field of study
} A recent study of married men between the ages 35 and 50 found that 63% felt that both partners
should earn a living

A population proportion is identified by π


Two requirements
} The binomial conditions have been met
a. The sample data are the number of successes in n trials.
b. There are only two possible outcomes. (We usually label one of the outcomes a “success” and the
other a “failure.”)
c. The probability of a success remains the same from one trial to the next.
d. The trials are independent. This means the outcome on one trial does not affect the outcome on
another
} The values n𝜋 and n (1- π) should both be greater than or equal to 5

50
Confidence Interval, π Example
The union representing the Bottle Blowers of America (BBA) is considering a proposal to merge with the
Teamsters Union. At least three-fourths of the BBA membership must approve any merger. A random
sample of 2,000 current members reveals 1,600 plan to vote for the merger proposal. What is the
estimate of the population proportion? Can you conclude that the necessary proportion of BBA members
favor the merger? Why?

First, calculate the sample proportion,

p = 1600/2000= .80

*95% confidence interval is z-value with 1.96


Next, use formula (9-4) to determine the 95% confidence interval,

0.80 – 0.018 = 0.782


0.80 + 0.018 = 0.818

The endpoints of the confidence interval are 0.782 and 0.818, so we conclude the merger will likely pass
because the interval estimate includes values greater than 75% of the union membership.

Determining Sample Size for Means


There are three factors that determine the sample size when we wish to estimate the mean
} The margin of error, E
} The desired level of confidence, for example 95%
} The variation in the population

Sample Size to Estimate a Population Mean Example


A student in public administration wants to estimate the mean monthly earnings of city council members
in large cities. She can tolerate a margin of error of $100 in estimating the mean. She would also prefer to
report the interval estimate with a 95% level of confidence. The student found a report by the
Department of Labor that reported a standard deviation of $1,000. What is the required sample size?
The computed value of 384.16 is rounded up to 385. A sample size of 385 is required to meet the
specifications.

51
Determining Sample Size for Proportions
There are three factors that determine the sample size when we wish to estimate a proportion:
The margin of error, the desired lv. of confidence, a value for π to calculate the var in the population

Sample Size for the Population Proportion Example


The student in the previous example also wants to estimate the proportion of cities that have private refuse
collectors. The student wants to estimate the population proportion with a margin of error of .10, prefers
a level of confidence of 90%, and has no estimate for the population proportion.
What is the required sample size?

The student needs a random sample of 68 cities.


*Use a population proportion of 0.50 in the formula since we do not have an estimate for this value.

Finite Population Adjustment


} A population that has a fixed upper bound is finite
} For a finite population, the standard error is adjusted by the finite-population correction factor,
,%-
6,%&
} This will make your estimate more precise by reducing the standard error and resulting in a smaller
range of values in estimating the population mean

Finite Population Adjustment Example


There are 250 families residing in Scandia, Pennsylvania. A random sample of 40 of these families
revealed the mean annual church contribution was $450 and the standard deviation of this was $75.
1. What is the population mean? What is the best estimate of the population mean?
We do not know the population mean. The best estimate is $450.

2. Develop a 90% confidence interval for the population mean.


In this case, we know x-bar = 450, s = 75, N = 250, and n = 40.
We don’t know population standard deviation, so we use t statistics

Using Appendix B.5, move across the top row to 90% and then down to df row 39, the t value is 1.685.

450 – 18.35 = $431.65


450 + 18.35 = $468.35.

The population mean is more than $431.65 but less than $468.35.
52
3. Using the confidence interval, explain why the population mean could be $445. Could the population
mean be $425? Why?
The endpoints are $431.65 and $468.35, so the population mean could be $445. It is not likely the
population mean is $425 since $425 is not within the confidence interval.

CHAPTER 9 PRACTICE PROBLEMS


A survey of 50 retail stores revealed that the average price of a microwave was $375 with a sample
standard deviation of $20. Assuming the population is normally distributed, what is the 99% confidence
interval to estimate the true cost of the microwave?
A) $367.42 to $382.58 B) $315.00 to $415.00
C) $323.40 to $426.60 D) $335.82 to $414.28

Using t-statistics
n = 50
df = 50-1 = 49
t-stat= 2.680

A group of statistics students decided to conduct a survey at their university to find the average (mean)
amount of time students spent studying per week. Assuming a population standard deviation of three
hours, what is the required sample size if the error should be less than a half hour with a 99% level of
confidence?
A) 196 B) 239
C) 15 D) 554
Using the formula When determining sample size, remember to round any partial value up to the next
whole number value.
z value for confidence interval: 99%: 2= 49.5%
probability 49.5% equal: 2.57

C10 ONE-SAMPLE TESTS OF HYPOTHESIS


Hypothesis Testing
} Hypothesis testing begins with a hypothesis statement about a population parameter
HYPOTHESIS: A statement about a population parameter subject to verification
Examples
} The mean speed of automobiles passing milepost 150 on the West Virginia Turnpike is 68 mph
} The mean cost to remodel a kitchen is $20,000
} The objective of hypothesis testing is to verify the validity of a statement about a population
parameter
Hypothesis Testing: A procedure based on sample evidence and probability theory to determine whether
the hypothesis is a reasonable statement.

53
Step 1 State the null hypothesis (H0) and the alternate hypothesis (H1)

NULL HYPOTHESIS: A statement about the value of a population parameter developed for the purpose of
testing numerical evidence.
} The null hypothesis always includes the equal sign
} For example; =, ≥, or ≤ will be used in H0

ALTERNATE HYPOTHESIS A: statement that is accepted if the sample data provide sufficient evidence that
the null hypothesis is false.
} The alternate hypothesis never includes the equal sign
} For example; ≠, <, or > is used in H1

Step 2 Select the level of significance


LEVEL OF SIGNIFICANCE: The probability of rejecting the null hypothesis when it is true.
} It is designated with the Greek letter alpha, α. Sometimes called the level of risk
} Can be any value between 0 and 1
} Traditionally, 0.05 is used for consumer research projects, 0.01 for quality assurance, 0.10 for
political polling

Possible Error in Hypothesis Testing


Since the researcher cannot study every item or individual in the population, error is possible

TYPE I ERROR: Rejecting the null hypothesis, H0, when it is true.


} Type I error is designated with the Greek letter alpha, α

TYPE II ERROR: Not rejecting the null hypothesis when it is false


} Type II error is designated with the Greek letter beta, β

Step 3 select the test statistic


TEST STATISTIC: A value, determined from sample information, used to determine whether to reject the
null hypothesis.

In hypothesis testing for the mean, μ, when σ is known, the test statistic z is computed

Step 4 of the Process


Formulate the decision rule
The decision rule is a statement of specific conditions under which the null hypothesis is rejected and the
conditions under which it is not rejected

The region or area of rejection defines the location of all the values that are either so large or so small
that their probability of occurrence under a true null hypothesis is remote

54
Critical Value: The dividing point between the region where the null hypothesis is rejected and the region
where it is not rejected.
} The sampling distribution of the statistic z
follows the normal distribution
} Here, an α of .05 is used in a one-tailed
test
} The value 1.645 separates the regions
where the null hypothesis is rejected and
where it is not rejected
} The value 1.645 is the critical value

Step 5 Make a decision


} First, select a sample and compute the value of the test statistic
} Compare the value of the test statistic to the critical value
} Then, make the decision regarding the null hypothesis

Step 6 Interpret the results


What can we say or report based on the results of the statistical test?

One-Tailed and Two-Tailed Tests

55
Two-Tailed Test Example, σ Known
Jamestown Steel Company manufactures and assembles desks and other office equipment at several
plants in New York State. At the Fredonia plant, the weekly production of the Model A325 desk follows a
normal distribution with a mean of 200 and a standard deviation of 16. New production methods have
been introduced and the vice president of manufacturing would like to investigate whether there has been
a change in weekly production of the Model A325. Is the mean number of desks produced different from
200 at the 0.01 significance level?

Step 1: State the null hypothesis and alternate hypothesis.


H0: μ = 200 desks
H1: μ ≠ 200 desks
Step 2: Select the level of significance. Here α = .01
Step 3: Select the test statistic. In this example, we’ll use z

Step 4: Formulate the decision rule by first determining the critical values of z.
Decision Rule:
If the computed value of z is not between −2.576 and 2.576, reject the null hypothesis.
If z falls between −2.576 and 2.576, do not reject the null hypothesis.

Step 5: Take sample, compute the test statistic, make decision.


The mean number of desks produced last year (50 weeks because the plant was shut down 2 weeks for
vacation) is 203.5. The standard deviation of the population is 16 desks per week.
Compute z with formula 10-1.

1x - μ 203.5-200
z= = = 1.547
σ/√n 16/√50
Decision: Because 1.547 does not fall in the rejection region, we decide not to reject H0.

Step 6: Interpret the result.


We did not reject the null hypothesis, so we have failed to show that the population mean has changed
from 200 per week.

56
One-Tailed Test
Suppose instead of wanting to know if there had been a change in the mean number of desks assembled,
the vice president wanted to know if there had been an increase in the number of units assembled. Can
we conclude, because of the improved production methods, that the mean number of desks assembled in
the last 50 weeks was more than 200?
Before:
A two-tailed test
H0: = 200 desks
H1: ≠ 200 desks

Now:
A one-tailed test
H0: ≤ 200 desks
H1: > 200 desks

The critical values for a one-tailed test are different from a two-tailed test at the same significance level.
In the two-tailed test, we split the significance level in half and put half in the lower tail and half in the
upper tail. In a one-tailed test, we put all the rejection region in one tail. Using Appendix B.5 again, move
to the top heading called “Level of Significance, select the column with = .01, and move to the last row,
which is labeled z value is 2.326.

The p-Value in Hypothesis Testing


p-VALUE: The probability of observing a sample value as extreme as, or more extreme than the value
observed, given that the null hypothesis is true.
} Compare the p-value with the level of significance, α
} If the p-value is smaller than the significance level, reject H0
} If the p-value is larger than α, H0 is not rejected
} A p-value not only results in a decision about H0, but gives additional insight about the strength of
that decision

Finding a p-Value
} In the previous example about desk production, the computed z was 1.547 and H0 was not
rejected
} Round the computed z-value to two decimal places, 1.55
} Using the z-table, find the probability of finding a z-value of 1.55
or more by 0.5000 − 0.4394 = 0.0606

} Since this is a two-tailed test 2(0.0606) = 0.1212


} In this chart, we can easily compare the p-value with the level of significance

57
Hypothesis Testing, σ Unknown

If the population standard deviation is not known, s is substituted for σ


} The major characteristics of the t distribution are:
} It is a continuous distribution
} It is bell-shaped and symmetrical
} There is a family of t distributions, depending on the number of degrees of freedom
} It is flatter, or more spread out, than the standard normal distribution

Hypothesis Testing, σ Unknown Example


The Myrtle Beach International Airport provides a cell phone parking lot where people can wait for a
message to pick up arriving passengers. To decide if the cell phone lot has enough parking places, the
manager of airport parking needs to know if the mean time in the lot is more than 15 minutes. A sample
of 12 recent customers showed they were in the lot the following lengths of time, in minutes (see below).
At the .05 significance level, is it reasonable to conclude that the mean time in the lot is more than 15
minutes?

Step 1: State the null hypothesis and the alternate hypothesis


H0: μ ≤ 15
H1: μ > 15

Step 2: Select the level of significance; we will use 0.05


Step 3: Select the test statistic; we will use t
Step 4: Formulate the decision rule; reject H0 if t is less than 1.796
Step 5: Take sample, make decision
x7 - μ 23-15
t = s/ = = 2.818
√ n 9.835/√12

The test statistic of 2.818 is greater than our critical value of 1.796.
Therefore, our decision is: Reject H0
Step 6: Interpret the result
We conclude that the time customers spend in the lot
is more than 15 minutes. This result indicates that the airport may need to add more parking places.

58
CHAPTER 10 PRACTICE PROBLEMS

The average cost of tuition plus room and board for a small private liberal arts college is reported to be
$9,500 per term, but a financial administrator believes that the average cost is higher. A study conducted
using 350 small liberal arts colleges showed that the average cost per term is $9,845. The population
standard deviation is $1,200. Let α = 0.05. What is our decision about the average cost?
A) Equal to $9,500 B) Greater than $9,500
C) Less than $9,500 D) Not equal to $9,500

The null and alternate hypotheses are:


H0: µ ≤ $9,500 This is a one (right)-tailed test
H1: µ > $9,500. the population standard deviation is known.

Based on the sample information, the test statistic is as this test statistic is greater than the critical value
of +1.645
we reject the null hypothesis and conclude the mean is greater than $9,500.

Alternatively, the p-value is the probability of observing a sample value as extreme as, or more extreme
than, the value observed, given that the null hypothesis is true.

The probability of getting a sample mean of $9,845 or greater, assuming a population mean of $9,500
corresponds to the probability of obtaining a z-value greater than 5.38.

This probability is beyond the range of the "areas under the normal curve" table, so the probability is
extremely small or virtually zero.

The p-value, 0.0000, is less than the significance level 0.05, so the decision is to reject the null hypothesis
and conclude the mean is greater than $9,500.

The mean income per person in the United States is $60,000, and the distribution of incomes follows a
normal distribution. A random sample of 10 residents of Wilmington, Delaware, had a mean of $70,000
with a standard deviation of $10,000. At the .05 level of significance, is that enough evidence to conclude
that residents of Wilmington, Delaware, have more income than the national average?

59

You might also like