Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Mod 1 Statistics and Probability

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 14

Statistics and Probability

Statistics as a Tool in Decision-Making


Statistics is defined as a science that studies data to be able to make a
decision. Hence, it is a tool in decision-making process. Mention that Statistics as
a science involves the methods of collecting, processing, summarizing and
analyzing data in order to provide answers or solutions to an inquiry. One also
needs to interpret and communicate the results of the methods identified above to
support a decision that one makes when faced with a problem or an inquiry.
Statistics enable us to:
 characterize persons, objects, situations, and phenomena;
 explain relationships among variables;
 formulate objective assessments and comparisons; and, more importantly
 make evidence-based decisions and predictions.
Statistical Process in Solving a Problem
 Planning or designing the collection of data to answer statistical questions in a
way that maximizes information content and minimizes bias;
 Collecting the data as required in the plan;
 Verifying the quality of the data after they were collected;
 Summarizing the information extracted from the data; and
 Examining the summary statistics so that insight and meaningful information can
be produced to support decision-making or solutions to the question or problem at
hand.
KEY POINTS
 Difference between questions that could be and those that could not answered
using Statistics.
 Statistics is a science that studies data.
 There are many uses of Statistics but its main use is in decision-making.
 Logical decisions or solutions to a problem could be attained through a statistical
process.
Random Variables and Probability Distributions
Probability
 Historically, probability was studied by gamblers who wanted to increase their
winnings (or at least decrease their losses)
 In the study of statistics, we are concerned basically with the presentation and
interpretation of chance outcomes that occur in a planned study or scientific
investigation.
 Probability as a general concept can be defined as the chance of an event
occurring.
Basic concepts of Probability
A probability experiment is a chance process that leads to well-defined
results called outcomes.
Random experiment…
• the outcome is certain.
• the outcome is impossible.
• the outcome has an even chance of occurring.
• the outcome has a strong but not a certain chance of occurring.
 Sample Space is the set of all possible outcomes of a statistical experiment and is
represented by the symbol S.
 An outcome is the result of single trial of a probability experiment.
1
 An event consists of a set of outcomes of a probability experiment. It is a subset of
a sample space, usually denoted by capital letters.
 Null Space or empty space (Φ , { }) is a subset of the sample space that contains
no elements.
 Mutually Exclusive. Two events are mutually exclusive if they have no common
element.
 A tree diagram is a device consisting of line segments emanating from a starting
point and also from the outcome point. It is used to determine all possible
outcomes of a probability experiment.

Properties of the Probability


 the probability of an event is a non-negative value; in fact it ranges from zero
(when the event is impossible) to one (when the event is sure); the closer the value
to one, the more likely the event will occur.
 the probability of the sure event is one (in other words, the chance of a sure event
is 100 percent).

Axioms of Probability
1. The probability of an impossible event is 0.
2. The probability of an event that is certain to occur is 1.
3. For any event A, the probability of A is between 0 and 1 inclusive.

Approaches to Probability
1. Classical approach to probability
…uses sample spaces to determine the numerical probability that an event will
happen.
…assumes that all outcomes in the sample space are equally likely to occur.
This probability of any event A is denoted by
n( A)
P(A) =
n (S)
Example. In rolling a balance die, what are the probability of getting
a. an odd number? Answer: (1,3, & 5) = 3/6 = 1/2
b. an even number? Answer: (2, 4, & 6) = 3/6 = 1/2
c. a perfect square number? Answer: (1 & 4) = 2/6 = 1/3

2. Empirical or relative frequency approximation of probability


…relies on actual experience to determine the likelihood of outcomes.
…uses frequency distribution based on observations to determine numerical
probabilities of events.
Given a frequency distribution, the probability of an event A being in a given class
is
f
P(A) =
n
Example. Hospital records, indicated the knee replacement patients stayed in the
hospital for the number of days shown in the distribution.

Frequency No. of days stayed

2
15
3 32 4
56 5
19 6
5 7

Find these probabilities.


a. A patient stayed exactly 5 days. Answer: 56/127
b. A patient stayed less than 6 days. Answer: 103/127
c. A patient stayed at most 4 days. Answer: 47/127
d. A patient stayed at least 5 days. Answer: 80/127

3. Subjective probabilities
… uses a probability value based on an educated guess or estimate, employing
opinions and inexact information.
…a person or group makes an educated guess at the chance that an event will
occur based on the experience and evaluation of a solution.
Example. 1. A financial analyst claim that there is 80% probability that peso dollar
exchange rate will decreased by 3 pesos.
2. A physician might say that, on the basis of her diagnosis, there is a 30% chance
the patient will need an operation.

Random Variables and Probability Distributions

Statistical Experiment – an activity that will produce outcomes, or a process that


will generate data.
A random variable is a function that associates a real number with each element
in the sample space. We use a capital letter, say X, to denote a random variable
and its corresponding small letter, x, for one of its values. Each possible value of X
represents an event that is a subset of the sample space for the given experiment.
Note: Random variables are conceptually different from mathematical variables. A
random variable is linked to observations in the real world, where uncertainty is
involved.
Example. Frequency and Relative Frequency Distributions of the Number of CPs
owned by each student.
Number Frequency Relative
of CPs Frequency
Owned
0 34 0.068
1 316 0.632
2 132 0.264
3 18 0.036

3
N = 500 1.000

Illustrating a Random Variable


(Discrete and Continuous)
What’s In
In the study of basic probability, you have discovered that an experiment is
any movement that should be possible more than once under comparative
condition. The arrangement of every possible outcomes of an experiment is what
we called a sample space. You have additionally figured out how to mathematically
list down the conceivable outcome of a given experiment. In tossing a coin, for
example, the potential results are turning up a head or a tail.
For you to begin let us all understand that probability distributions can be
illustrated or classified as discrete probability distributions or as continuous
probability distributions, depending on whether they define probabilities
associated with discrete variables and continuous variables.
A variable X whose value depends on the outcome of a random process is
called a random variable. A random variable is a variable whose value is a
numerical outcome of a random phenomenon.
A random variable is denoted with a capital letter. The probability
distribution of a random variable X tells what the possible values of X are and how
probabilities are assigned to those values.
A random variable can be discrete or continuous

Two Types of random variables


Recall: A variable is any information, attribute, characteristic, number, or
quantity that describes a person, place, event, thing, or idea that can be
measured or counted. A variable can be qualitative or quantitative; and a
quantitative variable can either be discrete or continuous.
1. A discrete variable is a quantitative variable whose value can only be
attained through counting. It can be finite in number of possible values or
countably infinite if the counting process has no end.
 In an experiment, the outcome is said to be discrete random variable if
the experiment has only countable or countably infinite number of
outcomes. No other outcome exists between two consecutive
outcomes.
Examples: number of shoes, rolling a die, flipping/tossing a coin,
students grade level
2. A continuous variable is a quantitative variable that can assume an
infinitely many, uncountable number of real number values. The value given
to an observation can include values as small as the instrument of
measurement allows.
 In an experiment, the outcome is said to be a continuous random
variable if an outcome can take an uncountably infinite number of
possible outcomes within a specified real number interval. Here, it is
always possible to have an outcome between any two existing ones.
Examples: Length, weight, width, time, distance

Activities to be answer on your Activity Notebook


4
Activity 1
Direction: Determine the relative frequency of the following variables.

Variable Frequency Relative


Frequency
dark 10
milk 21
white 9
cream 15

N=

Activity 2 - See my Prob-Ability

Direction: Identify the probability of the following statements.

1. What is the probability of getting a HEART from a deck of cards.


2. There are 20 marbles in a container: 4 are red, 5 are blue, and 11 are yellow.
What is the probability that a blue marble will be picked?
3. Choosing a month from a year, what is the probability of selecting a month with
31 days?
4. Two fair coins are tossed simultaneously. What is the probability of showing a
tail (T) followed by a head (H)?
5. If one letter are chosen at random from the word TRUSTWORTHY, what is the
probability that the letter chosen is consonant?

Acrivity 3

Direction: Classify whether the given experiment implies a discrete random


variable or a continuous random variable. Write Discrete if discrete and
Continuous if continuous.

1. The number of students present in a Class Temperance


2. The average distance travelled by a tricycle in a month
3. The number of motorcycle owned by a randomly selected household
4. The number of girls taller than 5 feet in a random sample of 6 girls.
5. The weight of a box of soft drinks labeled 12 ounces.
PROBABILITY DISTRIBUTION for DISCRETE RANDOM VARIABLE
A probability distribution gives the probability for each value of the random
variable.
 A probability distribution function is a function P(X) that shows the
relative probability that each outcome of an experiment will happen.
 A discrete probability distribution is a table of values that shows the
probability of any outcomes of an experiment.
Two Requirements for a Probability Distribution

5
 The sum of the probabilities of all the events in the sample space must be
equal to 1.
 The probability of each event in the sample space must be between or equal
to 0 and 1. That is, 0 ≤ P(X) ≤ 1.

Mean and Variance of Discrete Random Variables


The mean of discrete random variables, also known as the expected value,
is the weighted average of all possible values of the random variables. The symbol
used for the mean is µ.
The variance is a measure of spread or dispersion. It measures the variation
of the values of a random variable from the mean. The symbol used for the
variance σ 2 and its square root σ is called the standard deviation.
If P(x) is the probability of every value of x,
1. Mean: μ=∑ [ x • P( x ) ]
2. Variance: σ 2=∑ (x −µ)2 • P (x)
3. Standard deviation: √ variance=√ σ 2 = σ
Probability histogram of a discrete random variable
 A probability histogram shows relative probabilities of the sample points in
the form of a bar graph.
 Let the x-axis denote the independent variable, which is the set of sample
points, and let the y-axis denote the dependent variable, which is the
corresponding probabilities of the samples points.

Example 1. Find the probability distribution of an experiment where the sum of


the number shown on a pair of dice in a single throw is considered.

Dice 1 1 2 3 4 5 6
2
1 2 3 4 5 6 7
2 3 4 5 6 7 8
3 4 5 6 7 8 9
4 5 6 7 8 9 10
5 6 7 8 9 10 11
6 7 8 9 10 11 12

Let x represent the sum of two dice. Then the probability distribution as follows:

x 2 3 4 5 6 7 8 9 10 11 12
1 2 3 4 5 6 5 4 3 2 1
P(x) 36 36 36 36 36 36 36 36 36 36 36
or .0 or .0 or .0 or .1 or .1 or . or . or . or . or . or .
28 56 83 11 39 167 139 111 083 056 028

x P(x) x. P(x) x-μ (x - μ)² P(x) (x - μ)²


2 1 (2)(.028) 2 - 7.007 (- 5.007)² (.028) (25.070)
or .028
36 = 0.056 = - 5.007 = 25.070 = 0.702
3 2 0.168 - 4.007 16.056 0. 899
or .056
36
6
4 3 0.332 - 3.007 9.042 0. 750
or .083
36
5 4 0.555 - 2.007 4.028 0. 447
or .111
36
6 5 0.834 - 1.007 1.014 0. 141
or .139
36
7 6 1.169 - .007 0.00004 0. 000006
or .167
36
8 5 1.112 .993 0.986 0. 137
or .139
36
9 4 0.999 1.993 3.972 0. 441
or .111
36
10 3 0.830 2.993 8.958 0.744
or .083
36
11 2 0.616 3.993 15.944 0. 893
or .056
36
12 1 0.336 4.993 24.930 0. 698
or .028
36

Mean: μ=∑ [ x • P( x )]

= 0.056 + 0.168 +0.332 + 0.555 + 0.834 + 1.169 + 1.112 + 0.999 +


0.830 + 0.616+0.33
= 7.007

Variance: σ 2=∑ ( x −μ)2 • P(x )


= 0.702 + 0. 899 + 0. 750 + 0. 447 + 0. 141 + 0. 000006 +0. 137 + 0. 441 + 0.744 + 0.
893 + 0. 698
= 5. 852
Standard deviation: σ = √ σ 2
= √ 5. 852
= 2.419
= 2.42
Probability Histogram

7
Rolling a Dice
0.18
0.16
0.14
0.12
0.1
P(x) 0.08
0.06
0.04
0.02
0
1 2 3 4 5 6 7 8 9 10 11 12

Example 2. Let us take this example,


Suppose that a coin is tossed twice so that the sample space is S = {𝐻𝐻, 𝑇𝐻,
𝐻𝑇, 𝑇𝑇}. Let X represent the “number of heads that can come up”, Based on the
prepared discrete probability distributions of the random variable x, calculate the
mean, variance, and standard deviation.

Applying the concepts from the activity given, let us complete the table
below.

Mean: μ=∑ [ x • P( x )]
= 0 + 0.25 + 0.5 + 0.5
=1

Variance: σ 2=∑ ( x −μ)2 • P(x )


= 0.25 + 0 + 0.25
= 0.50
Standard deviation: σ = √ σ 2
= √ .50 = 0.71
Probability Histogram

8
Tossing a Coins
0.6
0.5
0.4

P(x)
0.3
0.2
0.1
0
0 1 2

Example 3. There are 8 balls in a jar, 3 blue balls and 5 red balls, choose 3 blue
balls at random. Find the mean, variance, standard deviation and its probability.
Let x = number of blue balls
x = 0, 1, 2, 3
(¿ 5 C 3) 10
P(x = 0) = (¿3 C 0) ¿¿ =
(¿8 C3 ) ¿ 56

(¿5 C 2) 30
P(x = 1) = (¿3 C 1) ¿¿ =
(¿8 C3 )¿ 56

(¿ 5 C 1) 15
P(x = 2) = (¿3 C 2) ¿¿ =
(¿8 C3 )¿ 56

(¿5 C 0) 1
P(x = 3) = (¿3 C 3) ¿¿ =
(¿8 C3 )¿ 56

x P(x) x. P(x) x-μ (x - μ)² P(x) (x - μ)²


0 10 0 - 1.126 1.268 0.227
or .179
56
1 30 0.536 - 0.126 0.016 0.009
or .536
56
2 15 0.536 0.874 0.764 0.205
or .268
56
3 1 0.054 1.874 3.512 0.063
or .018
56

Mean: μ=∑ [ x • P( x )]
= 0 + 0.536 + 0.536+ 0.054
= 1.126

Variance: σ 2=∑ ( x −μ)2 • P(x )


= 0.227 + 0.009 + 0.205 + 0.063

9
= 0.504
Standard deviation: σ = √ σ 2
= √ .50 4 = 0.71

Probability Histogram

Drawing out a ball


0.6
0.5
0.4
P(x) 0.3
0.2
0.1
0
0 1 2 3

Activity to be answer on your Activity Notebook


Activity 1
The probabilities that a player will get 5 to 10 questions right on a trivia quiz
are shown below. Solve for the μ , σ 2 and σ then graph.

x p(x) x . p(x) x-μ (x - μ)² P(x) (x - μ)²


5 0.04
6 0.20
7 0.30
8 0.10
9 0.25
10 0.10

Binomial Distribution
 A binomial distribution is a probability distribution with only two possible
outcomes: success and failure.

Characteristics of a Binomial Experiment


1. The experiment consists of “n” identical trials.
2. There are only two possible outcomes on each trial. We will denote one
outcome by “s” (for success) and the other “f” (for failure).
3. The probability of “s” remains the same from trial to trial. The probability of
“s” is denoted by “p” and the probability of “f” is denoted by “q”.
Note: q = p – 1
4. The trials are independent.
5. Binomial random variable “x” is the number of “S’s” in “n” trials.

10
Binomial Probability Distribution
P (x) = nCx • px • q n – x .
Where:
p(x) = probability of x (successes)
p = probability of a “success” on a single trial
q=1–p
n = number of trials
x = number of “successes) in n trial
(x = 0, 1, 2, 3, …n)
n – x = number of failures in n trials

Examples: 1. You toss 1 coin 5 times in a row. Note the number of tails, what is
the probability of getting exactly 3 tails.
n=5 p = .5 q = .5 x=3
P(x) = nCx • px • q n – x .
P(x=3)= 5C3 (.50)3 (.50) 5 – 3
= 10 (.125) (.25)
= 0.3125

2. x = 5 p = .30 q = .70 n = 10
x n–x
P(x) = nCx • p • q .
P(x=5) = 10C5 (.30)5 (.70) 10 – 5
= 252 (.002) (.168)
= 0.085

3. If I toss a coin 20 times, what is the probability of getting 2 or fewer


heads?

x = 0, 1, 2 p = .50 q = .50 n = 20

P(x ≤ 2) = P(x=0) + P(x=1) + P(x=2)

P(x) = nCx • px • q n – x .
P(x=0)= 20C0 (.50)0 (.50) 20 – 0

= 1 (1) (.0000009)
= .0000010

P(x) = nCx • px • q n – x .
P(x=1)= 20C1 (.50)1 (.50) 20 – 1

= 20 (.50) (.000002)
= .00002

P(x) = nCx • px • q n – x .
P(x=2)= 20C2 (.50)2 (.50) 20 – 2

= 190 (.25) (.000004)


= .0002
P(x≤2) = P(x=0) + P(x=1) + P(x=2)
= .0000010 + .00002 + .0002
= .000221
4. If I toss a coin 15 times, what is the probability of getting at least 3 heads?
x = 0, 1, 2 p = .50 q = .50 n = 15

P(x ≥ 3) = 1 - P(x=0) + P(x=1) + P(x=2)


11
P(x) = nCx • px • q n – x .
P(x=0)= 15C0 (.50)0 (.50) 15 – 0

= 1 (1) (.00003)
= .00003

P(x) = nCx • px • q n – x .
P(x=1)= 15C1 (.50)1 (.50) 15 – 1

= 15 (.50) (.00006)
= .00045

P(x) = nCx • px • q n – x .
P(x=2)= 15C2 (.50)2 (.50) 15 – 2

= 105 (.25) (.00012)


= .00315
P(x ≥ 3) = 1 - P(x=0) + P(x=1) + P(x=2)
= 1 – (.00003 + .00045 + .00315)
= 1 - .00363
= .99637

Hypergeometric Experiment
It is a statistical experiment that has the following properties:
a. A sample size n is randomly selected without replacement from a population
of N items.
b. In the population, K items can be classified as successes and N – k items
can be classified as failure.

Hypergeometric Random Variable


It is the number of successes that result from a hypergeometric experiment

Example: 1. You have an URN of 10 marbles, 5 red and 5 green. You can randomly have
selected a marble/s without replacement.
a. What is the probability of selecting a red marble in 1 st and 2nd trial?
Ans: 1st trial = 5/10 or ½, and 2nd trial 5/9
b. In your first trial, you get 1 red marble. What is the probability to get
another red marble in 2nd trial?
Ans: 4/9
c. What is the probability of selecting a green marble in 1st to 4th trial in
every trial you get a red marble?
Ans: 1st trial = 5/10 or ½, 2nd trial = 5/9,
3rd trial 5/8, an 4th trial 5/7

Hypergeometric Distribution
It is a probability distribution of hypergeometric random variables.

Formula:
( kCx ) [ ( N −k ) C ( n−x ) ]
p(x) =
NCn

12
where:
N = number of items in a population
K = number of items in the population that are classified as successes
n = number of items in the sample
x = number of items in the sample that are classified as successes

nk
= N ²=
k ( N−k ) n (N−n)
N ²(N −1) =
√ k ( N −k ) n( N −n)
N ²( N −1)

Example: 1. Suppose we randomly select 5 cards without replacement from ordinary deck
of playing cards. What is the probability of getting exactly 2 red cards, it’s mean, variance
and standard deviation?
N=52 n=5 x=2 k=26

nk 5(26) 130
= N = 52 = 52 = 2.5

k ( N−k ) n (N−n) 26 ( 52−26 ) 5(52−5) 158,860


²= = = = 1.152
N ²(N −1) 52²(52−1) 137 , 904

=
√ k ( N −k ) n( N −n)
N ²(N −1)
=

26 ( 52−26 ) 5(52−5)
52²(52−1)
=
158,860 =
137 , 904 √
√ 1.152 = 1.07

( kCx ) [ ( N −k ) C ( n−x ) ] (¿26 C2 )[ ( 52−26 ) C ( 5−2 ) ]


p(x) = = ¿
NCn (¿52 C 5) ¿

(¿26 C 3) 325(2600)
= (325) ¿= = 0.3251
2,598,960 2,598,960
2. Suppose we select 5 cards from an ordinary deck of playing cards. What is the
probability of obtaining 2 or fewer hearts?
N=52 n=5 x= 0, 1, 2 k=13

p(x≤2) = P(x=0) + P(x=1) + P(x=2)

nk 5(13) 65
= N = 52 = 52 = 1.25

k ( N−k ) n (N−n) 13 ( 52−13 ) 5 (52−5) 119,145


²= = = = .86
N ²(N −1) 52²(52−1) 137 , 904

=
√ k ( N −k ) n( N −n)
N ²(N −1)
=

26 ( 52−26 ) 5(52−5)
52²(52−1)
=
119,145 =
137 , 904 √
√ .86 = .93

( kCx ) [ ( N −k ) C ( n−x ) ] (¿13 C0 )[ ( 52−13 ) C ( 5−0 ) ]


p(x=0) = = ¿
NCn (¿52 C 5) ¿

13
(¿ 39 C 5) 575,757
= (1) ¿= = 0.2215
2,598,960 2,598,960

( kCx ) [ ( N −k ) C ( n−x ) ] (¿13 C1 )[ (52−13 ) C ( 5−1 ) ]


p(x=1) = = ¿
NCn ( ¿52 C 5) ¿

(¿ 39 C 4) 13(82,251) 1,069,263
= (13) ¿= = = 0.4114
2,598,960 2,598,960 2,598,960

( kCx ) [ ( N −k ) C ( n−x ) ] (¿13 C2 )[ (52−13 ) C ( 5−2 ) ]


p(x=2) = = ¿
NCn ( ¿52 C 5) ¿

(¿39 C 3) 78( 9139) 712,842


= (78) ¿= = = 0.2743
2,598,960 2,598,960 2,598,960

p(x≤2) = P(x=0) + P(x=1) + P(x=2)


= 0.2215 + 0.4114 + 0.2743
= .9072

Activities to be answer on your Activity Notebook


1. If I toss a coin 14 times, what is the probability of getting 4 or fewer heads?

2. Suppose we randomly select 4 cards without replacement from ordinary


deck of playing cards. What is the probability of getting 2 or fewer red cards
from Ace to 10, it’s mean, variance and standard deviation?

14

You might also like