Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Statistics With R Unit 3

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 11

Unit III

Probability, Probability& Sampling Distribution

Experiment, Sample Space and Events, Classical Probability, General Rules Of Addition,
Conditional Probability, General Rules For Multiplication, Independent Events, Bayes’ Theorem,
Discrete Probability Distributions: Binomial, Poisson, Continuous Probability Distribution, Normal
Distribution & t-distribution, Sampling Distribution and Central Limit Theorem.
Probability
A random experiment is a mechanism that produces a definite outcome that cannot be
predicted with certainty. The sample space associated with a random experiment is the set of all
possible outcomes. An event is a subset of the sample space.
An event E is said to occur on a particular trial of the experiment if the outcome observed is an
element of the set E.

Example 1 Tossing a coin. The sample space is S = {H, T}. E = {H} is an event.

Example 2 Tossing a die. The sample space is S = {1, 2, 3, 4, 5, 6}. E = {2, 4, 6} is an event,

which can be described in words as ”the number is even”

A sample space is a collection or a set of possible outcomes of a random experiment. The


sample space is represented using the symbol, “S”. The subset of possible outcomes of an
experiment is called events. A sample space may contain a number of outcomes that depends
on the experiment. If it contains a finite number of outcomes, then it is known as discrete or finite
sample spaces.

A probability event can be defined as a set of outcomes of an experiment. In other words, an


event in probability is the subset of the respective sample space. So, what is sample space?

The entire possible set of outcomes of a random experiment is the sample space or the
individual space of that experiment. The likelihood of occurrence of an event is known
as probability. The probability of occurrence of any event lies between 0 and 1.

 Sample space: A sample space can be defined as the list of all possible
outcomes of a random experiment.
 Outcome: An outcome is a possible result of the random experiment.
 Event: An event is a possible outcome of an experiment and forms a subset of
the sample space.
 Trial: When a random experiment is repeated many times each one is known as a
trial.
Classical Probability

Probability is a measure of the likelihood of an event to occur. Many events cannot be predicted
with total certainty. We can predict only the chance of an event to occur i.e., how likely they are
going to happen, using it. Probability can range from 0 to 1, where 0 means the event to be an
impossible one and 1 indicates a certain event. Probability for Class 10 is an important topic for
the students which explains all the basic concepts of this topic. The probability of all the events
in a sample space adds up to 1.

The probability formula is defined as the possibility of an event to happen is equal to the ratio of
the number of favourable outcomes and the total number of outcomes.

Probability of event to happen P(E) = Number of favourable outcomes/Total Number of outcomes

General Rules Of Addition

When we add two positive numbers, it results in a positive value. Adding two negative numbers
results in the sum of the two numbers but with a negative sign. If we add two numbers with
opposite signs, then it results in subtraction.

12 – 2 = 10

-12 – (-2) = -12 + 2 = -10

Conditional Probability
Conditional probability is defined as the likelihood of an event or outcome
occurring, based on the occurrence of a previous event or outcome.
Conditional probability is calculated by multiplying the probability of the
preceding event by the updated probability of the succeeding, or
conditional, event.

P(B|A) = P(A∩B) / P(A)

Where
P = Probability
A = Event A
B = Event B
General Rules For Multiplication
The multiplication rule of probability explains the condition between two events. For two events
A and B associated with a sample space S set A∩B denotes the events in which both
events A and event B have occurred. Hence, (A∩B) denotes the simultaneous occurrence of
events A and B. Event A∩B can be written as AB. The probability of event AB is obtained by using
the properties of conditional probability.

According to the multiplication rule of probability, the probability of occurrence of both the
events A and B is equal to the product of the probability of B occurring and the conditional
probability that event A occurring given that event B occurs.

If A and B are dependent events, then the probability of both events occurring simultaneously is
given by:

P(A ∩ B) = P(B) . P(A|B)


If A and B are two independent events in an experiment, then the probability of both events
occurring simultaneously is given by:

P(A ∩ B) = P(A) . P(B)

Mutually Exclusive Events


Two or more events are mutually exclusive events if the occurrence of one event
precludes the occurrence of the other event(s). This characteristic means that
mutually exclusive events cannot occur simultaneously and therefore can have no
intersection.

For Mutually Exclusive Events : P ( X ∩Y )=0

Independent Events
Two or more events are independent events if the occurrence or nonoccurrence of
one of the events does not affect the occurrence or nonoccurrence of the other
event(s).
For Independant Events : P(X|Y) = P(X) and P(Y|X) = P(Y)
For Dependent Events :

Independent events
Independent events are those events whose occurrence is not dependent on any other event. For
example, if we flip a coin in the air and get the outcome as Head, then again if we flip the coin but
this time we get the outcome as Tail. In both cases, the occurrence of both events is
independent of each other. It is one of the types of events in probability. Let us learn here the
complete definition of independent events along with its Venn diagram, examples and how it is
different from mutually exclusive events.

In Probability, the set of outcomes of an experiment is called events. There are different types of
events such as independent events, dependent events, mutually exclusive events, and so on.

If the probability of occurrence of an event A is not affected by the occurrence of another event
B, then A and B are said to be independent events.

Consider an example of rolling a die. If A is the event ‘the number appearing is odd’ and B be the
event ‘the number appearing is a multiple of 3’, then

P(A)= 3/6 = 1/2 and P(B) = 2/6 = 1/3

Bayes' law

Bayes' theorem is also known as Bayes' rule, Bayes' law, or Bayesian reasoning,
which determines the probability of an event with uncertain knowledge.

In probability theory, it relates the conditional probability and marginal probabilities


of two random events.

Bayes' theorem was named after the British mathematician Thomas Bayes.


The Bayesian inference is an application of Bayes' theorem, which is fundamental to
Bayesian statistics.

It is a way to calculate the value of P(B|A) with the knowledge of P(A|B).

Bayes' theorem allows updating the probability prediction of an event by observing


new information of the real world.

Example: If cancer corresponds to one's age then by using Bayes' theorem, we can
determine the probability of cancer more accurately with the help of age.

Bayes' theorem can be derived using product rule and conditional probability of
event A with known event B:
As from product rule we can write:

1. P(A ⋀ B)= P(A|B) P(B) or  

Similarly, the probability of event B with known event A:

1. P(A ⋀ B)= P(B|A) P(A)  

Equating right hand side of both the equations, we will get:

Application of Bayes' theorem in Artificial intelligence:


Following are some applications of Bayes' theorem:

o It is used to calculate the next step of the robot when the already executed
step is given.
o Bayes' theorem is helpful in weather forecasting.
o It can solve the Monty Hall problem.

Discrete Probability Distribution

Discrete probability distribution is a type of probability distribution that


shows all possible values of a discrete random variable along with the
associated probabilities. In other words, a discrete probability distribution
gives the likelihood of occurrence of each possible value of a discrete
random variable.
Geometric distributions, binomial distributions, and Bernoulli distributions
are some commonly used discrete probability distributions. This article
sheds light on the definition of a discrete probability distribution, its
formulas, types, and various associated examples.

Binomial Distribution
Binomial distribution is a probability distribution used in statistics that
summarizes the likelihood that a value will take one of two independent
values under a given set of parameters or assumptions.

The underlying assumptions of binomial distribution are that there is only


one outcome for each trial, that each trial has the same probability of
success, and that each trial is mutually exclusive, or independent of one
another.

KEY TAKEAWAYS

 Binomial distribution is a probability distribution in statistics that


summarizes the likelihood that a value will take one of two
independent values under a given set of parameters or assumptions.
 The underlying assumptions of binomial distribution are that there is
only one outcome for each trial, that each trial has the same
probability of success, and that each trial is mutually exclusive or
independent of one another.
 Binomial distribution is a common discrete distribution used in
statistics, as opposed to a continuous distribution, such as normal
distribution.

The binomial distribution formula is calculated as:

P(x:n,p)  =  nCx  x px(1-p)n-x


where:

 n is the number of trials (occurrences)


 x is the number of successful trials
 p is probability of success in a single trial
 nCx is the combination of n and x. A combination is the number of
ways to choose a sample of x elements from a set of n distinct
objects where order does not matter and replacements are not
allowed. Note that nCx=n!/(r!(n−r)!), where ! is factorial (so, 4! = 4 ×
3 × 2 × 1).

Properties of Binomial Distribution


The properties of the binomial distribution are:

 There are two possible outcomes: true or false, success or failure, yes or no.
 There is ‘n’ number of independent trials or a fixed number of n times repeated trials.
 The probability of success or failure remains the same for each trial.
 Only the number of success is calculated out of n independent trials.
 Every trial is an independent trial, which means the outcome of one trial does not affect
the outcome of another trial.

Poisson distribution:
Poisson distribution is actually another probability distribution formula. As per
binomial distribution, we won’t be given the number of trials or the probability of
success on a certain trail. The average number of successes will be given in a
certain time interval. The average number of successes is called “Lambda” and
denoted by the symbol “λ”.

The formula for Poisson Distribution formula is given below:


−λ x
e λ
P( X =x)=
x!
Here,

λ is the average number of occurrences during a interavel


x is a Poisson random variable.
e is the base of logarithm and e = 2.71828 (approx).

Continuous probability distribution


Continuous probability distribution: A probability distribution in which the random
variable X can take on any value (is continuous). Because there are infinite values
that X could assume, the probability of X taking on any one specific value is zero.
Therefore we often speak in ranges of values (p(X>0) = .50). The normal distribution
is one example of a continuous distribution. The probability that X falls between
two values (a and b) equals the integral (area under the curve) from a to b:

Gaussian distribution,
Gaussian distribution, is a probability distribution that is symmetric about
the mean, showing that data near the mean are more frequent in
occurrence than data far from the mean.
KEY TAKEAWAYS

 The normal distribution is the proper term for a probability bell curve.
 In a normal distribution the mean is zero and the standard deviation
is 1. It has zero skew and a kurtosis of 3.
 Normal distributions are symmetrical, but not all symmetrical
distributions are normal.
 Many naturally-occurring phenomena tend to approximate the
normal distribution.
 In finance, most pricing distributions are not, however, perfectly
normal.

where:

 x = value of the variable or data being examined and f(x) the
probability function
 μ = the mean
 σ = the standard deviation

What Is Meant By the Normal Distribution?


The normal distribution describes a symmetrical plot of data around its
mean value, where the width of the curve is defined by the standard
deviation. It is visually depicted as the "bell curve."

The T distribution (also called Student’s T Distribution) is a family of


distributions that look almost identical to the normal distribution curve, only
a bit shorter and fatter. The t distribution is used instead of the normal
distribution when you have small samples (for more on this, see: t-score
vs. z-score). The larger the sample size, the more the t distribution looks
like the normal distribution. In fact, for sample sizes larger than 20 (e.g.
more degrees of freedom), the distribution is almost exactly like the normal
distribution.

Sampling distribution

A sampling distribution is a probability distribution of a statistic obtained


from a larger number of samples drawn from a specific population. The
sampling distribution of a given population is the distribution of frequencies
of a range of different outcomes that could possibly occur for a statistic of
a population.


A sampling distribution is a probability distribution of a statistic that is obtained through
repeated sampling of a specific population.
 It describes a range of possible outcomes for a statistic, such as the mean or mode of some
variable, of a population.
 The majority of data analyzed by researchers are actually samples, not populations.

Central Limit Theorem

central limit theorem is a statistical theory which states that when the large sample size has a
finite variance, the samples will be normally distributed and the mean of samples will be
approximately equal to the mean of the whole population.

In other words, the central limit theorem states that for any population with mean and standard
deviation, the distribution of the sample mean for sample size N has mean μ and standard
deviation σ / √n .
Central Limit Theorem for Sample Means,
x –μ
Z=
σ
√n

In probability theory, the central limit theorem (CLT) states that the distribution of a sample
variable approximates a normal distribution (i.e., a “bell curve”) as the sample size becomes
larger, assuming that all samples are identical in size, and regardless of the population's actual
distribution shape.

KEY TAKEAWAYS

 The central limit theorem (CLT) states that the distribution of sample
means approximates a normal distribution as the sample size gets
larger, regardless of the population's distribution.
 Sample sizes equal to or greater than 30 are often considered
sufficient for the CLT to hold.
 A key aspect of CLT is that the average of the sample means and
standard deviations will equal the population mean and standard
deviation.
 A sufficiently large sample size can predict the characteristics of a
population more accurately.
 CLT is useful in finance when analyzing a large collection of
securities to estimate portfolio distributions and traits for returns, risk,
and correlation.

Key Components of the Central Limit Theorem


The central limit theorem is comprised of several key characteristics.
These characteristics largely revolve around samples, sample sizes, and
the population of data.
1. Sampling is successive. This means some sample units are
common with sample units selected on previous occasions.
2. Sampling is random. All samples must be selected at random so
that they have the same statistical possibility of being selected.
3. Samples should be independent. The selections or results from
one sample should have no bearing on future samples or other
sample results.
4. Samples should be limited. It's often cited that a sample should be
no more than 10% of a population if sampling is done without
replacement. In general, larger population sizes warrant the use of
larger sample sizes.
5. Sample size is increasing. The central limit theorem is relevant as
more samples are selected.

You might also like