Reviewer in Eda
Reviewer in Eda
Reviewer in Eda
2022-2023
MIDTERM REVIEWER MATH 403 – EDA
Lesson 1 Variable – It is a measure or characteristic
or property of a population or sample that
OBTAINING DATA
may have several different values. (The
STATISTICS : Defined as the science that value varies or nag iiba.)
deals with the collection, organization,
Ungrouped Data - Data which are not
presentation, analysis, and interpretation of
organized in any specific way. Also called
data in order to be able to draw judgments or
RAW DATA.
conclusions that help in the decision-making
process. Ex: {1.0, 1.25, 2.5, 1.0, 1.25, 1.75, 1.0}
2 Major Branches of Statistics: Grouped Data - Raw data organized into
▪ Descriptive statistics: Deals with the groups or categories with corresponding
procedures that organize, summarize frequencies.
and describe quantitative data.
Ex:
▪ Inferential Statistics: Deals with
making a judgment or a conclusion ▪ # of 1.0 to 1.50 – 5
about a population based on the findings ▪ # of 1.5 to 2.5 – 2
from a sample that is taken from the
population. TYPES OF DATA
COMMON STATISTICAL TERMS Qualitative Data: Are data that can be placed
into distinct categories, according to some
Population: Refers to the totality of objects, characteristic or attribute.
persons, places or things used in a particular
study. Ex : Course, Religion
Sample: Any subset of population or few Quantitative Data: are numerical and can be
members of a population. If the sample is to ordered or ranked.
be informative about the total population, it Ex: Age, Height, Weight
must be, in some sense, representative of that
population. Purpose of sample is to save time Identify if data is qualitative or quantitative:
and money. ▪ Number of bicycles sold in 1 year by a
Parameter: Descriptive measure of a large sporting goods store. Quanti
characteristic of a population. ▪ Colors of baseball caps in a store. Quali
Statistic: Descriptive measure of a ▪ Time it takes to cut a lawn. Quanti
characteristic of a sample.
▪ Capacity in cubic feet of six truck beds.
PARAMETER AND STATISTIC: Quanti
▪ Average height of all students of ▪ Classification of children in a day care
Batangas State University. (Population) center (infant, toddler, preschool). Quali
▪ Average weight of 100 students of ▪ Weights of fish caught in Taal Lake.
Batangas State University. (Sample) Quanti
▪ Average test score of 10 students in a ▪ Marital status of faculty members in a
class of 50.(Sample) large university. Quali
▪ Number of registered voters in Quantitative data can be classified into two
Batangas.(Population) groups:
DATA : Facts, figures and information Discrete- It can only have specific value
collected on some characteristics of a ▪ They can be counted
population or sample. ▪ Examples
– Sides of dice
Constant - Characteristic or property of a
– Number of students in a class
population or sample which is common to
all members of the group. (The value is Continuous- It can take on any value on
fixed.) interval. They are obtained by measuring.
SECOND YEAR – FIRST SEMESTER EE-2103 A.Y. 2022-2023
MIDTERM REVIEWER MATH 403 – EDA
May mga times na nabibilang ang data pero Observational Study
yung totality nila ay hindi kayang bilangin ▪ A researcher observes and measures
like total grains of sand in the Earth. characteristics of interest of part of a
Examples: population
Temperature ▪ Two types:
Speed – Retrospective Observational
Weight Study : It is the study that
Height observes past data / historical
Identify Discrete and Continuous data: data
▪ Number of doughnuts sold each day by
– Prospective Observational
Krispy Kreme. D
▪ Water temperatures of six swimming Study: Study starts at the
pools in Sports Complex on a given day. present and studies a group for a
C set period of time.
▪ Weights of cats in a pet shelter.C SIMULATION
▪ Lifetime of 12 flashlight batteries. C ▪ Uses a mathematical or physical model
▪ Number of cheeseburgers sold each day to reproduce the conditions of a situation
by a burger stand on BSU. D or process.
▪ Number of DVDs rented each day by a ▪ Often involves the use of computers.
video store. D ▪ Allow you to study the situation that are
▪ Time between two flashes of lightnings. impractical or even dangerous to create
C in real life.
▪ Often save time and money.
METHODS OF DATA COLLECTION ▪ Example:
Data Collection- Is the process of acquiring – Crash test dummies
information from different sources, about the SAMPLING TECHNIQUE
topic under study. This involves acquiring ▪ Sampling is the process of selecting
information published literature, surveys units (e.g., people, organizations) from a
through questionnaires or interviews, population of interest.
experimentations, documents and records, ▪ Sample must be a representative of the
tests or examinations and other forms of data target population.
gathering instruments. ▪ The target population is the entire
SOURCES OF DATA group a researcher is interested in; the
group about which the researcher
Primary Data - Data collected by the wishes to draw conclusions.
researcher, for the very first time, from different TWO WAYS OF SELECTING A SAMPLE
resources, with a particular problem, question, 1. NON-PROBABILITY SAMPLING
or specific purpose in mind. 2. PROBABILITY SAMPLING
Ex: Survey, Questionnaire, Experiment,
Observation, Telephonic Interview
Secondary Data - Data collected by any NON-PROBABILITY SAMPLING
persons, organization or agency in the past ▪ Not every member of the population has
through surveys, experiments or study, for the chance
some other purpose, but used by the ▪ Non-probability sampling is also called
researcher to deal with the problem at hand. judgment or subjective sampling.
▪ This method is convenient and
Ex: Newspaper, Websites, Government
economical but the inferences made
Publications, Records, Books
based on the findings are not so reliable.
BASIC METHODS OF COLLECTING DATA ▪ Common types:
– Convenience Sampling -
▪ Observational Study
Selecting a sample based on the
▪ Simulation
availability of the respondent
▪ Survey
and/or proximity to the
▪ Design Experiment
researcher. Also known as
SECOND YEAR – FIRST SEMESTER EE-2103 A.Y. 2022-2023
MIDTERM REVIEWER MATH 403 – EDA
accidental, opportunity or grab When a sample is to be taken from a
sampling. population with several strata, the proportion of
– Purposive Sampling - Samples each stratum in the sample should be the same
are chosen based on the goals of as in the population.
the study. They may be chosen CLUSTER SAMPLING - is a sampling
based on their knowledge of the technique where the entire population is
study being conducted or if they divided into groups, or clusters, and a random
satisfy the traits or conditions set sample of these clusters are selected. All
by the researcher. observations in the selected clusters are
included in the sample.
– Quota Sampling
▪ PROPORTIONAL-In Identify the sampling being performed.
proportional quota ▪ Each of the 30 basketball high school
sampling the major teams has 12 players. The organizer
characteristics of the wants to have a quick survey to know
population by sampling a the average height of the players.
proportional amount of
each is represented. 1. Each team was asked to place papers
▪ NON-PROPORTIONAL - with its players’ names into separate
In this method, a minimum fishbowls and randomly drew out five
number of sampled units in names from each bowl. The five names
each category is specified from each team were combined to make
and not concerned with up the sample. Which of the following
having numbers that sampling techniques was used in this
match the proportions in situation?
the population. a. Cluster
b. Simple
PROBABILITY SAMPLING c. Stratified
▪ every member of the population is given d. Systematic
an equal chance to be selected as a part 2. The organizer listed all the players on a
of the sample. sheet of paper and then assigned a
▪ There are several probability unique number for each. Sixty numbers
techniques: were picked to get the samples. Which
– Simple Random Sampling random sampling technique did the
– Systematic Random Sampling organizer apply?
– Stratified Sampling a. Cluster
b. Simple
– Cluster Sampling
c. Stratified
SIMPLE RANDOM SAMPLING (Draw lots) - is d. systematic
the basic sampling technique where a group of 3. All players were grouped according to
subjects (a sample) is selected for study from a their ages and chose players from each
larger group (a population). group to measure their heights. Which
random sampling technique did he
Each individual is chosen entirely by chance
apply?
and each member of the population has an
a. Cluster
equal chance of being included in the sample.
b. Simple
SYSTEMATIC RANDOM SAMPLING - is a c. Stratified
random sampling that uses a list of all the d. systematic
elements in the population and then elements 4. The organizer created a list of all
are being selected based on the kth consistent players, decided to surveyed every sixth
intervals. To get the kth interval, divide the name on the list, and later asked those
population size by the sample size. players that were selected to answer a
questionnaire. Which random sampling
STRATIFIED SAMPLING - A stratified sample
technique did he apply?
is obtained by taking samples from each
a. Cluster
stratum or sub-group of a population.
b. Simple
SECOND YEAR – FIRST SEMESTER EE-2103 A.Y. 2022-2023
MIDTERM REVIEWER MATH 403 – EDA
c. Stratified d. systematic
d. Systematic 5. Mrs. Mogol grouped the first-year high
5. A team was randomly selected to school students according to the school
answer the question prior to the study. last attended. She proportionately and
Which random sampling technique did randomly chose students from each
he apply? group. Which random sampling
a. Cluster technique did she apply?
b. Simple a. Cluster
c. Stratified b. Simple
d. Systematic c. Stratified
d. systematic
▪ A National High School has 2,000 first
year high school students. Mrs. Mogol,
the school principal, wanted to obtain
information from these students about
last year’s lesson that had not been
tackled.
1. What is the target population in her
study?
a. All students in her school
b. Parents of all students in her
school
c. First year high school students
in her school
d. Parents of first year high school
student in her school
2. The principal created a list of all grade 7
students, decided to survey every
seventh student on the list. Which
random sampling technique did she
apply?
a. Cluster
b. Simple
c. Stratified
d. systematic
3. Mrs. Mogol wrote each name of all first-
year high school students on small
pieces of paper, she then put them in a
box and drew 300 names to participate
in the study. Which random sampling
technique did she apply?
a. Cluster
b. Simple
c. Stratified
d. systematic
4. The principal grouped the first-year high
school students according to the
barangay where they live. She randomly
picked a barangay and all of the
students living in that barangay
answered the questionnaire. Which
random sampling technique did she
apply?
a. Cluster
b. Simple
c. Stratified
SECOND YEAR – FIRST SEMESTER EE-2103 A.Y. 2022-2023
MIDTERM REVIEWER MATH 403 – EDA
S = {(1,1), (1,2), (1,3), (1,4), (1,5), (1,6),
(2,1), (2,2), (2,3), (2,4), (2,5), (2,6), (3,1),
(3,2), (3,3), (3,4), (3,5), (3,6), (4,1), (4,2),
Lesson 2 (4,3), (4,4), (4,5), (4,6), (5,1), (5,2),
(5,3), (5,4), (5,5), (5,6), (6,1), (6,2), (6,3),
PROBABILITY (6,4), (6,5), (6,6)}
Probability is the likelihood or chance of an E = sum of 7
event occurring.
successful outcomes S E = {(1,6),(2,5),(3,4),(4,3),(5,2),(6,1)}
Probability= =
total number of possible outcomes T
VENN DIAGRAM - Is a rectangle (the
Example:
universal set) that includes circles depicting
Dice
Find the probability of showing an even number
from a single roll of dice.
Coin
Find the probability of showing 2 Heads from a the subsets.
tossing the coin twice.
INTERSECTION OF EVENTS
A∪B=?
A ∪ B = {1, 3, 5, 7, 9} ∪ {2, 4, 6, 8, 10}
A ∪ B = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
E [X ]=∑ X ∙ P( X )
BINOMIAL EXPERIMENT
SECOND YEAR – FIRST SEMESTER EE-2103 A.Y. 2022-2023
MIDTERM REVIEWER MATH 403 – EDA
A binomial experiment is a statistical E[𝑋]=np
experiment that has the following The variance is
properties: Var[X]=npq
The experiment consists of n The standard deviation is
repeated trials. SD[X]=√npq
Each trial can result in just two
possible outcomes. We call one
of these outcomes a success and
the other, a failure.
The probability in each trial is
constant.
The probability of success,
denoted by p, is the same on
every trial.
The probability of failure, denoted
by q, is the same on every trial.
The trials are independent; that
is, the outcome on one trial does
not affect the outcome on other
trials.
BINOMIAL PROBABILITY
Binomial probability refers to the
probability that a binomial experiment
results in exactly “r” successes.
r n−r
P[ X ]=nCr ⋅ p ⋅ q
DISTRIBUTION.
It can be represented by:
·Table Less than/ fewer than Greater than/ more than
·Graph
Where:
·Formula
n = number of trials
Note: r = number of successes that result from
·The sum of all probabilities in a a binomial experiment
binomial distribution is 1. p = probability of success in every trial
·Every probability is a number between q = probability of failure in every trial
0 and 1. q=1–p
The binomial distribution has the
following properties:
The mean of the distribution is
SECOND YEAR – FIRST SEMESTER EE-2103 A.Y. 2022-2023
MIDTERM REVIEWER MATH 403 – EDA
POISSON DISTRIBUTION
A POISSON RANDOM VARIABLE is
the number of successes that result
from a Poisson experiment.
The probability distribution of a Poisson
random variable is called a POISSON
DISTRIBUTION
It can be represented by:
·Table
·Graph
·Formula
Note:
·The sum of all probabilities in a poisson
distribution is 1.
·Every probability is a number between
0 and 1.
CUMULATIVE POISSON PROBABILITY
A CUMULATIVE POISSON PROBABILITY
refers to the probability that the Poisson
random variable is greater than some specified
lower limit and less than some specified upper
limit.
At most At least
POISSON PROBABILITY
In Poisson Probability, it talks about the
probability of how likely an event to
happen for a specific period of time
The Poisson Probability that exactly “r”
successes occur in a Poisson
experiment, when the mean number of
successes is μ is given by the formula:
Where:
μ = average number of successes
μ = variance of Poisson distribution
μ = mean of the distribution
r = exact number of successes
e = Euler’s number = 2.71828
r −μ
μ ·e
P=
r!
SECOND YEAR – FIRST SEMESTER EE-2103 A.Y. 2022-2023
MIDTERM REVIEWER MATH 403 – EDA
Addition Rule of Probability
● The probability that a set of mutually
exclusive events will happen in a single trial
is the sum of the probabilities of the
separate events.
P=P(1)+P(2)+P(3)+…+P(n)
EXAMPLES
1. Find the probability of drawing an ACE or a
Face card in a single draw from an ordinary
deck of 52 playing cards.
P(R) = 6/10
Probability of B|R(2nd draw):
PROBABILITY OF INCLUSIVE EVENTS
P(B|R) = 4/10
● Two or more events are said to be inclusive,
when one or the other or both can occur. In P(R&B)=P(R)⋅P(B|R)
other words, two events are said to be
inclusive if they have a common outcome. P(R&B)=6/10⋅4/10
P(R&B)=24/100
P(A or B)=P(A)+P(B)-P(A and B)
P(R&B)=6/25
P(A ∪ B)=P(A)+P(B)-P(A and B)
P(CLUB∪NUMBER)=P_CLUB+P_NUMBER- P(A&B)=P(A)P(B/A)
PCLUB&NUMBER
P(A∩B)=P(A)P(B/A)
P(CLUB∪NUMBER)=13/52+36/52- 9/52 Multiplication Rule of Probability (Dependent
Events)
P(CLUB∪NUMBER)=40/52
● The probability that a set of dependent
P(CLUB∪NUMBER)=10/13 events will happen is the product of their
separate probabilities.
P(M|E)=(0.60⋅0.05)
P=nCr⋅p^r⋅q^(n-r)
((0.60⋅0.05)+(0.40⋅0.02))
Where:
P(M|E)=15/19=0.7895
•n = number of trials
•N = r1 + r2 + r3 + … + rk EXAMPLE
•r = number of selections or occurrence per 1. Four persons are chosen at random
event from a group of 3 men, 2 women
and 4 children. Find the probability
•p = probability of each event that of the 4 persons selected,
exactly 2 are children.
EXAMPLE
SECOND YEAR – FIRST SEMESTER EE-2103 A.Y. 2022-2023
MIDTERM REVIEWER MATH 403 – EDA
N = number of population That is, the probability that a continuous
N=3+2+4= 9 random variable X takes a value in the
p = number of successes in the population(N) interval [a , b] is given by an integral of
p=4 the probability density function f ( x )
q = number of failures in the population(N) b
q=5 P ( a< X < b )=∫ f ( x ) dx
n = number of sample a
n=4
{
2
r = number of successes in the sample(n) x
,−1< x <2
r=2 For the density function f ( x )= 3 ,
P=((pCr)⋅qC(n-r))/NCn ( 0 ) , elsewhere
evaluate P(0< x ≤ 1).
P=((4C2)⋅5C(4-2))/9C4
P=10/21
EMPIRICAL RULE
The empirical rule is better known as
68% - 95% - 99.70% rule.
CONVERTING NORMAL DISTRIBUTION
TO STANDARD NORMAL DISTRIBUTION
STANDARDIZING or STANDARDIZATION
of a random variable is to convert the
random variable X to a standard normal
variable or z-score.
X −μ
z= → random variable ( X ) ¿ z−score (z)
σ
X =z σ + μ→ z−score ( z ) ¿ random variable ( X )
Where:
Empirical Formula: z = standard normal variable or z-score
• About 68 % of all values will lie within X = random variable X
μ ± σ. μ = mean
• About 95 % of all values will lie within σ = standard deviation
μ ±2 σ . Example:
• About 99.7 % of all values will lie within The mean number of hours a Filipino
μ ±3 σ . worker spends on the computer is 3.1 hours
SECOND YEAR – FIRST SEMESTER EE-2103 A.Y. 2022-2023
MIDTERM REVIEWER MATH 403 – EDA
per workday. Assuming that the standard
deviation is 0.5 hour and is normally
distributed, how long does a worker spend
on the computer if his z-score is 1.2?
`
Examples:
EXPONENTIAL DISTRIBUTION
Exponential Distribution describes the
time between events that follow a
Poisson Distribution. It is often
concerned with the amount of time until
some specific event occurs.
SECOND YEAR – FIRST SEMESTER EE-2103 A.Y. 2022-2023
MIDTERM REVIEWER MATH 403 – EDA
For the density function of exponential
distribution:
Where λ = the average number of
events over a time period
Example:
Time between two flashes of lightning
during a storm
In excess