Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
4 views

Engineering Data Analysis Reviewer

The document outlines key concepts in engineering data analysis, including types of data (univariate, bivariate, multivariate), methods of summarizing data (frequency distribution, contingency tables, scatter plots), and statistical measures (mean, median, mode, variance). It also covers probability concepts, including random variables, probability distributions, and sampling methods. Additionally, it discusses the Central Limit Theorem and various probability rules applicable to data analysis.

Uploaded by

Kesh Ya
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Engineering Data Analysis Reviewer

The document outlines key concepts in engineering data analysis, including types of data (univariate, bivariate, multivariate), methods of summarizing data (frequency distribution, contingency tables, scatter plots), and statistical measures (mean, median, mode, variance). It also covers probability concepts, including random variables, probability distributions, and sampling methods. Additionally, it discusses the Central Limit Theorem and various probability rules applicable to data analysis.

Uploaded by

Kesh Ya
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Engineering Data Analysis Reviewer  Ratio: scales that have measurable

intervals
Chapter 1
 Data: facts and figures from which Chapter 2
conclusions can be drawn
 Data set: the data that are collected for a Graphically Summarizing Qualitative Data
particular study
Elements: may be people, objects, events, or  Frequency distribution: A table that
other entries summarizes the number (or frequency) of
 Variable: any characteristic of an items in each of several non-overlapping
element and may change from one object classes
to another in the population.  Relative frequency: summarizes the
 Univariate: data set consists of proportion of items in each class
observations on a single variable.
 Bivariate: data when observations are Formula:
made on each of two variables.
 Multivariate: data arises when
observations are made on more than one
variable. Steps in Constructing a Frequency Distribution
 Measurement: A way to assign a value of
a variable to the element 1. Find the number of classes
 Quantitative: the possible measurements 2. Find the class length
of the values of a variable are numbers 3. Form non-overlapping classes of equal
that represent quantities width
 Qualitative: the possible measurements 4. Tally and count the number of
fall into several categories measurements in each class
 Cross-sectional data: Data collected at 5. Graph the histogram
the same or approximately the same point
in time Contingency Tables:
 Time series data: data collected over
different time periods Classifies data on two dimensions
 Rows classify according to one dimension
Data Sources  Columns classify according to a second
dimension
 Existing sources: data already gathered Requires three variables
by public or private sources like  The row variable
o Internet  The column variable
o Library  The variable counted in the cells
o US Government
o Data collection agency Scatter Plots:
 Used to study relationships between two
 Experimental and observational variables
studies: data we collect ourselves for a  Place one variable on the x-axis
specific purpose  Place a second variable on the y-axis
 Place dot on pair coordinates
Response variable: variable of interest
Independent Variable: related to the variable Types of Relationships
of interest and will be measured.  Linear: A straight line relationship
Experimental study: able to manipulate the between the two variables
independent variables  Positive: When one variable goes up, the
Observational: unable to control IV. other variable goes up
 Negative: When one variable goes up,
Population and Samples the other variable goes down
 No Linear Relationship: There is no
 Population: A set of all elements about coordinated linear movement between the
which we wish to draw conclusions two variables
 Census: An examination all of the
population of measurements Chapter 3
 Sample: A subset of the elements of a
population Measures of Central Tendency
 Mean,  : The average or expected value
 Descriptive Statistics: The science of  Median, Md : The value of the middle
describing the important aspects of a set point of the ordered measurements
of measurements  Mode, Mo: The most frequent value
 Statistical Inference: The science of
describing the important aspects a set of Measures of Variation
measurements  Range: Largest minus the smallest
measurement
Scales of Measurement  Variance: The average of the squared
 Nominative: tags or named variables deviations of all the population
 Ordinal: ordering and ranking data measurements from the population mean
 Interval: known equal intervals of the  Standard Deviation: The square root of
same distance the variance.
Empirical Rule  The mean of all these returns is the
68.26% : within one standard deviation means calculated as the geometric mean:
95.44%: within two standard deviation means
99.73%: within three standard deviation means Formula:

Z-scores: the number of standard deviations


that x is from the mean
o A positive z score is for x above the mean Chapter 4
o A negative z score is for x below the mean  Experiment: is any process of
o The mean has a z score of zero observation with an uncertain outcome
 Sample space: The possible outcomes for
Formula: an experiment also known as
experimental outcomes and sample space
outcomes
 Probability is a measure of the chance
Percentiles, Quartiles that an experimental outcome will occur
 The first quartile Q1 is the 25th percentile when an experiment is carried out
 The second quartile (or median) is the 50th o If E is a sample space outcome,
percentile then P(E) denotes the probability
 The third quartile Q3 is the 75th percentile that E will occur and:
 The interquartile range IQR is Q3 - Q1 Conditions:
o 0  P(E)  1 such that:
Formula: o If E can never occur, then P(E) = 0
o If E is certain to occur, then P(E) =
1
o The probabilities of all the sample
space outcomes must sum to 1
Covariance
 A positive covariance indicates a positive Sample Space
linear relationship between x and y  Sample Space: The set of all possible
o As x increases, y increases experimental outcomes
 A negative covariance indicates a negative  Sample Space Outcomes: The
linear relationship between x and y experimental outcomes in the sample
o As x increases, y decreases space
 Event: A set of sample space outcomes
Formula:  Probability: The probability of an event is
the sum of the probabilities of the sample
space outcomes that correspond to the
event

Correlation Coefficient: a measure of the Probability Rules


strength of the relationship that does not depend  Addition Rule
on the magnitude of the data o If A and B are mutually exclusive,
then the probability that A or B will
Formula: occur is P(AB) = P(A) + P(B)

o If A and B are not mutually


exclusive:
Weighted Means: calculated by multiplying the P(AB) = P(A) + P(B) – P(A∩B),
weight or probability associated with a particular where P(A∩B) is the joint
event or outcome. probability of A and B both
occurring together
Formula:
 Conditional Probability
o The probability of an event A, given
that the event B has occurred, is
Example: called the conditional probability of
A given B. Denoted as P(A|B)
Weight (%) Grade (%) o Further, P(A|B) = P(A∩B) / P(B);
Q1 10 70 P(B) ≠ 0
Q2 10 65 o Likewise, P(B|A) = P(A∩B) / P(A)
Q3 30 70
Q4 50 85  Multiplication Rule
Weighted Mean 77% o Given any two events, A and B
P(A∩B) = P(A)P(B|A)
= P(B)P(A|B)

Geometric Mean:
 For rates of return of an investment, use Chapter 5
the geometric mean  Random variable: a variable that
 Suppose the rates of return are R1, R2, …, assumes numerical values determined by
Rn for periods 1, 2, …, n the outcome of an experiment
o Discrete random variable:  Properties of f(x): f(x) is a continuous
Possible values can be counted or function such that
listed o f(x) ≥ 0 for all x
Examples o The total area under the curve of
 The number of defective units in a batch f(x) is equal to 1
of 20  Essential point: An area under a
 A rating on a scale of 1 to 5 continuous probability distribution is a
o Continuous random variable: probability
May assume any numerical value in
one or more intervals Uniform Distribution: all values between a
Examples minimum and maximum value have the same
 The waiting time for a credit card probability.
authorization Formula:
 The interest rate charged on a business
loan

Discrete Probability Distribution


 Probability distribution: is a table,
graph or formula that gives the probability Normal Probability Distribution
associated with each possible value that  A straight line indicates a normal
the variable can assume. distribution
o Denote the values of the random
variable by x and the value’s Formula:
associated probability by p(x)

 Binomial Experiments
1. Experiment consists of n identical trials
2. Each trial results in either “success” or
“failure” Finding Normal Probabilities
3. Probability of success, p, is constant from 1. Formulate the problem in terms of x
trial to trial values
4. Trials are independent 2. Calculate the corresponding z values, and
o If x is the total number of restate the problem in terms of these z
successes in n trials of a binomial values
experiment, then x is a binomial 3. Find the required areas under the
random variable standard normal curve by using the table

 Poisson Distribution: finding the Exponential Distribution


probability of an independent event that is
occurring in a fixed interval of time and Chapter 7
has a constant mean rate.  Random Sample: every set of n
Formula: elements in the population has the same
chance of being selected
 Probability sampling: is a sampling
Example: where we know the chance that each
In a café, the costumers arrives at a mean rate of element in the population will be included
2 per min. Find the probability of arrival of 5 in 1 in the sample
min. o Allows making statistical inferences
 Convenience sampling: is where we
select elements because they are easy or
convenient to sample
 Hypergeometric Distribution:  Voluntary response sampling: is where
determine the probability of certain participants self-select
number of success in a series of draws  Judgment sampling: is where a
made without replacements from a fixed knowledgeable person selects population
population. elements
Formula:  Sampling distribution of the sample
mean: is the probability distribution of
the population of the sample means
obtainable from all possible samples of
size n from a population of size N
 Sampling distribution of the sample
proportion: distribution of all possible
sample proportions
Chapter 6 o Formula:

Continuous Probability Distribution


 A continuous random variable may
assume any numerical value in one or
more intervals Central Limit Theorem: non-normal
o Car mileage population
o Temperature o The larger n, the better the
Properties of Continuous Probability Distribution approximation

You might also like