Engineering Data Analysis Reviewer
Engineering Data Analysis Reviewer
intervals
Chapter 1
Data: facts and figures from which Chapter 2
conclusions can be drawn
Data set: the data that are collected for a Graphically Summarizing Qualitative Data
particular study
Elements: may be people, objects, events, or Frequency distribution: A table that
other entries summarizes the number (or frequency) of
Variable: any characteristic of an items in each of several non-overlapping
element and may change from one object classes
to another in the population. Relative frequency: summarizes the
Univariate: data set consists of proportion of items in each class
observations on a single variable.
Bivariate: data when observations are Formula:
made on each of two variables.
Multivariate: data arises when
observations are made on more than one
variable. Steps in Constructing a Frequency Distribution
Measurement: A way to assign a value of
a variable to the element 1. Find the number of classes
Quantitative: the possible measurements 2. Find the class length
of the values of a variable are numbers 3. Form non-overlapping classes of equal
that represent quantities width
Qualitative: the possible measurements 4. Tally and count the number of
fall into several categories measurements in each class
Cross-sectional data: Data collected at 5. Graph the histogram
the same or approximately the same point
in time Contingency Tables:
Time series data: data collected over
different time periods Classifies data on two dimensions
Rows classify according to one dimension
Data Sources Columns classify according to a second
dimension
Existing sources: data already gathered Requires three variables
by public or private sources like The row variable
o Internet The column variable
o Library The variable counted in the cells
o US Government
o Data collection agency Scatter Plots:
Used to study relationships between two
Experimental and observational variables
studies: data we collect ourselves for a Place one variable on the x-axis
specific purpose Place a second variable on the y-axis
Place dot on pair coordinates
Response variable: variable of interest
Independent Variable: related to the variable Types of Relationships
of interest and will be measured. Linear: A straight line relationship
Experimental study: able to manipulate the between the two variables
independent variables Positive: When one variable goes up, the
Observational: unable to control IV. other variable goes up
Negative: When one variable goes up,
Population and Samples the other variable goes down
No Linear Relationship: There is no
Population: A set of all elements about coordinated linear movement between the
which we wish to draw conclusions two variables
Census: An examination all of the
population of measurements Chapter 3
Sample: A subset of the elements of a
population Measures of Central Tendency
Mean, : The average or expected value
Descriptive Statistics: The science of Median, Md : The value of the middle
describing the important aspects of a set point of the ordered measurements
of measurements Mode, Mo: The most frequent value
Statistical Inference: The science of
describing the important aspects a set of Measures of Variation
measurements Range: Largest minus the smallest
measurement
Scales of Measurement Variance: The average of the squared
Nominative: tags or named variables deviations of all the population
Ordinal: ordering and ranking data measurements from the population mean
Interval: known equal intervals of the Standard Deviation: The square root of
same distance the variance.
Empirical Rule The mean of all these returns is the
68.26% : within one standard deviation means calculated as the geometric mean:
95.44%: within two standard deviation means
99.73%: within three standard deviation means Formula:
Geometric Mean:
For rates of return of an investment, use Chapter 5
the geometric mean Random variable: a variable that
Suppose the rates of return are R1, R2, …, assumes numerical values determined by
Rn for periods 1, 2, …, n the outcome of an experiment
o Discrete random variable: Properties of f(x): f(x) is a continuous
Possible values can be counted or function such that
listed o f(x) ≥ 0 for all x
Examples o The total area under the curve of
The number of defective units in a batch f(x) is equal to 1
of 20 Essential point: An area under a
A rating on a scale of 1 to 5 continuous probability distribution is a
o Continuous random variable: probability
May assume any numerical value in
one or more intervals Uniform Distribution: all values between a
Examples minimum and maximum value have the same
The waiting time for a credit card probability.
authorization Formula:
The interest rate charged on a business
loan
Binomial Experiments
1. Experiment consists of n identical trials
2. Each trial results in either “success” or
“failure” Finding Normal Probabilities
3. Probability of success, p, is constant from 1. Formulate the problem in terms of x
trial to trial values
4. Trials are independent 2. Calculate the corresponding z values, and
o If x is the total number of restate the problem in terms of these z
successes in n trials of a binomial values
experiment, then x is a binomial 3. Find the required areas under the
random variable standard normal curve by using the table