Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
31 views

Introduction To Engineering Data Analysis

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views

Introduction To Engineering Data Analysis

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

12/6/2022

Topic Outline
• Definition of Statistics

Engineering Data Analysis


• Descriptive and Inferential Statistics
• Variables and Types of Data
ENENDA30 • Data Collection and Sampling Techniques
Engr. Jeanfel P. Tumbaga • Experimental Design

No part of this material may be reproduced, distributed, or transmitted in any form or by any means, including photocopying, recording, or other electronic or mechanical No part of this material may be reproduced, distributed, or transmitted in any form or by any means, including photocopying, recording, or other electronic or mechanical
methods, without the prior written permission of the owner, except for personal academic use and certain other noncommercial uses permitted by copyright law. methods, without the prior written permission of the owner, except for personal academic use and certain other noncommercial uses permitted by copyright law.

Statistics Common Statistical Terms


a) Variable is a characteristic or attribute that can assume different
the science of conducting studies to collect, organize, summarize, values.
analyze and draw conclusions from data.
b) Data are the values (measurements or observations) that the
Areas of Statistics variables can assume.
1. Descriptive Statistics c) Random Variables are variables whose values are determined by
chance.
consists of the collection, organization, summarization and
presentation of data d) Data Set is formed by collection of data values.
2. Inferential Statistics e) Datum (Data Value) is the term used for each value in a data set.
consists of generalizing from samples to populations, performing f) Population consists of all subjects (human or otherwise) that are
estimations and hypothesis tests, determining relationships among being studied.
variables, and making predictions.
g) Sample is a group of subjects selected from a population.
No part of this material may be reproduced, distributed, or transmitted in any form or by any means, including photocopying, recording, or other electronic or mechanical No part of this material may be reproduced, distributed, or transmitted in any form or by any means, including photocopying, recording, or other electronic or mechanical
methods, without the prior written permission of the owner, except for personal academic use and certain other noncommercial uses permitted by copyright law. methods, without the prior written permission of the owner, except for personal academic use and certain other noncommercial uses permitted by copyright law.
12/6/2022

Sample Problem 1 Sample Problem 2


Determine whether descriptive or inferential statistics were used. A study conducted at Manatee Community College revealed that students who
a. The average price of a 30-second ad for the Academy Awards show in a recent year attended class 95 to 100% of the time usually received an A in the class. Students
was 1.90 million dollars. who attended class 80 to 90% of the time usually received a B or C in the class.
Students who attended class less than 80% of the time usually received a D or an F
b. The Department of Economic and Social Affairs predicts that the population of Mexico or eventually withdrew from the class.
City, Mexico, in 2030 will be 238,647,000 people. Based on this information, attendance and grades are related. The more you attend
class, the more likely it is you will receive a higher grade. If you improve your
c. A medical report stated that taking statins is proven to lower heart attacks, but some attendance, your grades will probably improve. Many factors affect your grade in a
people are at a slightly higher risk of developing diabetes when taking statins. course. One factor that you have considerable control over is attendance. You can
increase your opportunities for learning by attending class more often.
d. A survey of 2234 people conducted by the Harris Poll found that 55% of the
respondents said that excessive complaining by adults was the most annoying social
media habit.

No part of this material may be reproduced, distributed, or transmitted in any form or by any means, including photocopying, recording, or other electronic or mechanical No part of this material may be reproduced, distributed, or transmitted in any form or by any means, including photocopying, recording, or other electronic or mechanical
methods, without the prior written permission of the owner, except for personal academic use and certain other noncommercial uses permitted by copyright law. methods, without the prior written permission of the owner, except for personal academic use and certain other noncommercial uses permitted by copyright law.

Questions Classification of Variables


Qualitative
1. What are the variables under study? Variables
2. What are the data in the study? Variables Discrete Variables
Quantitative
3. Are descriptive, inferential, or both types of statistics used?
Variables
4. What is the population under study? Continuous Variables
5. Was a sample collected? If so, from where?
a) Qualitative Variables are variables that can be placed into distinct
6. From the information given, comment on the relationship between categories, according to some characteristic or attribute.
the variables. b) Quantitative Variables are numerical can be ordered or ranked.
1) Discrete Variables assume values that can be counted.
2) Continuous Variables can assume an infinite number of values between any
two specific values. They are obtained by measuring which often include
fractions and decimal.
No part of this material may be reproduced, distributed, or transmitted in any form or by any means, including photocopying, recording, or other electronic or mechanical No part of this material may be reproduced, distributed, or transmitted in any form or by any means, including photocopying, recording, or other electronic or mechanical
methods, without the prior written permission of the owner, except for personal academic use and certain other noncommercial uses permitted by copyright law. methods, without the prior written permission of the owner, except for personal academic use and certain other noncommercial uses permitted by copyright law.
12/6/2022

Measurement Level of Variables Measurement Level of Variables


a) Nominal Level classifies data into mutually exclusive, non-
overlapping, categories in which no order or ranking can be
imposed on the data.
Examples: Subjects, Gender, Religion, Marital Status
b) Ordinal Level classifies data into categories that can be ranked;
however, precise differences between the ranks do not exist.
Examples: Student Evaluations, Ranking of Contestants
c) Interval Level ranks data, and precise differences between units of
measure do exist; however, there is no meaningful zero.
Examples: IQ Level, Temperature
d) Ratio Level possesses all the characteristics of interval
measurement, and there exists a true zero.
Examples: Height, Weight, Area, Time
No part of this material may be reproduced, distributed, or transmitted in any form or by any means, including photocopying, recording, or other electronic or mechanical No part of this material may be reproduced, distributed, or transmitted in any form or by any means, including photocopying, recording, or other electronic or mechanical
methods, without the prior written permission of the owner, except for personal academic use and certain other noncommercial uses permitted by copyright law. methods, without the prior written permission of the owner, except for personal academic use and certain other noncommercial uses permitted by copyright law.

Sample Problem 3 Sample Problem 4


Classify each variable as a discrete or continuous variable. What level of measurement would be used to measure each variable?
a. The number of heads when flipping three coins a. The ages of authors who wrote the hardback versions of the top 25 fiction
books sold during a specific week.
b. The duration of time it takes to get to school
b. The colors of baseball hats sold in a store for a specific year.
c. The weights of the football players on the teams that play in the NFL this year
c. The highest temperature for each day of a specific month.
d. The student’s grade level
d. The ratings of bands that played in the homecoming parade at a college.

No part of this material may be reproduced, distributed, or transmitted in any form or by any means, including photocopying, recording, or other electronic or mechanical No part of this material may be reproduced, distributed, or transmitted in any form or by any means, including photocopying, recording, or other electronic or mechanical
methods, without the prior written permission of the owner, except for personal academic use and certain other noncommercial uses permitted by copyright law. methods, without the prior written permission of the owner, except for personal academic use and certain other noncommercial uses permitted by copyright law.
12/6/2022

Planning
Framework of Statistical Analysis
Sample Problem 5 PLANNING
The chart shows the number of job-related injuries for each of the transportation Data Collection
• Starts with a concise and clear definition of the problem.
industries for 1998.
• Identify the target population and important factors that affect the problem
Industry Number of Injuries 1. What are the variables under study? or that may contribute to its solutions.
Organization and
Railroad 4520 2. Categorize each variable as quantitative or qualitative. Presentation of Data • Choose the appropriate sampling technique or observation procedure which
3. Categorize each quantitative variable as discrete or requires data-gathering instruments.
Intercity Bus 5100 continuous. • Getting the sample size.
Subway 6850 4. Identify the level of measurement for each variable. Data Analysis
DATA COLLECTION
Trucking 7144 5. The railroad is shown as the safest transportation
industry. Does that mean railroads have fewer accidents • The process of acquitting measurements, counts, or raw data.
Airline 9950 than the other industries? Explain. Interpretation and ORGANIZATION AND PRESENTATION OF DATA
6. What factors other than safety influence a person’s Conclusion
• Summarizing, organizing and presenting of data
choice of transportation?
• Presentation can use graphs, frequency distribution tables, charts, diagrams
7. From the information given, comment on the
relationship between the variables. or numerical techniques.
Recommendation

No part of this material may be reproduced, distributed, or transmitted in any form or by any means, including photocopying, recording, or other electronic or mechanical No part of this material may be reproduced, distributed, or transmitted in any form or by any means, including photocopying, recording, or other electronic or mechanical
methods, without the prior written permission of the owner, except for personal academic use and certain other noncommercial uses permitted by copyright law. methods, without the prior written permission of the owner, except for personal academic use and certain other noncommercial uses permitted by copyright law.

Planning
Framework of Statistical Analysis NATURE OF DATA
DATA ANALYSIS
Data Collection
• Includes conversion of the data into relevant information that leads to the DATA
formulations of clear, summarized and comprehensible numerical
description
Organization and
Presentation of Data
• Involves the treatment of data with appropriate and carefully selected
statistical tools
Primary Secondary
INTERPRETATION AND CONCLUSION Data Data
Data Analysis • This is where intelligent conclusions are drawn
a) Primary Data
• Statistical inferences or generalizations are made about the population of • Gathered directly from an original source
interest and predictions are also formulated
Interpretation and
• The analyst/researcher is involved in collecting the data relevant to his/her
Conclusion
• Inference has three categories: estimation, hypothesis testing and investigation and the information gathered is based on the direct or first-hand
association. experience.
RECOMMENDATION b) Secondary Data
Recommendation • Takes the form of suggested actions to be undertaken or recommended • Gathered from published or unpublished materials that have been previously
solutions and further research on the problem investigated upon obtained by
No part of this material may be reproduced, distributed, or transmitted in any form or by any means, including photocopying, recording, or other electronic or mechanical No part of this material may be reproduced, distributed, or transmitted in any form or by any means, including photocopying, recording, or other electronic or mechanical
methods, without the prior written permission of the owner, except for personal academic use and certain other noncommercial uses permitted by copyright law. methods, without the prior written permission of the owner, except for personal academic use and certain other noncommercial uses permitted by copyright law.
12/6/2022

Data Collection Data Collection


In research, statisticians use data in many ways. Data can be used to describe 2. Mailed Questionnaire Survey
situations or events. Data can be collected in a variety of ways. One of the most Mailed Questionnaire Surveys can cover a wider geographic area and are
common methods is using surveys. less costly to conduct. Also, respondents can remain anonymous if they desire.
Survey can be done by using a variety of methods such as: Major Drawbacks:
• Low number of responses and inappropriate answers to questions.
1. Telephone Survey • Some people may have difficulty reading and understanding the questions.
Telephone surveys have an advantage over personal interview surveys in
that they are less costly. 3. Personal Interview
Major Drawbacks: Personal interview surveys have the advantage of obtaining in-depth
• Some people in the population will not have phones or will not answer when the calls are responses to questions from the person being interviewed.
made;
• Many people now have unlisted numbers and cellphones; and Major Drawbacks:
• Even the tone of voice of the interviewer might influence the response of the person who is • More costly that the other two survey methods.
being interviewed. • Interviewer may be biased in his or her selection of respondents.
No part of this material may be reproduced, distributed, or transmitted in any form or by any means, including photocopying, recording, or other electronic or mechanical No part of this material may be reproduced, distributed, or transmitted in any form or by any means, including photocopying, recording, or other electronic or mechanical
methods, without the prior written permission of the owner, except for personal academic use and certain other noncommercial uses permitted by copyright law. methods, without the prior written permission of the owner, except for personal academic use and certain other noncommercial uses permitted by copyright law.

Sampling Techniques Sampling Techniques


Remember, samples cannot be selected in haphazard ways because the 2. Systematic Sampling
information obtained might be biased. To obtain samples that are unbiased which A systematic sampling is a sample obtained
give each subject in the population an equally likely chance of being selected – by selecting every kth member of the population
statisticians use four basic methods of sampling: where k is a counting number.
1. Random Sampling
A random sample is a sample in which all members of the population have 3. Stratified Sampling
an equal chance of being selected. A stratified sampling is a sample obtained
by dividing the population into subgroups or strata
according to some characteristics relevant to the
study. Then subjects are selected at random from
each subgroup.

No part of this material may be reproduced, distributed, or transmitted in any form or by any means, including photocopying, recording, or other electronic or mechanical No part of this material may be reproduced, distributed, or transmitted in any form or by any means, including photocopying, recording, or other electronic or mechanical
methods, without the prior written permission of the owner, except for personal academic use and certain other noncommercial uses permitted by copyright law. methods, without the prior written permission of the owner, except for personal academic use and certain other noncommercial uses permitted by copyright law.
12/6/2022

Sampling Techniques Sampling Error


4. Cluster Sampling Since samples are not perfect representatives of the populations from which they
are selected, there is always some error in the results. This error is called sampling
A cluster sampling is obtained by dividing the population into sections or error.
clusters and then selecting one or more clusters at random and using all members
in the cluster(s) as the members of the sample.
Sampling error is the difference between the results obtained from a sample and
the results obtained from the population from which the sample was selected.

A nonsampling error occurs when the data are obtained erroneously, or the
sample is biased, i.e. nonrepresentative.

• Caution and vigilance should be used when collecting data


• Other sampling techniques such as sequential, double and multistage sampling
are discussed in the latter discussion of this course.

No part of this material may be reproduced, distributed, or transmitted in any form or by any means, including photocopying, recording, or other electronic or mechanical No part of this material may be reproduced, distributed, or transmitted in any form or by any means, including photocopying, recording, or other electronic or mechanical
methods, without the prior written permission of the owner, except for personal academic use and certain other noncommercial uses permitted by copyright law. methods, without the prior written permission of the owner, except for personal academic use and certain other noncommercial uses permitted by copyright law.

Sample Problem 6 Experimental Design


There are several ways to classify statistical studies. This section explains two types
State which sampling method was used. of studies.
a. Out of 10 hospitals in a municipality, a researcher selects one and collects records for a 24- 1. Observational Studies
hour period on the types of emergencies that were treated there.
In an observational study, the researcher merely observes what is
happening or what has happened in the past and tries to draw conclusions based
b. A researcher divides a group of students according to gender, major field, and low, average, on these observations.
and high-grade point average. Then she randomly selects six students from each group to
answer questions in a survey.
There are three main types of observation studies:
• When all data are collected at one time, the study is called a cross-sectional
study.
c. The subscribers to a magazine are numbered. Then a sample of these people is selected using
random numbers. • When the data are collected using records obtained from the past, the study is
called a retrospective study.
• If the data are collected over a period of time, say, past and present, the study is
d. Every 10th bottle of Energized Soda is selected, and the amount of liquid in the bottle is called a longitudinal study.
measured. The purpose is to see if the machines that fill the bottles are working properly.

No part of this material may be reproduced, distributed, or transmitted in any form or by any means, including photocopying, recording, or other electronic or mechanical No part of this material may be reproduced, distributed, or transmitted in any form or by any means, including photocopying, recording, or other electronic or mechanical
methods, without the prior written permission of the owner, except for personal academic use and certain other noncommercial uses permitted by copyright law. methods, without the prior written permission of the owner, except for personal academic use and certain other noncommercial uses permitted by copyright law.
12/6/2022

Experimental Design Experimental Design


There are several ways to classify statistical studies. This section explains two types The purpose of a statistical study is to gain and process information obtained from
of studies. the study in order to answer specific questions about the subject being
2. Experimental Studies investigated.
In an experimental study, the researcher manipulates one of the variables Statistical researchers use specific procedure:
and tries to determine how the manipulation influences other variables. 1. Formulate the purpose of the study.
2. Identify the variables for the study.
The independent variable in an experimental study is the one that is being
manipulated by the researcher. The independent variable is also called the 3. Define the population.
explanatory variable. The resultant variable is called the dependent variable or the 4. Decide what sampling method you will use to collect the data.
outcome variable.
5. Collect the data.
A confounding variable is one that influences the dependent or outcome variable 6. Summarize the data and perform any statistical calculations needed.
but was not separated from the independent variable. 7. Interpret the results.

No part of this material may be reproduced, distributed, or transmitted in any form or by any means, including photocopying, recording, or other electronic or mechanical No part of this material may be reproduced, distributed, or transmitted in any form or by any means, including photocopying, recording, or other electronic or mechanical
methods, without the prior written permission of the owner, except for personal academic use and certain other noncommercial uses permitted by copyright law. methods, without the prior written permission of the owner, except for personal academic use and certain other noncommercial uses permitted by copyright law.

Uses and Misuses of Statistics Uses and Misuses of Statistics


Statistical techniques can be used to describe data, compare two or more data sets,
determine if a relationship exists between variables, test hypotheses and make Ambiguous Averages
estimates about population characteristics. • There are four commonly used measures that are loosely called “ averages” and
these values can vary greatly, therefore can change data.
However, there is another aspect of statistics, and that is the misuse of statistical
techniques to sell products that don’t work properly, to attempt to prove • Mean
something true that is really not true, or to get our attention by using statistics to • Median
evoke fear, shock and outrage. • Mode
Suspect Samples • Mid-range
• The first thing to consider is the sample that was used in the research study.
Changing the subject
• Is the sample size large enough?
• Another type of statistical distortion can occur when different values are used to
• How was the sample selected? represent the same data.
• Is there a built-in bias?
• Was the sample a convenience sample or a volunteer sample?
• Example: Obama might say “During my term, expenditures increased by 3%. Romney might
say “During my opponent’s term, expenditures increased by six million USD.
No part of this material may be reproduced, distributed, or transmitted in any form or by any means, including photocopying, recording, or other electronic or mechanical No part of this material may be reproduced, distributed, or transmitted in any form or by any means, including photocopying, recording, or other electronic or mechanical
methods, without the prior written permission of the owner, except for personal academic use and certain other noncommercial uses permitted by copyright law. methods, without the prior written permission of the owner, except for personal academic use and certain other noncommercial uses permitted by copyright law.
12/6/2022

Uses and Misuses of Statistics Sample Problem 7


Detached Statistics As the evidence on the adverse effects of cigarette smoke grew, people tried many
• A claim where no comparison is made. different ways to quit smoking. Some people tried chewing tobacco or, as it was
called, smokeless tobacco. A small amount of tobacco was placed between the
Misleading Graph cheek and gum. Certain chemicals from the tobacco were absorbed into the
• If graphs are drawn inappropriately, they misinterpret the data. bloodstream and gave the sensation of smoking cigarettes. This prompted studies
• Misrepresented data can lead a reader to false conclusions. on the adverse effects of smokeless tobacco. One study in particular used 40
university students as subjects. Twenty were given smokeless tobacco to chew, and
Implied Connections twenty given a substance that looked and tasted like smokeless tobacco, but did
• A claim that attempts to imply connections between to variables that may not not contain any of the harmful substances. The students were randomly assigned
actually exist. to one of the groups. The students’ blood pressure and heart rate were measured
before they started chewing and 20 minutes after they had been chewing. A
Faulty Survey Questions significant increase in heart rate occurred in the group that chewed the smokeless
• A researcher needs to ensure that questions are properly written since the way tobacco.
questions are phrased can often influence the way people answer them.
No part of this material may be reproduced, distributed, or transmitted in any form or by any means, including photocopying, recording, or other electronic or mechanical No part of this material may be reproduced, distributed, or transmitted in any form or by any means, including photocopying, recording, or other electronic or mechanical
methods, without the prior written permission of the owner, except for personal academic use and certain other noncommercial uses permitted by copyright law. methods, without the prior written permission of the owner, except for personal academic use and certain other noncommercial uses permitted by copyright law.

Questions
1. What type of study was this (observational, quasi-experimental, or
experimental)?
2. What are the independent and dependent variables?
3. Which was the treatment group?
4. Could the students’ blood pressures be affected by knowing that
END OF LESSON 1
they are part of a study?
5. List some possible confounding variables.
6. Do you think this is a good way to study the effect of smokeless Walpole, R.E., Myers, R.H. (2007).Probability and Statistics for Engineers
tobacco? & Scientists (8thEdition). PerasonEducation International
Montgomery, D. C., Runger, G. C. (2003). Applied Statistics & Probability
for Engineers(3thEdition). John Wiley & Sons, Inc.
No part of this material may be reproduced, distributed, or transmitted in any form or by any means, including photocopying, recording, or other electronic or mechanical No part of this material may be reproduced, distributed, or transmitted in any form or by any means, including photocopying, recording, or other electronic or mechanical
methods, without the prior written permission of the owner, except for personal academic use and certain other noncommercial uses permitted by copyright law. methods, without the prior written permission of the owner, except for personal academic use and certain other noncommercial uses permitted by copyright law.

You might also like