Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
25 views

Module 1

This document provides an introduction to obtaining data for engineering data analysis. It discusses descriptive and inferential statistics. The main methods of collecting primary data are outlined as observation, interviews, questionnaires, and case studies. Observation methods include structured, unstructured, participant, and non-participant. Interview types include personal, structured, unstructured, focused, and group. The document also differentiates between qualitative and quantitative variables and examples of data.

Uploaded by

japsbatman
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views

Module 1

This document provides an introduction to obtaining data for engineering data analysis. It discusses descriptive and inferential statistics. The main methods of collecting primary data are outlined as observation, interviews, questionnaires, and case studies. Observation methods include structured, unstructured, participant, and non-participant. Interview types include personal, structured, unstructured, focused, and group. The document also differentiates between qualitative and quantitative variables and examples of data.

Uploaded by

japsbatman
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

ES209

Engineering Data Analysis


Module No. 01
Topic OBTAINING DATA
Period Week no. 01 Date: September 05-10,2022

OBTAINING DATA

Introduction
Hello dear young engineers!
Welcome to this module on Engineering Data Analysis. This module will help you understand
how to obtain data by following the methods, planning, and conducting. But first we will define
what is statistics?
Statistics may be defined as the science that deals with the collection, organization, presentation,
analysis, and interpretation of data in order be able to draw judgments or conclusions that help
in the decision-making process. The two parts of this definition correspond to the two main
divisions of Statistics. These are Descriptive Statistics and Inferential Statistics. Descriptive
Statistics, which is referred to in the first part of the definition, deals with the procedures that
organize, summarize and describe quantitative data. It seeks merely to describe data. Inferential
Statistics, implied in the second part of the definition, deals with making a judgment or a
conclusion about a population based on the findings from a sample that is taken from the
population.

Objective/Intended Learning Outcomes

At the end of this module, you are expected to:

• Demonstrate understanding of the different methods of obtaining data.


• Explain the procedures for planning and conducting surveys and experiment.

1
WHAT IS DATA ?

▪ “Dataa measurement or characteristics


are values of qualitative of an item.
or quantitative variables, belonging to a set of items.”
Set of items sometimes called the population: the set of objects you are interested in.

Variables a measurement or characteristics of an item.

qualitative • A categorical variable.


• A variable that is not numerical. It describes data that fits into categories.
• Example:
• Eye colors (variables include: blue, green, brown, hazel).
• States (variables include: Florida, New Jersey, Washington).
• Dog breeds (variables include: Alaskan Malamute, German
Shepherd, Siberian Husky, Shih tzu).

quantitative • measurement variable or numerical variables


• Example:
• counts, percent, or numbers.

Practice Activity 01

Determine whether the following is a qualitative or a quantitative variable.


Write QLV if qualitative variable and QTV if quantitative variable.

1. High school Grade Point Average (e.g. 4.0, 3.2, 2.1).


2. Number of pets owned (e.g. 1, 2, 4).
3. How many cousins you have (e.g. 0, 12, 22).
4. Your race (e.g. Asian, Latino, black).
5. Party affiliation (e.g. Republican, Democrat, Independent).

What have you observed with the first three statements? How the statements
four and five?

The general rule of thumb: if you can add it, it is quantitative, if you cannot
add something, then it is qualitative.

2
WHAT DO DATA LOOK LIKE ?

Figure 1: Image Figure 2: Music

Figure 3: News Figure 4: Excel File

Practice Activity 02

List at least examples of data:

___________________________
___________________________
___________________________
___________________________
___________________________

3
METHODS OF DATA COLLECTION

Data collection is the process of gathering and measuring information on variables of interest, in
an established systematic fashion that enables one to answer stated research questions, test
hypotheses, and evaluate outcomes.

TYPES OF DATA

PRIMARY DATA data which are collected fresh and for the first time and thus happen to
be original in character and known as PRIMARY DATA.

SECONDRY data which have been collected by someone else and which have already
DATA been passed through the statistical process.

4
METHODS OF DATA COLLECTION |Primary Data

1. Observation

• is a method under which data from the field.


• is collected with the help of observation by the observer or by personally going to the
field.

Advantages Disadvantages

Subjective bias eliminated Time consuming

Current information Limited information

Independent to respondent’s Unforeseen factors


variable

Types of Observation

1a. Structured Observation

• when observation is done by characterizing the style of recording the observed


information, standardized conditions of observation, the definition of the units to be
observed, and selection of pertinent data of observation.
• Example: An auditor performing inventory analysis in store

1b. Unstructured Observation

• when observation is done without any thought before observation.


• Example: Observing children playing with new toys.

1c. Participant

• when the observer is a member of the group he is observing.


• Advantages:
1. Observation of natural behavior.
2. Closeness with the group.
3. Better understanding.

1d. Non-participate

• when the observer is observing people without giving any information to them.
• Advantages:
1. Objectivity and neutrality.
2. More willingness of the respondent. 5
Types of Observation

1e. Uncontrolled

• when the observation takes place in natural conditions. It is done to get a spontaneous
picture of life and persons.

1f. Controlled

• when an observation takes place according to definite pre-arranged plans, with the
experimental procedure then it is a controlled observation generally done in the
laboratory under controlled conditions.

2. Interview

• This method of collecting data involves presentation or oral-verbal stimuli and replies
in terms of oral-verbal responses.
• The interview method is an oral verbal communication where the interviewer asks
questions (which are aimed to get information required for study) to respondent.

Types of Interview

1. Personal interviews The interviewer asks questions generally in a face to face


contact with the other person or persons.

2. Structured interviews In this case, a set of pre-decided questions are there.

3. Unstructured interviews In this case, we don’t follow a system of pre-determined


questions.

Attention is focused on the given experience of the


4. Focused interviews respondent and its possible effects.

5. Clinical interviews concerned with broad underlying feelings or motivations


or with the course of an individual’s life experience, rather
than with the effects of the specific experience, as in the
case of a focused interview.

6
Types of Interview

6. Group interviews A group of 6 to 8 individuals is interviewed.

7. Qualitative and Divided on the basis of subject matter i.e. whether


quantitative interviews qualitative or quantitative.

8. Individual interviews
The interviewer meets a single person and interviews him.

9. Selection interviews Done for the selection of people for certain jobs.

10. Depth interviews It deliberately aims to elicit unconscious as well as other


types of material relating especially to personality
dynamics and motivations

11. Telephonic interviews Contacting samples on telephone

3. Questionnaire

• This method of data collection is quite popular, particularly in the case of big
enquiries.
• Is mailed to respondents who are expected to read and understand the questions and
write down the reply in the space meant for the purpose of the questionnaire itself.
• The respondents have to answer the questions on their own.

Advantages Disadvantages

Low cost even if the geographical area is Low rate of return of duly filled
too large questionnaire.
Answers are in respondents word so free Slowest method of data collection.
from bias.
Adequate time to think for answers. Difficult to know if the expected
respondent have filled the form or it is
filled by someone else.
Non approachable respondents may be
conveniently contacted.
Large samples can be used so results are
more reliable.
7
METHODS OF DATA COLLECTION |Primary Data

4. Case Study

• is essentially an intensive investigation of the particular unit under consideration.

Advantages Disadvantages

They are less costly and less They are subject to selection bias
time-consuming; they are
advantageous when exposure data
is expensive or hard to obtain.
They are advantageous when They generally do not allow
studying dynamic populations in calculation of incidence (absolute
which follow-up is difficult. risk).

5. Survey

• is one of the common methods of diagnosing and solving social problems is by


undertaking surveys.

Advantages Disadvantages

Relatively easy to administer Respondents may not feel


encouraged to provide accurate,
honest answers
Can be developed in less time Surveys with closed-ended questions
(compared to other data-collection may have a lower validity rate than
methods) other question types.
Cost-effective, but cost depends on Data errors due to question
survey mode non-responses may exist.

Practice Activity 03
Define the following:

• Registration Method
• Experimentation Method
8
METHODS OF DATA COLLECTION |Secondary Data

Sources of Data

• Publications of Central, state, and local government


• Technical and trade journals
• Books, Magazines, Newspaper
• Reports & publications of industry, bank, stock exchange
• Reports by research scholars, Universities, economist
• Public Records

Factors to considered before using Secondary Data

Reliability of data Who, when, which methods, at what time etc.

Object, scope, and nature of original inquiry should be studied, as if the


Suitability of data
study was with a different objective then that data is not suitable for the
current study

Adequacy of data Level of accuracy,


Area differences then data are not adequate for the study

9
Factors to consider when choosing a Data collection methods

There are various factors to consider when choosing a data collection method. As such the
researcher must judiciously select the method/methods for his own study, keeping in view the
following factors:

Nature, scope and object of inquiry

This constitutes the most important factor affecting the choice of a particular method. The
method selected should be such that it suits the type of enquiry that is to be conducted by the
researcher. This factor is also important in deciding whether the data already available (secondary
data) are to be used or the data not yet available (primary data) are to be collected.

Availability of funds

The availability of funds for the research project determines to a large extent the method to be
used for the collection of data. When funds at the disposal of the researcher are very limited, he
will have to select a comparatively cheaper method which may not be as efficient and effective as
some other costly method. Finance, in fact, is a big constraint in practice and the researcher has
to act within this limitation.

Time factor

Availability of time has also to be taken into account in deciding a particular method of data
collection. Some methods take relatively more time, whereas with others the data can be
collected in a comparatively shorter duration. The time at the disposal of the researcher, thus,
affects the selection of the method by which the data are to be collected.

Precision required

Precision required is yet another important factor to be considered at the time of selecting the
method of collection of data.

10
Designing a Survey

Surveys can take different forms. They can be used to ask only one question or they can ask a
series of questions. We can use surveys to test out people’s opinions or to test a hypothesis.

When designing a survey, the following steps are useful:

1. Determine the goal of your survey: What question do you want to answer?
2. Identify the sample population: Whom will you interview?
3. Choose an interviewing method: face-to-face interview, phone interview, self-administered
paper survey, or internet survey.
4. Decide what questions you will ask in what order, and how to phrase them. (This is
important if there is more than one piece of information you are looking for.)
5. Conduct the interview and collect the information.
6. Analyze the results by making graphs and drawing conclusions.

Example:

Martha wants to construct a survey that shows which sports students at her school like to play
the most.
Step 1: List the goal of the survey
Step 2: What population should she interview?
Step 3: How should she administer the survey?
Step 4: Create a data collection sheet that she can use to record her results

Step 1: GOAL
The goal of the survey is to find the answer to the question: “Which sports do students
at Martha’s school like to play the most?”
Step 2: POPULATION
A sample of the population would include a random sample of the student population in
Martha’s school. A good strategy would be to randomly select students (using dice or a
random number generator) as they walk into an all-school assembly
Step 3: METHODS

Face-to-face interviews are a good choice in this case. Interviews will be easy to conduct
since the survey consists of only one question which can be quickly answered and
recorded, and asking the question face to face will help eliminate non-response bias.

Step 4: DATA

11
Basis of Conducting Experiment

1. With an experiment, the researcher is trying to learn something new about the world, an
explanation of 'why' something happens.
2. The experiment must maintain internal and external validity, or the results will be useless.
3. When designing an experiment, a researcher must follow all of the steps of the scientific
method, from making sure that the hypothesis is valid and testable, to using controls and
statistical tests.

12
Introduction to Design of Experiments (DOE)

What is the Scientific Method?

Do you remember learning about this back in high school or junior high even? What were
those steps again?

Decide what phenomenon you wish to investigate.


Specify how you can manipulate the factor and hold all other conditions fixed, to
insure that these extraneous conditions aren't influencing the response you plan to
measure.
Then measure your chosen response variable at several (at least two) settings of the
factor under study. If changing the factor causes the phenomenon to change, then
you conclude that there is indeed a cause-and-effect relationship at work.
How many factors are involved when you do an experiment? Some say two -
perhaps this is a comparative experiment? Perhaps there is a treatment group and a
control group? If you have a treatment group and a control group then, in this case,
you probably only have one factor with two levels.

How many of you have baked a cake? What are the factors involved to ensure a successful
cake? Factors might include preheating the oven, baking time, ingredients, amount of
moisture, baking temperature, etc.-- what else? You probably follow a recipe so there are
many additional factors that control the ingredients - i.e., a mixture. In other words,
someone did the experiment in advance! What parts of the recipe did they vary to make the
recipe a success? Probably many factors, temperature and moisture, various ratios of
ingredients, and the presence or absence of many additives. Now, should one keep all the
factors involved in the experiment at a constant level and just vary one to see what would
happen? This is a strategy that works but is not very efficient. This is one of the concepts
that we will address in this course.

“All experiments are designed experiments, it is just that


some are poorly designed and some are well-designed.”

What is your thought about this quote, young engineers?

13
Engineering Experiments
If we had infinite time and resource budgets there probably wouldn't be a big fuss made over
designing experiments. In production and quality control we want to control the error and learn
as much as we can about the process or the underlying theory with the resources at hand. From
an engineering perspective we're trying to use experimentation for the following purposes:
• reduce time to design/develop new products & processes
• improve performance of existing processes
• improve reliability and performance of products
• achieve product & process robustness
• perform an evaluation of materials, design alternatives, setting component & system
tolerances, etc.

We always want to fine-tune or improve the process. In today's global world this drive for
competitiveness affects all of us both consumers and producers.
Robustness is a concept that enters into statistics at several points. In the analysis, stage
robustness refers to a technique that isn't overly influenced by bad data. Even if there is an
outlier or bad data you still want to get the right answer. Regardless of who or what is involved
in the process - it is still going to work.

Every experiment design has input. Back to the cake baking example: we have our ingredients
such as flour, sugar, milk, eggs, etc. Regardless of the quality of these ingredients we still want
our cake to come out successfully. In every experiment there are inputs and in addition, there are
factors (such as time of baking, temperature, the geometry of the cake pan, etc.), some of which
you can control and others that you can't control. The experimenter must think about factors
that affect the outcome. We also talk about the output and the yield or the response to your
experiment. For the cake, the output might be measured as texture, height, size, or flavor.

14
The Basic Principles of DOE

Randomization
This is an essential component of any experiment that is going to have validity. If you are doing a
comparative experiment where you have two treatments, a treatment and a control, for instance,
you need to include in your experimental process the assignment of those treatments by some
random process. An experiment includes experimental units. You need to have a deliberate
process to eliminate potential biases from the conclusions, and random assignment is a critical
step.

Replication

Blocking
Blocking is a technique to include other factors in our experiment which contribute to
undesirable variation. Much of the focus in this class will be to creatively use various blocking
techniques to control sources of variation that will reduce error variance. For example, in human
studies, the gender of the subjects is often an important factor. Age is another factor affecting
the response. Age and gender are often considered nuisance factors which contribute to the
variability and make it difficult to assess the systematic effects of treatment. By using these as
blocking factors, you can avoid biases that might occur due to differences between the allocation
of subjects to the treatments, and as a way of accounting for some noise in the experiment. We
want the unknown error variance at the end of the experiment to be as small as possible. Our
goal is usually to find out something about a treatment factor (or a factor of primary interest),
but in addition to this, we want to include any blocking factors that will explain variation.

15
The Basic Principles of DOE

Multi-factor Designs

Confounding
Confounding is something that is usually considered bad! Here is an example. Let's say we are
doing a medical study with drugs A and B. We put 10 subjects on drug A and 10 on drug B. If
we categorize our subjects by gender, how should we allocate our drugs to our subjects? Let's
make it easy and say that there are 10 male and 10 female subjects. A balanced way of doing this
study would be to put five males on drug A and five males on drug B, five females on drug A
and five females on drug B. This is a perfectly balanced experiment such that if there is a
difference between males and females at least it will equally influence the results from drug A
and the results from drug B.
An alternative scenario might occur if patients were randomly assigned treatments as they came
in the door. At the end of the study, they might realize that drug A had only been given to the
male subjects and drug B was only given to the female subjects. We would call this design totally
confounded. This refers to the fact that if you analyze the difference between the average
response of the subjects on A and the average response of the subjects on B, this is exactly the
same as the average response of males and the average response of females. You would not have
any reliable conclusion from this study at all. The difference between the two drugs A and B
might just as well be due to the gender of the subjects since the two factors are totally
confounded.
Confounding is something we typically want to avoid but when we are building complex
experiments we sometimes can use confounding to our advantage. We will confound things we
are not interested in order to have more efficient experiments for the things we are interested in.
This will come up in multiple factor experiments later on. We may be interested in main effects
but not interactions so we will confound the interactions in this way in order to reduce the
sample size, and thus the cost of the experiment, but still has good information on the main
effects.

16
Steps for Planning, Conducting and Analyzing an Experiment

The practical steps needed for planning and conducting an experiment include: recognizing the
goal of the experiment, choice of factors, choice of response, choice of the design, analysis and
then drawing conclusions. This pretty much covers the steps involved in the scientific method.

1. Recognition and statement of the problem


2. Choice of factors, levels, and ranges
3. Selection of the response variable(s)
4. Choice of design
5. Conducting the experiment
6. Statistical analysis
7. Drawing conclusions, and making recommendations

What this topic will deal with primarily is the choice of design. This focus includes all the related
issues about how we handle these factors in conducting our experiments.

Factors
We usually talk about "treatment" factors, which are the factors of primary interest to you. In
addition to treatment factors, there are nuisance factors which are not your primary focus, but
you have to deal with them. Sometimes these are called blocking factors, mainly because we will
try to block these factors to prevent them from influencing the results.

There are other ways that we can categorize factors:

Experimental vs. Classification Factors

Experimental Factors Classification Factors

These are factors that you can specify (and set These can't be changed or assigned, these
the levels) and then assign at random as the come as labels on the experimental units. The
treatment to the experimental units. age and sex of the participants are
Examples would be temperature, level of an classification factors which can't be changed
additive fertilizer amount per acre, etc. or randomly assigned. But you can select
individuals from these groups randomly.

17
Steps for Planning, Conducting and Analyzing an Experiment

Quantitative vs. Qualitative Factors

Quantitative Factors Qualitative Factors

You can assign any specified level of a These factors have categories which are
quantitative factor. Examples: percent or pH different types. Examples might be species of
level of a chemical. a plant or animal, a brand in the marketing
field, gender, - these are not ordered or
continuous but are arranged perhaps in sets.

You are finally done with Module 1 ! Hop on for


more exciting and challenging activities in
Module 2!

18
References

• Dodge, Y.; Cox, D.; Commenges, D.; Davidson, A; Solomon, P.; and Wilson, S. (Eds.). The
Oxford Dictionary of Statistical Terms, 6th Edition. New York: Oxford University Press,
2006.
• Beyer, W. H. CRC Standard Mathematical Tables, 31st ed. Boca Raton, FL: CRC Press, pp.
536 and 571, 2002.
• Agresti A. (1990) Categorical Data Analysis. John Wiley and Sons, New York.
• Kotz, S.; et al., eds. (2006), Encyclopedia of Statistical Sciences, Wiley.
• Lindstrom, D. (2010). Schaum’s Easy Outline of Statistics, Second Edition (Schaum’s Easy
Outlines) 2nd Edition. McGraw-Hill Education
• Selection of appropriate method for data collection in research methodology tutorial 04
September 2022 - learn selection of appropriate method for data collection in research
methodology tutorial (11495): Wisdom Jobs India. Wisdom Jobs. (n.d.). Retrieved September
4, 2022, from
https://www.wisdomjobs.com/e-university/research-methodology-tutorial-355/selection-of-
appropriate-method-for-data-collection-11495.html
• Lesson 1: Introduction to design of experiments: Stat 503. PennState: Statistics Online
Courses. (n.d.). Retrieved September 4, 2022, from
https://online.stat.psu.edu/stat503/lesson/1

19

You might also like