Chapter1 S
Chapter1 S
Chapter1 S
nguyenthuhang.cs2@ftu.edu.vn
Assessment
Attendance: 10%
Mid-term test: 30%
Final exam: 60%
Course outline
Chapter 1: Introduction to Statistics
Chapter 2: Summarizing Data
Chapter 3: Numerical Descriptive Techniques
Chapter 4: Inferences Based on a Single Sample:
Confidence Intervals and Tests of Hypothesis
Chapter 5: Inferences Based on a Two Samples
Confidence Intervals and Tests of Hypothesis
Chapter 6: ANOVA Analysis
Chapter 7: Regression Analysis
Chapter 8: Time series analysis
Text book
3. Characterizing Data
Decision-
e.g., Average
Making
1. What is statistics?
Economics Engineering
Forecasting Construction
Demographics Materials
Sports Business
Individual & Team Consumer
Performance Preferences
Financial Trends
Objectives of Statistics
B Drawing conclusions
making estimates,
decisions,
predictions, etc.
about sets of data based on sampling
Types of Statistics
Statistics
The branch of mathematics that transforms data into
useful information for decision makers.
Collect data
e.g., Survey
Present data
e.g., Tables and graphs
Characterize data
X i
e.g., Sample mean = n
Descriptive Statistics
Descriptive statistics
utilizes numerical and graphical methods to
explore data,
i.e., to look for patterns in a data set,
to summarize the information revealed in a
data set,
to present the information in a convenient
form.
Inferential Statistics
Estimation
e.g., Estimate the population
mean weight using the sample
mean weight
Hypothesis testing
e.g., Test the claim that the
population mean weight is 120
pounds
a decision maker
Parameter
a summary measure (e.g., mean) that is computed
Population Sample
Types of
Data
Quantitative Qualitative
Data Data
Quantitative Data
Measured on a numeric 4
scale.
Number of defective
943
items in a lot. 21 52
Salaries of CEOs of
at a company.
Method of payment
$ Credit
Example
Chemical and manufacturing plants sometimes
discharge toxic-waste materials such as DDT into
nearby rivers and streams. These toxins can
adversely affect the plants and animals inhabiting
the river and the riverbank. The U.S. Army Corps
of Engineers conducted a study of fish in the
Tennessee River (in Alabama) and its three
tributary creeks: Flint Creek, Limestone Creek, and
Spring Creek. A total of 144 fish were captured,
and the following variables were measured for
each: (continued on next slide)
Example (cont)
1. River/creek where each fish was captured
2. Species (channel catfish, largemouth bass, or
smallmouth buffalo fish)
3. Length (centimeters)
4. Weight (grams)
5. DDT concentration (parts per million)
Data
Categorical Numerical
Examples:
Marital Status
Political Party Discrete Continuous
Eye Color
(Defined categories) Examples: Examples:
Number of Children Weight
Defects per hour Voltage
(Counted items) (Measured characteristics)
Levels of Measurement
Primary Sources: The data collector is the one using the data
for analysis
Data from a political survey
Data collected from an experiment
Observed data
Secondary Sources: The person performing data analysis is
not the data collector
Analyzing census data
Examining data from print journals or data published on the internet.
5. Sources of Data
Published source:
book, journal, newspaper, Web site (
https://www.wider.unu.edu/data),
https://data.worldbank.org/
Designed experiment:
researcher exerts strict control over the units
Survey:
a group of people are surveyed and their responses
are recorded
Observation study:
units are observed in natural setting and variables of
interest are recorded
Designed Experiment
Questionnaires
supposed to be measured
Reliable: measures the quantity or concept in a
questionnaire:
Step 1: Write out the primary and secondary aims
of your study.
Step 2: Write out concepts/information to be
collected that relates to these aims.
Step 3: Review the current literature to identify
already validated questionnaires that measure
your specific area of interest.
Step 4: Compose a draft of your questionnaire.
Step 5: Revise the draft.
Step 6: Assemble the final questionnaire.
Step 1: Define the aims of the 80
study
Compose a draft
Compose a draft
Question: What brand of computer do you own?
(A) IBM PC
(B) Apple
Principle: Avoid hidden assumptions. Make sure to
accommodate all possible answers.
Solution:
(1) Make each response a separate dichotomous item
Do you own an IBM PC? (Circle: Yes or No)
Compose a draft
Compose a draft
Compose a draft
Question: Which one of the following do you think increases a
person’s chance of having a heart attack the most? (Check
one.)
[ ] Smoking [ ] Being overweight [ ] Stress
Principle: Encourage the respondent to consider each possible
response to avoid the uncertainty of whether a missing item may
represent either an answer that does not apply or an overlooked
item.
Solution: Which of the following increases the chance of having
a heart attack?
Smoking: [ ] Yes [ ] No [ ] Don’t know
Being overweight: [ ] Yes [ ] No [ ] Don’t know
Stress: [ ] Yes [ ] No [ ] Don’t know
89
Compose a draft
Question:
(1) Do you currently have a life insurance policy?
(Circle: Yes or No)
If no, go to question 3.
(2) How much is your annual life insurance premium?
Principle: Avoid branching as much as possible
to avoid confusing respondents.
Solution: If possible, write as one question.
How much did you spend last year for life insurance?
(Write 0 if none).
90
Step 5: Revise
questionnaire
questionnaire
Include white space to make answers clear and
to help increase response rate.
Space response scales widely enough so that it
is easy to circle or check the correct answer
without the mark accidentally including the
answer above or below.
Open-ended questions: the space for the response
should be big enough to allow respondents with large
handwriting to write comfortably in the space.
Closed-ended questions: line up answers vertically
and precede them with boxes or brackets to check, or
by numbers to circle, rather than open blanks.
98
Non-responders
Conclusions