Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
6 views

Week1 Introduction

biostatistics

Uploaded by

thuy36030
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Week1 Introduction

biostatistics

Uploaded by

thuy36030
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

Biostatistics

Spring semester, 2023-2024


General information

 Instructors:
Theory class: A.Prof.Dr. Nguyen Tan Khoi
Practice class: Ms. Do Ngoc Phuc Chau
 The content are linked between theory and practice classes
 The scores are assessed separately:
Theory class: decided by A.Prof. Khoi
Practice class: detail in the Rubrics for scoring (Blackboard)
50% from in-class Quizzes
20% from Group project
30% from average of Midterm test and Final exam*
* if you passed Theory class, the old scores will be used.
Course outline

- Chapter 1: Introduction
- Chapter 2: Descriptive statistics
- Chapter 3: Probability & Distribution of probability
- Chapter 4: Continuous distribution of probability
- Chapter 5: Hypothesis testing
- Chapter 6: ANOVA
- Chapter 7: Regression and correlation analysis
- Chapter 8: Normality test
- Chapter 9: Non-parametric tests
Course syllabus

CLO1. Present the collected data from experiment
Knowledge
Course CLO2. Distinguish different hypothesis testing methods for analysis
learning Skill CLO3. Use Excel to present and analyse the data
outcomes
Attitude CLO4. Reason to design question list for a survey-experiment
Introduction & Descriptive statistics: Processing data and presenting data, Calculating
Week 1
simple statistic parameter using Excel
Week 2 Probability & Distribution of probability: Calculating frequency of data using Excel
Group project - Survey preparation: Understanding the experiment carried out by
Week 3
survey, Knowing different terms of research, Designing the questions for the survey
Content Parametric statistical tests: Carrying out simple hypothesis testing using Excel,
Week 4
Carrying out ANOVA tests using Excel
Non-parametric tests & Normality tests: Carrying out non-parametric tests using
Week 5
Excel, Perform normality test using Excel
Review & Group project - Presentation: Revision all hypothesis testing methods,
Week 6
Present the result of Group project
1. Jim Fowler, Lou Cohen, and Ph. Jarvis. 1998. Practical Statistics for Field Biology. John & Wiley &
Sons. 2nd Edition.
Reading
2. Chap T. Le. 2003. Introductory Biostatistics. John & Wiley & Sons.
list 3. J. Susan Milton and Jesse C. Arnold. 2003. Introduction to Probability and Statistics. Mc Graw-Hill.
4th Edition.
INTRODUCTION
Content

1. Biostatistics
2. Types of data Learning objectives
3. Displaying data
4. Sample and Population
5. Different tools for biostatistics - Distinguish between
qualitative and quantitative data
- Describe four scale of
measurements
- Describe the difference
between population, census and
sample
- Use different types of chart for
different type of data
WHAT IS BIOSTATISTICS?
statistics used in biological fields

So, What is statistics?


the process of converting data into information.

consists of various steps like generation of hypothesis,


collection of data, and application of analysis methods.

Then, Biostatistics teaches us how to summarize, analyze, and


draw meaningful inferences from data that then lead to
confirmations of hypotheses that relates to biological problem.
CATEGORIES of STATISTICS
How many pairs of shoes does each
student in our class own?
Descriptive Statistics
✓ Collect
✓ Organize
✓ Summarize
✓ Display
✓ Analyze

Inferential Statistics
✓ Predict and forecast
values of population
parameters
✓ Test hypotheses about
values of population
parameters
CATEGORIES of STATISTICS
How many pairs of shoes does each
student in our class own?
Descriptive Statistics * observation * variable
✓ Collect
✓ Organize
✓ Summarize
✓ Display
✓ Analyze

Inferential Statistics
✓ Predict and forecast
values of population
parameters
✓ Test hypotheses about
values of population
parameters
Youtube channel: Dr Nic’s Maths and Stats
CATEGORIES of STATISTICS
How many pairs of shoes does each
student in our class own?
Descriptive Statistics
✓ Collect
✓ Organize
✓ Summarize
✓ Display
✓ Analyze

Inferential Statistics
✓ Predict and forecast
values of population
parameters
✓ Test hypotheses about
values of population
parameters
Youtube channel: Dr Nic’s Maths and Stats
CATEGORIES of STATISTICS
How many pairs of shoes does each
student in our class own?
Descriptive Statistics
✓ Collect
✓ Organize
✓ Summarize
✓ Display
✓ Analyze

Inferential Statistics
✓ Predict and forecast
values of population
parameters
✓ Test hypotheses about
values of population
parameters
Youtube channel: Dr Nic’s Maths and Stats
CATEGORIES of STATISTICS
How many pairs of shoes does each
student in our class own?
Descriptive Statistics
✓ Collect - How many pairs of shoes does the
✓ Organize 162nd student may have?
✓ Summarize - How many pairs does she (he) own
✓ Display if that is a girl (boy)?
✓ Analyze - Does the girl own more shoes than
the boy?
- Does student in our class own more
Inferential Statistics shoes than normal with 3 pairs of
✓ Predict and forecast shoes?
values of population
parameters
✓ Test hypotheses about
values of population
parameters
TYPES of DATA
Qualitative data Quantitative data
(Categorical or Nominal) (Measurable or Countable)
Examples are- • Discrete variable
✓ Color • Continuous variable
✓ Gender Examples are-
✓ Level of agreement ✓ Temperatures
✓ Salaries
✓ Number students in a group
✓ Level of agreement

SCALES of MEASUREMENT
Nominal Scale – groups or classes
Ordinal Scale – order matters
Interval Scale – difference or distance matters – has arbitrary zero value
Ratio Scale – ratio matters – has a natural zero value
DISPLAYING DATA

Discrete variables
Pie chart
Bar chart
Line graph

Continuous variables
Histogram
Frequency polygon
Ogive (or Cumulative frequency graph)
Stem-and-Leaf diagram
Scatter plot
SAMPLE and POPULATION

A population – consists of the set


of all measurements for which the
investigator is interested

A sample – is a subset of the


measurements selected from the
population

A census – is a complete ?? Population vs. Census ??


enumeration of every item in a
population
POPULATION or SAMPLE
What is average height of IU students?
Population Sample
- all about >7,000 IU students - 300 students
✓ Impossible ✓ Possible
✓ Impractical ✓ Easy to archive
✓ Too costly ✓ Cheaper
✓ Take long time ✓ Faster
Sampling and Simple random sample
• Sampling from the population is often done randomly,
such that every possible sample of equal size (n) will have
an equal chance of being selected
• A sample selected in this way is called a simple random
sample or just a random sample
• A random sample allows chance to determine its elements
How can we make random sample of 300 students from IU students?
EXPERIMENT, SET and EVENT
Set and Complement of set
Intersecting of sets
Union of sets
Mutually exclusive or disjoint sets
Partitions
TOOLS FOR BIOSTATISTICS…
DESCRIPTIVE STATISTICS
Content

1. Measures of Central Tendency:


Mean, Mode, Median Learning objectives
2. Percentiles and Quartiles
3. Measures of Variability: Range, - Explain measures of central
Interquartile range, Variance, tendency and how to compute
Standard deviation them
4. Using Excel to calculate the - Calculate and interpret
measures percentiles and quartiles
5. Skewness and Kurtosis - Explain measures of
6. Relations between the Mean variability and how to compute
and Standard deviation them
- Distinguish different symbols
- Understand the relations
between mean and standard
deviation
Does the girl own more shoes than the boy?
MEASURES of CENTRAL TENDENCY

Median – middle value when sorted in order of magnitude

Mode – most frequently-occurring value

Mean – average
Arithmetic Mean or Average

Population Mean Sample Mean


MEASURES of VARIABILITY or DISPERSION

Range – difference between maximum and minimum values

Interquartile range (IQR) – difference between third and


first quartile

Variance – average of the squared deviations from the means

Population Variance Sample Variance

Standard deviation (SD) – square root of the variance


Population SD Sample SD
Percentile – a number where a certain percentage of scores fall
below that percentile
Example: 85th percentile of GPA set is 72 – means that 85%
of observations in the GPA set are lower than 72

Quartile – the percentage points that break down the ordered data set
into quarters
• The first quartile, Q1, or lower quartile is the 25th percentile – the
point below which lie ¼ of the data
• The second quartile, Q2, or middle quartile is the 50th percentile –
the point below which lie ½ of the data. This is also called the median
• The third quartile, Q3, or upper quartile is the 75th percentile – the
point below which lie ¾ of the data
* Find the 80th percentile:
1/ Order the data ascending (from small to big)
2/ Find the position of 80th percentile: using equation
(n+1)xP/100 with n = number of data and P = the
percentile
➔ (20 + 1) x 80 / 100 = 16.8
The position 16.8, followed the order, which is located
between 16th and 17th observations, tend to near the 17th.
3/ Calculate the number at the 16.8th position: using
equation (xafter – xbefore)xdecimal part of position+xbefore
→ (x of 17th – x of 16th) x 0.8 + x of 16th
➔ (33 – 32) x 0.8 + 32 = 32.8

32.8 is the 80th percentile of the data set


or, 80% of data set has value smaller than 32.8
OTHER MEASURES

Skewness – measure of the degree of


asymmetry of a frequency distribution
•Skewed to left
•Symmetric or unskewed
•Skewed to right

Kurtosis – measure of flatness or


peakedness of a frequency distribution
•Platykurtic (relatively flat)
•Mesokurtic (normal)
•Leptokurtic (relatively peaked)
RELATIONS between the MEAN and S.D.

Chebyshev’s Theorem
•Applies to any distribution, regardless of shape
•Places lower limits on the percentages of observations within a
given number of standard deviations from the mean

Empirical Rule
•Applies only to roughly mound-shaped and symmetric
distributions
•Specifies approximate percentages of observations within a given
number of standard deviations from the mean
Distribution of data
Chebyshev’s Theorem
Empirical Rule

You might also like