FIN 640 - Lecture Notes 4 - Sampling and Estimation

This document discusses the topics of Week 4 in a statistics course: 1. It introduces different sampling methods like simple random sampling, systematic sampling, and stratified random sampling. 2. It covers the distribution of the sample mean and the central limit theorem. It also discusses point estimates, interval estimates, and how to calculate confidence intervals for the population mean. 3. It discusses potential sources of sampling bias like data-mining bias and sample selection bias. Maintaining a representative sample is important for drawing accurate statistical conclusions about a population.

Uploaded by

Vipul

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

307 views

FIN 640 - Lecture Notes 4 - Sampling and Estimation

Uploaded by

Vipul

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 40

Week 4: Sampling and Estimation

PROF. MICHAEL DONG

CALIFORNIA STATE UNIVERSITY LONG BEACH

FALL 2020
1. Sampling

Road Map 2. Distribution of The Sample Mean

3. Point and Interval Estimates

4. Sampling Bias
1. Sampling

3
Simple Random Sampling
 When an analyst chooses to sample, he must formulate a sampling
plan. A sampling plan is the set of rules used to select a sample. The
basic type of sample from which we can draw statistically sound
conclusions about a population is the simple random sample
(random sample, for short).
 A simple random sample is a subset of a larger population created
in such a way that each element of the population has an equal
probability of being selected to the subset.
 The procedure of drawing a sample to satisfy the definition of a
simple random sample is called simple random sampling.
Systematic sampling
 With systematic sampling, we select every kth member until we
have a sample of the desired size. The sample that results from this
procedure should be approximately random. real sampling
situations may require that we take an approximately random
sample.
Sampling Error
 Sampling error is the difference between the observed value of a
statistic and the quantity it is intended to estimate.
Sampling distribution
 The sampling distribution of a statistic is the distribution of all the
distinct possible values that the statistic can assume when
computed from samples of the same size randomly drawn from the
same population.

 For a population size of 10, and we pick a sample of 4, how many

possible samples are there?
Stratified Random Sampling
 In stratified random sampling, the population is divided into
subpopulations (strata) based on one or more classification criteria.
Simple random samples are then drawn from each stratum in sizes
proportional to the relative size of each stratum in the population.
These samples are then pooled to form a stratified random
sample.
Stratified Random Sampling – Bond
Indexing
Stratified Random Sampling – Bond
Indexing
Time-Series and cross-Sectional data
 Cross-sectional data are data on some characteristic of individuals,
groups, geographical regions, or companies at a single point in
time.
 Time-series data are data on some characteristic of one
individuals, groups, geographical regions, or companies over time.
Also called Longitudinal data.
 If both dimensions exist, we call it Panel Data. Panel data consist of
observations through time on a single characteristic of multiple
observational units.
2. Distribution of The Sample Mean

12
The Central Limit Theorem
The Central Limit Theorem
 Let’s recall the example we used in estimating mean stock returns
for IBM.
Standard Error of the Sample Mean
 For sample mean X calculated from a sample generated by a
population with standard deviation σ, the standard error of the
sample mean is given by one of two expressions.
 When we know σ, the population standard deviation:

 When we do not know the population standard deviation and need

to use the sample standard deviation, s, to estimate it:
What we do in practice in estimating standard
errors
 in practice, we almost always need to use equation 2. The estimate
of s is given by the square root of the sample variance, s2,
calculated as follows:

 We will soon see how we can use the sample mean and its
standard error to make probability statements about the
population mean by using the technique of confidence intervals.
3. Point and Interval Estimates

17
We care most about the population mean
 So we use estimators calculated from the sample to estimate the
population mean
 The formulas that we use to compute the sample mean and all the
other sample statistics are examples of estimation formulas or
estimators.
 The particular value that we calculate from sample observations
using an estimator is called an estimate.
Point Estimate
 To take the example of the mean, the calculated value of the
sample mean in a given sample, used as an estimate of the
population mean, is called a point estimate of the population
mean.
 In many applications, we have a choice among a number of
possible estimators for estimating a given parameter. how do we
make our choice? We often select estimators because they have
one or more desirable statistical properties. Following is a brief
description of three desirable properties of estimators:
unbiasedness (lack of bias), efficiency, and consistency.
Unbiasedness
 An unbiased estimator is one whose expected value (the mean of its
sampling distribution) equals the parameter it is intended to estimate.
 For example, the expected value of the sample mean, X , equals μ, the
population mean, so we say that the sample mean is an unbiased estimator
(of the population mean).
 The sample variance, s2, which is calculated using a divisor of n − 1 (equation
3), is an unbiased estimator of the population variance, σ2. if we were to
calculate the sample variance using a divisor of n, the estimator would be
biased: its expected value would be smaller than the population variance. We
would say that sample variance calculated with a divisor of n is a biased
estimator of the population variance.
Unbiasedness
 Sample mean and sample variance are both unbiased estimators of
the population mean and variance.
Consistency
 A consistent estimator is one for which the probability of estimates
close to the value of the population parameter increases as sample
size increases.
Consistency
 Law of Large Numbers (LLN)
 The weak law of large numbers (also called Khinchin's law) states
that the sample average converges in probability towards the
expected value.

 The strong law of large numbers states that the sample average
converges almost surely to the expected value
Efficiency
 An unbiased estimator is efficient if no other unbiased estimator of
the same parameter has a sampling distribution with smaller
variance.
 Sample mean X is an efficient estimator of the population mean;
sample variance s2 is an efficient estimator of σ2.
Interval Estimate
 Confidence Interval:
 A confidence interval is a range for which one can assert with a given
probability 1 − α, called the degree of confidence, that it will contain the
parameter it is intended to estimate. This interval is often referred to as
the 100(1 − α)% confidence interval for the parameter.
 The endpoints of a confidence interval are referred to as the lower and
upper confidence limits.
Confidence interval estimate
 A 100(1 − α)% confidence interval for a parameter has the
following structure:
Point estimate ± reliability factor × Standard error

 A 100(1 − α)% confidence interval for a parameter has the following

structure
Confidence interval for the Population Mean
(with Known Pop. Variance)
 A 100(1 − α)% confidence interval for population mean μ when we are sampling from
a normal distribution with known variance σ2 is given by

 Reliability Factors for Confidence Intervals Based on the Standard Normal Distribu-
tion. We use the following reliability factors when we construct confidence intervals
based on the standard normal distribution:
 90 percent confidence intervals: use z0.05 = 1.65
 95 percent confidence intervals: use z0.025 = 1.96
 99 percent confidence intervals: use z0.005 = 2.58

 Zα is the value that makes P(Z > Zα) = α

 90 percent confidence intervals: use z0.05 = 1.65
 95 percent confidence intervals: use z0.025 = 1.96
 99 percent confidence intervals: use z0.005 = 2.58

28
29
Confidence interval for the Population Mean
(with Unknown Pop. Variance)
 If we are sampling from a population with unknown variance, then a 100(1 −
α)% confidence interval for the population mean μ is given by

 tα is the value that makes P(t > tα) = α

 For a sample of size n, the t distribution will have n-1 degrees of freedom,
denoted t(n-1)
31
Example
 Suppose an investment analyst takes a random sample of US equity
mutual funds and calculates the average Sharpe ratio. The sample
size is 100, and the average Sharpe ratio is 0.45. The sample has a
standard deviation of 0.30.
 Calculate and interpret the 90 percent confidence interval for the
population mean of all US equity mutual funds.
 Recognizing that the population variance of the distribution of
Sharpe ratios is unknown, the analyst decides to calculate the
confidence interval using the theoretically correct t-statistic.
Example
Selection of Sample Size

Hence, the larger the sample size, the greater precision with which we can estimate
the population parameter.

This might explain why sometimes we don’t observe statistically significant results.
4. Sampling Bias

35
Data-Mining Bias
 Data-mining is the practice of determining a model by extensive
searching through a dataset for statistically significant patterns
(that is, repeatedly “drilling” in the same data until finding
something that appears to work).
 Out-of-sample test
 If we were to just report the significant variables, without also
reporting the total number of variables that we tested that were
unsuccessful as predictors, we would be presenting a very
misleading picture of our findings. – Asset Pricing Tests
Sample Selection Bias
 When data availability leads to certain assets being excluded from
the analysis, we call the resulting problem sample selection bias.
 Funds or companies that are no longer in business do not appear
there. So, a study that uses these types of databases suffers from a
type of sample selection bias known as survivorship bias.
Look-ahead Bias
 A test design is subject to look-ahead bias if it uses information that
was not available on the test date.
 For example, tests of trading rules that use stock market returns
and accounting balance sheet data must account for look-ahead
bias. in such tests, a company’s book value per share is commonly
used to construct the P/B variable. Although the market price of a
stock is available for all market participants at the same point in
time, fiscal year-end book equity per share might not become
publicly available until sometime in the following quarter.
Time-Period Bias
 A test design is subject to time-period bias if it is based on a time
period that may make the results time-period specific.
Q&A

Numericals On Samping Distribution
0% (1)
Numericals On Samping Distribution
3 pages
BA4101 - Statistics - For - Management All - Units - Two - Mark's - Questions - and Answers
100% (2)
BA4101 - Statistics - For - Management All - Units - Two - Mark's - Questions - and Answers
46 pages
BA Module 02 - 2.1 + 2.2
No ratings yet
BA Module 02 - 2.1 + 2.2
12 pages
Quiz 3
100% (1)
Quiz 3
6 pages
Analyzing The External Environment of The Firm: Chapter Two
No ratings yet
Analyzing The External Environment of The Firm: Chapter Two
37 pages
Newbold Chapter 7
No ratings yet
Newbold Chapter 7
62 pages
MQM100 MultipleChoice Chapter8
No ratings yet
MQM100 MultipleChoice Chapter8
14 pages
GraphPad Prism 5 Two-Way ANOVA
No ratings yet
GraphPad Prism 5 Two-Way ANOVA
28 pages
AP Statistics Midterm
33% (3)
AP Statistics Midterm
51 pages
Quantitative Methods of Investment Analysis
No ratings yet
Quantitative Methods of Investment Analysis
16 pages
4 Sampling Distributions
100% (1)
4 Sampling Distributions
30 pages
Special Probability Distributions: Presented By: Juanito S. Chan
No ratings yet
Special Probability Distributions: Presented By: Juanito S. Chan
37 pages
Basic Business Statistics: Analysis of Variance
No ratings yet
Basic Business Statistics: Analysis of Variance
85 pages
03 Chapter 3 - Statistical Estimation
No ratings yet
03 Chapter 3 - Statistical Estimation
17 pages
Estimation
No ratings yet
Estimation
41 pages
Mathematics and Statistics (Unit IV & V)
75% (4)
Mathematics and Statistics (Unit IV & V)
61 pages
OLS Assumptions
No ratings yet
OLS Assumptions
11 pages
BRM CH 21
No ratings yet
BRM CH 21
31 pages
Chapter 5 - The Standard Trade Model
No ratings yet
Chapter 5 - The Standard Trade Model
57 pages
Stat - Inference II
No ratings yet
Stat - Inference II
28 pages
Week 4 Continuous Probability Distribution PDF
No ratings yet
Week 4 Continuous Probability Distribution PDF
40 pages
Measures of Central Tendency
No ratings yet
Measures of Central Tendency
10 pages
Anova Notes
No ratings yet
Anova Notes
7 pages
Lesson# 4 Measure of Dispersion: Department of Statistics FC College University, Lahore
No ratings yet
Lesson# 4 Measure of Dispersion: Department of Statistics FC College University, Lahore
63 pages
Chi Square Distribution
No ratings yet
Chi Square Distribution
4 pages
Chapter 2 (Econometrics)
No ratings yet
Chapter 2 (Econometrics)
36 pages
ANCOVA
No ratings yet
ANCOVA
17 pages
Chap 1-1 Chap 1-1
100% (2)
Chap 1-1 Chap 1-1
17 pages
Chapter 3
100% (1)
Chapter 3
59 pages
Chap - 5 - Problems With Answers
No ratings yet
Chap - 5 - Problems With Answers
15 pages
Slides Prepared by John S. Loucks St. Edward's University
100% (1)
Slides Prepared by John S. Loucks St. Edward's University
44 pages
Chapter Five Regression
No ratings yet
Chapter Five Regression
12 pages
Linear Regression and Correlation
No ratings yet
Linear Regression and Correlation
1 page
Ignou Assignment
No ratings yet
Ignou Assignment
8 pages
Questions Regarding Panel Data
No ratings yet
Questions Regarding Panel Data
3 pages
Chapter-23 Bivariate Statistical Analysis: Measurement of Association
No ratings yet
Chapter-23 Bivariate Statistical Analysis: Measurement of Association
30 pages
Chapter 4 - Continuous Random Variables and Probability Distribution
No ratings yet
Chapter 4 - Continuous Random Variables and Probability Distribution
34 pages
Business Statistics Question Answer MBA First Semester-1
No ratings yet
Business Statistics Question Answer MBA First Semester-1
59 pages
Sampling Distribution Revised For IBS 2020 Batch
No ratings yet
Sampling Distribution Revised For IBS 2020 Batch
48 pages
Confidence Interval Estimation
No ratings yet
Confidence Interval Estimation
39 pages
COURSE 2 ECONOMETRICS 2009 Confidence Interval
No ratings yet
COURSE 2 ECONOMETRICS 2009 Confidence Interval
35 pages
List of Formula - Managerial Statistics
No ratings yet
List of Formula - Managerial Statistics
6 pages
Collection of Data Class 11
No ratings yet
Collection of Data Class 11
13 pages
Exam 1 Chpt3 Stugy Guide
No ratings yet
Exam 1 Chpt3 Stugy Guide
7 pages
CH 16
100% (1)
CH 16
54 pages
Chapter10 Sampling Two Stage Sampling
No ratings yet
Chapter10 Sampling Two Stage Sampling
21 pages
4T + 3C 240 (Hours of Carpentry Time)
No ratings yet
4T + 3C 240 (Hours of Carpentry Time)
5 pages
3.regression Slides
100% (1)
3.regression Slides
25 pages
Choosing The Correct Statistical Test
No ratings yet
Choosing The Correct Statistical Test
26 pages
5 - Test of Hypothesis (Part - 1)
No ratings yet
5 - Test of Hypothesis (Part - 1)
44 pages
SB Test Bank Chapter 11
No ratings yet
SB Test Bank Chapter 11
88 pages
Stats Annova Two Way
No ratings yet
Stats Annova Two Way
4 pages
SAMPLING & SAMPLING DISTRIBUTION EDITED ch07
100% (1)
SAMPLING & SAMPLING DISTRIBUTION EDITED ch07
50 pages
Chapter 3 Final 1
100% (1)
Chapter 3 Final 1
5 pages
Final Review Worksheet-STAT 362-Final Review
No ratings yet
Final Review Worksheet-STAT 362-Final Review
24 pages
Handout 03 Continuous Random Variables
100% (1)
Handout 03 Continuous Random Variables
18 pages
Econometric S
No ratings yet
Econometric S
26 pages
Statistics Report..
No ratings yet
Statistics Report..
34 pages
Qmt12 Chapter 7 Sampling Distributions
No ratings yet
Qmt12 Chapter 7 Sampling Distributions
49 pages
Multiple Regression
100% (1)
Multiple Regression
58 pages
Chapter 5 Sampling and Estimation
No ratings yet
Chapter 5 Sampling and Estimation
13 pages
SM-2 Basic Statistics
No ratings yet
SM-2 Basic Statistics
35 pages
Statistical Sampling & Parameter Estimation: Prof M.Shashi
No ratings yet
Statistical Sampling & Parameter Estimation: Prof M.Shashi
25 pages
Basic Univariate Statistics for Engineers 2019
No ratings yet
Basic Univariate Statistics for Engineers 2019
32 pages
FIN 600 - Financial Analysis Info
No ratings yet
FIN 600 - Financial Analysis Info
37 pages
FIN 600 - Final Exam Equations
No ratings yet
FIN 600 - Final Exam Equations
1 page
FIN 600 - Midterm Sample With Solutions
0% (2)
FIN 600 - Midterm Sample With Solutions
22 pages
FIN 640 - Assignment 1 - Part 1 With Solutions
No ratings yet
FIN 640 - Assignment 1 - Part 1 With Solutions
17 pages
Fundamental Law of Active Management
No ratings yet
Fundamental Law of Active Management
43 pages
Negative Scores: Table A-2
No ratings yet
Negative Scores: Table A-2
10 pages
Audit Market Concentration and Audit Quality of Listed Industrial Firms in Nigeria
No ratings yet
Audit Market Concentration and Audit Quality of Listed Industrial Firms in Nigeria
13 pages
Emotion Work and Burnout: Cross-Sectional Study of Nurses and Physicians in Hungary
No ratings yet
Emotion Work and Burnout: Cross-Sectional Study of Nurses and Physicians in Hungary
11 pages
NPTEL Course: Course Title: Security Analysis and Portfolio Management Course Coordinator: Dr. Jitendra Mahakud
No ratings yet
NPTEL Course: Course Title: Security Analysis and Portfolio Management Course Coordinator: Dr. Jitendra Mahakud
8 pages
CHEM 1067 Lec 5 - 2019 - 5
No ratings yet
CHEM 1067 Lec 5 - 2019 - 5
5 pages
(eBook PDF) Basic Business Statistics Concepts and Applications 13th all chapter instant download
100% (2)
(eBook PDF) Basic Business Statistics Concepts and Applications 13th all chapter instant download
41 pages
Campbell Optimal Portfolio Selection in A Value-At-Risk Framework JBF01
No ratings yet
Campbell Optimal Portfolio Selection in A Value-At-Risk Framework JBF01
16 pages
0341-ID-A Practical Guide For The Estimation of Uncertainty in Testing
No ratings yet
0341-ID-A Practical Guide For The Estimation of Uncertainty in Testing
33 pages
Unit 1
No ratings yet
Unit 1
21 pages
Statistical Quality Control: by 4Th Edition © Wiley 2010 Powerpoint Presentation by R.B. Clough - Unh M. E. Henrie - Uaa
No ratings yet
Statistical Quality Control: by 4Th Edition © Wiley 2010 Powerpoint Presentation by R.B. Clough - Unh M. E. Henrie - Uaa
40 pages
Statistics Economics - Test of Hypothesis
No ratings yet
Statistics Economics - Test of Hypothesis
20 pages
QBUS6320 Individual Assignment 2 - 2021S1 - 500601765
No ratings yet
QBUS6320 Individual Assignment 2 - 2021S1 - 500601765
11 pages
MCA Question Bank
No ratings yet
MCA Question Bank
33 pages
M Ch-15 Statistics
No ratings yet
M Ch-15 Statistics
4 pages
GENN004 Lect8 Algorithms2
No ratings yet
GENN004 Lect8 Algorithms2
12 pages
Error in Physics
No ratings yet
Error in Physics
6 pages
Stat 3
No ratings yet
Stat 3
42 pages
OREAS 61d
No ratings yet
OREAS 61d
11 pages
Seminsr Garch
No ratings yet
Seminsr Garch
14 pages
Managing The Risk of Momentum
No ratings yet
Managing The Risk of Momentum
26 pages
Artikel 1 - Sesi Ke-9 - Newcomer Adaptation in Teams-Antecedents & Outcome - Gunawan
No ratings yet
Artikel 1 - Sesi Ke-9 - Newcomer Adaptation in Teams-Antecedents & Outcome - Gunawan
17 pages
Principles of Business Forecasting 1St Edition Ord Solutions Manual PDF
100% (34)
Principles of Business Forecasting 1St Edition Ord Solutions Manual PDF
25 pages
Chapter 8 Appendices
No ratings yet
Chapter 8 Appendices
6 pages
The Moderating Role of Socioeconomic Status On Motivation of Adoles 2018 Sys
No ratings yet
The Moderating Role of Socioeconomic Status On Motivation of Adoles 2018 Sys
9 pages