0% found this document useful (0 votes)

30 views

Statistics For Data Science 20 21 Programming Exercises 1

This document outlines 14 programming exercises on topics in statistics and probability, including sampling from distributions, probability spaces, random variables, transformations, expectation, maximum likelihood estimation, and other statistical concepts. Students are asked to write simulations and programs to explore these topics.

Uploaded by

MANISH BARAL

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views

Statistics For Data Science 20 21 Programming Exercises 1

Uploaded by

MANISH BARAL

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

Programming Exercises | Submission Deadline 31.03.

2021 Statistics for Data Science 20/21

(1) Introduction
1. Sample a univariate Gaussian using scipy.stats.

2. Evaluate the PDF of a univariate Gaussian using scipy.stats.

3. Visualize the PDF of a univariate and a normalized sample histogram of samples from a univariate
Gaussian with identical parameters on top of each other using Matplotlib.

(2) Probability spaces

1. (Dice experiment 1) Consider the probability space model of tossing a fair dice. Let A = {2, 4, 6}
and B = {1, 2, 3, 4} be two events. Then, P(A) = 1/2, P(B) = 2/3 and P(A ∩ B) = 1/3. Since
P(A ∩ B) = P(A)P(B), the events A and B are independent. Simulate draws from the outcome
space and verify that P̂(A ∩ B) = P̂(A)P̂(B), where P̂(E) denotes the proportion of times an event
E occurs in the simulation.

2. (Dice experiment 2) Consider the probability space model of tossing a fair dice. Identify two events
A and B that are not independent. Analytically, evaluate P(A), P(B), P(A ∩ B), P(A|B) and
P(B|A) and verify these values by means of simulation.

3. (Coin experiment) Consider the probability space model of tossing a fair coin twice, i.e. a uniform
probability measure on Ω = {HH, HT, T H, T T }, where H indicates heads and T indicates tails.
Simulate draws from this probability space and verify that the events H appears on the rst toss,
H appears on the second toss, and both tosses have the same outcome each have probability
1/2.

(3) Random variables

1. Simulate the probability space model of throwing to dice and the random variable corresponding
the sum of the pips. Visualize a normalized histograms of simulated outcomes of this random
variable and compare it to the theoretical prediction.

2. Visualize the PMF of a Bernoulli random variable and a normalized histogram of many samples of
a Bernoulli random variable with identical parameter setting on top of each other.

3. Visualize the PDF of a Gaussian random variable and a normalized histogram of many samples of
a Gaussian random variable with identical parameter settings on top of each other.

(4) Joint distributions

1. Write a simulation that demonstrates that the marginal distributions of a bivariate Gaussian distri-
bution with expectation parameter and covariance parameters

1 0.3 0.2
µ= and Σ= , (1)
2 0.2 0.5

respectively, are given by univariate Gaussian distributions with expectation parameters µ1 =

1, µ2 = 2 and variance parameters σ 2 = 0.3 and σ 2 = 0.5, respectively.

2. Write a simulation that veries that obtaining samples from 2 independent univariate Gaussian
distributions with parameters µi , σi2 > 0, i = 1, 2 is equivalent to obtaining samples from a two-
dimensional Gaussian distribution with the appropriately specied parameters µ ∈ R2 and Σ ∈
2×2
R .

Page 1 of 3
Programming Exercises | Submission Deadline 31.03.2021 Statistics for Data Science 20/21

3. Write a simulation that exemplary veries the analytical results on conditional Gaussian distributions
for the case of a bivariate Gaussian distribution.

(5) Transformations
1. Write a program that generates pseudo-random numbers from an exponential distribution using a
uniform pseudo-random number generator and the probability integral transform theorem.

2. Let X ∼ N (0, 1) and let Y = exp(X). Evaluate the PDF of Y analytically and verify your
evaluation using a simulation based on drawing random numbers from N (0, 1).

3. Let X ∼ N (0, 1) and let Y = X 2. By simulation, validate that Y is distributed according to

a chi-squared distribution with one degree of freedom. Next, let X1 , ..., X10 ∼ N (0, 1) and let
P10 2
Y = i=1 Xi . By simulation, validate that Y is distributed according to a chi-squared distribution
with ten degrees of freedom.

(6) Expectation and covariance

1. Sample n = 10 data points of a univariate Gaussian distribution and evaluate the sample mean,
sample variance, and sample standard deviation.

2. Sample n = 10 data points of a bivariate Gaussian distribution and evaluate the sample covariation
and sample correlation.

3. Validate the theorem on the variances of sums and dierences of random variables using a sampling
approach in a bivariate Gaussian scenario.

(7) Inequalities and limits

1. Write simulations that validate the Markov and Chebychev inequalities.

2. Write a simulation that validates the Weak Law of Large Numbers.

3. Write a simulation that validates the Lindenberg-Lévy Central Limit Theorem.

4. Write a simulation that validates the Liapunov Central Limit Theorem.

(8) Maximum likelihood estimation

1. Let X1 , ..., Xn ∼ Bern(µ) be n = 20 i.i.d. Bernoulli random variables. Using an optimization
routine of your choice, formulate and implement the numerical maximum likelihood estimation of
µ for true, but unknown values of µ = 0.7 and µ=1 based on X1 , ..., Xn .

2. Let X1 , ..., Xn ∼ Bern(µ). For a large number n, sample the X1 , ..., Xn and evaluate the max-
ML
imum likelihood estimator µ̂ . Repeat this m times and create a histogram of the realized
µ̂M L ML
1 , ..., µ̂m .

(9) Finite estimator properties

1. For X1 , ..., Xn ∼ Bern(µ) implement a simulation which validates the unbiasedness of the sample
mean, the unbiasedness of the sample variance, the biasedness of the sample standard deviation,
and the biasedness of the maximum likelihood variance parameter estimator.

Page 2 of 3
Programming Exercises | Submission Deadline 31.03.2021 Statistics for Data Science 20/21

2. For X1 , ..., Xn ∼ N (µ, σ 2 ) implement a simulation which validates the unbiasedness of the sample
mean, the unbiasedness of the sample variance, the biasedness of the sample standard deviation,
and the biasedness of the maximum likelihood variance parameter estimator.

(10) Asymptotic estimator properties

1. Write a simulation that veries the asymptotic unbiasedness of the maximum likelihood estimator
for the variance parameter of a univariate Gaussian distribution. Include a verication of the
unbiasedness of the sample variance.

2. Write a simulation that veries the asymptotic eciency of the maximum likelihood estimator for
the parameter of a Bernoulli distribution.

3. Write a simulation that veries the asymptotic eciency of the maximum likelihood estimator for
the variance parameter of a univariate Gaussian distribution.

(11) Condence intervals

1. Write a simulation that veries that the T statistic is distributed according to a t-distribution with
n−1 degrees of freedom.

2. Write a simulation that veries that the 95%-condence interval for the expectation parameter
of a Gaussian distribution with unknown variance comprises the true, but unknown, expectation
parameter in ≈ 95% of its realizations.

3. Write a simulation that veries that the approximate 95%-condence interval for the expectation
parameter of a Bernoulli distribution comprises the true, but unknown, expectation parameter in
≈ 95% of its realizations.

(12) Hypothesis testing

1. By means of simulation, show that a two-sided T test with simple null hypothesis Θ0 := {µ0 } of
signicance level α0 is exact.

2. By means of simulation, demonstrate that the δ -condence interval-based test for the expectation
parameter of univariate Gaussian distribution is of signicance level α0 = 1 − δ .

(13) Conjugate inference

1. For n = 10, implement batch and recursive Bayesian estimation for the Beta-Binomial model.
Compare the results based on identical samples.

(14) Numerical methods

1. Estimate the expected value of a Beta(α, β) for varying values of α and β by means of Monte
Carlo integration by using a Beta distribution random number generator. Compare the results to
the true expected values.

2. Estimate the expected value of a Beta(α, β) for varying values of α and β by means of Monte
Carlo integration using an importance sampling scheme and a uniform random number generator.

3. Use an acceptance-rejection algorithm to sample random numbers from Beta(2, 6).

Page 3 of 3

03 - CT3S Introduction To Probability Simulation and Gibbs Sampling With R Solutions
100% (1)
03 - CT3S Introduction To Probability Simulation and Gibbs Sampling With R Solutions
270 pages
Metric Tolerance Chart PDF
100% (1)
Metric Tolerance Chart PDF
6 pages
Stats Test II
No ratings yet
Stats Test II
3 pages
PTSP Lab Record
No ratings yet
PTSP Lab Record
27 pages
Series 1, Oct 1st, 2013 Probability and Related) : Machine Learning
No ratings yet
Series 1, Oct 1st, 2013 Probability and Related) : Machine Learning
4 pages
PTSP ASSIGNMENT Q Bank
No ratings yet
PTSP ASSIGNMENT Q Bank
9 pages
Problem Sheet 3.3
No ratings yet
Problem Sheet 3.3
3 pages
Stat2001 Practice Exam
No ratings yet
Stat2001 Practice Exam
5 pages
Exercises 02
No ratings yet
Exercises 02
3 pages
Assignment MEF 2 2018
No ratings yet
Assignment MEF 2 2018
5 pages
Problem Set1 PDF
No ratings yet
Problem Set1 PDF
6 pages
January 2016B
No ratings yet
January 2016B
7 pages
MIT18.650. Statistics For Applications Fall 2016. Problem Set 2
No ratings yet
MIT18.650. Statistics For Applications Fall 2016. Problem Set 2
3 pages
Way of preparaton questions as per University standards
No ratings yet
Way of preparaton questions as per University standards
13 pages
Part 1 Assignment Statistics
No ratings yet
Part 1 Assignment Statistics
1 page
Example Questions For Final
No ratings yet
Example Questions For Final
9 pages
Qf 202 Assignment 1
No ratings yet
Qf 202 Assignment 1
3 pages
Math 362, Problem Set 5
No ratings yet
Math 362, Problem Set 5
4 pages
Problem Set 6
No ratings yet
Problem Set 6
4 pages
Practice Examination 2
No ratings yet
Practice Examination 2
6 pages
Probability Assignment 3
No ratings yet
Probability Assignment 3
2 pages
Exercises
No ratings yet
Exercises
27 pages
Work Sheet-III
No ratings yet
Work Sheet-III
7 pages
CS 215: Data Analysis and Interpretation: Sample Questions
No ratings yet
CS 215: Data Analysis and Interpretation: Sample Questions
10 pages
Workshop Questions Week 9
No ratings yet
Workshop Questions Week 9
2 pages
PSB 2024
No ratings yet
PSB 2024
5 pages
MA 4114 - Probability & statistical method Question Paper
No ratings yet
MA 4114 - Probability & statistical method Question Paper
6 pages
Chapter 12 PDF
No ratings yet
Chapter 12 PDF
21 pages
STAT 330 Supplementary Notes
No ratings yet
STAT 330 Supplementary Notes
134 pages
EC404 - Monsoon 2016 Archana Aggarwal Introduction To Statistics and Econometrics 307, SSS-II Problem Set 1
No ratings yet
EC404 - Monsoon 2016 Archana Aggarwal Introduction To Statistics and Econometrics 307, SSS-II Problem Set 1
4 pages
National Institute of Technology Rourkela Department of Mathematics End-Semester Examination: 2021-2022 (Autumn)
No ratings yet
National Institute of Technology Rourkela Department of Mathematics End-Semester Examination: 2021-2022 (Autumn)
2 pages
Probabilistic Methods in Engineering: Exercise Set 4
No ratings yet
Probabilistic Methods in Engineering: Exercise Set 4
3 pages
sujal 4
No ratings yet
sujal 4
31 pages
2-ENSIA 2023-2024 Worksheet 1
No ratings yet
2-ENSIA 2023-2024 Worksheet 1
4 pages
Cs1b 221 Exam Final Clean
No ratings yet
Cs1b 221 Exam Final Clean
8 pages
2021-2022-final
No ratings yet
2021-2022-final
5 pages
assgn3
No ratings yet
assgn3
3 pages
Statistics_practice_problems__Ranging_from_basic_to_advanced
No ratings yet
Statistics_practice_problems__Ranging_from_basic_to_advanced
7 pages
Worksheets
No ratings yet
Worksheets
35 pages
Joint Random variable assignment 2
No ratings yet
Joint Random variable assignment 2
3 pages
Workshop 5: PDF Sampling and Statistics: Preview: Generating Random Numbers
No ratings yet
Workshop 5: PDF Sampling and Statistics: Preview: Generating Random Numbers
10 pages
Question paper
No ratings yet
Question paper
6 pages
Isi JRF Stat 07
No ratings yet
Isi JRF Stat 07
10 pages
PTSP QB
No ratings yet
PTSP QB
9 pages
Aaoc C111 515 C 2009 2
No ratings yet
Aaoc C111 515 C 2009 2
3 pages
Exercises
No ratings yet
Exercises
9 pages
Probability & Statistics
No ratings yet
Probability & Statistics
3 pages
Sm2205es1 7
No ratings yet
Sm2205es1 7
12 pages
Introduction To Simulation Using R: 13.1 Analysis Versus Computer Simulation
No ratings yet
Introduction To Simulation Using R: 13.1 Analysis Versus Computer Simulation
19 pages
AprilMay - 2018
No ratings yet
AprilMay - 2018
2 pages
Homework 3
No ratings yet
Homework 3
3 pages
Mathematica Laboratories For Mathematical Statistics
No ratings yet
Mathematica Laboratories For Mathematical Statistics
26 pages
Week 5
No ratings yet
Week 5
3 pages
W5 Notes
No ratings yet
W5 Notes
3 pages
Problem-2: MATLAB Code
No ratings yet
Problem-2: MATLAB Code
10 pages
Homework 0
No ratings yet
Homework 0
5 pages
ISI MStat 06
No ratings yet
ISI MStat 06
5 pages
Probability and Statistics
No ratings yet
Probability and Statistics
5 pages
Random Numbers (2/3) : Non - Uniform Distribu/ons
No ratings yet
Random Numbers (2/3) : Non - Uniform Distribu/ons
19 pages
Introduction To Probability: 2.1 Random Variable
No ratings yet
Introduction To Probability: 2.1 Random Variable
4 pages
Digital Signal and Image Processing using MATLAB, Volume 3: Advances and Applications, The Stochastic Case
From Everand
Digital Signal and Image Processing using MATLAB, Volume 3: Advances and Applications, The Stochastic Case
Gérard Blanchet
3/5 (1)
Learn Statistics Fast: A Simplified Detailed Version for Students
From Everand
Learn Statistics Fast: A Simplified Detailed Version for Students
Hesbon R.M
No ratings yet
W07 Statistical Analysis For Categorical Data-4
No ratings yet
W07 Statistical Analysis For Categorical Data-4
16 pages
Lecture-3 (Fitting of 2nd Degree Parabola)
No ratings yet
Lecture-3 (Fitting of 2nd Degree Parabola)
7 pages
BusStat 2023 HW13 Chapter 13
No ratings yet
BusStat 2023 HW13 Chapter 13
3 pages
Tutprac 1
No ratings yet
Tutprac 1
8 pages
Activity 4 CGPA Vs Placement Package Program
No ratings yet
Activity 4 CGPA Vs Placement Package Program
4 pages
Link Ratio Method
No ratings yet
Link Ratio Method
18 pages
Part 3 Skittles
No ratings yet
Part 3 Skittles
2 pages
L19 - Chi Square Test 1
No ratings yet
L19 - Chi Square Test 1
17 pages
Correlation Analysis
No ratings yet
Correlation Analysis
17 pages
Mmw-Chapter 1docx-Pdf-Free
No ratings yet
Mmw-Chapter 1docx-Pdf-Free
5 pages
Week 2
No ratings yet
Week 2
12 pages
Extra PMTH 003 Qa
No ratings yet
Extra PMTH 003 Qa
4 pages
Physiological Determinants of Elite Mountain Bike
No ratings yet
Physiological Determinants of Elite Mountain Bike
9 pages
Chebysev Inequality: Suppose and Variance
No ratings yet
Chebysev Inequality: Suppose and Variance
13 pages
Chap 013
50% (2)
Chap 013
141 pages
Chi Square Test
No ratings yet
Chi Square Test
11 pages
BUS194A Exam1 Formula Sheet
No ratings yet
BUS194A Exam1 Formula Sheet
2 pages
Math Assignment Unit 7
No ratings yet
Math Assignment Unit 7
5 pages
ASTM D4541-02 Pull Off Test
No ratings yet
ASTM D4541-02 Pull Off Test
208 pages
Statistical Analysis Data Treatment and Evaluation
No ratings yet
Statistical Analysis Data Treatment and Evaluation
55 pages
Assignment - 3
No ratings yet
Assignment - 3
2 pages
Analyzing and Interpreting Data From Likert
No ratings yet
Analyzing and Interpreting Data From Likert
3 pages
I Wayan Agus Wirya Pratama - UjianWord
No ratings yet
I Wayan Agus Wirya Pratama - UjianWord
23 pages
Lecture 24 PDF
No ratings yet
Lecture 24 PDF
12 pages
Evolution of GDP, Final Consumption and Net Investment in Romania During 1998-2011
No ratings yet
Evolution of GDP, Final Consumption and Net Investment in Romania During 1998-2011
2 pages
Module 1: Introduction To Machine Learning: 1. What Is Machine Learning? How Is It Different From Human Learning?
No ratings yet
Module 1: Introduction To Machine Learning: 1. What Is Machine Learning? How Is It Different From Human Learning?
21 pages
Unit-26 - Canonical - Correlation-Cropped (2 Files Merged)
No ratings yet
Unit-26 - Canonical - Correlation-Cropped (2 Files Merged)
11 pages

Statistics For Data Science 20 21 Programming Exercises 1

Uploaded by

Statistics For Data Science 20 21 Programming Exercises 1

Uploaded by

Programming Exercises | Submission Deadline 31.03.

2021 Statistics for Data Science 20/21

2. Evaluate the PDF of a univariate Gaussian using scipy.stats.

(2) Probability spaces

(3) Random variables

(4) Joint distributions

respectively, are given by univariate Gaussian distributions with expectation parameters µ1 =

3. Let X ∼ N (0, 1) and let Y = X 2. By simulation, validate that Y is distributed according to

(6) Expectation and covariance

(7) Inequalities and limits

2. Write a simulation that validates the Weak Law of Large Numbers.

3. Write a simulation that validates the Lindenberg-Lévy Central Limit Theorem.

4. Write a simulation that validates the Liapunov Central Limit Theorem.

(8) Maximum likelihood estimation

(9) Finite estimator properties

(10) Asymptotic estimator properties

(11) Condence intervals

(12) Hypothesis testing

(13) Conjugate inference

(14) Numerical methods

3. Use an acceptance-rejection algorithm to sample random numbers from Beta(2, 6).

You might also like

(11) Condence intervals