Mstat Note7 Random Variable f23
Mstat Note7 Random Variable f23
Right continuous
Non-decreasing
Probability mass function (PMF)
• Corresponding CDF
Probability density function (PDF)
• PDF is not a probability!!
0.3
0.0
0 1
Bernoulli Distribution
Examples
• Coin Toss
• 0: T
• 1: H
• p: probability to have H
• Disease Probability
• 0: Non disease
• 1: Disease
• p : probability to have the disease
Bernoulli Distribution
Examples
• Suppose there are 5 individuals, and the
probabilities to have the disease is p=0.2
# R-code
N=5
p=0.2
rbinom(N, 1, p)
Bernoulli Distribution
Examples
• Suppose there are 5 individuals, and the
probabilities to have the disease are all different as
p1=0.1, p2=0.2, p3=0.3, p4=0.4, p5=0.5
# R-code
N=5
p=c(0.1, 0.2, 0.3, 0.4, 0.5)
rbinom(N, 1, p)
Binomial Distribution
Binomial, n=10, p=0.3
0 1 2 3 4 5 6 7 8 9 10
Binomial Distribution
• Sum of n independent Bernoulli(p) random variables
follows Binomial(n, p)
• Disease Probability
• Suppose we sample 50 individuals in SNU
• x: number of individuals with disease
• p: probability to have the disease
Binomial Distribution
Binomial, n=1000, p=0.3 Large n: binomial distribution has a bell
shape
=> Close to Normal distribution
Probability
0.015
0.000
0 61 143 235 327 419 511 603 695 787 879 971
0 2 4 6 8 10 12 14 16 18 20
Geometric Distribution
Geometric, p=0.3
0.30
Probability
0.15
0.00
0 2 4 6 8 10 12 14 16 18 20
Ex. Number of trials needed until the first head in coin toss
Poisson Distribution
Poisson, lambda=1
0.30
Probability
0.15
0.00
0 2 4 6 8 10 12 14 16 18 20
Poisson Distribution
Siméon Denis Poisson
Poisson, lambda=1
Binomial, n=1000, p=0.001
0.30
0.30
Probability
Probability
0.15
0.15
0.00
0.00
0 2 4 6 8 10 12 14 16 18 20 0 2 4 6 8 10 12 14 16 18 20
0.30
0.30
Probability
Probability
0.15
0.15
0.00
0.00
0 2 4 6 8 10 12 14 16 18 20 0 2 4 6 8 10 12 14 16 18 20
Poisson Distribution
• DNA data
• The number of mutation in the region
• x: number of mutation
• 𝜆: average number
Well known Continuous RVs
Normal Distribution
Normal, mu=0, sigma=1
0.4
0.3
Density
0.2
0.1
0.0
-4 -2 0 2 4
x
Normal Distribution
• Ex. X ~ N(3, 5)
#$%
• Dist of &
?
• Any measurement
• Noise (error) in the observation
• Linear regression is a good example
!
𝜒 distribution
Exponential Distribution
Exponential
0.8
Density
0.4
0.0
0 2 4 6 8 10
x
Exponential Distribution
• CDF
𝑥
𝐹 𝑥 = 1 − exp(− )
𝛽
• Memorylessness
P 𝑋 > 𝑡 + 𝑠 𝑋 > t) = P(X > s)
Independent?
Example
Joint distribution
of X and Y?
Independence
• Following theorem is very useful to identify the
independence
Independence
Independent?
Conditional Distribution
Discrete
Continuous
Example
Marginal distribution of Y?
Multivariate Dist.
• For multivariate random variables, using vector-
notation is more convenient
• X= (X1,…, Xn)
• Corresponding PDF is f(X1,…, Xn)
• Independence of X1,…, Xn
• Can be confirmed using
• Or
IID sampling
• X ~ Multinomial (n, p)
Multinomial
• Each element Xj marginally follows Binomial(n, pj)
• Preference
Multivariate Normal
• Two parameters
• Mean: 𝜇=(𝜇1, …, 𝜇k)
• Variance (nxn matrix): Σ
• Variance should be symmetric and positive definite!!
Multivariate Normal (Extra)
• If each Xj follows IID N(0, 1) (so Z value) and then
Multivariate Normal (Extra)
Distribution of Y?
Transformation of multivariate RV
• Transform of several random variables
• Max(X, Y), Min(X, Y), X+Y, X/Y
• Ex. Minimum waiting time.
• Let Z=r(X,Y)
Transformation of multivariate RV
• Suppose X1 and X2 are independent RV and follows
exp(1) distribution. Y = Min(X1, X2 ).
Distribution of Y?
Summary
• Random variable
• Map sample space to real number (or vector)
• We actually use random variables (not sample space) to data
analysis
• Discrete Random Variables
• Bernoulli, Binomial, Poisson, etc
• Continuous Random Variables
• Normal, chi-squared, Exponential, etc
• Multivariate RV
• Independence, conditional dist.
• Change of variables