0% found this document useful (0 votes)

43 views

Inferential Statistical Analysis Using Python -

Uploaded by

slaydes13

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

43 views

Inferential Statistical Analysis Using Python -

Uploaded by

slaydes13

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

(https://brainalystacademy.

com/)

HOME (HTTPS://BRAINALYSTACADEMY.COM/) BASIC STATISTICS  (HTTPS://BRAINALYSTACADEMY.COM/BASIC-STATISTICS/)

DATA EXPLORATION & PREPRATION  (HTTPS://BRAINALYSTACADEMY.COM/DATA-EXPLORATION-AND-PREPRATION/)

MODELING  (HTTPS://BRAINALYSTACADEMY.COM/MODELING/) BUY OUR COURSE NOW (HTTPS://BRAINALYST.IN/)

Inferential Statistical Analysis Using
Python

Contents [ hide ]
1 Inferential Statistics¶

1.1 Contents¶
1.2 1. Z scores, Z-Test¶

1.2.1 1.1 Z Value¶

1.2.1.1 Computing z-score using defualt values¶
1.2.1.2 Computing z-score along specified axis using degrees of freedom¶

1.2.1.3 Computing z-score using nan_policy¶

1.2.2 1.2 Z-Test¶

1.2.3 Example¶
1.2.4 T-Test¶
1.2.5 1.3 Two Sided One-Sample T-test¶

1.2.6 1.4 Independent t-Test¶

1.2.7 Using scipy library¶
1.2.8 Example¶

1.2.9 Using statsmodels¶

1.2.10 1.5 Paired t-Test¶

1.3 2. F-Test¶
1.4 3. Correlation coefficients¶
1.4.1 Calculating correlation coefficient using pandas dataframe¶

1.4.2 4. Chi-Square Test¶

1.4.2.1 Importing Library¶

1.4.2.2 With f_obs values¶

1.4.2.3 With f_exp and f_obs values¶
1.4.2.4 With f_obs as 2d¶

1.4.2.5 With axis as None¶

1.4.2.6 With axis as 1¶

1.4.2.7 With ddof specified¶

Inferential Statistics
Inferential statistics is used for finding inferences on the data and make predictions about the data on a
given sample of data.This uses probability to find conclusions.
There are possible methods to perform inferential statistics on the data. In this blog we will discuss
about Z-Score, Z-Test, F-Test, Correlation Coefficients, chi-square Test for performing the analysis of the
data and get a probable conclusion based on it.

When we use Inferential Statistics?

Inferential statistics mainly used for finding conclusions about the data, the data can be a sample or set
of features so sometimes we use a large size of data for building a model at that time this inferential
statistics comes in handy.

Contents
1. Z Scores, Z-Test

1.1 Z Value

1.2 Z Test

1.3 Two-sided One-Sample t-Test

1.4 Independent t-Test

1.5 Paired T-test

1. F-test

1. Correlation Coefficients

1. Chi-Square Test

1. Z scores, Z-Test

1.1 Z Value
Z- Value/ Z- Score tells a value (x) is how many standard deviations below or above the population
mean. If the Z value is positive the value/ score (x) is higher than the mean and if the Z value is negative
the value is lesser than the mean

Z-Score can be calculated as follows

z = (X – μ) / σ

where,
X : Single data value
μ : Mean value
σ : Standard Deviation

Z-score in python can be calcualted by using scipy.stats.zscore such as, scipy.stats.zscore(a, axis=0,
ddof=0, nan_policy=’propagate’)

where,

a : array_like
An array like object containing the sample data.

axis : int or None, optional

Axis , either horizontal or vertical

ddof : int, optional

Degrees of freedom correction in standard deviation calcualtion.
Default value is 0 (zero).

nan_policy : {‘propagate’, ‘raise’, ‘omit’}, optional

This field defines a way of handling when input contains nan.
Default value is propagate, which returns nan
The value raise, throws an error
The value omit, ignores nan values and performs the calculation.

Note: Whenever the value is omit, the nan values in the input
propagate to the output, but these nan values
do not affect the z-scores that's been computed for the non-nan values

Ex: a = [0.8976,0.9989,0.5678,0.1234,0.7765,1,1.675,1.456]

==> Mean (μ) = Sum of all the elements/N , where N = total number of elements
mean (μ)= (0.8976+0.9989+0.5678+0.1234,0.7765+1+1.675+1.456)/8 =0.9369

==> standard deviation (σ) = sqrt((X-μ)/N) , where X = element

standard deviation (σ) = sqrt((0.8976-0.9369)^2+(0.9989-0.9369)^2+(0.5678-0.9369)^2+(0.1234-

0.9369)^2+(0.7765-0.9369)^2+(1-0.9369)^2+(1.675-0.9369)^2+(1.456-0.9369)^2))/8 = 0.45378

==> Z-score (z) = (X-μ)/σ

z = [(0.8976-0.9369)/0.4537,(0.9989-0.9369)/0.4537,(0.5678-0.9369)/0.4537,(0.1234-0.9369)/0.4537,
(0.7765-0.9369)/0.4537,(1-0.9369)/0.4537,(1.675-0.9369)/0.4537,(1.456-0.9369)/0.4537]

Result is ==> z = [-0.0866,0.1357,-0.8135,-1.7930,-0.3535,0.1390,1.6268,1.144]

Computing z-score using defualt values

In [2]: import numpy as np

import scipy.stats as stats

a = np.array([0.8976,0.9989,0.5678,0.1234,0.7765,1,1.675,1.456])
stats.zscore(a)
array([-0.08660476, 0.13662837, -0.81337952, -1.79269639, -0.35347081,
Out[2]:
0.13905242, 1.62653867, 1.14393202])

Computing z-score along specified axis using degrees of freedom

In [4]: a = np.array([[0.1234,0.4567,0.7890,0.9876],
[0.6789,0.7890,0.9987,0.6657],
[0.2234,0.9987,0.3345,0.5567]])

stats.zscore(a,axis=1,ddof=1)
array([[-1.22576827, -0.3486311 , 0.52587439, 1.04852498],
Out[4]:
[-0.67641081, 0.03847117, 1.40005837, -0.76211873],
[-0.88942498, 1.37202025, -0.56536131, 0.08276604]])

Computing z-score using nan_policy

In [15]: a = np.array([[0.1234,np.nan,0.7890,0.9876],
[0.6789,0.7890,0.9987,0.6657],
[np.nan,0.9987,0.3345,np.nan]])

stats.zscore(a,axis=1) # default value of nan_policy is propagate, w

array([[ nan, nan, nan, nan],
Out[15]:
[-0.78105192, 0.04442268, 1.61664815, -0.88001891],
[ nan, nan, nan, nan]])

In [16]: a = np.array([[0.1234,np.nan,0.7890,0.9876],
[0.6789,0.7890,0.9987,0.6657],
[np.nan,0.9987,0.3345,np.nan]])

# nan_policy='raise', throws error

stats.zscore(a,axis=1,nan_policy='raise')

---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-16-7d7bc298bb30> in <module>
3 [np.nan,0.9987,0.3345,np.nan]])
4
----> 5 stats.zscore(a,axis=1,nan_policy='raise') # nan_policy='raise', throws error

~\anaconda3\lib\site-packages\scipy\stats\stats.py in zscore(a, axis, ddof, nan_polic

y)
2545 return np.empty(a.shape)
2546
-> 2547 contains_nan, nan_policy = _contains_nan(a, nan_policy)
2548
2549 if contains_nan and nan_policy == 'omit':

~\anaconda3\lib\site-packages\scipy\stats\stats.py in _contains_nan(a, nan_policy)

237
238 if contains_nan and nan_policy == 'raise':
--> 239 raise ValueError("The input contains nan values")
240
241 return contains_nan, nan_policy

ValueError: The input contains nan values

In [17]: a = np.array([[0.1234,np.nan,0.7890,0.9876],
[0.6789,0.7890,0.9987,0.6657],
[np.nan,0.9987,0.3345,np.nan]])

stats.zscore(a,axis=1,nan_policy='omit') # nan_policy='omit', comput

array([[-1.37976297, nan, 0.4211984 , 0.95856458],
Out[17]:
[-0.78105192, 0.04442268, 1.61664815, -0.88001891],
[ nan, 1. , -1. , nan]])

1.2 Z-Test
Z- Test is to test the population proportion. Z-Test can be used to test the given mean, when the sample
is large, which means the length of the data is more than 30 , and when the population standard
deviation is known as well as variance is known. This test is perfromed to check if the 2 sample means
are approximately equal or not.

To perfrom this z-test the samples should be taken at random from the population and the data should
be normally distributed. If the data taken is larger than 30 then it is assumed that the data is normally
distributed. If the sample size is less than 30 then the t-test is considered.

We check if the value obtained is approimately equal or not by considering hypothesis such as Null
Hypothesis (H0) : If the value is equal to the other value, this hypothesis is accepted Alternate
Hypothesis (HA) :If the values is not equal to the other value, this hypothesis is accepted

This Z-test is calculated by using the formula

After performing the z-test, the value obtained should be compared with the alpha value, which is
assumed to be 0.05 in the z-score table, which is considered to be pvalue.

This pvalue if it is less than the alpha value, then the Null Hypothesis is rejected which means
the Alternate Hypothesis is considered. In other words the means of the two samples are not
equal
The pvalue if it is greater than the alpha value, then the Null Hypothesis is accepted. In other
words the means or averages of the two samples are equal.

In Machine Learning, we calculate z-test by using method ztest from

statsmodels.stats.weightstats
statsmodels.stats.weightstats.ztest(x1, x2=None, value=0,
alternative='two-sided', usevar='pooled', ddof=1.0)

where,
x1,x2 are arrays

value : float
In the one sample case, value is the mean of x1 under the Null
hypothesis. In the two sample case, value is the difference between
mean of x1 and mean of x2 under the Null hypothesis. The test statistic
is x1_mean - x2_mean - value.

alternative : str
The alternative hypothesis, H1, has to be one of the following
‘two-sided’: H1: difference in means not equal to value
(default)
‘larger’ : H1: difference in means larger than value
‘smaller’ : H1: difference in means smaller than value

usevar : str, ‘pooled’

Currently, only ‘pooled’ is implemented. If pooled, then the
standard deviation of the samples is assumed to be the same. see
CompareMeans.ztest_ind for different options.

ddof : int
Degrees of freedom use in the calculation of the variance of the
mean estimate. In the case of comparing means this is one, however it
can be adjusted for testing other statistics (proportion, correlation)

Returns,
tstat : float
test statistic

pvalue : float
pvalue of the t-test

Example
In [27]: import numpy as np
import pandas as pd
from numpy.random import randn
from statsmodels.stats.weightstats import ztest

x1 = [20, 30, 40, 50, 10, 20]

z = ztest(x1,value= 25) # where value is a mean value
z
(0.5547001962252289, 0.5790997419539189)
Out[27]:

The first value from the above result is statistic value and the other value is pvalue. From the above
output, we can understand that the pvalue of the taken data is 0.9 which is greater than the value of
alpha which is 0.05. Hence we come to the output that Null Hypothesis is correct and it is accepted,
which means that the given data and the assumed mean are approximately equal.

In [42]: x1 = [20, 30, 40, 50, 10, 20]

x2 = [11, 12, 13, 14, 15, 16]

z=ztest(x1, x2, value= 0, alternative = 'larger')

z
(2.448717008689441, 0.007168301924196878)
Out[42]:

From the above output, we can understand that the Null Hypothesis is rejected and Alternate Hypothesis
is accepted as the pvalue is less than the alpha value

T-Test
T-test, also known as Student’s T-test, is used to determine the difference among two groups of
variables by comparing their mean values or the averages. This T-test not only determines the difference
but also determines the significance of their differences. In other words, this test simply explains that
the differeneces among the varibale groups is occurred by a chance or relevant to the data taken.

The 3 types of T-test are

1. Independent T-test
2. Paired Sample T-test
3. One-Sample T-test
One-Sample T-test : This one-sample T-test is a t-test where the one group’s mean or avergae is
compared with one significant value which is a mean of the population

Types of One-Sample T-test are

1. One tailed One-Sample T-test
2. Two tailed One-Sample T-test
3. Upper tailed One-Sample T-test
4. Lower tailed One-Sample T-test

1.3 Two Sided One-Sample T-test

Two Sided One-Sample T-test or Two tailed One-Sample T-test ———————

1.4 Independent t-Test

Independent T-test, also known as Two Sample T-test is used to test whether the means of the taken 2
groups are equl or not. This Independent T-test assumes that the variances of the taken population has
equal variance by defualt.

In Machine Learning, we can perform this test using

- Using scipy library
- Using Statsmodels

Using scipy library

scipy.stats.ttest_ind(a, b, axis=0, equal_var=True) where, a, b are two arrays of 2 groups

axis : int or None, optional Axis along which to compute test. If None, compute over the whole arrays, a,
and b.

equal_var : bool, optional If True (default), perform a standard independent 2 sample test that assumes
equal population variances If False, perform Welch’s t-test, which does not assume equal population
variance.

Returns statistic : float or array The calculated t-statistic.

pvalue : float or array The p-value.

Example
For example, we are given 2 different groups of data among which Bag-A has a bunch of apples and
Bag-B has a bunch of mangoes. We need to check if the both bags have the same averages or means.

For this we need to assume 2 hypothesis, null hypothesis and alternate hypothesis H0 -> The means of
two bags are equal HA -> The means of two bags are not equal

Two find which hypothesis is predicted to be different, we check the value of alpha, which is assumed to
be 0.05, with the pvalue that is obtained after performing t-test. If the pvalue is less than the alpha value,
then the HA is considered to be true If the pvalue is greater than the alpha value, then the H0 is
considere to be true

Lets test the above example with ttest using scipy library

In [111… import scipy.stats as stats

a = np.array([5,6,7,8,2,3,4,5])
b = np.array([12,13,14,15,16,2,3,4])

stats.ttest_ind(a, b, equal_var=True) # Assuming that the 2 groups h

Ttest_indResult(statistic=-2.2331335038240865, pvalue=0.042379219768910015)
Out[111…

In [112… stats.ttest_ind(a, b, equal_var=False) # Assuming that the 2 groups

Ttest_indResult(statistic=-2.2331335038240865, pvalue=0.05369587840008499)
Out[112…

Understanding the result from the above test, assuming that the alpha value is 0.05, the both results
assuming that the 2 groups have same variances and different variances, returns out pvalue which is
less than the value of alpha. Hence the 2 bags or the 2 groups means are not supposed to be equal
Using statsmodels
statsmodels.stats.weightstats.ttest_ind(x1, x2)

where,
x1 and x2 are two array groups

Returns:
tstat : float
test statistic

pvalue : float
pvalue of the t-test

df : int or float
degrees of freedom used in the t-test

In [115… from statsmodels.stats.weightstats import ttest_ind

a = np.array([12,14,16,4,5,11,12,11])
b = np.array([12,13,14,15,16,2,3,4])

ttest_ind(a,b)
(0.2963188789948769, 0.7713367820262194, 14.0)
Out[115…

Understanding the result from the above test, assuming that the alpha value is 0.05, the resulted pvalue
is 0.7 which is greater than the assumed alpha value. Hence it can be said that the assumption H0 is
true, i.e., the means of the 2 groups are equal

1.5 Paired t-Test

A Paired t-Test explains the difference between two variables for the same subject. It compares one set
of measurements with the second set from the same sample. This test is also known as Dependent
Sample T-test

In simple words, this T-test measures the difference between two averages or means of two different
groups. This test similar to other tests assumes that there are 2 hypothesis. Null Hypothesis (H0) : The
difference between two means of the two groups is zero Alternate Hypothesis (HA) : The difference
between two means of the two groups is not equal to zero.

In Machine Learning, this Paired T-test can be calculated by using ttest_rel() method defined in
scipy.stats library
scipy.stats.ttest_rel(a, b, axis=0, nan_policy='propagate',
alternative='two-sided')

where,

a, b : array_like

axis : int or None, optional

Axis along which to compute test. If None, compute over the
whole arrays, a, and b.

nan_policy : {‘propagate’, ‘raise’, ‘omit’}, optional

Defines how to handle when input contains nan. The following
options are available (default is ‘propagate’):
‘propagate’: returns nan
‘raise’: throws an error
‘omit’: performs the calculations ignoring nan values

alternative : {‘two-sided’, ‘less’, ‘greater’}, optional

Defines the alternative hypothesis. The following options
are available (default is ‘two-sided’):
‘two-sided’: the means of the distributions underlying the
samples are unequal.
‘less’: the mean of the distribution underlying the first
sample is less than the mean of the distribution underlying the second
sample.
‘greater’: the mean of the distribution underlying the first
sample is greater than the mean of the distribution underlying the
second sample.

Returns
statistic : float or array
t-statistic.

pvalue : float or array

The p-value.

In [3]: import scipy.stats as stats

a = np.array([12, 14, 16, 4, 5, 11, 12, 11])

b = np.array([12, 13, 14, 15, 16, 2, 3, 4])

stats.ttest_rel(a,b)
Ttest_relResult(statistic=0.26355219111613715, pvalue=0.7997147761519707)
Out[3]:

From the above result, the assumed alpha value which is 0.05 is less than the obtained pvalue. Hence
we can accpet the Null hypothesis H0, saying that the difference between two means of two groups is
zero

2. F-Test
F-Test can be applied to test the significant difference between the variance of two populations, based
on the small samples drawn from those populations. The test based on this statistic is known as F-Test.

Simply said, this F-test compares the variances of 2 values by perfroming division.The result of the f-test
is always positive, because the variances are always positive. Let’s assume that the two variables are s1
and s2, the formula is considered as F = s1^2/s2^2

The Hypothesis for this F-test are defined as, Null Hypothesis (H0) : The variances of two variables are
equal and there is no significant difference Alternate Hypothesis (HA) : The variances of two variables
are not equal

F-Statistic, also known to be as F-Value is used in Analysis of Variance (ANOVA) and in regression
models to find the significance between the means of the two populations by comparing variances. F-
Statistic is used in F-test. F-Test is almost similar to T-test, except in F-test we check for the significance
among group of variables, whereas in T-test, we check for the significance among 2 variables. F-test is
used to check for the similarity among the means of different variables.

For an F-test to be conducted we need to assume

- that the data taken is normally distributed
- The numerator variance should be larger and the denominator
variance should be smaller

There are many statistics in which F-Statistic is used, but mostly

used F-test is Analysis Of Variance (ANOVA)
ANOVA : Analysis Of Variances, called as ANOVA, is used to test two or more than two groups of means
differences. This ANOVA uses F-Statistic to calculate the difference between means of 2 groups or
more than that

The Hypothesis here is taken as,

Null Hypothesis (H0) : The groups are significantly equal
Alternate Hypothesis (H1) : The groups are not equal

There are different types of ANOVA such as,

- One-way ANOVA
- Two-way ANOVA
- Factorial ANOVA
- Repeated Measures ANOVA
- MANOVA etc.,

The most used ANOVA is One-way ANOVA, which is used to compare

groups mean with an independent variable to check whether the groups
are likely or not

This test is performed by using a method f_oneway from scipy.stats

as
scipy.stats.f_oneway(*samples, axis=0)

where,
samples can be any number of groups or array like variables
axis defines along which the test to be performed, by
default set to zero and is optional

returns,
statistic : float
The computed F statistic of the test.

pvalue : float
The associated p-value from the F distribution.

As per this if the pvalue is less than the alpha value (0.05) then
the Null Hypothesis is rejected and if the pvalue is higher than the
alpha value then the Null Hypothesis is accepted.

In [ ]: # Importing Libraries

In [2]: import numpy as np

import pandas as pd
import scipy.stats as stats

In [10]: # Creating a dataset

In [12]: cities = ["punjab","delhi","hyderabad","bangalore","mumbai"]

In [15]: people_of_spec_city = np.random.choice(a= cities, p = [0.05, 0.15 ,0

# np.random.choice, returns some random values from the given value

people_of_spec_city
array(['mumbai', 'hyderabad', 'mumbai', 'hyderabad', 'mumbai', 'mumbai',
Out[15]:
'delhi', 'hyderabad', 'mumbai', 'mumbai', 'mumbai', 'hyderabad',
'bangalore', 'hyderabad', 'delhi', 'hyderabad', 'mumbai', 'mumbai',
'mumbai', 'mumbai', 'mumbai', 'mumbai', 'mumbai', 'mumbai',
'punjab', 'hyderabad', 'mumbai', 'mumbai', 'mumbai', 'delhi',
'punjab', 'hyderabad', 'delhi', 'bangalore', 'hyderabad', 'mumbai',
'hyderabad', 'mumbai', 'hyderabad', 'mumbai', 'mumbai', 'mumbai',
'mumbai', 'hyderabad', 'mumbai', 'bangalore', 'hyderabad',
'hyderabad', 'mumbai', 'mumbai', 'hyderabad', 'mumbai', 'mumbai',
'mumbai', 'mumbai', 'mumbai', 'mumbai', 'delhi', 'delhi', 'delhi',
'hyderabad', 'delhi', 'mumbai', 'mumbai', 'mumbai', 'delhi',
'mumbai', 'hyderabad', 'mumbai', 'mumbai', 'mumbai', 'mumbai',
'hyderabad', 'hyderabad', 'mumbai', 'bangalore', 'mumbai',
'hyderabad', 'mumbai', 'mumbai', 'mumbai', 'mumbai', 'mumbai',
'hyderabad', 'mumbai', 'mumbai', 'hyderabad', 'hyderabad',
'punjab', 'mumbai', 'hyderabad', 'mumbai', 'mumbai', 'delhi',
'mumbai', 'mumbai', 'bangalore', 'mumbai', 'mumbai', 'mumbai',
'mumbai', 'hyderabad', 'mumbai', 'bangalore', 'mumbai',
'bangalore', 'hyderabad', 'mumbai', 'mumbai', 'hyderabad', 'delhi',
'mumbai', 'mumbai', 'mumbai', 'delhi', 'mumbai', 'hyderabad',
'hyderabad', 'delhi', 'delhi', 'mumbai', 'delhi', 'delhi',
'mumbai', 'mumbai', 'mumbai', 'delhi', 'hyderabad', 'punjab',
'delhi', 'mumbai', 'delhi', 'mumbai', 'mumbai', 'bangalore',
'mumbai', 'hyderabad', 'mumbai', 'hyderabad', 'hyderabad',
'mumbai', 'mumbai', 'hyderabad', 'mumbai', 'mumbai', 'punjab',
'punjab', 'bangalore', 'bangalore', 'mumbai', 'hyderabad',
'hyderabad', 'mumbai', 'hyderabad', 'mumbai', 'delhi', 'mumbai',
'mumbai', 'hyderabad', 'delhi', 'mumbai', 'hyderabad', 'punjab',
'delhi', 'mumbai', 'mumbai', 'hyderabad', 'mumbai', 'hyderabad',
'mumbai', 'mumbai', 'hyderabad', 'hyderabad', 'mumbai', 'mumbai',
'punjab', 'hyderabad', 'mumbai', 'hyderabad', 'hyderabad',
'bangalore', 'punjab', 'bangalore', 'hyderabad', 'bangalore',
'mumbai', 'delhi', 'bangalore', 'mumbai', 'delhi', 'mumbai',
'mumbai', 'hyderabad', 'delhi', 'hyderabad', 'hyderabad', 'delhi',
'delhi', 'mumbai', 'mumbai', 'hyderabad', 'mumbai', 'bangalore',
'mumbai', 'mumbai', 'delhi', 'mumbai', 'mumbai', 'mumbai',
'bangalore', 'hyderabad', 'hyderabad', 'mumbai', 'mumbai',
'mumbai', 'mumbai', 'mumbai', 'hyderabad', 'mumbai', 'hyderabad',
'delhi', 'mumbai', 'bangalore', 'hyderabad', 'mumbai', 'delhi',
'hyderabad', 'bangalore', 'mumbai', 'delhi', 'delhi', 'delhi',
'mumbai', 'hyderabad', 'mumbai', 'mumbai', 'mumbai', 'delhi',
'hyderabad', 'mumbai', 'delhi', 'mumbai', 'mumbai', 'punjab',
'hyderabad', 'mumbai', 'mumbai', 'hyderabad', 'mumbai', 'delhi',
'mumbai', 'mumbai', 'hyderabad', 'mumbai', 'delhi', 'hyderabad',
'mumbai', 'mumbai', 'hyderabad', 'hyderabad', 'mumbai', 'punjab',
'mumbai', 'mumbai', 'mumbai', 'mumbai', 'mumbai', 'hyderabad',
'delhi', 'hyderabad', 'mumbai', 'delhi', 'hyderabad', 'hyderabad',
'delhi', 'mumbai', 'hyderabad', 'hyderabad', 'mumbai', 'mumbai',
'mumbai', 'mumbai', 'hyderabad', 'hyderabad', 'hyderabad',
'mumbai', 'mumbai', 'delhi', 'mumbai', 'mumbai', 'delhi', 'mumbai',
'punjab', 'mumbai', 'hyderabad', 'mumbai', 'mumbai', 'delhi',
'hyderabad', 'hyderabad', 'mumbai', 'hyderabad', 'bangalore',
'delhi', 'mumbai', 'hyderabad', 'mumbai', 'hyderabad', 'mumbai',
'mumbai', 'delhi', 'hyderabad', 'hyderabad', 'hyderabad',
'hyderabad', 'hyderabad', 'hyderabad', 'mumbai', 'delhi', 'mumbai',
'mumbai', 'mumbai', 'hyderabad', 'delhi', 'mumbai', 'mumbai',
'delhi', 'mumbai', 'mumbai', 'hyderabad', 'mumbai', 'hyderabad',
'mumbai', 'mumbai', 'hyderabad', 'hyderabad', 'hyderabad',
'hyderabad', 'bangalore', 'mumbai', 'hyderabad', 'mumbai',
'hyderabad', 'punjab', 'bangalore', 'mumbai', 'punjab',
'hyderabad', 'mumbai', 'delhi', 'punjab', 'hyderabad', 'hyderabad',
'mumbai', 'mumbai', 'mumbai', 'mumbai', 'hyderabad', 'mumbai',
'mumbai', 'punjab', 'delhi', 'mumbai', 'mumbai', 'hyderabad',
'mumbai', 'mumbai', 'mumbai', 'delhi', 'mumbai', 'hyderabad',
'hyderabad', 'mumbai', 'hyderabad', 'mumbai', 'bangalore',
'mumbai', 'mumbai', 'delhi', 'hyderabad', 'mumbai', 'mumbai',
'hyderabad', 'delhi', 'mumbai', 'hyderabad', 'punjab', 'bangalore',
'mumbai', 'mumbai', 'hyderabad', 'delhi', 'hyderabad', 'punjab',
'mumbai', 'mumbai', 'hyderabad', 'mumbai', 'mumbai', 'hyderabad',
'hyderabad', 'hyderabad', 'mumbai', 'delhi', 'delhi', 'mumbai',
'mumbai', 'mumbai', 'mumbai', 'mumbai', 'delhi', 'delhi', 'delhi',
'delhi', 'mumbai', 'mumbai', 'delhi', 'mumbai', 'bangalore',
'mumbai', 'hyderabad', 'hyderabad', 'bangalore', 'delhi', 'mumbai',
'mumbai', 'delhi', 'hyderabad', 'bangalore', 'mumbai', 'mumbai',
'delhi', 'punjab', 'hyderabad', 'mumbai', 'mumbai', 'mumbai',
'mumbai', 'bangalore', 'delhi', 'hyderabad', 'delhi', 'hyderabad',
'mumbai', 'mumbai', 'mumbai', 'delhi', 'delhi', 'mumbai', 'mumbai',
'delhi', 'hyderabad', 'hyderabad', 'mumbai', 'mumbai', 'hyderabad',
'hyderabad', 'mumbai', 'mumbai', 'hyderabad', 'mumbai', 'mumbai',
'mumbai', 'delhi', 'punjab', 'punjab', 'mumbai', 'hyderabad',
'hyderabad', 'hyderabad', 'hyderabad', 'mumbai', 'bangalore',
'mumbai', 'hyderabad', 'mumbai', 'mumbai', 'mumbai', 'hyderabad',
'mumbai', 'hyderabad', 'hyderabad', 'mumbai', 'mumbai',
'hyderabad', 'delhi', 'bangalore', 'hyderabad', 'bangalore',
'mumbai', 'mumbai', 'mumbai', 'delhi', 'delhi', 'hyderabad',
'delhi', 'delhi', 'hyderabad', 'mumbai', 'hyderabad', 'hyderabad',
'mumbai', 'mumbai', 'mumbai', 'mumbai', 'hyderabad', 'mumbai',
'hyderabad', 'mumbai', 'mumbai', 'mumbai', 'hyderabad', 'mumbai',
'mumbai', 'punjab', 'delhi', 'mumbai', 'mumbai', 'delhi',
'hyderabad', 'hyderabad', 'hyderabad', 'mumbai', 'mumbai',
'mumbai', 'hyderabad', 'mumbai', 'mumbai', 'mumbai', 'punjab',
'delhi', 'hyderabad', 'hyderabad', 'mumbai', 'mumbai', 'hyderabad',
'mumbai', 'delhi', 'mumbai', 'delhi', 'hyderabad', 'delhi',
'delhi', 'mumbai', 'delhi', 'mumbai', 'mumbai', 'mumbai', 'mumbai',
'mumbai', 'hyderabad', 'mumbai', 'mumbai', 'delhi', 'delhi',
'mumbai', 'mumbai', 'hyderabad', 'mumbai', 'mumbai', 'hyderabad',
'mumbai', 'delhi', 'bangalore', 'mumbai', 'mumbai', 'bangalore',
'mumbai', 'mumbai', 'mumbai', 'bangalore', 'delhi', 'mumbai',
'bangalore', 'bangalore', 'hyderabad', 'mumbai', 'mumbai',
'mumbai', 'mumbai', 'mumbai', 'delhi', 'mumbai', 'mumbai', 'delhi',
'hyderabad', 'hyderabad', 'mumbai', 'bangalore', 'mumbai',
'mumbai', 'hyderabad', 'mumbai', 'mumbai', 'hyderabad', 'mumbai',
'delhi', 'mumbai', 'mumbai', 'hyderabad', 'hyderabad', 'hyderabad',
'mumbai', 'mumbai', 'mumbai', 'bangalore', 'mumbai', 'mumbai',
'hyderabad', 'mumbai', 'mumbai', 'hyderabad', 'delhi', 'mumbai',
'hyderabad', 'hyderabad', 'mumbai', 'mumbai', 'mumbai',
'hyderabad', 'mumbai', 'mumbai', 'mumbai', 'delhi', 'mumbai',
'punjab', 'delhi', 'mumbai', 'punjab', 'hyderabad', 'delhi',
'hyderabad', 'mumbai', 'mumbai', 'delhi', 'punjab', 'mumbai',
'delhi', 'delhi', 'hyderabad', 'mumbai', 'punjab', 'mumbai',
'punjab', 'mumbai', 'hyderabad', 'mumbai', 'mumbai', 'mumbai',
'mumbai', 'hyderabad', 'mumbai', 'mumbai', 'mumbai', 'mumbai',
'mumbai', 'mumbai', 'hyderabad', 'mumbai', 'delhi', 'hyderabad',
'mumbai', 'mumbai', 'bangalore', 'mumbai', 'mumbai', 'hyderabad',
'delhi', 'delhi', 'mumbai', 'mumbai', 'mumbai', 'mumbai',
'hyderabad', 'mumbai', 'hyderabad', 'delhi', 'mumbai', 'mumbai',
'delhi', 'punjab', 'mumbai', 'hyderabad', 'mumbai', 'hyderabad',
'mumbai', 'delhi', 'punjab', 'hyderabad', 'mumbai', 'mumbai',
'mumbai', 'delhi', 'hyderabad', 'mumbai', 'delhi', 'mumbai',
'delhi', 'hyderabad', 'bangalore', 'mumbai', 'hyderabad',
'hyderabad', 'mumbai', 'mumbai', 'hyderabad', 'mumbai',
'bangalore', 'hyderabad', 'mumbai', 'hyderabad', 'bangalore',
'mumbai', 'mumbai', 'mumbai', 'mumbai', 'bangalore', 'delhi',
'hyderabad', 'mumbai', 'hyderabad', 'hyderabad', 'mumbai',
'hyderabad', 'mumbai', 'mumbai', 'mumbai', 'hyderabad', 'mumbai',
'mumbai', 'punjab', 'hyderabad', 'mumbai', 'hyderabad', 'mumbai',
'mumbai', 'mumbai', 'hyderabad', 'delhi', 'hyderabad', 'hyderabad',
'bangalore', 'delhi', 'mumbai', 'mumbai', 'mumbai', 'mumbai',
'hyderabad', 'hyderabad', 'mumbai', 'delhi', 'mumbai', 'mumbai',
'bangalore', 'mumbai', 'mumbai', 'delhi', 'mumbai', 'bangalore',
'mumbai', 'hyderabad', 'delhi', 'delhi', 'hyderabad', 'mumbai',
'mumbai', 'hyderabad', 'mumbai', 'mumbai', 'mumbai', 'mumbai',
'mumbai', 'mumbai', 'delhi', 'mumbai', 'hyderabad', 'hyderabad',
'hyderabad', 'mumbai', 'delhi', 'mumbai', 'hyderabad', 'delhi',
'bangalore', 'hyderabad', 'mumbai', 'hyderabad', 'bangalore',
'hyderabad', 'delhi', 'delhi', 'mumbai', 'mumbai', 'mumbai',
'hyderabad', 'hyderabad', 'hyderabad', 'mumbai', 'mumbai', 'delhi',
'hyderabad', 'hyderabad', 'mumbai', 'mumbai', 'mumbai', 'mumbai',
'hyderabad', 'mumbai', 'hyderabad', 'mumbai', 'mumbai', 'mumbai',
'delhi', 'hyderabad', 'hyderabad', 'mumbai', 'mumbai', 'delhi',
'mumbai', 'hyderabad', 'mumbai', 'hyderabad', 'delhi', 'delhi',
'mumbai', 'mumbai', 'hyderabad', 'bangalore', 'hyderabad',
'hyderabad', 'hyderabad', 'delhi', 'mumbai', 'mumbai', 'mumbai',
'delhi', 'mumbai', 'mumbai', 'mumbai', 'mumbai', 'mumbai',
'mumbai', 'delhi', 'hyderabad', 'delhi', 'punjab', 'mumbai',
'hyderabad', 'delhi', 'mumbai', 'mumbai', 'mumbai', 'mumbai',
'mumbai', 'delhi', 'mumbai', 'hyderabad', 'mumbai', 'mumbai',
'mumbai', 'mumbai', 'mumbai', 'delhi', 'mumbai', 'delhi', 'delhi',
'mumbai', 'mumbai', 'hyderabad', 'mumbai', 'mumbai', 'mumbai',
'mumbai', 'mumbai', 'hyderabad', 'hyderabad', 'mumbai', 'delhi',
'mumbai', 'mumbai', 'punjab', 'hyderabad', 'hyderabad', 'mumbai',
'punjab', 'mumbai', 'mumbai', 'mumbai', 'delhi', 'mumbai', 'delhi',
'mumbai', 'mumbai', 'mumbai', 'hyderabad', 'mumbai', 'hyderabad',
'bangalore', 'hyderabad', 'hyderabad', 'mumbai', 'delhi', 'mumbai',
'bangalore', 'delhi', 'hyderabad', 'mumbai', 'delhi', 'hyderabad',
'hyderabad', 'mumbai', 'delhi', 'delhi', 'delhi', 'mumbai',
'hyderabad', 'hyderabad', 'hyderabad', 'hyderabad', 'punjab',
'hyderabad', 'mumbai', 'mumbai', 'hyderabad', 'delhi', 'mumbai',
'mumbai', 'mumbai', 'delhi', 'hyderabad', 'delhi', 'bangalore',
'delhi', 'punjab', 'mumbai', 'mumbai', 'mumbai', 'delhi', 'mumbai',
'mumbai', 'mumbai', 'mumbai', 'delhi', 'hyderabad', 'delhi',
'mumbai', 'mumbai', 'delhi', 'delhi', 'hyderabad', 'punjab',
'delhi', 'mumbai', 'delhi', 'hyderabad', 'hyderabad', 'mumbai',
'punjab', 'mumbai', 'hyderabad', 'mumbai', 'punjab', 'mumbai',
'delhi', 'punjab', 'mumbai', 'mumbai', 'mumbai', 'hyderabad',
'hyderabad', 'mumbai', 'delhi', 'delhi', 'mumbai', 'hyderabad',
'mumbai', 'punjab', 'mumbai', 'bangalore', 'bangalore', 'mumbai',
'hyderabad', 'hyderabad', 'mumbai', 'mumbai', 'delhi', 'mumbai',
'delhi', 'mumbai', 'punjab', 'hyderabad', 'mumbai', 'mumbai',
'mumbai', 'mumbai', 'mumbai', 'delhi', 'hyderabad', 'delhi',
'mumbai'], dtype='<U9')

In [16]: population_of_spec_city = stats.poisson.rvs(loc=18, mu=30, size= 1

# stats.poisson.rvs method is used to generate random numbers, where
#paramters and as for the size, it defines the number of values

population_of_spec_city
array([61, 43, 54, 46, 55, 52, 43, 48, 42, 52, 55, 50, 39, 50, 54, 41, 42,
Out[16]:
51, 48, 58, 50, 49, 43, 48, 44, 49, 48, 49, 44, 47, 48, 58, 51, 39,
39, 44, 48, 44, 47, 47, 45, 54, 49, 43, 47, 57, 44, 49, 57, 39, 48,
48, 39, 42, 52, 53, 51, 52, 46, 55, 43, 45, 51, 52, 52, 42, 40, 40,
40, 52, 48, 59, 48, 52, 56, 48, 56, 49, 43, 57, 52, 42, 58, 45, 52,
53, 49, 49, 40, 44, 52, 55, 52, 60, 49, 36, 47, 42, 46, 49, 51, 44,
45, 42, 49, 41, 46, 46, 51, 57, 50, 58, 47, 49, 47, 40, 49, 50, 50,
58, 50, 47, 53, 50, 55, 43, 51, 52, 54, 56, 44, 41, 47, 38, 52, 48,
52, 43, 47, 60, 41, 59, 51, 41, 50, 50, 42, 42, 48, 36, 43, 48, 44,
51, 43, 46, 45, 49, 44, 55, 39, 51, 65, 47, 54, 48, 42, 45, 56, 49,
44, 41, 40, 41, 51, 38, 57, 49, 40, 50, 39, 50, 45, 55, 49, 47, 49,
48, 46, 46, 47, 52, 54, 50, 42, 60, 55, 50, 52, 41, 50, 52, 41, 44,
51, 45, 41, 46, 57, 49, 41, 51, 41, 40, 51, 46, 41, 47, 46, 49, 52,
44, 45, 48, 58, 52, 55, 39, 45, 53, 36, 43, 50, 48, 49, 43, 54, 46,
46, 62, 40, 47, 51, 49, 41, 58, 50, 62, 43, 48, 40, 50, 55, 48, 51,
50, 36, 44, 46, 39, 54, 48, 49, 48, 45, 49, 41, 41, 40, 53, 41, 35,
52, 36, 39, 51, 47, 52, 43, 41, 47, 58, 45, 38, 47, 47, 48, 48, 47,
43, 57, 48, 43, 46, 48, 47, 42, 52, 52, 55, 50, 54, 52, 40, 50, 52,
40, 41, 46, 59, 44, 61, 44, 48, 37, 47, 52, 41, 43, 44, 45, 43, 54,
40, 37, 51, 53, 53, 50, 37, 52, 46, 46, 42, 43, 49, 43, 46, 48, 53,
50, 53, 57, 43, 48, 57, 47, 53, 49, 47, 44, 53, 44, 55, 53, 47, 41,
44, 49, 51, 48, 50, 45, 52, 54, 55, 48, 44, 44, 52, 46, 54, 48, 42,
38, 51, 48, 46, 43, 47, 49, 45, 40, 43, 46, 41, 53, 55, 48, 43, 41,
46, 60, 56, 58, 54, 46, 48, 44, 47, 41, 45, 45, 46, 41, 41, 52, 49,
47, 48, 52, 40, 53, 44, 67, 53, 52, 57, 41, 53, 37, 49, 61, 47, 47,
61, 50, 44, 40, 57, 52, 53, 44, 50, 48, 48, 47, 51, 53, 48, 41, 46,
46, 45, 49, 46, 48, 53, 39, 46, 55, 58, 47, 49, 49, 43, 44, 49, 45,
54, 38, 43, 62, 49, 48, 60, 47, 47, 49, 53, 61, 39, 46, 52, 48, 47,
43, 53, 42, 51, 51, 55, 47, 45, 53, 45, 48, 57, 50, 44, 55, 39, 44,
49, 53, 45, 55, 47, 49, 42, 54, 43, 58, 46, 45, 45, 49, 53, 43, 48,
46, 42, 41, 40, 45, 45, 35, 53, 44, 37, 49, 57, 49, 49, 45, 41, 56,
39, 45, 48, 44, 53, 55, 58, 39, 45, 43, 55, 47, 46, 45, 37, 51, 43,
45, 54, 45, 51, 50, 48, 47, 42, 58, 57, 44, 63, 54, 46, 48, 45, 47,
52, 42, 53, 38, 46, 55, 38, 52, 47, 47, 46, 38, 46, 48, 55, 55, 54,
45, 50, 38, 53, 54, 50, 50, 60, 47, 52, 43, 53, 54, 48, 46, 48, 46,
57, 53, 47, 42, 55, 34, 56, 50, 49, 45, 55, 56, 45, 50, 49, 58, 43,
48, 46, 46, 55, 64, 47, 45, 33, 41, 44, 49, 53, 42, 49, 44, 44, 49,
49, 47, 39, 44, 41, 58, 59, 52, 47, 41, 51, 46, 60, 53, 51, 54, 52,
37, 50, 50, 46, 51, 47, 44, 39, 54, 54, 46, 52, 49, 47, 54, 52, 53,
40, 42, 54, 50, 44, 40, 51, 49, 56, 48, 44, 47, 52, 55, 44, 45, 49,
45, 52, 51, 47, 46, 44, 50, 45, 48, 56, 41, 51, 47, 50, 39, 41, 39,
46, 47, 46, 49, 41, 56, 48, 53, 50, 55, 45, 46, 41, 52, 48, 43, 53,
47, 54, 48, 52, 43, 40, 51, 51, 52, 50, 48, 57, 50, 46, 50, 51, 55,
43, 65, 51, 51, 59, 48, 44, 50, 40, 47, 66, 55, 45, 51, 43, 55, 55,
61, 46, 52, 49, 46, 51, 51, 38, 45, 53, 46, 49, 52, 51, 59, 58, 49,
52, 49, 43, 37, 44, 46, 48, 36, 56, 45, 48, 43, 47, 49, 62, 44, 49,
40, 45, 58, 52, 48, 46, 47, 39, 56, 51, 48, 52, 52, 46, 55, 46, 46,
46, 48, 38, 45, 46, 57, 43, 44, 47, 58, 39, 49, 48, 44, 58, 45, 49,
52, 50, 44, 50, 53, 43, 55, 53, 54, 49, 42, 52, 46, 48, 49, 60, 42,
48, 49, 44, 53, 45, 52, 50, 53, 45, 46, 46, 49, 44, 60, 43, 46, 48,
45, 43, 48, 37, 38, 47, 41, 46, 60, 54, 49, 54, 60, 53, 47, 45, 52,
44, 60, 52, 50, 53, 47, 55, 40, 47, 40, 46, 39, 55, 52, 43, 45, 48,
41, 38, 46, 50, 39, 41, 45, 51, 49, 51, 57, 52, 46, 42, 47, 47, 47,
39, 38, 51, 38, 52, 51, 52, 45, 51, 39, 50, 45, 52, 41, 43, 59, 48,
49, 47, 43, 50, 51, 50, 58, 50, 42, 38, 50, 55, 38, 42, 48, 41, 52,
50, 47, 51, 44, 48, 45, 44, 53, 50, 44, 53, 47, 51, 45, 52, 43, 54,
44, 49, 43, 50, 53, 46, 53, 42, 45, 47, 51, 48, 48, 39, 51, 53, 46,
50, 56, 47, 43, 48, 56, 51, 52, 48, 45, 54, 46, 52, 58, 54, 41, 55,
49, 48, 54, 45, 60, 43, 46, 57, 54, 48, 45, 49, 56, 44],
dtype=int64)

In [19]: # Forming the Dataframe from the obatined values

population_frame = pd.DataFrame({"city":people_of_spec_city,"populat

# Dividing these values by the categorical variables into groups

groups = population_frame.groupby("city").groups

groups
{'bangalore': [12, 33, 45, 75, 96, 103, 105, 134, 147, 148, 180, 182, 184, 187, 202,
Out[19]:
209, 222, 227, 302, 338, 344, 375, 387, 418, 422, 428, 438, 472, 486, 488, 563, 566,
570, 573, 574, 588, 605, 663, 699, 707, 711, 716, 741, 753, 758, 783, 787, 827, 897,
903, 931, 978, 979], 'delhi': [6, 14, 29, 32, 57, 58, 59, 61, 65, 93, 110, 114, 118,
119, 121, 122, 126, 129, 131, 155, 159, 163, 186, 189, 193, 196, 197, 205, 220, 225,
229, 230, 231, 237, 240, 249, 254, 268, 271, 274, 287, 290, 297, 303, 310, 318, 323,
326, 349, 361, 368, 378, 383, 391, 403, 404, 410, 411, 412, 413, 416, 423, 426, 431,
439, 441, 446, 447, 450, 463, 485, 492, 493, 495, 496, 515, 518, 530, 537, 539, 541,
542, 544, 553, 554, 562, 571, 581, 584, 596, 612, 623, 626, 630, 634, 637, 638, 659,
667, 668, ...], 'hyderabad': [1, 3, 7, 11, 13, 15, 25, 31, 34, 36, 38, 43, 46, 47, 5
0, 60, 67, 72, 73, 77, 83, 86, 87, 90, 101, 106, 109, 116, 117, 127, 136, 138, 139, 1
42, 150, 151, 153, 158, 161, 166, 168, 171, 172, 176, 178, 179, 183, 192, 194, 195, 2
00, 210, 211, 217, 219, 223, 226, 233, 238, 244, 247, 252, 255, 258, 259, 267, 269, 2
72, 273, 276, 277, 282, 283, 284, 294, 298, 299, 301, 305, 307, 311, 312, 313, 314, 3
15, 316, 322, 329, 331, 334, 335, 336, 337, 340, 342, 347, 351, 352, 357, 364, ...],
'mumbai': [0, 2, 4, 5, 8, 9, 10, 16, 17, 18, 19, 20, 21, 22, 23, 26, 27, 28, 35, 37,
39, 40, 41, 42, 44, 48, 49, 51, 52, 53, 54, 55, 56, 62, 63, 64, 66, 68, 69, 70, 71, 7
4, 76, 78, 79, 80, 81, 82, 84, 85, 89, 91, 92, 94, 95, 97, 98, 99, 100, 102, 104, 10
7, 108, 111, 112, 113, 115, 120, 123, 124, 125, 130, 132, 133, 135, 137, 140, 141, 14
3, 144, 149, 152, 154, 156, 157, 160, 164, 165, 167, 169, 170, 173, 174, 177, 185, 18
8, 190, 191, 198, 199, ...], 'punjab': [24, 30, 88, 128, 145, 146, 162, 175, 181, 24
3, 261, 292, 343, 346, 350, 360, 386, 393, 432, 464, 465, 514, 529, 625, 628, 635, 64
1, 643, 680, 687, 730, 845, 880, 884, 919, 933, 950, 957, 961, 964, 976, 989]}
In [20]: # Etract individual groups into respective variables

punjab = population_of_spec_city [groups["punjab"]]

bangalore = population_of_spec_city[groups["bangalore"]]
delhi = population_of_spec_city[groups["delhi"]]
hyderabad = population_of_spec_city[groups["hyderabad"]]
mumbai = population_of_spec_city[groups["mumbai"]]

In [21]: # Now calculate the one-way anova test for the obtained individual g

stats.f_oneway(asian, black, hispanic, other, white)

F_onewayResult(statistic=0.9110431706569894, pvalue=0.45674036540270235)
Out[21]:

From the above obtained result, we can decide on to the output that the pvalue which is 0.4 is greater
than alpha value (0.05). Hence we can say that there is no significant difference among the variances of
different groups and are almost equal. And the Null Hypothesis is accepted

3. Correlation coefficients
The correlation coefficient is a statistical measure that shows the degree to which, changes to a value
of one variable predict change to the value of another. The letter r is used to represent the correlation
coefficient and the r is a unit-free value between -1 and 1.

Correlation coefficient measures the relatability among the data, the strength of that relationship is
obtained by using these correlation coefficient formulas. The values obtained from these correlation
coefficients are -1, 0 or 1 where, -1 represents a relationship which is negative and weak 1 represents a
relationsip which is positive and strong 0 (zero) represents no relationship

Let’s suppose, there are two variables x and y, for which the correlation coefficient need to be found

If the value of y goes up, when the value of x goes up, which means x is directly proportional to y
then the correlation coefficient between x and y results out to be between 1 or positive values

If the value of y goes down, whenever the value of x goes up or vice versa, which means x is
inversely proportional to x then the correlation coefficient between x and y results to be -1 or
negative values

Though the value of y goes down, if there is no change in the other variable in our case it is x,
then the correlation coefficient between these 2 variable results to be 0 (zero)

For Example,

Positive Correlation : If the quantity of milk increases, the price also increases Negative
Correlation : If the price of a stock goes down, then the buying of that stock increases Zero
Correlation : There is no relationship between score in video games and grades of an
examination

Before we dig into how the correlation among 2 or more coefficients is calculated. It is
necessary to understand a term called covariance

So, what is covariance? Covariance is a term that is used to describe the linear relationship
between 2 variables. If the covariance is predicted to be positive, then the variables have a linear
relationship i.e., both the variables can change in the same direction. If the covariance is
predicted to be negative, then the variables don’t have that linear relationship i.e., both varibles
tend to go in different directions

There are many types of correlation coefficients. But here we will discuss some of the important
correlation coefficients that are widely being used

1. Pearson’s r
Pearson's r also known as Pearson's product momemnt
correlation coefficient, is used for describing the strength of the
linear relationship amomg 2 variables
This correlation coefficient is used when the data follows
normal distribution, with no outliers, with no skewed data and when
you expect linear relationship among the 2 variables.
The Pearson'r formula is as follows
The strength of the correlation is considered as
- weak positive correlation as 0<r<0.3, weak negative
correlation as -0.3<r<0
- strong positive correlation as 0.5<r<1, strong negative
correlation as -1<r<-0.5
- no correlation as r=0

1. Spearman’s rho

Spearman’s rho, also known as Spearman’s rank correlation coefficient, is used as an alternative to the
Pearson’s correlation coefficient. This correlation coefficient is a rank correlation coefficient as it uses
the rankings to determine the strength of each varibal(say lowest to highest). Unlike Pearson’s r,
Spearman’s rho is used to calculate monotonic relationship, which is Non-linear. Spearman’s rho formula
is as follows

1. Kendall’s tau

Kendall’s tau is used for the calculation of correlation coefficient when there are 2 variables, which may
be continuous variables with outliers or ordinals, but exhibiting monotonic relationship. Spearman’s rho
and Kendall’s tau are almost similar but it is better to use Kendall’s tau for better results.

These 3 correlation coefficients can be calculated in Machine Learning by using a function in pandas
which is

DataFrame.corr(method='pearson', min_periods=1)

Paramaters:
method {'pearson','spearman','kendall')

Calculating correlation coefficient using pandas dataframe

In [2]: import pandas as pd
import numpy as np

df = pd.read_csv('C:/Users/leena.ganta/Desktop/DataVedas/happyscore_
df.head()
Out[2]: ADJUST
AVG_SA STD_SA MEDIAN INCOME
ED_SATI AVG_IN HAPPYS COUNT
TISFACT TISFACT _INCOM _INEQU REGION GDP
SFACTI COME CORE RY.1
ION ION E ALITY
ON

COUNT
RY

‘Central
ARMENI 1731.50 31.4455 and
37 4.9 2.42 2096.76 4.350 0.76821 Armenia
A 6667 56 Eastern
Europe’

‘Sub-
ANGOL 1044.24 42.7200
26 4.3 3.19 1448.88 Saharan 4.033 0.75778 Angola
A 0000 00
Africa’

‘Latin
America
ARGENT 5109.40 45.4755 Argentin
60 7.1 1.91 7101.12 and 6.574 1.05351
INA 0000 56 a
Caribbe
an’

AUSTRI 19457.0 16879.6 30.2962 ‘Western

59 7.2 2.11 7.200 1.33723 Austria
A 4 20000 50 Europe’

‘Australi
AUSTRA 19917.0 15846.0 35.2850 a and
65 7.6 1.80 7.284 1.33358 Australia
LIA 0 60000 00 New
Zealand’

In [55]: df.corr(method='pearson') # or df.corr() gives the same result as me

Out[55]: ADJUSTED
AVG_SATI STD_SATI AVG_INCO MEDIAN_I
INCOME_I
HAPPYSC
_SATISFAC NEQUALIT GDP
SFACTION SFACTION ME NCOME ORE
TION Y

ADJUSTED
_SATISFAC 1.000000 0.978067 -0.527553 0.728006 0.704383 -0.123835 0.901213 0.755578
TION

AVG_SATI
0.978067 1.000000 -0.341201 0.689043 0.661883 -0.082471 0.885988 0.776679
SFACTION

STD_SATI
-0.527553 -0.341201 1.000000 -0.478206 -0.481429 0.221831 -0.457896 -0.242038
SFACTION

AVG_INCO
0.728006 0.689043 -0.478206 1.000000 0.995605 -0.382587 0.782122 0.814024
ME

MEDIAN_I
0.704383 0.661883 -0.481429 0.995605 1.000000 -0.449053 0.760328 0.797905
NCOME

INCOME_I
NEQUALIT -0.123835 -0.082471 0.221831 -0.382587 -0.449053 1.000000 -0.187222 -0.303204
Y

HAPPYSC
0.901213 0.885988 -0.457896 0.782122 0.760328 -0.187222 1.000000 0.790061
ORE

GDP 0.755578 0.776679 -0.242038 0.814024 0.797905 -0.303204 0.790061 1.000000

In [61]: df.corr(method='spearman') # correlaion coefficient using spearman m

Out[61]: ADJUSTED
AVG_SATI STD_SATI AVG_INCO MEDIAN_I
INCOME_I
HAPPYSC
_SATISFAC NEQUALIT GDP
SFACTION SFACTION ME NCOME ORE
TION Y

ADJUSTED
_SATISFAC 1.000000 0.981629 -0.497192 0.803010 0.779671 -0.168049 0.900697 0.766098
TION

AVG_SATI
0.981629 1.000000 -0.354810 0.808310 0.782479 -0.137139 0.893395 0.773521
SFACTION

STD_SATI
-0.497192 -0.354810 1.000000 -0.317653 -0.309697 0.182610 -0.421175 -0.275832
SFACTION

AVG_INCO
0.803010 0.808310 -0.317653 1.000000 0.990839 -0.356069 0.819542 0.960969
ME

MEDIAN_I
0.779671 0.782479 -0.309697 0.990839 1.000000 -0.448926 0.806704 0.961583
NCOME

INCOME_I
NEQUALIT -0.168049 -0.137139 0.182610 -0.356069 -0.448926 1.000000 -0.242107 -0.409767
Y

HAPPYSC
0.900697 0.893395 -0.421175 0.819542 0.806704 -0.242107 1.000000 0.793673
ORE

GDP 0.766098 0.773521 -0.275832 0.960969 0.961583 -0.409767 0.793673 1.000000

In [58]: df.corr(method='kendall') # correlation coefficient using kendall me

Out[58]: ADJUSTED
AVG_SATI STD_SATI AVG_INCO MEDIAN_I
INCOME_I
HAPPYSC
_SATISFAC NEQUALIT GDP
SFACTION SFACTION ME NCOME ORE
TION Y

ADJUSTED
_SATISFAC 1.000000 0.905145 -0.378239 0.614896 0.593379 -0.124810 0.741682 0.581131
TION

AVG_SATI
0.905145 1.000000 -0.266347 0.618270 0.593810 -0.104128 0.732966 0.591166
SFACTION

STD_SATI
-0.378239 -0.266347 1.000000 -0.237797 -0.233515 0.124672 -0.320795 -0.205190
SFACTION

AVG_INCO
0.614896 0.618270 -0.237797 1.000000 0.929566 -0.229994 0.622277 0.841441
ME

MEDIAN_I
0.593379 0.593810 -0.233515 0.929566 1.000000 -0.299779 0.614087 0.847011
NCOME

INCOME_I
NEQUALIT -0.124810 -0.104128 0.124672 -0.229994 -0.299779 1.000000 -0.166762 -0.264067
Y

HAPPYSC
0.741682 0.732966 -0.320795 0.622277 0.614087 -0.166762 1.000000 0.601310
ORE

GDP 0.581131 0.591166 -0.205190 0.841441 0.847011 -0.264067 0.601310 1.000000

4. Chi-Square Test
Chi-Square Test is a non-parametric test, which tests the significance of the difference between
observed frequencies and theoretical frequencies of distribution without any assumption about the
distribution of the population. In simple words, chi-square test is used to determine the difference
between the expected data and the observed data

Chi-square test is also used to determine whether the built regression model is good fit or not by
assessing train and test datasets. This test is used on categorical variables

Two Chi-Sqaure tests that are mostly used are

Independence : As the name suggests, it describes the dependence of two variable sets
Good Fit of data: This chi-square depicts whether the taken sample of a data is considered to be
the representive sample that contributes to the good fit for the expected outcome from the
taken population of data.

The formula for the chi-square test is as follows

In Chi-Square Test we assume two hypothesis,

Null Hypothesis (H0) : The 2 variables are independent

Alternate Hypothesis (HA) : The 2 variables are not independent

We determine by performing chi-square test and come out to one reasonable hypothesis as the
solution. This can be done by comparing the statistic value that is obtained after the test with
the alpha value i.e., 0.05 using chi-square table with the help of pvalue.

If the pvalue is less than alpha we accept the alternate hypothesis (HA). If the pvalue is greater
than alpha then we accept the Null hypothesis (H0).

Expected value is calculated as Ei= (Row Total * Column Total)/Total no of observations

In Machine learning, to perform chi-squared test we use a method
named chisquare which is imported from scipy.stats.

scipy.stats.chisquare(f_obs, f_exp=None, ddof=0, axis=0)

where,
f_obs - an array, with observed frequencies in each
category
f_exp - an array, optional; with expected frequencies in
each category. If no array is given, it assumes the categories are
almost likely
ddof - int, optional; stands for delta degrees of freedom,
which is used for adjustment to the degrees of freedom for obtaining
p-value. The p-value is determined using a chi-squared distribution
with k-1 delte degrees of freedom (ddof), where k is said to be the
umber of observed frequencies. By defualt the ddof value is 0
axis - int or None, optional; the axis along which to be
applied test. If axis is given as none, then the f_obs values are
treated as a single data set. Defaulted value is 0.
Returns,
chisq - float or ndarray, the value is a float if axis is
None or f_obs or f_exp are 1-dimensional array
p-value - float or ndarry, the values is a float if ddof
and the return value chisq are scalars.

Importing Library

In [37]: from scipy.stats import chisquare

With f_obs values

In [92]: f_obs=[16, 12, 16, 18, 14, 12]

chisq=chisquare(f_obs).statistic
pvalue=chisquare(f_obs).pvalue
print("chisquare statistic :",chisq)
print("p_value :",pvalue)
chisquare statistic : 2.0
p_value : 0.8491450360846096

In [ ]: # Since from the obtained result with pvalue (0.8) > alpha (0.05), w

With f_exp and f_obs values

In [100… f_obs=[16, 12, 16, 18, 14, 12]

f_exp=[16, 8, 16, 16, 16, 16]
chisquare(f_obs,f_exp)
Power_divergenceResult(statistic=3.5, pvalue=0.6233876277495822)
Out[100…

With f_obs as 2d

In [101… f_obs=[[16, 12, 16, 18, 14, 12],[24, 12, 32, 16, 32, 12]] # The test
chisquare(f_obs)
Power_divergenceResult(statistic=array([1.6 , 0. , 5.33333333, 0.1176470
Out[101…
6, 7.04347826,
0. ]), pvalue=array([0.20590321, 1. , 0.02092134, 0.73160059, 0.
00795544,
1. ]))

In [ ]: # Here most of the pvalues obtained are less than the alpha value, w

With axis as None

In [102… f_obs=[[16, 12, 16, 18, 14, 12],[24, 12, 32, 16, 32, 12]] # The test
chisquare(f_obs, axis=None)
Power_divergenceResult(statistic=33.33333333333333, pvalue=0.0004645423926184954)
Out[102…

With axis as 1

In [107… f_obs=[16, 12, 16, 18, 14, 12]

f_exp=[[16, 12, 16, 18, 14, 12],[16, 8, 16, 16, 16, 16]]
chisquare(f_obs,f_exp,axis=1)
Power_divergenceResult(statistic=array([0. , 3.5]), pvalue=array([1. , 0.62338
Out[107…
763]))

With ddof specified

In [103… f_obs=[16, 12, 16, 18, 14, 12]
chisquare(f_obs, ddof=1)
Power_divergenceResult(statistic=2.0, pvalue=0.7357588823428847)
Out[103…

In [104… chisquare(f_obs,ddof=[0,1])
Power_divergenceResult(statistic=2.0, pvalue=array([0.84914504, 0.73575888]))
Out[104…

In [106… chisquare(f_obs,ddof=[0,1,2])
Power_divergenceResult(statistic=2.0, pvalue=array([0.84914504, 0.73575888, 0.5724067
Out[106…
]))

The above results with the different degrees of freedom generates pvalues with values higher than the
alpha value, so we accept the Null hypothesis (H0)

SHARE MY ADVENTURES

(https://twitter.com/share?
Twitter text=Inferential%20Statistical%20Analysis%20Using%20Python&url=https%3A%2F%2Fbrainalystacademy.com%2Finferential-
statistics-in-python%2F)

(https://www.facebook.com/sharer.php?
Facebook u=https%3A%2F%2Fbrainalystacademy.com%2Finferential-
statistics-in-python%2F)

(https://www.pinterest.com/pin/create/button/?url=https%3A%2F%2Fbrainalystacademy.com%2F
Pinterest
PYTHON.jpg&description=INFERENTIAL_STATISTICS_BLOG+Inferential+Statistics%C2%B6Inferential+statistics+is+used+for+finding+inferences+on+the+data

LinkedIn
python%2F&title=Inferential%20Statistical%20Analysis%20Using%20Python&summary=INFERENTIAL_STATISTICS_BLOG+Inferential+Statistics%C2%B6Inferen

(viber://forward?
Viber text=https%3A%2F%2Fbrainalystacademy.com%2Finferential-
statistics-in-python%2F)

(https://vk.com/share.php?
VK url=https%3A%2F%2Fbrainalystacademy.com%2Finferential-
statistics-in-python%2F)

(https://www.reddit.com/submit?
Reddit url=https%3A%2F%2Fbrainalystacademy.com%2Finferential-statistics-in-
python%2F&title=Inferential%20Statistical%20Analysis%20Using%20Python)

(https://www.tumblr.com/widgets/share/tool?
Tumblr canonicalUrl=https%3A%2F%2Fbrainalystacademy.com%2Finferential-
statistics-in-python%2F)

(https://partners.viadeo.com/share?
Viadeo url=https%3A%2F%2Fbrainalystacademy.com%2Finferential-
statistics-in-python%2F)

(whatsapp://send?
WhatsApp text=https%3A%2F%2Fbrainalystacademy.com%2Finferential-
statistics-in-python%2F)

 YOU MIGHT ALSO LIKE

(https://brainalystacademy.com/measur (https://brainalystacademy.com/chi-
es-of-shape/) square/)

Measures of Shape – What is Chi-Square Test?

Skewness And Kurtosis (Formula & Examples)
(https://brainalystacademy (https://brainalystacademy
.com/measures-of-shape/) .com/chi-square/)
 September 2, 2022  September 6, 2022

(https://brainalystacademy.com/anova-
repeated-measures/)
An Introduction to Repeated
Measures Anova?
(https://brainalystacademy
.com/anova-repeated-
measures/)
 September 6, 2022

Your comment here...

Name (required) Email (required)

Website

Save my name, email, and website in this browser for the next time I comment.

P O S T C O M M E NT

Factor Analysis – An Easy Overview With Example

(https://brainalystacademy.com/factor-analysis/)
A Quick Overview Of Boosting In Machine Learning
(https://brainalystacademy.com/boosting/)
Inferential Statistical Analysis Using Python
(https://brainalystacademy.com/inferential-
statistics-in-python/)
A Quick Introduction To Averaging Methods
(https://brainalystacademy.com/averaging-
techniques/)
Unsupervised Anomaly Detection Using Python
(https://brainalystacademy.com/unsupervised-
anomaly-detection/)
Principal Component Analysis ( PCA ) – A Detailed
Overview
(https://brainalystacademy.com/principal-
component-analysis/)
Hierarchical Clustering – How Does It Works And
Its Types
(https://brainalystacademy.com/hierarchical-
clustering/)
What Is Dbscan Clustering Algorithm In Machine
Learning
(https://brainalystacademy.com/dbscan/)
K-means Clustering In Machine Learning
(https://brainalystacademy.com/k-means/)
Support Vector Machine ( Svm ) Algorithm In
Machine Learning
(https://brainalystacademy.com/support-vector-
machine/)

Categories
ANOMALY DETECTION
(https://brainalystacademy.com/category/modelin
g/unsupervised-learning-model/anomaly-
detection/)
BASIC STATISTICS
(https://brainalystacademy.com/category/basic-
statistics/)
BASIC STATISTICS APPLICATION
(https://brainalystacademy.com/category/basic-
statistics-application/)
CLASSIFICATION PROBLEMS
(https://brainalystacademy.com/category/modelin
g/supervised-learning/classification-problems/)
CLUSTERING PROBLEMS
(https://brainalystacademy.com/category/modelin
g/unsupervised-learning-model/clustering-
problems/)
DATA EXPLORATION AND PREPRATION
(https://brainalystacademy.com/category/data-
exploration-and-prepration/)
DESCRIPTIVE STATISTICS
(https://brainalystacademy.com/category/basic-
statistics/descriptive-statistics/)
DIMENSIONALITY REDUCTION
(https://brainalystacademy.com/category/modelin
g/unsupervised-learning-model/dimensionality-
reduction/)
ENSEMBLE METHODS
(https://brainalystacademy.com/category/modelin
g/supervised-learning/regression-
problems/ensemble-methods/)
F-TEST
(https://brainalystacademy.com/category/basic-
statistics/inferential-statistics/f-test/)
FEATURE CONSTRUCTION
(https://brainalystacademy.com/category/data-
exploration-and-prepration/feature-
engineering/feature-construction/)
FEATURE ENGINEERING
(https://brainalystacademy.com/category/data-
exploration-and-prepration/feature-engineering/)
FEATURE REDUCTION
(https://brainalystacademy.com/category/data-
exploration-and-prepration/feature-
engineering/feature-reduction/)
FEATURE SELECTION
(https://brainalystacademy.com/category/data-
exploration-and-prepration/feature-
engineering/feature-reduction/feature-selection/)
IMPORTANT CONCEPTS
(https://brainalystacademy.com/category/basic-
statistics/inferential-statistics/important-
concepts/)
INFERENTIAL STATISTICS
(https://brainalystacademy.com/category/basic-
statistics/inferential-statistics/)
MISCELLANEOUS METHODS
(https://brainalystacademy.com/category/data-
exploration-and-prepration/miscellaneous-
methods/)
MODELING
(https://brainalystacademy.com/category/modelin
g/)
REGRESSION PROBLEMS
(https://brainalystacademy.com/category/modelin
g/supervised-learning/regression-problems/)
SUPERVISED LEARNING
(https://brainalystacademy.com/category/modelin
g/supervised-learning/)
TIME SERIES ANALYSIS
(https://brainalystacademy.com/category/modelin
g/time-series-analysis/)
Uncategorized
(https://brainalystacademy.com/category/uncateg
orized/)
UNSUPERVISED LEARNING MODEL
(https://brainalystacademy.com/category/modelin
g/unsupervised-learning-model/)

(htt
Brainalyst provides services and online professional(htt (httin Data Science,
courses ps:/ Machine Learning, Deep Learning and Artificial
ps:/ Intelligence.
ps:/ (htt /w
/w /w ps:/ ww.
ww. ww. /twi link
fac inst tter. edin
ebo agr co .co
ok.c am. m/ m/c
om/ co Brai om
Brai m/b naly pan
nalyUSEFUL LINKy/br
rain stin
stin alys dia) aina
dia) t/) lyst
ABOUT US
/)

SERVICES

BLOGS

POLICIES

REFUND POLICY

TERM & CONDITION

CONTACT INFORMATION

 training@brainalyst.in

 +91 7419915555

Simpli Learn - Hospital POS Project
100% (4)
Simpli Learn - Hospital POS Project
8 pages
ProblemSet Notebook17-18
100% (1)
ProblemSet Notebook17-18
97 pages
Final 12
No ratings yet
Final 12
11 pages
Problem Set 7 (With Instructions) : Regression Statistics
No ratings yet
Problem Set 7 (With Instructions) : Regression Statistics
6 pages
Spline Methods Draft: Tom Lyche and Knut Mørken
No ratings yet
Spline Methods Draft: Tom Lyche and Knut Mørken
235 pages
Chi Square Test in Weka
67% (3)
Chi Square Test in Weka
40 pages
Finite Difference
No ratings yet
Finite Difference
18 pages
Assignment 2
No ratings yet
Assignment 2
7 pages
Full Statistics
No ratings yet
Full Statistics
108 pages
Max and Min PDF
No ratings yet
Max and Min PDF
19 pages
DBSCAN
No ratings yet
DBSCAN
18 pages
Binary To Gray: B) Write A Behavioral VHDL Code Description To Implement Octal To Binary Encoder
100% (1)
Binary To Gray: B) Write A Behavioral VHDL Code Description To Implement Octal To Binary Encoder
2 pages
Time Series Analysis
100% (1)
Time Series Analysis
15 pages
Untitled Document
No ratings yet
Untitled Document
8 pages
Types of Data (Qualitative and Quantitative)
No ratings yet
Types of Data (Qualitative and Quantitative)
89 pages
Exercises: Applied Bayesian Analysis and Numerical Methods (STK4021)
No ratings yet
Exercises: Applied Bayesian Analysis and Numerical Methods (STK4021)
30 pages
MTH 102: Probability and Statistics: Quiz 7 Post (A Light) Lunch Assignment 27/05/2020 Sanjit K. Kaul
No ratings yet
MTH 102: Probability and Statistics: Quiz 7 Post (A Light) Lunch Assignment 27/05/2020 Sanjit K. Kaul
3 pages
Advanced Numerical Analysis
No ratings yet
Advanced Numerical Analysis
2 pages
Analysis of Walking Pattern Using LRCN For Early Diagnosis of Dementia in Elderly Patients
No ratings yet
Analysis of Walking Pattern Using LRCN For Early Diagnosis of Dementia in Elderly Patients
12 pages
Example of 2D Convolution
No ratings yet
Example of 2D Convolution
5 pages
Midsem Regular MFDS 22-12-2019 Answer Key PDF
No ratings yet
Midsem Regular MFDS 22-12-2019 Answer Key PDF
5 pages
Hierarchical Cluster Analysis
No ratings yet
Hierarchical Cluster Analysis
4 pages
Solution CH # 5
No ratings yet
Solution CH # 5
39 pages
Big Data Analysis Assig.2
100% (1)
Big Data Analysis Assig.2
5 pages
Portfolio Optimization Using Particle Swarm Optimization
No ratings yet
Portfolio Optimization Using Particle Swarm Optimization
6 pages
Regression Formula
No ratings yet
Regression Formula
2 pages
Clustering & Association Algorithms 4
No ratings yet
Clustering & Association Algorithms 4
17 pages
App.A - Detection and Estimation in Additive Gaussian Noise PDF
No ratings yet
App.A - Detection and Estimation in Additive Gaussian Noise PDF
55 pages
R Basics PDF
No ratings yet
R Basics PDF
10 pages
Full Download Multivariate Statistical Methods A Primer Third Edition Manly PDF DOCX
100% (10)
Full Download Multivariate Statistical Methods A Primer Third Edition Manly PDF DOCX
65 pages
Multivariate Regression Model-LE
No ratings yet
Multivariate Regression Model-LE
5 pages
Seminar Report Machine Learning
No ratings yet
Seminar Report Machine Learning
20 pages
Numerical Differentiation - Origin
No ratings yet
Numerical Differentiation - Origin
2 pages
SL Bivariate Analysis Questions
No ratings yet
SL Bivariate Analysis Questions
15 pages
Classification Metrics in Machine Learning
No ratings yet
Classification Metrics in Machine Learning
6 pages
Confusion Matrix, Accuracy, Precision, Recall, F1 Score
No ratings yet
Confusion Matrix, Accuracy, Precision, Recall, F1 Score
1 page
Bayesian Inference
No ratings yet
Bayesian Inference
5 pages
Strategic Approach To Software Testing
No ratings yet
Strategic Approach To Software Testing
6 pages
Some Notes On Least Squares, QR-factorization, SVD and Fitting
No ratings yet
Some Notes On Least Squares, QR-factorization, SVD and Fitting
12 pages
2018 H2 Prelim Compilation (Correlation Regression)
No ratings yet
2018 H2 Prelim Compilation (Correlation Regression)
21 pages
Notes PDF
No ratings yet
Notes PDF
407 pages
Module 1 Notes
100% (1)
Module 1 Notes
73 pages
R - (2017) Understanding and Applying Basic Statistical Methods Using R (Wilcox - R - R) (Sols.)
No ratings yet
R - (2017) Understanding and Applying Basic Statistical Methods Using R (Wilcox - R - R) (Sols.)
91 pages
Principal Component Analysis 4 Dummies
100% (1)
Principal Component Analysis 4 Dummies
8 pages
Lecture 3 EdgeDetection
No ratings yet
Lecture 3 EdgeDetection
52 pages
CH 12
No ratings yet
CH 12
30 pages
Linear Regression
100% (1)
Linear Regression
51 pages
Computational Methods - Error Analysis
No ratings yet
Computational Methods - Error Analysis
9 pages
ISyE 6669 Homework 15 PDF
No ratings yet
ISyE 6669 Homework 15 PDF
3 pages
Data Mining-Outlier Analysis
No ratings yet
Data Mining-Outlier Analysis
6 pages
Chandigarh Group of Colleges College of Engineering Landran, Mohali
No ratings yet
Chandigarh Group of Colleges College of Engineering Landran, Mohali
47 pages
Chapter6 (MBI) El
No ratings yet
Chapter6 (MBI) El
51 pages
Solution Manual for Introduction to Applied Linear Algebra 1st by Boyd - All Chapters Are Available In PDF Format For Download
100% (7)
Solution Manual for Introduction to Applied Linear Algebra 1st by Boyd - All Chapters Are Available In PDF Format For Download
53 pages
Univariate and Bivariate Data Analysis + Probability
100% (1)
Univariate and Bivariate Data Analysis + Probability
5 pages
U L D R: Nsupervised Earning and Imensionality Eduction
No ratings yet
U L D R: Nsupervised Earning and Imensionality Eduction
58 pages
Math207 HW3
No ratings yet
Math207 HW3
2 pages
Density & Grid based clustering
100% (1)
Density & Grid based clustering
21 pages
(MADHU MANGAL PAUL) Numerical Analysis For Scienti
100% (1)
(MADHU MANGAL PAUL) Numerical Analysis For Scienti
666 pages
Community Project: ANCOVA (Analysis of Covariance) in SPSS
No ratings yet
Community Project: ANCOVA (Analysis of Covariance) in SPSS
4 pages
Modern Actuarial Risk Theory: Using R: January 2008
No ratings yet
Modern Actuarial Risk Theory: Using R: January 2008
6 pages
Random Variable Generation
No ratings yet
Random Variable Generation
5 pages
Dsbda Ass2
No ratings yet
Dsbda Ass2
49 pages
ADS EXP Assignments
No ratings yet
ADS EXP Assignments
38 pages
7b2f94dd-9a9c-4387-ac46-1c6120f14acf
No ratings yet
7b2f94dd-9a9c-4387-ac46-1c6120f14acf
173 pages
c4326d92-3ed0-472a-8d94-7f851cfa24f1
No ratings yet
c4326d92-3ed0-472a-8d94-7f851cfa24f1
10 pages
sma_Expt_8
No ratings yet
sma_Expt_8
9 pages
current-handbook-distributed-conputing
No ratings yet
current-handbook-distributed-conputing
52 pages
QA Notes
No ratings yet
QA Notes
8 pages
Chapter 3
No ratings yet
Chapter 3
26 pages
Lim Xin Yong
No ratings yet
Lim Xin Yong
5 pages
Quantitative Analysis Sectional Test
No ratings yet
Quantitative Analysis Sectional Test
11 pages
(eBook PDF) Business Statistics, 3rd Canadian Edition instant download
100% (5)
(eBook PDF) Business Statistics, 3rd Canadian Edition instant download
53 pages
Determining Sample Size How To Calculate Survey Sample Size
No ratings yet
Determining Sample Size How To Calculate Survey Sample Size
4 pages
Malhotra11 Tif
100% (1)
Malhotra11 Tif
13 pages
Assignment For The Graded Points
No ratings yet
Assignment For The Graded Points
5 pages
SMDM - Assignment 1
No ratings yet
SMDM - Assignment 1
4 pages
C. Grouped Discrete and Continuous Data (H) PDF
No ratings yet
C. Grouped Discrete and Continuous Data (H) PDF
10 pages
William R. Bell, Scott H. Holan, Tucker S. McElroy - Economic Time Series - Modeling and Seasonality-Chapman and Hall - CRC (2012)
No ratings yet
William R. Bell, Scott H. Holan, Tucker S. McElroy - Economic Time Series - Modeling and Seasonality-Chapman and Hall - CRC (2012)
544 pages
Ttest
No ratings yet
Ttest
8 pages
6 Binomial and Geometric Distributions
100% (1)
6 Binomial and Geometric Distributions
29 pages
Probit Analysis - Respuesta: Estimated Regression Model (Maximum Likelihood)
No ratings yet
Probit Analysis - Respuesta: Estimated Regression Model (Maximum Likelihood)
18 pages
Text Mining QP
No ratings yet
Text Mining QP
1 page
(Written Examination Scheme) : (MCQ S)
No ratings yet
(Written Examination Scheme) : (MCQ S)
4 pages
Prob Stat Exam 3rd Quarter
100% (1)
Prob Stat Exam 3rd Quarter
3 pages
A Theoretical Generalization of The Markowitz Model Incorporating Skewness and Kurtosis - Pierpaolo Uberti
No ratings yet
A Theoretical Generalization of The Markowitz Model Incorporating Skewness and Kurtosis - Pierpaolo Uberti
29 pages
Merits and Demerits of Measure of Average and Measure of Dispersion
No ratings yet
Merits and Demerits of Measure of Average and Measure of Dispersion
5 pages
1.3 Measure of Variability and Position
No ratings yet
1.3 Measure of Variability and Position
47 pages
Nonparametric Lecture
No ratings yet
Nonparametric Lecture
31 pages
MENTAL MAP OF STATISTICAL CONCEPTS
No ratings yet
MENTAL MAP OF STATISTICAL CONCEPTS
3 pages
S1
No ratings yet
S1
134 pages
Statistical Causal Inferences and Their Applications in Public Health Research-Springer International Publishing (2016)
100% (1)
Statistical Causal Inferences and Their Applications in Public Health Research-Springer International Publishing (2016)
324 pages
January 2019 (IAL) QP - S1 Edexcel
No ratings yet
January 2019 (IAL) QP - S1 Edexcel
20 pages
MTH302Sample Paper For Practice MTH302
0% (1)
MTH302Sample Paper For Practice MTH302
7 pages
Unit 01 - Describing Data and Its Distributions - 1 Per Page
No ratings yet
Unit 01 - Describing Data and Its Distributions - 1 Per Page
79 pages
Eem504 HW2
No ratings yet
Eem504 HW2
2 pages