Associative Hypothesis Testing
Associative Hypothesis Testing
Associative Hypothesis Testing
ROUTINE TASK
By Group 4 :
DEPARTEMENT OF BIOLOGY
FACULTY OF MATHEMATICS AND NATURAL SCIENCE
MEDAN
2021
1
TABLE OF CONTENT
Cover ................................................................................................................................... 1
Table of Content ................................................................................................................. 2
CHAPTER I PRELIMINARY ........................................................................................ 3
1.1 Background ........................................................................................................ 3
1.2 Purpose ............................................................................................................... 3
1.3 Benefit …………………………………………………………………………. 4
CHAPTER II MATERIAL .............................................................................................. 5
2.1 Definition of Associative Hypothesis Testing ..................................................... 5
2.2 Associative Hypothesis Testing Techniques ………………............................... 6
2.2.1 Product Moment Correlation ................................................................... 7
2.2.2 Multiple Correlation ………………………………………………………..8
2.2.3 Parcial Correlation …................................................................................. 11
2.2.4 Contingent Correlation ............................................................................. 12
REFERENCES ………………………………………………………………………….. 18
2
CHAPTER I
PRELIMINARY
1.1 Background
Hypotheses can be interpreted as statistical statements about population parameters. A
hypothesis is defined as a temporary answer to the formulation of a research problem, it can
be a statement about the relationship between two or more variables, a comparison
(comparison), or an independent variable (description).
Associative hypothesis testing is an assumption about the existence of a relationship
between variables in the population that will be tested through the relationship between
variables in the sample taken from the population. There are three forms of symmetrical
relationships, causal relationships (causal) and interactive relationships (mutual influence).
To find a relationship between two or more variables, it is done by calculating the correlation
between the variables to be searched for. Correlation is a number that shows the direction and
strength of the relationship between two or more variables. Direction is expressed in the form
of a positive and negative relationship, while the strength of the relationship is expressed in
the magnitude of the correlation coefficient.
The relationship between two or more variables is said to be a positive relationship, if
the value of one variable is increased, it will increase the other variable and vice versa if one
variable is lowered it will decrease the value of the other variable. The relationship between
two or more variables is said to be a negative relationship, if the value of one variable is
increased it will decrease the value of the other variable, and vice versa if the value of one
variable is lowered it will increase the value of the other variable.
There are various Correlation Statistics techniques that can be used to test associative
hypotheses. Which coefficient to use depends on the type of data to be analyzed. The
following shows the various correlation statistical techniques used to test the associative
hypothesis. Nonparametric statistics are used for nominal and ordinal data and for interval
and ratio data, parametric statistics are used.
1.2 Purpose
The objectives in preparing this paper are as follows :
1. Knowing the concept of associative hypothesis testing
2. Understand how to use the formula and under what conditions to use the associative
hypothesis test
3
3. Understand the correlation statistical techniques that can be used to test associative
hypotheses
1.3 Benefits
The benefits in preparing this paper are as follows :
1. Describe a research problem using associative hypothesis testing
2. Describe some of the variables that will be tested using associative hypothesis
testing
3. Helping students in preparing thesis in managing the data obtained
4
CHAPTER II
MATERIAL
5
In statistical analysis, the magnitude of the correlation coefficient can be described by
the distribution of the data points on the X-Y curve. The pictures showing the correlation
coefficient are as follows :
Y Y Y
X X X
Figure 1 shows the distribution of the relationship between variable X and variable Y
which does not show a certain pattern. That is, when variable X is low, variable Y can be low
or high. Likewise, when the variable X is high. This pattern shows that there is no
relationship between the two variables.
Figure 2 shows when variable X is low, variable Y is also low. When variable X is
high, variable Y is also high. This kind of relationship shows that between the two variables
there is a strong positive relationship.
Figure 3 shows when variable X is low, variable Y is high, and when variable X is
high, variable Y is low. This kind of relationship shows that there is a strong negative
relationship between the two variables.
6
Ordinal Korelasi Rank Spearman
Kendall Tau
Nominal Koefisien Kontingency
rxy =
Where :
rxy = Correlation between variable x and variable y
x = (Xi – mean of X)
y = (Yi – mean of Y)
Sample case
Want to know, is it true that there is a relationship between income and expenses. For
this purpose, data were collected on 10 random respondents. The data obtained are as follows
:
X 800 900 700 600 700 800 900 600 500 500
Y 300 300 200 200 200 200 300 100 100 100
From the calculation of the correlation coefficient with the correlation formula at the
top, it is obtained :
Mean of x = 7;
Mean of y = 2;
ΣX2 = 20;
ΣY2 = 60;
ΣXY = 10;
obtained rxy = 0.9129
7
So, there is a positive correlation of 0.9129 between income and expenditure. This
means that the greater the income, the greater the expenses. The problem is whether the
correlation number is significant (can be generalized) to say that there is also a
POPULATION correlation. For this reason, it is necessary to compare r count with r table (r
product moment table) at a certain level of significance. By looking at the product moment r
table figures, for a 5% significance level, with N = 10, we get r table = 0.632. Mean count (=
0.9129) is greater than table r, which means we must rejects Ho and accepts Ha. It can be
concluded that there is a strong and significant relationship between income and expenditure.
Testing the significance of correlation, apart from comparing the correlation
coefficient (calculated r number) with the product moment coefficient r table, it can also be
done by comparing t count with t table.
In this way, t can be found using a formula :
t=
For the case, above, we get t count = 6.33
From the t table, with a significance level of 5%, two-party test, with dk = n-2 = 8, the
price of t table = 2.306.
Because t is greater than t table, then Ho is rejected and Ha is accepted, it means that
there is a strong and significant relationship between income and expenditure.
X1 = Employee welfare
X1 = Leadership model
X1 = Supervision
Y = Work Effectiveness
8
From the picture above, it can be seen that the multiple correlation R is not the sum of
the simple correlations that exist in each variable (not r1 + r2 + r3). The name multiple
correlation ® is a joint relationship between X1, X2, X3 and Y.
The multiple correlation formula for two variables is as follows :
Where :
r y.x1.x2 = Correlation between variables X1 and x2 together with the
variable Y
r yx1 = Product moment correlation between X1 and Y
r yx2 = Product moment correlation between X2 and Y
r x2x1 = Product moment correlation between X1 and X2
Case in point :
To examine the problems of leadership models and office layout in relation to job
satisfaction, relevant data were collected. From these data, the simple correlation is
calculated, and obtained :
1. Correlation between the leadership model and job satisfaction = 0.45
2. Correlation between office layout and job satisfaction = 0.48
3. The correlation between the leadership model and office layout = 0.22
The results of the simple and multiple correlation calculations can be described as
follows :
9
From these calculations, it turns out that the magnitude of the multiple correlation R is
greater than the individual correlation ryx1 and ryx2.
The significance test of the multiple correlation coefficient can use the following F
test formula :
Fh =
Where :
R = Coefficient of multiple correlation
K = number of independent variables
N = number of sample members
Based on the figures that have been found, if n = 30, then the price of Fh can be
calculated using the formula above, and we get Fh = 7.43.
The calculated F value is then compared with the F table with dk numerator = k = 2;
and dk denominator = n - k - 1 = 10 - 2 - 1 = 7.With a significance level of 5%, the value of F
table is found to be 4.74.
By comparing the number of F table with F count, it turns out that F count is greater
than F table, it means that Ho is rejected and Ha is accepted. Thus, the multiple correlation
coefficient found is significant (applicable to the population from which the sample was
drawn).
10
If there is a question, for students whose study time is the same (parsized), what is the
correlation between IQ and college achievement? The answer to this question can be found
using the following partial correlation formula :
The formula above can be read: The correlation between Y and X1, if the variable X2
is controlled or the correlation between Y and X1, if X2 is constant.
Meanwhile, if X1 is controlled, the formula is :
While the partial correlation test can be calculated with the following formula :
Example :
Research was conducted to find out whether there is a relationship between
occupational professions and the type of sport that is often practiced. Professions are grouped
into: Doctor, Lawyer, Lecturer, Businessman. (Dr, Pc, Ds, Bs). While the types of sports are
grouped into: Golf, Tennis, Badminton, Football (Gf, T, Bt, Sp). The number of respondents
used to collect data is as follows :
11
Dr = 58
Pc = 75
Ds = 68
Bs = 81
Total number = 282
Based on a sample of four professional groups that were randomly selected, data were
obtained such as the following data :
To calculate the expected f (fh), first calculate the percentage of each sample who
enjoys golf, tennis, badminton, and football sports.
From here, the percentage can be calculated :
a. Percentage of liking Golf = 80/282 = 0.284
b. Percentage of liking tennis = 80/282 = 0.284
c. Percentage of liking badminton = 70/282 = 0.248
d. Percentage of liking football = 52/282 = 0.184
Furthermore, each fh (expected frequency) of the group that enjoys each type of sport
can be counted :
1. Who loves Golf :
a. Fh Dokter : 0.284 x 58 = 16.472
b. Fh Pengacara : 0.284 x 75 = 21.300
c. Fh Dosen : 0.284 x 68 = 19.312
d. Fh Bisnisman : 0.284 x 81 = 23.004
12
2. Who likes tennis :
a. Fh Dokter : 0.284 x 58 = 16.472
b. Fh Pengacara : 0.284 x 75 = 21.300
c. Fh Dosen : 0.284 x 68 = 19.312
d. Fh Bisnisman : 0.284 x 81 = 23.004
3. Who likes badminton :
a. Fh Dokter : 0.248 x 58 = 14.384
b. Fh Pengacara : 0.248 x 75 = 18.600
c. Fh Dosen : 0.248 x 68 = 16.864
d. Fh Bisnisman : 0.248 x 81 = 20.088
4. Who likes football :
a. Fh Dokter : 0.184 x 58 = 10.672
b. Fh Pengacara : 0.184 x 75 = 13.800
c. Fh Dosen : 0.184 x 68 = 12.512
d. Fh Bisnisman : 0.184 x 81 = 14.902
Based on the results of these calculations, then entered into the table as follows :
Furthermore, the value of Chi Square (X2) can be calculated using the formula :
13
Thus, the value of Chi Square (X2) is calculated = 29,881. Furthermore, to calculate
the contingency efficiency C, the price is entered into the formula :
So, the magnitude of the coefficient between the type of profession and the pleasure
of sports = 0.31. To test the significance of the C coefficient, it can be done by testing the
calculated Chi Square (X2) value found with the Chi Square (X2) table, at a certain
significance level and dk. Price dk = (k - 1) (r - 1); where K = number of samples = 4; r =
number of exercise categories. So dk = (4 - 1) (4 - 1) = 9. With dk = 9 and at the significance
level of 0.05, the value of Chi Square (X2) table = 15.51. The test provision is that if the
calculated Chi Square (X2) value is greater than the Chi Square (X2) table, then the
relationship is significant. In our case above, it turns out that the calculated Chi Square (X2)
value is greater than the Chi Square (X2) table. (29,881> 15.51). Thus, it can be concluded
14
that Ho is rejected and Ha is accepted. So, the type of work profession actually has a
significant relationship with the type of sport you like, which is 0.31. The data available on
the sample and the correlation rate reflect the state of the population in which the sample is
drawn.
15
Y' : Subject in the dependent variable predicted a: Price Y if X = 0 (constant
price)
b : The number of direction or regression coefficient, which shows the rate of
increase or decrease in the dependent variable based on the independent
variable.
X : Subject to the independent variable that has a certain value.
Technically, the price of b is the tangent of (the ratio) between the dependent variable
line lengths after the regression equation is found.
Y = 2 +
0.33X
a=2
b = (3-2) / (3-0)
= 1/3
= 0.33
a= Y - bX
Where:
r : product moment correlation coefficient between variable X and variable Y
Sy : standard deviation of variable Y
Sx : standard deviation of variable X
So the price of b is a function of the correlation coefficient. If the correlation
coefficient is high, then the price of b is also large, on the other hand, if the correlation
coefficient is low, the price of b is also low (small). In addition, if the correlation coefficient
is negative, the price is also negative.
Apart from using the formula above, prices a and b can also be found using the
following formula;
16
The Example of Simple Linier Regression
These data is result of observing at Quality value of service (X) and average
value of sell (Y) Balonpas :
Nomor X Y Nomor X Y
1 54 167 18 45 160
2 50 155 19 47 155
3 53 148 20 53 159
4 45 146 21 49 159
5 48 170 22 56 172
6 63 173 23 57 168
7 46 149 24 50 159
8 56 166 25 49 150
9 52 170 26 58 165
10 56 174 27 48 159
11 47 156 28 52 162
12 56 158 29 56 168
13 55 150 30 54 166
14 52 160 31 59 177
15 50 157 32 47 149
16 60 177 33 48 155
17 55 166 34 56 160
17
20 53 159 8.427 2.809 25.281
21 49 159 7.791 2.401 25.281
22 56 172 9.632 3.136 29.584
23 57 168 9.576 3.249 28.224
24 50 159 7.950 2.500 25.281
25 49 150 7.350 2.401 22.500
26 58 165 9.570 3.364 27.225
27 48 159 7.632 2.304 25.281
28 52 162 8.424 2.704 26.244
29 56 168 9.408 3.136 28.224
30 54 166 8.964 2.916 27.556
31 59 177 10.443 3.481 31.329
32 47 149 7.003 2.209 22.201
33 48 155 7.440 2.304 24.025
34 56 160 8.960 3.136 25.600
(34).(288.380) - (1.782).(5.485)
b= = 1.29
(34).(94.098) – (1.782) 2
By finding the prices of a and b, the regression equation can be determined, namely:
Y = 93.85 + 1.29 X
This regression equation can then be used to project how the individuals in the dependent
variable will occur if the individuals in the independent variable are defined. For example, if
the service quality score is set = 64, then the average sales value can be predicted as:
From the regression equation above, it can be interpreted that if the value of service quality
increases by one unit, then the average sales value per month will increase by 1.29 units. It
can also be said that each service quality value increases by 10, then the average sales value
will increase by 12.9.
18