Module AGRI 214
Module AGRI 214
Module AGRI 214
INTRODUCTION
This module aims to help agriculture students gain not only knowledge but more
importantly the necessary skills in computational procedures in statistics as well as
concise and valid interpretation of statistical results from various agricultural
experiments. Familiarization of the different statistical designs in terms of their features,
advantages and limitations are largely important in determining the most appropriate
experimental design and correct interpretation of results thereby making an experiment
valid and reliable. Although computer applications are currently available and more
convenient to use, the manual procedures presented herein provides a strong
foundation in the proper understanding of the underlying statistical principles utilized in
the computer-based applications. It is not the aim of this manual to cover all aspects of
agricultural research and statistics. However, it is designed to be a simple, yet reliable
reference for students in agriculture.
Users of this manual are encouraged to patiently understand and follow the steps in
each of the statistical tools presented in order to gain familiarity and expertise in
utilizing them for various agricultural experiments. A good tactic is to practice the
procedures using the available sample data in this manual.
THE AUTHOR
Module 1. Elements of Experimentation
• Basic concepts of experimentation
• Formulation of hypothesis
• Design of a procedure
• Requirements of experimental design
• Essential components of experimental design
Overview
In this module the student would be able to have a basic knowledge on how to conduct
scientific experiments, the principles involved and the associated procedures, with focus
on the conduct of valid experimental designs in agricultural research.
Learning al Outcome
The student is expected to learn and implement the principles and procedures of
designing an agricultural experiment keeping in mind the associated limitations of each
of the designs based on formulated hypothesis and research objectives.
LEARNING CONTENT
Experimentation
It is the process of carrying out an operation under determined conditions and scientific
principles to discover, verify or illustrate a theory, prove or disprove a hypothesis.
Elements of Experimentation
1. Formulation of Hypothesis
A hypothesis is a theory that needs to be proven. It is a tentative explanation
for a phenomenon used as a basis for further investigation
2. Design of a procedure
It is done to verify the hypothesis
2
Example
A plant breeder would like to investigate why corn hybrids show lower yields than
native varieties in disease-prevalent areas in Mindanao. The test materials would
probably be the native and hybrid varieties while the characters to be measured would
be disease infection and grain yield.
Basic Concepts
An experimental design consist of two basic structures
a. Treatment structure: how the experimental factors are used to derive the set of
treatments to be studied
b. Design (block) structure: how the experimental units are organized
Experimental unit (eu)
Unit or group of units of experimental materials to which a treatment is applied
Experimental Design
It is a plan of assigning experimental conditions or treatments to some experimental
units and the statistical analysis associated with the plan.
Why is it important?
Experiment design ensures that the appropriate data will be obtained in a way
that permits an objective analysis leading to valid inferences with respect to the
stated problem
• Experimental design guides statistical analysis. If the experiment was not
designed properly, there may be no statistical analysis appropriate for the
collected data.
Lack of proper design results not only in wasting resources but also in arriving at
misleading results.
3
4. The specification of the rules by which the treatments are to be allocated to the
experimental units (Randomization)
5. The specification of the measurements or other observations to be made on
each treatment.
Estimate of Error
It is the separation of treatment difference from other sources of variation
If you say yes, you presume that any difference between the yields of the two plots is
caused by the varieties and nothing else. This is not TRUE.
Even if the same variety were planted on both plots, the yield would differ. This is
because other factors such as soil fertility, moisture, insect damage, diseases, and birds
may also have affected the yields of the plots.
A satisfactory evaluation of the two varieties must involve a procedure that can
separate varietal difference from other sources of variation.
4
A researcher therefore must be able to design an experiment that allows him to decide
whether the observed difference is caused by varietal difference or other factors.
The two varieties planted in two adjacent plots will be considered different in their yield
only if the yield difference is larger than the expected If both plots were planted to the
same variety.
Hence, the researcher needs to know not only the yield difference between plots
planted to different varieties, but also the yield difference between plots planted to the
same variety.
The difference between experimental plots treated alike is called experimental error.
This error is the primary basis for deciding whether an observed difference is real or just
due to chance. Every experiment must be designed to have a measure of the
experimental error.
5
6
1. Blocking – putting exp units that are similar as possible together in the same
group and by assigning all treatments into each block separately and independently.
2. Proper plot technique – ensures that treatments are maintained uniformly for
all experimental treatments. For example, for a variety trial, all other factors are
uniformly applied.
3. Data Analysis – proper choice of data analysis can help achieve control of exp
error
7
Proper Interpretation of Results
An important feature of the design of an experiment is the ability to uniformly maintain
all environmental factors that are not part of the treatments being evaluated. The main
advantage is achievement of uniformity. However, it limits the applicability of exp
results, that is, they are often times applicable only to conditions similar to those
present during the conduct of the experiment.
Treatment – is a level of a factor or the combination of levels of several factors that will
be applied to exp units to measure the effect of its application
Treatment Effect – is the expected increase or decrease in average response with the
application of a particular treatment
Interaction Effect – is the combined effect of two or more factors, that is, 2 factors A
and B interact if the response of A is not the same at all levels of B and vice versa
8
Main Sources of Variation
1. Inherent variability which exists in the experimental units
2. Variation due to lack of uniformity in the physical conduct of the experiment
9
The Analysis of Variance
• Analysis of Variance (ANOVA) is a statistical method used to test differences
between two or more means.
• ANOVA is used to test general rather than specific differences among means
• ANOVA tests the non-specific null hypothesis that all treatment means are equal
• A statistical procedure used to test the degree to which two or more groups vary
or differ in an experiment
Example
a. Yield Trial - 8 rice varieties
Ho – the varieties have similar yields
Ha – the varieties have different yields
10
11
Learning Activities
Study Questions:
1. What is the importance of an experimental design?
2. Discuss the different strategies to minimize experimental error.
3. What is blocking? Discuss its significance.
4. Discuss the importance of the analysis of variance.
12
Activity 1:
Conceptualize a field experiment using the principles and procedures in this module.
13
Module 2. Completely Randomized Design (CRD)
a. Randomization and Layout
b. Analysis of Variance
Overview
This module aims to equip students with the skill of employing the principles and
procedures of establishing an experiment using the completely randomized design
Learning al Outcome
The student is expected to learn as well as apply the principles and procedures of
implementing the completely randomized design in agricultural research studies, learn
and apply its limitations, and the proper interpretation of results
LEARNING CONTENT
14
number as 916. Proceeding further, the following numbers are sequentially
selected:
Random Sequence Random Sequence Random Sequence Random Sequence
Number Number Number Number
916 1 254 6 400 11 884 16
658 2 808 7 066 12 878 17
974 3 793 8 321 13 339 18
757 4 242 9 511 14 185 19
152 5 606 10 912 15 866 20
15
Group Number (Treatments) Ranks (Plot Numbers)
1 (A) 19, 11, 20, 12, 2
2 (B) 5, 14, 13, 4, 10
3 (C) 8, 1, 6, 9, 18
4 (D) 17, 16, 7, 3, 15
Plot Number 1 2 3 4
Treatment C A D B
5 6 7 8
B C D C
9 10 11 12
C B A A
13 14 15 16
B B D D
17 18 19 20
D C A A
16
Sequence: 11 12 13 14 15 16 17 18 19 20
Rank: 10 2 1 3 19 5 17 9 18 6
Card No: A(D) 3(F) 2(F) 7(F) 10(S) A(F) 4(S) 9(D) 5(S) 4(D)
Where:
H = Heart D = Diamond F = Flower S = Spade
B.2. Assign the treatments to the plots by using the ranks as plot numbers
as follows:
Treatment Plot Number
A 15 11 14 7 20
B 13 4 8 16 12
C 10 2 1 3 19
D 5 17 9 18 6
17
1.1.2. ANALYSIS OF VARIANCE
The Analysis of Variance Table for a CRD is as follows:
Source of Degree of Sum of Mean Computed Tabular F
Variation Freedom Squares Square F 5% 1%
Treatment (t – 1)
Error t(r – 1)
TOTAL rt - 1
The formulas for computing the Correction Factor (CF) and the sums of squares are:
CF = G2 Where G = Grand Total
n n = total number of observations
n
Total Sum of Squares (TSS) = ∑Xi2 – CF Where Xi is the individual observation
i=1
t
Treatment SS (TrSS) = ∑Ti2 – CF Where Ti are the treatment totals
I=1
NOTE : As a general guideline, the F values should be computed only when the error
df is large enough, that is, when it is 6 or more.
Obtain the F tabular values from Appendix E of the Book “Statistical Procedures for
Agricultural Research” by Gomez and Gomez with f1 = treatment df (t-1) and f2 = error
df (rt-1). In our example, the tabular F values with f1 = 6 and f2 = 21 are 2.57 at 5% level
of significance and 3.81 at 1% level.
If the Computed F is greater than the Tabular F at 5% level of significance it is said that
there are significant differences among treatment means at 5% level. Such result is
shown by putting one asterisk above the computed F value. If the computed F is greater
than the Tabular F at 1% level, the treatment differences are said to be highly significant
and two asterisks are put above the computed F value. If the computed F is equal or
smaller than the tabular F at 5% level, treatment differences are said to be
nonsignificant and ns is written above the computed F value.
18
The Coefficient of Variation (cv) indicates the degree of precision with which the
treatments are compared. It is a good index of the reliability of the experiment. It is an
expression of the experimental error as a percentage of the mean, thus, the higher the
cv, the lower is the reliability of the experiment. The cv varies with the type of
experiment, crop grown, and the trait being measured. An experienced researcher
should be able to make a good judgment on the acceptability of a cv value for a
particular type of experiment. According to Gomez and Gomez, field experiments on
transplanted rice at the International Rice Research Institute (IRRI) have established that
for field experiments on transplanted rice, the acceptable cv for yield is 6 to 8% for
variety trials, 10 to 12% for fertilizer trials, and 13 to 15% for insecticide and herbicide
trials.
Learning Activities
Exercise 1. Prepare a randomization for an experiment in a Completely Randomized
Design with 8 treatments and 4 replications using draw lots, and the table of random
numbers.
Exercise 2. The following Table shows a Hypothetical Data from a CRD experiment with
7 treatments and 4 replications. Compute for the sums of squares, mean squares, the
corresponding F value and then construct the ANOVA Table and interpret the result.
19
• Most of the lectures would be delivered online via edmodo or google classroom
• Hard copies of lectures would also provided to students allowed to visit the
campus during the semester
• Zoom meetings or google meet would be conducted as deemed necessary
Assessment Task
• Submission of accomplished activity 1.
• Answer the study questions
• Assessment Quiz 1 (online)
References
Gomez & Gomez. Statistical Procedures for Agricultural Research. 1984
Steel, R.G.D. & Torrie, J.H. Principles and Procedures of Statistics. 1960
20
a. Randomization (Follow the procedures in module 1)
b. Analysis of variance
Overview
This module aims to equip students with the skill of employing the principles and
procedures of establishing an experiment using the randomized complete block design
Learning Outcome
The student is expected to learn as well as apply the principles and procedures of
implementing the randomized complete block design in agricultural research studies,
learn and apply its limitations, and the proper interpretation of results
LEARNING CONTENT
Example:
Treatments (t) – A, B, C, D, E, F
Number of Replication (r) – 4
Steps
1. Divide the experimental area into r equal blocks
2. Divide the first block into t number of treatments. Number the plots from 1
to t. Assign t treatments at random to the t plots. In this example, block I is
divided into 6 equal plots, and the 6 treatments are assigned at random to
the plots.
Gradient
21
2 5
D B
3 6
F A
Block I
22
rt r = number of replications
t = number of treatments
n
TSS = ∑Xi2 – CF Where Xi is the individual observation
i=1 n = total number of observations
t
∑Ti2 – CF Where Ti = the treatment totals
TrSS = I=1 r = number of replications
r
t
∑Ti2 – CF Where Ti = the treatment totals
RepSS = I=1 t = number of treatments
t
Learning Activities
Exercise 1. Construct the ANOVA Table from the following data from an RCB Design.
Compute for the cv and interpret the results.
23
• Most of the lectures would be delivered online via edmodo or google classroom
• Hard copies of lectures would also provided to students allowed to visit the
campus during the semester
• Zoom meetings or google meet would be conducted as deemed necessary
Assessment Task
• Submission of accomplished activity 1.
• Answer the study questions
• Assessment Quiz 1 (online)
References
Gomez & Gomez. Statistical Procedures for Agricultural Research. 1984
Steel, R.G.D. & Torrie, J.H. Principles and Procedures of Statistics. 1960
24
a. Randomization
b. Analysis of Variance
Overview
An experiment involving two or more factors analyzed simultaneously and their
corresponding interaction, if any.
Learning al Outcome
The student is expected to learn as well as apply the principles and procedures of
implementing the factorial randomized complete block design in agricultural research
studies, learn and apply its limitations, and the proper interpretation of results
LEARNING CONTENT
Introduction
An experiment where there more than one factors studied in which the treatments
consist of all possible combinations of the levels of two or more factors is called as a
factorial experiment.
Example:
Treatment combination
Treatment Number Variety N rate, kg/ha
1 X 0
2 X 60
3 Y 0
4 Y 0
Analysis of Variance
Consider an experiment involving 5 rates of nitrogen fertilizer, 3 rice varieties, and 4
replications
F Tab
SV DF SS MS F Comp
5% 1%
Replication (r – 1) = 3
Treatment ab – 1 = 14
Variety (A) a – 1 = 2
Nitrogen (B) b – 1 = 4
A x B (a-1)(b-1) = 8
Error (r – 1)(ab– 1)= 42
TOTAL (rab – 1) = 59
Formulas:
25
C.F. = G2
rab
A SS = ∑ A2 - C.F
rb
B SS = ∑ B2 - C.F
ra
A x B SS = Treatment SS – A SS – B SS
Learning Activities
Activity 1.
Construct the analysis of variance of an experiment using the example above with data
shown below. Refer to Gomez & Gomez for the step by step computations.
26
Table 3. Grain Yield of 3 rice varieties tested with 5 levels of Nitrogen in a RCB design
Nitrogen Level
Rep I Rep II Rep III Rep IV Treatment Total
(kg/ha)
(T)
V1
0N 3,852 2,707 3,014 2,865
40 kg N 4,757 4,806 4,532 4,673
70 kg N 4,566 4,504 4,899 5,635
100 kg N 6,030 5,314 5,970 5,531
130 kg N 5,870 5,870 5,906 5,513
V2
0N 2,870 3,894 4,180 3,510
40 kg N 4,900 5,146 4,126 4,906
70 kg N 5,930 5,670 5,895 4,305
100 kg N 5,674 5,375 6,421 5,461
130 kg N 5,460 5,543 5,768 5,908
V3
0N 4,200 3,895 3,756 3,452
40 kg N 5,256 4,562 4,809 4,236
70 kg N 5,820 4,895 5,690 4,907
100 kg N 5,678 5,543 6,012 4,765
130 kg N 5,860 6,276 6,025 5,321
Rep Total (R )
Grand Total (G)
27
Module 5. Split Plot Design
a. Randomization
b. Analysis of variance
Overview
An experiment involving two or more factors analyzed simultaneously and their
corresponding interaction, if any.
Learning al Outcome
The student is expected to learn as well as apply the principles and procedures of
implementing the split plot design in agricultural research studies, learn and apply its
limitations, and the proper interpretation of results
LEARNING CONTENT
28
N3 N4 N1 N5 N2 N6 N2 N5 N6 N3 N1 N4 N1 N4 N2 N5 N3 N6
c. Divide each mainplot (N rates) into b = 4 subplots (varieties) and randomly assign
the 4 varieties in each subplot for each of the 18 mainplots
N3 N4 N1 N5 N2 N6 N2 N5 N6 N3 N1 N4 N1 N4 N2 N5 N3 N6
V2 V3 V1 V2 V1 V4 V1 V2 V4 V2 V1 V3 V4 V3 V1 V2 V1 V1
V1 V4 V2 V4 V3 V2 V3 V4 V3 V3 V4 V2 V3 V4 V3 V3 V4 V4
V4 V1 V4 V1 V4 V3 V2 V3 V2 V1 V2 V4 V1 V1 V4 V4 V2 V2
V3 V2 V3 V3 V2 V1 V4 V1 V1 V2 V3 V1 V2 V2 V2 V1 V3 V3
29
TSS = ∑X2 – CF Where: X = individual observations
AMS = ASS
a-1
BMS = BSS
b-1
A x B MS = A x B SS
(a-1)(b-1)
Error (b) MS = Error (b) SS
a(r-1)(b-1)
F(A) = _AMS____
30
Error (a) MS
F(B) = __BMS____
Error (b) MS
F(A x B) = A x B MS
Error (b) MS
Learning Activities
Exercise 1. Construct the ANOVA and interpret the results for the following data from an
experiment involving 6 seeding rates and 4 rice varieties in a Split-plot Design with 3
replications
31
S4 (50kg/ha)
V1 4562 4612 4200
V2 4891 4524 4633
V3 4541 4202 4215
V4 5010 4423 4688
S5 (60kg/ha)
V1 4131 4123 4522
V2 4562 4612 4010
V3 4401 4002 4314
V4 4203 4322 3965
S6 (70kg/ha)
V1 4412 4576 5012
V2 4367 4612 4655
V3 4862 4451 4001
V4 4103 4213 3985
Seeding
Yield Total (RA)
Seeding Rate Rate
Total (A)
Rep I Rep II Rep III
S1 15851 16727 16631 49209
S2 20845 18930 17445 57220
S3 20695 17534 17497 55726
S4 19004 17761 17736 54501
S5 17297 17059 16811 51167
S6 17744 17852 17653 53249
11143 10586 10377
Rep Total (R) 6 3 3
Grand Total (G) 321072
32
S5 12776 13184 12717 12490
S6 14000 13634 13314 12301
Variety Total (B) 81466 82479 78091 79036
Overview
Where the result of the ANOVA is significant, we can determine which specific pair of
treatments are significantly different using either the LSD or DMRT tests. The
procedures in performing treatment comparisons is discussed in detail in this module
33
Learning al Outcome
The student is expected to learn when and how to perform treatment mean comparison
and be able to interpret results correspondingly and arrive at valid conclusions.
LEARNING CONTENT
34
LSD Test for Experiments with equal number of replications is discussed in this Manual.
The Table below will be used as an example.
Formula:
LSDα = tα 2S2 Where tα = tabular t value at a certain level of
r significance and error df from the
ANOV Table. t values are found in
Appendix C of Gomez and Gomez.
S2 = the error mean square
Exercise 1. Compute for the LSD value at 5% and 1% level of significance using the data
in Table 5. Compare the treatment means against the Control and put asterisks above
the treatment differences to indicate the significance of the difference between a
particular pair of treatment means. Error Mean Square is 93665, Number of replications
is 4 and the tabular t0.05 = 2.080 while t0.01 = 2.831.
Table 5. Mean Yield Comparison Under 6 Insecticide Treatments Using LSD Test
Treatment Mean Yield (kg/ha) Difference from Control (kg/ha)
T1 2243 851
T2 2612 1220
T3 2543 1151
T4 2365 973
T5 1699 307
T6 1713 321
Control 1392 -----
35
2.3.1. Rank the data in ascending or descending order. For yield data, it is
customary to rank in descending order as follows:
Treatment Mean Yield (kg/ha) Rank
T2 2612 1
T3 2543 2
T4 2365 3
T1 2243 4
T6 1713 5
T5 1699 6
T7 1392 7
2.3.2. Compute for the standard error of the mean difference, Sd.
2.3.3. Compute for the (t-1) values of the shortest significant ranges as:
P rp(0.05)
2 2.94
3 3.09
4 3.18
36
5 3.24
6 3.30
7 3.33
Using the above values, the (t-1) Rp values are:
p Rp = (rp)(Sd)
2
2 (2.94)(216.41) = 450
2
3 (3.09)(216.41) = 473
2
4 (3.18)(216.41) = 487
2
5 (3.24)(216.41) = 496
2
6 (3.30)(216.41) = 505
2
7 (3.33)(216.41) = 510
2
2.3.4. Compute for the difference between the largest treatment mean and the
largest Rp value and declare all treatment means less than the computed
difference as significantly different from the highest treatment mean.
2612 – 510 = 2102 kg/ha
2.3.5 Next, compute the range between the remaining treatment means, i.e.,
those means which are larger than or equal to the difference between
the largest means and the largest Rp value, and compare this range with
the value of Rp for the group of treatments being compared. If the
computed range is smaller than the corresponding Rp value, all the
treatment means in the group are declared not significantly different
from each other. In the table of means in 2.3.1, T6, T5 and T7 are less
37
than the computed difference of 2102 kg/ha, hence, they are declared
significantly different from T2.
2.3.6 With the 4 remaining treatments (T2, T3, T4 and T1), whose values are
larger than 2102 kg/ha, compute the range as 2612 – 2243 = 369kg/ha.
This difference is less than p = 4 of 487, hence T2, T3, T4 and T1 are not
significantly different from each other. Either vertical line is drawn
connecting the 4 means or a common letter is written to the right of
them.
Treatment Mean Yield (kg/ha)
T2 2612
T3 2543
T4 2365
T1 2243
T6 1713
T5 1699
T7 1392
2.3.7 Compute the difference between the second largest mean and the
second largest Rp value and declare all treatment means less than this
difference as significantly different from the second largest mean. For the
remaining treatment means which are larger than or equal to the
computed difference, compute the range and declare all treatments not
significantly different from each other if the range is smaller than the
corresponding Rp value. The second largest mean is T3 (2543kg/ha)
2543 – 505 = 2038 kg/ha.
Since T5, T6, and T7 are less than 2038 kg/ha, they are declared
significantly different from T3
2.3.8 Continue the process with the 3rd largest mean and the 3rd largest Rp
value. In the example, only T6, T5 and T7 are not compared with each
38
other. We then compute for the difference between each of them and
compare their differences with the appropriate Rp values as follows:
T6 vs T5 = 1713 – 1699 = 14 < 450 (at p = 2)
T6 vs T7 = 1713 – 1392 = 321 < 473 (at p = 3)
T5 vs T7 = 1699 – 1392 = 307 < 450 (at p = 2)
Since all the computed differences are less than the corresponding R p
values, these treatments namely, T6, T5 and T7 are not significantly
different from each other. A vertical line is drawn to connect these
treatments.
Treatment Mean Yield (kg/ha)
T2 2612
T3 2543
T4 2365
T1 2243
T5 1713
T6 1699
T7 1392
Comparison of means is now complete. Instead of vertical lines, a more
convenient way is to put letters to the right of the means to designate the
comparisons. In this way, means followed by the same letter are not significantly
different.
Treatment Mean Yield (kg/ha)
T1 2243 a
T2 2612 a
T3 2543 a
T4 2365 a
T5 1699 b
T6 1713 b
T7 1392 b
39
Learning Activities
Exercise 1. Perform treatment means comparison on the following data using DMRT.
Use the Rp values and the error mean square in the previous example.
Table 6. Grain Yield of 8 Rice Varieties
Treatment Grain Yield (kg/ha)
V1 4612
V2 5425
V3 4314
V4 5892
V5 3894
V6 4863
V7 6124
V8 4325
40
b. Simple linear correlation
Overview
The modules deals with the determination and interpretation of relationships among
variables in an agricultural experiment. It assumed however that the relations hip is
linear for the foregoing statistical tools to be applicable
Learning al Outcome
The student should be able to properly apply the methods of detecting the presence or
degree of relationships or associations among variables of interest and be able to
interpret the extant of such association.
LEARNING CONTENT
3.1 Features
Regression analysis describes the effect of one or more variables (designated as
independent variables) on a single variable (designated as the dependent variable) by
expressing the latter as a function of the former. In this analysis it is important to clearly
distinguish between the dependent and the independent variable. For example, in an
experiment on yield response to nitrogen, it is obvious that the dependent variable is
yield. Generally, the character of higher importance becomes the dependent variable.
The regression and correlation analysis is called simple if only two variables are involved,
and multiple, if more than two variables are involved.
41
dependent variable Y and the independent variable X is represented by the following
equation:
Y = α + βX
Where:
α is the intercept of the line on the Y axis
β is the regression coefficient, and is the slope of the line or the amount
of change in Y for each change in X.
Step 1. Compute for the means, the corrected sums of squares, and the corrected sum
of cross products between x and y (results are presented in Table 6 above) as follows:
X = ∑X
n
Y = ∑Y
n
42
∑x2 = ∑(xi – X)2
a = Y – bX b = ∑xy
∑x2
From Table 7,
b = 216874 = 26.13
8300
a = 5041.75 - (26.13)(75) = 3082
43
Step 3. Draw a graphical illustration of the regression equation using the data in Table 6
Y = 3082 + 26.13X
7000
6000
5000
4000
3000
2000
1000
0
0 20 40 60 80 100 120 140
.
44
∑x2
Compare the tb value to the tabular t values of Appendix C (Gomez and Gomez) with (n –
2) degrees of freedom.
In our example, the linear response of yield to changes in the rate of nitrogen within 0
to 150kg/ha is significant at 5% level of significance.
3.4 Simple Linear Correlation
The simple linear correlation coefficient r is a measure of the degree of linear
association between two variables X and Y. In this analysis, there is no assignment of
independent and dependent variable, unlike in regression analysis.
The value of r ranges from -1 to +1 with the highest value indicating a perfect linear
relationship while a value of zero indicates no linear association between the two
variables. It does not say however that there is no relationship between the variables in
consideration. There might be a relationship but not linear.
For an r value of 0.8 for example, 64% ((100)(r2)) of the variation of Y is a result of the
linear function of X.
The formula for simple linear correlation coefficient is
r = _____∑xy______
(∑x2)(∑y2)
Procedure:
1. Compute for the means (Table 7), corrected sums of squares, and corrected
sums of cross products of the two variables.
2. Compute for the simple linear correlation coefficient r using the formula above
3. Compare the absolute value (Appendix H) of the computed r to the tabular value
with (n - 2) degrees of freedom (0.950 at 5% level of significance, and 0.990 at
1%). Assuming the computed r is 0.985, 97% ((100)(0.985)2 of the variation in the
mean yield is accounted for by the linear function of Nitrogen rates.
45
Learning Activities
Exercise 1. Using the data in Table 7 compute for the correlation coefficient and
interpret the results. Compute also for the correlation coefficient r using the data shown
in exercise 7.
x y x2 y2 (x)(y)
20 4352
25 5344
35 6455
40 6321
Total
Mean
46
Module 8. Computation of Missing Data
Overview
The module deals with the approaches in dealing with lost data without compromising
the reliability of an agricultural experiment as well as their applicability and limitations.
Learning Outcome
The student is expected to appropriate employ statistical remedies to deal with lost or
missing data in the light of the objective of arriving at conclusive findings in an
experiment without sacrificing reliability
LEARNING CONTENT
47
X = rBo + tT0 – G0
(r – 1)(t – 1)
Where:
r = number of replications
t = number of treatments
B0 = Replication Total containing the missing observation
T0 = Treatment Total containing the missing observation
G0 = Grand Total of all observations
Step 2. Replace the missing data in Table 7 and compute for the replication, treatment,
and grand totals and then proceed with the analysis of variance. Subtract one from both
the error df and the total df. In Table 7, the total df of 23 would become 22 and the
error df of 15 becomes 14.
Step 3. Compute for the correction factor for bias B:
B = [B0 – (t – 1)X]2
t(t – 1)
Step 3. Subtract the computed B from the TrSS and the TSS.
Step 4. Construct the Analysis of Variance for RCBD as shown
Table 7.1 Analysis of Variance of Data from Table 7 with One Missing Data
Ftab
SV DF SS MS Fcomp
5% 1%
Rep (r – 1) = 3
Treatment (t – 1) = 5
Error (r – 1)(t – 1) – 1
= 14
TOTAL rt – 1 – 1 = 22
Step 5. For pair comparison, compute for the standard error of the mean difference sd
sd = s2 2 + _____t_______
r r(r – 1)(t – 1)
Step 5. The computed sd value can be used for the computation of the LSD or DMRT
values to compare the treatment means where
LSDα = (tα)(sd)
48
To illustrate the procedure, Table 7 below is assumed to have one missing data as
shown:
Table 8. Data from a RCB Experiment with one missing data
Grain Yield (kg/ha)
Treatment
Rep 1 Rep II Rep III Rep IV Treatment
N rates (kg/ha)
Total
60 5235 5342 5309 4632
75 5322 5679 4763 4253
90 5277 5724 5495 4735
105 5155 X 4968 4421 (T0)
120 4801 4563 4425 4763
Rep Total (B0)
Grand Total (G0)
Learning Activities
Exercise 1. Compute for the missing data in Table 8 and then proceed with the analysis
of variance as well as treatment mean comparison using LSD Test.
Overview
A presentation on valid sampling techniques in agricultural experiments. It is designed
to guide students in proper sampling techniques in order to obtain valid experimental
data.
Learning al Outcome
The student is expected to learn different sampling procedures in various agricultural
experiments as bases in determining the size of experimental plots
LEARNING CONTENT
Introduction
For an experiment to be valid and more reliable, a proper plot size is required based on
the character of interest. Characters like grain yield, stover yield of forage crops, or cane
yield of sufarcane require larger plot size because of the difficulty of making
measurements compared to other characters. For example, plant height can be
measures from 10 representative plants out 200 in a plot; for tiller number, 1 sq.m. can
be counted in a 15 sq.m. plot.
Sampling Unit
A sampling unit is the unit on which actual measurement is made. Considering a each
plot as a population, the sampling unit should be smaller than a plot. Some of the
commonly used sampling units are a leaf, a pant, a group of plants, or a unit area. A
proper sampling unit differs among crops, among characters to be measured, and
among cultural practices.
50
3. High precision and low cost – Achieved when the variability among sampling
units within a plot is kept small. In transplanted rice for example, the variation
between single-hill sampling units for tiller count is much larger than that for
plant height.
Sampling Size
The number of sampling units taken from the population is the sample size. In a
replicated field trial where each plot is a population, sample size could be the number of
plants per plot used in measuring plant height, the number of leaves per plot used for
measuring leaf area, , or the number of hills per plot used for counting tillers.
The sample size is governed by:
The size of the variability among sampling units within the same plot (sampling
variance)
The degree of precision desired for the character of interest
The required sample size can be determined based on the prescribed margin of error of
the plot mean or the treatment mean.
For example, the researcher can prescribe that the sample estimate should not deviate
from the true value by more than 5% or 10%.
To measure the number of panicles per hill in transplanted rice plots with a single hill as
a sampling unit, the researcher estimates the variance in panicle number between
individual hills with the same plot (v s) as 5.0429 or a cv of 28.4% based on the average
number of panicles per hill of 17.8. He estimated that the plot mean should be within
8% of the true value. At 5% level of significance the sample size is
51
Sampling Design
Example:
Plant height of maize is to be measured in each plot consisting of 200 hills measures
from 6 single-hill sampling units
In the above example, plant numbers 78, 17, 3, 173, 133 and 98 are the plants where
pant height is to be measured.
Learning Activities
Exercise 1. Make a random sampling for the measurement of number of productive
tillers in transplanted rice using simple random sampling technique assuming there are
150 hills in a plot.
Exercise 2. Compute for the sampling size if d = 0.5, Mean (X) = 15, Z α = 2.05 and vs is
3.05. Show complete solutions
52
References
Gomez, K & Gomez, A. Statistical Procedures for Agricultural Research. 2 nd Edition. 1984
Steel, R.G.D. & Torrie, J.H. Principles and Procedures of Statistics. 1960
ACKNOWLEDGMENT
53
The author would like to thank his colleagues at the College of Agriculture, Isabela State
University, Echague, Isabela for their logistical support to the preparation and printing of this
Module. The support of the University in providing the privilege and opportunity to work at
home for the preparation of this module is also acknowledged. Lastly, hearfelt gratitude is
hereby expressed to the authors of the book “Statistical Procedures for Agricultural Research”
which is used as the primary reference in this Module.
This module is not perfect. Corrections, comments and suggestions for its improvement are
favorably accepted and thankfully acknowledged.
54
The author finished his B.S. in Agricultural Engineering at the Isabela State University, Echague,
Isabela with highest academic award, M.S. in Crop Science (Plant Breeding), and PhD in Plant
Breeding at the Central Luzon State University, Science City of Muñoz, Nueva Ecija. He passed
the Licensure Examination for Agricultural Engineers on first attempt. Before his involvement
with the ISU – Echague Campus as a faculty member at the College of Agriculture, he worked in
different private agricultural companies for 30 years here and abroad, and for the last 24 years
he continued his research work in rice as a breeder for irrigated lowland ecosystem.
55