Livestock Production Science 84 (2003) 63–73
www.elsevier.com / locate / livprodsci
Model comparison for genetic evaluation of milk yield in
Uruguayan Holsteins
J.I. Urioste a , *, R. Rekaya b ,1 , D. Gianola b , W.F. Fikse c , K.A. Weigel b
a
´ , Universidad de la Republica
´
´ 780, 12900 Montevideo, Uruguay
, Av. Garzon
Facultad de Agronomıa
b
Department of Dairy Science, University of Wisconsin, Madison, WI 53706, USA
c
INTERBULL Centre, SLU, Box 7023, S750 07 Uppsala, Sweden
Received 23 May 2002; received in revised form 31 December 2002; accepted 7 March 2003
Abstract
Three models for genetic evaluation of milk yield of Uruguayan Holstein cattle were compared using 159 169 lactation
records from 81 928 cows calving between 1989 and 1998. Model I included the effects of herd-year-season, parity by age
group, additive genetic merit, permanent environment, and residual. Model II included all effects in Model I, as well as
number of days open and length of the dry period. Model III included all factors in Model II, and it accommodated
heterogeneity of variance within contemporary groups (CG) through a pre-adjustment of the data based on empirical Bayes
estimates of the CG variance. Estimates of heritability for milk yield were 0.23, 0.24 and 0.25, and estimates of repeatability
were 0.55, 0.56 and 0.57 for Models I, II and III, respectively. Models were contrasted by examining changes in sire ranking
and by a cross-validation procedure, based on the ability of the models to predict first, second and later lactations. Data were
divided into two subsets, and records from one subset were predicted using location parameters estimated from the other
subset. A resampling procedure was used to minimise the dependency on the sample structure. Correspondence between
observed and predicted values was assessed in terms of square root of empirical mean square errors of prediction, percentage
squared bias and the coefficient of determination ‘R 2 ’. Adjustment for heterogeneous CG variance had a marked effect on
rankings of animals, especially elite cows, where correlations between solutions from Models I and II versus Model III
ranged from 0.53 to 0.80. The percentage of animals selected in common by each pair of models decreased when selection
intensity increased. Cross-validation analyses suggested that an assumption of heterogeneity of CG variance is tenable,
especially in later lactations, whereas some doubts arise in first lactations, most probably due to the data structure used in the
analyses.
2003 Elsevier B.V. All rights reserved.
Keywords: Dairy cattle; Milk yield; Heterogeneous variance; Genetic evaluation; Predictive ability
*Corresponding author. Tel.: 1598-2-355-9636; fax: 1598-2359-3004.
E-mail address: jorgeu@internet.com.uy (J.I. Urioste).
1
Present address: Department of Animal and Dairy Science,
University of Georgia, Athens, GA 30603-2771, USA.
1. Introduction
Best linear unbiased prediction (Henderson, 1973,
1984) has become the standard method for inferring
breeding values, especially in dairy cattle. A large
0301-6226 / 03 / $ – see front matter 2003 Elsevier B.V. All rights reserved.
doi:10.1016 / S0301-6226(03)00051-4
64
J.I. Urioste et al. / Livestock Production Science 84 (2003) 63–73
number of countries employ an animal model (INTERBULL, 2000), where an additive genetic effect
is fitted for each animal in the pedigree. Systematic
environmental effects, such as age and parity of cow,
duration of the dry period, number of days open and
length of the previous and current calving intervals
are often included as explanatory variables in the
models for genetic evaluation of milk yield (INTERBULL, 2000). The first applications of mixed linear
models for genetic evaluation of dairy cattle assumed
constant genetic and residual components of variance
across environments. However, considerable evidence has accumulated that there exists heterogeneity
of variance for milk yield (Hill et al., 1983; Brother´˜
stone and Hill, 1986; Ibanez
et al., 1996, 1999;
Dodenhoff and Swalve, 1998). Ignoring such heterogeneity may lead to imprecise or biased predictions
of genetic merit. Most countries now account for
heterogeneity of variance in their national evaluation
programs (e.g., Brotherstone and Hill, 1986; Jones
´˜
and Goddard, 1990; Meuwissen et al., 1996; Ibanez
et al., 1996; Wiggans and Van Raden, 1991; RobertGranie´ et al., 1999). In several countries, a simple
pre-adjustment of records is made, prior to fitting the
animal model, in an attempt to reduce within-herd
heterogeneous phenotypic variance.
Across-herd genetic evaluation of dairy cattle in
Uruguay started in 1992, and the system evolved
rapidly towards a BLUP animal model for repeated
lactations, assuming homogeneous variance. For
computational reasons, and due to a lack of accurate
estimates of variance parameters, the model has been
kept as simple as possible. It is of interest to progress
toward more biologically plausible models, such as
those that consider heterogeneous variance and that
include effects of reproduction on milk yield. Reproductive and management variables such as calving interval, length of dry period and number days
open have been shown to influence milk yield
(Schaeffer and Henderson, 1972; Funk et al., 1987;
Sadek and Freeman, 1992; Lee et al., 1997; Berger
and Lista, 1999).
Studies on model performance based on goodness
of fit or predictive ability are scarce in the literature.
´
Perez-Enciso
et al. (1993) compared linear and
Poisson mixed models for litter size in pigs. These
authors split the data into two subsets, and predicted
records in one of the subsets using location parame-
ters estimated from the complementary subset. Estany and Sorensen (1995) in pigs, Olesen et al.
´˜
(1994) in sheep, and Ibanez
et al. (1999) and
Tempelman and Gianola (1999) in dairy cattle used
similar procedures. Once a model with reasonable
goodness of fit and predictive ability is found, it is
desirable to obtain precise estimates of genetic
parameters for this model.
The objective of this study was to estimate genetic
parameters for and compare three alternative models
for genetic evaluation of milk yield in the Uruguayan
Holstein population, based on their predictive ability
of first, second and later lactations. The first model
was the repeatability animal model that is currently
used in practice. The second model included the
effect of reproductive and management variables on
milk yield, and the third model is as the second
model but with an adjustment for heterogeneous
phenotypic variance.
2. Material and methods
2.1. Data
Milk yield records (305-day) from Holstein cows
calving between 1989 and 1998 that had been
included in the 1999 Uruguayan national genetic
evaluation were used in the analysis. Incomplete
records are routinely extended to 305-day lactation
records by the last test-day method, with extension
factors calculated by age, parity and month of
calving. Production data and pedigree were provided
´ Rural del Uruguay (ARU) and by
by the Asociacion
Instituto Nacional para el Mejoramiento Lechero
(INML). A 305-day lactation record was included in
the analysis if it had at least four test-day yields and
if the cow had at least one test day beyond 150 days
in milk. Records for parities 3 through 5 were
included, provided the cow’s records for the preceding lactations qualified for inclusion in the data set.
Winter–Spring (June–November) and Summer–Autumn (December–May) calving seasons were defined. Herd-year-season contemporary groups (HYS)
were formed by combining herd, year and season of
calving classes, and only HYS-classes with more
than five records were retained. Extended records
received were given the same weight as completed
J.I. Urioste et al. / Livestock Production Science 84 (2003) 63–73
records. After edits, 159 169 lactations from 81 928
cows in 3600 HYS were available for analysis.
Additional information is in Urioste et al. (2001).
The pedigree file included 99 192 animals, of which
82.6% had milk records and 64.3% had at least one
known parent.
2.2. Models
Three alternative models were compared. Model I
is the model currently used in the Uruguayan genetic
evaluation, and Model II included additional information on explanatory variables related to reproduction. Model III contemplated an adjustment
for heterogeneous variance in addition to the reproductive variables. The specification for Model I
was:
y ijkl 5 HYS i 1 L j 1 a k 1 pe k 1 e ijkl
(1)
where y ijkl 5305-day milk yield of cow k in lactation
l, lactation-age class j and HYS i; HYS i fixed effect
of herd-year-season i (i51,2, . . . 3600); L j 5fixed
effect of the jth combination of lactation and age
class ( j51,2, . . . ,26); a k 5random additive genetic
effect of animal k (k51,2, . . . ,99 192); pe k 5random
permanent environmental effect of cow k (k5
1,2, . . . ,81 928); e ijkl 5random residual term.
The vector of additive genetic effects was assumed
to have the distribution a | N(0,As 2a ), where 0 is a
vector of population means, A is a known matrix of
additive genetic relationships between animals and
s 2a is the additive genetic variance. The permanent
environmental effects were taken to be independent
and identically distributed, with mean 0 and variance
s 2pe . The random residuals were independently distributed, each with mean 0 and variance s 2e . Additive
genetic, permanent and residual effects were mutually independent. A description of the age-parity
classes is given in Table 1.
Dry-off and calving dates are reported routinely in
the Uruguayan recording schemes. To identify information regarding the impact of reproductive measures, preliminary analyses were performed using a
set of fixed models. These included all fixed terms in
Model I plus the effects of length of previous calving
interval (CI), number of days open (DO) and length
of previous dry period (DP) or previous days in milk
(DIM). There was a part-whole relationship between
65
Table 1
Parity-age classes and age ranges within parity
Parity
No. of
levels
No. of
lactations
Range of
ages (months)
1
2
3
4
5
5
7
6
6
2
57 111
46 760
29 859
16 694
8745
22–53
30–71
48–93
54–180
78–180
CI, DP and DIM, and these variables were consequently measuring the same biological trait. It was
decided to keep DP only. Further, because DO in the
current lactation is still not recorded routinely in
Uruguay, it was approximated as the difference
between current CI (computed from current and next
calving date) and a fixed gestation length (280 days),
to account for effects of pregnancy on current
lactation yield.
Final classes for DO and DP, and the number of
observations per class are in Table 2. Initially, a class
consisted of a 5-day interval, and adjacent classes
were combined when differences between levels
were not significant. Missing observations for DO or
DP were grouped into a specific class. A model
including both DO and DP accounted for little
additional variation (R 2 of 0.51) as compared with a
model including DP only (R 2 of 0.50). However, DO
constitutes a different source of biological variability, so it was decided to include both variables.
Model III had the same specification as Model II,
but heterogeneity of variance within contemporary
groups (CG) was accommodated through a pre-adjustment of records, using an empirical Bayes estimator of the CG variance (Urioste et al., 2001).
Briefly, herd-year-season levels were subdivided
into herd-year-season-parity-lactation length CG
classes, and the usual estimate of variance within a
CG ( sˆ 2i , where i 5 1,2, . . . ,8955) was obtained. A
fixed linear model including effects of herd, calving
period, season, production level, contemporary group
size, milk recording system, parity number and
length of lactation was fitted to the logarithm of
these estimated variances. Predicted values of vari0
ances using this model are represented as e k 9 ig ;
here, k9 i is a suitable incidence vector and g 0 is a
solution for the parameters of the model. Combining
J.I. Urioste et al. / Livestock Production Science 84 (2003) 63–73
66
Table 2
Classes for length of dry period and number of days open
Class
Dry period (days)
No. of lactations
Days open (days)
No. of lactations
1
2
3
4
5
6
7
8
9
10
11
12
13
0–20
21–30
31–40
41–45
46–55
56–70
71–80
81–90
91–110
111–130
131–150
151–360
Missing data
2050
1914
3662
2696
7659
15 433
19 429
17 461
10 750
16 373
13 853
17 757
80 132
0–40
41–65
66–120
121–150
151–360
361–600
Missing data
2633
9703
31 629
10 383
24 253
23 496
77 072
0
ŝ 2i with e k 9 ig , an empirical Bayes estimator of the
posterior mean of the variance within the ith CG is:
ni
0
0
s˜ i2 (g 0 ,n ) 5 e k9 ig 1 ]ssˆ i2 2 e k 9 ig d
n *i
where n 58955-rank (model for log-variances) is a
degree of belief parameter, ni 5 n i 2 1 (n i is the
number of records for CG i ), and n * i 5 n 1 ni . Records were then adjusted as:
y ij 2 mˆ i
y ijC 5 mˆ i 1 ]]]
s
s̃(g 0 ,n ) base
where sbase 5744.3 kg, based on preliminary analyses of the data.
In each of the three models, additive genetic,
permanent environmental and residual variances,
heritability and repeatability were estimated via
restricted maximum likelihood using the VCE software (Neumaier and Groeneveldt, 1998). Conditionally on these estimates, single trait BLUP breeding
values and estimates of fixed effects were computed
using the program JAA20 developed by Dr. Ignacy
Misztal.
2.3. Comparison between models
Models were contrasted by examining changes in
the predicted breeding values and by using a crossvalidation procedure, using re-sampling. Rank correlations and regressions between predictions of breeding values were computed for bulls and cows at
different hypothetical percentages of animals selected. The cross-validation procedure focused on the
ability of the models to predict first, second and later
lactation phenotypic records. The objective of the
re-sampling was to reduce the dependency of several
end-points examined on the specific structure of the
sample to be predicted.
To illustrate, consider prediction of first lactation
records. These were divided into two randomly
created sets, each containing approximately 50% of
the records, and such that all fixed effects levels
were present in both sets. One of the two sets (which
we will refer to as ‘training set’) was chosen at
random and merged with the data on second and
later lactations to obtain BLUP of breeding values
and BLUE of fixed effects. The breeding values and
fixed effects were calculated only once, and these
were used to predict the appropriate lactation records
in other set, the ‘prediction set’, which initially
included all first lactation records. This process was
repeated 1000 times, such that the ‘prediction set’
varied at random (a new sample of records was taken
each time) and contained, on average, 50% of
records coming from the ‘training set’.
The described procedure allowed computing two
measures of model quality for each ‘replication’.
One of the measures was goodness of fit, that is, the
ability of the model to reproduce an observation that
belongs to the ‘training set’ but that appears, with
50% chance, in the ‘predictive set’. The second
measure centered on predictive performance, or
ability of a model to predict an observation in the
J.I. Urioste et al. / Livestock Production Science 84 (2003) 63–73
‘predictive set’ that was not included in the ‘training
set’. Within each ‘replication’, for each lactation and
model, agreement between observed and fitted or
predicted values was assessed using the square root
of the empirical mean squared error statistic:
]]]]]
n
1 s
SME 5 ] ( y i 2 yˆ i )2
n S i [S
œ O
where yˆ i is the fitted or predicted value for y i , y i is
the observed record, and S is the appropriate set of
observations. An additional end-point calculated was
the ‘percentage squared bias’ (PSB), proposed by Ali
and Schaeffer (1987), computed as:
PSB 5 100( y 2 yˆ )9( y 2 yˆ ) /y9y
where y and yˆ are observed and predicted values,
respectively. Finally, a simple measure of agreement
between predicted values and observations that had
been omitted in the predictive set, the determination
coefficient ‘R 2 ’ (by analogy with the usual regression statistic), was calculated as:
‘R 2 ’ 5
O( y 2 yˆ ) /(O y 2 ny¯ ),
2
2
2
computed over members of the predictive set. For
each end-point, the mean and the standard deviations
were calculated over the 1000 replicates.
3. Results and discussion
Average milk yield for the Uruguayan Holstein in
the sample was 4888 kg for a 305-day lactation, with
a standard deviation of 1144 kg, a minimum of 1544
kg and a maximum yield of 11 650 kg. Table 3
presents estimates of variance components,
Table 3
REML estimates of genetic and phenotypic parameters, by model
(s 2a 5 additive genetic variance; s 2pe 5 permanent environment
variance; s 2e 5 residual variance; s 2P 5 phenotypic variance)
2
a
2
pe
2
e
2
P
2
s (kg )
s (kg 2 )
s (kg 2 )
s (kg 2 )
Heritability
Repeatability
Model I
Model II
Model III
135 543
182 576
265 355
583 474
0.23
0.55
136 505
181 754
253 633
571 892
0.24
0.56
155 334
192 135
264 242
611 711
0.25
0.57
67
heritability and repeatability for milk yield. Models
II and III gave slightly higher estimates of heritability and repeatability than Model I. Heritabilities were
in the lower range of estimates from literature,
probably influenced by the weak structure of the
pedigree information: about 75% of the cows were
grade and reliable pedigree information is unavailable for a large percentage of animals. However, the
results are consistent with values obtained in populations with similar average levels of production.
˜ et al. (1989) found a heritability of 0.16
Carabano
and an additive genetic variance of 108 608 kg 2 for
Spanish Holsteins producing 4982 kg milk, on
average. Stanton et al. (1991) estimated the variances between sires and heritabilities of milk yield
from the United States and three Latin American
countries. They found evidence of a lower additive
genetic variance in the latter. Heritability in USA
was 0.25, whereas for the Latin American countries
it was 0.21. When milk records were adjusted to a
common standard deviation of 1000 kg, these authors found that heritability was 0.26 for the USA
population and 0.23 for the combined Latin American data. Hill et al. (1983) reported a heritability of
0.24 for British Friesian herds with low production
level. Estimates of genetic variance found by these
authors, however, were higher than those obtained
here. All estimates of components of variance were
larger in Model III. For example, the estimate of
additive genetic variance was about 15% larger in
Model III than in Models I or II. Increased heritability for Model III suggests that a failure to accounting
for heteroscedasticity between contemporary groups
may mask some of the existing genetic variation.
Conditional on the variance components estimated
(Table 3), solutions pertaining to the effects of age
(within lactation), number of days open, and length
of dry period on milk yield are presented in Fig. 1.
Lactation-age effects indicated that cows reached
mature production at the third lactation. Within
lactation, production increased with age at calving,
particularly in young cows. DO and DP both influenced yields, but their effects were smaller than
those reported in other studies (Schaeffer and Henderson, 1972; Funk et al., 1987; Sadek and Freeman,
1992; Lee et al., 1997). Milk production increased
with DO. Cows with DP less than 40 days produced
markedly less milk. Yield increased with DP up to
68
J.I. Urioste et al. / Livestock Production Science 84 (2003) 63–73
Fig. 1. Systematic environmental effects affecting milk yield; (A) parity and age of calving; (B) number of days open (Model II: filled bars;
Model III: dotted bars); (C) length of previous dry period (Model II: filled bars; Model III: dotted bars).
J.I. Urioste et al. / Livestock Production Science 84 (2003) 63–73
70–100 days and decreased thereafter. Schaeffer and
Henderson (1972) found an ‘optimum’ dry period of
50–59 days, whereas in the study of Funk et al.
(1987) maximum production was found at 60–69
days. In practice, differences are small within a range
of 615 days. The estimates of the effects of DO and
DP on yield were similar for the two models (II and
III) where the reproductive information was included
in the explanatory structure.
Rank correlations between models for predicted
breeding values obtained with the three models are in
Table 4. For the entire data set, correlations were
larger than 0.98 and were similar for bulls and cows.
For high selection intensity (0.1, 1 and 5% animals
hypothetically selected), correlations between
BLUPs from Models I and II were high. However,
rank correlations between evaluations from Models I
and II with those from Model III were much lower,
ranging from 0.63 to 0.88 for sires and 0.53 to 0.80
for cows. Clearly, the adjustment for heterogeneous
variance had a marked effect on rankings. A similar
picture was observed by Robert-Granie´ et al. (1999):
the overall correlation between evaluations from a
homogeneous and heterogeneous variance model was
high (0.98–0.99), but for elite cows the correlation
was 0.54–0.78, reflecting an important re-ranking in
this fraction of the population. This is possibly due
to the fact that a cow’s own records and those of her
dam and maternal relatives are usually made within a
single herd, a situation where the effects of heterogeneity of variance on genetic evaluation can be
especially important.
Table 5 summarizes changes in absolute value of
the BLUP EBV for milk yield. Again, changes were
69
larger when the comparison involved Model III, and
especially when we analysed top fractions of animals
selected. In particular cases, absolute changes in the
BLUPs were around 450 kg, which is more than one
genetic standard deviation in any of the models.
Relationships between results from all models
were also assessed by means of regressions on
animals’ rankings. Departures from 0 for the intercept and from 1 for the slope of the regression line
indicate differing ranking positions for animals selected by two different models. In this sense, regression lines are more informative than simple rank
correlation measures, which cannot be significantly
tested for values different from 1. Table 6 presents
regressions on rankings of sires and cows between
the three models, both for the overall data and for the
top 5% of the animals. Overall regressions showed a
clear departure of expected values. This was more
marked in cows than in sires. Regressions involving
top selected animals showed the same trend.
Fig. 2 relates the percentage of animals (both sires
and cows) selected in common by two different
models, at different percentages of selected animals.
The proportion of animals in common decreases
when selection intensity increases. Percentage animals in common was lower for cows than for sires.
For example, approximately 10% of the sires were
not present in the top 5% of animals, when selection
was based on models I or II instead of model III. For
cows, the percentage of animals not present increased to 15–16%. Most visible changes should
occur with the top animals (sires and elite cows),
whereas selection at herd level, where 70–75% of
cows are retained, would not change very much. In
Table 4
Rank correlations between predicted breeding values from different models, by type of animal and percentage selected
No. of animals
Model I vs. Model II
Model I vs. Model III
Model II vs. Model III
2321
96 871
0.99
0.99
0.98
0.98
0.99
0.99
Sires selected with Model III
1%
23
5%
116
0.95
0.97
0.63
0.85
0.72
0.88
Cows selected with Model III
0.1%
97
1%
969
5%
4844
0.98
0.97
0.98
0.53
0.69
0.78
0.57
0.69
0.80
Overall
Sires
Cows
J.I. Urioste et al. / Livestock Production Science 84 (2003) 63–73
70
Table 5
Absolute values of the changes in predicted breeding value, by type of animal and percentage selected (M-I: Model I; M-II: Model II; M-III:
Model III)
Average (kg)
Minimum (kg)
Maximum (kg)
M-I vs. M-II
M-I vs. M-III
M-II vs. M-III
16.1
34.6
28.4
0
0
0
165.5
449.5
467.6
Top 1%
M-I vs. M-II
M-I vs. M-III
M-II vs. M-III
Sires
16.4
61.2
56.2
Cows
20.2
71.9
65.5
Sires
0.4
0.9
3.4
Cows
0
0.1
0.3
Sires
46.3
187.8
208.9
Cows
115.1
449.5
467.1
Top 5%
M-I vs. M-II
M-I vs. M-III
M-II vs. M-III
21.7
57.2
45.6
18.9
60.5
53.3
0.1
0.2
0
0
0
0.1
73.2
288.4
234.5
149.8
449.5
467.6
Overall
Table 6
Regressions on rankings of animals, by type of animal and percentage selected (M-I: Model I; M-II: Model II; M-III: Model III)
Intercept
Sires
Slope
Cows
Sires
Overall
M-I vs. M-II
M-I vs. M-III
M-II vs. M-III
M-II vs. M-I
M-III vs. M-I
M-III vs. M-II
7.563.1
19.365.0
12.364.0
7.563.1
19.365.0
12.364.0
289.63619.62
762.24631.75
478.63625.20
289.63619.62
762.24631.75
478.63625.20
5%
M-I vs. M-II
M-I vs. M-III
M-II vs. M-III
M-II vs. M-I
M-III vs. M-I
M-III vs. M-II
21.361.9
5.565.3
3.564.3
5.261.7
21.463.5
15.963.2
4.83614.51
262.81654.58
244.90649.93
185.95613.36
1351.87624.33
1275.32624.17
the study of Robert-Granie´ et al. (1999), change
from homogeneous to heterogeneous models resulted
in 30–40% new cows in the top Holstein cow
population. Sire ranking may also be affected when
daughters are non-randomly distributed among high
and low variance environments (Hill, 1984). RobertGranie´ et al. (1999) reported the case of foreign bulls
as being used in herds with high production level and
more variable environments, and therefore being
0.9960.00
0.9860.00
0.9960.00
0.9960.00
0.9860.00
0.9960.00
21.361.9
5.565.3
3.564.3
5.261.7
21.463.5
15.963.2
Cows
0.9960.00
0.9860.00
0.9960.00
0.9960.00
0.9860.00
0.9960.00
1.0260.00
1.1160.02
1.0860.02
0.9060.00
0.3660.01
0.3960.01
more affected when a correction for heterogeneity is
performed.
In Table 7, the predictive ability of the models is
illustrated. In general terms, MSE was reduced with
the complexity of the model, and increased with
parity number. The coefficient of determination R 2
increased with model complexity and decreased with
lactation number. The notorious exception was
Lactation 1, for which there were no big differences
J.I. Urioste et al. / Livestock Production Science 84 (2003) 63–73
71
Fig. 2. Percentage of sires (s) and cows (d) selected in common between Models I and III (Model I in the legend), and Models II and III
(Model II in the legend), at different percentages of selected animals.
Table 7
Square root of empirical Mean Square Error (SME), Percentage
Squared Bias (PSB) and coefficient of determination ‘R 2 ’ for
prediction of different lactations with Models I, II and III
Lactations
Model I
Model II
Model III
1
SME (lack of fit) (kg)
SME (pred. error) (kg)
Average SME (kg)
PSB (%)
R 2 (%)
396.362.1
581.463.3
489.162.1
1.1860.01
82.360.2
389.762.0
578.263.2
484.462.0
1.1560.01
82.660.2
422.162.0
657.863.3
541.962.2
1.4460.01
80.660.2
2
SME (lack of fit) (kg)
SME (pred. error) (kg)
Average SME (kg)
PSB (%)
R 2 (%)
434.862.5
635.163.3
555.862.4
1.2060.01
77.360.2
421.062.4
620.663.2
542.062.3
1.1360.01
78.460.2
408.462.2
606.362.8
528.362.0
1.0860.01
79.460.2
3 and later
SME (lack of fit) (kg)
SME (pred. error) (kg)
Average SME (kg)
PSB (%)
R 2 (%)
470.762.6
677.262.9
599.662.2
1.2360.01
73.460.2
459.062.5
664.462.8
587.362.2
1.1860.01
74.560.2
443.462.1
657.062.7
577.462.0
1.1460.01
74.560.2
between Models I and II (recall that the effect of DP
is non-existent in lactation 1), and where Model III
seems to be the worst model. This result was
´ ˜ et al. (1999) used
unexpected. The study of Ibanez
first lactations only and compared three models:
homogeneous genetic and residual variance,
homogeneous genetic and residual variance after data
standardisation (like ours) and a heterogeneous genetic and residual variance. They found that a
phenotypic preadjustment for heterogeneity corrected
for the phenotypic dispersion and was better than the
purely homogeneous model, but dispersion of predicted genetic values of animals in environments
with large heritability was underestimated with respect to the heterogeneous variance model. For our
data, the larger SME values for Model III could be
interpreted as a consequence of the data structure:
new herds entering the evaluation typically reported
a few first lactation records, producing poor estimates of the Bayesian estimator and consequently
(under) overestimating the adjustment for heterogeneity of variances. In later parities, however,
production stress causes important variation in milk
yield, and a correction for heterogeneity of variance
seems to be advisable. Prediction bias, measured as
PSB, was very low in all models and lactations.
72
J.I. Urioste et al. / Livestock Production Science 84 (2003) 63–73
´ ˜ et al. (1999) also found small values of PSB
Ibanez
for milk yield, 3.02% for a homogeneous model and
slightly smaller values, 2.84 or 2.87%, for two
heterogeneous models. Some primary effects of
accounting for heterogeneous variance in national
genetic evaluations have been discussed by Weigel
and Lawlor (1994), and apply to the Uruguayan
situation: (1) increased fairness of genetic evaluations; (2) increased overall accuracy of evaluation
for superior cows, thereby increasing potential for
genetic progress through maternal selection pathways; (3) improved accuracy of sire evaluations for
(a) sires imported from foreign countries and used
non-randomly with respect to herd variance and (b)
breeder proven bulls with many progeny in a single
herd with high or low variance
4. Conclusions
The general results obtained in this study show
that the current Uruguayan genetic evaluation model
could be improved in various aspects. Accurate
estimations of parameters are now available (Table
3), and the effects of DO and DP proved to be
significant (see Fig. 1). The problem of heterogeneous variances in genetic evaluations of dairy cattle
is that above average animals in the more variable
herds may be over-evaluated. A greater proportion of
animals would then be selected from the more
variable herds. Cow evaluation was found to be
much more sensitive to violations of the assumed
homogeneous variance. Sire ranking may also be
affected when daughters are non-randomly distributed among high and low variance environments (Hill,
1984). This could be the case of Uruguay, since
there is no organised progeny testing of young bulls,
and a majority of sires have daughters in just one
herd. Also, foreign bulls (more expensive semen)
tend to be used in herds with a high production level.
Accuracy of evaluations depends on how well the
assumptions of the model match the data. Our
results, presented in Table 7, suggest that the assumption of heterogeneous variance is tenable, especially in later lactations, whereas some doubts arise
in first lactations, most probably due to the structure
of the database and pedigree information used in the
analyses. Given the relative simplicity of the results
found here, and with the necessary care in data
selection to avoid poor estimates of the Bayesian
estimator proposed by Urioste et al. (2001) to deal
with heterogeneity of variances, incorporation of an
adjustment for heterogeneity of variances to the
Uruguayan Holstein genetic evaluation is strongly
recommended.
Acknowledgements
ARU and INML are thanked for supplying the
data, and the Department of Dairy Science, University of Wisconsin-Madison, for hosting J.I. Urioste’s
visit. Dr. Ignacy Misztal is acknowledged for his
´
JAA programs, and Facultad de Agronomıa,
Uruguay, and World-Wide Sires, Inc., Visalia, CA,
for economic support.
References
Ali, T.E., Schaeffer, L.R., 1987. Accounting for covariances
among test day milk yield in dairy cows. Can. J. Anim. Sci. 67,
637–644.
´ entre comportamiento
Berger, A., Lista, O., 1999. Relacion
productivo y reproductivo de vacas Holando. Tesis Ingeniero
´
´
Agronomo,
Universidad de la Republica
Montevideo, Uruguay,
100 pp.
Brotherstone, S., Hill, W.G., 1986. Heterogeneity of variance
amongst herds for milk production. Anim. Prod. 42, 297–303.
˜ M.J., Van Vleck, L.D., Wiggans, G.R., Alenda, R., 1989.
Carabano,
Estimation of genetic parameters for milk and fat yields of
dairy cattle in Spain and the United States. J. Dairy Sci. 72,
3013–3022.
Dodenhoff, J., Swalve, H.H., 1998. Heterogeneity of variances
across regions of northern Germany and adjustment in genetic
evaluation. Livest. Prod. Sci. 53, 225–236.
Estany, J., Sorensen, D., 1995. Estimation of genetic parameters
for litter size in Danish Landrace and Yorkshire pigs. Anim.
Sci. 60, 315–324.
Funk, D.A., Freeman, A.E., Berger, P.J., 1987. Effects of previous
days open, previous days dry and present days open on
lactation yield. J. Dairy Sci. 70, 2366–2373.
Henderson, C.R., 1973. Sire evaluation and genetic trends. In:
Proc. Animal Breeding and Genetics symposium in honor of
Dr, J.L. Lush. American Society of Animal Science and
American Dairy Science Association. Champaign, IL, USA,
pp. 10–41.
Henderson, C.R., 1984. In: Applications of Linear Models in
Animal Breeding. University of Guelph, Guelph, Ontario,
Canada.
J.I. Urioste et al. / Livestock Production Science 84 (2003) 63–73
Hill, W.G., 1984. On selection among groups with heterogeneous
variance. Anim. Prod. 39, 473–477.
Hill, W.G., Edwards, M.R., Ahmed, M.-K.A., Thompson, R.,
1983. Heritability of milk yield and composition at different
levels and variability of production. Anim. Prod. 36, 59–68.
˜ M.J., Alenda, R., 1999. Identification of
´˜
Ibanez,
M.A., Carabano,
sources of heterogeneous residual and genetic variances in milk
yield data from the Spanish Holstein-Friesian population and
impact on genetic evaluation. Livest. Prod. Sci. 59, 33–49.
˜ M.J., Foulley, J.L., Alenda, R., 1996.
´˜
Ibanez,
M.A., Carabano,
Heterogeneity of herd-period phenotypic variances in the
Spanish Holstein-Friesian cattle: sources of heterogeneity and
genetic evaluation. Livest. Prod. Sci. 45, 137–147.
INTERBULL 2000. National Genetic Evaluation Programmes for
Dairy Production Traits Practiced in Interbull Member Countries 1999–2000. Bulletin no. 24, Department of Animal
Breeding and Genetics, SLU, Uppsala, Sweden, 111 pp.
Jones, L.P., Goddard, M.E., 1990. Five years experience with the
animal model for dairy evaluations in Australia. In: Proc. 4th
WCGALP, Edinburgh, Scotland, XIII, pp. 382–385.
Lee, J.K., Van Raden, P.M., Norman, H.D., Wiggans, G.R.,
Meinert, T.R., 1997. Relationship of yield during early lactation and days open during current lactation with 305-day yield.
J. Dairy Sci. 80, 771–776.
Meuwissen, T.H.E., De Jong, G., Engel, B., 1996. Joint estimation
of breeding values and heterogeneous variances of large data
files. J. Dairy Sci. 79, 310–316.
Neumaier, A., Groeneveldt, E., 1998. Restricted Maximum Likelihood estimation of covariances in sparse linear models. Genet.
Sel. Evol. 30, 3–26.
´
Olesen, I., Perez-Enciso,
M., Gianola, D., Thomas, D.L., 1994. A
comparison of normal and nonnormal mixed models for
73
number of lambs born in Norwegian sheep. J. Anim. Sci. 72,
1166–1173.
´
Perez-Enciso,
M., Tempelman, R.J., Gianola, D., 1993. A comparison between linear and Poisson mixed models for litter size
in Iberian pigs. Livest. Prod. Sci. 35, 303–316.
´ C., Bonaıti,
¨ B., Boichard, D., Barbat, A., 1999.
Robert-Granie,
Accounting for variance heterogeneity in French dairy cattle
genetic evaluation. Livest. Prod. Sci. 60, 343–357.
Sadek, M.H., Freeman, A.E., 1992. Adjustment factors for previous and present days open considering all lactations. J. Dairy
Sci. 75, 279–287.
Schaeffer, L.R., Henderson, C.R., 1972. Effects of days dry and
days open on Holstein milk production. J. Dairy Sci. 55,
107–112.
Stanton, T.L., Blake, R.W., Quaas, R.L., Van Vleck, L.D.,
˜ M.J., 1991. Genotype by environment interaction for
Carabano,
Holstein milk yield in Colombia, Mexico and Puerto Rico. J.
Dairy Sci. 74, 1700–1714.
Tempelman, R.J., Gianola, D., 1999. Genetic analysis of fertility
in dairy cattle using negative binomial mixed models. J. Dairy
Sci. 82, 1834–1847.
Urioste, J.I., Gianola, D., Rekaya, R., Fikse, W.F., Weigel, K.A.,
2001. Evaluation of extent and amount of heterogeneous
variance for milk yield in Uruguayan Holsteins. Animal Sci.
72, 259–268.
Weigel, K.A., Lawlor, T.J., 1994. Adjustment for heterogeneous
variance in genetic evaluations for conformation of United
States Holsteins. J. Dairy Sci. 77, 1691–1701.
Wiggans, G.R., Van Raden, P.M., 1991. Method and effect of
adjustment for heterogeneous variance. J. Dairy Sci. 74, 4350–
4357.