QTL explaining variation in production traits and udder health in the Danish Holstein population

The main objective was to locate QTL and estimate the proportion of total genetic variance attributable to quantitative trait loci (QTL) for production index traits and the udder health index identified on six Bos taurus autosomes in the Danish Holstein dairy cattle population. Data were obtained from a granddaughter design of 20 sire families with a total of 1 869 progeny tested sons. The number of sons per grandsire ranged from 20 to 284, with an average family size of 93.5. Indexes of the estimated breeding values were obtained for the milk production traits and for the udder health index from the Danish Agricultural Advisory Service database. A random-QTL model was applied to incorporate marker information into parameter estimation for each single QTL. The procedure allowed us to detect new QTL on BTA3, BTA16 and BTA28 and to estimate the proportion of total genetic variance attributed to different QTL on a total of six Bos taurus autosomes for the udder health index and yield index traits in the Danish Holstein population. Variance estimates vary between 2 to 58 % of the total variance for different QTL and seem to explain a substantial part of the variance at certain positions of the cattle genome. The results are discussed against the background of the failure of marker-assisted selection (MAS) and the recent availability of large panels of single nucleotide polymorphisms (SNPs) that have improved the search for mutations underlying variation in complex traits resulting in modern genomic selection.


Introduction
Most economically important traits in livestock species are quantitative by nature.These traits are influenced by many chromosomal regions, which are often referred to as quantitative trait loci (QTL).Many QTL have been identified in dairy cattle for primary production and health traits such as milk, fat and protein yield, clinical mastitis and type traits (Georges et al. 1995, Zhang et al. 1998, Spelman et al. 1999, Schrooten et al. 2000, Klungland et al. 2001, Thomsen et al. 2001, Kühn et al. 2002, Hiendleder et al. 2003, Ron et al. 2004).Danish breeders have put their emphasis mainly on health and conformation traits (Buitenhuis et al. 2007, Lund et al. 2008, Thomasen et al. 2008).
Locating these QTL and understanding their inheritance is worthwhile for animal breeders as it enables them to estimate an animal's breeding value more accurately and apply markerassisted selection (MAS) as described by Fernando & Grossman (1989).MAS is particularly useful when breeding for traits that are difficult or expensive to measure and/or are lowly heritable (Meuwissen & Goddard 1996, Goddard & Hayes 2002).Including random QTL effects in MAS is also useful in pre-selection of young candidates for progeny testing programs or the selection of heifers as candidates for dams of bulls.Well-described examples of MAS had been implemented in large-scale dairy cattle breeding schemes in New Zealand (Spelman et al. 2007), France (Boichard et al. 2002) and Germany (Bennewitz et al. 2004).In relation to this, Spelman et al. (2007) reported MAS to have led to minor to moderate improvement in genetic gain with, at best, a neutral cost-effectiveness.
The benefit of MAS is highly dependent on how accurately the location, effects, and the proportion of variance due to the polygenic and due to the QTL components are estimated (Lande & Thompson 1990).
A special challenge in estimating variance components for QTL arises from the fact that usually only a small part of the overall population is genotyped.As for France, Germany and New Zealand, where MAS had been added to existing breeding programs, only genotyped animals provide information for the QTL-specific evaluations, and therefore specific approaches had to be established for the estimation of QTL variance components and subsequent MA-BLUP breeding value estimation (Bennewitz et al. 2004, Liu et al. 2004, Druet et al. 2006, Neuner et al. 2008, 2009).
In principle, MAS involves selection on markers either in linkage disequilibrium (LD) or linkage equilibrium with the QTL.But as we have to perform very stringent tests for statistical significance to identify QTL, only a limited fraction of the genetic variation is explained by the identified QTL (Goddard & Hayes 2009).Besides that the progress in identifying causal genes has been slow as linkage mapping results result still in large confidence intervals.The recent availability of large panels of single nucleotide polymorphisms (SNPs) has improved search for mutations underlying variation in complex traits.The clear advantage of genomic selection (GS) is targeting potentially all QTL simultaneously in contrast to MAS typically targeting only a few selected QTL.Thus, there has been an evolution from MAS towards GS (for example, Goddard & Hayes 2007), and GS is now rapidly replacing traditional evaluation systems used for dairy cattle.
Despite this evolutionary development the objective of this study was twofold: first, scan for new QTL on chromosomes that have not yet been investigated for production and udder health indexes and secondly, estimate the proportion of total genetic variation attributed to QTL for production index traits and in particular for the udder-health trait index in Danish Holstein cattle.

Animals
Data used in this study was derived from a granddaughter design with Danish Holstein cattle.The design involved 20 sire families with 1 869 progeny-tested sons.The number of sons per grandsire ranged from 20 to 284 and the average family size was 93.5.Each progeny tested sire had at least 70 daughters with records.Indexes of the estimated breeding values were obtained for the milk-production traits (Milk index, MI; Protein Index, PI; Fat index, FI; and the compound yield index, YI) and for an udder-health index (UI) from the Danish Agricultural Advisory Service database.Details of methods and models used for breeding value estimation can be found in Danish Cattle Federation (2006).
As daughter-yield deviations (DYD) were not available for the observed traits, indexes of the estimated breeding values were de-regressed (de-regressed proofs=DRPF).The deregression as described by Lien et al. (1995) provides an approximate measure of daughteryield deviations.
In the subsequent statistical analysis the different number of daughters contributing to the calculation of de-regressed breeding values was accounted for by a weighting factor w, where the number of effective daughters of each sire is n e and h 2 is the heritability of the trait provided by the Danish Agricultural Advisory Service database.The weighting represents the variance of the DRPFs (Thomsen et al. 2001).

Pedigree information
The pedigree for animals of the latest generation was calculated as described by Kučerová et al. (2006).However, we modified it to specifically account also for genetic relationships between grandsire families.All male and female ancestors of the grandsires and sons were traced back until unknown parents were reached.This resulted in a pedigree with 10 134 animals that were related to the genotyped grandsires and their sons.The oldest ancestor in the pedigree was born in 1953.

Markers and maps
Grandsires and sons were genotyped for selected regions on the following chromosomes BTA3, BTA5, BTA7, BTA16, BTA23, and BTA28.Markers and their positions were selected from the Meat Animal Research Center (http://www.marc.usda.gov/genome/cattle/cattle. html; USDAMARC, Clay Center, USA).Details on marker typing, marker maps, distribution of markers across the chromosomes and the selection of the genomic regions are provided in Table 1 of Thomasen et al. (2008).Genotypes were produced on an automated sequence analyser.Inconsistent marker types or markers exhibiting evidence of segregation distortion were discarded.
Chromosomes for the current study have been chosen, because they harbour significant QTL for health traits such as clinical mastitis, somatic cell score, conformation and calving traits as described by Buitenhuis et al. (2007), Lund et al. (2008) and Thomasen et al. (2008), but have not been investigated for production index traits and the udder health index.

Statistical analyses
A random-QTL model, as described by Sørensen et al. (2003), was fitted to estimate variance components for each single trait.The model decomposes the overall genetic variance into a component due to the segregation of a putative QTL, and another due to the effect of a polygenic term (the collective effects of all other QTL affecting the trait).The model can be written as: where y is a vector of de-regressed proofs, X is a matrix relating records to the fixed effects, β is a vector of fixed effects including the overall trait means, Z is a matrix relating records to individuals, u is a vector of polygenic effects, W is a matrix relating each individual's record to its QTL effect, q is a vector of additive QTL effects corresponding to the QTL, and e is a vector of random residuals.The random variables u, q and e are assumed to be multivariate normally distributed (MVN) and mutually uncorrelated.Specifically, u is MVN (0, G⊗A), q is MVN (0, K⊗GRM i ), and e is MVN (0,E⊗R).
A is the additive genetic relationship matrix and GRM i is the gametic relationship matrix (GRM) for the QTL |M,p conditional on marker data (M) at the position p of the QTL i on the chromosome.G, K and E represent the variance-covariance structure of traits due to polygenic, additive QTL and residual effects, respectively.
In a single-trait single-QTL model used for the current across family analysis G, K, and E are reduced to scalars.The most likely marker-linkage phase in the sire was used to compute the GRM as described by Wang et al. (1995) and extended by Sørensen et al. (2003) for flanking markers and with a second extension for both sexes.
The variance components were estimated using the average-information restricted maximum-likelihood algorithm (AIREML and DMU, Jensen et al. 1997).The restricted likelihood was maximized with respect to the variance components associated with the random effects in the model.Maximizing a sequence of restricted likelihoods over a grid of specific positions yields a profile of the restricted likelihood of the QTL positions.Parameters were estimated for every cM along each chromosome and yielded in the restricted likelihood of the QTL position (Sørensen et al. 2003).
Estimates of variance components were only considered for specific QTL locations, which have been suggested by Thomasen et al. (2008) to be used in subsequent MAS in Danish Holstein cattle.
The test statistic is based on the likelihood ratio test (LRT) as described by Kučerová et al. (2006), in which the presence of QTL were based on the asymptotic distribution of the LRT statistic with LRT=−2ln(L reduced −L full ), where L full and L reduced were the maximized likelihoods under the full model and the reduced model representing no QTL effect for the chromosome.Significant thresholds were calculated as described by Piepho (2001) at levels of α=0.10, 0.05 and 0.01.

QTL positions
Using the expanded pedigree material as described above the single-trait single-QTL model was used to refine the precise locations of the QTL of pre-selected regions on chromosomes BTA3, BTA5, BTA7, BTA16, BTA23, and BTA28 and to subsequently estimate the variance components of the QTL at the maximized positions.
Likelihood profiles for the yield index and the udder health index on the respective chromosomes detected by the single-trait-single-QTL-analysis are shown in Figure 1 to 6.The significance thresholds in the figures are shown for 5 % chromosomewise, but significance of the results is also indicated in the Table 1.The axis of abscissae indicates the chromosomal positions and the ordinate shows the LRT statistics.
In total, 13 QTL for the production yield index traits, the compounded yield index and the udder health index were identified as to be significant on the chromosomes under investigation (Table 1).Table 1 also shows the corresponding significance and the related variance components.For the yield index, the QTL on BTA3 was just below the significance threshold, although the QTL profile was highest for the position that had been identified in the former regression analysis.Four QTL regions being significant for yield traits in previous investigations (FI on BTA5, YI on BTA5, YI on BTA23 and MI on BTA28) did not reach significance for yield index traits in our investigation.

Variance components
Model 1 was applied to the data to partition total variance into the QTL, polygenic, and the residual components.Among all traits the polygenic variance was largest except for the protein and yield index on BTA28.Corresponding QTL variance estimates varied between 2 to 58 %.Despite the fact that some estimates of the QTL variances were relatively low, the QTL term was significant for almost all yield index traits and for the udder health index as shown in the Table 1.Although variance components for the compounded yield index were not significant on BTA3, BTA5 and BTA23, the same chromosomes were highly significant for the protein index, milk index or fat index and were, therefore, considered in the analysis of the compounded yield index as well.The overall estimates are shown in the Table 1 for the udder health index and the production index traits.
In total, thirteen QTL for the production yield index traits, the compounded yield index and the udder health index were identified as to be significant on the chromosomes under investigation (Table 1).Table 1 also shows the corresponding significance and the related variance components.For the yield index, the QTL on BTA3 was just below the significance threshold, although the QTL profile was highest for the position that had been identified in the former regression analysis.Four QTL regions being significant for yield traits in previous investigations (FI on BTA5, YI on BTA5, YI on BTA23 and MI on BTA28) did not reach significance for yield index traits in our investigation.
In total, thirteen QTL for the production yield index traits, the compounded yield index and the udder health index were identified as to be significant on the chromosomes under investigation (Table 1).Table 1 also shows the corresponding significance and the related variance components.For the yield index, the QTL on BTA3 was just below the significance threshold, although the QTL profile was highest for the position that had been identified in the former regression analysis.Four QTL regions being significant for yield traits in previous investigations (FI on BTA5, YI on BTA5, YI on BTA23 and MI on BTA28) did not reach significance for yield index traits in our investigation.
In total, thirteen QTL for the production yield index traits, the compounded yield index and the udder health index were identified as to be significant on the chromosomes under investigation (Table 1).Table 1 also shows the corresponding significance and the related variance components.For the yield index, the QTL on BTA3 was just below the significance threshold, although the QTL profile was highest for the position that had been identified in the former regression analysis.Four QTL regions being significant for yield traits in previous investigations (FI on BTA5, YI on BTA5, YI on BTA23 and MI on BTA28) did not reach significance for yield index traits in our investigation.
In total, thirteen QTL for the production yield index traits, the compounded yield index and the udder health index were identified as to be significant on the chromosomes under investigation (Table 1).Table 1 also shows the corresponding significance and the related variance components.For the yield index, the QTL on BTA3 was just below the significance threshold, although the QTL profile was highest for the position that had been identified in the former regression analysis.Four QTL regions being significant for yield traits in previous investigations (FI on BTA5, YI on BTA5, YI on BTA23 and MI on BTA28) did not reach significance for yield index traits in our investigation.
In total, thirteen QTL for the production yield index traits, the compounded yield index and the udder health index were identified as to be significant on the chromosomes under investigation (Table 1).Table 1 also shows the corresponding significance and the related variance components.For the yield index, the QTL on BTA3 was just below the significance threshold, although the QTL profile was highest for the position that had been identified in the former regression analysis.Four QTL regions being significant for yield traits in previous investigations (FI on BTA5, YI on BTA5, YI on BTA23 and MI on BTA28) did not reach significance for yield index traits in our investigation.
In total, thirteen QTL for the production yield index traits, the compounded yield index and the udder health index were identified as to be significant on the chromosomes under investigation (Table 1).Table 1 also shows the corresponding significance and the related variance components.For the yield index, the QTL on BTA3 was just below the significance threshold, although the QTL profile was highest for the position that had been identified in the former regression analysis.Four QTL regions being significant for yield traits in previous investigations (FI on BTA5, YI on BTA5, YI on BTA23 and MI on BTA28) did not reach significance for yield index traits in our investigation.

Discussion
A random-QTL model was applied to incorporate marker information into parameter estimation for each single QTL.The procedure allowed us to detect new QTL and to estimate the proportion of total genetic variance attributed to different QTL on six different Bos taurus autosomes for the udder health index and yield index traits in the Danish Holstein population.
Variance estimates vary between 2 to 58 % of the total variance for different QTL and seem to explain a major part of the variance at certain positions of the cattle genome.

QTL for production traits
QTL affecting production index traits have been detected in our study on chromosomes BTA 3, 5, 16, 23 and 28.Several of these QTL have been detected in prior scans for different populations (Spelman et al. 1996, Kühn et al. 1999, Ashwell et al. 2001, De Koning et al. 2001, Thomsen et al. 2001, Nadesalingam et al. 2001, Bennewitz et al. 2004).
To our knowledge the QTL for PI on BTA3 has not been identified in this chromosomal region.Only Ashwell et al. (2001) have reported a QTL on BTA3 for PI near by position 54 cM, which in fact has been in agreement with the result of our prior regression analysis (as shown in brackets of Table 1 in column 2).The shift of the position might be specific to the model applied.The compounded YI did not reach chromosomewise significance, but its peak has been identified near position 87 cM, which is very close to a QTL described by Skelding et al. (2010) showing associations between the bovine interleukin-12 and interleukin-23 receptor genes and protein yield.
The QTL for MI and PI on BTA5 have been detected by Bonakdar et al. (2010) in a similar region between marker BMC1006 and ETH10.Although both index traits show high significance, the compounded YI did again not reach the significance level.Nonetheless, the peak was almost at the same position (38 cM).A possible explanation for this QTL might be the IGF-I gene polymorphism that has been associated with milk fat and protein in Holstein dairy cows (Bonakdar et al. 2010).
A new QTL for MI has been detected on BTA16.The closest QTL to our knowledge that has been detected for MI, is still approx.40 cM apart (Daetwyler et al. 2008).The QTL for PI in the same location has been detected earlier though by Lillehammer et al. in 2007.The compounded YI showed in contrast to BTA5 high significance at the same position of the chromosome.This was rather surprising because this chromosomal region has been chosen for investigation due to a strong QTL for conformation traits (Buitenhuis et al. 2007).
The QTL for FI, MI and PI on BTA23 have been also detected by Bennewitz et al. (2004) in the same chromosomal region.But again the compounded YI did not show up as a significant QTL in our study, although the peak for YI was highest in the same region as it was for the individual indexes.This might be explained by the different weight of the individual indexes in the compounded index.
The final chromosome under investigation (BTA28) for production traits also revealed a QTL for PI at position 49 cM and for YI at 50 cM.Even though QTL for MI and FI did not show significance in our QTL search, the highest peaks were also at the same position.These results are in good agreement with QTL detected in the identical region by Bagnato et al. (2008).Along with this chromosomal region a strong QTL for conformation traits has also been described by Buitenhuis et al. (2007) that might be in LD with the QTL alleles selected for production index traits.
One might argue that QTL in this study have only been detected with a single-trait-single-QTL model.As shown by Sørensen et al. (2003), the model will provide sufficient position estimates for QTL with small and big effects.The model dimension is very variable, and it usually has a good mixing character between the polygenic and QTL components and, thus, is easy to converge.There is no doubt, that the recent development using a genomewide panel of dense markers (SNPs) has the great advantage of targeting potentially all QTL simultaneously, whereas the traditional approach is limited due to their very stringent tests for statistical significance to identify QTL, so that only a limited fraction of the genetic variation will be explained by the identified QTL (Goddard & Hayes 2009).

QTL for the udder health index
The QTL for udder health index on BTA7 has been confirmed by Ron et al. (2004).They reported a putative QTL for somatic cell score at a nearby position (31.73 cM) on the chromosome.Our QTL for the udder health index on BTA23 has also been detected in the same marker bracket by Lund et al. (2008), although somatic cell score has been used as phenotypic trait in their study.Even though Lund et al. (2008) showed evidence for QTL affecting clinical mastitis, somatic cell score and udder conformation traits in the Danish Holstein Cattle on BTA5, the QTL for the udder health index on BTA5 did not reach significance in our study.The QTL for udder health index on BTA28, which has been identified in the same chromosomal region (Tal-Stein et al. 2010), dropped also below an appropriate level of significance in our QTL search.

Variance components
Overall, the QTL under investigation explained a proportion of 2 to 58 % of the genetic variance.The genetic variance due to the QTL is assumed to be a precise partitioning of polygenic variance components.This is primarily, because the relationship among grandsires and sons of different grandsire families has been accounted for by the inclusion of many former generations to the pedigree material compared to Neuner et al. (2008), who included only the last three generations.In a granddaughter design (Weller et al. 1990) the grandsires are assumed to be unrelated, but the current approach took all known male and female ancestors of the grandsires and sons into account.As a result the residual variances are lower compared to the study of Szyda et al. (2005), even though the material used in their study was rather large with respect to the number of individuals genotyped for the QTL regions of interest and the number of daughters per bull.The same effect has been shown by Kučerová et al. (2006), where variances were higher when using a more restricted pedigree compared to an extended pedigree.
Another explanation for the rather small proportion of residual variance might be the use of DRPFs as the dependent variable, which has been shown to be advantageous for MAS.Estimated breeding values might be slightly less correct and potentially biased, so that the estimated variances might be affected.Efficient weighting of the dependent variable as applied in this study should also improve the results of parameter estimation (Thomsen 2006, Neuner et al. 2008).
An important influence for the precision of the estimation of variance components is also due to marker informativity.As pointed out by Druet et al. (2006), non-informative sires will have the probability of identity-by-descent with his son equal to 0.5, which is identical to the additive relationship matrix, and therefore no information is available for the separation of QTL effects and the polygenic term.In order to counteract this fact, the setup of the study material has accounted for the non-informativity of sire families by choosing the most informative markers in the QTL regions (Thomasen et al. 2008).
Another way to account for this problem is to include genotypes of the female side, as it was originally planned in the development of routine genotyping for the MAS scheme, whereas data available for the current analysis consisted of genotypes of bulls only.Based on the large number of genotyped bulls, genotypes of some dams and maternal grandsires might have been reconstructed to improve information on QTL transmissions.
As an alternative Bolard & Boichard (2002) showed how the information on maternal grandsire genotypes and, consequently, on QTL transmissions can be incorporated in the QTL mapping and subsequently used for the estimation of parameters.
One general problem still seems to remain unsolved: the overestimation of QTL variances.For several QTL the variance seemed to be overestimated with values up to 58 % such as for i.e.YI and PI.This might be explained with the pre-selection of our data set as in many QTL studies before.Sires were always supposed to be QTL heterozygous and their sons should have high positive and high negative values for the phenotypic trait values (van der Werf et al. 2007).
In conclusion, the study has identified some new QTL and provided some good estimates of the genetic variances due to the QTL.Despite the examples of applied MAS as described above, the application of MAS will be limited, because estimating effects and making predictions from modern genome-wide association studies has moved forward with an enormous progress.Even though newly detected QTL seem to be interesting from the scientific standpoint, many QTL are not applicable to commercial populations, or they need to be fine-mapped and confirmed with adequate methods first.Only a few QTL with sufficiently large effect to be economically viable for the large costs and logistical demands of implementation might then be used in MAS in some smallholder production systems (Marshall et al. 2011).
Figure 1Loglikelihood ratio test statistic profile for Yield Index on BTA5

Table 1
BTA: chromosome, LRT: Loglikelihood Ratio Test statistic with related significance, QTL: QTL variance component with related standard errors as provided by DMU, PG: additive polygenic variance component with related standard errors, Rest: residual variance component with related standard errors, a expected position based on previous linkage analysis in brackets, *P<0.10,**P<0.05,***P<0.01chromosomewise