Genetic diversity and population structure in divergent German cattle selection lines on the basis of milk protein polymorphisms

Abstract The aim of this study was to analyze the genetic structure of the casein cluster in eight selection lines of the Holstein Friesian (HF), German Simmental (GS) and German Black Pied cattle (“Deutsches Schwarzbuntes Niederungsrind”, DSN) breeds. A total of 2962 milk samples were typed at αs1-casein (αs1-CN), β-casein (β-CN), αs2-casein (αs2-CN) and κ-casein (κ-CN) loci using isoelectric focusing. The number of alleles per locus ranged from one (αs2-CN) to five (β-CN), and the average expected heterozygosity and polymorphic information content of all loci were 0.33 and 0.27, respectively. The unrooted dendrogram revealed that the selection lines of the endangered DSN breed were clearly separated from the HF and GS breeds due to their predominance of the β-CN A1 allele and the comprehensive haplotype BA1A (in the abbreviation of αs1-β-κ-CN). Temporal changes in allele distributions indicated decreasing genetic diversity at the casein loci, explaining the moderate level of genetic differentiation among selection lines (7.1 %). The variability of the casein should be exploited in future using breeding programs to select genetic lines for specific protein production in bovine milk but also to preserve biodiversity.


Introduction
Since domestication 8000-10 000 years ago, natural as well as man-made factors including geography, environment, culture and directional artificial selection contributed to cattle trait modifications phenotypically and genetically (Loftus et al., 1994). From a time perspective, in contrast to natural selection, artificial selection has the ability to change the genome rapidly. The consequence is a targeted displacement in allele frequencies, implying deviations from Hardy-Weinberg equilibrium (HWE) (Lachance, 2009). Two types of selection appear on the genomic level. Positive (Darwinian) selection promotes the spread of beneficial alleles, so that frequencies for these alleles increase and the selected alleles might be fixed over generations (Maynard Smith and Haigh, 1974;Kreitman, 2000). Negative or purifying selection hinders the spread of unfavorable alleles, causing decreasing allele frequencies up to the complete loss from the population (Kreitman, 2000). Selection not only affects the favored or unfavored mutations directly. In addition, selec-tion causes a "hitchhiking" effect on the frequency of neutral alleles at linked loci (Maynard Smith and Haigh, 1974). The cattle genome therefore represents an opportunity for the identification of genetic variation that contributes to phenotypic diversity and for inferring genome responses to strong artificial selection. The different methods to detect selection signatures are based either on the distribution of allele frequencies, on the properties of haplotypes segregating within populations or on genetic differentiation between populations (reviewed by Hohenlohe et al., 2010).
Along with divergent selection criteria, the long-lasting intensive specific improvement of economically important traits contributed to the formation of diverse genetic lines within breeds. For example, artificial selection in the dualpurpose Simmental breed implied the establishment of divergent strains which are specialized for either milk or meat production (Campbell and Marshall, 2016). As the future requires promotion of more efficient sustainable livestock systems and utilization of greater proportions of non-human competitive products for animal feed, attention is given on adaptation to grazing systems (Delaby et al., 2018). Pasturebased systems reflect harsh environments, emphasizing the importance of animal traits associated with grazing behavior and robustness. Functional traits required under grazing include feed efficiency, health, fertility and longevity (Washburn and Mullen, 2014). In predestinated locations in Ireland or New Zealand, the development of grazing systems is accompanied by animal breeding and selection strategies on adaptation to local conditions (e.g., Lopez-Villalobos et al., 2000). The New Zealand total merit index favors robust, lightweight, long-living and efficient milk producing pasture converters (Jaeger, 2018). However, German Holstein (HF_G) cows have been selected for modern and large-scale indoor systems during decades, raising questions of possible genotype-environment interactions with impact on adaptation capabilities to harsh environments (König et al., 2005). In consequence, so-called "pasture breeding projects" were initiated in Germany (Brügemann et al., 2015;May et al., 2017), aiming at genetic line comparisons in grassland systems. Specific pasture-based selection lines within the HF_G breed were created by mating, e.g., HF_G cows with Holstein Friesian sires from New Zealand (HF_NZ). The close genetic relationships between selection lines with the same founder animals suggest genetic comparisons on the basis of milk protein compositions, in order to study effects of selection in dairy lines during the past decades.
Genes influencing milk yield and protein content are the casein genes CSN1S1, CSN1S2, CNS2 and CSN3, encoding the proteins α s1 -casein (α s1 -CN), β-casein (β-CN), α s2casein (α s2 -CN) and κ-casein (κ-CN), respectively (Ng-Kwai-Hang et al., 1984). Several single-nucleotide polymorphisms (SNPs) of the casein genes change their protein sequences, implying different casein variants. A recent review of milk protein nomenclature (Gallinat et al., 2013) indicated 10 variants for α s1 -CN (A, B, C, D, E, F, G, H, I, J), 15 for β-CN (A1, A2, A3, B, C, D, E, F, G, H 1 , H 2 , I, J, K, L), five for α s2 -CN (A, B, C, D, E) and 14 for κ-CN (A, A1, B, B2, C, D, E, F 1 , F 2 , G 1 , G 2 , H, I, J) in Bos genus. The tight genetic linkage among the casein genes within a 250 kb cluster on chromosome 6 (BTA6) implies limited recombination and suggests the creation of casein haplotypes (Ferretti et al., 1990;Lien et al., 1993). Casein polymorphisms were used for the characterization of domesticated breeds and for tracing the evolutionary history (Caroli et al., 2009). Beja-Pereira et al. (2002) and Jann et al. (2004) provided evidence for a geographically associated distribution of casein haplotypes, and they identified a decline of genetic diversity for taurine breeds in Europe from the south to the north and from the east to the west. Mahé et al. (1999) discriminated between Bos taurus and Bos indicus origins at the milk protein level. Furthermore, casein genes harbor a number of variants with beneficial effects on milk production, milk composition and technological properties (reviewed by Caroli et al., 2009). Additionally, numerous studies (e.g., Ehrmann et al., 1997;Çardak et al., 2003) focused on the effects of polymorph milk proteins on the individual milk protein content. For example, Ng- Kwai-Hang et al. (1984) identified causal relationships between the homozygote genotypes BB of the respective casein α s1 -CN and κ-CN with the protein and casein content of milk. Protein yield and protein percentage are included into the overall production index (RZM) for German dairy cattle since decades and have been used as a major selection criterion (König et al., 2007). In consequence, monitoring casein genetic variants is a useful tool to inferring signatures of selection.
To our knowledge, there are no studies addressing genetic diversity in individual selection lines -especially in pasturebased selection lines -based on alleles and haplotypes of the whole casein cluster. We hypothesize that divergent directions of positive selection (e.g., towards pasture ability, dairy or meat production) have altered allele and haplotype frequencies of the casein. Therefore, the aims of the present study were to (i) compare allele and haplotype frequencies across selection lines, (ii) study temporal changes in allele frequencies since the past 25 years and (iii) analyze genetic diversity between individual selection lines and evaluate effects of selection on casein frequencies.

Animals
Milk samples from 2962 cows from first to third lactation of the Holstein Friesian (HF), German Black Pied cattle ("Deutsches Schwarzbuntes Niederungsrind", DSN) and German Simmental (GS) breeds were collected in 2018. The samples were obtained from 50 small and medium-sized farms spread over Germany. Herd sizes ranged from 24 to 228 milking cows, with an average of 76 cows per farm.
The breeds were subdivided into eight selection lines based on divergent breeding strategies (Table 1). With regard to the HF breed, a total of four selection lines was considered. Three HF lines were established in the framework of the "German pasture genetics project" (Brügemann et al., 2015;May et al., 2017) considering a specific mating design in participating grazing herds. The first line in the grazing herds (HF_NZ) based on inseminations of HF_G cows with HF sires from New Zealand. The second line (HF_G p ) was established considering mating between HF_G cows from the grazing herds with HF_G "pasture" sires. The selected HF_G pasture sires are suited to grazing conditions and represented favorable breeding values for traits that were important in New Zealand (i.e., small body size, high fat percentage, high non-return rate, short interval from calving to first insemination) (May et al., 2017). The third HF line (HF_G m ) from the grazing herds included female offspring from mating of HF_G cows with HF_G sires representing outstanding breeding values for milk yield. The fourth HF line (HF_G ref ) considered HF_G cows from intensive indoor systems, i.e., herds with a strong selection focus on milk yield. Continuous se- lection strategies within lines contributed to production trait differences, especially for lactation milk yield and fat percentage as indicated in Table 1. The local dual-purpose DSN cattle population is the founder breed of the modern HF population and has a long breeding history in the grassland region of East Frisia, Lower Saxony, Germany (Mügge et al., 1999). The DSN breeding goal considers both output traits milk and meat. DSN is defined as robust cattle under harsh environmental conditions and showed superiority over HF in terms of physiological traits (Al-Kanaan, 2016). Due to divergent breeding strategies under different housing conditions after World War II (separation into East and West Germany), two selection lines for DSN were considered (DSN east and DSN west , respectively). For Simmental cattle, the most famous dual-purpose breed for milk and beef production in Germany, two selection lines were included: GS cows of the dual-purpose breed in milk production systems (GS m ) and GS suckler cows as used in beef production systems (GS b ).

Milk protein typing
Skimmed milk samples from 2962 cows were analyzed for milk protein polymorphisms of α s1 -CN, α s2 -CN, β-CN and κ-CN by isoelectric focusing in 0.3 mm thin polyacrylamide gels according to Seibert et al. (1985) and Erhardt (1989). This method describes the simultaneous separation of the known α s1 -CN, β-CN, α s2 -CN and κ-CN variants due to their isoelectric point and considers genetic variants which cannot be detected via commercial SNP chip applications.
L. G. Hohmann et al.: Bovine milk protein diversity in selection lines

Statistical analyses
Allele frequencies were calculated by direct counting, and HWE was tested by applying a χ 2 test using the packages adegenet version 2.1.1 (Jombart, 2008;Jombart and Ahmed, 2011) and pegas (Paradis, 2010), as implemented in the software package R, version 2.14.2 (R Core Team, 2019). The polymorphic information content (PIC) was computed for each locus within and across populations using the R package polysat (Clark and Jasieniuk, 2011). The observed (H o ) and expected (H e ) heterozygosity were calculated using the R package adegenet. Wright's F -statistic parameters (F IS , F IT , F ST ; Wright, 1965) describing the expected level of heterozygosity at various levels of population structure were calculated for each locus across all selection lines using the R package hierfstat (Goudet and Jombart, 2015). The most widely used fixation index (F ST ) serves as a measure of population differentiation due to genetic structure. The overall inbreeding coefficient (F IT ) measures the reduction in heterozygosity of an individual relative to the total population, whereas Wright's inbreeding coefficient (F IS ) measures the reduction in heterozygosity of an individual due to nonrandom mating within its subpopulation (Wright, 1965). The R package hierfstat was also applied for the calculation of F IS per population and loci. Negative F IS values indicate heterozygote excesses and positive F IS values imply a deficiency of heterozygotes, indicating a considerable level of inbreeding. Haplotypes were inferred using the software package PHASE version 2.1 (Stephens et al., 2001), in order to evaluate the haplotype variability within and among populations.
The standard genetic distance (D s ) according to Nei (1972) was calculated from haplotype frequencies using the R package adegenet. The unrooted dendrogram was constructed using the unweighted pair-group method with arithmetic mean (UPGMA) (Sneath and Sokal, 1973) to reconstruct phylogenetic relationships. The robustness of the phylogenies was evaluated by bootstrap values, considering 10 000 replications of resampling loci.
Discriminant analysis of principal components (DAPC) as implemented in the R packages ade4 (Bougeard and Dray, 2018) and adegenet was used to illustrate the admixture within the populations. In contrast to other common multivariate approaches (e.g., principal component analysis or factorial correspondence analysis), DAPC maximizes the separation between groups while minimizing variation within a group, providing a clear discrimination of pre-defined genetic groups (Jombart et al., 2010;Alves et al., 2015).

Allele frequencies and test for Hardy-Weinberg equilibrium
A total of 11 alleles were detected in eight selection lines at four casein loci. The number of alleles per locus ranged from five (β-CN) over three (κ-CN) to two (α s1 -CN) alleles. For α s2 -CN, only the allele A was identified, so that the monomorphic locus α s2 -CN was excluded from further analyses. Allele frequencies of the remaining casein loci α s1 -CN, β-CN and κ-CN in the eight studied selection lines are presented in Table 2. For α s1 -CN, all selection lines showed an average high frequency for the common B allele (97 %) and a minor allele frequency (MAF) of 3 % for the C allele. Only the selection lines HF_G p (7 %), GS b (6 %) and GS m (6 %) showed a MAF larger than 3 % for the C allele. For β-CN, the variant A2 (53 %) was the predominant allele, but the two DSN subpopulations had a higher proportion of the A1 allele (in average 67.3 %). The highest frequency of A2 was found in HF_NZ (68 %). The A3 allele revealed highest frequencies in HF_G p (7 %) and HF_NZ (2 %) but was zero in both GS subpopulations. The C allele of β-CN only occurred in GS b (2 %) and GS m (0.4 %) and thus represents a breedspecific allele for GS. With regard to κ-CN, the allele A had the largest frequency in most of the selection lines, but exceptions with a higher or equal frequency of the κ-CN B allele were HF_G p and HF_NZ with 51 % and 50 %, respectively ( Table 2). The highest frequency of the κ-CN E allele was found in HF_G ref (10 %). The rare allele C, which was detected by Erhardt (1993) in GS with a frequency of 0.02 %, was not identified in the sampled animals. In the χ 2 test for HWE, three and five selection lines showed significant deviation (P < 0.05) for the β-CN and κ-CN loci, respectively, with corresponding degrees of freedom (d.f.) ranging from 1 to 6 ( Table 2). The calculated χ 2 value for α s1 -CN was 0.57 on average (d.f. = 1), indicating HWE in all populations (P > 0.5, Table 2).

Genetic variation of casein loci
The breed-and casein-wise estimates of H o , H e and F IS as well as the PIC are presented in Table 3  (DSN west ) at κ-CN. The negative F IS values of some breeds indicated an excess of heterozygotes. The fixation coefficients of subpopulations within the total population, measured as F ST value for the three loci α s1 -CN, β-CN and κ-CN, varied from 0.016 (α s1 -CN) to 0.080 (β-CN), with a mean of 0.071. It means that 7.1 % of the total genetic variation in the selection lines corresponds to genetic differences among populations, while the remaining 92.9 % explained differences among individuals within population. Additionally, results of F statistics revealed on average an excess of heterozygotes of 11.3 % for each of the analyzed subpopulations (F IS ) and 3.4 % in the whole population (F IT ). In comparison to the negative F IS values of κ-CN (−0.233) and β-CN (−0.017) among the eight selection lines, the casein locus α s1 -CN showed a deficit of heterozygotes due to its positive F IS value (0.001). Table 4 represents the results of the haplotype analysis of the α s1 -β-κ-CN cluster (in order according to their location on BTA6). A total of 13 haplotypes was identified. More than 80 % of all individuals carried one of the BA1A, BA2A or BA2B haplotypes (abbreviation of the specific combination of α s1 -β-κ-CN alleles), with mean frequencies of 35 %, 32 % or 20 %, respectively. DSN east , DSN west and HF_G p revealed the highest frequencies for BA1A, while the most frequent haplotype for the remaining selection lines was BA2A (Table 4).

Genetic distances and population structure
The matrix of Nei's D s among the studied selection lines is presented in Table 5. We identified a very close relationship between GS b and GS m (0.004). A close relationship was also found between HF_G m and HF_NZ (0.011), followed by HF_G m and GS m (0.013). The selection line HF_G ref was the most divergent from the two DSN subpopulations: 0.318    Fig. 2. For the best discrimination of haplotypes into predefined clusters, DAPC was run using 12 principal components and seven discriminant functions. The first two linear discriminants, which are illustrated in the scatterplot, contributed to 56 % and 27 % of the total variation, respectively. The first linear discriminant separated DSN and HF popula-  (Nei, 1972). The x axis represents the genetic distances between the eight studied selection lines.
tions, whereas the second linear discriminant distinguished between the Simmental subpopulations from all other selection lines.

Temporal changes of milk protein polymorphisms
Temporal changes in allele frequencies of milk protein polymorphisms in the HF and GS common cattle breeds are evident when comparing results from the present study with allele frequencies for the same breeds 25 years ago (Erhardt, 1993; Table 2). The 25-year period reflects six generations of mating and selection schemes, with an additional possible impact of random genetic drift. At the β-CN locus, frequencies for the A2 allele were larger in the present than in the historical data, in particular for HF cows. In this regard, Erhardt (1993) , 2000, 2010and after 2010, respectively. Freyer et al. (1999 and Bech and Kristiansen (1990) reported a favorable impact of the A2 allele on milk and protein yield. Hence, ongoing selection of bulls and cows according to genetic merits for milk or protein yield indirectly increased the A2 variant for β-CN. The relatively high A2 frequencies in the "milk lines" HF_G ref (64 %) and GS m (65 %) support such hypothesis. Another explanation addresses the relation of A1 milk consumption with the release of the opioid peptide β-casomorphin-7, which may play a role in the development of some human diseases (i.e., ischemic heart disease, type 1 diabetes) (Tailford et al., 2003;Kamiński et al., 2007;Cieślińska et al., 2012;Sheng et al., 2019). As the production of milk with special nutrition properties (i.e., hypoallergenic milk) benefits from the A2 variant of bovine β-CN (De Noni, 2008), farmers are encouraged to select favorable alleles for milk production in niche markets. Additionally, A2 variant information for HF_G sires recently is given in sire catalogues, public journals and discussion forums (Gödert et al., 2017). Nevertheless, the pasturebased genetic line HF_NZ revealed the highest frequency with 68 % for the A2 allele. On the one hand, this may be the result of crossing with the breed Jersey, which generally displays a high frequency of 67 % for β-CN A2 (Erhardt, 1993). In New Zealand, the crosses between HF and Jersey generated the so-called "kiwi cross", a new synthetic breed with favorable values for milk composition traits (Rowarth, 2013;Buckley et al., 2014;Mogollón-García et al., 2020). Another explanation might be the intensive selection for the A2 variant as initiated by the "a2 Milk Company" founded in New Zealand (The a2 Milk Company, 2020). The "a2 Milk Company" initiated a milk marketing program, considering only cows carrying the homozygous β-CN genotype A2A2. Up to now, there have been no progresses regarding active marketing strategies for bovine milk with defined milk protein variants (e.g., A2 milk) in Germany (Gödert et al., 2017). In contrast to the increasing frequencies of the A2 allele, the β-CN A1 allele declined with progressing time in all selection lines, apart from HF_G p . The genetic line HF_G p reflects the A1 and A2 allele frequencies as identified by Erhardt (1993) in HF_G cows. Such a result indicates genomic characteristic similarities of low yielding HF_G cows from low input systems with the broad HF population 25 years ago.
With regard to κ-CN, frequencies of the B allele increased in all populations with progressing time. The B allele frequency in HF_G was 13 % in 1993 (Erhardt, 1993) but increased to 39 % (average from all HF selection lines). The increasing frequency may be due to the positive effect of the κ-CN B allele on milk protein percentage and therefore its favorable cheese-making properties (Hallén et al., 2008;Heck et al., 2009;Mohammadi et al., 2013). Such association stimulated interest in using casein polymorphism in marker-assisted selection schemes to improve milk performance traits in farm animals (Kumar et al., 2006). Simultaneously, the rare allele C was suppressed until its complete loss from the subpopulations. Departures from HWE in both loci (β-CN and κ-CN) reflect temporal changes, i.e., increasing frequencies of the favorable alleles A2 (β-C) and B (κ-CN), due to the impact of selection (Lachance, 2009).
For α s1 -CN, frequencies in HF did not differ between present and historic data, because high frequencies of the α s1 -CN B allele, close to fixation, were already reported by Erhardt (1993). In GS, the B allele frequency increased from 89 % (Erhardt, 1993) towards fixation (94 %; Table 2). With regard to the α s2 -CN locus, only the allele A was identified in all selection lines. The D allele is rather common in French breeds (e.g., Montbéliarde) (Grosclaude et al., 1979) but was also described for HF and GS with low frequencies of 0.2 % and 2 %, respectively (Erhardt, 1993;Meier et al., 2019). Such a loss of rare alleles (e.g., α s2 -CN D allele, κ-CN C allele) indicates genetic drift, a mechanism of evolution in which allele frequencies change over generations by chance (Hartl and Clark, 2007), with an impact on decreasing genetic diversity.

Genetic diversity parameters
Among selection lines, HF_G p displayed the highest gene diversity over all loci (average H e overall loci = 0.41). Alternative selection of HF in grazing systems with a focus on a broad pattern of functional traits including especially female fertility and somatic cells might explain their variability at protein loci. Additionally, observed (but rather limited) genetic exchange with DSN contributed to genetic diversity. The H e for each locus in the reference line HF_G ref is in agreement with the commercial Portuguese HF population (Beja-Pereira et al., 2002). The lowest values for H e across all loci were observed for DSN east , which might be due to the larger inbreeding increase in the DSN east subpopulation compared to the subset for DSN cows from former West Germany (Jaeger et al., 2018a). As the deficiency of heterozygotes is an indication of inbreeding, the positive F IS values for DSN east at both loci (β-CN and κ-CN) (0.27 and 0.02, respectively) underline this assumption. An explanation for the mating of closely related animals in the past DSN east is the restricted gene flow from foreign countries in the former German Democratic Republic. In contrast, in DSN west , sires from the Netherlands have been used in the period from 1970 to 1980 (Jaeger et al., 2018a). Nevertheless, also for DSN west , the diversity measurement (H e = 0.31) suggests a general small effective population size for DSN, reflecting a small real population with only 2800 registered cows in Germany (Rinderproduktion Berlin-Brandenburg GmbH, 2016). A decreasing population size is a major cause for losses in genetic diversity (Kantanen et al., 1999). In such context, Jaeger et al. (2018b) calculated an increase of inbreeding per year in DSN of 0.1 %, implying a rather small effective population size of 85 animals.

Relationships between selection lines
In the UPGMA dendrogram, the genetically closely related subpopulations of the DSN breed (DSN east , DSN west ; D S = 0.014) built their own cluster, clearly differentiated from the remaining selection lines. The majority of selection lines (i.e., selection lines within HF and GS) revealed the highest frequency for the haplotype BA2A, which was also detected for Italian Friesian cattle (Boettcher et al., 2004). In contrast, in both DSN subpopulations, BA1A was the most frequent casein haplotype with a frequency up to 57 %. The predominant impact of such chromosomal segment and especially of the A1 allele (67.3 % on average for both DSN subpopulations) is in agreement with results of Ng- Kwai-Hang et al. (1984) and Meier et al. (2019). Generally, in addition to DSN, breeds originating from northern Europe, including European Red cattle (Bech and Kristiansen, 1990), Black and White Lowland breeds (McLean et al., 1984) and Danish Red cattle (Meier et al., 2019), showed highest frequencies for the A1 allele and the corresponding haplotypes. These results indicate that the β-CN A1 allele is a major character-istic for breeds with a certain geographic location in Nordic countries. A further explanation for genomic similarities in Nordic breeds including DSN addresses identical breeding objectives towards a dual-purpose phenotype. A shared characteristic of Nordic breeds and DSN is the similarity in fat and protein percentages (Meier et al., 2019).
The average genetic distance of 0.30 between the founder DSN breed (DSN east , DSN west ) and the modern HF_G ref population indicates the breeding particularities in both lines in the past decades. The robust DSN cattle were subject of extensive breeding mostly in pasture-based production systems, predominantly considering mating with natural service sires. In contrast, HF_G ref cows were intensively selected for milk yield, indirectly favoring casein haplotypes as already reported in goats (Grosclaude and Martin, 1997). The results for Nei's D S indicate a close relationship between both DSN subpopulations and HF_G p . The predominance of the A1 allele in HF_G p makes them more similar to DSN east and DSN west than to the current HF_G ref population. This might be due to the higher genetic percentage of DSN in their ancestors, as the genetic pasture line was selected for robust animals (Jaeger et al., 2018a, b). The high frequencies for the α s1 -CN C allele in HF_G p (7 %) and both DSN subpopulations (6 %) support such hypothesis.
Genetic distances were observed between HF selection lines, as they clustered separately. With regard to allele frequencies at the κ-CN locus, some specific patterns in selection lines were noticed. First, the selection lines HF_G p , HF_NZ and HF_G m from the grazing herds displayed the highest allele frequencies for the favorable κ-CN B allele (51 %, 50 % and 42 %, respectively). The importance of specific breeding goal traits differed in divergent feeding systems (Washburn and Mullen, 2014;Delaby et al., 2018). In pasture-based systems, the focus of selection has emphasized fertility, fitness and robustness. The prevalence of the B allele in pasture-based selection lines may be the result of indirect selection for these traits. In this regard, Hiendleder et al. (2003) detected quantitative trait locus (QTL) linked with the milk protein genes on BTA6 for udder quality and limb conformation (e.g., quality of feed and leg), being traits reflecting the pasture ability.
In both GS subpopulations (GS b , GS m ), we identified a private allele in the β-CN C variant and therefore the breedspecific haplotype BCB. This is in agreement with results by Çardak (2005), who found the C allele with a frequency of 2.3 % in Simmental cows but not in HF. The occurrence of β-CN C explains the lowest D A between GS b and GS s as well as their differentiation from the other selection lines (Fig. 1). The breed-specific C allele may be linked to a favorable mutation on BTA6 for carcass and body weight, promoting the breeding value for beef production in a dual-purpose breed like GS. In this regard, QTL for growth traits (i.e., body length, carcass weight) have been detected within the NCAPG gene located on BTA6 in local beef cattle breeds (e.g., Chinese Qingchuan and Japanese Black and Brown beef cattle), indicating overlapping mechanisms of bone and muscle growth with lipid deposition (Setoguchi et al., 2009;Liu et al., 2015). Furthermore, the gene SPP1 on BTA6 was associated with body weight in Polish Holstein Friesian cattle (Pareek et al., 2008). In a functional genomic approach, Sheehy et al. (2009) suggested SPP1 as an important regulator of bovine milk protein gene expressions, explaining the possible link between the casein and SPP1.

Genetic differentiation among selection lines
In the present study, the average F ST among selection lines was 7.1 %, reflecting a moderate level of population differentiation (Hartl and Clark, 2007). Hence, 7 % of the total genetic variation corresponds to selection line particularities, and the remaining 93 % is due to individual differences. The illustration for DAPC (Fig. 2) indicates that the selection lines do not clearly distinguish divergent clusters. The DAPC visualizes a high admixture between the subpopulations. We only identified a separation between the DSN and HF subpopulations along the linear discriminant 1 and a separation between the Simmental subpopulations with the remaining selection lines. The slight genetic variation among subpopulations might be a result of the decreased variability at the casein loci, which is indicated by the average PIC of 0.27 over all loci. Genotyping of the casein genes (e.g., β-CN and κ-CN) is of increasing relevance for practical breeding and selection, also from a genetic diversity monitoring perspective.

Conclusions
The results of the present study indicate that different selection strategies (e.g., pasture ability, meat or dairy production) indirectly contributed to the variability of the casein polymorphisms linked to milk production traits. The selection lines of the endangered DSN breed showed the lowest gene diversity and clearly separated from the HF and GS breeds due to their predominance of the β-CN A1 allele. The pasture-based selection lines of the HF breed carried the favorable κ-CN B allele with highest frequency, which is related to a higher protein content in milk. Temporal changes in allele distributions reflect that casein loci or selected mutations in close proximity to the casein underlie selective breeding. Fixation of alleles and results for evaluated indicators of heterozygosity (e.g., H e , F ST , F IS ) showed diversity loss at the casein loci. The present study revealed differences in allele frequencies at casein loci across selection lines, indicating breeding potential for specific milk markets. Furthermore, genetic milk protein variants can be used to monitor genetic diversity.