Variation in the caprine keratin-associated protein 15-1 (KAP15-1) gene affects cashmere fibre diameter

Abstract Keratin-associated proteins (KAPs) are a structural component of cashmere fibre, and variation in some KAP genes (KRTAPs) has been associated with a number of caprine fibre traits. In this study, we report the identification of KRTAP15-1 in goats. Sequence variation in the gene was detected using the polymerase chain reaction single-strand conformation polymorphism (PCR-SSCP) technique in 250 Longdong goats, and six variants (named A to F) containing eight single nucleotide polymorphisms (SNPs) were identified. Five of the SNPs were non-synonymous and would lead to putative amino acid changes. Reverse-transcription polymerase chain reaction (RT-PCR) analysis revealed that KRTAP15-1 was expressed in secondary hair follicles but not in heart tissue, liver tissue, lung tissue, kidney tissue or the longissimus dorsi muscle. Despite being rich in cysteine, the caprine KAP15-1 protein possesses a high content of serine and moderate content of glycine and phenylalanine. Association analyses revealed that KRTAP15-1 variant A was associated with decreased mean fibre diameter (MFD), and this effect appeared to be dominant; while variant C was found to be associated with increased MFD, the effect being recessive. The findings suggest that caprine KRTAP15-1 is highly polymorphic and that variation in this gene affects cashmere MFD.


Introduction
Hair and cashmere are produced by the primary and the secondary hair follicles, respectively, of cashmere goats. As a consequence of the characteristic of being softer, lighter and stronger, with better insulating properties, cashmere fibres are sometimes called "soft gold", this reflecting how highly the fibre is valued. The price of cashmere fibres is comparatively much higher than wool and mohair, and cashmere fleece weight, mean fibre diameter (MFD) and unstraightened fibre length are the most important traits that determine the economic return for the production of cashmere.
The main components of cashmere fibres are keratins (Ks), which form keratin intermediate filaments and keratinassociated proteins (KAPs), which form a matrix cross-linking the keratin intermediate filaments. The Ks and KAPs are therefore believed to play an important role in determining the characteristics of the fibre.
KAPs have a high content of either cysteine or glycine and tyrosine, and based on this content, the proteins can be divided into three broad groups: the high-sulfur (HS) group, which contains less than 30 mol % cysteine; the ultra-highsulfur (UHS) group, which contains more than 30 mol % cysteine, and the high glycine and tyrosine (HGT) group, which has 35 mol %-60 mol % glycine and tyrosine (Gong et al., 2016). Within these groups, the KAPs can be further subdivided into families based on their sequence similarity, and over 100 KAP genes (called KRTAPs) belonging to 27 fam-126 M. Zhao et al.: Variation in the caprine keratin-associated protein 15-1 (KAP15-1) gene ilies have been identified to date across mammalian species (Gong et al., 2010).
KAP15-1 is a single gene-member family belonging to the HS-KAP group. The KAP15-1 gene (KRTAP15-1) has been described in mice and humans (Pruett et al., 2004;Rogers et al., 2002) and recently in sheep . In sheep, variation in KRTAP15-1 has been reported to affect wool yield . Despite a caprine KRTAP15-1 sequence having been deposited in the NCBI GenBank database (accession number AY510116.1), little is known about KRTAP15-1 variation and its effect on fibre traits in this species.
In this study, we attempted to identify KRTAP15-1 in goats, to search for potential variation in the gene and to investigate its effect on cashmere fleece traits.

Goats and DNA samples
Two hundred and fifty-three Longdong cashmere goats fed at the Yu sheng Cashmere Goat Breeding Company in Huan County of Gansu Province were investigated. At 1 year of age, cashmere fibres were collected by combing the goats, and the weight of fibre collected per goat was measured. A sample of fibre from the mid-side region was collected from each goat and sent for the measurement of mean crimped fibre length and MFD at the Inner Mongolia Agricultural University, Inner Mongolia, China.
Blood samples from these goats were collected directly onto FTA cards (Whatman BioScience, Middlesex, UK). A two-step washing procedure was used to purify the goat genomic DNA for polymerase chain reaction (PCR) amplification from 1.2 mm punches of the dried blood spots, using the protocol described in Zhou et al. (2006).

Animal tissues
Three 3-year-old Longdong cashmere goats were slaughtered, and tissue from the skin, heart, liver, lungs, kidneys and longissimus dorsi muscle was collected and rapidly frozen for storage in liquid nitrogen. The separation of primary and secondary hair follicles from the skin tissue used the method described by Jin et al. (2011). Briefly, the skin sample was cut into a strip of a width of 1.0 cm. Next, the hair follicle bulbs were exposed by removing the subcutaneous fat using a dissecting needle and the follicle tissues separated from the surrounding tissues. Once isolated, the hair follicles were sorted under a microscope into primary and secondary follicles, this being based on whether they had sweat glands or not.
Amplifications were undertaken in Bio-Rad S1000 thermal cyclers (Bio-Rad, Hercules, CA, USA) and performed in a 20 µL reaction including the purified genomic DNA from one 1.2 mm punch of dried blood, 0.25 µM of each primer, 2.0 µL of 10× PCR buffer (supplied with the DNA polymerase enzyme), deoxyribonucleoside triphosphates (dNTPs) at 150 µM (Takara, Dalian, China), 2.5 mM Mg 2+ and 0.5 U of Taq DNA polymerase (Takara), with deionised water (dH 2 O) to make up the volume. The PCR amplification conditions consisted of an initial denaturation at 94 • C for 5 min, followed by 35 cycles of 94 • C denaturation for 30 s, 60 • C annealing for 30 s and 72 • C extension for 30 s, and a final extension at 72 • C of 5 min. The quality of PCR products was examined using agarose gel electrophoresis (1 % gel in 1× TBE (Tris-borate-EDTA) buffer).

Sequencing of KRTAP15-1 variants and sequence analyses
Amplicons that appeared to be homozygous upon analysis of the SSCP gels were sequenced in both directions by the Beijing Genomics Institute (Beijing, China). Variants that appeared to be present in a heterozygous form were sequenced using an approach described previously (Gong et al., 2011). Briefly, a band corresponding to the variant was excised as a gel slice from the polyacrylamide gel, macerated, and then used as a template for re-amplification with the original primers. The second amplicon was then sequenced directly. The alignment of DNA sequences, translation, comparisons and the construction of the phylogenetic tree were undertaken using DNAMAN version 5.2.10 (Lynnon BioSoft, Vaudreuil, Canada).
The coding sequence of a caprine KRTAP15-1 sequence (accession number AY510116.1) was used to BLAST search the Caprine Genome Assembly (ASM170441 v1) to determine its chromosomal location.

Reverse-transcription polymerase chain reaction
Trizol reagent (Invitrogen, Carlsbad, CA, USA) was used to extract total RNA from between 50 and 100 mg of each of the five tissue samples and the secondary follicles. The quality and concentration of the RNA extracted was checked using 2 % agarose gels electrophoresis and UV spectrophotometry, respectively.
The PrimeScript ™ RT Reagent Kit with gDNA Eraser (Perfect Real Time, Takara) was used to produce cDNA, following the manufacturer's instructions. This cDNA was amplified using a second set of PCR primers (5'-ATCTTCCGCAGTCCCTG-3' and 5'-GATGACCGGCAACTCCT-3') located within the KRTAP15-1 coding region, which would amplify a 161 bp fragment. This amplification used the conditions and thermal profile described above, but the genomic DNA template was replaced with a 0.8 µL aliquot of the cDNA. The caprine β-actin gene was selected as an internal reference standard with the primers 5'-AGCCTTCCTTCCTGGGCATGGA-3' and 5'-GGACAGCACCGTGTTGGCGTAGA-3'. PCR products were examined by electrophoresis in 1.0 % agarose gels.

Statistical analyses
All statistical analyses were undertaken using SPSS v24.0 (IBM, NY, USA). General linear mixed-effects models (GLMMs) were used to evaluate the effect of the absence or presence (coded as 0 or 1, respectively) of KRTAP15-1 variants on the three cashmere traits that had been measured.
For those variants with sufficiently common homozygous forms (> 1 % of all genotypes), a second set of analyses was performed with the number of variant copies present included (in place of absence or presence), followed by planned orthogonal contrasts to ascertain whether additive, dominant or recessive effects were present. These models were conducted in an identical manner to the GLMMs used for testing the presence/absence of each variant. As with the absence or presence models, each variant was tested in separate models and then subsequently included in a model with all the variants that met the criteria (i.e. sufficient homozygosity) present. A Bonferroni correction was applied to reduce the probability of false positive results during the multiple comparisons in these models.
Gender and sire were found to affect (P < 0.05) all the fibre traits and so they were included in the models as fixed and random factors, respectively. Birth rank was not found to affect fibre characteristics and thus was not included in the models. Only the main effects were tested. Unless indicated, all P values were considered statistically significant when P < 0.05. Trends were noted when 0.05 ≤ P < 0.1.

Sequence variation in caprine KRTAP15-1
Six unique banding patterns (A to F) were identified in the Longdong cashmere goats by PCR-SSCP analysis (Fig. 2). Either one or a combination of two banding patterns were observed for each goat, and this is consistent with it being either homozygous or heterozygous for variation in the gene. Sequencing of PCR amplicons representative of these different banding patterns revealed six unique nucleotide sequences and eight single nucleotide polymorphisms (SNPs). Five of these SNPs were non-synonymous, and one nonsynonymous SNP would lead to the creation of an alternative start codon at the seventh codon position (Fig. 3). This would result in a loss of six N-terminal amino acids in the protein.
Four of the SNPs were found to be in linkage, and these were c.229C/T, c.245T/C, c.323G/A and c.339T/C, with the haplotypes through these SNPs being either T-C-A-C or C-T-G-T.
In the 250 Longdong cashmere goats investigated, the frequency of the six KRTAP15-1 variants A, B, C, D, E and F was 41.6 %, 27.5 %, 27.3 %, 3.2 %, 0.2 % and 0.2 %, respectively. The common variants were A, B and C, while the common genotypes were AA, BB, CC, AB, AC and BC, these collectively accounting for 92.8 % of the goats. The other genotypes were rare and occurred at a frequency of less than 5 %.

Phylogenetic relationship of caprine KRTAP15-1 to other HS-KRTAPs
Phylogenetic analysis of the caprine KRTAP15-1 sequences with all of the HS-KRTAPs identified in goats and sheep and the KRTAP15-1 sequences from humans and mice revealed that the caprine KRTAP15-1 sequences were clustered with the KRTAP15-1 sequences from sheep, human and The arrows represent the direction of transcription. The distances between these genes are approximate and refer to the caprine genome assembly NC_030808.1. mice and separated from ovine and caprine KRTAPs from other HS-KAP families (Fig. 4). This suggests the sequences identified represent caprine KRTAP15-1, and based on this analysis the variants A-F were formally named CAPHI-KRTAP15-1*A to CAPHI-KRTAP15-1*F to be consistent with the KAP/KRTAP nomenclature proposed by .
Other amino acids are rare or absent in the protein.

Expression of caprine KRTAP15-1
The concentration of RNA extracted from the secondary follicles and five tissues ranged from 931 to 1050 ng µL −1 . Expression of KRTAP15-1 was detected in secondary hair fol-licles from the Longdong goats but not in other tissues, including heart tissue, liver tissue, lung tissue, kidney tissue and tissue from the longissimus dorsi muscle (Fig. 5).

Association of KRTAP15-1 variation with cashmere fibre traits
Of the six variants identified, variants D, E and F were rare (with a frequency of less than 5 %) in the Longdong cashmere goats investigated. These variants were removed from the association study, and analyses were only carried out for variants A, B and C. All of these variants also had sufficiently common homozygous forms to enable the copy number analyses (i.e. > 1 % of all genotypes). In the absence or presence models, the presence of variant A was found to be associated with decreased cashmere MFD (absent: 13.6±0.06 µm; present: 13.4±0.05 µm; P = 0.006) ( Table 1). No associations were detected for the other variants.
When the number of variant copy was considered, goats that did not contain variant A produced fibres with higher mean fibre diameter than goats that contained one or two copies of A ( Table 2). Goats that carried two copies of variant C produced fibre with higher MFD than goats that did not have C or that had one copy of C (Table 2).

Discussion
This study reports the identification of caprine KRTAP15-1, and describes its effect on cashmere fibre traits. The caprine KRTAP15-1 gene was clustered with other HS-KRTAPs and HGT-KRTAPs in a region on chromosome one and in an arrangement similar to that found in sheep . The gene was expressed in secondary fibre follicles in the skin, and it exhibited sequence variation. This variation appeared to affect cashmere MFD.

Figure 5.
RT-PCR results for detection of β-actin and KRTAP15-1 in six tissues derived from Longdong cashmere goats. M: 750 bp DNA marker; lanes 1 and 8: secondary hair follicle tissue; lanes 2 and 9: heart tissue; lanes 3 and 10: liver tissue; lanes 4 and 11: lung tissue; lanes 5 and 12: kidney tissue; lanes 6 and 13: longissimus dorsi muscle tissue; lane 7: negative control. 13.4 ± 0.05 119 13.5 ± 0.05 112 0.358 * Estimated marginal means and standard errors (SEs) of those means from general linear mixed-effects models. "Gender" and "sire" were included in the models as a fixed factor and a random factor, respectively. P < 0.05 is in bold. stricted to expression in the skin. This is consistent with the expression pattern detected for caprine KRTAP20-1  and caprine KRTAP20-2  but different to that reported for caprine KRTAP11-1. Jin et al. (2017) reported that this gene was strongly expressed in skin, heart and liver. The reason for this is currently unknown.
While KAP15-1 could be classified as belonging to the HS group (Rogers et al., 2006;Gong et al., 2012), the putative caprine KAP15-1 protein would contain a lower level of cysteine and a higher amount of serine than might be expected. This difference is not unprecedented, and it is similar to what has been observed with the HS-KAPs: KAP24-1 and KAP26-1 Li et al., 2017). The caprine KAP15-1 protein would also be rich in phenylalanine. Having a high level of phenylalanine has only been reported for the KAP15 family but has not been found in other HS KAPs . The functional consequence of this is unknown, but it has been suggested that the possession of a high level of phenylalanine may facilitate interac-tion between KAP15-1 and the keratin intermediate filaments (KIFs) via a ring-stacking mechanism .
Caprine KRTAP15-1 appears to be polymorphic, with a high level of polymorphism being detected in a reasonably small group of cashmere goats. While ovine KRTAP15-1 is polymorphic , the level of polymorphism observed in sheep (four variants) was lower than detected here in the goats (six variants). Moreover, the number of SNPs identified in caprine KRTAP15-1 is higher than that identified in the sheep homologue, with eight SNPs found in caprine KRTAP15-1 compared to four SNPs in ovine KRTAP15-1 . The detection of eight SNPs in a 456 bp PCR fragment (excluding the primer binding regions) would give a density of approximately 17.5 SNPs per kilobase. This is much higher than the average SNP density of 4.9 SNPs per kilobase described for sheep (Kijas et al., 2009). Little is known about how this extensive variation came about in goats, but the linkage of SNPs suggests that gene conversion or non-reciprocal genetic exchange may have occurred and that this has contributed to the accumulation of sequence variation. The presence of two chi-  13.7 ± 0.10 a 19 0.024 * Estimated marginal means and standard errors (SEs) of those means derived from general linear mixed-effects models. "Gender" and "sire" were included in the models as a fixed factor and a random factor, respectively. Means within rows that do not share a superscript letter ( a or b ) are significantly (P < 0.05) different and are given in bold.
The association results obtained from the absence or presence models and the copy number models appear to match. For variant A, the absence of this variant was associated with a higher MFD than the presence of either one copy or two copies of the variant (see Table 2). This compares favourably with the single-variant absence or presence models, which reveal that the absence of variant A is associated with increased MFD, when compared to the presence of A (see Table 1). For variant C, goats that had one copy of this variant had no difference in MFD compared to goats that did not have variant C. This effect of variant C was not observed in the single-variant absence or presence models, but the finding might have been biased by the reasonably small number of CC goats investigated (n = 19).
It is interesting to note that despite both variants A and C appearing to have an effect on MFD, the nature of effects observed for these two variants appear to be different. The models suggest that variant A has a dominant effect, while variant C has a recessive effect. The reason for this is unknown, and it may simply once again reflect bias from the smaller number of CC goats, but it could favour the production of finer cashmere fibres, should A be selected for.
The effect of KRTAP15-1 in the cashmere goats appears to be different to that found in sheep. In sheep, KRTAP15-1 is found to mainly affect wool yield, but it only had a trend of association with wool MFD . Unfortunately fibre yield was not measured in the Longdong goats in this study. Whether this means that KRTAP15-1 has a larger effect on MFD in cashmere fibres from goats than in wool from sheep cannot be easily resolved. It might be as a consequence of the fineness of the fibres, as cashmere fibres are much finer than wool fibres, with the average MFD of the cashmere samples investigated in this study being 13.3 µm versus 19.2 µm for the wool samples investigated in Li et al. (2018). A small change in MFD in the goats may be more pronounced and thus more easily reach significance.
When compared to the other KRTAPs in goats, KRTAP15-1 appears to have a unique effect on cashmere traits. Of the 12 previously identified KRTAPs in goats, effects on cashmere fibre traits have been investigated for three: KRTAP8-2 (Liu et al., 2007), KRTAP20-1  and KRTAP20-2 . All of these KRTAPs are located on the same chromosome as KRTAP15-1. Variation in KRTAP8-2 (Liu et al., 2007) was found to be associated with cashmere weight and MFD but not with fibre length, while variation in KRTAP20-1 and KRTAP20-2 was associated with cashmere weight and fibre length , and only a trend of association was observed with MFD with KRTAP20-1 . Despite using the same population of cashmere goats as for the analysis of KRTAP20-1 and KRTAP20-2 , variation in KRTAP15-1 is found to be associated with MFD but not with cashmere weight and fibre length. This suggests that different KRTAPs may have different effects on cashmere traits. In this context, the effect of KRTAP15-1 on MFD would seem to be unlikely to be due to linkage to other KRTAPs on the same chromosome.