Population validation of reproductive gene mutation loci and association with the litter size in Nubian goat

Abstract Litter size is an important component trait of doe reproduction. By improving it, production efficiency and economic benefits can be significantly provided. Genetic marker-assisted selection (MAS) based on proven molecular indicators could enhance the efficacy of goat selection, as well as litter size trait. Many molecular markers have been identified that they can be used to improve litter size in different goat breeds. However, the presence and value of these markers vary among goat breeds. In the present study, we used the reported loci on other breeds of goat as candidate loci to detect whether these loci appear in this Nubian goat population; then we proceed to genotype and detect surrounding loci (50 bp) by multiplex PCR and sequencing technology. As a result, 69 mutations (59 SNPs and 10 indels) were screened out from 23 candidate genes in Nubian goat population, 12 loci were significantly associated with the litter size of first-parity individuals; 5 loci were significantly associated with the litter size of second-parity individuals; 3 loci were significantly associated with the litter size of third-parity individuals. In addition, five loci were significantly associated with the average litter size. The additive effect value of KITLG: g.18047318 G>A in first parity, KITLG: g.18152042G>A in third parity, KISS-1: g.1341674 C>G in first parity, and GHR: g.32134187G>A in second parity exceed more than 0.40, and the preponderant alleles are G, C, A and G, respectively. Further, linkage disequilibrium analysis of 21 mutation loci shows that 3 haplotype blocks are formed, and the litter size of combination type AACC in KISS-1 gene and AAGG in KITLG gene are significantly lower than that of other combinations genotype in first parity (P<0.05). These findings can provide effective candidate DNA markers for selecting superior individuals in Nubian goat breeding.


Introduction
The goat (Capra hircus) is believed to be one of the first domesticated ungulates that underwent domestication approximately 10 000 years ago; therefore it has been a witness to the historical progress of human civilization (Naderi et al., 2008). A variety of natural or artificial factors (e.g., environmental changes, human migration and socioeconomic influences) have shaped the phenotypic diversity of goat, which is one of the most important agricultural animals for various products (e.g., milk, meat and fiber) to humans (Silpa et al. 2018;Li et al. 2012;Takada et al. 1997b). The Nubian goat has good adaptability and reproductive performance with more than two kids per litter. They are farmed on a large scale in southwest China and also used to improve local doe.
Litter size is considered a trait with low heritability (h 2 = 0.10 − 0.15) , and traditional direct selection is ineffective in goat breeding (Cui et al., 2018). With the rapid development of animal molecular genetics and modern molecular biotechnology, the analysis of single gene or chromosome fragments has become the main object of analyzing fecundity traits. Marker-assisted selection (MAS), based on relevant genetic variants, is used with traits of low heritability (h 2 ), such as those associated with reproduction (Yh et al., 2020). The combined efficiency is significantly higher than that achievable by breeding based on the phenotype alone (Yuan et al., 2019). To facilitate MAS application to litter size in the goat industry, critical genetic variants causing phenotypic advantage should be verified, which is more convenient and efficient. Therefore, it is an effective way to scan single nucleotide polymorphisms (SNPs) in genes with known reproductive physiology functions. Multiplex PCR targeted amplification sequencing (MAT-seq) is an accurate, sensitive, widely applicable and economical method for loci detection and genotyping, which is a multiplex PCR amplification fragment compatible with Illumina HiSeq/MiSeq platform sequencing technology (Yoshihiko et al., 2018;Grandell et al., 2016). It is a good method to detect target sites or fragments (< 50 bp) and to genotype important candidate genes.
Based on current literature, some genes that affect the secretion of reproductive hormones in the hypothalamicpituitary-testicular axis, ovulation rate, embryonic/fetal survival and endometrial receptivity are considered to be the main candidate genes affecting litter size (Xu et al., 2018). The influence of candidate genes on reproductive traits differs considerably because these genes could affect the physiological pathways, metabolism and phenotypic expression differentially with different paracrine or autocrine effects (Wang et al., 2018b). The major focus of goat breeding for genetic improvement greatly relies on candidate genes, which influence reproductive traits so as to ascertain the functions of genes as causing changes in and association with phenotypic values of the trait, molecular structure analysis, expression profile, sequence variability and their association with phenotypes. The selection for favorable alleles of candidate genes could help the development of high litter size goat breeding. Therefore, to provide an improved understanding of the genetic complexity underlying the litter size in Nubian goat, the objectives of the present study were the following: (1) to detect 43 previously reported mutations in the 24 genes associated with the litter size of other goat breeds (e.g., Shaanbei White Cashmere goat, Jining Grey goat, Boer goat) and whether they appear in this Nubian goat population; (2) to investigate the effects of loci involved in reproduction on litter size in Nubian goat; and (3) to explore the effects of locus-locus combinations on litter size. These information will be useful in identifying valuable genetic markers for MAS in Nubian goat.

Animals and DNA extraction
The study was approved by the ethics committee of Guangxi University (Nanning, Guangxi, China) (approval number: 2016-Gxu205). A total of 355 Nubian doe (aged 12-36 months) were accessed from two farms (227 of Guangxi Fusui Guangyang Agricultural and Livestock Co. Ltd. and 128 of Guangxi Lvshijie Agricultural investment co. Ltd.) in Guangxi Province, China. All doe were mated naturally and raised under identical management conditions. Addi-tionally, all selected individuals were healthy and with detailed reproduction records (litter size of first-, second-and third-parity individuals) from 2016 to 2018. The experimental protocol across times is shown in Fig. 1. All efforts were made to minimize any discomfort during blood collection. Approximately 5.0 mL of blood has been collected aseptically from the jugular vein of each doe into 5 mL vacutainer tubes containing the anticoagulant, ethylenediaminetetraacetic acid (EDTA-K2). Genomic DNA is extracted from blood using the DNeasy Blood and QIAamp DNA extraction kit (Qiagen, Germany). The quality of DNA is checked by a spectrophotometer taking ratio of optical density (OD) value at 260 and 280 nm. The integrity of DNA was detected by 1.0 % agarose gel electrophoresis. DNA samples were each diluted to a working concentration of 30 ng/µL by TE buffer (pH 8.0) and stored at −20 • C for DNA pooling and genotyping.
2.2 Gene selection, primer design, DNA pool PCR amplification, polymorphism identification We hypothesized that molecular markers identified on goat breeds (e.g., Shaanbei White Cashmere goat, Jining Grey goat, Boer goat) that are significantly associated with litter size would have the same effect on Nubian goat. In the present study, genes with known biological functions in follicular and ovarian development were first selected as target genes. These genes or SNPs identified in this study and known to influence reproductive traits in a goat breed were then selected as candidate genes or markers. Therefore, 43 candidate loci (29 SNPs and 14 indels) on 24 genes with known reproductive physiology functions in animals were selected as potential candidate loci for goat litter size (Table 1). Among the 43 candidate loci (Table S1), 38 polymorphic loci (29 SNPs and 9 indels) of 23 genes were verified by Sanger sequencing of PCR products amplified from 5 DNA pools. These DNA pools comprised 30 randomly selected individual genomic DNA samples as a polymerase chain reaction (PCR) template for amplification. Specific primers for PCR were designed based on these candidate gene sequences. The PCR was performed in a volume of 40 µL, containing 20 µL 2 × Taq PCR MasterMix, 1.5 µL each of forward and reverse primers, 2 µL pool DNA (30 ng/µL), and 15 µL double-distilled H 2 O. The PCR protocol was as follows: initial denaturation for 3 min at 95 • C; denaturation for 15 s at 95 • C; annealing for 30 s at 50-60 • C; extension for 15-45 s at 72 • C; and a final extension for 5 min at 72 • C, with subsequent cooling to 4 • C. The PCR products were detected by 2.0 % agarose gel electrophoresis. The purified PCR products were sequenced using Sanger sequencing, and the sequences were compared using DNASTAR and Chromas software to detect putative SNPs. The identified SNPs were individually genotyped by the MTA-seq method (Sangon Biotech, Shanghai, China). Three DNA samples were randomly selected from 355 samples to genotype twice, and  2 blank samples (double-distilled water) were also used to eliminate cross-contamination.

Multiplex PCR and sequencing to genotype and detect surrounding loci (50 bp)
A panel which contains 38 target SNPs loci was designed (Table S2). Library preparation is performed by two-step PCR. First round PCR reaction was set up as follows: DNA (10 ng/µL) 2 µL; amplicon PCR forward primer mix (10 µM) 1 µL; amplicon PCR reverse primer mix (10 µM) 1 µL; 2 × PCR Ready Mix 15 µL (total 25 µL) (Kapa HiFi Ready Mix). The plate was sealed, and PCR was performed in a thermal instrument (BIO-RAD, T100TM) using the following program: 1 cycle of denaturing at 98 • C for 5 min, first 8 cycles of denaturing at 98 • C for 30 s, annealing at 50 • C for 30 s, elongation at 72 • C for 30 s, then 25 cycles of denaturing at 98 • C for 30 s, annealing at 66 • C for 30 s, elongation at 72 • C for 30 s, extension at 72 • C for 5 min and finally held at 4 • C. The PCR products were checked by using electrophoresis in 1 % (w/v) agarose gels in TBE buffer (Tris, boric acid, EDTA) stained with ethidium bromide (EB) and visualized under UV light. Then we used AM-Pure XP beads to purify the amplicon product. After that, the second-round PCR was performed. PCR reaction was set up as follows: DNA (10 ng/µL) 2 µL; universal P7 primer with barcode (10 µM) 1 µL; universal P5 primer (10 µM) 1 µL; 2 × PCR Ready Mix 15 µL (total 30 µL) (Kapa HiFi Ready Mix). The plate was sealed and PCR was performed in a thermal instrument (BIO-RAD, T100TM) by using the following program: 1 cycle of denaturing at 95 • C for 3 min, then 5 cycles of denaturing at 94 • C for 30 s, annealing at 55 • C for 20 s, elongation at 72 • C for 30 s, elongation at 72 • C for 30 s and a final extension at 7 • C for 5 min. Then we used AM-Pure XP beads to purify the amplicon product. The libraries were then quantified and pooled. Paired-end sequencing of the library is performed on the HiSeqXTen sequencers (Illumina, San Diego, CA).

Data quality control (QC) and SNP calling
Raw reads were filtered according to two steps: removing low-quality bases from reads 3 to 5 (Q < 20) by PRINSEQlite (v 0.20.3), and the remaining clean data were mapped to the reference genome by BWA (version 0.7.13-r1126) with default parameters. A Perl script was written to calculate each genotype of target locus. ANNOVAR (16 April 2018) was used to detect genetic variants.

Association analysis
Statistical analyses were performed using SAS (V. 9.2) (SAS Institute Inc.). The scattergrams of litter size at different parity were computed using Microsoft Excel (v16.42). The general linear model (GLM) of the analysis of variance (ANOVA) was used to analyze the relationships between the different genotypes and litter size. Because these two goat farms are national core farms, the geographical location is very close and all goats were fed with standardized feeding method. The effect of the farm on the number of goats' litter size is very small, which can be ignored. And leastsquares mean (LSM) was used for multiple comparisons in litter size of goat with different loci genotypes according to the adjusted linear model: In this model, Y ij k represents phenotypic value (litter size); µ is the population mean; P j shows the effect of the j th parity (j = first, second, third or average); G i represents the effect of the i th genotypes (i = Ref/Ref, Ref/Alt, or Alt/Alt); W k represents the effect of the k th weather (k = cool or hot) and ε ij k represents random error. P values < 0.05 were considered to be significant. All statistically significant single marker traits associated with LSM litter size were initially included in further combined haplotypes analysis of the combined effects of different genes on LSM litter size.
To estimate the mode of inheritance of the loci, the following functions were used: where D and A are the estimated values of dominant effect and additive effect, respectively; A 1 A 1 , A 1 A 0 and A 0 A 0 are the genotypic values of dominant homozygote, heterozygote and allozygote, respectively. The genetic effects are partially recessive, incompletely dominant, partially dominant, complete dominant and over dominant when D < 0, D = 0, 0 < D < |A|, D = |A| and D > |A|, respectively (Chong et al., 2018). The substitution effect of alleles was calculated by linear regression model SAS (V. 9.2): (3) In this model, Y ij represents phenotypic value (litter size); µ is the population mean; b is the corresponding regression coefficient; X ij is the value of the proportion of each allele in the genotype and ε ij represents random error.

Population parameter calculation
According to the results of genotype statistical analysis, genotypic and allelic frequencies, polymorphism information content (PIC), genetic homozygosity (Ho), genetic heterozygosity (He), effective allele number (Ne) and the Hardy-Weinberg equilibrium (HWE) were analyzed by POPGENE software version 3.2. Distribution differences for loci genotypic frequencies in Nubian goats were performed using the chi-square test in SPSS (Version 18.0) (IBM Corp., Armonk, NY, USA). Nubian goat populations with P > 0.05 (chi-square test) were considered to conform to the Hardy-Weinberg equilibrium.

Haplotype analysis
In the current study, only the significant SNPs in the same gene were used to estimate the extent of linkage disequilibrium (LD) and haplotype blocks. The extent of LD between significant SNP pairs and haplotype blocks was estimated using the HaploView 4.2 program. Using the default setting of HaploView 4.2 to run the program, the parameter r 2 represents the statistical association between the two SNPs. Generally, when r 2 > 0.8, the two SNPs can be regarded as closely linked and can be replaced each other. The association analysis between haplotype region and litter size, Hap score and p value of Hap score were computed using R software (haplo.states packages). The model of association analysis between combined haplotypes and litter size is as follows in Eq. (1) except that the genotype (G) is replaced by combined haplotypes.

Litter size trait analysis
Litter size from three parities was recorded in this study. The average litter size at the first to third parity was 2.130 ± 0.022; the litter size at the first parity was 2.065 ± 0.037; the second-parity litter size was 2.168 ± 0.040; and the third-parity litter size was 2.343 ± 0.066. Scatterplots also show that litter size tended to increase with the parity (Fig. 2).

Verification of polymorphic loci, genotyping and quality control
Among the 43 candidate loci (29 SNPs and 14 indels) of 24 genes, 38 polymorphic loci (29 SNPs and 9 indels) of 23 genes were identified by DNA pooling and Sanger sequencing (Table S2). Then, the primers of multiplex PCR were redesigned and sequenced. The genotypic correlation coefficients of three pairs of technical repeats equal to one and no genotyping signals were detected in two blank samples, indicating that the MTA-seq method is reliable. Sixty-nine mutations (59 SNPs and 10 indels) were screened out from 23 candidate genes by multiplex PCR and sequencing technology (Date S3).

Population genetic parameters
The genotypes and allele frequencies, He, Ho, Ne, PIC and Hardy-Weinberg equilibrium test P values, as well as other genetic parameters, associated with the significantly correlated loci were calculated to determine the genotype distribution among Nubian goat (Table 3). The significantly correlated loci data indicated the minimum (maximum) values of He were 0.02 (0.46), the minimum (maximum) value of Ho was 0.52 (0.98) and the minimum (maximum) value of Ne was 1.02 (1.94). Thirteen loci were low polymorphic status (PIC < 0.25); other SNPs were moderate (0.25 < PIC < 0.50) or high polymorphic (PIC > 0.50) status. Five loci deviated from Hardy-Weinberg equilibrium (P < 0.05).

Association analysis of multiple markers with reproductive traits
According to the results of linkage disequilibrium, we combined the loci with linkage disequilibrium and analyzed the correlation with litter size (Table 5). Two combinations (Block 1 and Block 2) were founded to be significantly associated with litter size in the first parity (P < 0.05). The litter size of combination type AACC in KISS-1 gene and combination type AAGG in KITLG gene were obviously lower than that of other combinations.

Discussion
MAS has been widely used in the selection of important economic characters of pigs, chickens, cattle and sheep. In the same way, there has been some research progress in goat. But the molecular markers found in goat are various in different breeds. Considering research progress of goat molecular markers, being significantly associated with litter size, we hypothesized that molecular markers identified on goat breeds (e.g., Shaanbei White Cashmere goat, Jining Grey goat, Boer goat) would have the same effect on Nubian goat. Therefore, we verify the polymorphisms in Nubian goat population of reported loci significantly associated with litter size and then perform genotyping and detection of surrounding loci (50 bp). By doing so, we are able to provide some scientific data to enable the efficient and rapid development of the Nubian goat industry. In this paper, 69 mutations were screened out from 23 candidate genes in Nubian goat by multiplex PCR and sequencing technology. However, only 21 mutations were significantly linked with litter size of Nubian goat. Interestingly, we found seven loci without mutation homozygous (INHA: g.2838305G>ins-C and g:28318345C>G; MARCHF1: g.1858739G>A; ATBF1: g.39535967C>T and g:39535996C>T; CTNNB1: c.1684-238_1684-222delACTTGGCTGTGCACAGT and c.1684-218_1684-211delGGAGGTGCinsT). It may be caused by the fact that homozygous lethal mutation and/or mutant homozygote is sterile (Nicol et al., 2009). In this sampling group of Nubian goat, 13 loci are of low polymorphic status (PIC < 0.25), which suggests that they have low selection potential; 8 loci are of moderate (0.25 < PIC < 0.50) or high polymorphic (PIC > 0.50) status, which suggests that they could provide more effective genetic information. Five loci obviously deviate from Hardy-Weinberg equilibrium (p < 0.05) in Nubian goat, which is mainly due to artificial  Litter size is presented as the least squares mean ± standard error (SE) of phenotypic values. Different superscript letters indicate significant differences between groups (P < 0.05). "-" denotes information is not available. A, G, C and T of the gene frequency and genotype frequency column respectively represent the adenine, guanine, cytosine and thymine; PIC, polymorphic information content; Ho, homozygosity; He, heterozygosity; Ne, number of effective allele; χ 2 and p value denote the chi-square value and P value of Hardy-Weinberg equilibrium test; "-" denotes information is not available. χ 2 0.05 = 5.991, χ 2 0.01 = 9.21, df = 1. P < 0.05 means the difference is significant, and P > 0.05 means the difference is not significant.  Hap frequency: the haplotype frequency in the population; Hap score: statistical measurement of correlation between each haplotype and traits. The positive or negative value of the score indicates the frequency of the haplotype in the population with positive or negative effects on the litter size. P value: the progressive chi-square P value is calculated by the square of score statistics.
selection and/or nonrandom mating/or genetic drift (Wang et al., 2015). In these 21 polymorphic loci, they are also significantly associated with litter size of other goat population. For example, KITLG: g.18047318 G>A is revealingly associated with litter size of Xinong Saanen (SN), Guanzhong (GZ) and Boer goat ; GDF9: g.66027701C>T is tellingly associated with litter size of Inner Mongolia cashmere goat (Nicol et al., 2009) as well. In quantitative genetics, the genetic variations of major economic traits are explained by both additive effect and dominant effect. Additive effect is considered to be a fixed component in the breeding, which also known as breeding value. Dominant effect refers to the deviation between the effect value of each gene and its additive effect value, which comes from the effect produced by the interaction between alleles. This effect is the main part of heterosis that can be inherited but not fixed in the breeding. The genetic effect value analysis of 21 significant loci showed that the additive effect values of KITLG as g.18047318 G>A in first parity, KITLG as g.18152042G>A in third parity, KISS-1 as g.1341674 C>G in first parity, and GHR as g.32134187G>A in second parity exceed more than 0.40, and the preponderant alleles are G, C, A and G, respectively.
In haplotype structure analysis, three linkage disequilibrium blocks are found in these 21 loci. The haplotypes are integrated into a GLM model to explore the effect of lociloci combinations on litter size. The litter size of combination type AACC in KISS-1 gene and AAGG in KITLG gene are expressively lower than that of other combinations genotype in first born (P < 0.05). Our findings indicated that homozygous individuals with the KISS-1 gene AACC and KITLG gene AAGG genotype should not be retained in MAS breeding programs for improving the frequency of favorable alleles in the Nubian goat.
In the functional annotation of mutation loci (Table S4), there are five SNPs (POU1F1: g.34236170A>G; GDF9: g.66027701C>T; NEDD4: g.48710049G>A; PGR: g.74589762C>T; KISS-1: g.1341600 A>G) in the coding region (exon) which are meaningfully related to litter size, and three (POU1F1, NEDD4, PGR) of them are synonymous mutations. They may change the efficiency, stability and accuracy of mRNA alternative splicing (Kimchisarfaty et al., 2007;Duan et al., 2003) as well as protein translation and fold (Sauna and Kimchisarfaty, 2011) to affect litter size in doe. Another two (KISS-1 and GDF9) directly change amino acids (KISS-1: p.Ala11Thr; GDF9: p.Ala273Val) and could change protein function, which maybe affects the secretion of reproductive hormone (e.g., GH, PRL, TSHβ, PGR) or/and early folliculogenesis (Hanrahan et al., 2004;Silva et al., 2011) to affect litter size in doe. There are 16 loci located in the non-coding region (5_prime_UTR, 3_prime_UTR, intron). Three loci (INHA: g.2838305G>ins-C, g:28318345C>G and g:28318349G>C) are located in 5_prime_UTR. It may lead to the change of INHA gene frame shift and/or translation activity to affect estrus start and litter size in doe (Zi et al., 2012). Two loci are located in 3_prime_UTR; it may affect the binding sites of miRNA and the stability of mRNA, subcellular localization and translation level (Stark et al., 2005), which affect litter size in doe. Eleven loci are located in intron, which mainly involve in expression regulation and selective shearing. They may affect the cutting of precursor mRNA, cause exon skipping or intron inclusion, and lead to the change of protein sequence, which causes functional difference  to affect litter size in doe. This suggests that these mutations have potential functional significance, although the underlying mechanism remains to be elucidated in Nubian goat.

Conclusions
In conclusion, 38 previously reported mutations of 23 reproduction genes were identified in Nubian goat, which are importantly related to litter size of other breeds. A total of 69 mutations (59 SNPs and 10 indels) were detected by multiplex PCR, and 21 loci were significantly correlated with litter size in Nubian goat. The combined haplotype analysis revealed that AACC genotype in KISS-1 gene and AAGG genotype in KTLG gene are notably lower than that of other combination genotypes in first born. This study suggests that these loci can be considered as effective SNP markers for genomic selection in Nubian goat breeding. However, further research with large sample sizes is needed to confirm the genetic effects of these mutations and molecular mechanisms.
Data availability. Data used and analyzed during this study are available from the corresponding author upon reasonable request. Table S1 contains primers and PCR condition applied for pooled-DNA sequencing for the 43 candidate loci. Table S2 contains the information of 38 polymorphic loci (29 SNPs and 9 indels) of 23 genes identified by DNA pooling and the primers of multiplex PCR. Date S3 contains the information of 69 polymorphic loci (59 SNPs and 10 indels). Table S4 contains the annotation on the significant mutation loci linked with litter size in Nubian goat. The supplement related to this article is available online at: https://doi.org/10.5194/aab-64-375-2021-supplement.

Supplement.
Author contributions. These studies were designed by SZ and QJ. SZ, XG, YJ and YS performed all the molecular experiments. PP, HX and YW assisted in explaining the results and preparing samples. SZ, QJ and YH performed the writing, review and editing. All the authors read and approved the final manuscript.
Competing interests. The authors declare that they have no conflict of interest.
Disclaimer. Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.