Statistical modeling for growth data in linear mixed models – Implications derived from an example of a population comparison of Golden Hamsters

Using statistical modeling to determine the structure of expectation and covariance employed during analysis is a common feature of analytical research. This paper describes the necessary methodology for, and illustrates those techniques that are of special importance in, practical modeling and evaluation scenarios (likelihood ratio test, analytical criteria, residual analysis). Our approach is demonstrated upon a population comparison, taken on various measurement dates, that focuses on a wild population and a laboratory population of Golden Hamsters. The selected example is particularly suited due to the fact that – aside from the actual growth function of interest – additional fixed (e.g. effect of different mating periods, litter size) and random factors (e.g. maternal environment, repeated performances per animal) must be considered. The modeling shows significant efficiency regarding the improvement of the analytical criteria. The recommended evaluation model leads to a very close match of the observed ordinary least square residuals and of the variance and covariance functions, respectively, that have been derived from the estimated covariance structure.


Introduction
Analyzing growth curves is a very common task in biological research.In general, this analysis comprises the acquisition of data for the same object (plant, animal) over a certain period.This leads to repeated observations per object and thus results in accordingly complex covariance structures between these observations (LITTELL et al. 1998, MIELENZ & SCHUELER 2007b).The separate analysis per measurement date circumvents this dependency structure, but from a scientific perspective, it must be considered suboptimal, since it does not allow any statements on the characteristics of the growth curves.
Often, however, the matter of interest is not only the growth curve itself but also the comparison of curves for two or more analysis classes, such as populations.In this case, the respective differences between the growth curves will have to be estimated and statements on the significance of these differences given.
This briefly outlined task leads us to a linear model with fixed effects -such as the selected populations to be analyzed, and random effects -such as selected animals with their repeated measurements as representatives of the chosen population to be examined.The resulting linear mixed model is not given.On the contrary, from a statistical point of view, the challenge lies in determining an expectation and covariance structure that corresponds to the data.Developing the expectation structure is required to determine unbiased estimations of the desired treatment effects and their differences.Often, additional fixed noise factors may be significant and must not be neglected.In order to guarantee unbiased results for the statistical tests to be performed, the covariance structure is required.
Thus, a variety of analyses initially demand statistical modeling to determine the expectation and covariance structure.This paper aims to describe the individual steps involved in the statistical modeling for growth data and a subsequent population comparison.In order to accomplish this, we make use of body mass data on both a wild population and a laboratory population of Golden Hamsters (Mesocricetus auratus).The data were gathered in intervals of 7 days between the hamsters' 28th and 70th day of life.
For more than 75 years, Golden Hamsters have been kept both in private homes and in research labs.In 1930, a sibling mating constituted the beginning of their history as pets cared for by humans (GATTERMANN 2000).In the study at hand, representatives from this population are compared with the offspring of wild hamsters from 1998 (GATTERMANN 2000).The population differences expected due to their different backgrounds have been proven, for example, in their behavior, reproduction rates, body mass trends, organ mass, and blood characteristics (FRITZSCHE et al. 2000, GATTERMANN 2000, GATTERMANN et al. 2002, FRITZSCHE et al. 2006, KRAUSE 2008).The decreased genetic variability of the lab population has also been documented (FRITZSCHE et al. 2000, NEUMANN et al. 2005, KRAUSE 2008).Research also exists on the description of growth curves for Golden Hamsters, but this research does not describe the statistical modeling with sufficient precision (GATTERMANN et al. 2002, KRAUSE 2008).
In this paper, the body masses per animal are repeatedly measured and will be analyzed simultaneously.Moreover, a suitable model-fitting strategy will be illustrated with the example of these Golden Hamster data.
We address the challenges described above with the help of the SAS software package, though other software tools such as R may also be used for the represented approach.However, we prefer SAS, since the Kenward-Roger method (KENWARD and ROGER 1997) used with SAS represents a very powerful approximation of the degrees of freedom for hypothesis testing in linear mixed models, and can be used with a diverse range of experiment set-ups.

Approach for the model selection
As indicated in the introduction, the model selection for mixed linear models comprises the selection of both the expectation structure for the fixed effects and the covariance structure for the random effects.Depending on the problem to be tackled, the number of the respective suitable variations may be rather large for both the expectation and the covariance structure; hence, the number of possible combinations can be large as well.For this reason, a multi-step approach (WOLFINGER 1993), together with the combined optimization of expectation and covariance structure (NGO & BRAND 1997), is suggested.We therefore employ the following steps in this paper: 1. Provisional selection of the expectation structure for assumed residual effects of diagonal structure and homogeneous variance estimated with the ordinary least square method (OLS) (SEARLE 1971).2. Selection of the covariance structure while using the expectation structure from the results of the first step.3. Final definition of the expectation structure via significance testing of noise factors while using the optimized covariance structure.In step 1, our approach allows us to use the OLS residuals for every analyzed model to check for any bias, which is particularly convenient for repeated observations.Furthermore, the distribution of the residuals over time allows information on variance heterogeneity over time.Finally, we can use the variance function to compare the covariance structure observed for the residuals with the estimated structure derived at the end of step 2. In so doing, the estimated covariance structure should basically map the distribution of the residuals.
In order to implement this approach, we have chosen an inverted sequence of the steps proposed by WOLFINGER (1993), who initially performs an optimization step for the covariance structure.When determining the provisional expectation structure, it is even recommended to employ overparameterized models where possible.That is to say, aside from all treatment factors, all potential noise factors should be included in the modeling.This ensures that the OLS method delivers consistent estimators for the fixed effects.Once the optimized covariance structure has been obtained, the third step may include testing the significance of the levels of fixed noise factors or of regression coefficients belonging to the independent variables (derived from the continuous noise factors); this can be done, for example, with the help of the approximate t-or F-test, respectively (GIESBRECHT & BURNS 1985, FAI & CORNELIUS 1996, KENWARD & ROGER 1997, SPILKE et al. 2005).It is quite possible that the fixed noise factors identified as significant during the first step do not expose any significance at all when used with the optimized covariance structure.In this case, the third step will require yet another analysis of the residuals.This is to avoid the accidental exclusion of significant noise factors in the sense of statistical type-II errors, which is necessary because the expectation and covariance structures are not independent of one another.Additionally, the normal distribution of the residuals and the random model effects may be tested at the end, for example, with the help of quantile-quantile plots.

Model selection methods
In the subsequent sections, we describe the use of the likelihood-ratio test, analytical criteria and residual analysis for the model selection.

Likelihood-ratio test (LRT)
The likelihood-ratio test allows the comparison of the model's fit, provided that one of the models is hierarchically subordinated to the other.This is the case if one model can be seen as a specialization of a more general model due to certain model effects having been fixed.The LRT then results from: Given certain regularity conditions, the LRT testing statistic asymptotically follows an 2 χ distribution, with the degrees of freedom resulting from the number of restrictions that are necessary to transform the general model g into the special model s (FAHRMEIER & HAMERLE 1984, GREENE 2003).The model fit of the general model, when compared to the special model, is considered better if LRT > 2 (1 , ) FG χ α − , with α≤0.05 in most cases.If two models are compared regarding the expectation structure, the likelihood functions are determined and maximized with the help of the classic Maximum Likelihood (ML) method.If the model comparison focuses on the covariance structure for a constant expectation structure, the likelihoods are employed via the Restricted Maximum Likelihood (REML) method (WOLFINGER 1993, NGO & BRAND 1997, MIELENZ et al. 2007a).
The LRT based on the quotient of restricted likelihood functions is often referred to as RLRT in related works.Two of the classic requirements for the LRT are that the test has to be performed based on independent random variables y1 to yn, and that the parameters for the null hypothesis do not lie on the boundary of the parameter space.In the linear mixed model, for example, these requirements are not met when testing hypotheses with the form of 2 0 : 0 u H σ = against 2 1 : 0 u H σ > .Therefore, there are various research efforts that derive the exact asymptotic distribution of the LRT or RLRT, respectively, for the special cases of linear mixed models and that analyze the derived results with the help of simulation studies (SELF & LIANG 1987, CRAINICEANU & RUPPERT 2004a, CRAINICEANU & RUPPERT 2004b).The authors of this paper are not aware of any research study that derives the asymptotic distributions of the RLRT without any restrictions regarding the complexity of the covariance structure in the linear mixed model.Thus, this paper assumes the classic asymptotic 2 χ distribution when testing hypotheses on the covariance structure with the help of the RLRT.

Information Criteria
For model comparisons without requiring hierarchical models for the models to be analyzed, there are a number of analytical criteria.Often, Akaike's information criteria (AKAIKE 1969(AKAIKE , 1973(AKAIKE , 1974) ) and their modifications of HURVICH and TSAI (1989) as well as the criterion of SCHWARZ (1978) are used.Calculating these criteria for comparing expectation structures using the ML method is carried out as follows: Here, pX denotes the rank of the design matrix X of the fixed effects, q is the number of variance components to be estimated, and n stands either for the number of observations with only one observation per object or for the number of objects with multiple observations per object, respectively.The comparison of the covariance structure for identical expectation structures is realized with the help of the REML method: Thus, the calculation formulas of the information criteria are given in such a way that the model with the lower value for the information criterion is preferred.

Residual analysis for testing modeling results
An important tool for testing the modeling results is given by the residual analysis (BELSLEY et al. 1980, COOK et al. 1982, NOBRE & SINGER 2007).With this tool, the residuals derived from an ordinary-least-square analysis may be used for testing both the expectation and the covariance structure.For observations made over a certain time period, the residual analysis may be employed particularly efficiently, since in this case, each point in time has its own estimated values for expectation and variance, as well as residuals for verifying these values.
The purpose of verifying the expectation structure is to detect any potentially existing systematic bias of the residuals caused by the non-consideration of fixed effects.If all significant fixed effects have been considered, the mean of the residuals per measurement date should be approximately zero without any changes over time.This may be re-checked by using a locally adjusted regression (CLEVELAND et al. 1988, CLEVELAND et al. 1991).
The variability differences of the residuals over time may indicate whether or not a heterogeneous variance over time needs to be considered.Furthermore, the OLS residuals serve to check the covariance structure for repeated observations per object.
Thus, the estimated covariance matrix derived from the observed residuals may be opposed to the estimated model covariance matrix.This can be illustrated by representing the variance function or covariance function or correlations (see »Statistical model development«), respectively, between the residuals within one object over time.
If there are few observations per day of life, however, the squared OLS residuals are also plotted against the time.Then, a smoothened mean trend of the squared residuals may serve to check the variance function over the measurement period.

Material
The data available for our investigation refers to a laboratory population and a wild population of Golden Hamsters (KRAUSE 2008).In total, data from 626 animals from 140 litters (lab population: 57, wild population: 83) taken by 7 measurements are available for analysis.Both the generation of the experiment animals and the testing took place in 7 successive periods (mating periods over a period of 2 years, short periods).It can be expected that these time periods are connected to effects on the body mass.As expected, the statistical measures shown in Table 1 expose significant differences between both sexes and populations.Furthermore, there is a considerable increase in variability over time.These results have to be considered when modeling the expectation structure.

Modeling the expectation structure (step 1)
The foundation for modeling growth curves was provided by an estimation of the curves with the help of a locally adjusted regression (CLEVELAND et al. 1988, CLEVELAND et al. 1991) using the SAS procedure LOESS.The results of such a non-parametric regression allow the justified derivation of a functional approach for the design of the expectation structure.The results in Figure 1 are shown for the male animals of the lab population.They show a non-linear curve, and for the given observation period, a squared functional approach is most likely sufficient and may yield a good fit to the observations.Furthermore, Figure 1 illustrates an increase in variability for older animals -as the results in Table 1 already suggested.The results for the female lab animals and those for both sexes of the wild population are similar and thus omitted here.However, the comparison of the locally adjusted regressions also leads to the necessity of modeling populationand sex-specific functions, which was already indicated by the results in Table 1.
Figure 1 Locally adjusted regression (smoothing parameter=0.5)and observations of the body mass for male lab animals Lokal angepasste Regression (Glättungsparameter=0,5) und Beobachtungen der Körpermasse der männlichen Tiere der Laborpopulation The results of modeling the expectation structure -derived particularly from Figure 1 are presented in the next table .It must be stressed that the initial assumptions made here only focus on the expectation structure, not on any interval and significance issues; examining these will only be possible after optimizing the covariance structure.Thus, the models shown in Table 2 correspond to an OLS analysis.
According to the data structure and the abovementioned statements, the expectation value for the body mass of an animal with sex j from population i at time t is to be modeled for the whole observation period.While doing so, the period‹s impact k, as well as the litter size of animal l‹s litter of origin will have to be considered as well.For this purpose, the regression coefficient of the body mass is mapped to the litter size depending on the time of measurement.This considers that the impact of the size of the litter of origin decreases in importance as the animal‹s age increases.That is to say, with there being 7 measurement dates, 7 regression coefficients will be estimated.
When considering all influential factors, the expectation value of yijkl (t) has the following form for the model with exclusively fixed regression coefficients (fixed regression model FRM):  The LRT for the comparison of the work models with the most complex model is significant in any case.That is to say, an increased model complexity definitely leads to a significant reduction of the (−2logL) values.For the analytical criteria AICC and BIC, the same curve is found.Accordingly, one would prefer the more complex model M4 with 26 fixed effects when using these criteria as well.Aside from the period effect (7 effects), this model contains the sex-and population-specific fit of the growth curve depending on time due to a 2nd-degree polynomial (4•3 coefficients), as well as the regression of body mass to litter size for every measurement date (7 coefficients).
In SAS notation, this model leads to the subsequent representation where the option NOINT suppresses the estimation of a general mean independent from factor levels.The term pop*sex invokes the estimation of a general median specific to each population*sex combination.The use of model M4 leads to the residuals shown in Figure 2 both below and to their curve, which have been fit with non-parametric regression for smoothing parameter 0.5 (SAS procedure LOESS).The residual analysis is based on the standardized OLS residuals.That is to say, the observed residuals are divided by their standard error.Further details on the calculation can be found in GREGOIRE et al. (1995).Here, the associated advantage lies in the improved comparability and comprehensibility, since the assumption of a normal distribution leads to all residuals being within the range of ±3.The above figure illustrates that there is no systematic trend for the residuals depending on the animals‹ age, and thus, the locally adjusted regression line does not expose any slope.Few standardized residuals go beyond the range of ±3.However, the figure does show the systematic increase for the residuals depending on the animals' age as an expression of the expected age-dependent heterogeneity of the residual variance.This aspect has to be considered during the subsequent modeling of the covariance structure.

Modeling of the covariance structure (step 2)
Thus far, the model design process has assumed independent residual effects.This is not realistic.For the data structure at hand, two major reasons contradict this assumption: -The same animals were examined at various intervals.Accordingly, there are repeated observations per animal, and dependencies between them can be expected.Additionally, a time dependence is to be expected for this relation, since successive observations may be more similar than observations for the same animal with two measurement points that lie further apart.
-As Golden Hamsters are multiparous mammals, maternal environmental conditions are very important within our analysis.To a certain extent, the examined animals are from the same litter.Accordingly, the same litter environment leads to the expected higher similarities between those animals when compared to individuals from different litters.Derived from that, the modeling of the covariance structure refers to these two complex causal issues while also considering the time-dependent heterogeneity of the residual effects, as the results presented in Figure 2 imply.Accordingly, a so-called random regression model (RRM) (SCHAEFFER & DEKKERS 1994, JAMROZIK & SCHAEFFER 1997) is assumed for the dam and animal effects.That is to say, one individual regression function is estimated for each dam and animal.
In the subsequent sections, we will first describe the appropriate models for the dam and animal effects.Let dm = (dm0 , … , dmn)' be the vector for the random regression coefficients of dam m and let ail = (ail , … , ailn)' be the vector for the random regression coefficients of animal l (with dam m) from population i.Then let x = (x0 , … , xr)' denote the vector for the time-dependent covariates with xr = t r for r=0, 1,…, n.When considering random dam and animal effects, the random regression model (RRM) associated with (FRM) then takes on the following form: Let dm and ail be multi-dimensionally normally distributed vectors that are independent of each other, with dm ~ N(0,Kd) and ail ~ N(0,Ka(i)).Then, the variance function for a given record of animal l with dam m from population i at time t has the following form: Assuming independent residual effects, we arrive at the following representation of the covariance function for the days of life t1 and t2: If we restrict the random regression model to the special case of nD = nA = 1 (i.e., only intercept and slope), implementing SAS within the MIXED procedure leads to the following additional commands: RANDOM intercept t / SUBJECT=dam TYPE=UN; RANDOM intercept t / SUBJECT=animal TYPE=UN GROUP=POP; Here, the METHOD=ML has to be replaced by METHOD=REML, the class statement has to be extended with the order terms »dam« and »animal«.When using TYPE=UN, matrix Kd contains three variance components to be estimated.The option GROUP=POP invokes the population-specific estimation of Ka(i).Thus, a total of 6 variance components will have to be estimated for animal effects.The specific regression coefficients for the factors »dam« and »animal« may be considered as random deviations of the fixed regression coefficients in the squared approach for the expectation value according to model (FRM).If higher-degree polynomials are used, the simple transformation t*=t/ tmax (with tmax being the maximum age during the observation period) often leads to better convergence when optimizing the likelihood function.
Up to this point, the covariance structure has been modeled using the subject-specific components »dam« and »animal«.Further possible relations between the dams are neglected, since the available data set most likely does not allow a realistic estimation of the genetic covariances.For the residual effects, this means that there is no deviation from the diagonal structure; all non-diagonal elements are zero.Subsequently, this covariance structure is to be specified in more detail, since covariances between the residual effects that depend on the age differences between measurement points may be possible (VERBEKE et al. 1998, LESAFFRE et al. 2000).For this purpose, the residual effect of an animal is considered independent of its age and it is then split as follows: e(t)=e1(t)+e2(t).Here, e1(t) denotes the share of serial correlation between repeated performances of an animal; component e2(t) describes the portion that is valid independently and with the same variance for all residual effects.The serial correlation model has to be complemented with a distance function g(•).This function is chosen in such a way that all residual effects e1(t) of an animal have the same variance.Additionally, the correlation between them will always be positive and decrease monotonically with increasing distance between two measurement points.Let 2 e σ be the variance of e(t).Thus, the variance and covariance functions of the residual effects for an animal on day of life t have the following form: As presented above, d=|t 1 -t 2 | is the distance between two measurement points.There are a number of possible correlation functions, such as the spherical and the exponential function (SCHABENBERGER and PIERCE 2002).In the simplest case, a linear function as shown below may be chosen for g(•): The larger the parameter ρ, the stronger the decrease of function g with increasing values of d.The formulation of a linear correlation function in SAS can be achieved by using the REPEATED command: REPEATED / SUBJECT=animal TYPE=SP(LIN) (day_of_life); The results of the model development for optimizing the covariance structure are summarized in Table 3. There, an expectation structure according to model M4 from Table 2 has been assumed for all cases.Model M5 as the starting point for the model design is a model in line with model M4 with independent residual effects (OLS model); however, the REML method is used for estimating the random model parameters.This method has been used for models M5-M12.
Models M5-M11 are considered special cases of model M12, which results from setting one or more random effects to zero.Accordingly, hypotheses on the boundary of the parameter space result from the LR test, which means that significance assumptions can only be made with reservations.A test according to Section »Likelihood-ratio test (LRT)« can be applied appropriately.The performed RLRT test shows a continuous significant improvement of the restricted-likelihood values, and thus, the model fit is improved despite an increase in the number of model parameters to be estimated, from 1 for M5, to 11 for M12.The results -both for AICC and BIC -lead to the same conclusion, which strengthens reliability during model selection.q number of variance components, df degrees of freedom for the restricted likelihood-ratio test (RLRT), differences of AICC or BIC, respectively, compared to model M12 ( Δ AICC, Δ BIC); in M5 to M12 the following always applies At the end of the model selection process, this results in a linear random-regression approach for modeling the dam and animal effects with a 2×2 covariance matrix each when modeling animal effects in a population-specific way.Since the animal-specific regression coefficients additionally depend on the population, it is necessary to estimate a total of three two-dimensional covariance matrices.Furthermore, the consideration of a serial correlation of repeated observations for one and the same animal is required.For this purpose, a linear serial correlation model is quite suitable.
Testing models with a squared component for the dam and animal effects did not lead to any convergence, which is understandable given the rather low number of independent objects (dams and animals, respectively) used with these highly complex models.Also, the use of additional functional approaches, such as the exponential, the spherical, or the Gaussian approach for modeling the serial correlation did not result in any convergence when optimizing the restricted likelihood function.Thus, these models were discarded.In SAS notation, the selected model for the random effects can be summarized as follows: RANDOM intercept t / SUBJECT=dam TYPE=UN; RANDOM intercept t / SUBJECT=animal TYPE=UN GROUP=POP; REPEATED / SUBJECT=animal TYPE=SP(LIN) (day_of_life); The estimated values of the covariance components and of the correlation parameter resulting from the use of this model are summarized in Table 4.
When using the OLS residual analysis, the result is a fundamental and basically assumption-free option for confirming the results for the selection of the covariance structure with analytical criteria, for additionally testing the »correctness« of the proposed model, and for thereby increasing reliability during the model selection process.Per population and sex, at least 154 observations were available from a total of 7 measurement points (from the 28th to the 70th day of life in 7-day intervals).Therefore, it was possible, for all analyzed days of life, to estimate the standard deviations for the residuals and the correlations between the residuals on the various analyzed days of life.Figure 3 below shows the estimated standard deviations of the residuals within one population and sex when using OLS.There, significant differences arise between the populations, but differences between the sexes remain low.Despite a good system of mapping the increase in variance with increasing age, the variance function can only insufficiently model the curve of the residuals when applying population-independent modeling of the covariance structure according to model M8.This works much more efficiently when modeling random population-specific regression coefficients and graphically shows the related model differences, since they could have been already derived from the evaluation criteria from Table 3.Another challenge in the context of residual analysis is found in testing the correlation structure of the residuals.For the example of the lab population, Table 5 shows the correlations resulting from the OLS residuals (model M4) and the use of the estimated model parameters for model M12; the correlations exhibit a very good match.The maximum deviation between observed and estimated correlation is 0.14 for the lab population and 0.10 for the wild population (which has been omitted here due to space restrictions).The mean for the deviations is 0.01 for the lab population, and 0.03 for the wild population.If a mean correlation is formed by summarizing correlations with identical temporal distance (7, 14, …, 42), the result is a correlation function that depends on distance.If the correlations are ordered by distance, the curves shown in Figure 4 will emerge.
We can derive a very good match of the correlations from Figure 4, obtained from the OLS residuals (model M4) and model M12.There, the match for smaller distances is better than it is for larger temporal differences.For this result, however, it must be noted that fewer values are available for the mean calculation for larger distances, and hence the estimation is less precise.For comparison purposes, the results from model M10 are listed as well.Both models only differ in one linear component in the random-regression approach for the dam effects and the resulting possible consideration of temporal effects in model M12.As Figure 4 illustrates, this additional effect is necessary in order to arrive at a satisfactory mapping of the correlation structure for the residuals.
Based on the graphically presented results in Figures 3 and 4, the model selection could be confirmed with the help of AICC/ BIC and RLRT.The variance and correlation functions of model M12 adequately reflect the results of the OLS residual analysis.
Figure 4 Mean correlation between repeated observations per animal as a function of the tem-poral distance between the examined days of life estimated with OLS residuals and models M10 and M12 from Table 3 Mittlere Korrelationen zwischen wiederholten Beobachtungen pro Tier als Funktion des zeitlichen Abstandes zwischen den untersuchten Lebenstagen, geschätzt mit OLS-Residuen und den Modellen (M10) und (M12) aus Tabelle 3

Testing the expectation structure (step 3)
The use of the expectation structure defined in Section 4, as well as the achievements from Section 5 on the covariance structure, lead to the following table with results of the significance check using the F-test while applying an approximation of the degrees of freedom from KENWARD and ROGER (1997).Thus, when considering the results from Table 6, we can also assume a significant influence from the noise factors »period« and »litter_size*day_of_life.«The model design of the fixed effects according to Section »Material« is confirmed.
In linear mixed models, marginal and conditional residuals can be distinguished (SCHABENBERGER 2004).A conditional residual is the difference between the observed data and the predicted values of the observations.Analysis of the conditional residuals valid for model M12 leads to the results presented in Figures 5 and 6.Therefore, we can also assume an efficient fit of the selected analysis model to the presumed normal distribution.Only the respective tails of the distribution show clearly visible deviations for a comparatively small number of observations (Figures 5 and 6).The same statements can be made for the random effects for dams and animals, which have been omitted here due to space restrictions.

Illustration of the growth curves and testing of differences
Based on the developed model described in the previous sections, an estimation of the growth curves, their differences and associated confidence intervals may be performed.
The resulting curves are shown in Figures 7 and 8 separately for male and female animals.
The curves exhibit different growth dynamics between the populations but also between sexes.The represented confidence intervals provide an impression of the estimations‹ precision.The different estimation values for the covariances of the animal effects between the populations are clearly reflected in the confidence intervals, whereas the confidence intervals for the animals of the wild population are much broader.Furthermore, the differences in the growth curves and the associated confidence intervals can be determined for every day of the animals' lives using the parameter estimations (Figures 9 and 10).These values can immediately be employed for a significance comparison of the growth curves.Interval borders that do not include zero show a significant difference, with a statistical type-I error of 0.05.This test result corresponds to that of a t-test for comparing population means for any given day of life t.However, the curves describe the population difference in much more detail than do multiple mean comparisons for different days of life t.For example, it may be derived from the curves that the population differences are not constant during the growth phase but are instead subject to distinct temporal dynamics.For the female animals, a significantly higher body mass in the lab population can already be found on the 31st day of life; for male animals, this does not happen before the 34th day of life.Further, a different maximum difference between the female animals (58th day of life) and the male animals (64th day of life) can be illustrated with the curves.Differences in the speed of growth between the two hamster populations can be explained by their different cultural history.The lab population was founded in 1930 by only 2 animals (GATTERMANN 2000), which resulted in a highly restricted genetic pool that materialized as a founding effect.Aside from this random factor, which might have led to a population of quickly growing animals, the intentional breeding of quickly growing animals cannot be ruled out completely.For example, KRAUSE (2008) showed that there were no differences in body mass between the lab and the wild Golden Hamsters on the first day of life.The fast increase in mass of the lab animals might have been reached by a selection favoring high body mass after the end of the suckling period.Research reports by BACHMANOV et al. (2002), ADEOGUN and ADEOYE (2004), AKANNO andIBE (2005), GAYA et al. (2006) and KLINGT et al. (2006) show that the attribute body mass possesses high heritability and thus can be influenced rather efficiently by targeted selections.

Figure 3
Figure 3Function of the standard deviation estimated with OLS residuals and with models M8 and M12 from Table3Funktion der Standardabeichungen, geschätzt mit OLS-Residuen und mit den Modellen (M8) und (M12) aus Tabelle 3 Figure 5 Histogram of the conditional residuals of model M12 Histogramm der bedingten Residuen von Modell M12

Figure 7
Figure 7Estimated growth curves and their confidence intervals for the lab and the wild population (P=0.95) for the male animals Geschätzte Wachstumskurven und deren Konfidenzintervalle der Labor-und Wildpopulation (P=0,95) für die männlichen Tiere Figure 9Estimated differences of the growth curves and confidence interval of the differences between the lab and the wild population (P=0.95) for the male animals Geschätzte Differenzen der Wachstumskurven und Konfidenzintervall der Differenzen zwischen der Laborund Wildpopulation (P=0,95) für die männlichen Tiere )

Table 2
shows four work models of increasing complexity.There, model M1 is only included to demonstrate the model development process.This model does not allow any time-dependent estimation of model effects and thus leads to a considerably worse model fit.According to the results from Table2, a considerable reduction of the likelihood function multiplied by −2 (−2logL) and an improvement of the information criteria can be demonstrated in the context of the model development.

Table 5
Product-moment correlations of the residuals within »animal« between the days of life (above the diagonal) and the correlation determined from the estimated model parameters (below the diagonal)