Longitudinal GWA Analyses Using Linear Mixed Effects Models: Lme and Lmer

Longitudinal GWA analyses using Linear Mixed Effects Models: lme and lmer

Prepared by: Karolina Sikorska and Kelly Benke

This exercise is designed to provide you with R scripting to read in MaCH dosage and info files, convert a ‘wide’ dataset to ‘long’ format, and perform linear mixed effects modeling. We will use 2 packages to do this. The first is the lme function in the NLME package. This function can be considered the ‘gold standard’ and will allow many options for modeling, as well as produce p-values. The second is the lmer function in the LME4 package. This function runs much faster, however, it will not automatically produce p-values and cannot perform many of the options that the lme function can perform. Our goal is familiarize you with linear mixed effects models for a fairly simple example, and to guide you toward scripting to run many models as you would need to do for a GWA.

1. How many individuals are present in this study?

2. How many measurements are taken in the study, and how many measurements per individual are there?

3. The scan function allows us to bring in the dose file efficiently, which is very useful for an entire chromosome. What columns are ignored from the MaCH dose file to create the object dose?

4. How do the trajectories appear – are they linear or do you detect curvature? What patterns do you observe by genotype? 5. Model 1 provides estimates for several parameters for a select SNP. Fill in the table below using the summary information from this model:

Table of Results for LME Pval (if Parameter Estimate relevant) Random intercept variance Random slope variance Residual variance ρ (ran int, ran slp)

β0 β Time β SNP β SNP by Time

6. Does the lmer function produce results similar to the lme function for this example? Why?

Table of Results for LMER Pval (if Parameter Estimate relevant) Random intercept variance Random slope variance Residual variance ρ (ran int, ran slp)

β0 β Time β SNP β SNP by Time

7. How does a linear regression model compare with the output for the lme and lmer models? Why?

8. Please run the results for both the lme, lmer and lm functions and save the output files. How many snps are there in these files? 9. We would like to see a plot of these results, and will use the online program locusZoom. Open a web browser to the following URL:

https://statgen.sph.umich.edu/locuszoom/genform.php?type=yourdata

a. Select the output file for either the lme or the lmer results

b. For P-value column name, type either the SNP main effect: P_SNP, or the SNP by Time effect: P_int.

c. For Marker column name, type: SNP

d. Select white space for the delimiter

e. Type in the SNP reference name: rs9939609

f. Allow other values to default

g. Click the ‘Plot your Data’ tab (upper left) and wait for the .pdf file to be made, then save. Does this look like a real signal? What is already known about the snps in this region?