Estimating HLM Models Using R: Part 1

Posted on
ANOVA HLM R

Estimating HLM Models Using R: Part 1

There are a number of different R packages that now exist for fitting mixed models, including hierarchical linear models. For cross-sectional applications, perhaps the most frequently used package is lme4 (Bates et al., 2015). However, due to ambiguity in how to appropriately determine the degrees of freedom for \(t\)-tests, lme4 does not provide \(p\)-values for the fixed effects. The lmerTest package (Kuznetsova et al. 2017) provides a wrapper to lme4 that will provide p-values using the same method as SAS’s popular mixed models procedures. Loading lmerTest automatically loads lme4.

The data used in this tutorial can be loaded from the merTools (Knowles and Frederick, 2018). What follows replicates the results from Raudenbush and Bryk’s (2002, herafter R&B) cannonical text on hierarchical linear models (see especially chapter 4).

library(lmerTest)
library(merTools)

The Empty Model

As a first step, R&B begin with an empty model containing no covariates.

\[ Y_{ij} = \beta_{0j} + e_{ij} \]

Each school’s intercept, \(\beta_{0j}\), is then set equal to a grand mean, \(\gamma_{00}\), and a random error \(u_{0j}\).

\[ \beta_{0j} = \gamma_{00} + u_{0j} \]

Substituting (2) into (1) produces:

\[ Y_{ij} = \gamma_{00} + u_{0j} + e_{ij} \]

We can fit this “empty” model as follows:

mod_1 <- lmer(mathach ~ (1 | schid), data = hsb)
summary(mod_1)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: mathach ~ (1 | schid)
##    Data: hsb
## 
## REML criterion at convergence: 47116.8
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.0631 -0.7539  0.0267  0.7606  2.7426 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  schid    (Intercept)  8.614   2.935   
##  Residual             39.148   6.257   
## Number of obs: 7185, groups:  schid, 160
## 
## Fixed effects:
##             Estimate Std. Error       df t value Pr(>|t|)    
## (Intercept)  12.6370     0.2444 156.6473   51.71   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The lmer function (for linear mixed effects regression) takes a formula as its first agrument. To the left of ~ is the dependent variable, to the right are fixed effects and random effects. The empty model does not contain any fixed effects beyond the intercept, which is included by default. The random effects are specified inside parentheses, with the pipe (|) used to represent nesting. The 1 to the left of the pipe indicates a random intercept, and the schid to the right of the pipe indicates that the intercept will vary across schools. The summary method reports teh results.

Typically, the results from the empty model are used to determine the amount of variance in the outcome that occurs at level-2 (school) versus level-1 (student), which is known as the intraclass correlation coefficient (ICC). The output tells us that the variance component for schid, our school grouping variable, is 8.614. The level-1 variance component is 39.148. Note that the output also displays the variance components in standard deviation units (the square root of the variance components). Be sure to use the variances, not the standard deviations, to estimate the intra-class correlation coefficient.

To get the ICC manually, we would calculate:

\[ ICC = \frac{8.614}{8.614 + 39.148} = .1804 \]

The convenience function icc from the sjstats package (Lüdecke, 2019) will do this for us.

sjstats::icc(mod_1)
## 
## Intraclass Correlation Coefficient for Linear mixed model
## 
## Family : gaussian (identity)
## Formula: mathach ~ (1 | schid)
## 
##   ICC (schid): 0.1804

This tells us that 18.04% of total variability exists at the school level.

The next step is to estimate the means-as-outcomes model.

Citations

Douglas Bates, Martin Maechler, Ben Bolker, Steve Walker (2015). Fitting Linear Mixed-Effects Models Using lme4. Journal of Statistical Software, 67(1), 1-48. doi:10.18637/jss.v067.i01.

Jared E. Knowles and Carl Frederick (2018). merTools: Tools for Analyzing Mixed Effect Regression Models. R package version 0.4.1. https://CRAN.R-project.org/package=merTools.

Kuznetsova A, Brockhoff PB, Christensen RHB (2017). “lmerTest Package: Tests in Linear Mixed Effects Models.” Journal of Statistical Software, 82(13), 1-26. doi: 10.18637/jss.v082.i13 (URL: http://doi.org/10.18637/jss.v082.i13).

Lüdecke D (2019). sjstats: Statistical Functions for Regression Models (Version 0.17.4)_. doi: 10.5281/zenodo.1284472 (URL: http://doi.org/10.5281/zenodo.1284472), <URL: https://CRAN.R-project.org/package=sjstats>.

Stephen W. Raudenbush and Anthony S. Bryk (2002). Hierarchical Linear Models: Applications and Data Analysis Methods. Thousand Oaks: Sage.