Random intercept & random slopes

For this exercise we load again the sleepstudy dataset:

library(lme4)
## Loading required package: Matrix
data("sleepstudy")
str(sleepstudy)
## 'data.frame':    180 obs. of  3 variables:
##  $ Reaction: num  250 259 251 321 357 ...
##  $ Days    : num  0 1 2 3 4 5 6 7 8 9 ...
##  $ Subject : Factor w/ 18 levels "308","309","310",..: 1 1 1 1 1 1 1 1 1 1 ...

and fit again the same full model, with both random intercepts and slopes:

model_full <- lmer(Reaction ~ Days + (Days|Subject), data=sleepstudy)

Questions

(Click on ‘solution’ buttons below to toggle open the answers to each question.)

1 Fit a model with only random intercepts, and compare it to the full model using a likelihood ratio test. What can you conclude from this test?

A model with only random intercepts assumes that individual participants have different “baseline” reaction times but are affected in the same way by sleep deprivation. We can estimate the model with only random intercepts using the code.

model_rintercept <- lmer(Reaction ~ Days + (1|Subject), data=sleepstudy)
summary(model_rintercept)
## Linear mixed model fit by REML ['lmerMod']
## Formula: Reaction ~ Days + (1 | Subject)
##    Data: sleepstudy
## 
## REML criterion at convergence: 1786.5
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.2257 -0.5529  0.0109  0.5188  4.2506 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  Subject  (Intercept) 1378.2   37.12   
##  Residual              960.5   30.99   
## Number of obs: 180, groups:  Subject, 18
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept) 251.4051     9.7467   25.79
## Days         10.4673     0.8042   13.02
## 
## Correlation of Fixed Effects:
##      (Intr)
## Days -0.371

and run the likelihood ratio test using the function anova()

anova(model_full, model_rintercept)
## refitting model(s) with ML (instead of REML)
## Data: sleepstudy
## Models:
## model_rintercept: Reaction ~ Days + (1 | Subject)
## model_full: Reaction ~ Days + (Days | Subject)
##                  npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)    
## model_rintercept    4 1802.1 1814.8 -897.04   1794.1                         
## model_full          6 1763.9 1783.1 -875.97   1751.9 42.139  2  7.072e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The likelihood ratio test is significant, with \(\chi^2=42.14\) and \(p=7.07 \times 10^{-10}\). This indicates that different participants vary substantially from one another in their individual values of the slope (that is in the effect of sleep deprivation) and that ignoring these differences result in a model that provides a worse fit to the data. (Or , similarly, that including random slopes to the model improve its ability to fit the data more than what you would expect by chance alone.)


2 Fit a model with only random slopes, and compare it to the full model using a likelihood ratio test. What can you conclude from this test?

This model would assume that all participants have the same baseline response time, but may be affected differently by sleep deprivation. We can estimate the model with only random slopes using the code:

model_rslope <- lmer(Reaction ~ Days + (0 + Days |Subject), data=sleepstudy)
summary(model_rslope)
## Linear mixed model fit by REML ['lmerMod']
## Formula: Reaction ~ Days + (0 + Days | Subject)
##    Data: sleepstudy
## 
## REML criterion at convergence: 1766.5
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.5104 -0.5588  0.0541  0.6244  4.6022 
## 
## Random effects:
##  Groups   Name Variance Std.Dev.
##  Subject  Days  52.71    7.26   
##  Residual      842.03   29.02   
## Number of obs: 180, groups:  Subject, 18
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)   251.41       4.02  62.539
## Days           10.47       1.87   5.599
## 
## Correlation of Fixed Effects:
##      (Intr)
## Days -0.340

and run the likelihood ratio test using the function anova()

anova(model_full, model_rslope)
## refitting model(s) with ML (instead of REML)
## Data: sleepstudy
## Models:
## model_rslope: Reaction ~ Days + (0 + Days | Subject)
## model_full: Reaction ~ Days + (Days | Subject)
##              npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)    
## model_rslope    4 1782.1 1794.8 -887.04   1774.1                         
## model_full      6 1763.9 1783.1 -875.97   1751.9 22.141  2  1.557e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The likelihood ratio test is significant, with \(\chi^2=22.14\) and \(p=1.56 \times 10^{-5}\). This indicates that different participants vary substantially from one another in their individual baseline response times.


3 Plot the data together with the predictions of the full models and the models with random intercept and random slopes using ggplot2 (for more advanced R users)

First, make sure we have loaded the plotting package

library(ggplot2)

Then, we add colums to the datasets with the predicted values of the model (these can be calculated easily using the predict() function).

sleepstudy$full <- predict(model_full)
sleepstudy$rslope <- predict(model_rslope)
sleepstudy$rintercept <- predict(model_rintercept)

In order to create a plot with a meaningful legend that identify the different model’s predictions, is necessary to reshape the data such that all predictions are in the same column, and we have an additional categorical variable that identify from which model they come frome. This can be done quite easily using the pivot_longer() function, in the tidyr package that is contained in the tidyverse - a collection of packages that aims to simplify data processing.

library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✔ tibble  3.1.7     ✔ dplyr   1.0.9
## ✔ tidyr   1.2.0     ✔ stringr 1.4.0
## ✔ readr   2.1.2     ✔ forcats 0.5.1
## ✔ purrr   0.3.4
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ tidyr::expand() masks Matrix::expand()
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ✖ tidyr::pack()   masks Matrix::pack()
## ✖ tidyr::unpack() masks Matrix::unpack()
data_long <- pivot_longer(sleepstudy,
                          cols=c("full","rslope","rintercept"), 
                          values_to="prediction",
                          names_to="model")
str(data_long)
## tibble [540 × 5] (S3: tbl_df/tbl/data.frame)
##  $ Reaction  : num [1:540] 250 250 250 259 259 ...
##  $ Days      : num [1:540] 0 0 0 1 1 1 2 2 2 3 ...
##  $ Subject   : Factor w/ 18 levels "308","309","310",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ model     : chr [1:540] "full" "rslope" "rintercept" "full" ...
##  $ prediction: num [1:540] 254 251 292 273 271 ...

We can now create the plot:

ggplot(data_long, aes(x=Days, y=Reaction))+ # indicate which variables are plotted in x and y axis
  facet_wrap(~Subject,ncol=9)+ # split the plot in separate facets for each participant
  geom_point()+ # add data points
  geom_line(aes(y=prediction, color=model)) # add prediction lines, with different colors for each model



Categorical predictors & dummy variable

The dataset ergoStool in the MEMSS package contains contains data from an ergonomic study in which 9 subjects evaluated the difficulty to adjusting the height of 4 different types of stool chairs.

data(ergoStool,package="MEMSS")
str(ergoStool)
## 'data.frame':    36 obs. of  3 variables:
##  $ effort : num  12 15 12 10 10 14 13 12 7 14 ...
##  $ Type   : Factor w/ 4 levels "T1","T2","T3",..: 1 2 3 4 1 2 3 4 1 2 ...
##  $ Subject: Factor w/ 9 levels "A","B","C","D",..: 1 1 1 1 2 2 2 2 3 3 ...

Questions

1 Fit this dataset with an appropriate multilevel model.

The model is estimated using the following command

m <- lmer(effort ~ Type + (1|Subject), ergoStool)
summary(m)
## Linear mixed model fit by REML ['lmerMod']
## Formula: effort ~ Type + (1 | Subject)
##    Data: ergoStool
## 
## REML criterion at convergence: 121.1
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -1.80200 -0.64317  0.05783  0.70100  1.63142 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  Subject  (Intercept) 1.775    1.332   
##  Residual             1.211    1.100   
## Number of obs: 36, groups:  Subject, 9
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)   8.5556     0.5760  14.853
## TypeT2        3.8889     0.5187   7.498
## TypeT3        2.2222     0.5187   4.284
## TypeT4        0.6667     0.5187   1.285
## 
## Correlation of Fixed Effects:
##        (Intr) TypeT2 TypeT3
## TypeT2 -0.450              
## TypeT3 -0.450  0.500       
## TypeT4 -0.450  0.500  0.500

Note that we use only random intercept and not random slopes. If we try to run a model with also random slopes we receive an error:

lmer(effort ~ Type + (Type|Subject), ergoStool)
## Error: number of observations (=36) <= number of random effects (=36) for term (Type | Subject); the random-effects parameters and the residual variance (or scale parameter) are probably unidentifiable

This error means that we do not have enough information to estimate the full model with random slopes. Since we only have 1 measurement for each subject-chair combination, the model would not be able to disentangle the random effects from the residual random variability. Estimating such model would be possible if we had mode than 1 observations for each subject-chair (e.g. if subject repeatedly evaluated the same chairs at different times).


2 Use the model estimates to identify which seat type requires more effort to adjust, and which it requires less. Find the 2 seats that requires the most effort to adjust and test whether they significantly different from each other.

From the output of the model we can see T2 is the seat that requires more effort, and the baseline T1 is the seat that requires less effort, whereas T2 and T3 are the two seat that requires greater effort.

print(summary(m)$coefficients, digits=4)
##             Estimate Std. Error t value
## (Intercept)   8.5556     0.5760  14.853
## TypeT2        3.8889     0.5187   7.498
## TypeT3        2.2222     0.5187   4.284
## TypeT4        0.6667     0.5187   1.285

In this model, however, each chair is contrasted with the baseline T1, so it doesn’t provide information about whether T2 and T3 are different from each other. You can double check this by examining the contrast matrix of the factor Type, which reveal the underlying dummy variable coding explicitly.

contrasts(ergoStool$Type)
##    T2 T3 T4
## T1  0  0  0
## T2  1  0  0
## T3  0  1  0
## T4  0  0  1

To test whether we T2 really worse than seat T3, we can modify the contrast matrix as follow:

levels(ergoStool$Type) # these are the levels of the Type variable; we want T2 (the second listed) as baseline
## [1] "T1" "T2" "T3" "T4"
contrasts(ergoStool$Type) <- contr.treatment(levels(ergoStool$Type), base=2)

next we can re-run the model

m <- lmer(effort ~ Type + (1|Subject), ergoStool)
summary(m)
## Linear mixed model fit by REML ['lmerMod']
## Formula: effort ~ Type + (1 | Subject)
##    Data: ergoStool
## 
## REML criterion at convergence: 121.1
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -1.80200 -0.64317  0.05783  0.70100  1.63142 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  Subject  (Intercept) 1.775    1.332   
##  Residual             1.211    1.100   
## Number of obs: 36, groups:  Subject, 9
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  12.4444     0.5760  21.604
## TypeT1       -3.8889     0.5187  -7.498
## TypeT3       -1.6667     0.5187  -3.213
## TypeT4       -3.2222     0.5187  -6.212
## 
## Correlation of Fixed Effects:
##        (Intr) TypeT1 TypeT3
## TypeT1 -0.450              
## TypeT3 -0.450  0.500       
## TypeT4 -0.450  0.500  0.500

The \(t\) value is quite large (>3) suggesting a significant difference. We could test this with bootstrapping by estimating confidence intervals around the parameters of interest. Note that we could also correct the confidence intervals for multiple comparisons. We have \(\binom{4}{2}=6\) possible pair of chairs to compare, so using Bonferroni correction we should use as confidence level \(1 - \frac{\alpha}{6}\), where \(\alpha\) is our chosen acceptable probability of a Type 1 error (usually \(0.05\)).

confint(m, parm="TypeT3", method="boot", level = 1 - 0.05/6)
## Computing bootstrap confidence intervals ...
##          0.417 %   99.583 %
## TypeT3 -2.874605 -0.1793896



Treatment effects in longitudinal studies

The cd4_mod.csv dataset has measurements of the strength of the immune system (CD4 percentages; these are white blood cells also known as T-helper cells) for a set of HIV-positive children who were not given zinc and who were measured several times over a period of about 2 years. The dataset also includes the ages of the children at baseline.

Load the dataset into R:

cd4 <- read.csv("https://raw.githubusercontent.com/mattelisi/RHUL-stats/main/data/cd4_mod.csv")
str(cd4)
## 'data.frame':    1228 obs. of  7 variables:
##  $ treatmnt: int  1 1 1 1 1 1 1 2 2 1 ...
##  $ baseage : num  3.91 3.91 3.91 3.91 3.91 ...
##  $ CD4PCT  : num  18 37 13 NA 13 NA 12 1 0.3 28 ...
##  $ time    : num  0 0.559 0.789 NA 1.422 ...
##  $ newpid  : int  1 1 1 1 1 1 1 2 2 3 ...
##  $ VISIT   : int  1 4 7 10 13 16 19 1 4 1 ...
##  $ CD4CNT  : int  323 610 324 NA 626 220 220 30 4 714 ...

We will use CD4PCT (percentage of CD4 cells) as main dependent variable and time as temporal variable (this is the time passed from the visit, expressed in years). baseage is the age at baseline and newpid is a numerical code that identifies each child.

Questions

(Click on ‘solution’ buttons below to toggle open the answers to each question.)

1 Analyse the CD4 percentages as a function of time (regardless of treatment) with a model with both random intercepts and random slopes. Include also the age a baseline as covariate. Use the lmerTest library to estimate p-values. What can we conclude from this analysis?

Load in the workspace the required libraries

library(lme4)
library(lmerTest)

Estimate the model and examine the result

model1 <- lmer(CD4PCT ~ baseage + time + (time |newpid), cd4)
summary(model1)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: CD4PCT ~ baseage + time + (time | newpid)
##    Data: cd4
## 
## REML criterion at convergence: 7670.6
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.7704 -0.4206 -0.0646  0.3476  6.2823 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr 
##  newpid   (Intercept) 137.91   11.744        
##           time         24.97    4.997   -0.29
##  Residual              47.20    6.870        
## Number of obs: 1049, groups:  newpid, 227
## 
## Fixed effects:
##             Estimate Std. Error       df t value Pr(>|t|)    
## (Intercept)  28.7548     1.4866 238.2089  19.342  < 2e-16 ***
## baseage      -1.0800     0.3498 220.4910  -3.088  0.00228 ** 
## time         -2.8061     0.6170 182.8976  -4.548 9.82e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##         (Intr) baseag
## baseage -0.819       
## time    -0.213 -0.014

The fixed effect of time is \(\beta_{\text{time}}=-2.81\); this is the average slope for time across all children. The standard deviation of the slope across children is \(\sigma_{\beta_{\text{time}}}=5.00\) (see Random effects: part of the output). The results thus indicate that most children - although not all - have declining CD4 levels during the period examined.


2 Create another model including also treatment as predictor. What can be concluded from this analysis?

Note: treatment is currently encoded as 1 and 2 (two integer numbers), so you need to make sure that R knows it is actually a categorical predictor. Moreover, since the treatment can only have effect after it is started, it cannot influence the CD4 percentage as baseline; to account for that formulate the model such that only the slope differs across treatment groups, while the intercept (that is the CD4 percentage a baseline) can be assumed to be similar.

First, tell R that treatment is a categorical predictor using the factor function

cd4$treatmnt <- factor(cd4$treatmnt)
contrasts(cd4$treatmnt) # this reveal how this is encoded as dummy variable
##   2
## 1 0
## 2 1

Next, lets formulate the model. Since the intercept should not differ across treatment groups, the treatment should be entered only in interaction with time but not as ‘main effect’.

model2 <- lmer(CD4PCT ~ baseage + time + time:treatmnt + (time |newpid), cd4)
summary(model2)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: CD4PCT ~ baseage + time + time:treatmnt + (time | newpid)
##    Data: cd4
## 
## REML criterion at convergence: 7667.4
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.7662 -0.4121 -0.0661  0.3450  6.2416 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr 
##  newpid   (Intercept) 137.90   11.743        
##           time         25.30    5.030   -0.30
##  Residual              47.22    6.872        
## Number of obs: 1049, groups:  newpid, 227
## 
## Fixed effects:
##                Estimate Std. Error       df t value Pr(>|t|)    
## (Intercept)     28.7377     1.4838 238.3252  19.368  < 2e-16 ***
## baseage         -1.0755     0.3488 219.6681  -3.084  0.00231 ** 
## time            -3.3527     0.8135 189.9055  -4.121 5.63e-05 ***
## time:treatmnt2   1.1944     1.1380 175.8608   1.050  0.29533    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) baseag time  
## baseage     -0.818              
## time        -0.161 -0.015       
## tim:trtmnt2 -0.006  0.008 -0.650

The positive interaction coefficient time:treatmnt2 would suggest that children who were given the treatment CD4 levels declined with a slightly lower rate than children who were not given the treatment. However, the test on the interaction coefficient is not significant, thus we cannot reject the null hypothesis that the difference between treatment and control is due to chance.


3 Compare the two models you have created at the previous steps with a likelihood ratio test.

anova(model1, model2)
## refitting model(s) with ML (instead of REML)
## Data: cd4
## Models:
## model1: CD4PCT ~ baseage + time + (time | newpid)
## model2: CD4PCT ~ baseage + time + time:treatmnt + (time | newpid)
##        npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)
## model1    7 7686.5 7721.2 -3836.3   7672.5                     
## model2    8 7687.4 7727.1 -3835.7   7671.4 1.1055  1     0.2931

(Note that the p-value is similar to that of the t-test above).


4 Plot the data alongside the fixed-effects predictions of model2 (advanced R users)

In order to plot the predictions, it is in this case simpler to use the predict function on a separated, simplified dataframe object. This dataset can have only 2 time points (the min and max of time) as this is sufficient to represent a line. Furthermore, we set baseage to the average of the children in the study.

pred_df <- expand.grid(time = range(cd4$time, na.rm=T),
                       treatmnt = unique(cd4$treatmnt),
                       baseage = mean(cd4$baseage, na.rm=T))

pred_df
##       time treatmnt  baseage
## 1 0.000000        1 3.580798
## 2 1.939726        1 3.580798
## 3 0.000000        2 3.580798
## 4 1.939726        2 3.580798

Next, we can calculate the predictions. Note the use of the argument re.form=NA: it indicates R that we do not want to include random effects when calculating the predictions. (see ?predict.merMod for details on the arguments available)

pred_df$CD4PCT <- predict(model2, re.form=NA, newdata=pred_df)
pred_df
##       time treatmnt  baseage   CD4PCT
## 1 0.000000        1 3.580798 24.88651
## 2 1.939726        1 3.580798 18.38318
## 3 0.000000        2 3.580798 24.88651
## 4 1.939726        2 3.580798 20.70007

Now we can use these to include in the plot

ggplot(cd4, aes(x=time, y=CD4PCT))+ 
  facet_grid(.~treatmnt)+ 
  geom_point()+ 
  geom_line(data=pred_df,size=2,color="blue")
## Warning: Removed 176 rows containing missing values (geom_point).

Bonus: use bootstrapping to estimate and plot uncertainty around mean predictions

The code below use parametric bootstrapping to estimate a 95% confidence interval around the predicted mean CD4PCT value.

bootFUN <- function(model){
   pred_cd4 <- predict(model, re.form=NA, newdata=pred_df)
   return(pred_cd4)
}

boot_res <- bootMer(model2, bootFUN, nsim=250)
dim(boot_res$t)
## [1] 250   4
pred_df$lower <- apply(boot_res$t, 2, function(x){quantile(x, probs = 0.025)})
pred_df$upper <- apply(boot_res$t, 2, function(x){quantile(x, probs = 0.975)})
pred_df$CD4PCT <- predict(model2, re.form=NA, newdata=pred_df)

ggplot(cd4, aes(x=time, y=CD4PCT))+ 
  facet_grid(.~treatmnt)+ 
  geom_point()+ 
  geom_ribbon(data=pred_df,aes(ymin=lower, ymax=upper, fill=treatmnt), alpha=0.5)+
  geom_line(data=pred_df,size=2,aes(color=treatmnt))
## Warning: Removed 176 rows containing missing values (geom_point).



Multiple predictors & interactions

The mathachieve dataset is from the 1982 “High School and Beyond” survey, and containts data from 7185 high-school students from 160 schools.

mathachieve <- read.csv("https://raw.githubusercontent.com/mattelisi/RHUL-stats/main/data/mathachieve.csv")
str(mathachieve)
## 'data.frame':    7185 obs. of  8 variables:
##  $ school  : int  1224 1224 1224 1224 1224 1224 1224 1224 1224 1224 ...
##  $ minority: chr  "No" "No" "No" "No" ...
##  $ sex     : chr  "Female" "Female" "Male" "Male" ...
##  $ ses     : num  -1.528 -0.588 -0.528 -0.668 -0.158 ...
##  $ mathach : num  5.88 19.71 20.35 8.78 17.9 ...
##  $ meanses : num  -0.428 -0.428 -0.428 -0.428 -0.428 -0.428 -0.428 -0.428 -0.428 -0.428 ...
##  $ size    : int  842 842 842 842 842 842 842 842 842 842 ...
##  $ sector  : chr  "Public" "Public" "Public" "Public" ...

The response variable variable mathach is the student’s score on a math-achievement test; the remaining are possible explanatory variables. In particular note that - ses, the adjusted socioeconomic status of the student’s family; - meanses, the average socioeconomic status for students in each school; - size, the number of students in each school; - sector, a factor coded Catholic or Public for the type of student’s school.

The variable school is an identification number for the student’s school. Note that sector and meanses are a school-level variable and hence is identical for all students in the same school.

Questions

(Click on ‘solution’ buttons below to toggle open the answers to each question.)

2 Add the interaction term ses:sector to the model, and evaluate whether it improves the model ability to fit the data using a likelihood ratio test. Interpret the results and calculate the slope for ses in catholic and public schools.

We can add the interaction by using:

math.2 <- lmer(mathach ~ meanses + sector + ses + size+ ses:sector + (ses | school), data=mathachieve)
## Warning: Some predictor variables are on very different scales: consider
## rescaling

## Warning: Some predictor variables are on very different scales: consider
## rescaling
summary(math.2)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: mathach ~ meanses + sector + ses + size + ses:sector + (ses |  
##     school)
##    Data: mathachieve
## 
## REML criterion at convergence: 46525.7
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.0909 -0.7285  0.0204  0.7553  3.0364 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr
##  school   (Intercept)  2.30811 1.5192       
##           ses          0.05868 0.2422   0.51
##  Residual             36.79626 6.0660       
## Number of obs: 7185, groups:  school, 160
## 
## Fixed effects:
##                    Estimate Std. Error         df t value Pr(>|t|)    
## (Intercept)       1.311e+01  2.916e-01  1.437e+02  44.965  < 2e-16 ***
## meanses           3.098e+00  3.779e-01  1.642e+02   8.199 6.59e-14 ***
## sectorPublic     -1.520e+00  3.345e-01  1.435e+02  -4.543 1.17e-05 ***
## ses               1.449e+00  1.624e-01  1.810e+02   8.923 4.95e-16 ***
## size              4.416e-04  2.507e-04  1.415e+02   1.761   0.0804 .  
## sectorPublic:ses  1.328e+00  2.126e-01  1.588e+02   6.246 3.71e-09 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) meanss sctrPb ses    size  
## meanses     -0.161                            
## sectorPublc -0.226  0.317                     
## ses          0.017 -0.209 -0.047              
## size        -0.667 -0.032 -0.427  0.002       
## sctrPblc:ss  0.013  0.018  0.063 -0.735 -0.003
## fit warnings:
## Some predictor variables are on very different scales: consider rescaling

To run the likelihood ratio test use:

anova(math.1, math.2)
## refitting model(s) with ML (instead of REML)
## Data: mathachieve
## Models:
## math.1: mathach ~ meanses + sector + ses + size + (ses | school)
## math.2: mathach ~ meanses + sector + ses + size + ses:sector + (ses | school)
##        npar   AIC   BIC logLik deviance  Chisq Df Pr(>Chisq)    
## math.1    9 46557 46619 -23270    46539                         
## math.2   10 46524 46593 -23252    46504 35.232  1  2.927e-09 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

This indicates that the interaction term does indeed improve the model.

Note that the interaction term is significant and positive. One possible way to interpret this would be that this means that the different in achievements that can be explained by ses is larger in public schools than in catholic schools. Indeed, the slope for ses is the following in Catholic schools:

fixef(math.2)["ses"]
##      ses 
## 1.448998

and is the following in the Public schools:

fixef(math.2)["ses"] + fixef(math.2)["sectorPublic:ses"]
##     ses 
## 2.77657

Another possible interpretation is that ses matter more in the lower end of the range (e.g. from lower to medium ses) that in the upper end (e.g. from medium to high ses). In fact, the following plot show the distribution of mean ses values split by sector and we can see that students in catholic schools had, on average, higher ses

mathachieve %>%
  group_by(school, sector) %>%
  summarise(ses = mean(ses)) %>%
  ggplot(aes(x=sector, y=ses))+
  geom_violin() 
## `summarise()` has grouped output by 'school'. You can override using the
## `.groups` argument.