ggpredict()
and ggemmeans()
compute predicted values for all possible levels or values from a model’s predictors. Basically, ggpredict()
wraps the predict()
-method for the related model, while ggemmeans()
wraps the emmeans()
-method from the emmeans-package. Both ggpredict()
and ggemmeans()
do some data-preparation to bring the data in shape for the newdata
-argument (predict()
) resp. the at
-argument (emmeans()
). It is recommended to read the general introduction first, if you haven’t done this yet.
For models without categorical predictors, the results from ggpredict()
and ggemmeans()
are identical (except some slight differences in the associated confidence intervals, which are, however, negligable).
library(ggeffects)
data(efc)
fit <- lm(barthtot ~ c12hour + neg_c_7, data = efc)
ggpredict(fit, terms = "c12hour")
#>
#> # Predicted values of Total score BARTHEL INDEX
#> # x = average number of hours of care per week
#>
#> x predicted std.error conf.low conf.high
#> 0 75.072 1.077 72.962 77.183
#> 20 70.155 0.895 68.400 71.909
#> 45 64.008 0.818 62.405 65.610
#> 65 59.090 0.902 57.323 60.857
#> 85 54.172 1.087 52.042 56.302
#> 105 49.255 1.331 46.645 51.864
#> 125 44.337 1.609 41.184 47.490
#> 170 33.272 2.289 28.787 37.758
#>
#> Adjusted for:
#> * neg_c_7 = 11.83
ggemmeans(fit, terms = "c12hour")
#>
#> # Predicted values of Total score BARTHEL INDEX
#> # x = average number of hours of care per week
#>
#> x predicted std.error conf.low conf.high
#> 0 75.072 1.077 72.959 77.186
#> 20 70.155 0.895 68.398 71.912
#> 45 64.008 0.818 62.403 65.612
#> 65 59.090 0.902 57.320 60.860
#> 85 54.172 1.087 52.039 56.305
#> 105 49.255 1.331 46.641 51.868
#> 125 44.337 1.609 41.180 47.494
#> 170 33.272 2.289 28.780 37.764
#>
#> Adjusted for:
#> * neg_c_7 = 11.83
As can be seen, the continuous predictor neg_c_7
is held constant at its mean value, 11.83. For categorical predictors, ggpredict()
and ggemmeans()
behave differently. While ggpredict()
uses the reference level of each categorical predictor to hold it constant, ggemmeans()
- like ggeffects()
- averages over the proportions of the categories of factors.
library(sjmisc)
data(efc)
efc$e42dep <- to_label(efc$e42dep)
fit <- lm(barthtot ~ c12hour + neg_c_7 + e42dep, data = efc)
ggpredict(fit, terms = "c12hour")
#>
#> # Predicted values of Total score BARTHEL INDEX
#> # x = average number of hours of care per week
#>
#> x predicted std.error conf.low conf.high
#> 0 92.745 2.173 88.485 97.004
#> 20 91.317 2.169 87.067 95.567
#> 45 89.532 2.208 85.206 93.859
#> 65 88.105 2.274 83.649 92.561
#> 85 86.677 2.368 82.037 91.318
#> 105 85.250 2.486 80.376 90.123
#> 125 83.822 2.627 78.674 88.970
#> 170 80.610 3.005 74.721 86.499
#>
#> Adjusted for:
#> * neg_c_7 = 11.83
#> * e42dep = independent
ggemmeans(fit, terms = "c12hour")
#>
#> # Predicted values of Total score BARTHEL INDEX
#> # x = average number of hours of care per week
#>
#> x predicted std.error conf.low conf.high
#> 0 73.515 0.846 71.853 75.176
#> 20 72.087 0.734 70.646 73.528
#> 45 70.302 0.718 68.894 71.711
#> 65 68.875 0.809 67.287 70.462
#> 85 67.447 0.966 65.550 69.344
#> 105 66.019 1.164 63.735 68.304
#> 125 64.592 1.384 61.875 67.309
#> 170 61.380 1.922 57.608 65.152
#>
#> Adjusted for:
#> * neg_c_7 = 11.83
One would obtain the same results again, if condition
is used to define specific levels at which variables, in our case the factor e42dep
, should be held constant.
ggpredict(fit, terms = "c12hour")
#>
#> # Predicted values of Total score BARTHEL INDEX
#> # x = average number of hours of care per week
#>
#> x predicted std.error conf.low conf.high
#> 0 92.745 2.173 88.485 97.004
#> 20 91.317 2.169 87.067 95.567
#> 45 89.532 2.208 85.206 93.859
#> 65 88.105 2.274 83.649 92.561
#> 85 86.677 2.368 82.037 91.318
#> 105 85.250 2.486 80.376 90.123
#> 125 83.822 2.627 78.674 88.970
#> 170 80.610 3.005 74.721 86.499
#>
#> Adjusted for:
#> * neg_c_7 = 11.83
#> * e42dep = independent
ggemmeans(fit, terms = "c12hour", condition = c(e42dep = "independent"))
#>
#> # Predicted values of Total score BARTHEL INDEX
#> # x = average number of hours of care per week
#>
#> x predicted std.error conf.low conf.high
#> 0 92.745 2.173 88.479 97.010
#> 20 91.317 2.169 87.061 95.573
#> 45 89.532 2.208 85.199 93.865
#> 65 88.105 2.274 83.642 92.567
#> 85 86.677 2.368 82.030 91.324
#> 105 85.250 2.486 80.370 90.130
#> 125 83.822 2.627 78.667 88.977
#> 170 80.610 3.005 74.712 86.507
#>
#> Adjusted for:
#> * neg_c_7 = 11.83
Creating plots is as simple as described in the vignette Plotting Marginal Effects.