Additive Models

Solutions

Load packages and data

Today

By the end of today you will…

  • understand the difference between and additive vs interaction model
  • understand the geometric picture of multiple linear regression
  • be able to build, fit and interpret linear models with \(>1\) predictor

Additive vs interaction model

We are going to fit a model that looks at the impact of bill length on flipper length, while also accounting for the species of penguin. In this context, define what an additive model is vs an interaction model.

additive: The relationship between flipper length and bill length does not depend on species

interaction: The relationship between flipper length and bill length depends on species

Plots

additive model

penguins <- na.omit(penguins)
fitlm <- lm(flipper_length_mm ~ bill_length_mm + species, data = penguins)
penguins$predlm = predict(fitlm)
ggplot(penguins, aes(x = bill_length_mm, 
                     y = flipper_length_mm, 
                     color = species)) +
     geom_point() +
     geom_line(aes(y = predlm), linewidth = 1)

interaction model

penguins |>
  ggplot(
    aes(x = bill_length_mm, 
                     y = flipper_length_mm, 
                     color = species)) +
     geom_point() +
     geom_smooth(method = "lm" , se = F)
`geom_smooth()` using formula = 'y ~ x'

Fitting the additive model

To fit the additive model, we can use the + sign. Use the plus sign to add species to the linear model code fit from Monday’s class.

model1 <- lm(flipper_length_mm ~ bill_length_mm + species, data = penguins)

summary(model1)

Call:
lm(formula = flipper_length_mm ~ bill_length_mm + species, data = penguins)

Residuals:
     Min       1Q   Median       3Q      Max 
-24.8669  -3.4617  -0.0765   3.7020  15.9944 

Coefficients:
                 Estimate Std. Error t value Pr(>|t|)    
(Intercept)      147.5633     4.2234  34.940  < 2e-16 ***
bill_length_mm     1.0957     0.1081  10.139  < 2e-16 ***
speciesChinstrap  -5.2470     1.3797  -3.803  0.00017 ***
speciesGentoo     17.5517     1.1883  14.771  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 5.833 on 329 degrees of freedom
Multiple R-squared:  0.8283,    Adjusted R-squared:  0.8268 
F-statistic: 529.2 on 3 and 329 DF,  p-value: < 2.2e-16

What are the equations for the three different species in the additive model?

See the slides!