Testing mtcars (simulation)

This code can be used to simulate the sampling distribution under the assumption of the null hypothesis. The takeaway here is that we need to shift our original data by the difference in our sample mean to our null value, so that when we resample with replacement, we get a sampling distribution centered at the null value. See slides for written out steps!


The Data

Rows: 32
Columns: 11
$ mpg  <dbl> 21.0, 21.0, 22.8, 21.4, 18.7, 18.1, 14.3, 24.4, 22.8, 19.2, 17.8,…
$ cyl  <dbl> 6, 6, 4, 6, 8, 6, 8, 4, 4, 6, 6, 8, 8, 8, 8, 8, 8, 4, 4, 4, 4, 8,…
$ disp <dbl> 160.0, 160.0, 108.0, 258.0, 360.0, 225.0, 360.0, 146.7, 140.8, 16…
$ hp   <dbl> 110, 110, 93, 110, 175, 105, 245, 62, 95, 123, 123, 180, 180, 180…
$ drat <dbl> 3.90, 3.90, 3.85, 3.08, 3.15, 2.76, 3.21, 3.69, 3.92, 3.92, 3.92,…
$ wt   <dbl> 2.620, 2.875, 2.320, 3.215, 3.440, 3.460, 3.570, 3.190, 3.150, 3.…
$ qsec <dbl> 16.46, 17.02, 18.61, 19.44, 17.02, 20.22, 15.84, 20.00, 22.90, 18…
$ vs   <dbl> 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0,…
$ am   <dbl> 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0,…
$ gear <dbl> 4, 4, 4, 3, 3, 3, 3, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 4, 4, 4, 3, 3,…
$ carb <dbl> 4, 4, 1, 1, 2, 1, 4, 2, 2, 4, 4, 3, 3, 3, 4, 4, 4, 1, 2, 1, 1, 2,…

Null and Alternative

From class context:

\(\H_o: \mu = 12\)

\(\H_a: \mu > 12\)

Sample Mean

xbar <- mtcars |>
  summarize(stat = mean(mpg))

1 20.09062

The Difference

xbar - 12
1 8.090625

The difference is 8.09

Shift MPG by diff

mtcars <- mtcars |>
  mutate(mpg_shift = mpg - 8.09)

The Process


null_df <- mtcars |>
  specify(response = mpg_shift) |>
  generate(reps = 10000, type = "bootstrap") |>
  calculate(stat = "mean")

The Graph

null_df |>
    aes(x = stat)
  ) + 
  geom_density(fill = "gray") +
  labs(title = "Bootstrap distribution",
       y = "")

The Calculation

null_df |>
  get_p_value(xbar, "right")
Warning: Please be cautious in reporting a p-value of 0. This result is an approximation
based on the number of `reps` chosen in the `generate()` step.
ℹ See `get_p_value()` (`?infer::get_p_value()`) for more information.
# A tibble: 1 × 1
1       0

Our p-value is < 0.001