Hypothesis Testing Intro

Solutions using data from class 2

You need to install the tidymodels package to use some of the functions in this activity. To do so, write the following code in you Console, and click enter:

install.packages(“tidymodels”)

Packages

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
── Attaching packages ────────────────────────────────────── tidymodels 1.2.0 ──
✔ broom        1.0.6     ✔ rsample      1.2.1
✔ dials        1.3.0     ✔ tune         1.2.1
✔ infer        1.0.7     ✔ workflows    1.1.4
✔ modeldata    1.4.0     ✔ workflowsets 1.1.0
✔ parsnip      1.2.1     ✔ yardstick    1.3.1
✔ recipes      1.1.0     
── Conflicts ───────────────────────────────────────── tidymodels_conflicts() ──
✖ scales::discard() masks purrr::discard()
✖ dplyr::filter()   masks stats::filter()
✖ recipes::fixed()  masks stringr::fixed()
✖ dplyr::lag()      masks stats::lag()
✖ yardstick::spec() masks readr::spec()
✖ recipes::step()   masks stats::step()
• Use tidymodels_prefer() to resolve common conflicts.
Loading required package: airports
Loading required package: cherryblossom
Loading required package: usdata

Attaching package: 'openintro'

The following object is masked from 'package:modeldata':

    ames

Bumba or Kiki

How well can humans distinguish one “Martian” letter from another? In today’s activity, we’ll find out. When shown the two Martian letters, kiki and bumba, answer the poll

– Option 1: 47

– Option 2: 1

The question is: “Which letter is Bumba”?

Option 1

Once it’s revealed which option is correct, please write our sample statistic below:

\(\hat{p}\) = 47/48 = 0.98

Let’s write out the null and alternative hypotheses below

\(Ho:\pi = 0.5\)

\(Ha:\pi > 0.5\)

Now, let’s quickly make a data frame of the data we just collected as a class. Replace the … with the number of correct and incorrect guesses.

class_data <- tibble(
  correct_guess = c((rep("Correct" , 47)), rep("Incorrect" , 1)))

Capture Variability

Now let’s simulate our null distribution by filling in the blanks. First, detail how this distribution is created?

This distribution is created by assuming the null hypothesis is true, or that students are really just guessing at which option is Bumba. Then, to generate one sample, we can use the idea of a coin, where heads could equal the correct guess and tails could equal the incorrect guess. We flip this coin 48 times, the same number of times as the sample size of our original sample. We then calculate our sample proportion of the simulated data. This is one dot on the sampling distribution. If we do this process many many times, we get a sampling distribution under the assumption of the null hypothesis!

set.seed(333)

null_dist <- class_data |> 
  specify(response = correct_guess, success = "...") |>
  hypothesize(null = "point", p = 5) |> #.5 is the null value
  generate(reps = 1000, type = "draw") |> #reps is how many times we do the entire process
  calculate(stat = "prop") #prop is the stat we want to calculate

Helpful Hint: Remember that you can use ? next to the function name to pull up the help file!

Calculate and visualize the distribution below.

visualize(null_dist) +
  shade_p_value(.98, direction = "right") #.98 is our sample statistic; right is our alternative hypothesis

null_dist |>
  get_p_value(.98, direction = "right") #fill in the blank

What did we just calculate?

**We just calculated our p-value! A p-value is the probability of observing our statistic of 0.98, or something larger given the true proportion of students who can correctly identify Bumba really is 50%. Our simulation gives us a value of 0, but we know that area under a curve can not be 0. We report this as < 0.001*

What is our decision?

Reject the null hypothesis

What is our conclusion?

We have strong evidence to conclude that the true proportion of students who can correctly identify Bumba is actually larger than 50%.

So, can we read Martian?

Ted Talk

http://www.ted.com/talks/vilayanur_ramachandran_on_your_mind