You need to install the tidymodels package to use some of the functions in this activity. To do so, write the following code in you Console, and click enter:
Loading required package: airports
Loading required package: cherryblossom
Loading required package: usdata
Attaching package: 'openintro'
The following object is masked from 'package:modeldata':
ames
Bumba or Kiki
How well can humans distinguish one “Martian” letter from another? In today’s activity, we’ll find out. When shown the two Martian letters, kiki and bumba, answer the poll
– Option 1: 47
– Option 2: 1
The question is: “Which letter is Bumba”?
Option 1
Once it’s revealed which option is correct, please write our sample statistic below:
\(\hat{p}\) = 47/48 = 0.98
Let’s write out the null and alternative hypotheses below
\(Ho:\pi = 0.5\)
\(Ha:\pi > 0.5\)
Now, let’s quickly make a data frame of the data we just collected as a class. Replace the … with the number of correct and incorrect guesses.
Now let’s simulate our null distribution by filling in the blanks. First, detail how this distribution is created?
This distribution is created by assuming the null hypothesis is true, or that students are really just guessing at which option is Bumba. Then, to generate one sample, we can use the idea of a coin, where heads could equal the correct guess and tails could equal the incorrect guess. We flip this coin 48 times, the same number of times as the sample size of our original sample. We then calculate our sample proportion of the simulated data. This is one dot on the sampling distribution. If we do this process many many times, we get a sampling distribution under the assumption of the null hypothesis!
set.seed(333)null_dist<-class_data|>specify(response =correct_guess, success ="...")|>hypothesize(null ="point", p =5)|>#.5 is the null valuegenerate(reps =1000, type ="draw")|>#reps is how many times we do the entire processcalculate(stat ="prop")#prop is the stat we want to calculate
Helpful Hint: Remember that you can use ? next to the function name to pull up the help file!
Calculate and visualize the distribution below.
visualize(null_dist)+shade_p_value(.98, direction ="right")#.98 is our sample statistic; right is our alternative hypothesisnull_dist|>get_p_value(.98, direction ="right")#fill in the blank
What did we just calculate?
**We just calculated our p-value! A p-value is the probability of observing our statistic of 0.98, or something larger given the true proportion of students who can correctly identify Bumba really is 50%. Our simulation gives us a value of 0, but we know that area under a curve can not be 0. We report this as < 0.001*
What is our decision?
Reject the null hypothesis
What is our conclusion?
We have strong evidence to conclude that the true proportion of students who can correctly identify Bumba is actually larger than 50%.