Estimating Single Mean

Lecture 13?

Dr. Elijah Meyer

NC State University
ST 511 - Fall 2024

Invalid Date

Checklist

– Are you keeping up with Slack?

– Quiz-6 Wednesday (due Sunday)

– Exam-1: October 9th (in-class)

– Exam-1: Assigned October 9th; Due 11:59pm October 15th

Announcements

– All videos are live

– All solutions are posted

– Exam equation sheet is posted

Warm Up

Notation check

\(\mu\)

\(\pi\)

\(s\)

\(\hat{p}\)

\(\bar{x}\)

\(\sigma\)

Notation check

\(\mu\) = population mean

\(\pi\) = population proportion

\(s\) = sample standard deviation

\(\hat{p}\) sample proportion

\(\bar{x}\) sample mean

\(\sigma\) population standard deviation

Warm Up

What is the difference between a population distribution and sampling distribution?

Warm Up

A population distribution is a distribution of all observational units of interest, while a sampling distribution is a distribution of a statistic that comes from a random sample of a population

Warm Up

Null distribution (sampling distribution under the assumption of the null hypothesis), and the connection to a named distribution (Z and t).

Make the connection

Howling Cow Example

\(H_o:\) \(\pi\) = .5

\(H_a:\) \(\pi\) < .5

\(\hat{p}\) = \(\frac{37}{100}\)

Our entire goal is to make a null distribution! This distribution is:

– centered at \(\pi_o\)

– has spread (standard error for the null) of

\[ \sqrt{\frac{.5 * (1-.5)}{100}} \]

It has this standard error because we checked our assumptions!

What’s this look like

Here is the approximated null distribution. And we can calculate the p-value straight from here!

Z = \(\frac{\hat{p} - \pi_o}{SE}\)

Z = \(\frac{.37 - .5}{0.05}\) = -2.61

Confidence intervals

When do we make confidence intervals? What question are we trying to answer?

What is the population mean bill length (mm) of penguins?

Penguins data set

Includes measurements for penguin species, island in Palmer Archipelago. There are 342 penguins in the data set. For this example, you can assume that one penguin does not influence another penguin, or that each penguin is independent from each other.

What is the other assumption we need to check? How do we check this?

Normallity

Normality

Still a bit subjective (as we learned in hypothesis testing). Stick to the general < 30, > 30, > 60 rules, and we will continue practicing in different scenarios.

Confidence interval

\(\bar{x} \pm \text{Margin of Error}\)

Confidence interval

Let’s make a 95% confidence interval!

\(\bar{x} \pm t^* * SE\)

\(\bar{x} \pm t^* * \frac{s}{\sqrt{n}}\)

t-distribution

Standard deviation

Confidence interval

Let’s make a 95% confidence interval!

penguins |>
  summarize(center = mean(bill_length_mm, na.rm = T),
            spread = sd(bill_length_mm, na.rm = T),
            count = n()-2) # subtract 2 NA values out
# A tibble: 1 × 3
  center spread count
   <dbl>  <dbl> <dbl>
1   43.9   5.46   342
qt(.975, df = 341)
[1] 1.966945

Hmmm.. that t* looks familiar…

Confidence interval

Let’s make a 95% confidence interval!

\(43.9 \pm 1.97 * \frac{5.46}{\sqrt{342}} = (43.318, 44.482)\)

Interpret your confidence interval in the context of the problem

Interpret your confidence interval in the context of the problem

We are 95% confident that the true mean bill length for penguins on the Palmer island to be between 43.318 and 44.482 mm.

Simulation

App

Steps

  • Sample with replacement 342 times

  • Calculate the mean of your new resampled data

  • Plot it

  • Do this process many many times!

Review