Data Viz II

Lecture 5

Dr. Elijah Meyer

NC State University
ST 511 - Fall 2024

2024-09-04

Checklist

– Are you keeping up with the prepare material?

– Are you posting on Slack?

– Take advantage of TA office hours!

– Data Viz quiz due Sunday

– HW-1 due Sep 8th on Gradescope

– Pull up slides + follow along!

Announcements

HW-1 question 1b

Create a data frame that displays the minimum and maximum lot area, in acres. Name your columns min_lot and max_lot.

The acres column is in square feet. It is fine to leave the units as square feet for this question.

Announcements

The workbench license went through (last last week)

You can still use local install if you would like

Announcements

HW-1

-- Gradescope is set up; Mark all pages associated with each exercise 

-- Do not select any pages of your PDF submission to be with Workflow + Formatting 

AE

-- Solutions from last week are posted

-- Will be posted shortly after quiz due date 

Announcements

New resources

– Functions + Arguments

– Tidyverse stylings (also was prepare material)

Warm Up

Practice reading the following code as a sentence

mtcars |>
  ggplot(
    aes(
      x = mpg, y = wt) 
  ) +
  geom_point()

Goals for today

– More plots!

– More practice with R

Warm Up: Style Guide

Based on the style guide, how can the code below be improved?

iris |>
  filter(Species == "setosa")|>ggplot(aes(x = Sepal.Width,y = Sepal.Length))+ geom_point()

Warm Up: Style Guide

col vs bar

A bar chart or bar graph is a chart or graph that presents categorical data with rectangular bars

The bars represent either counts or a proportional value

There are two types of bar charts: geom_bar() and geom_col()

col vs bar

Help File

geom_bar() performs a statistical transformation to your data before plotting (i.e., it counts up the cases for you)

library(palmerpenguins)
library(tidyverse)
penguins |>
  select(species) 
# A tibble: 344 × 1
   species
   <fct>  
 1 Adelie 
 2 Adelie 
 3 Adelie 
 4 Adelie 
 5 Adelie 
 6 Adelie 
 7 Adelie 
 8 Adelie 
 9 Adelie 
10 Adelie 
# ℹ 334 more rows

geom_col

geom_col() uses stat_identity(): it leaves the data as is.

# A tibble: 3 × 2
  species   count
  <fct>     <int>
1 Adelie      152
2 Chinstrap    68
3 Gentoo      124

bar plots

The bars represent either counts or a proportional value

we can do this within the geom function!

segmented bar chart

note

This is a little more nuanced in just a bar chart. We need to calculate the proportions ourselves before plotting. This can be a separate resource / in Slack for those interested.

ae-sep4

Recreate