library(tidyverse)
<- ggplot2::mpg mpg
Classwork 6
ggplot - First Steps; Aethetic Mappings; Facets
Direction
- Open the Quarto document,
danl-200-quarto.qmd
. - Use
danl-200-quarto.qmd
as a template to answer the following questions.- Delete everything except for the YAML header.
- Shortcut for creating a code chunk
- Windows: Alt + Ctrl + I
- Mac: option + command + I
- To run a code code chunk:
- Click the play button in a code chunk at the top right corner.
- Shortcut:
- Windows: Ctrl + Shift + Enter
- Mac: command + shift + return
- To run a current line of code in a code chunk:
- Shortcut:
- Windows: Ctrl + Enter
- Mac: command + return
- Shortcut:
Question 1. First Steps
Load the following data.frame, mpg
.
datatable(mpg)
library(skimr)
skim(mpg)
Name | mpg |
Number of rows | 234 |
Number of columns | 11 |
_______________________ | |
Column type frequency: | |
character | 6 |
numeric | 5 |
________________________ | |
Group variables | None |
Variable type: character
skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
---|---|---|---|---|---|---|---|
manufacturer | 0 | 1 | 4 | 10 | 0 | 15 | 0 |
model | 0 | 1 | 2 | 22 | 0 | 38 | 0 |
trans | 0 | 1 | 8 | 10 | 0 | 10 | 0 |
drv | 0 | 1 | 1 | 1 | 0 | 3 | 0 |
fl | 0 | 1 | 1 | 1 | 0 | 5 | 0 |
class | 0 | 1 | 3 | 10 | 0 | 7 | 0 |
Variable type: numeric
skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
---|---|---|---|---|---|---|---|---|---|---|
displ | 0 | 1 | 3.47 | 1.29 | 1.6 | 2.4 | 3.3 | 4.6 | 7 | ▇▆▆▃▁ |
year | 0 | 1 | 2003.50 | 4.51 | 1999.0 | 1999.0 | 2003.5 | 2008.0 | 2008 | ▇▁▁▁▇ |
cyl | 0 | 1 | 5.89 | 1.61 | 4.0 | 4.0 | 6.0 | 8.0 | 8 | ▇▁▇▁▇ |
cty | 0 | 1 | 16.86 | 4.26 | 9.0 | 14.0 | 17.0 | 19.0 | 35 | ▆▇▃▁▁ |
hwy | 0 | 1 | 23.44 | 5.95 | 12.0 | 18.0 | 24.0 | 27.0 | 44 | ▅▅▇▁▁ |
Q1a.
Run ggplot(data = mpg)
. What do you see?
ggplot(data = mpg)
Q1b.
Write a ggplot code to make a scatterplot of hwy
vs. cyl
.
ggplot(data = mpg) +
geom_point(mapping = aes(x = hwy, y = cyl))
Q1c.
- What happens if you make a scatterplot of
class
vs.drv
?- Why is the plot not useful?
ggplot(data = mpg) +
geom_point(mapping = aes(x = class, y = drv))
Question 2. Aethetic Mapping
Q2a.
- Which variables in the data.frame
mpg
are categorical? - Which variables are continuous?
Q2b.
- Consider the ggplot below:
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy))
- Map a continuous variable to
color
,size
, andshape
. - How do these aesthetics behave differently for categorical vs. continuous variables?
- To consider categorical variables, use
as.factor(VARIABLE)
.
- To consider categorical variables, use
Q2d.
What happens if you map an aesthetic to something other than a variable name, like aes(color = displ < 5)
?
Question 3. Facets
Q3a.
What happens if you facet on a continuous variable?
Q3b.
- What do the empty cells in the scatterplot of
displ
vs.hwy
withfacet_grid(drv ~ cyl)
mean?- How do they relate to this plot?
ggplot(data = mpg) +
geom_point(mapping = aes(x = drv, y = cyl))
Q3c.
What plots does the following code make? What does .
do?
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy),
alpha = .5) +
facet_grid(drv ~ .)
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy),
alpha = .5) +
facet_grid(. ~ cyl)
Q3d.
Consider the following faceted plot:
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy),
alpha = .5) +
facet_wrap(~ class, nrow = 2)
- What are the advantages to using faceting instead of the color aesthetic?
- What are the disadvantages?
Q3e.
Use the following data.frame.
<- read_csv(
tvshows 'https://bcdanl.github.io/data/tvshows.csv')
Rows: 40 Columns: 6
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (3): Show, Network, Genre
dbl (3): PE, GRP, Duration
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
datatable(tvshows)
- Provide both (1)
ggplot
code and (2) comment to describe the relationship between audience size (GRP
) and audience engagement (PE
).