library(tidyverse)Classwork 2
Relationship Plots
R Packages
For Classwork 2, please load the tidyverse package:
Question 1. Ice Cream Sales and Shark Attacks
Consider the data frame df, which records monthly ice cream sales and shark attacks.
df <- read_csv("http://bcdanl.github.io/data/icecream-shark-df.csv")Part A
- π€ Task 1: Fill in the blanks in the provided
ggplot()code chunk.
- π¬ Task 2: Add a brief comment describing the relationship between ice cream sales (
IceCreamSales) and shark attacks (SharkAttacks).

ggplot(data = __BLANK_1__,
mapping = aes(x = __BLANK_2__,
y = __BLANK_3__)) +
geom___BLANK_4__() +
geom___BLANK_5__()ggplot(data = df,
mapping = aes(x = IceCreamSales,
y = SharkAttacks)) +
geom_point() +
geom_smooth() +
scale_y_continuous(breaks = seq(2,12,2))The scatterplot along with the fitted line shows a positive linear relationship between ice cream sales and drowning incidents. This trend is highlighted by the fitted line in the plot.
As ice cream sales increase, the number of drowning incidents also tends to increase.
Part B


- β Is the observed relationship one of correlation or causation? Explain your reasoning.
- Consider the following monthly trends for
IceCreamSalesandSharkAttacks:
- Consider the following monthly trends for
Monthly Trend of IceCreamSales

Monthly Trend of SharkAttacks

ggplot(data = df,
mapping = aes(x = Month,
y = IceCreamSales)) +
geom_point() +
geom_line() + # a geometric object for a line chart
scale_x_continuous(breaks = seq(1,12,1)) +
scale_y_continuous(breaks = seq(2, 7, 1)*10^3) The observed relationship is correlation, not causation. While the data shows that higher ice cream sales are associated with more drowning incidents, this does not imply that buying more ice cream causes more drownings.
This correlation could be due to a confounding factor, such as warmer weather, which increases both ice cream consumption and water-related activities, leading to more drowning incidents.
Question 2. NBC Show Data
The nbc_show dataset comes from NBCβs TV pilots, containing information about television shows, their viewership metrics, and audience engagement.
nbc_show <- read_csv("https://bcdanl.github.io/data/nbc_show.csv")- Gross Ratings Points (
GRP):
Measures the total viewership of a show β an indicator of its broadcast marketability.- πΊ A higher
GRPsuggests broader exposure and a more marketable program.
- πΊ A higher
- Projected Engagement (
PE):
Captures how attentive and engaged viewers were after watching a show β a more suitable measure of audience engagement.- π§ After viewing, audiences take a short quiz testing order and detail recall.
- This reflects their level of attention and retention (for both the show and its ads).
- High
PEvalues indicate strong viewer engagement.
- π§ After viewing, audiences take a short quiz testing order and detail recall.
Tasks
Since GRP reflects how many people watch a show and PE reflects how engaged or attentive those viewers are, itβs reasonable to expect some connection between the two β shows that reach more people may also have higher engagement, although not always.
Our goal is to see whether greater viewership tends to coincide with stronger engagement (and how this varies by genre in Classwork 12).
- π€ Task 1: Fill in the blanks in the provided
ggplot()code chunks.
- π¬ Task 2: Add a brief comment describing the relationship between
GRPandPE.
(1) Scatterplot with a Non-Linear Fitted Line

ggplot(data = __BLANK_1__,
mapping = aes(x = __BLANK_2__,
y = __BLANK_3__)) +
geom_point() +
geom___BLANK_4__()ggplot(data = nbc_show,
mapping = aes(x = GRP,
y = PE)) +
geom_point() +
geom_smooth()(2) Scatterplot with a Linear Fitted Line

ggplot(data = __BLANK_1__,
mapping = aes(x = __BLANK_2__,
y = __BLANK_3__)) +
geom_point() +
geom___BLANK_4__(method = __BLANK_5__)ggplot(data = nbc_show,
mapping = aes(x = GRP,
y = PE)) +
geom_point() +
geom_smooth(method = "lm")Question 3. GDP per capita vs. Life Expectancy
For Question 3, please install the R package gapminder before starting:
install.packages("gapminder")
library(gapminder)
??gapminderThe gapminder package provides a built-in dataset named gapminder, which contains country-level data on life expectancy, GDP per capita, and population across time.
Letβs assign it to a new object called df_gapminder:
df_gapminder <- gapminder::gapminderTasks
- π€ Task 1: Fill in the blanks in the provided
ggplot()code chunks.
- π¬ Task 2: Add a brief comment describing the relationship between GDP per capita (
gdpPercap) and life expectancy (lifeExp).
(1) gdpPercap vs. lifeExp

ggplot(data = __BLANK_1__,
mapping = aes(x = __BLANK_2__,
y = __BLANK_3__)) +
geom_point(__BLANK_4__ = .1) + # Add transparency to reduce overplotting
geom_smooth(__BLANK_5__ = "darkorange") +
geom_smooth(__BLANK_6__)ggplot(data = df_gapminder,
mapping = aes(x = gdpPercap,
y = lifeExp)) +
geom_point(alpha = .1) + # Add transparency to reduce overplotting
geom_smooth(color = "darkorange") +
geom_smooth(method = "lm")(2) log(gdpPercap) vs. lifeExp

ggplot(data = __BLANK_1__,
mapping = aes(x = __BLANK_2__,
y = __BLANK_3__)) +
geom_point(__BLANK_4__ = .2) + # Add transparency to reduce overplotting
geom_smooth(__BLANK_5__ = "darkorange") +
geom_smooth(__BLANK_6__)ggplot(data = df_gapminder,
mapping = aes(x = log(gdpPercap),
y = lifeExp)) +
geom_point(alpha = .2) + # Add transparency to reduce overplotting
geom_smooth(color = "darkorange") +
geom_smooth(method = "lm")- Log transformation reduces visual clutterβa highly dense cluster of points has now disappeared.
- Additionally, the linear model now fits well into the data.
Discussion
Welcome to our Classwork 2 Discussion Board! π
This space is designed for you to engage with your classmates about the material covered in Classwork 2.
Whether you are looking to delve deeper into the content, share insights, or have questions about the content, this is the perfect place for you.
If you have any specific questions for Byeong-Hak (@bcdanl) regarding the Classwork 2 materials or need clarification on any points, donβt hesitate to ask here.
All comments will be stored here.
Letβs collaborate and learn from each other!