Homework 1
Personal Website; ggplot
Visualization
Direction
Please submit your Quarto Document for Part 2 in Homework 1 to Brightspace with the name below:
danl-310-hw1-LASTNAME-FIRSTNAME.qmd
( e.g.,danl-310-hw1-choe-byeonghak.qmd
)
The due is February 19, 2025, 2:00 P.M.
Please send Byeong-Hak an email (
bchoe@geneseo.edu
) if you have any questions.
Descriptive Statistics
The following provides the descriptive statistics for each part of Homework 1:
Part 1. Personal Website
- Decorate your website:
- Replace
YOUR NAME
with your name in_quarto.yml
andindex.qmd
. - Describe yourself in
index.qmd
. - Add the picture file (e.g.,
png
) of your profile photo toimg
directory. Then correctimg/profile.png
inindex.qmd
accordingly. - Add the PDF file of your resumé to the website working directory in your laptop.
- Correct links for your resumé, LinkedIn, email, and optionally social media.
- Make sure that you do not have any broken links in your website.
- Add a “ggplot Basics” blog post to your blog using Quarto document.
- In your “ggplot Basics” blog post, briefly explain ggplot basics we discussed in Lecture 3, Lecture 4, and Classwork 4
- Choose a proper image file for a thumbnail for a blog post.
- An YAML header template for a blog post can be found below, including an image option:
---
title: BLOG_TITLE
author: YOUR_NAME
date: 2025-02-14
categories: [tag_1, tag_2, tag_3] # tags for a blog post (e.g., python)
image: image.png
execute:
warning: false
message: false
toc: true
---
- Use the 3-step git commands (
git add .
,git commit -m "..."
, andgit push
) to update your online website.
Part 2. ggplot
visualization
Setup
This is the setup R code chunk for this Quarto document:
library(tidyverse)
library(datasets)
library(gapminder)
library(skimr) # a better summary of data.frame
library(scales) # scales for ggplot
library(ggthemes) # additional ggplot themes
library(hrbrthemes) # additional ggplot themes and color pallets
library(lubridate)
library(ggridges)
library(DT)
theme_set(theme_minimal()) # setting the minimal theme for ggplot
Provide ggplot codes to replicate the given figures.
Use the following data.frame for Question 1, 2, and 3.
<- read_csv(
ncdc_temp 'https://bcdanl.github.io/data/ncdc_temp_cleaned.csv')
Question 1
Click to Check the Answer!
ggplot(ncdc_temp, aes(x = date, y= temperature)) +
geom_line(aes(color = location), size = 1) +
geom_point(data = ncdc_temp |>
filter(month %in% c("01", "04", "07", "10"),
== 1)) + # Adds a layer to the ggplot object with a line plot of the temperature data, with a size of 1.
day scale_x_date(name = "month",
limits = c(ymd("0000-01-01"), ymd("0001-01-04")), # Adds a scale to the x-axis with the label "month" and limits of Jan 1, 0000 to Jan 4, 0001, and breaks at the beginning of each quarter (Jan, Apr, Jul, Oct), with corresponding labels.
breaks = c(ymd("0000-01-01"), ymd("0000-04-01"), ymd("0000-07-01"), ymd("0000-10-01"), ymd("0001-01-01")),
labels = c("Jan", "Apr", "Jul", "Oct", "Jan"),
expand = c(1/366, 0)) +
scale_y_continuous(limits = c(19.9, 107), # Adds a scale to the y-axis with limits of 19.9 to 107, breaks at every 20 units, and label "temperature (°F)".
breaks = seq(20, 100, by = 20),
name = "temperature (°F)") +
theme(legend.title.align = 0.5) # Adjusts the alignment of the legend title to be centered.
Question 2
Click to Check the Answer!
<- ggplot(ncdc_temp,
p aes(x = month, y= temperature))
# add a box plot with grey fill
+ geom_boxplot(fill = 'grey90') +
p # add labels for x and y axes
labs(x = "month",
y = "mean temperature (°F)") +
# apply a custom theme to the plot
theme_clean()
Question 3
Use ggridges::geom_density_ridges()
for Question 3.
Click to Check the Answer!
<- ggplot(ncdc_temp,
p aes(x = temperature, y = month))
+ geom_density_ridges( # Adds a layer to the ggplot object with a smoothed density plot of the temperature data using the 'ridgeline' plot type.
p scale = 3,
rel_min_height = 0.01, # Sets the scaling and minimum relative height for the plot.
bandwidth = 3.4,
fill = "#56B4E9",
color = "white" # Sets the bandwidth for the plot, as well as the fill and color for the plot elements.
+
)
scale_x_continuous( # Adds a scale to the x-axis for continuous values.
name = "mean temperature (°F)", # Sets the label for the x-axis.
expand = c(0, 0),
breaks = c(0, 25, 50, 75) # Sets the expansion and the break points for the x-axis.
+
)
scale_y_discrete(
name = "month",
expand = c(0, .2, 0, 2.6)) + # Adds a scale to the y-axis for discrete (categorical) values, with a label and a custom expansion.
theme( # Applies a custom theme to the ggplot object.
plot.margin = margin(3, 7, 3, 1.5) # Sets the margin of the plot.
)
Question 4
Use datasets::mtcars
for Question 4.
Click to Check the Answer!
<- ggplot(data = mtcars,
m aes(x = disp, y = mpg, color = hp))
+ geom_point(aes(color = hp)) + # add scatter plot with color mapped to "hp" variable
m labs(x = "displacement(cu. in.)", y = "fuel efficiency(mpg)")+ # add labels to x and y axes
scale_color_gradient()+ # add color gradient scale legend
scale_fill_brewer(palette = "Emrld") # add fill color palette with "Emrld" scheme to the legend
Question 5
Use the following data.frame for Question 5.
<- read_csv(
popgrowth_df 'https://bcdanl.github.io/data/popgrowth.csv')
Click to Check the Answer!
<- ggplot(popgrowth_df,
p aes(y = fct_reorder(state, popgrowth),
x = 100*popgrowth,
fill = region))
+ geom_col() + # Add the geom for the columns
p scale_x_continuous(
limits = c(-.6, 37.5), expand = c(0, 0), # Set x axis limits and expansion
labels = scales::percent_format(accuracy = 1, scale = 1), # Set percent labels for x axis
name = "population growth, 2000 to 2010" # Set name for x axis
+
) theme(legend.position = c(.67, .4), # Set legend position
axis.text.y = element_text( size = 6,
margin = margin(t = 0, r = 0, b = 0, l = 0) )) # Adjust the size and margin for y axis text
Question 6
Use the following data.frame for Question 6
<- read_csv(
male_Aus 'https://bcdanl.github.io/data/aus_athletics_male.csv')
Click to Check the Answer!
# Define color and fill vectors for use in plot
<- c("#BD3828", rep("#808080", 4))
colors <- c("#BD3828D0", rep("#80808080", 4))
fills
<- ggplot(male_Aus,
p aes(x=height, y=pcBfat,
shape = sport,
color = sport,
fill = sport))
# Add geom_point layer with custom size
+ geom_point(size = 3) +
p
# Set shape values for different sports
scale_shape_manual(values = 21:25) +
# Set color values for different sports
scale_color_manual(values = colors) +
# Set fill values for different sports
scale_fill_manual(values = fills) +
# Set x and y axis labels
labs(x = "height (cm)",
y = "% body fat" )
Question 7
Use the following data.frame for Question 7
<- read_csv(
titanic 'https://bcdanl.github.io/data/titanic_cleaned.csv')
Click to Check the Answer!
<- ggplot(titanic, aes(x = age, y = after_stat(count) ) )
p
# Add a density line plot for all passengers with transparent color, and fill legend with "all passengers"
+ geom_density(
p data = select(titanic, -gender),
aes(fill = "all passengers"),
color = "transparent"
+
) # Add another density line plot for each gender with transparent color, and fill legend with gender
geom_density(aes(fill = gender),
bw = 2,
color = "transparent") +
# Set the x-axis limits, name, and expand arguments
scale_x_continuous(limits = c(0, 75),
name = "passenger age (years)",
expand = c(0, 0)) +
# Set the y-axis limits, name, and expand arguments
scale_y_continuous(limits = c(0, 26),
name = "count",
expand = c(0, 0)) +
# Set the manual color and fill values, breaks, and labels for the legend
scale_fill_manual(
values = c("#b3b3b3a0", "#0072B2", "#D55E00"),
breaks = c("all passengers", "male", "female"),
labels = c("all passengers ", "males ", "females"),
name = NULL,
guide = guide_legend(direction = "horizontal")
+
) # Set the Cartesian coordinate system to allow for data points to fall outside the plot limits
coord_cartesian(clip = "off") +
# Create separate density line plots for male and female passengers
facet_wrap(~gender) +
# Set the x-axis line to blank, increase the strip text size, and set the legend position and margin
theme(
axis.line.x = element_blank(),
strip.text = element_text(size = 14, margin = margin(0, 0, 0.2, 0, "cm")),
legend.position = "bottom",
legend.justification = "right",
legend.margin = margin(4.5, 0, 1.5, 0, "pt"),
legend.spacing.x = grid::unit(4.5, "pt"),
legend.spacing.y = grid::unit(0, "pt"),
legend.box.spacing = grid::unit(0, "cm")
)
Question 8
Use the following data.frame for Question 8.
<- read_csv(
cows_filtered 'https://bcdanl.github.io/data/cows_filtered.csv')
Click to Check the Answer!
<- ggplot(cows_filtered,
p aes(x = butterfat,
color = breed,
fill = breed))
# add a density line for each breed with some transparency
+ geom_density(alpha = .2) +
p
# set x-axis properties
scale_x_continuous(
expand = c(0, 0), # remove padding from axis limits
labels = scales::percent_format(accuracy = 1, scale = 1), # format axis labels as percentages with 1 decimal point
name = "butterfat contents" # set axis label
+
)
# set y-axis properties
scale_y_continuous(limits = c(0, 1.99),
expand = c(0, 0)) +
# set plot area properties
coord_cartesian(clip = "off") + # allow density lines to extend beyond axis limits
theme(axis.line.x = element_blank()) # remove x-axis line
Question 9
Provide your GitHub username.