# Install the nycflights13 package
install.packages("nycflights13")
# Load the package into your R session
library(nycflights13)
left_join()
Classwork 3
Question 1
- Install the
nycflights13
R package and load it into your R session in your Posit Cloud project.
Answer:
Answer: In this step, we first install the nycflights13
package using install.packages()
, which contains datasets related to flights in and out of New York City in 2013. Once installed, we load the package into the R session with library(nycflights13)
, making the flights
and airlines
data.frames available for analysis.
Question 2
- The
nycflights13
package provides two data.frames:flights
andairlines
, which are related by thecarrier
variable.carrier
: A two-letter abbreviation indicating the full name of the airline.
- Use the
left_join()
function to create a new data.frame,flight_airline
, that includes all observations and variables from theflights
data.frame, along with thename
variable from theairlines
data.frame that corresponds to thecarrier
variable in theflights
data.frame.
Answer:
library(tidyverse)
# Perform left join to merge flights and airlines data.frames
<- flights |> left_join(airlines)
flight_airline
# View the first few rows of the new data.frame
head(flight_airline)
# A tibble: 6 × 20
year month day dep_time sched_dep_time dep_delay arr_time sched_arr_time
<int> <int> <int> <int> <int> <dbl> <int> <int>
1 2013 1 1 517 515 2 830 819
2 2013 1 1 533 529 4 850 830
3 2013 1 1 542 540 2 923 850
4 2013 1 1 544 545 -1 1004 1022
5 2013 1 1 554 600 -6 812 837
6 2013 1 1 554 558 -4 740 728
# ℹ 12 more variables: arr_delay <dbl>, carrier <chr>, flight <int>,
# tailnum <chr>, origin <chr>, dest <chr>, air_time <dbl>, distance <dbl>,
# hour <dbl>, minute <dbl>, time_hour <dttm>, name <chr>
Answer: Here, we use the left_join()
function from the one of the packages in tidyverse
. The left_join()
function merges two data.frames based on a common key variable—in this case, the carrier
variable, which is present in both the flights
and airlines
data.frames. This operation adds the name
variable (airline name) from the airlines
data.frame to the flights
data.frame, while keeping all observations and variables from flights
. The result is stored in flight_airline
.
Discussion
Welcome to our Classwork 3 Discussion Board! 👋
This space is designed for you to engage with your classmates about the material covered in Classwork 3.
Whether you are looking to delve deeper into the content, share insights, or have questions about the content, this is the perfect place for you.
If you have any specific questions for Byeong-Hak (@bcdanl) or peer classmate (@GitHub-Username) regarding the Classwork 3 materials or need clarification on any points, don’t hesitate to ask here.
Let’s collaborate and learn from each other!