<- data.frame(num = c(1, 4, NA, 2, NA, 3, 7),
df chr = c("A", "A", "A", "B", "X", "Z", "D"))
Classwork 9
Filtering observations
Question 1. Filtering with filter()
Q1a.
- Find all flights that
- Had an arrival delay of two or more hours
- Flew to Houston (
IAH
orHOU
) - Were operated by United, American, or Delta
- Departed in summer (July, August, and September)
- Arrived more than two hours late, but didn’t leave late
- Were delayed by at least an hour, but made up over 30 minutes in flight
- Departed between midnight and 6am (inclusive)
Q1b.
- How many flights have a missing
dep_time
? What other variables are missing? What might these rows represent?
Q1c.
- Why is
NA^0
not missing? Why isNA | TRUE
not missing? Why isFALSE & NA
not missing? Can you figure out the general rule? (NA * 0
is a tricky counterexample!)
Q1d.
- Was there a flight on every day of 2013?
Question 2. Arrange rows with arrange()
Q2a.
- How could you use
arrange()
to sort all missing values to the start? (Hint: useis.na()
).
Q2b.
- Sort flights to find the most delayed flights. Find the flights that left earliest.
Q2c.
- Sort flights to find the fastest (highest speed) flights.
Q2d.
- Which flights traveled the farthest? Which traveled the shortest?